Unknown Key-Share Attacks on Uses of TLS with the Session Description Protocol (SDP)Mozillamt@lowentropy.netMozillaekr@rtfm.comUnknown Key-Share AttackSDPDTLS-SRTPWebRTCSIP identityThis document describes unknown key-share attacks on the use of
Datagram Transport Layer Security for the Secure Real-Time Transport
Protocol (DTLS-SRTP). Similar attacks are described on the use of
DTLS-SRTP with the identity bindings used in Web Real-Time
Communications (WebRTC) and SIP identity. These attacks are difficult
to mount, but they cause a victim to be misled about the identity of a
communicating peer. This document defines mitigation techniques that
implementations of RFC 8122 are encouraged to deploy.IntroductionThe use of Transport Layer Security (TLS) with the Session Description Protocol (SDP) is defined in . Further use with Datagram Transport Layer Security
(DTLS) and the Secure
Real-time Transport Protocol (SRTP) is defined as DTLS-SRTP .In these specifications, key agreement is performed using TLS or DTLS, with
authentication being tied back to the session description (or SDP) through the
use of certificate fingerprints. Communication peers check that a hash, or
fingerprint, provided in the SDP matches the certificate that is used in the TLS
or DTLS handshake.WebRTC identity (see )
and SIP identity both provide a mechanism that binds an
external identity to the certificate fingerprints from a session description.
However, this binding is not integrity protected and is therefore vulnerable to an
identity misbinding attack, also known as an unknown key-share (UKS) attack, where the
attacker binds their identity to the fingerprint of another entity. A
successful attack leads to the creation of sessions where peers are confused
about the identity of the participants.This document describes a TLS extension that can be used in combination with
these identity bindings to prevent this attack.A similar attack is possible with the use of certificate fingerprints alone.
Though attacks in this setting are likely infeasible in existing
deployments due to the narrow preconditions
(see ), this document also
describes mitigations for this attack.The mechanisms defined in this document are intended to strengthen the protocol
by preventing the use of unknown key-share attacks in combination with other protocol
or implementation vulnerabilities. RFC 8122 is updated by this
document to recommend the use of these mechanisms.This document assumes that signaling is integrity protected. However, as
explains, many deployments that use SDP do not
guarantee integrity of session signaling and so are vulnerable to other attacks.
offers key continuity mechanisms as a potential means of
reducing exposure to attack in the absence of integrity protection.
provides some analysis of the effect of key continuity in
relation to the described attacks.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14
when, and only when, they appear in all capitals, as shown here.
Unknown Key-Share AttackIn an unknown key-share attack , a malicious participant in a protocol
claims to control a key that is in reality controlled by some other actor. This
arises when the identity associated with a key is not properly bound to the key.An endpoint that can acquire the certificate fingerprint of another entity can
advertise that fingerprint as their own in SDP.
An attacker can use a copy of that fingerprint to cause a victim to
communicate with another unaware victim, even though the first victim believes
that it is communicating with the attacker.
When the identity of communicating peers is established by higher-layer
signaling constructs, such as those in SIP identity or WebRTC
, this allows an attacker to bind their own identity to a session
with any other entity.The attacker obtains an identity assertion for an identity it controls, but
binds that to the fingerprint of one peer. The attacker is then able to cause a
TLS connection to be established where two victim endpoints communicate. The
victim that has its fingerprint copied by the attack correctly believes that it
is communicating with the other victim; however, the other victim incorrectly
believes that it is communicating with the attacker.An unknown key-share attack does not result in the attacker having access to any
confidential information exchanged between victims. However, the failure in
mutual authentication can enable other attacks. A victim might send information
to the wrong entity as a result. Where information is interpreted in context,
misrepresenting that context could lead to the information being misinterpreted.A similar attack can be mounted based solely on the SDP fingerprint attribute
without compromising the integrity of the signaling channel.This attack is an aspect of SDP-based protocols upon which the technique known as
third-party call control (3PCC) relies. 3PCC exploits the
potential for the identity of a signaling peer to be different than the media
peer, allowing the media peer to be selected by the signaling peer.
describes the consequences of the mitigations described here for
systems that use 3PCC.Limits on Attack FeasibilityThe use of TLS with SDP depends on the integrity of session signaling. Assuming
signaling integrity limits the capabilities of an attacker in several ways. In
particular:
An attacker can only modify the parts of the session signaling that they are
responsible for producing, namely their own offers and answers.
No entity will successfully establish a session with a peer unless they are
willing to participate in a session with that peer.
The combination of these two constraints make the spectrum of possible attacks
quite limited. An attacker is only able to switch its own certificate
fingerprint for a valid certificate that is acceptable to its peer. Attacks
therefore rely on joining two separate sessions into a single session.
describes an attack on SDP signaling under these constraints.Systems that rely on strong identity bindings, such as those defined in
or , have a different threat model, which admits the
possibility of attack by an entity with access to the signaling channel.
Attacks under these conditions are more feasible as an attacker is assumed to be
able both to observe and to modify signaling messages. describes an attack
that assumes this threat model.Interactions with Key ContinuitySystems that use key continuity (as defined in
or as recommended in ) might be able to detect an
unknown key-share attack if a session with either the attacker or the genuine
peer (i.e., the victim whose fingerprint was copied by an attacker) was
established in the past. Whether this is possible depends on how key continuity
is implemented.Implementations that maintain a single database of identities with an index of
peer keys could discover that the identity saved for the peer key does not match
the claimed identity. Such an implementation could notice the disparity between
the actual keys (those copied from a victim) and the expected keys (those of the
attacker).In comparison, implementations that first match based on peer identity could
treat an unknown key-share attack as though their peer had used a
newly configured device. The apparent addition of a new device could generate
user-visible notices (e.g., "Mallory appears to have a new device"). However,
such an event is not always considered alarming; some implementations might
silently save a new key.Third-Party Call ControlThird-party call control (3PCC) is a technique where a signaling
peer establishes a call that is terminated by a different entity. An unknown
key-share attack is very similar in effect to some 3PCC practices, so use of
3PCC could appear to be an attack. However, 3PCC that follows RFC 3725 guidance
is unaffected, and peers that are aware of changes made by a 3PCC controller can
correctly distinguish actions of a 3PCC controller from an attack.3PCC as described in RFC 3725 is incompatible with SIP identity , as
SIP Identity relies on creating a binding between SIP requests and SDP. The
controller is the only entity that generates SIP requests in RFC 3725.
Therefore, in a 3PCC context, only the use of the fingerprint attribute
without additional bindings or WebRTC identity is possible.The attack mitigation mechanisms described in this document will prevent the use
of 3PCC if peers have different views of the involved identities or the value
of SDP tls-id attributes.For 3PCC to work with the proposed mechanisms, TLS peers need to be aware of the
signaling so that they can correctly generate and check the TLS extensions. For
a connection to be successfully established, a 3PCC controller needs either to
forward SDP without modification or to avoid modifications to fingerprint,
tls-id, and identity attributes. A controller that follows the best
practices in RFC 3725 is expected to forward SDP without modification, thus
ensuring the integrity of these attributes.Unknown Key-Share Attack with Identity BindingsThe identity assertions used for WebRTC
() and the
Personal Assertion Token (PASSporT) used in SIP identity (, ) are bound
to the certificate fingerprint of an endpoint. An attacker can cause an identity
binding to be created that binds an identity they control to the fingerprint of
a first victim.An attacker can thereby cause a second victim to believe that they are
communicating with an attacker-controlled identity, when they are really talking
to the first victim. The attacker does this by creating an identity assertion
that covers a certificate fingerprint of the first victim.A variation on the same technique can be used to cause both victims to
believe they are talking to the attacker when they are talking to each other.
In this case, the attacker performs the identity misbinding once for each
victim.The authority certifying the identity binding is not required to
verify that the entity requesting the binding actually controls the
keys associated with the fingerprints, and this might appear to be
the cause of the problem. SIP and WebRTC identity providers are not
required to perform this validation.A simple solution to this problem is suggested by . The identity of
endpoints is included under a message authentication code (MAC) during the
cryptographic handshake. Endpoints then validate that their peer has provided
an identity that matches their expectations. In TLS, the Finished message
provides a MAC over the entire handshake, so that including the identity in a
TLS extension is sufficient to implement this solution.Rather than include a complete identity binding, which could be
sizable, a collision- and preimage-resistant hash of the binding is included
in a TLS extension as described in . Endpoints then need
only validate that the extension contains a hash of the identity binding they
received in signaling. If the identity binding is successfully validated, the
identity of a peer is verified and bound to the session.This form of unknown key-share attack is possible without compromising signaling
integrity, unless the defenses described in are used. In order to
prevent both forms of attack, endpoints MUST use the external_session_id
extension (see ) in addition to the external_id_hash
() so that two calls between the same parties can't be
altered by an attacker.ExampleIn the example shown in , it is assumed that the attacker
also controls the signaling channel.Mallory (the attacker) presents two victims, Norma and Patsy, with two separate
sessions. In the first session, Norma is presented with the option to
communicate with Mallory; a second session with Norma is presented to Patsy.The attack requires that Mallory obtain an identity binding for her own identity
with the fingerprints presented by Patsy (P), which Mallory might have obtained
previously. This false binding is then presented to Norma ('Signaling1' in
).Patsy could be similarly duped, but in this example, a correct binding between
Norma's identity and fingerprint (N) is faithfully presented by Mallory. This
session ('Signaling2' in ) can be entirely legitimate.A DTLS session is established directly between Norma and Patsy.
In order for this to happen, Mallory can substitute transport-level
information in both sessions, though this is not necessary if Mallory
is on the network path between Norma and Patsy.
As a result, Patsy correctly believes that she is communicating with Norma.
However, Norma incorrectly believes that she is talking to Mallory. As stated in
, Mallory cannot access media, but Norma might send information to Patsy
that Norma might not intend or that Patsy might misinterpret.The external_id_hash TLS ExtensionThe external_id_hash TLS extension carries a hash of the identity assertion
that the endpoint sending the extension has asserted to its peer. Both peers
include a hash of their own identity assertion.The extension_data for the external_id_hash extension contains a
ExternalIdentityHash struct, described below using the syntax defined in
:;
} ExternalIdentityHash;
]]>Where an identity assertion has been asserted by a peer, this extension includes
a SHA-&wj;256 hash of the assertion. An empty value is used to indicate support for
the extension.
Note:
For both types of identity assertion, if SHA-&wj;256 should prove to be inadequate
in the future (see ), a new TLS extension
that uses a different hash function can be defined.
Identity bindings might be provided by only one peer. An endpoint that does not
produce an identity binding MUST generate an empty external_id_hash extension
in its ClientHello or -- if a client provides the extension -- in ServerHello or
EncryptedExtensions. An empty extension has a zero-length binding_hash field.A peer that receives an external_id_hash extension that does not match the
value of the identity binding from its peer MUST immediately fail the TLS
handshake with an illegal_parameter alert. The absence of an identity binding
does not relax this requirement; if a peer provided no identity binding, a
zero-length extension MUST be present to be considered valid.Implementations written prior to the definition of the extensions in this
document will not support this extension for some time. A peer that receives an
identity binding but does not receive an external_id_hash extension MAY accept
a TLS connection rather than fail a connection where the extension is absent.
The endpoint performs the validation of the external_id_hash extension
in addition to the validation required by and any verification
of the identity assertion .
An endpoint MUST validate any external_session_id value that is present; see .
An external_id_hash extension with a binding_hash field
that is any length other than 0 or 32 is invalid
and MUST cause the receiving endpoint to generate a fatal decode_error alert.In TLS 1.3, an external_id_hash extension sent by a server MUST be sent in the
EncryptedExtensions message.Calculating external_id_hash for WebRTC IdentityA WebRTC identity assertion
() is provided as a JSON
object that is encoded into a JSON text. The JSON text is
encoded using UTF-8 as described by
.
The content of the external_id_hash extension is produced by hashing the
resulting octets with SHA-&wj;256 . This produces the 32 octets of
the binding_hash parameter, which is the sole contents of the extension.The SDP identity attribute includes the base64 encoding of
the UTF-8 encoding of the same JSON text. The external_id_hash extension is
validated by performing base64 decoding on the value of the SDP identity
attribute, hashing the resulting octets using SHA-&wj;256, and comparing the results
with the content of the extension. In pseudocode form, using the
identity-assertion-value field from the SDP identity attribute grammar as
defined in :
Note:
The base64 of the SDP identity attribute is decoded to avoid capturing
variations in padding. The base64-decoded identity assertion could include
leading or trailing whitespace octets. WebRTC identity assertions are not
canonicalized; all octets are hashed.
Calculating external_id_hash for PASSporTWhere the compact form of PASSporT
is used, it MUST be expanded
into the full form. The base64 encoding used in the SIP Identity (or 'y')
header field MUST be decoded then used as input to SHA-&wj;256. This produces the
32-octet binding_hash value used for creating or validating the extension. In
pseudocode, using the signed-identity-digest parameter from the Identity header field grammar
defined :Unknown Key-Share Attack with FingerprintsAn attack on DTLS-SRTP is possible because the identity of peers involved is not
established prior to establishing the call. Endpoints use certificate
fingerprints as a proxy for authentication, but as long as fingerprints are used
in multiple calls, they are vulnerable to attack.Even if the integrity of session signaling can be relied upon, an attacker might
still be able to create a session where there is confusion about the
communicating endpoints by substituting the fingerprint of a communicating
endpoint.An endpoint that is configured to reuse a certificate can be attacked if it is
willing to initiate two calls at the same time, one of which is with an
attacker. The attacker can arrange for the victim to incorrectly believe that
it is calling the attacker when it is in fact calling a second party. The
second party correctly believes that it is talking to the victim.As with the attack on identity bindings, this can be used to cause two victims
to both believe they are talking to the attacker when they are talking to each
other.ExampleTo mount this attack, two sessions need to be created with the same endpoint at
almost precisely the same time. One of those sessions is initiated with the
attacker, the second session is created toward another honest endpoint. The
attacker convinces the first endpoint that their session with the attacker has
been successfully established, but media is exchanged with the other honest
endpoint. The attacker permits the session with the other honest endpoint to
complete only to the extent necessary to convince the other honest endpoint to
participate in the attacked session.In addition to the constraints described in , the attacker in this
example also needs the ability to view and drop packets between victims.
That is, the attacker needs to be on path for media.The attack shown in depends on a somewhat implausible set
of conditions. It is intended to demonstrate what sort of attack is possible
and what conditions are necessary to exploit this weakness in the protocol.In this scenario, there are two sessions initiated at the same time by Norma.
Signaling is shown with single lines ('-'), DTLS and media with double lines
('=').The first session is established with Mallory, who falsely uses Patsy's
certificate fingerprint (denoted with 'fp=P'). A second session is initiated
between Norma and Patsy. Signaling for both sessions is permitted to complete.Once signaling is complete on the first session, a DTLS connection is
established. Ostensibly, this connection is between Mallory and Norma, but
Mallory forwards DTLS and media packets sent to her by Norma to Patsy. These
packets are denoted 'DTLS1' because Norma associates these with the first
signaling session ('Signaling1').Mallory also intercepts packets from Patsy and forwards those to Norma at the
transport address that Norma associates with Mallory. These packets are denoted
'DTLS2' to indicate that Patsy associates these with the second signaling
session ('Signaling2'); however, Norma will interpret these as being associated
with the first signaling session ('Signaling1').The second signaling exchange ('Signaling2'), which is between Norma and Patsy, is
permitted to continue to the point where Patsy believes that it has succeeded.
This ensures that Patsy believes that she is communicating with Norma. In the
end, Norma believes that she is communicating with Mallory, when she is really
communicating with Patsy. Just like the example in , Mallory
cannot access media, but Norma might send information to Patsy that Norma
might not intend or that Patsy might misinterpret.Though Patsy needs to believe that the second signaling session has been
successfully established, Mallory has no real interest in seeing that session
also be established. Mallory only needs to ensure that Patsy maintains the
active session and does not abandon the session prematurely. For this reason,
it might be necessary to permit the signaling from Patsy to reach Norma in order to allow
Patsy to receive a call setup completion signal, such as a SIP ACK. Once the
second session is established, Mallory might cause DTLS packets sent by Norma to
Patsy to be dropped. However, if Mallory allows DTLS packets to pass, it is
likely that Patsy will discard them as Patsy will already have a successful DTLS
connection established.For the attacked session to be sustained beyond the point that Norma detects
errors in the second session, Mallory also needs to block any signaling that
Norma might send to Patsy asking for the call to be abandoned. Otherwise, Patsy
might receive a notice that the call has failed and thereby abort the call.This attack creates an asymmetry in the beliefs about the identity of peers.
However, this attack is only possible if the victim (Norma) is willing to
conduct two sessions nearly simultaneously; if the attacker (Mallory) is on the
network path between the victims; and if the same certificate -- and therefore
the SDP fingerprint attribute value -- is used by Norma for both sessions.Where Interactive Connectivity Establishment (ICE)
is used, Mallory also needs to ensure that
connectivity checks between Patsy and Norma succeed, either by forwarding checks
or by answering and generating the necessary messages.Unique Session Identity SolutionThe solution to this problem is to assign a new identifier to communicating
peers. Each endpoint assigns their peer a unique identifier during call
signaling. The peer echoes that identifier in the TLS handshake, binding that
identity into the session. Including this new identity in the TLS handshake
means that it will be covered by the TLS Finished message, which is necessary to
authenticate it (see ).Successfully validating that the identifier matches the expected value means that
the connection corresponds to the signaled session and is therefore established
between the correct two endpoints.This solution relies on the unique identifier given to DTLS sessions using the
SDP tls-id attribute . This field is
already required to be unique. Thus, no two offers or answers from the same
client will have the same value.A new external_session_id extension is added to the TLS or DTLS handshake for
connections that are established as part of the same call or real-time session.
This carries the value of the tls-id attribute and provides integrity
protection for its exchange as part of the TLS or DTLS handshake.The external_session_id TLS ExtensionThe external_session_id TLS extension carries the unique identifier that an
endpoint selects. When used with SDP, the value MUST include the tls-id
attribute from the SDP that the endpoint generated when negotiating the session.
This document only defines use of this extension for SDP; other methods of
external session negotiation can use this extension to include a unique session
identifier.The extension_data for the external_session_id extension contains an
ExternalSessionId struct, described below using the syntax defined in
:;
} ExternalSessionId;
]]>For SDP, the session_id field of the extension includes the value of the
tls-id SDP attribute as defined in
(that is, the tls-id-value ABNF production). The value of the tls-id
attribute is encoded using ASCII .Where RTP and RTCP are not multiplexed, it is possible that the
two separate DTLS connections carrying RTP and RTCP can be switched. This is
considered benign since these protocols are designed to be distinguishable as
SRTP provides key separation. Using RTP/RTCP multiplexing
further avoids this problem.The external_session_id extension is included in a ClientHello, and if the
extension is present in the ClientHello, either ServerHello (for TLS and DTLS
versions older than 1.3) or EncryptedExtensions (for TLS 1.3).Endpoints MUST check that the session_id parameter in the extension that they
receive includes the tls-id attribute value that they received in their peer's
session description. Endpoints can perform string comparison by ASCII decoding
the TLS extension value and comparing it to the SDP attribute value or by comparing
the encoded TLS extension octets with the encoded SDP attribute value. An
endpoint that receives an external_session_id extension that is not identical
to the value that it expects MUST abort the connection with a fatal
illegal_parameter alert.
The endpoint performs the validation of the external_id_hash extension in
addition to the validation required by .
If an endpoint communicates with a peer that does not support this
extension, it will receive a ClientHello, ServerHello, or EncryptedExtensions message that
does not include this extension. An endpoint MAY choose to continue a session
without this extension in order to interoperate with peers that do not implement
this specification.In TLS 1.3, an external_session_id extension sent by a server MUST be sent in
the EncryptedExtensions message.This defense is not effective if an attacker can rewrite tls-id values in
signaling. Only the mechanism in external_id_hash is able to defend against
an attacker that can compromise session integrity.Session ConcatenationUse of session identifiers does not prevent an attacker from
establishing two concurrent sessions with different peers and
forwarding signaling from those peers to each other. Concatenating
two signaling sessions in this way creates two signaling sessions,
with two session identifiers, but only the TLS connections from a
single session are established as a result. In doing so, the
attacker creates a situation where both peers believe that they are
talking to the attacker when they are talking to each other.In the absence of any higher-level concept of peer identity, an
attacker who is able to copy the session identifier from
one signaling session to another can cause the peers to establish a
direct TLS connection even while they think that they are connecting
to the attacker. This differs from the attack described in the
previous section in that there is only one TLS connection rather than
two. This kind of attack is prevented by systems that enable peer
authentication, such as WebRTC identity or SIP identity
; however, these systems do not prevent establishing
two back-to-back connections as described in the previous paragraph.Use of the external_session_id does not guarantee that the identity of the
peer at the TLS layer is the same as the identity of the signaling peer. The
advantage that an attacker gains by concatenating sessions is limited unless data is
exchanged based on the assumption that signaling and TLS peers are the same. If a
secondary protocol uses the signaling channel with the assumption that the
signaling and TLS peers are the same, then that protocol is vulnerable to attack.
While out of scope for this document, a signaling system that can defend against session concatenation
requires that the signaling layer is authenticated and bound to any TLS connections.It is important to note that multiple connections can be created within the same
signaling session. An attacker might concatenate only part of a session,
choosing to terminate some connections (and optionally forward data) while
arranging to have peers interact directly for other connections. It is even
possible to have different peers interact for each connection. This means that
the actual identity of the peer for one connection might differ from the peer on
another connection.Critically, information about the identity of TLS peers provides no assurances
about the identity of signaling peers and does not transfer between TLS
connections in the same session. Information extracted from a TLS connection
therefore MUST NOT be used in a secondary protocol outside of that connection if
that protocol assumes that the signaling protocol has the same peers.
Similarly, security-sensitive information from one TLS connection MUST NOT be
used in other TLS connections even if they are established as a result of the
same signaling session.Security ConsiderationsWhen combined with identity assertions, the mitigations in this document ensure
that there is no opportunity to misrepresent the identity of TLS peers. This
assurance is provided even if an attacker can modify signaling messages.Without identity assertions, the mitigations in this document prevent the
session splicing attack described in . Defense against session
concatenation () additionally requires that protocol peers are not able to
claim the certificate fingerprints of other entities.IANA ConsiderationsThis document registers two extensions in the "TLS ExtensionType Values"
registry established in :
The external_id_hash extension defined in has been
assigned a code point of 55; it is recommended and is marked as "CH, EE"
in TLS 1.3.
The external_session_id extension defined in has
been assigned a code point of 56; it is recommended and is marked as
"CH, EE" in TLS 1.3.
ReferencesNormative ReferencesWebRTC Security ArchitectureSession Description Protocol (SDP) Offer/Answer Considerations for
Datagram Transport Layer Security (DTLS) and Transport Layer Security
(TLS)Informative ReferencesUnknown Key-Share Attacks on the Station-to-Station (STS) ProtocolPublic Key CryptographyLecture Notes in Computer ScienceVol. 1560SIGMA: The 'SIGn-and-MAc' Approach to Authenticated Diffie-Hellman and Its Use in the IKE ProtocolsAdvances in Cryptology -- CRYPTO 2003Lecture Notes in Computer ScienceVol. 2729WebRTC 1.0: Real-time Communication Between BrowsersW3C Proposed RecommendationAcknowledgementsThis problem would not have been discovered if it weren't for
discussions with , , and . A
solution similar to the one presented here was first proposed by
, who provided valuable input on
this document. assisted with
a formal model of the solution. and
provided significant review and
input.