The purpose of this framework is to define a means through which media privacy is ensured when communicating within a conferencing environment consisting of one or more Media Distributors that only switch, and hence do not terminate, media. It does not otherwise attempt to hide the fact that a conference between endpoints is taking place.
This framework reuses several specified RTP security technologies, including the Secure Real-time Transport Protocol (SRTP) [
RFC 3711], Encrypted Key Transport (EKT) [
RFC 8870], and DTLS-SRTP.
This solution framework focuses on the E2E privacy and integrity of the participant's media by limiting access to only trusted entities to the E2E key used for authenticated E2E encryption. However, this framework does give a Media Distributor access to RTP header fields and header extensions, as well as the ability to modify a certain subset of the header fields and to add or change header extensions. Packets received by a Media Distributor or an endpoint are authenticated hop by hop.
To enable all of the above, this framework defines the use of two security contexts and two associated encryption keys: an "inner" key (a distinct E2E key for each transmitted media flow) for authenticated encryption of RTP media between endpoints and an "outer" key (a HBH key) known only to a Media Distributor or the adjacent endpoint for the hop between an endpoint and a Media Distributor or peer endpoint. An endpoint will receive one or more E2E keys from every other endpoint in the conference that correspond to the media flows transmitted by those other endpoints, while HBH keys are derived from the DTLS-SRTP association with the Key Distributor. Two communicating Media Distributors use DTLS-SRTP associations directly with each other to obtain the HBH keys they will use. See
Section 4.5 for more details on key exchange.
+-------------+ +-------------+
| |################################| |
| Media |------------------------ *----->| Media |
| Distributor |<----------------------*-|------| Distributor |
| X |#####################*#|#|######| Y |
| | | | | | |
+-------------+ | | | +-------------+
# ^ | # HBH Key (XY) -+ | | # ^ | #
# | | # E2E Key (B) ---+ | # | | #
# | | # E2E Key (A) -----+ # | | #
# | | # # | | #
# | | # # | | #
# | | *---- HBH Key (AX) HBH Key (YB) ----* | | #
# | | # # | | #
# *--------- E2E Key (A) E2E Key (A) ---------* #
# | *------- E2E Key (B) E2E Key (B) -------* | #
# | | # # | | #
# | v # # | v #
+-------------+ +-------------+
| Endpoint A | | Endpoint B |
+-------------+ +-------------+
The double transform [
RFC 8723] enables endpoints to perform encryption using both the E2E and HBH contexts while still preserving the same overall interface as other SRTP transforms. The Media Distributor simply uses the corresponding normal (single) AES-GCM transform, keyed with the appropriate HBH keys. See
Section 6.1 for a description of the keys used in PERC and
Section 7 for a diagram of how encrypted RTP packets appear on the wire.
RTCP is only encrypted hop by hop -- not end to end. This framework does not provide an additional step for RTCP-authenticated encryption. Rather, implementations utilize the existing procedures specified in [
RFC 3711]; those procedures use the same outer, HBH cryptographic context chosen in the double transform operation described above. For this reason, endpoints
MUST NOT send confidential information via RTCP.
To ensure the confidentiality of E2E keys shared between endpoints, endpoints use a common Key Encryption Key (KEK) that is known only by the trusted entities in a conference. That KEK, defined in the EKT specification [
RFC 8870] as the EKT Key, is used to subsequently encrypt the SRTP master key used for E2E-authenticated encryption of media sent by a given endpoint. Each endpoint in the conference creates an SRTP master key for E2E-authenticated encryption and keeps track of the E2E keys received via the Full EKT Tag for each distinct synchronization source (SSRC) in the conference so that it can properly decrypt received media. An endpoint may change its E2E key at any time and advertise that new key to the conference as specified in [
RFC 8870].
Any given RTP media flow is identified by its SSRC, and an endpoint might send more than one at a time and change the mix of media flows transmitted during the lifetime of a conference.
Thus, an endpoint
MUST maintain a list of SSRCs from received RTP flows and each SSRC's associated E2E key information. An endpoint
MUST discard old E2E keys no later than when it leaves the conference.
If the packet is to contain RTP header extensions, it should be noted that those extensions are only encrypted hop by hop per [
RFC 8723]. For this reason, endpoints
MUST NOT transmit confidential information via RTP header extensions.
To ensure the integrity of transmitted media packets, it is
REQUIRED that every packet be authenticated hop by hop between an endpoint and a Media Distributor, as well as between Media Distributors. The authentication key used for HBH authentication is derived from an SRTP master key shared only on the respective hop. Each HBH key is distinct per hop, and no two hops ever use the same SRTP master key.
While endpoints also perform HBH authentication, the ability of the endpoints to reconstruct the original RTP header also enables the endpoints to authenticate RTP packets end to end. This design yields flexibility to the Media Distributor to change certain RTP header values as packets are forwarded. Values that the Media Distributor can change in the RTP header are defined in [
RFC 8723]. RTCP can only be encrypted hop by hop, giving the Media Distributor the flexibility to (1) forward RTCP content unchanged, (2) transmit compound RTCP packets, (3) initiate RTCP packets for reporting statistics, or (4) convey other information. Performing HBH authentication for all RTP and RTCP packets also helps provide replay protection (see
Section 8). The use of the replay protection mechanism specified in
Section 3.3.2 of
RFC 3711 is
REQUIRED at each hop.
If there is a need to encrypt one or more RTP header extensions hop by hop, the endpoint derives an encryption key from the HBH SRTP master key to encrypt header extensions as per [
RFC 6904]. This still gives the Media Distributor visibility into header extensions, such as the one used to determine the audio level [
RFC 6464] of conference participants. Note that when RTP header extensions are encrypted, all hops need to decrypt and re-encrypt these encrypted header extensions. Please refer to Sections
5.1,
5.2, and
5.3 of [
RFC 8723] for procedures to perform RTP header extension encryption and decryption.
In brief, the keys used by any given endpoints are determined as follows:
-
The HBH keys that the endpoint uses to send and receive SRTP media are derived from a DTLS handshake that the endpoint performs with the Key Distributor (following normal DTLS-SRTP procedures).
-
The E2E key that an endpoint uses to send SRTP media can be either set from the DTLS-SRTP association with the Key Distributor or chosen by the endpoint. It is then distributed to other endpoints in a Full EKT Tag, encrypted under an EKT Key provided to the client by the Key Distributor within the DTLS channel they negotiated. Note that an endpoint MAY create a different E2E key per media flow, where a media flow is identified by its SSRC value.
-
Each E2E key that an endpoint uses to receive SRTP media is set by receiving a Full EKT Tag from another endpoint.
-
The HBH keys used between two Media Distributors are derived via DTLS-SRTP procedures employed directly between them.
The Media Distributor maintains a tunnel with the Key Distributor (e.g., using the tunnel protocol defined in [
PERC-DTLS]), making it possible for the Media Distributor to facilitate the establishment of a secure DTLS association between each endpoint and the Key Distributor as shown in
Figure 3. The DTLS association between endpoints and the Key Distributor enables each endpoint to generate E2E and HBH keys and receive the KEK. At the same time, the Key Distributor securely provides the HBH key information to the Media Distributor. The key information summarized here may include the SRTP master key, the SRTP master salt, and the negotiated cryptographic transform.
+-----------+
KEK info | Key | HBH Key info to
to Endpoints |Distributor| Endpoints & Media Distributor
+-----------+
# ^ ^ #
# | | #--- Tunnel
# | | #
+-----------+ +-----------+ +-----------+
| Endpoint | DTLS | Media | DTLS | Endpoint |
| KEK |<------------|Distributor|------------>| KEK |
| HBH Key | to Key Dist | HBH Keys | to Key Dist | HBH Key |
+-----------+ +-----------+ +-----------+
In addition to the secure tunnel between the Media Distributor and the Key Distributor, there are two additional types of security associations utilized as a part of the key exchange, as discussed in the following paragraphs. One is a DTLS-SRTP association between an endpoint and the Key Distributor (with packets passing through the Media Distributor), and the other is a DTLS-SRTP association between peer Media Distributors.
Endpoints establish a DTLS-SRTP association over the RTP session with the Media Distributor and its media ports for the purposes of key information exchange with the Key Distributor. The Media Distributor does not terminate the DTLS signaling but instead forwards DTLS packets received from an endpoint on to the Key Distributor (and vice versa) via a tunnel established between the Media Distributor and the Key Distributor.
When establishing the DTLS association between endpoints and the Key Distributor, the endpoint
MUST act as the DTLS client, and the Key Distributor
MUST act as the DTLS server. The KEK is conveyed by the Key Distributor over the DTLS association to endpoints via procedures defined in EKT [
RFC 8870] via the EKTKey message.
The Key Distributor
MUST NOT establish DTLS-SRTP associations with endpoints without first authenticating the Media Distributor tunneling the DTLS-SRTP packets from the endpoint.
Note that following DTLS-SRTP procedures for the cipher defined in [
RFC 8723], the endpoint generates both E2E and HBH encryption keys and salt values. Endpoints
MUST either use the DTLS-SRTP-generated E2E key for transmission or generate a fresh E2E key. In either case, the generated SRTP master salt for E2E encryption
MUST be replaced with the salt value provided by the Key Distributor via the EKTKey message. That is because every endpoint in the conference uses the same SRTP master salt. The endpoint only transmits the SRTP master key (not the salt) used for E2E encryption to other endpoints in RTP/RTCP packets per [
RFC 8870].
Media Distributors use DTLS-SRTP directly with a peer Media Distributor to establish the HBH key for transmitting RTP and RTCP packets to that peer Media Distributor. The Key Distributor does not facilitate establishing a HBH key for use between Media Distributors.
Following the initial key information exchange with the Key Distributor, an endpoint is able to encrypt media end to end with an E2E key, sending that E2E key to other endpoints encrypted with the KEK, and is able to encrypt and authenticate RTP packets using a HBH key. This framework does not allow the Media Distributor to gain access to the KEK information, preventing it from gaining access to any endpoint's E2E key and subsequently decrypting media.
The KEK may need to change from time to time during the lifetime of a conference, such as when a new participant joins or leaves a conference. Dictating if, when, or how often a conference is to be rekeyed is outside the scope of this document, but this framework does accommodate rekeying during the lifetime of a conference.
When a Key Distributor decides to rekey a conference, it transmits a new EKTKey message containing the new EKT Key to each of the conference participants. Upon receipt of the new EKT Key, the endpoint
MUST create a new SRTP master key and prepare to send that key inside a FullEKTField using the new EKT Key as per
Section 4.5 of
RFC 8870. In order to allow time for all endpoints in the conference to receive the new keys, the sender should follow the recommendations in
Section 4.6 of
RFC 8870. On receiving a new EKT Key, endpoints
MUST be prepared to decrypt EKT Tags using the new key. The EKT Security Parameter Index (SPI) field is used to differentiate between EKT Tags encrypted with the old and new keys.
After rekeying, an endpoint
SHOULD retain prior SRTP master keys and EKT Keys for a period of time sufficient for the purpose of ensuring that it can decrypt late-arriving or out-of-order packets or packets sent by other endpoints that used the prior keys for a period of time after rekeying began. An endpoint
MAY retain old keys until the end of the conference.
Endpoints
MAY follow the procedures in
Section 5.2 of
RFC 5764 to renegotiate HBH keys as desired. If new HBH keys are generated, the new keys are also delivered to the Media Distributor following the procedures defined in [
PERC-DTLS] as one possible method.
At any time, endpoints
MAY change the E2E encryption key being used. An endpoint
MUST generate a new E2E encryption key whenever it receives a new EKT Key. After switching to a new key, the new key is conveyed to other endpoints in the conference in RTP/RTCP packets per [
RFC 8870].