4. Examples
In this section, we provide examples showing how to use the SDP Capability Negotiation.4.1. Multiple Transport Protocols
The following example illustrates how to use the SDP Capability Negotiation extensions to negotiate use of one out of several possible transport protocols. The offerer uses the expected least- common-denominator (plain RTP) as the actual configuration, and the alternative transport protocols as the potential configurations.
The example is illustrated by the offer/answer exchange below, where Alice sends an offer to Bob: Alice Bob | (1) Offer (RTP/[S]AVP[F]) | |--------------------------------->| | | | (2) Answer (RTP/AVPF) | |<---------------------------------| | | | (3) Offer (RTP/AVPF) | |--------------------------------->| | | | (4) Answer (RTP/AVPF) | |<---------------------------------| | | Alice's offer includes plain RTP (RTP/AVP), RTP with RTCP-based feedback (RTP/AVPF), Secure RTP (RTP/SAVP), and Secure RTP with RTCP- based feedback (RTP/SAVPF) as alternatives. RTP is the default, with RTP/SAVPF, RTP/SAVP, and RTP/AVPF as the alternatives and preferred in the order listed: v=0 o=- 25678 753849 IN IP4 192.0.2.1 s= c=IN IP4 192.0.2.1 t=0 0 m=audio 53456 RTP/AVP 0 18 a=tcap:1 RTP/SAVPF RTP/SAVP RTP/AVPF a=acap:1 crypto:1 AES_CM_128_HMAC_SHA1_80 inline:WVNfX19zZW1jdGwgKCkgewkyMjA7fQp9CnVubGVz|2^20|1:4 FEC_ORDER=FEC_SRTP a=acap:2 rtcp-fb:0 nack a=pcfg:1 t=1 a=1,[2] a=pcfg:2 t=2 a=1 a=pcfg:3 t=3 a=[2] The "m=" line indicates that Alice is offering to use plain RTP with PCMU or G.729. The capabilities are provided by the "a=tcap" and "a=acap" attributes. The "tcap" capability indicates that Secure RTP with RTCP-based feedback (RTP/SAVPF), Secure RTP (RTP/SAVP), and RTP with RTCP-based feedback are supported. The first "acap" attribute provides an attribute capability with a handle of 1. The capability is a "crypto" attribute, which provides the keying material for SRTP using SDP security descriptions [RFC4568]. The second "acap" attribute provides an attribute capability with a handle of 2. The
capability is an "rtcp-fb" attribute, which is used by the RTCP-based feedback profiles to indicate that payload type 0 (PCMU) supports feedback type "nack". The "a=pcfg" attributes provide the potential configurations included in the offer by reference to the capabilities. There are three potential configurations: o Potential configuration 1, which is the most preferred potential configuration specifies use of transport protocol capability 1 (RTP/SAVPF) and attribute capabilities 1 (the "crypto" attribute) and 2 (the "rtcp-fb" attribute). Support for the first one is mandatory whereas support for the second one is optional. o Potential configuration 2, which is the second most preferred potential configuration specifies use of transport protocol capability 2 (RTP/SAVP) and mandatory attribute capability 1 (the "crypto" attribute). o Potential configuration 3, which is the least preferred potential configuration (but the second least preferred configuration overall, since the actual configuration provided by the "m=" line is always the least preferred configuration), specifies use of transport protocol capability 3 (RTP/AVPF) and optional attribute capability 2 (the "rtcp-fb" attribute). Bob receives the SDP session description offer from Alice. Bob does not support any Secure RTP profiles; however, he supports plain RTP and RTP with RTCP-based feedback, as well as the SDP Capability Negotiation extensions, and hence he accepts the potential configuration for RTP with RTCP-based feedback provided by Alice: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= c=IN IP4 192.0.2.2 t=0 0 m=audio 54568 RTP/AVPF 0 18 a=rtcp-fb:0 nack a=acfg:1 t=3 a=[2] Bob includes the "a=acfg" attribute in the answer to inform Alice that he based his answer on an offer containing the potential configuration with transport protocol capability 3 and optional attribute capability 2 from the offer SDP session description (i.e., the RTP/AVPF profile using the "rtcp-fb" value provided). Bob also includes an "rtcp-fb" attribute with the value "nack" value for RTP payload type 0.
When Alice receives Bob's answer, session negotiation has completed, however Alice nevertheless chooses to generate a new offer using the actual configuration. This is done purely to assist any intermediaries that may reside between Alice and Bob but do not support the SDP Capability Negotiation framework (and hence may not understand the negotiation that just took place): Alice's updated offer includes only RTP/AVPF, and it is not using the SDP Capability Negotiation framework (Alice could have included the capabilities as well if she wanted): v=0 o=- 25678 753850 IN IP4 192.0.2.1 s= c=IN IP4 192.0.2.1 t=0 0 m=audio 53456 RTP/AVPF 0 18 a=rtcp-fb:0 nack The "m=" line now indicates that Alice is offering to use RTP with RTCP-based feedback and using PCMU or G.729. The "rtcp-fb" attribute provides the feedback type "nack" for payload type 0 again (but as part of the actual configuration). Bob receives the SDP session description offer from Alice, which he accepts, and then generates an answer to Alice: v=0 o=- 24351 621815 IN IP4 192.0.2.2 s= c=IN IP4 192.0.2.2 t=0 0 m=audio 54568 RTP/AVPF 0 18 a=rtcp-fb:0 nack Bob includes the same "rtcp-fb" attribute as before, and the session proceeds without change. Although Bob did not include any capabilities in his answer, he could have done so if he wanted. Note that in this particular example, the answerer supported the SDP Capability Negotiation framework and hence the attributes and procedures defined here; however, had he not, the answerer would simply have ignored the new attributes received in step 1 and accepted the offer to use normal RTP. In that case, the following answer would have been generated in step 2 instead:
v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= c=IN IP4 192.0.2.2 t=0 0 m=audio 54568 RTP/AVP 0 184.2. DTLS-SRTP or SRTP with Media-Level Security Descriptions
The following example illustrates how to use the SDP Capability Negotiation framework to negotiate use of SRTP using either SDP security descriptions or DTLS-SRTP. The offerer (Alice) wants to establish a Secure RTP audio stream but is willing to use plain RTP. Alice prefers to use DTLS-SRTP as the key management protocol, but supports SDP security descriptions as well (note that [RFC5763] contains additional DTLS-SRTP examples). The example is illustrated by the offer/answer exchange below, where Alice sends an offer to Bob: Alice Bob | (1) Offer (RTP/[S]AVP,SDES | DTLS-SRTP)| |--------------------------------------->| | | |<--------- DTLS-SRTP handshake -------->| | | | (2) Answer (DTLS-SRTP) | |<---------------------------------------| | | | (3) Offer (DTLS-SRTP) | |--------------------------------------->| | | | (4) Answer (DTLS-SRTP) | |<---------------------------------------| | | Alice's offer includes an audio stream that offers use of plain RTP and Secure RTP as alternatives. For the Secure RTP stream, it can be established using either DTLS-SRTP or SDP security descriptions:
v=0 o=- 25678 753849 IN IP4 192.0.2.1 s= t=0 0 c=IN IP4 192.0.2.1 a=acap:1 setup:actpass a=acap:2 fingerprint: SHA-1 \ 4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB a=tcap:1 UDP/TLS/RTP/SAVP RTP/SAVP m=audio 59000 RTP/AVP 98 a=rtpmap:98 AMR/8000 a=acap:3 crypto:1 AES_CM_128_HMAC_SHA1_32 inline:NzB4d1BINUAvLEw6UzF3WSJ+PSdFcGdUJShpX1Zj|2^20|1:32 a=pcfg:1 t=1 a=1,2 a=pcfg:2 t=2 a=3 The first (and preferred) potential configuration for the audio stream specifies use of transport capability 1 (UDP/TLS/RTP/SAVP), i.e., DTLS-SRTP, and attribute capabilities 1 and 2 (active/passive mode and certificate fingerprint), both of which must be supported to choose this potential configuration. The second (and less preferred) potential configuration specifies use of transport capability 2 (RTP/SAVP) and mandatory attribute capability 3, i.e., the SDP security description. Bob receives the SDP session description offer from Alice. Bob supports DTLS-SRTP as preferred by Alice and Bob now initiates the DTLS-SRTP handshake to establish the DTLS-SRTP session (see [RFC5764] for details). Bob also sends back an answer to Alice as follows: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= a=setup:active a=fingerprint: SHA-1 \ FF:FF:FF:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 UDP/TLS/RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=acfg:1 t=1 a=1,2 For the audio stream, Bob accepted the use of DTLS-SRTP, and hence the profile in the "m=" line is "UDP/TLS/RTP/SAVP". Bob also includes a "setup:active" attribute to indicate he is the active
endpoint for the DTLS-SRTP session as well as the fingerprint for Bob's certificate. Bob's "acfg" attribute indicates that he chose potential configuration 1 from Alice's offer. When Alice receives Bob's answer, session negotiation has completed (and Alice can verify the DTLS handshake using Bob's certificate fingerprint in the answer); however, Alice nevertheless chooses to generate a new offer using the actual configuration. This is done purely to assist any intermediaries that may reside between Alice and Bob but do not support the capability negotiation extensions (and hence may not understand the negotiation that just took place). Alice's updated offer includes only DTLS-SRTP for the audio stream, and it is not using the SDP Capability Negotiation framework (Alice could have included the capabilities as well if she wanted): v=0 o=- 25678 753850 IN IP4 192.0.2.1 s= t=0 0 c=IN IP4 192.0.2.1 a=setup:actpass a=fingerprint: SHA-1 \ 4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB m=audio 59000 UDP/TLS/RTP/AVP 98 a=rtpmap:98 AMR/8000 The "m=" line for the audio stream now indicates that Alice is offering to use DTLS-SRTP in active/passive mode using her certificate fingerprint provided. Bob receives the SDP session description offer from Alice, which he accepts, and then generates an answer to Alice: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= a=setup:active a=fingerprint: SHA-1 \ FF:FF:FF:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 UDP/TLS/RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=acfg:1 t=1 a=1,2
Bob includes the same "setup:active" and fingerprint attributes as before, and the session proceeds without change. Although Bob did not include any capabilities in his answer, he could have done so if he wanted. Note that in this particular example, the answerer supported the capability extensions defined here; however, had he not, the answerer would simply have ignored the new attributes received in step 1 and accepted the offer to use normal RTP. In that case, the following answer would have been generated in step 2 instead: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 RTP/AVP 98 a=rtpmap:98 AMR/8000 Finally, if Bob had chosen to use SDP security descriptions instead of DTLS-SRTP, the following answer would have been generated: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:WSJ+PSdFcGdUJShpX1ZjNzB4d1BINUAvLEw6UzF3|2^20|1:32 a=acfg:2 t=2 a=34.3. Best-Effort SRTP with Session-Level MIKEY and Media-Level Security Descriptions
The following example illustrates how to use the SDP Capability Negotiation extensions to support so-called Best-Effort Secure RTP as well as alternative keying mechanisms, more specifically MIKEY [RFC3830] and SDP security descriptions. The offerer (Alice) wants to establish an audio and video session. Alice prefers to use session-level MIKEY as the key management protocol, but supports SDP security descriptions as well. The example is illustrated by the offer/answer exchange below, where Alice sends an offer to Bob:
Alice Bob | (1) Offer (RTP/[S]AVP[F], SDES|MIKEY) | |--------------------------------------->| | | | (2) Answer (RTP/SAVP, SDES) | |<---------------------------------------| | | | (3) Offer (RTP/SAVP, SDES) | |--------------------------------------->| | | | (4) Answer (RTP/SAVP, SDES) | |<---------------------------------------| | | Alice's offer includes an audio and a video stream. The audio stream offers use of plain RTP and Secure RTP as alternatives, whereas the video stream offers use of plain RTP, RTP with RTCP-based feedback, Secure RTP, and Secure RTP with RTCP-based feedback as alternatives: v=0 o=- 25678 753849 IN IP4 192.0.2.1 s= t=0 0 c=IN IP4 192.0.2.1 a=acap:1 key-mgmt:mikey AQAFgM0XflABAAAAAAAAAAAAAAsAyO... a=tcap:1 RTP/SAVPF RTP/SAVP RTP/AVPF m=audio 59000 RTP/AVP 98 a=rtpmap:98 AMR/8000 a=acap:2 crypto:1 AES_CM_128_HMAC_SHA1_32 inline:NzB4d1BINUAvLEw6UzF3WSJ+PSdFcGdUJShpX1Zj|2^20|1:32 a=pcfg:1 t=2 a=1|2 m=video 52000 RTP/AVP 31 a=rtpmap:31 H261/90000 a=acap:3 crypto:1 AES_CM_128_HMAC_SHA1_80 inline:d0RmdmcmVCspeEc3QGZiNWpVLFJhQX1cfHAwJSoj|2^20|1:32 a=acap:4 rtcp-fb:* nack a=pcfg:1 t=1 a=1,4|3,4 a=pcfg:2 t=2 a=1|3 a=pcfg:3 t=3 a=4 The potential configuration for the audio stream specifies use of transport capability 2 (RTP/SAVP) and either attribute capability 1 (session-level MIKEY as the keying mechanism) or 2 (SDP security descriptions as the keying mechanism). Support for either of these attribute capabilities is mandatory. There are three potential configurations for the video stream.
o The first configuration with configuration number 1 uses transport capability 1 (RTP/SAVPF) with either attribute capabilities 1 and 4 (session-level MIKEY and the "rtcp-fb" attribute) or attribute capabilities 3 and 4 (SDP security descriptions and the "rtcp-fb" attribute). In this example, the offerer insists on not only the keying mechanism being supported, but also that the "rtcp-fb" attribute is supported with the value indicated. Consequently, all the attribute capabilities are marked as mandatory in this potential configuration. o The second configuration with configuration number 2 uses transport capability 2 (RTP/SAVP) and either attribute capability 1 (session-level MIKEY) or attribute capability 3 (SDP security descriptions). Both attribute capabilities are mandatory in this configuration. o The third configuration with configuration number 3 uses transport capability 3 (RTP/AVPF) and mandatory attribute capability 4 (the "rtcp-fb" attribute). Bob receives the SDP session description offer from Alice. Bob supports Secure RTP, Secure RTP with RTCP-based feedback and the SDP Capability Negotiation extensions. Bob also supports SDP security descriptions, but not MIKEY, and hence he generates the following answer: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:WSJ+PSdFcGdUJShpX1ZjNzB4d1BINUAvLEw6UzF3|2^20|1:32 a=acfg:1 t=2 a=2 m=video 55468 RTP/SAVPF 31 a=rtpmap:31 H261/90000 a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:AwWpVLFJhQX1cfHJSojd0RmdmcmVCspeEc3QGZiN|2^20|1:32 a=rtcp-fb:* nack a=acfg:1 t=1 a=3,4 For the audio stream, Bob accepted the use of Secure RTP, and hence the profile in the "m=" line is "RTP/SAVP". Bob also includes a "crypto" attribute with his own keying material, and an "acfg" attribute identifying actual configuration 1 for the audio media stream from the offer, using transport capability 2 (RTP/SAVP) and
attribute capability 2 (the "crypto" attribute from the offer). For the video stream, Bob accepted the use of Secure RTP with RTCP-based feedback, and hence the profile in the "m=" line is "RTP/SAVPF". Bob also includes a "crypto" attribute with his own keying material, and an "acfg" attribute identifying actual configuration 1 for the video stream from the offer, using transport capability 1 (RTP/SAVPF) and attribute capabilities 3 (the "crypto" attribute from the offer) and 4 (the "rtcp-fb" attribute from the offer). When Alice receives Bob's answer, session negotiation has completed; however, Alice nevertheless chooses to generate a new offer using the actual configuration. This is done purely to assist any intermediaries that may reside between Alice and Bob but do not support the capability negotiation extensions (and hence may not understand the negotiation that just took place). Alice's updated offer includes only SRTP for the audio stream SRTP with RTCP-based feedback for the video stream, and it is not using the SDP Capability Negotiation framework (Alice could have included the capabilities as well is she wanted): v=0 o=- 25678 753850 IN IP4 192.0.2.1 s= t=0 0 c=IN IP4 192.0.2.1 m=audio 59000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:NzB4d1BINUAvLEw6UzF3WSJ+PSdFcGdUJShpX1Zj|2^20|1:32 m=video 52000 RTP/SAVPF 31 a=rtpmap:31 H261/90000 a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:d0RmdmcmVCspeEc3QGZiNWpVLFJhQX1cfHAwJSoj|2^20|1:32 a=rtcp-fb:* nack The "m=" line for the audio stream now indicates that Alice is offering to use Secure RTP with PCMU or G.729, whereas the "m=" line for the video stream indicates that Alice is offering to use Secure RTP with RTCP-based feedback and H.261. Each media stream includes a "crypto" attribute, which provides the SRTP keying material, with the same value again.
Bob receives the SDP session description offer from Alice, which he accepts, and then generates an answer to Alice: v=0 o=- 24351 621815 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:WSJ+PSdFcGdUJShpX1ZjNzB4d1BINUAvLEw6UzF3|2^20|1:32 m=video 55468 RTP/SAVPF 31 a=rtpmap:31 H261/90000 a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:AwWpVLFJhQX1cfHJSojd0RmdmcmVCspeEc3QGZiN|2^20|1:32 a=rtcp-fb:* nack Bob includes the same "crypto" attribute as before, and the session proceeds without change. Although Bob did not include any capabilities in his answer, he could have done so if he wanted. Note that in this particular example, the answerer supported the capability extensions defined here; however, had he not, the answerer would simply have ignored the new attributes received in step 1 and accepted the offer to use normal RTP. In that case, the following answer would have been generated in step 2 instead: v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 RTP/AVP 98 a=rtpmap:98 AMR/8000 m=video 55468 RTP/AVP 31 a=rtpmap:31 H261/90000 a=rtcp-fb:* nack Finally, if Bob had chosen to use session-level MIKEY instead of SDP security descriptions, the following answer would have been generated:
v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 a=key-mgmt:mikey AQEFgM0XflABAAAAAAAAAAAAAAYAyO... m=audio 54568 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=acfg:1 t=2 a=1 m=video 55468 RTP/SAVPF 31 a=rtpmap:31 H261/90000 a=rtcp-fb:* nack a=acfg:1 t=1 a=1,4 It should be noted, that although Bob could have chosen session-level MIKEY for one media stream, and SDP security descriptions for another media stream, there are no well-defined offerer processing rules of the resulting answer for this, and hence the offerer may incorrectly assume use of MIKEY for both streams. To avoid this, if the answerer chooses session-level MIKEY, then all Secure RTP-based media streams SHOULD use MIKEY (this applies irrespective of whether or not SDP Capability Negotiation is being used). Use of media-level MIKEY does not have a similar constraint.4.4. SRTP with Session-Level MIKEY and Media-Level Security Descriptions as Alternatives
The following example illustrates how to use the SDP Capability Negotiation framework to negotiate use of either MIKEY or SDP security descriptions, when one of them is included as part of the actual configuration, and the other one is being selected. The offerer (Alice) wants to establish an audio and video session. Alice prefers to use session-level MIKEY as the key management protocol, but supports SDP security descriptions as well. The example is illustrated by the offer/answer exchange below, where Alice sends an offer to Bob: Alice Bob | (1) Offer (RTP/[S]AVP[F], SDES|MIKEY) | |--------------------------------------->| | | | (2) Answer (RTP/SAVP, SDES) | |<---------------------------------------| | |
Alice's offer includes an audio and a video stream. Both the audio and the video stream offer use of Secure RTP: v=0 o=- 25678 753849 IN IP4 192.0.2.1 s= t=0 0 c=IN IP4 192.0.2.1 a=key-mgmt:mikey AQAFgM0XflABAAAAAAAAAAAAAAsAyO... m=audio 59000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=acap:1 crypto:1 AES_CM_128_HMAC_SHA1_32 inline:NzB4d1BINUAvLEw6UzF3WSJ+PSdFcGdUJShpX1Zj|2^20|1:32 a=pcfg:1 a=-s:1 m=video 52000 RTP/SAVP 31 a=rtpmap:31 H261/90000 a=acap:2 crypto:1 AES_CM_128_HMAC_SHA1_80 inline:d0RmdmcmVCspeEc3QGZiNWpVLFJhQX1cfHAwJSoj|2^20|1:32 a=pcfg:1 a=-s:2 Alice does not know whether Bob supports MIKEY or SDP security descriptions. She could include attributes for both; however, the resulting procedures and potential interactions are not well- defined. Instead, she places a session-level "key-mgmt" attribute for MIKEY in the actual configuration with SDP security descriptions as an alternative in the potential configuration. The potential configuration for the audio stream specifies that all session-level attributes are to be deleted (i.e., the session-level "a=key-mgmt" attribute) and that mandatory attribute capability 2 is to be used (i.e., the "crypto" attribute). The potential configuration for the video stream is similar, except it uses its own mandatory "crypto" attribute capability (2). Note how the deletion of the session-level attributes does not affect the media-level attributes. Bob receives the SDP session description offer from Alice. Bob supports Secure RTP and the SDP Capability Negotiation framework. Bob also supports both SDP security descriptions and MIKEY. Since the potential configuration is more preferred than the actual configuration, Bob (conceptually) generates an internal potential configuration SDP session description that contains the "crypto" attributes for the audio and video stream, but not the "key-mgmt" attribute for MIKEY, thereby avoiding any ambiguity between the two keying mechanisms. As a result, he generates the following answer:
v=0 o=- 24351 621814 IN IP4 192.0.2.2 s= t=0 0 c=IN IP4 192.0.2.2 m=audio 54568 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:WSJ+PSdFcGdUJShpX1ZjNzB4d1BINUAvLEw6UzF3|2^20|1:32 a=acfg:1 a=-s:1 m=video 55468 RTP/SAVP 31 a=rtpmap:31 H261/90000 a=crypto:1 AES_CM_128_HMAC_SHA1_80 inline:AwWpVLFJhQX1cfHJSojd0RmdmcmVCspeEc3QGZiN|2^20|1:32 a=acfg:1 a=-s:2 For the audio stream, Bob accepted the use of Secure RTP using SDP security descriptions. Bob therefore includes a "crypto" attribute with his own keying material, and an "acfg" attribute identifying the actual configuration 1 for the audio media stream from the offer, with the delete-attributes ("-s") and attribute capability 1 (the "crypto" attribute from the offer). For the video stream, Bob also accepted the use of Secure RTP using SDP security descriptions. Bob therefore includes a "crypto" attribute with his own keying material, and an "acfg" attribute identifying actual configuration 1 for the video stream from the offer, with the delete-attributes ("-s") and attribute capability 2. Below, we illustrate the offer SDP session description, when Bob instead offers the "crypto" attribute as the actual configuration keying mechanism and "key-mgmt" as the potential configuration:
v=0 o=- 25678 753849 IN IP4 192.0.2.1 s= t=0 0 c=IN IP4 192.0.2.1 a=acap:1 key-mgmt:mikey AQAFgM0XflABAAAAAAAAAAAAAAsAyO... m=audio 59000 RTP/SAVP 98 a=rtpmap:98 AMR/8000 a=crypto:1 AES_CM_128_HMAC_SHA1_32 inline:NzB4d1BINUAvLEw6UzF3WSJ+PSdFcGdUJShpX1Zj|2^20|1:32 a=acap:2 rtpmap:98 AMR/8000 a=pcfg:1 a=-m:1,2 m=video 52000 RTP/SAVP 31 a=rtpmap:31 H261/90000 a=acap:3 crypto:1 AES_CM_128_HMAC_SHA1_80 inline:d0RmdmcmVCspeEc3QGZiNWpVLFJhQX1cfHAwJSoj|2^20|1:32 a=acap:4 rtpmap:31 H261/90000 a=pcfg:1 a=-m:1,4 Note how we this time need to perform delete-attributes at the media level instead of the session level. When doing that, all attributes from the actual configuration SDP session description, including the rtpmaps provided, are removed. Consequently, we had to include these rtpmaps as capabilities as well, and then include them in the potential configuration, thereby effectively recreating the original "rtpmap" attributes in the resulting potential configuration SDP session description.5. Security Considerations
The SDP Capability Negotiation framework is defined to be used within the context of the offer/answer model, and hence all the offer/answer security considerations apply here as well [RFC3264]. Similarly, the Session Initiation Protocol (SIP) uses SDP and the offer/answer model, and hence, when used in that context, the SIP security considerations apply as well [RFC3261]. However, SDP Capability Negotiation introduces additional security issues. Its use as a mechanism to enable alternative transport protocol negotiation (secure and non-secure) as well as its ability to negotiate use of more or less secure keying methods and material warrant further security considerations. Also, the (continued) support for receiving media before answer combined with negotiation of alternative transport protocols (secure and non-secure) warrants further security considerations. We discuss these issues below.
The SDP Capability Negotiation framework allows for an offered media stream to both indicate and support various levels of security for that media stream. Different levels of security can for example be negotiated by use of alternative attribute capabilities each indicating more or less secure keying methods as well as more or less strong ciphers. Since the offerer indicates support for each of these alternatives, he will presumably accept the answerer seemingly selecting any of the offered alternatives. If an attacker can modify the SDP session description offer, he can thereby force the negotiation of the weakest security mechanism that the offerer is willing to accept. This may enable the attacker to compromise the security of the negotiated media stream. Similarly, if the offerer wishes to negotiate use of a secure media stream (e.g., Secure RTP), but includes a non-secure media stream (e.g., plain RTP) as a valid (but less preferred) alternative, then an attacker that can modify the offered SDP session description will be able to force the establishment of an insecure media stream. The solution to both of these problems involves the use of integrity protection over the SDP session description. Ideally, this integrity protection provides end-to-end integrity protection in order to protect from any man-in- the-middle attack; secure multiparts such as Secure/Multipurpose Internet Mail Extensions (S/MIME) [RFC5751] provide one such solution; however, S/MIME requires use and availability of a Public Key Infrastructure (PKI). A slightly less secure alternative when using SIP, but generally much easier to deploy in practice, is to use SIP Identity [RFC4474]; this requires the existence of an authentication service (see [RFC4474]). Although this mechanism still requires a PKI, it only requires that servers (as opposed to end-users) have third-party validatable certificates, which significantly reduces the barrier to entry by ordinary users. Yet another, and considerably less secure, alternative is to use hop-by- hop security only, e.g., TLS or IPsec thereby ensuring the integrity of the offered SDP session description on a hop-by-hop basis. This is less secure because SIP allows partially trusted intermediaries on the signaling path, and such intermediaries processing the SIP request at each hop would be able to perform a man-in-the-middle attack by modifying the offered SDP session description. In simple architectures where the two UA's proxies communicate directly, the security provided by this method is roughly comparable to that provided by the previously discussed signature-based mechanisms. Per the normal offer/answer procedures, as soon as the offerer has generated an offer, the offerer must be prepared to receive media in accordance with that offer. The SDP Capability Negotiation preserves that behavior for the actual configuration in the offer; however, the offerer has no way of knowing which configuration (actual or potential) was selected by the answerer, until an answer indication is received. This opens up a new security issue where an attacker
may be able to interject media towards the offerer until the answer is received. For example, the offerer may use plain RTP as the actual configuration and Secure RTP as an alternative potential configuration. Even though the answerer selects Secure RTP, the offerer will not know that until he receives the answer, and hence an attacker will be able to send media to the offerer meanwhile. The easiest protection against such an attack is to not offer use of the non-secure media stream in the actual configuration; however, that may in itself have undesirable side effects: If the answerer does not support the secure media stream and also does not support the capability negotiation framework, then negotiation of the media stream will fail. Alternatively, SDP security preconditions [RFC5027] can be used. This will ensure that media is not flowing until session negotiation has completed and hence the selected configuration is known. Use of preconditions however requires both sides to support them. If they don't, and use of them is required, the session will fail. As a (limited) work around to this, it is RECOMMENDED that SIP entities generate an answer SDP session description and send it to the offerer as soon as possible, for example, in a 183 Session Progress message. This will limit the time during which an attacker can send media to the offerer. Section 3.9 presents other alternatives as well. Additional security considerations apply to the answer SDP session description as well. The actual configuration attribute tells the offerer on which potential configuration the answer was based, and hence an attacker that can either modify or remove the actual configuration attribute in the answer can cause session failure as well as extend the time window during which the offerer will accept incoming media that does not conform to the actual answer. The solutions to this SDP session description answer integrity problem are the same as for the offer, i.e., use of end-to-end integrity protection, SIP identity, or hop-by-hop protection. The mechanism to use depends on the mechanisms supported by the offerer as well as the acceptable security trade offs. As described in Sections 3.1 and 3.11, SDP Capability Negotiation conceptually allows an offerer to include many different offers in a single SDP session description. This can cause the answerer to process a large number of alternative potential offers, which can consume significant memory and CPU resources. An attacker can use this amplification feature to launch a denial-of-service attack against the answerer. The answerer must protect itself from such attacks. As explained in Section 3.11, the answerer can help reduce the effects of such an attack by first discarding all potential configurations that contain unsupported transport protocols, unsupported or invalid mandatory attribute capabilities, or unsupported mandatory extension configurations. The answerer should
also look out for potential configurations that are designed to pass the above test, but nevertheless produce a large number of potential configuration SDP session descriptions that cannot be supported. A possible way of achieving that is for an attacker to find a valid session-level attribute that causes conflicts or otherwise interferes with individual media description configurations. At the time of publication of this document, we do not know of such an SDP attribute; however, this does not mean it does not exist, or that it will not exist in the future. If such attributes are found to exist, implementers should explicitly protect against them. A significant number of valid and supported potential configurations may remain. However, since all of those contain only valid and supported transport protocols and attributes, it is expected that only a few of them will need to be processed on average. Still, the answerer must ensure that it does not needlessly consume large amounts of memory or CPU resources when processing those as well as be prepared to handle the case where a large number of potential configurations still need to be processed.6. IANA Considerations
6.1. New SDP Attributes
The IANA has registered the following new SDP attributes: Attribute name: csup Long form name: Supported capability negotiation extensions Type of attribute: Session-level and media-level Subject to charset: No Purpose: Option tags for supported SDP Capability Negotiation extensions Appropriate values: See Section 3.3.1 of RFC 5939 Contact name: Flemming Andreasen, fandreas@cisco.com Attribute name: creq Long form name: Required capability negotiation extensions Type of attribute: Session-level and media-level Subject to charset: No Purpose: Option tags for required SDP Capability Negotiation extensions Appropriate values: See Section 3.3.2 of RFC 5939 Contact name: Flemming Andreasen, fandreas@cisco.com
Attribute name: acap Long form name: Attribute capability Type of attribute: Session-level and media-level Subject to charset: No Purpose: Attribute capability containing an attribute name and associated value Appropriate values: See Section 3.4.1 of RFC 5939 Contact name: Flemming Andreasen, fandreas@cisco.com Attribute name: tcap Long form name: Transport Protocol Capability Type of attribute: Session-level and media-level Subject to charset: No Purpose: Transport protocol capability listing one or more transport protocols Appropriate values: See Section 3.4.2 of RFC 5939 Contact name: Flemming Andreasen, fandreas@cisco.com Attribute name: pcfg Long form name: Potential Configuration Type of attribute: Media-level Subject to charset: No Purpose: Potential configuration for SDP Capability Negotiation Appropriate values: See Section 3.5.1 of RFC 5939 Contact name: Flemming Andreasen, fandreas@cisco.com Attribute name: acfg Long form name: Actual configuration Type of attribute: Media-level Subject to charset: No Purpose: Actual configuration for SDP Capability Negotiation Appropriate values: See Section 3.5.2 of RFC 5939 Contact name: Flemming Andreasen, fandreas@cisco.com6.2. New SDP Capability Negotiation Option Tag Registry
The IANA has created a new SDP Capability Negotiation Option Tag registry. An IANA SDP Capability Negotiation Option Tag registration MUST be documented in an RFC in accordance with the [RFC5226] IETF Review policy. The RFC MUST provide the name of the option tag, a syntax, and a semantic specification of any new SDP attributes and any extensions to the potential configuration ("a=pcfg") and actual configuration ("a=acfg") attributes provided in this document. If the extension defines any new SDP attributes that are intended to be capabilities for use by the capability negotiation framework (e.g., similar to "a=acap"), those capabilities MUST adhere to the
guidelines provided in Section 3.4.3. Extensions to the potential and actual configuration attributes MUST adhere to the syntax provided in Sections 3.5.1 and 3.5.2. The option tag "cap-v0" is defined in this document, and the IANA has registered this option tag.6.3. New SDP Capability Negotiation Potential Configuration Parameter Registry
The IANA has created a new SDP Capability Negotiation Potential Configuration Parameter registry. An IANA SDP Capability Negotiation Potential Configuration registration MUST be documented in an RFC in accordance with the [RFC5226] IETF Review policy. The RFC MUST define the syntax and semantics of each new potential configuration parameter. The syntax MUST adhere to the syntax provided for extensions in Section 3.5.1 and the semantics MUST adhere to the semantics provided for extensions in Section 3.5.1 and 3.5.2. Associated with each registration MUST be the encoding name for the parameter as well as a short descriptive name for it. The potential configuration parameters "a" for "attribute" and "t" for "transport protocol" are defined in this document, and the IANA has registered them.7. Acknowledgments
The SDP Capability Negotiation solution defined in this document draws on the overall capability negotiation framework that was defined by [SDPng]. Also, the SDP Capability Negotiation solution is heavily influenced by the discussions and work done by the SDP Capability Negotiation Design Team. The following people in particular provided useful comments and suggestions to either the document itself or the overall direction of the solution defined here: Francois Audet, John Elwell, Roni Even, Miguel Garcia, Robert Gilman, Cullen Jennings, Jonathan Lennox, Matt Lepinski, Jean- Francois Mule, Joerg Ott, Colin Perkins, Jonathan Rosenberg, Thomas Stach, and Dan Wing. General Area review comments were provided by Christian Vogt, and Stephen Kent provided Security Directorate review comments. Eric Rescorla provided textual input to the Security Considerations. Alexey Melnikov, Robert Sparks, and Magnus Westerlund provided several review comments as well.
8. References
8.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. [RFC5234] Crocker, D., Ed., and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols", RFC 5245, April 2010.8.2. Informative References
[RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: Session Initiation Protocol", RFC 3261, June 2002. [RFC3312] Camarillo, G., Ed., Marshall, W., Ed., and J. Rosenberg, "Integration of Resource Management and Session Initiation Protocol (SIP)", RFC 3312, October 2002. [RFC3262] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional Responses in Session Initiation Protocol (SIP)", RFC 3262, June 2002. [RFC3407] Andreasen, F., "Session Description Protocol (SDP) Simple Capability Declaration", RFC 3407, October 2002. [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", STD 65, RFC 3551, July 2003.
[RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004. [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, August 2004. [RFC4145] Yon, D. and G. Camarillo, "TCP-Based Media Transport in the Session Description Protocol (SDP)", RFC 4145, September 2005. [RFC4474] Peterson, J. and C. Jennings, "Enhancements for Authenticated Identity Management in the Session Initiation Protocol (SIP)", RFC 4474, August 2006. [RFC4567] Arkko, J., Lindholm, F., Naslund, M., Norrman, K., and E. Carrara, "Key Management Extensions for Session Description Protocol (SDP) and Real Time Streaming Protocol (RTSP)", RFC 4567, July 2006. [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session Description Protocol (SDP) Security Descriptions for Media Streams", RFC 4568, July 2006. [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, "Extended RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 2006. [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. Hakenberg, "RTP Retransmission Payload Format", RFC 4588, July 2006. [RFC4756] Li, A., "Forward Error Correction Grouping Semantics in Session Description Protocol", RFC 4756, November 2006. [RFC5027] Andreasen, F. and D. Wing, "Security Preconditions for Session Description Protocol (SDP) Media Streams", RFC 5027, October 2007. [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for Real-time Transport Control Protocol (RTCP)-Based Feedback (RTP/SAVPF)", RFC 5124, February 2008. [RFC5751] Ramsdell, B. and S. Turner, "Secure/Multipurpose Internet Mail Extensions (S/MIME) Version 3.2 Message Specification", RFC 5751, January 2010.
[RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework for Establishing a Secure Real-time Transport Protocol (SRTP) Security Context Using Datagram Transport Layer Security (DTLS)", RFC 5763, May 2010. [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer Security (DTLS) Extension to Establish Keys for the Secure Real-time Transport Protocol (SRTP)", RFC 5764, May 2010. [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description Protocol (SDP) Grouping Framework", RFC 5888, June 2010. [BESRTP] Kaplan, H. and F. Audet, "Session Description Protocol (SDP) Offer/Answer Negotiation For Best-Effort Secure Real-Time Transport Protocol", Work in Progress, October 2006. [ICETCP] Rosenberg, J., Keranen, A., Lowekamp, B., and A. Roach, "TCP Candidates with Interactive Connectivity Establishment (ICE)", Work in Progress, September 2010. [SDPMedCap] Gilman, R., Even, R., and F. Andreasen, "SDP media capabilities Negotiation", Work in Progress, July 2010. [SDPng] Kutscher, D., Ott, J., and C. Bormann, "Session Description and Capability Negotiation", Work in Progress, February 2005.Author's Address
Flemming Andreasen Cisco Systems Iselin, NJ 08830 USA EMail: fandreas@cisco.com