Internet Engineering Task Force (IETF) F. Andreasen Request for Comments: 5939 Cisco Systems Category: Standards Track September 2010 ISSN: 2070-1721 Session Description Protocol (SDP) Capability NegotiationAbstract
The Session Description Protocol (SDP) was intended to describe multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. SDP was not intended to provide capability indication or capability negotiation; however, over the years, SDP has seen widespread adoption and as a result it has been gradually extended to provide limited support for these, notably in the form of the offer/answer model defined in RFC 3264. SDP does not define how to negotiate one or more alternative transport protocols (e.g., RTP profiles) or attributes. This makes it difficult to deploy new RTP profiles such as Secure RTP or RTP with RTCP-based feedback, negotiate use of different security keying mechanisms, etc. It also presents problems for some forms of media negotiation. The purpose of this document is to address these shortcomings by extending SDP with capability negotiation parameters and associated offer/answer procedures to use those parameters in a backwards compatible manner. The document defines a general SDP Capability Negotiation framework. It also specifies how to provide attributes and transport protocols as capabilities and negotiate them using the framework. Extensions for other types of capabilities (e.g., media types and media formats) may be provided in other documents.
Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5939. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.
Table of Contents
1. Introduction ....................................................4 2. Conventions Used in This Document ...............................7 3. SDP Capability Negotiation Solution .............................7 3.1. SDP Capability Negotiation Model ...........................7 3.2. Solution Overview .........................................10 3.3. Version and Extension Indication Attributes ...............14 3.4. Capability Attributes .....................................17 3.5. Configuration Attributes ..................................22 3.6. Offer/Answer Model Extensions .............................32 3.7. Interactions with ICE .....................................45 3.8. Interactions with SIP Option Tags .........................47 3.9. Processing Media before Answer ............................48 3.10. Indicating Bandwidth Usage ...............................49 3.11. Dealing with Large Number of Potential Configurations ....50 3.12. SDP Capability Negotiation and Intermediaries ............51 3.13. Considerations for Specific Attribute Capabilities .......52 3.14. Relationship to RFC 3407 .................................54 4. Examples .......................................................54 4.1. Multiple Transport Protocols ..............................54 4.2. DTLS-SRTP or SRTP with Media-Level Security Descriptions...58 4.3. Best-Effort SRTP with Session-Level MIKEY and Media-Level Security Descriptions .....................................61 4.4. SRTP with Session-Level MIKEY and Media-Level Security Descriptions as Alternatives ..............................66 5. Security Considerations ........................................69 6. IANA Considerations ............................................72 6.1. New SDP Attributes ........................................72 6.2. New SDP Capability Negotiation Option Tag Registry ........73 6.3. New SDP Capability Negotiation Potential Configuration Parameter Registry ..........................74 7. Acknowledgments ................................................74 8. References .....................................................75 8.1. Normative References ......................................75 8.2. Informative References ....................................75
1. Introduction
The Session Description Protocol (SDP) was intended to describe multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. An SDP session description contains one or more media stream descriptions with information such as IP address and port, type of media stream (e.g., audio or video), transport protocol (possibly including profile information, e.g., RTP/AVP or RTP/SAVP), media formats (e.g., codecs), and various other session and media stream parameters that define the session. Simply providing media stream descriptions is sufficient for session announcements for a broadcast application, where the media stream parameters are fixed for all participants. When a participant wants to join the session, he obtains the session announcement and uses the media descriptions provided, e.g., joins a multicast group and receives media packets in the encoding format specified. If the media stream description is not supported by the participant, he is unable to receive the media. Such restrictions are not generally acceptable to multimedia session invitations, where two or more entities attempt to establish a media session, that uses a set of media stream parameters acceptable to all participants. First of all, each entity must inform the other of its receive address, and secondly, the entities need to agree on the media stream parameters to use for the session, e.g., transport protocols and codecs. To solve this, RFC 3264 [RFC3264] defined the offer/answer model, whereby an offerer constructs an offer SDP session description that lists the media streams, codecs, and other SDP parameters that the offerer is willing to use. This offer session description is sent to the answerer, which chooses from among the media streams, codecs and other session description parameters provided, and generates an answer session description with his parameters, based on that choice. The answer session description is sent back to the offerer thereby completing the session negotiation and enabling the establishment of the negotiated media streams. Taking a step back, we can make a distinction between the capabilities supported by each participant, the way in which those capabilities can be supported, and the parameters that can actually be used for the session. More generally, we can say that we have the following: o A set of capabilities for the session and its associated media stream components, supported by each side. The capability indications by themselves do not imply a commitment to use the capabilities in the session.
Capabilities can, for example, be that the "RTP/SAVP" profile is supported, that the "PCMU" (Pulse Code Modulation mu-law) codec is supported, or that the "crypto" attribute is supported with a particular value. o A set of potential configurations indicating which combinations of those capabilities can be used for the session and its associated media stream components. Potential configurations are not ready for use. Instead, they provide an alternative that may be used, subject to further negotiation. A potential configuration can, for example, indicate that the "PCMU" codec and the "RTP/SAVP" transport protocol are not only supported (i.e., listed as capabilities), but they are offered for potential use in the session. o An actual configuration for the session and its associated media stream components, that specifies which combinations of session parameters and media stream components can be used currently and with what parameters. Use of an actual configuration does not require any further negotiation. An actual configuration can, for example, be that the "PCMU" codec and the "RTP/SAVP" transport protocol are offered for use currently. o A negotiation process that takes the set of actual and potential configurations (combinations of capabilities) as input and provides the negotiated actual configurations as output. SDP by itself was designed to provide only one of these, namely listing of the actual configurations; however, over the years, use of SDP has been extended beyond its original scope. Of particular importance are the session negotiation semantics that were defined by the offer/answer model in RFC 3264. In this model, both the offer and the answer contain actual configurations; separate capabilities and potential configurations are not supported. Other relevant extensions have been defined as well. RFC 3407 [RFC3407] defined simple capability declarations, which extends SDP with a simple and limited set of capability descriptions. Grouping of media lines, which defines how media lines in SDP can have other semantics than the traditional "simultaneous media streams" semantics, was defined in RFC 5888 [RFC5888], etc. Each of these extensions was designed to solve a specific limitation of SDP. Since SDP had already been stretched beyond its original intent, a more comprehensive capability declaration and negotiation
process was intentionally not defined. Instead, work on a "next generation" of a protocol to provide session description and capability negotiation was initiated [SDPng]. SDPng defined a comprehensive capability negotiation framework and protocol that was not bound by existing SDP constraints. SDPng was not designed to be backwards compatible with existing SDP and hence required both sides to support it, with a graceful fallback to legacy operation when needed. This, combined with lack of ubiquitous multipart MIME support in the protocols that would carry SDP or SDPng, made it challenging to migrate towards SDPng. In practice, SDPng has not gained traction and, as of the time of publication of this document, work on SDPng has stopped. Existing real-time multimedia communication protocols such as SIP, Real Time Streaming Protocol (RTSP), Megaco, and Media Gateway Control Protocol (MGCP) continue to use SDP. However, SDP does not address an increasingly important problem: the ability to negotiate one or more alternative transport protocols (e.g., RTP profiles) and associated parameters (e.g., SDP attributes). This makes it difficult to deploy new RTP profiles such as Secure RTP (SRTP) [RFC3711], RTP with RTCP-based feedback [RFC4585], etc. The problem is exacerbated by the fact that RTP profiles are defined independently. When a new profile is defined and N other profiles already exist, there is a potential need for defining N additional profiles, since profiles cannot be combined automatically. For example, in order to support the plain and Secure RTP version of RTP with and without RTCP-based feedback, four separate profiles (and hence profile definitions) are needed: RTP/AVP [RFC3551], RTP/SAVP [RFC3711], RTP/AVPF [RFC4585], and RTP/SAVPF [RFC5124]. In addition to the pressing profile negotiation problem, other important real-life limitations have been found as well. Keying material and other parameters, for example, need to be negotiated with some of the transport protocols, but not others. Similarly, some media formats and types of media streams need to negotiate a variety of different parameters. The purpose of this document is to define a mechanism that enables SDP to provide limited support for indicating capabilities and their associated potential configurations, and negotiate the use of those potential configurations as actual configurations. It is not the intent to provide a full-fledged capability indication and negotiation mechanism along the lines of SDPng or ITU-T H.245. Instead, the focus is on addressing a set of well-known real-life limitations. More specifically, the solution provided in this document provides a general SDP Capability Negotiation framework that is backwards compatible with existing SDP. It also defines specifically how to provide attributes and transport protocols as capabilities and negotiate them using the framework. Extensions for other types of capabilities (e.g., media types and formats) may be provided in other documents.
As mentioned above, SDP is used by several protocols, and hence the mechanism should be usable by all of these. One particularly important protocol for this problem is the Session Initiation Protocol (SIP) [RFC3261]. SIP uses the offer/answer model [RFC3264] (which is not specific to SIP) to negotiate sessions and hence the mechanism defined here provides the offer/answer procedures to use for the capability negotiation framework. The rest of the document is structured as follows. In Section 3, we present the SDP Capability Negotiation solution, which consists of new SDP attributes and associated offer/answer procedures. In Section 4, we provide examples illustrating its use. In Section 5, we provide the security considerations.2. Conventions Used in This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].3. SDP Capability Negotiation Solution
In this section, we first present the conceptual model behind the SDP Capability Negotiation framework followed by an overview of the SDP Capability Negotiation solution. We then define new SDP attributes for the solution and provide its associated updated offer/answer procedures.3.1. SDP Capability Negotiation Model
Our model uses the concepts of o Capabilities o Potential Configurations o Actual Configurations o Negotiation Process as defined in Section 1. Conceptually, we want to offer not just the actual configuration SDP session description (which is done with the offer/answer model defined in [RFC3264]), but the actual configuration SDP session description as well as one or more alternative SDP session descriptions, i.e., potential configurations. The answerer must choose either the actual configuration or one of the potential configurations, and generate an answer SDP session description based on that. The offerer may need to perform
processing on the answer, which depends on the offer that was chosen (actual or potential configuration). The answerer therefore informs the offerer which configuration the answerer chose. The process can be viewed *conceptually* as follows: Offerer Answerer ======= ======== 1) Generate offer with actual configuration and alternative potential configurations 2) Send offer with all configurations +------------+ | SDP o1 | | (actual | | config | | |-+ Offer +------------+ | -----> 3) Process offered configurations | SDP o2 | in order of preference indicated | (potential | 4) Generate answer based on chosen | config 1) |-+ configuration (e.g., o2), and +------------+ | inform offerer which one was | SDP o3 | chosen | (potential | | config 2) |-+ +------------+ | | SDP ... | : : +------------+ | SDP a1 | Answer | (actual | <----- | config,o2)| | | 5) Process answer based on +------------+ the configuration that was chosen (o2), as indicated in the answer The above illustrates the conceptual model: the actual solution uses a single SDP session description, which contains the actual configuration (as with existing SDP session descriptions and the offer/answer model defined in [RFC3264]) and several new attributes and associated procedures, that encode the capabilities and potential configurations. A more accurate depiction of the actual offer SDP session description is therefore as follows:
+--------------------+ | SDP o1 | | (actual | | config | | | | +-------------+ | | | capability 1| | | | capability 2| | | | ... | | | +-------------+ | Offer | | -----> | +-------------+ | | | potential | | | | config 1 | | | | potential | | | | config 2 | | | | ... | | | +-------------+ | | | +--------------------+ The above structure is used for two reasons: o Backwards compatibility: As noted above, support for multipart MIME is not ubiquitous. By encoding both capabilities and potential configurations in SDP attributes, we can represent everything in a single SDP session description thereby avoiding any multipart MIME support issues. Furthermore, since unknown SDP attributes are ignored by the SDP recipient, we ensure that entities that do not support the framework simply perform the regular RFC 3264 offer/answer procedures. This provides us with seamless backwards compatibility. o Message size efficiency: When we have multiple media streams, each of which may potentially use two or more different transport protocols with a variety of different associated parameters, the number of potential configurations can be large. If each possible alternative is represented as a complete SDP session description in an offer, we can easily end up with large messages. By providing a more compact encoding, we get more efficient message sizes. In the next section, we describe the exact structure and specific SDP parameters used to represent this.