RFC 5939

Session Description Protocol (SDP) Capability Negotiation

Pages: 77
Proposed Standard
→ Errata
Updated by: 6871

Part 1 of 4 – Pages 1 to 9

RFC5939 - Page 1

Internet Engineering Task Force (IETF)                      F. Andreasen
Request for Comments: 5939                                 Cisco Systems
Category: Standards Track                                 September 2010
ISSN: 2070-1721


       Session Description Protocol (SDP) Capability Negotiation

Abstract

   The Session Description Protocol (SDP) was intended to describe
   multimedia sessions for the purposes of session announcement, session
   invitation, and other forms of multimedia session initiation.  SDP
   was not intended to provide capability indication or capability
   negotiation; however, over the years, SDP has seen widespread
   adoption and as a result it has been gradually extended to provide
   limited support for these, notably in the form of the offer/answer
   model defined in RFC 3264.  SDP does not define how to negotiate one
   or more alternative transport protocols (e.g., RTP profiles) or
   attributes.  This makes it difficult to deploy new RTP profiles such
   as Secure RTP or RTP with RTCP-based feedback, negotiate use of
   different security keying mechanisms, etc.  It also presents problems
   for some forms of media negotiation.

   The purpose of this document is to address these shortcomings by
   extending SDP with capability negotiation parameters and associated
   offer/answer procedures to use those parameters in a backwards
   compatible manner.

   The document defines a general SDP Capability Negotiation framework.
   It also specifies how to provide attributes and transport protocols
   as capabilities and negotiate them using the framework.  Extensions
   for other types of capabilities (e.g., media types and media formats)
   may be provided in other documents.

RFC5939 - Page 2

Status of This Memo

   This is an Internet Standards Track document.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Further information on
   Internet Standards is available in Section 2 of RFC 5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc5939.

Copyright Notice

   Copyright (c) 2010 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

RFC5939 - Page 3

Table of Contents

   1. Introduction ....................................................4
   2. Conventions Used in This Document ...............................7
   3. SDP Capability Negotiation Solution .............................7
      3.1. SDP Capability Negotiation Model ...........................7
      3.2. Solution Overview .........................................10
      3.3. Version and Extension Indication Attributes ...............14
      3.4. Capability Attributes .....................................17
      3.5. Configuration Attributes ..................................22
      3.6. Offer/Answer Model Extensions .............................32
      3.7. Interactions with ICE .....................................45
      3.8. Interactions with SIP Option Tags .........................47
      3.9. Processing Media before Answer ............................48
      3.10. Indicating Bandwidth Usage ...............................49
      3.11. Dealing with Large Number of Potential Configurations ....50
      3.12. SDP Capability Negotiation and Intermediaries ............51
      3.13. Considerations for Specific Attribute Capabilities .......52
      3.14. Relationship to RFC 3407 .................................54
   4. Examples .......................................................54
      4.1. Multiple Transport Protocols ..............................54
      4.2. DTLS-SRTP or SRTP with Media-Level Security Descriptions...58
      4.3. Best-Effort SRTP with Session-Level MIKEY and Media-Level
           Security Descriptions .....................................61
      4.4. SRTP with Session-Level MIKEY and Media-Level Security
           Descriptions as Alternatives ..............................66
   5. Security Considerations ........................................69
   6. IANA Considerations ............................................72
      6.1. New SDP Attributes ........................................72
      6.2. New SDP Capability Negotiation Option Tag Registry ........73
      6.3. New SDP Capability Negotiation Potential
           Configuration Parameter Registry ..........................74
   7. Acknowledgments ................................................74
   8. References .....................................................75
      8.1. Normative References ......................................75
      8.2. Informative References ....................................75

RFC5939 - Page 4

1.  Introduction

   The Session Description Protocol (SDP) was intended to describe
   multimedia sessions for the purposes of session announcement, session
   invitation, and other forms of multimedia session initiation.  An SDP
   session description contains one or more media stream descriptions
   with information such as IP address and port, type of media stream
   (e.g., audio or video), transport protocol (possibly including
   profile information, e.g., RTP/AVP or RTP/SAVP), media formats (e.g.,
   codecs), and various other session and media stream parameters that
   define the session.

   Simply providing media stream descriptions is sufficient for session
   announcements for a broadcast application, where the media stream
   parameters are fixed for all participants.  When a participant wants
   to join the session, he obtains the session announcement and uses the
   media descriptions provided, e.g., joins a multicast group and
   receives media packets in the encoding format specified.  If the
   media stream description is not supported by the participant, he is
   unable to receive the media.

   Such restrictions are not generally acceptable to multimedia session
   invitations, where two or more entities attempt to establish a media
   session, that uses a set of media stream parameters acceptable to all
   participants.  First of all, each entity must inform the other of its
   receive address, and secondly, the entities need to agree on the
   media stream parameters to use for the session, e.g., transport
   protocols and codecs.  To solve this, RFC 3264 [RFC3264] defined the
   offer/answer model, whereby an offerer constructs an offer SDP
   session description that lists the media streams, codecs, and other
   SDP parameters that the offerer is willing to use.  This offer
   session description is sent to the answerer, which chooses from among
   the media streams, codecs and other session description parameters
   provided, and generates an answer session description with his
   parameters, based on that choice.  The answer session description is
   sent back to the offerer thereby completing the session negotiation
   and enabling the establishment of the negotiated media streams.

   Taking a step back, we can make a distinction between the
   capabilities supported by each participant, the way in which those
   capabilities can be supported, and the parameters that can actually
   be used for the session.  More generally, we can say that we have the
   following:

   o  A set of capabilities for the session and its associated media
      stream components, supported by each side.  The capability
      indications by themselves do not imply a commitment to use the
      capabilities in the session.

RFC5939 - Page 5

      Capabilities can, for example, be that the "RTP/SAVP" profile is
      supported, that the "PCMU" (Pulse Code Modulation mu-law) codec is
      supported, or that the "crypto" attribute is supported with a
      particular value.

   o  A set of potential configurations indicating which combinations of
      those capabilities can be used for the session and its associated
      media stream components.  Potential configurations are not ready
      for use.  Instead, they provide an alternative that may be used,
      subject to further negotiation.

      A potential configuration can, for example, indicate that the
      "PCMU" codec and the "RTP/SAVP" transport protocol are not only
      supported (i.e., listed as capabilities), but they are offered for
      potential use in the session.

   o  An actual configuration for the session and its associated media
      stream components, that specifies which combinations of session
      parameters and media stream components can be used currently and
      with what parameters.  Use of an actual configuration does not
      require any further negotiation.

      An actual configuration can, for example, be that the "PCMU" codec
      and the "RTP/SAVP" transport protocol are offered for use
      currently.

   o  A negotiation process that takes the set of actual and potential
      configurations (combinations of capabilities) as input and
      provides the negotiated actual configurations as output.

   SDP by itself was designed to provide only one of these, namely
   listing of the actual configurations; however, over the years, use of
   SDP has been extended beyond its original scope.  Of particular
   importance are the session negotiation semantics that were defined by
   the offer/answer model in RFC 3264.  In this model, both the offer
   and the answer contain actual configurations; separate capabilities
   and potential configurations are not supported.

   Other relevant extensions have been defined as well.  RFC 3407
   [RFC3407] defined simple capability declarations, which extends SDP
   with a simple and limited set of capability descriptions.  Grouping
   of media lines, which defines how media lines in SDP can have other
   semantics than the traditional "simultaneous media streams"
   semantics, was defined in RFC 5888 [RFC5888], etc.

   Each of these extensions was designed to solve a specific limitation
   of SDP.  Since SDP had already been stretched beyond its original
   intent, a more comprehensive capability declaration and negotiation

RFC5939 - Page 6

   process was intentionally not defined.  Instead, work on a "next
   generation" of a protocol to provide session description and
   capability negotiation was initiated [SDPng].  SDPng defined a
   comprehensive capability negotiation framework and protocol that was
   not bound by existing SDP constraints.  SDPng was not designed to be
   backwards compatible with existing SDP and hence required both sides
   to support it, with a graceful fallback to legacy operation when
   needed.  This, combined with lack of ubiquitous multipart MIME
   support in the protocols that would carry SDP or SDPng, made it
   challenging to migrate towards SDPng.  In practice, SDPng has not
   gained traction and, as of the time of publication of this document,
   work on SDPng has stopped.  Existing real-time multimedia
   communication protocols such as SIP, Real Time Streaming Protocol
   (RTSP), Megaco, and Media Gateway Control Protocol (MGCP) continue to
   use SDP.  However, SDP does not address an increasingly important
   problem: the ability to negotiate one or more alternative transport
   protocols (e.g., RTP profiles) and associated parameters (e.g., SDP
   attributes).  This makes it difficult to deploy new RTP profiles such
   as Secure RTP (SRTP) [RFC3711], RTP with RTCP-based feedback
   [RFC4585], etc.  The problem is exacerbated by the fact that RTP
   profiles are defined independently.  When a new profile is defined
   and N other profiles already exist, there is a potential need for
   defining N additional profiles, since profiles cannot be combined
   automatically.  For example, in order to support the plain and Secure
   RTP version of RTP with and without RTCP-based feedback, four
   separate profiles (and hence profile definitions) are needed: RTP/AVP
   [RFC3551], RTP/SAVP [RFC3711], RTP/AVPF [RFC4585], and RTP/SAVPF
   [RFC5124].  In addition to the pressing profile negotiation problem,
   other important real-life limitations have been found as well.
   Keying material and other parameters, for example, need to be
   negotiated with some of the transport protocols, but not others.
   Similarly, some media formats and types of media streams need to
   negotiate a variety of different parameters.

   The purpose of this document is to define a mechanism that enables
   SDP to provide limited support for indicating capabilities and their
   associated potential configurations, and negotiate the use of those
   potential configurations as actual configurations.  It is not the
   intent to provide a full-fledged capability indication and
   negotiation mechanism along the lines of SDPng or ITU-T H.245.
   Instead, the focus is on addressing a set of well-known real-life
   limitations.  More specifically, the solution provided in this
   document provides a general SDP Capability Negotiation framework that
   is backwards compatible with existing SDP.  It also defines
   specifically how to provide attributes and transport protocols as
   capabilities and negotiate them using the framework.  Extensions for
   other types of capabilities (e.g., media types and formats) may be
   provided in other documents.

RFC5939 - Page 7

   As mentioned above, SDP is used by several protocols, and hence the
   mechanism should be usable by all of these.  One particularly
   important protocol for this problem is the Session Initiation
   Protocol (SIP) [RFC3261].  SIP uses the offer/answer model [RFC3264]
   (which is not specific to SIP) to negotiate sessions and hence the
   mechanism defined here provides the offer/answer procedures to use
   for the capability negotiation framework.

   The rest of the document is structured as follows.  In Section 3, we
   present the SDP Capability Negotiation solution, which consists of
   new SDP attributes and associated offer/answer procedures.  In
   Section 4, we provide examples illustrating its use.  In Section 5,
   we provide the security considerations.

2.  Conventions Used in This Document

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in [RFC2119].

3.  SDP Capability Negotiation Solution

   In this section, we first present the conceptual model behind the SDP
   Capability Negotiation framework followed by an overview of the SDP
   Capability Negotiation solution.  We then define new SDP attributes
   for the solution and provide its associated updated offer/answer
   procedures.

3.1.  SDP Capability Negotiation Model

   Our model uses the concepts of

   o  Capabilities

   o  Potential Configurations

   o  Actual Configurations

   o  Negotiation Process

   as defined in Section 1.  Conceptually, we want to offer not just the
   actual configuration SDP session description (which is done with the
   offer/answer model defined in [RFC3264]), but the actual
   configuration SDP session description as well as one or more
   alternative SDP session descriptions, i.e., potential configurations.
   The answerer must choose either the actual configuration or one of
   the potential configurations, and generate an answer SDP session
   description based on that.  The offerer may need to perform

RFC5939 - Page 8

   processing on the answer, which depends on the offer that was chosen
   (actual or potential configuration).  The answerer therefore informs
   the offerer which configuration the answerer chose.  The process can
   be viewed *conceptually* as follows:

        Offerer                           Answerer
        =======                           ========

   1) Generate offer with actual
      configuration and alternative
      potential configurations
   2) Send offer with all configurations

   +------------+
   | SDP o1     |
   | (actual    |
   |  config    |
   |            |-+      Offer
   +------------+ |      ----->   3) Process offered configurations
     | SDP o2     |                  in order of preference indicated
     | (potential |               4) Generate answer based on chosen
     |  config 1) |-+                configuration (e.g., o2), and
     +------------+ |                inform offerer which one was
       | SDP o3     |                chosen
       | (potential |
       |  config 2) |-+
       +------------+ |
         | SDP ...    |
         :            :

                                      +------------+
                                      | SDP a1     |
                        Answer        | (actual    |
                        <-----        |  config,o2)|
                                      |            |
   5) Process answer based on         +------------+
      the configuration that was
      chosen (o2), as indicated in
      the answer

   The above illustrates the conceptual model: the actual solution uses
   a single SDP session description, which contains the actual
   configuration (as with existing SDP session descriptions and the
   offer/answer model defined in [RFC3264]) and several new attributes
   and associated procedures, that encode the capabilities and potential
   configurations.  A more accurate depiction of the actual offer SDP
   session description is therefore as follows:

RFC5939 - Page 9

          +--------------------+
          | SDP o1             |
          | (actual            |
          |  config            |
          |                    |
          | +-------------+    |
          | | capability 1|    |
          | | capability 2|    |
          | | ...         |    |
          | +-------------+    |   Offer
          |                    |   ----->
          | +-------------+    |
          | | potential   |    |
          | |   config 1  |    |
          | | potential   |    |
          | |   config 2  |    |
          | | ...         |    |
          | +-------------+    |
          |                    |
          +--------------------+

   The above structure is used for two reasons:

   o  Backwards compatibility:   As noted above, support for multipart
      MIME is not ubiquitous.  By encoding both capabilities and
      potential configurations in SDP attributes, we can represent
      everything in a single SDP session description thereby avoiding
      any multipart MIME support issues.  Furthermore, since unknown SDP
      attributes are ignored by the SDP recipient, we ensure that
      entities that do not support the framework simply perform the
      regular RFC 3264 offer/answer procedures.  This provides us with
      seamless backwards compatibility.

   o  Message size efficiency:   When we have multiple media streams,
      each of which may potentially use two or more different transport
      protocols with a variety of different associated parameters, the
      number of potential configurations can be large.  If each possible
      alternative is represented as a complete SDP session description
      in an offer, we can easily end up with large messages.  By
      providing a more compact encoding, we get more efficient message
      sizes.

   In the next section, we describe the exact structure and specific SDP
   parameters used to represent this.

(next page on part 2)