This section provides further details to the overview in
Section 4. First, formal syntax is
Section 5.1, followed by the rest of the SDP attribute definition in
Section 5.2.
Section 5.5 provides the definition of the RTP/RTCP mechanisms used. The section concludes with a number of examples.
This document defines a new SDP media-level "a=simulcast" attribute, with value according to the syntax in
Figure 3, which uses [
RFC 5234] and its update, [
RFC 7405]:
sc-value = ( sc-send [SP sc-recv] ) / ( sc-recv [SP sc-send] )
sc-send = %s"send" SP sc-str-list
sc-recv = %s"recv" SP sc-str-list
sc-str-list = sc-alt-list *( ";" sc-alt-list )
sc-alt-list = sc-id *( "," sc-id )
sc-id-paused = "~"
sc-id = [sc-id-paused] rid-id
; SP defined in [RFC5234]
; rid-id defined in [RFC8851]
The "a=simulcast" attribute has a parameter in the form of one or two simulcast stream descriptions, each consisting of a direction ("send" or "recv"), followed by a list of one or more simulcast streams. Each simulcast stream consists of one or more alternative simulcast formats. Each simulcast format is identified by a simulcast stream identifier (rid-id). The rid-id
MUST have the form of an RTP stream identifier, as described by [
RFC 8851].
In the list of simulcast streams, each simulcast stream is separated by a semicolon (";"). Each simulcast stream can, in turn, be offered in one or more alternative formats, represented by rid-ids, separated by commas (","). Each rid-id can also be specified as initially [
RFC 7728], indicated by prepending a "~" to the rid-id. The reason to allow separate initial pause states for each rid-id is that pause capability can be specified individually for each RTP payload type referenced by a rid-id. Since pause capability specified via the "a=rtcp-fb" attribute applies only to specified payload types, and a rid-id specified by "a=rid" can refer to multiple different payload types, it is unfeasible to pause streams with rid-id where any of the related RTP payload type(s) do not have pause capability.
Simulcast capability is expressed through a new media-level
Section 5.1. The use of this attribute at the session level is undefined. Implementations of this specification
MUST NOT use it at the session level and
MUST ignore it if received at the session level. Extensions to this specification may define such session-level usage. Each SDP media description
MUST contain at most one "a=simulcast" line.
There are separate and independent sets of simulcast streams in the "send" and "receive" directions. When listing multiple directions, each direction
MUST NOT occur more than once on the same line.
Simulcast streams using undefined rid-ids
MUST NOT be used as valid simulcast streams by an RTP stream receiver. The direction for a rid-id
MUST be aligned with the direction specified for the corresponding RTP stream identifier on the "a=rid" line.
The listed number of simulcast streams for a direction sets a limit to the number of supported simulcast streams in that direction. The order of the listed simulcast streams in the "send" direction suggests a proposed order of preference, in decreasing order: the rid-id listed first is the most preferred, and subsequent streams have progressively lower preference. The order of the listed rid-ids in the "recv" direction expresses which simulcast streams are preferred, with the leftmost being most preferred. This can be of importance if the number of actually sent simulcast streams has to be reduced for some reason.
rid-ids that have explicit [
RFC 5583] [
RFC 8851] to other rid-ids (even in the same media description)
MAY be used.
Use of more than a single, alternative simulcast format for a simulcast stream
MAY be specified as part of the attribute parameters by expressing the simulcast stream as a comma-separated list of alternative rid-ids. The order of the rid-id alternatives within a simulcast stream is significant; the rid-id alternatives are listed from (left) most preferred to (right) least preferred. For the use of simulcast, this overrides the normal codec preference as expressed by format-type ordering on the "m=" line, using regular SDP rules. This is to enable a separation of general codec preferences and simulcast-stream configuration preferences. However, the choice of which alternative to use per simulcast stream is independent, and there is currently no mechanism for the offerer to force the answerer to choose the same alternative for multiple simulcast streams.
A simulcast stream can use a codec defined such that the same RTP synchronization source (SSRC) can change RTP payload type multiple times during a session, possibly even on a per-packet basis. A typical example is a speech codec that makes use of formats for [
RFC 3389] and/or [
RFC 4733].
If [
RFC 7728] is supported, any rid-id
MAY be prefixed by a "~" character to indicate that the corresponding simulcast stream is paused already from the start of the RTP session. In this case, support for RTP stream pause/resume
MUST also be included under the same "m=" line where "a=simulcast" is included. All RTP payload types related to such an initially paused simulcast stream
MUST be listed in the SDP as pause/resume capable as specified by [
RFC 7728] -- e.g., by using the "*" wildcard format for "a=rtcp-fb".
An initially paused simulcast stream in the "send" direction for the endpoint sending the SDP
MUST be considered equivalent to an unsolicited locally paused stream and handled accordingly. Initially paused simulcast streams are resumed as described by the RTP pause/resume specification. An RTP stream receiver that wishes to resume an unsolicited locally paused stream needs to know the SSRC of that stream. The SSRC of an initially paused simulcast stream can be obtained from an RTP stream sender RTCP Sender Report (SR) or Receiver Report (RR) that includes both the desired SSRC as initial SSRC in the source description (SDES) chunk, optionally a [
RFC 8843] (if used and if rid-ids are not unique across "m=" lines), and the rid-id value in an [
RFC 8852].
If the endpoint sending the SDP includes a "recv"-direction simulcast stream that is initially paused, then the remote RTP sender receiving the SDP
SHOULD put its RTP stream in an unsolicited locally paused state. The simulcast stream sender does not put the stream in the locally paused state if there are other RTP stream receivers in the session that do not mark the simulcast stream as initially paused. However, in centralized conferencing, the RTP sender usually does not see the SDP signaling from RTP receivers and cannot make this determination. The reason for requiring that an initially paused "recv" stream be considered locally paused by the remote RTP sender instead of making it equivalent to implicitly sending a pause request is that the pausing RTP sender cannot know which receiving SSRC owns the restriction when Temporary Maximum Media Stream Bit Rate Request (TMMBR) and Temporary Maximum Media Stream Bit Rate Notification (TMMBN) are used for pause/resume signaling (
Section 5.6 of
RFC 7728); this is because the RTP receiver's SSRC in the "send" direction is sometimes not yet known.
Use of the redundant audio data format [
RFC 2198] could be seen as a form of simulcast for loss-protection purposes, but it is not considered conflicting with the mechanisms described in this memo and
MAY therefore be used as any other format. In this case, the "red" format, rather than the carried formats,
SHOULD be the one to list as a simulcast stream on the "a=simulcast" line.
The media formats and corresponding characteristics of simulcast streams
SHOULD be chosen such that they are different -- e.g., as different SDP formats with differing "a=rtpmap" and/or "a=fmtp" lines, or as differently defined RTP payload format restrictions. If this difference is not required, it is
RECOMMENDED to use RTP duplication procedures [
RFC 7104] instead of simulcast. To avoid complications in implementations, a single rid-id
MUST NOT occur more than once per "a=simulcast" line. Note that this does not eliminate use of simulcast as an RTP duplication mechanism, since it is possible to define multiple different rid-ids that are effectively equivalent.
-
Note:
-
The inclusion of "a=simulcast" or the use of simulcast does not change any of the interpretation or Offer/Answer procedures for other SDP attributes, such as "a=fmtp" or "a=rid".
An offerer wanting to use simulcast for a media description
SHALL include one "a=simulcast" attribute in that media description in the offer. An offerer listing a set of receive simulcast streams and/or alternative formats as rid-ids in the offer
MUST be prepared to receive RTP streams for any of those simulcast streams and/or alternative formats from the answerer.
An answerer that does not understand the concept of simulcast will also not know the attribute and will remove it in the SDP answer, as defined in existing SDP offer/answer procedures [
RFC 3264]. Since SDP session-level simulcast is undefined in this memo, an answerer that receives an offer with the "a=simulcast" attribute on the SDP session level
SHALL remove it in the answer. An answerer that understands the attribute but receives multiple "a=simulcast" attributes in the same media description
SHALL disable use of simulcast by removing all "a=simulcast" lines for that media description in the answer.
An answerer that does understand the attribute and wants to support simulcast in an indicated direction
SHALL reverse directionality of the unidirectional direction parameters -- "send" becomes "recv" and vice versa -- and include it in the answer.
An answerer that receives an offer with simulcast containing an "a=simulcast" attribute listing alternative rid-ids
MAY keep all the alternative rid-ids in the answer, but it
MAY also choose to remove any nondesirable alternative rid-ids in the answer. The answerer
MUST NOT add any alternative rid-ids in the "send" direction in the answer that were not present in the offer receive direction. The answerer
MUST be prepared to receive any of the receive-direction rid-id alternatives and
MAY send any of the "send"-direction alternatives that are part of the answer.
An answerer that receives an offer with simulcast that lists a number of simulcast streams
MAY reduce the number of simulcast streams in the answer, but it
MUST NOT add simulcast streams.
An answerer that receives an offer without RTP stream pause/resume capability
MUST NOT mark any simulcast streams as initially paused in the answer.
An RTP stream answerer capable of pause/resume that receives an offer with RTP stream pause/resume capability
MAY mark any rid-ids that refer to pause/resume capable formats as initially paused in the answer.
An answerer that receives indication in an offer of a rid-id being initially paused
SHOULD mark that rid-id as initially paused also in the answer, regardless of direction, unless it has good reason for the rid-id not being initially paused. One reason to remove an initial pause in the answer compared to the offer could be, for example, that all "receive"-direction simulcast streams for a media source the answerer accepts in the answer would otherwise be paused.
An offerer that receives an answer without "a=simulcast"
MUST NOT use simulcast towards the answerer. An offerer that receives an answer with "a=simulcast" without any rid-id in a specified direction
MUST NOT use simulcast in that direction.
An offerer that receives an answer where some rid-id alternatives are kept
MUST be prepared to receive any of the kept "send"-direction rid-id alternatives and
MAY send any of the kept "receive"-direction rid-id alternatives.
An offerer that receives an answer where some of the rid-ids are removed compared to the offer
MAY release the corresponding resources (codec, transport, etc) in its "receive" direction and
MUST NOT send any RTP packets corresponding to the removed rid-ids.
An offerer that offered some of its rid-ids as initially paused and receives an answer that does not indicate RTP stream pause/resume capability
MUST NOT initially pause any simulcast streams.
An offerer with RTP stream pause/resume capability that receives an answer where some rid-ids are marked as initially paused
SHOULD initially pause those RTP streams, even if they were marked as initially paused also in the offer, unless it has good reason for those RTP streams not being initially paused. One such reason could be, for example, that the answerer would otherwise initially not receive any media of that type at all.
Offers inside an existing session follow the same rules as for initial SDP offer, with these additions:
-
rid-ids marked as initially paused in the offerer's "send" direction SHALL reflect the offerer's opinion of the current pause state at the time of creating the offer. This is purely informational, and RTP stream pause/resume signaling [RFC 7728] in the ongoing session SHALL take precedence in case of any conflict or ambiguity.
-
rid-ids marked as initially paused in the offerer's "receive" direction SHALL (as in an initial offer) reflect the offerer's desired rid-id pause state. Except for the case where the offerer already paused the corresponding RTP stream through [RFC 7728] signaling, this is identical to the conditions at an initial offer.
Creation of SDP answers and processing of SDP answers inside an existing session follow the same rules as described above for initial SDP offer/answer.
Session modification restrictions in
Section 6.5 of
RFC 8851 also apply.
This document does not define the use of "a=simulcast" in declarative SDP, partly because use of the [
RFC 8851] is not defined for use in declarative SDP. If concrete use cases for simulcast in declarative SDP are identified in the future, the authors of this memo expect that additional specifications will address such use.
Simulcast RTP streams
MUST be related on the RTP level through [
RFC 8852], as specified in the SDP
Section 5.2 parameters. This is sufficient as long as there is only a single media source per SDP media description. When using [
RFC 8843], where multiple SDP media descriptions jointly specify a single RTP session, the SDES MID (Media Identification) mechanism in BUNDLE allows relating RTP streams back to individual media descriptions, after which the RtpStreamId relations described above can be used. Use of the RTP header extension for the [
RFC 7941] for both MID and RtpStreamId identifications can be important to ensure rapid initial reception, required to correctly interpret and process the RTP streams. Implementers of this specification
MUST support the RTCP source description (SDES) item method and
SHOULD support RTP header extension method to signal RtpStreamId on the RTP level.
-
NOTE:
-
For the case where it is clear from SDP that the RTP PT uniquely maps to a corresponding RtpStreamId, an RTP receiver can use RTP PT to relate simulcast streams. This can sometimes enable decoding even in advance of receiving RtpStreamId information in RTCP SDES and/or RTP header extensions.
RTP streams
MUST only use a single alternative rid-id at a time (based on RTP timestamps) but
MAY change format (and rid-id) on a per-RTP packet basis. This corresponds to the existing (nonsimulcast) SDP offer/answer case when multiple formats are included on the "m=" line in the SDP answer, enabling per-RTP packet change of RTP payload type.
These examples describe a client-to-video-conference service, using a centralized media topology with an RTP mixer.
+---+ +-----------+ +---+
| A |<---->| |<---->| B |
+---+ | | +---+
| Mixer |
+---+ | | +---+
| F |<---->| |<---->| J |
+---+ +-----------+ +---+
Alice is calling in to the mixer with a simulcast-enabled client capable of a single media source per media type. The client can send a simulcast of 2 video resolutions and frame rates: HD 1280x720p 30fps and thumbnail 320x180p 15fps. This is defined below using the [
RFC 6236]. In this example, only the "pt" "a=rid" parameter is used to describe simulcast stream formats, effectively achieving a 1:1 mapping between RtpStreamId and media formats (RTP payload types). Alice's Offer:
v=0
o=alice 2362969037 2362969040 IN IP4 192.0.2.156
s=Simulcast-Enabled Client
c=IN IP4 192.0.2.156
t=0 0
m=audio 49200 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49300 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000
a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=rid:1 send pt=97
a=rid:2 send pt=98
a=rid:3 recv pt=97
a=simulcast:send 1;2 recv 3
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
The only thing in the SDP that indicates simulcast capability is the line in the video media description containing the "simulcast" attribute. The included "a=fmtp" and "a=imageattr" parameters indicate that sent simulcast streams can differ in video resolution. The RTP header extension for RtpStreamId is offered to avoid issues with the initial binding between RTP streams (SSRCs) and the RtpStreamId identifying the simulcast stream and its format.
The answer from the server indicates that it, too, is simulcast capable. Should it not have been simulcast capable, the "a=simulcast" line would not have been present, and communication would have started with the media negotiated in the SDP. Also, the usage of the RtpStreamId RTP header extension is accepted.
v=0
o=server 823479283 1209384938 IN IP4 192.0.2.2
s=Answer to Simulcast-Enabled Client
c=IN IP4 192.0.2.43
t=0 0
m=audio 49672 RTP/AVP 0
a=rtpmap:0 PCMU/8000
m=video 49674 RTP/AVP 97 98
a=rtpmap:97 H264/90000
a=rtpmap:98 H264/90000
a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000
a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600
a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720]
a=imageattr:98 send [x=320,y=180] recv [x=320,y=180]
a=rid:1 recv pt=97
a=rid:2 recv pt=98
a=rid:3 send pt=97
a=simulcast:recv 1;2 send 3
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
Since the server is the simulcast media receiver, it reverses the direction of the "simulcast" and "rid" attribute parameters.
Fred is calling in to the same conference as in the example above with a two-camera, two-display system, thus capable of handling two separate media sources in each direction, where each media source is simulcast enabled in the "send" direction. Fred's client is restricted to a single media source per media description.
The first two simulcast streams for the first media source use different codecs, [
RFC 6190] and [
RFC 6184]. These two simulcast streams also have a temporal dependency. Two different video codecs, [
RFC 7741] and H264, are offered as alternatives for the third simulcast stream for the first media source. Only the highest-fidelity simulcast stream is sent from start, the lower-fidelity streams being initially paused.
The second media source is offered with three different simulcast streams. All video streams of this second media source are loss protected by [
RFC 4588]. In addition, all but the highest-fidelity simulcast stream are initially paused. Note that the lower resolution is more prioritized than the medium-resolution simulcast stream.
Fred's client is also using BUNDLE to send all RTP streams from all media descriptions in the same RTP session on a single media transport. Although using many different simulcast streams in this example, the use of RtpStreamId as simulcast stream identification enables use of a low number of RTP payload types. Note that when using both [
RFC 8843] and [
RFC 8851], it is recommended to use the RTP header extension for the [
RFC 7941] for carrying these RTP stream-identification fields, which is consequently also included in the SDP. Note also that for "a=rid", the corresponding RtpStreamId SDES attribute RTP header extension is named [
RFC 8852].
v=0
o=fred 238947129 823479223 IN IP6 2001:db8::c000:27d
s=Offer from Simulcast-Enabled Multi-Source Client
c=IN IP6 2001:db8::c000:27d
t=0 0
a=group:BUNDLE foo bar zen
m=audio 49200 RTP/AVP 99
a=mid:foo
a=rtpmap:99 G722/8000
m=video 49600 RTP/AVPF 100 101 103
a=mid:bar
a=rtpmap:100 H264-SVC/90000
a=rtpmap:101 H264/90000
a=rtpmap:103 VP8/90000
a=fmtp:100 profile-level-id=42400d;max-fs=3600;max-mbps=216000; \
mst-mode=NI-TC
a=fmtp:101 profile-level-id=42c00d;max-fs=3600;max-mbps=108000
a=fmtp:103 max-fs=900; max-fr=30
a=rid:1 send pt=100;max-width=1280;max-height=720;max-fps=60;depend=2
a=rid:2 send pt=101;max-width=1280;max-height=720;max-fps=30
a=rid:3 send pt=101;max-width=640;max-height=360
a=rid:4 send pt=103;max-width=640;max-height=360
a=depend:100 lay bar:101
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1;2;~4,3
m=video 49602 RTP/AVPF 96 104
a=mid:zen
a=rtpmap:96 VP8/90000
a=fmtp:96 max-fs=3600; max-fr=30
a=rtpmap:104 rtx/90000
a=fmtp:104 apt=96;rtx-time=200
a=rid:1 send max-fs=921600;max-fps=30
a=rid:2 send max-fs=614400;max-fps=15
a=rid:3 send max-fs=230400;max-fps=30
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1;~3;~2
The example in this section looks at applying simulcast with audio and video redundancy formats. The audio media description uses codec and bitrate restrictions, combined with the [
RFC 2198] for enhanced packet-loss resilience. The video media description applies both resolution and bitrate restrictions, combined with Forward Error Correction (FEC) in the form of [
RFC 8627] and [
RFC 4588].
The audio source is offered to be sent as two simulcast streams. The first simulcast stream is encoded with Opus, restricted to 64 kbps (rid-id=1), and the second simulcast stream (rid-id=2) is encoded with either G.711, or G.711 combined with linear predictive coding (LPC) for redundancy and explicit comfort noise (CN). Both simulcast streams include telephone-event capability. In this example, stand-alone LPC is not offered as a possible payload type for the second simulcast stream's RID, which could be motivated by, for example, not providing sufficient quality.
The video source is offered to be sent as two simulcast streams, both with two alternative simulcast formats. Redundancy and repair are offered in the form of both flexible FEC and RTP retransmission. The flexible FEC is not bound to any particular RTP streams and is therefore able to be used across all RTP streams that are being sent as part of this media description.
o=fred 238947129 823479223 IN IP6 2001:db8::c000:27d
s=Offer from Simulcast-Enabled Client using Redundancy
c=IN IP6 2001:db8::c000:27d
t=0 0
a=group:BUNDLE foo bar
m=audio 49200 RTP/AVP 97 98 99 100 101 102
a=mid:foo
a=rtpmap:97 G711/8000
a=rtpmap:98 LPC/8000
a=rtpmap:99 OPUS/48000/1
a=rtpmap:100 RED/8000/1
a=rtpmap:101 CN/8000
a=rtpmap:102 telephone-event/8000
a=fmtp:99 useinbandfec=1;usedtx=0
a=fmtp:100 97/98
a=fmtp:102 0-15
a=ptime:20
a=maxptime:40
a=rid:1 send pt=99,102;max-br=64000
a=rid:2 send pt=100,97,101,102
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=simulcast:send 1;2
m=video 49600 RTP/AVPF 103 104 105 106 107
a=mid:bar
a=rtpmap:103 H264/90000
a=rtpmap:104 VP8/90000
a=rtpmap:105 rtx/90000
a=rtpmap:106 rtx/90000
a=rtpmap:107 flexfec/90000
a=fmtp:103 profile-level-id=42c00d;max-fs=3600;max-mbps=108000
a=fmtp:104 max-fs=3600; max-fr=30
a=fmtp:105 apt=103;rtx-time=200
a=fmtp:106 apt=104;rtx-time=200
a=fmtp:107 repair-window=100000
a=rid:1 send pt=103;max-width=1280;max-height=720;max-fps=30
a=rid:2 send pt=104;max-width=1280;max-height=720;max-fps=30
a=rid:3 send pt=103;max-width=640;max-height=360;max-br=300000
a=rid:4 send pt=104;max-width=640;max-height=360;max-br=300000
a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid
a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id
a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id
a=rtcp-fb:* ccm pause nowait
a=simulcast:send 1,2;3,4