This section describes the specific procedures to be followed when creating and parsing SDP objects.
JSEP implementations
MUST comply with the specifications listed below that govern the creation and processing of offers and answers.
All session descriptions handled by JSEP implementations, both local and remote,
MUST indicate support for the following specifications. If any of these are absent, this omission
MUST be treated as an error.
-
ICE, as specified in [RFC 8445], MUST be used. Note that the remote endpoint may use a lite implementation; implementations MUST properly handle remote endpoints that use ICE-lite. The remote endpoint may also use an older version of ICE; implementations MUST properly handle remote endpoints that use ICE as specified in [RFC 5245].
-
DTLS [RFC 6347] [RFC 9147] or DTLS-SRTP [RFC 5763] MUST be used, as appropriate for the media type, as specified in [RFC 8827]. Note: RFC 8827 requires implementations to support DTLS 1.2 [RFC 6347] and permits the use of DTLS 1.3 [RFC 9147].
The SDP security descriptions mechanism for Secure Real-time Transport Protocol (SRTP) keying [
RFC 4568]
MUST NOT be used, as discussed in [
RFC 8827].
For media "m=" sections, JSEP implementations
MUST support the "UDP/TLS/RTP/SAVPF" profile specified in [
RFC 5764] as well as the "TCP/DTLS/RTP/SAVPF" profile specified in [
RFC 7850] and
MUST indicate one of these profiles for each media "m=" line they produce in an offer. For data "m=" sections, implementations
MUST support the "UDP/DTLS/SCTP" profile as well as the "TCP/DTLS/SCTP" profile and
MUST indicate one of these profiles for each data "m=" line they produce in an offer. The exact profile to use is determined by the protocol associated with the current default or selected ICE candidate, as described in
RFC 8839,
Section 4.2.1.2.
Unfortunately, in an attempt at compatibility, some endpoints generate other profile strings even when they mean to support one of these profiles. For instance, an endpoint might generate "RTP/AVP" but supply "a=fingerprint" and "a=rtcp-fb" attributes, indicating its willingness to support "UDP/TLS/RTP/SAVPF" or "TCP/DTLS/RTP/SAVPF". In order to simplify compatibility with such endpoints, JSEP implementations
MUST follow the following rules when processing the media "m=" sections in a received offer:
-
Any profile in the offer matching one of the following MUST be accepted:
-
The profile in any "m=" line in any generated answer MUST exactly match the profile provided in the offer.
-
Because DTLS-SRTP is REQUIRED, the choice of SAVP or AVP has no effect; support for DTLS-SRTP is determined by the presence of one or more "a=fingerprint" attributes. Note that lack of an "a=fingerprint" attribute will lead to negotiation failure.
-
The use of AVPF or AVP simply controls the timing rules used for RTCP feedback. If AVPF is provided or an "a=rtcp-fb" attribute is present, assume AVPF timing, i.e., a default value of "trr-int=0". Otherwise, assume that AVPF is being used in an AVP-compatible mode and use a value of "trr-int=4000".
-
For data "m=" sections, implementations MUST support receiving the "UDP/DTLS/SCTP", "TCP/DTLS/SCTP", or "DTLS/SCTP" (for backwards compatibility) profiles.
Note that re-offers by JSEP implementations
MUST use the correct profile strings even if the initial offer/answer exchange used an (incorrect) older profile string. This simplifies JSEP behavior, with minimal downside, as any remote endpoint that fails to handle such a re-offer will also fail to handle a JSEP endpoint's initial offer.
When createOffer is called, a new SDP description
MUST be created that includes the functionality specified in [
RFC 8834]. The exact details of this process are explained below.
When createOffer is called for the first time, the result is known as the initial offer.
The first step in generating an initial offer is to generate session-level attributes, as specified in
RFC 4566,
Section 5. Specifically:
-
The first SDP line MUST be "v=0" as defined in RFC 4566, Section 5.1.
-
The second SDP line MUST be an "o=" line as defined in RFC 4566, Section 5.2. The value of the <username> field SHOULD be "-". The <sess-id> MUST be representable by a 64-bit signed integer, and the value MUST be less than (263)-1. This is to ensure that the <sess-id> value, when expressed as a string, is always a non-negative integer, as some SDP parsers may fail to parse a negative <sess-id>. It is RECOMMENDED that the <sess-id> be constructed by generating a 64-bit quantity with the highest bit set to zero and the remaining 63 bits being cryptographically random. The value of the <nettype> <addrtype> <unicast-address> tuple SHOULD be set to a non-meaningful address, such as IN IP4 0.0.0.0, to prevent leaking a local IP address in this field; this problem is discussed in [RFC 8828]. As mentioned in [RFC 4566], the entire "o=" line needs to be unique, but selecting a random number for <sess-id> is sufficient to accomplish this.
-
The third SDP line MUST be a "s=" line as defined in RFC 4566, Section 5.3; to match the "o=" line, a single dash SHOULD be used as the session name, e.g., "s=-". Note that this differs from the advice in [RFC 4566], which proposes a single space, but as both "o=" and "s=" are meaningless in JSEP, having the same meaningless value seems clearer.
-
Session Information ("i="), URI ("u="), Email Address ("e="), Phone Number ("p="), Repeat Times ("r="), and Time Zones ("z=") lines are not useful in this context and SHOULD NOT be included.
-
Encryption Keys ("k=") lines do not provide sufficient security and MUST NOT be included.
-
A "t=" line MUST be added, as specified in RFC 4566, Section 5.9; both <start-time> and <stop-time> SHOULD be set to zero, e.g., "t=0 0".
-
An "a=ice-options" line with the "trickle" and "ice2" options MUST be added, as specified in RFC 8840, Section 4.1.1 and RFC 8445, Section 10.
-
If WebRTC identity is being used, an "a=identity" line MUST be added, as described in RFC 8827, Section 5.
The next step is to generate "m=" sections, as specified in
RFC 4566,
Section 5.14. An "m=" section is generated for each RtpTransceiver that has been added to the PeerConnection, excluding any stopped RtpTransceivers; this is done in the order the RtpTransceivers were added to the PeerConnection. If there are no such RtpTransceivers, no "m=" sections are generated; more can be added later, as discussed in
RFC 3264,
Section 5.
For each "m=" section generated for an RtpTransceiver, establish a mapping between the transceiver and the index of the generated "m=" section.
Each "m=" section, provided it is not marked as bundle-only,
MUST contain a unique set of ICE credentials and a unique set of ICE candidates. Bundle-only "m=" sections
MUST NOT contain any ICE credentials and
MUST NOT gather any candidates.
For DTLS, all "m=" sections
MUST use any and all certificates that have been specified for the PeerConnection; as a result, they
MUST all have the same fingerprint value or values [
RFC 8122], or these values
MUST be session-level attributes.
Each "m=" section
MUST be generated as specified in
RFC 4566,
Section 5.14. For the "m=" line itself, the following rules
MUST be followed:
-
If the "m=" section is marked as bundle-only, then the <port> value MUST be set to zero. Otherwise, the <port> value is set to the port of the default ICE candidate for this "m=" section, but given that no candidates are available yet, the default <port> value of 9 (Discard) MUST be used, as indicated in RFC 8840, Section 4.1.1.
-
To properly indicate use of DTLS, the <proto> field MUST be set to "UDP/TLS/RTP/SAVPF", as specified in RFC 5764, Section 8.
-
If codec preferences have been set for the associated transceiver, media formats MUST be generated in the corresponding order and MUST exclude any codecs not present in the codec preferences.
-
Unless excluded by the above restrictions, the media formats MUST include the mandatory audio/video codecs as specified in RFC 7874, Section 3 and RFC 7742, Section 5.
The "m=" line
MUST be followed immediately by a "c=" line, as specified in
RFC 4566,
Section 5.7. Again, as no candidates are available yet, the "c=" line
MUST contain the default value "IN IP4 0.0.0.0", as defined in
RFC 8840,
Section 4.1.1.
[
RFC 8859] groups SDP attributes into different categories. To avoid unnecessary duplication when bundling, attributes of category IDENTICAL or TRANSPORT
MUST NOT be repeated in bundled "m=" sections, repeating the guidance from
RFC 9143,
Section 7.1.3. This includes "m=" sections for which bundling has been negotiated and is still desired, as well as "m=" sections marked as bundle-only.
The following attributes, which are of a category other than IDENTICAL or TRANSPORT,
MUST be included in each "m=" section:
-
An "a=mid" line, as specified in RFC 5888, Section 4. All MID values MUST be generated in a fashion that does not leak user information, e.g., randomly or using a per-PeerConnection counter, and SHOULD be 3 bytes or less, to allow them to efficiently fit into the RTP header extension defined in RFC 9143, Section 15.2. Note that this does not set the RtpTransceiver mid property, as that only occurs when the description is applied. The generated MID value can be considered a "proposed" MID at this point.
-
A direction attribute that is the same as that of the associated transceiver.
-
For each media format on the "m=" line, "a=rtpmap" and "a=fmtp" lines, as specified in RFC 4566, Section 6 and RFC 3264, Section 5.1.
-
For each primary codec where RTP retransmission should be used, a corresponding "a=rtpmap" line indicating "rtx" with the clock rate of the primary codec and an "a=fmtp" line that references the payload type of the primary codec, as specified in RFC 4588, Section 8.1.
-
For each Forward Error Correction (FEC) mechanism supported by the application, "a=rtpmap" and "a=fmtp" lines, as specified in RFC 4566, Section 6. The FEC mechanisms that MUST be supported are specified in RFC 8854, Section 7, and specific usage for each media type is outlined in Sections 4 and 5 of [RFC 8854].
-
If this "m=" section is for media with configurable durations of media per packet, e.g., audio, an "a=maxptime" line, indicating the maximum amount of media, specified in milliseconds, that can be encapsulated in each packet, as specified in RFC 4566, Section 6. This value is set to the smallest of the maximum duration values across all the codecs included in the "m=" section.
-
If this "m=" section is for video media and there are known limitations on the size of images that can be decoded, an "a=imageattr" line, as specified in Section 3.6.
-
For each RTP header extension supported by the application, an "a=extmap" line, as specified in RFC 5285, Section 5. The list of header extensions that SHOULD/MUST be supported is specified in RFC 8834, Section 5.2. Any header extensions that require encryption MUST be specified as indicated in RFC 6904, Section 4.
-
For each RTCP feedback mechanism supported by the application, an "a=rtcp-fb" line, as specified in RFC 4585, Section 4.2. The list of RTCP feedback mechanisms that SHOULD/MUST be supported is specified in RFC 8834, Section 5.1.
-
If the RtpTransceiver has a "sendrecv" or "sendonly" direction:
-
For each MediaStream that was associated with the transceiver when it was created via addTrack or addTransceiver, an "a=msid" line, as specified in RFC 8830, Section 2, but omitting the "appdata" field.
-
If the RtpTransceiver has a "sendrecv" or "sendonly" direction, and the application has specified a rid-id for an encoding, or has specified more than one encoding in the RtpSenders's parameters, an "a=rid" line for each encoding specified. The "a=rid" line is specified in [RFC 8851], and its direction MUST be "send". If the application has chosen a rid-id, it MUST be used; otherwise, a rid-id MUST be generated by the implementation. rid-ids MUST be generated in a fashion that does not leak user information, e.g., randomly or using a per-PeerConnection counter (see guidance at the end of RFC 8852, Section 3.3), and SHOULD be 3 bytes or less, to allow them to efficiently fit into the RTP header extensions defined in RFC 8852, Section 3.3. If no encodings have been specified, or only one encoding is specified but without a rid-id, then no "a=rid" lines are generated.
-
If the RtpTransceiver has a "sendrecv" or "sendonly" direction and more than one "a=rid" line has been generated, an "a=simulcast" line, with direction "send", as defined in RFC 8853, Section 5.1. The associated set of rid-ids MUST include all of the rid-ids used in the "a=rid" lines for this "m=" section.
-
If (1) the bundle policy for this PeerConnection is set to "must-bundle" and this is not the first "m=" section or (2) the bundle policy is set to "balanced" and this is not the first "m=" section for this media type, an "a=bundle-only" line.
The following attributes, which are of category IDENTICAL or TRANSPORT,
MUST appear only in "m=" sections that either have a unique address or are associated with the BUNDLE-tag. (In initial offers, this means those "m=" sections that do not contain an "a=bundle-only" attribute.)
-
"a=ice-ufrag" and "a=ice-pwd" lines, as specified in RFC 8839, Section 5.4.
-
For each desired digest algorithm, one or more "a=fingerprint" lines for each of the endpoint's certificates, as specified in RFC 8122, Section 5.
-
An "a=setup" line, as specified in RFC 4145, Section 4 and clarified for use in DTLS-SRTP scenarios in RFC 5763, Section 5. The role value in the offer MUST be "actpass".
-
An "a=tls-id" line, as specified in RFC 8842, Section 5.2.
-
An "a=rtcp" line, as specified in RFC 3605, Section 2.1, containing the default value "9 IN IP4 0.0.0.0", because no candidates have yet been gathered.
-
An "a=rtcp-mux" line, as specified in RFC 5761, Section 5.1.3.
-
If the RTP/RTCP multiplexing policy is "require", an "a=rtcp-mux-only" line, as specified in RFC 8858, Section 4.
-
An "a=rtcp-rsize" line, as specified in RFC 5506, Section 5.
Lastly, if a data channel has been created, an "m=" section
MUST be generated for data. The <media> field
MUST be set to "application", and the <proto> field
MUST be set to "UDP/DTLS/SCTP" [
RFC 8841]. The <fmt> value
MUST be set to "webrtc-datachannel" as specified in
RFC 8841,
Section 4.4.2.
Within the data "m=" section, an "a=mid" line
MUST be generated and included as described above, along with an "a=sctp-port" line referencing the SCTP port number, as defined in
RFC 8841,
Section 5.1; and, if appropriate, an "a=max-message-size" line, as defined in
RFC 8841,
Section 6.1.
As discussed above, the following attributes of category IDENTICAL or TRANSPORT are included only if the data "m=" section either has a unique address or is associated with the BUNDLE-tag (e.g., if it is the only "m=" section):
-
"a=ice-ufrag"
-
"a=ice-pwd"
-
"a=fingerprint"
-
"a=setup"
-
"a=tls-id"
Once all "m=" sections have been generated, a session-level "a=group" attribute
MUST be added as specified in [
RFC 5888]. This attribute
MUST have semantics "BUNDLE" and
MUST include the MID identifiers of each "m=" section. The effect of this is that the JSEP implementation offers all "m=" sections as one bundle group. However, whether the "m=" sections are bundle-only or not depends on the bundle policy.
The next step is to generate session-level lip sync groups as defined in
RFC 5888,
Section 7. For each MediaStream referenced by more than one RtpTransceiver (by passing those MediaStreams as arguments to the addTrack and addTransceiver methods), a group of type "LS"
MUST be added that contains the MID values for each RtpTransceiver.
Attributes that SDP permits to be at either the session level or the media level
SHOULD generally be at the media level even if they are identical. This assists development and debugging by making it easier to understand individual media sections, especially if one of a set of initially identical attributes is subsequently changed. However, implementations
MAY choose to aggregate attributes at the session level, and JSEP implementations
MUST be prepared to receive attributes in either location.
Attributes other than the ones specified above
MAY be included, except for the following attributes, which are specifically incompatible with the requirements of [
RFC 8834] and
MUST NOT be included:
-
"a=crypto"
-
"a=key-mgmt"
-
"a=ice-lite"
Note that when bundle is used, any additional attributes that are added
MUST follow the advice in [
RFC 8859] on how those attributes interact with bundle.
Note that these requirements are in some cases stricter than those of SDP. Implementations
MUST be prepared to accept compliant SDP even if it would not conform to the requirements for generating SDP in this specification.
When createOffer is called a second (or later) time or is called after a local description has already been installed, the processing is somewhat different than for an initial offer.
If the previous offer was not applied using setLocalDescription, meaning the PeerConnection is still in the "stable" state, the steps for generating an initial offer
MUST be followed, subject to the following restriction:
-
The fields of the "o=" line MUST stay the same except for the <session-version> field, which MUST increment by one on each call to createOffer if the offer might differ from the output of the previous call to createOffer; implementations MAY opt to increment <session-version> on every call. The value of the generated <session-version> is independent of the <session-version> of the current local description; in particular, in the case where the current version is N, an offer is created and applied with version N+1, and then that offer is rolled back so that the current version is again N, the next generated offer will still have version N+2.
Note that if the application creates an offer by reading currentLocalDescription instead of calling createOffer, the returned SDP may be different than when setLocalDescription was originally called, due to the addition of gathered ICE candidates, but the <session-version> will not have changed. There are no known scenarios in which this causes problems, but if this is a concern, the solution is simply to use createOffer to ensure a unique <session-version>.
If the previous offer was applied using setLocalDescription, but a corresponding answer from the remote side has not yet been applied, meaning the PeerConnection is still in the "have-local-offer" state, an offer is generated by following the steps in the "stable" state above, along with these exceptions:
-
The "s=" and "t=" lines MUST stay the same.
-
If any RtpTransceiver has been added and there exists an "m=" section with a zero port in the current local description or the current remote description, that "m=" section MUST be recycled by generating an "m=" section for the added RtpTransceiver as if the "m=" section were being added to the session description (including a new MID value) and placing it at the same index as the "m=" section with a zero port.
-
If an RtpTransceiver is stopped and is not associated with an "m=" section, an "m=" section MUST NOT be generated for it. This prevents adding back RtpTransceivers whose "m=" sections were recycled and used for a new RtpTransceiver in a previous offer/ answer exchange, as described above.
-
If an RtpTransceiver has been stopped and is associated with an "m=" section, and the "m=" section is not being recycled as described above, an "m=" section MUST be generated for it with the port set to zero and all "a=msid" lines removed.
-
For RtpTransceivers that are not stopped, the "a=msid" line or lines MUST stay the same if they are present in the current description, regardless of changes to the transceiver's direction or track. If no "a=msid" line is present in the current description, "a=msid" line(s) MUST be generated according to the same rules as for an initial offer.
-
Each "m=" and "c=" line MUST be filled in with the port, relevant RTP profile, and address of the default candidate for the "m=" section, as described in RFC 8839, Section 4.2.1.2 and clarified in Section 5.1.2. If no RTP candidates have yet been gathered, default values MUST still be used, as described above.
-
Each "a=mid" line MUST stay the same.
-
Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless the ICE configuration has changed (e.g., changes to either the supported STUN/TURN servers or the ICE candidate policy) or the IceRestart option (Section 5.2.3.1) was specified. If the "m=" section is bundled into another "m=" section, it still MUST NOT contain any ICE credentials.
-
If the "m=" section is not bundled into another "m=" section, its "a=rtcp" attribute line MUST be filled in with the port and address of the default RTCP candidate, as indicated in RFC 5761, Section 5.1.3. If no RTCP candidates have yet been gathered, default values MUST be used, as described in Section 5.2.1 above.
-
If the "m=" section is not bundled into another "m=" section, for each candidate that has been gathered during the most recent gathering phase (see Section 3.5.1), an "a=candidate" line MUST be added, as defined in RFC 8839, Section 5.1. If candidate gathering for the section has completed, an "a=end-of-candidates" attribute MUST be added, as described in RFC 8840, Section 8.2. If the "m=" section is bundled into another "m=" section, both "a=candidate" and "a=end-of-candidates" MUST be omitted.
-
For RtpTransceivers that are still present, the "a=rid" lines MUST stay the same.
-
For RtpTransceivers that are still present, any "a=simulcast" line MUST stay the same.
If the previous offer was applied using setLocalDescription, and a corresponding answer from the remote side has been applied using setRemoteDescription, meaning the PeerConnection is in the "have-remote-pranswer" state or the "stable" state, an offer is generated based on the negotiated session descriptions by following the steps mentioned for the "have-local-offer" state above.
In addition, for each existing, non-recycled, non-rejected "m=" section in the new offer, the following adjustments are made based on the contents of the corresponding "m=" section in the current local or remote description, as appropriate:
-
The "m=" line and corresponding "a=rtpmap" and "a=fmtp" lines MUST only include media formats that have not been excluded by the codec preferences of the associated transceiver and also MUST include all currently available formats. Media formats that were previously offered but are no longer available (e.g., a shared hardware codec) MAY be excluded.
-
Unless codec preferences have been set for the associated transceiver, the media formats on the "m=" line MUST be generated in the same order as in the most recent answer. Any media formats that were not present in the most recent answer MUST be added after all existing formats.
-
The RTP header extensions MUST only include those that are supported by the application on the associated transceiver.
-
The RTCP feedback mechanisms MUST only include those that are supported by the application on the associated transceiver.
-
The "a=rtcp" line MUST NOT be added if the most recent answer included an "a=rtcp-mux" line.
-
The "a=rtcp-mux" line MUST be the same as that in the most recent answer.
-
The "a=rtcp-mux-only" line MUST NOT be added.
-
The "a=rtcp-rsize" line MUST NOT be added unless present in the most recent answer.
-
An "a=bundle-only" line, as defined in RFC 9143, Section 6, MUST NOT be added. Instead, JSEP implementations MUST simply omit parameters in the IDENTICAL and TRANSPORT categories for bundled "m=" sections, as described in RFC 9143, Section 7.1.3.
-
Note that if media "m=" sections are bundled into a data "m=" section, then certain TRANSPORT and IDENTICAL attributes may appear in the data "m=" section even if they would otherwise only be appropriate for a media "m=" section (e.g., "a=rtcp-mux"). This cannot happen in initial offers because in the initial offer JSEP implementations always list media "m=" sections (if any) before the data "m=" section (if any), and at least one of those media "m=" sections will not have the "a=bundle-only" attribute. Therefore, in initial offers, any "a=bundle-only" "m=" sections will be bundled into a preceding non-bundle-only media "m=" section.
The "a=group:BUNDLE" attribute
MUST include the MID identifiers specified in the bundle group in the most recent answer, minus any "m=" sections that have been marked as rejected, plus any newly added or re-enabled "m=" sections. In other words, the bundle attribute
MUST contain all "m=" sections that were previously bundled, as long as they are still alive, as well as any new "m=" sections.
Note that if bundling has been negotiated, unbundling is no longer possible, and media sections will not be marked as bundle-only. Although this is by design, it could cause issues in the rare case of sending a subsequent offer as an initial offer to a non-bundle-aware endpoint via Third Party Call Control (3PCC), as discussed in
RFC 9143,
Section 7.6.
"a=group:LS" attributes are generated in the same way as for initial offers, with the additional stipulation that any lip sync groups that were present in the most recent answer
MUST continue to exist and
MUST contain any previously existing MID identifiers, as long as the identified "m=" sections still exist and are not rejected, and the group still contains at least two MID identifiers. This ensures that any synchronized "recvonly" "m=" sections continue to be synchronized in the new offer.
The createOffer method takes as a parameter an RTCOfferOptions object. Special processing is performed when generating an SDP description if the following options are present.
If the IceRestart option is specified, with a value of "true", the offer
MUST indicate an ICE restart by generating new ICE ufrag and pwd attributes, as specified in
RFC 8839,
Section 4.4.1.1.1. If this option is specified on an initial offer, it has no effect (since a new ICE ufrag and pwd are already generated). Similarly, if the ICE configuration has changed, this option has no effect, since new ufrag and pwd attributes will be generated automatically. This option is primarily useful for reestablishing connectivity in cases where failures are detected by the application.
Silence suppression, also known as discontinuous transmission ("DTX"), can reduce the bandwidth used for audio by switching to a special encoding when voice activity is not detected, at the cost of some fidelity.
If the VoiceActivityDetection option is specified, with a value of "true", the offer
MUST indicate support for silence suppression in the audio it receives by including comfort noise ("CN") codecs for each offered audio codec, as specified in
RFC 3389,
Section 5.1, except for codecs that have their own internal silence suppression support. For codecs that have their own internal silence suppression support, the appropriate fmtp parameters for each such codec
MUST be specified to indicate that silence suppression for received audio is desired. For example, when using the Opus codec [
RFC 6716], the "usedtx=1" parameter, specified in [
RFC 7587], would be used in the offer.
If the VoiceActivityDetection option is specified, with a value of "false", the JSEP implementation
MUST NOT emit "CN" codecs. For codecs that have their own internal silence suppression support, the appropriate fmtp parameters for each such codec
MUST be specified to indicate that silence suppression for received audio is not desired. For example, when using the Opus codec, the "usedtx=0" parameter would be specified in the offer. In addition, the implementation
MUST NOT use silence suppression for media it generates, regardless of whether the "CN" codecs or related fmtp parameters appear in the peer's description. The impact of these rules is that silence suppression in JSEP depends on mutual agreement of both sides, which ensures consistent handling regardless of which codec is used.
The VoiceActivityDetection option does not have any impact on the setting of the "vad" value in the signaling of the client-to-mixer audio level header extension described in
RFC 6464,
Section 4.
When createAnswer is called, a new SDP description
MUST be created that is compatible with the supplied remote description as well as the requirements specified in [
RFC 8834]. The exact details of this process are explained below.
When createAnswer is called for the first time after a remote description has been provided, the result is known as the initial answer. If no remote description has been installed, an answer cannot be generated, and an error
MUST be returned.
Note that the remote description SDP may not have been created by a JSEP endpoint and may not conform to all the requirements listed in
Section 5.2. For many cases, this is not a problem. However, if any mandatory SDP attributes are missing or functionality listed as mandatory-to-use above is not present, this
MUST be treated as an error and
MUST cause the affected "m=" sections to be marked as rejected.
The first step in generating an initial answer is to generate session-level attributes. The process here is identical to that indicated in
Section 5.2.1 above, except that the "a=ice-options" line, with the "trickle" option as specified in
RFC 8840,
Section 4.1.3 and the "ice2" option as specified in
RFC 8445,
Section 10, is only included if such an option was present in the offer.
The next step is to generate session-level lip sync groups, as defined in
RFC 5888,
Section 7. For each group of type "LS" present in the offer, select the local RtpTransceivers that are referenced by the MID values in the specified group, and determine which of them either reference a common local MediaStream (specified in the calls to addTrack/addTransceiver used to create them) or have no MediaStream to reference because they were not created by addTrack/addTransceiver. If at least two such RtpTransceivers exist, a group of type "LS" with the MID values of these RtpTransceivers
MUST be added. Otherwise, the offered "LS" group
MUST be ignored and no corresponding group generated in the answer.
As a simple example, consider the following offer of a single audio and single video track contained in the same MediaStream. SDP lines not relevant to this example have been removed for clarity. As explained in
Section 5.2, a group of type "LS" has been added that references each track's RtpTransceiver.
a=group:LS a1 v1
m=audio 10000 UDP/TLS/RTP/SAVPF 0
a=mid:a1
a=msid:ms1
m=video 10001 UDP/TLS/RTP/SAVPF 96
a=mid:v1
a=msid:ms1
If the answerer uses a single MediaStream when it adds its tracks, both of its transceivers will reference this stream, and so the subsequent answer will contain a "LS" group identical to that in the offer, as shown below:
a=group:LS a1 v1
m=audio 20000 UDP/TLS/RTP/SAVPF 0
a=mid:a1
a=msid:ms2
m=video 20001 UDP/TLS/RTP/SAVPF 96
a=mid:v1
a=msid:ms2
However, if the answerer groups its tracks into separate MediaStreams, its transceivers will reference different streams, and so the subsequent answer will not contain a "LS" group.
m=audio 20000 UDP/TLS/RTP/SAVPF 0
a=mid:a1
a=msid:ms2a
m=video 20001 UDP/TLS/RTP/SAVPF 96
a=mid:v1
a=msid:ms2b
Finally, if the answerer does not add any tracks, its transceivers will not reference any MediaStreams, causing the preferences of the offerer to be maintained, and so the subsequent answer will contain an identical "LS" group.
a=group:LS a1 v1
m=audio 20000 UDP/TLS/RTP/SAVPF 0
a=mid:a1
a=recvonly
m=video 20001 UDP/TLS/RTP/SAVPF 96
a=mid:v1
a=recvonly
The example in
Section 7.2 shows a more involved case of "LS" group generation.
The next step is to generate an "m=" section for each "m=" section that is present in the remote offer, as specified in
RFC 3264,
Section 6. For the purposes of this discussion, any session-level attributes in the offer that are also valid as media-level attributes are considered to be present in each "m=" section. Each offered "m=" section will have an associated RtpTransceiver, as described in
Section 5.10. If there are more RtpTransceivers than there are "m=" sections, the unmatched RtpTransceivers will need to be associated in a subsequent offer.
For each offered "m=" section, if any of the following conditions are true, the corresponding "m=" section in the answer
MUST be marked as rejected by setting the <port> in the "m=" line to zero, as indicated in
RFC 3264,
Section 6, and further processing for this "m=" section can be skipped:
-
The associated RtpTransceiver has been stopped.
-
There is no offered media format that is both supported and, if applicable, allowed by codec preferences.
-
The bundle policy is "must-bundle", and this is not the first "m=" section or in the same bundle group as the first "m=" section.
-
The bundle policy is "balanced", and this is not the first "m=" section for this media type or in the same bundle group as the first "m=" section for this media type.
-
This "m=" section is in a bundle group, and the group's offerer tagged "m=" section is being rejected due to one of the above reasons. This requires all "m=" sections in the bundle group to be rejected, as specified in RFC 9143, Section 7.3.3.
Otherwise, each "m=" section in the answer
MUST then be generated as specified in
RFC 3264,
Section 6.1. For the "m=" line itself, the following rules
MUST be followed:
-
The <port> value would normally be set to the port of the default ICE candidate for this "m=" section, but given that no candidates are available yet, the default <port> value of 9 (Discard) MUST be used, as indicated in RFC 8840, Section 4.1.1.
-
The <proto> field MUST be set to exactly match the <proto> field for the corresponding "m=" line in the offer.
-
If codec preferences have been set for the associated transceiver, media formats MUST be generated in the corresponding order, regardless of what was offered, and MUST exclude any codecs not present in the codec preferences.
-
Otherwise, the media formats on the "m=" line MUST be generated in the same order as those offered in the current remote description, excluding any currently unsupported formats. Any currently available media formats that are not present in the current remote description MUST be added after all existing formats.
-
In either case, the media formats in the answer MUST include at least one format that is present in the offer but MAY include formats that are locally supported but not present in the offer, as mentioned in RFC 3264, Section 6.1. If no common format exists, the "m=" section is rejected as described above.
The "m=" line
MUST be followed immediately by a "c=" line, as specified in
RFC 4566,
Section 5.7. Again, as no candidates are available yet, the "c=" line
MUST contain the default value "IN IP4 0.0.0.0", as defined in
RFC 8840,
Section 4.1.3.
If the offer supports bundle, all "m=" sections to be bundled
MUST use the same ICE credentials and candidates; all "m=" sections not being bundled
MUST use unique ICE credentials and candidates. Each "m=" section
MUST contain the following attributes (which are of attribute types other than IDENTICAL or TRANSPORT):
-
If and only if present in the offer, an "a=mid" line, as specified in RFC 5888, Section 9.1. The MID value MUST match that specified in the offer.
-
A direction attribute, determined by applying the rules regarding the offered direction specified in RFC 3264, Section 6.1, and then intersecting with the direction of the associated RtpTransceiver. For example, in the case where an "m=" section is offered as "sendonly" and the local transceiver is set to "sendrecv", the result in the answer is a "recvonly" direction.
-
For each media format on the "m=" line, "a=rtpmap" and "a=fmtp" lines, as specified in RFC 4566, Section 6 and RFC 3264, Section 6.1.
-
If "rtx" is present in the offer, for each primary codec where RTP retransmission should be used, a corresponding "a=rtpmap" line indicating "rtx" with the clock rate of the primary codec and an "a=fmtp" line that references the payload type of the primary codec, as specified in RFC 4588, Section 8.1.
-
For each FEC mechanism supported by the application, "a=rtpmap" and "a=fmtp" lines, as specified in RFC 4566, Section 6. The FEC mechanisms that MUST be supported are specified in RFC 8854, Section 7, and specific usage for each media type is outlined in Sections 4 and 5 of [RFC 8854].
-
If this "m=" section is for media with configurable durations of media per packet, e.g., audio, an "a=maxptime" line, as described in Section 5.2.
-
If this "m=" section is for video media and there are known limitations on the size of images that can be decoded, an "a=imageattr" line, as specified in Section 3.6.
-
For each RTP header extension supported by the application and present in the offer, an "a=extmap" line, as specified in RFC 5285, Section 5. The list of header extensions that SHOULD/MUST be supported is specified in RFC 8834, Section 5.2. Any header extensions that require encryption MUST be specified as indicated in RFC 6904, Section 4.
-
For each RTCP feedback mechanism supported by the application and present in the offer, an "a=rtcp-fb" line, as specified in RFC 4585, Section 4.2. The list of RTCP feedback mechanisms that SHOULD/MUST be supported is specified in RFC 8834, Section 5.1.
-
If the RtpTransceiver has a "sendrecv" or "sendonly" direction:
-
For each MediaStream that was associated with the transceiver when it was created via addTrack or addTransceiver, an "a=msid" line, as specified in RFC 8830, Section 2, but omitting the "appdata" field.
Each "m=" section that is not bundled into another "m=" section
MUST contain the following attributes (which are of category IDENTICAL or TRANSPORT):
-
"a=ice-ufrag" and "a=ice-pwd" lines, as specified in RFC 8839, Section 5.4.
-
For each desired digest algorithm, one or more "a=fingerprint" lines for each of the endpoint's certificates, as specified in RFC 8122, Section 5.
-
An "a=setup" line, as specified in RFC 4145, Section 4 and clarified for use in DTLS-SRTP scenarios in RFC 5763, Section 5. The role value in the answer MUST be "active" or "passive". When the offer contains the "actpass" value, as will always be the case with JSEP endpoints, the answerer SHOULD use the "active" role. Offers from non-JSEP endpoints MAY send other values for "a=setup", in which case the answer MUST use a value consistent with the value in the offer.
-
An "a=tls-id" line, as specified in RFC 8842, Section 5.3.
-
If present in the offer, an "a=rtcp-mux" line, as specified in RFC 5761, Section 5.1.3. Otherwise, an "a=rtcp" line, as specified in RFC 3605, Section 2.1, containing the default value "9 IN IP4 0.0.0.0" (because no candidates have yet been gathered).
-
If present in the offer, an "a=rtcp-rsize" line, as specified in RFC 5506, Section 5.
If a data channel "m=" section has been offered, an "m=" section
MUST also be generated for data. The <media> field
MUST be set to "application", and the <proto> and <fmt> fields
MUST be set to exactly match the fields in the offer.
Within the data "m=" section, an "a=mid" line
MUST be generated and included as described above, along with an "a=sctp-port" line referencing the SCTP port number, as defined in
RFC 8841,
Section 5.1; and, if appropriate, an "a=max-message-size" line, as defined in
RFC 8841,
Section 6.1.
As discussed above, the following attributes of category IDENTICAL or TRANSPORT are included only if the data "m=" section is not bundled into another "m=" section:
-
"a=ice-ufrag"
-
"a=ice-pwd"
-
"a=fingerprint"
-
"a=setup"
-
"a=tls-id"
Note that if media "m=" sections are bundled into a data "m=" section, then certain TRANSPORT and IDENTICAL attributes may also appear in the data "m=" section even if they would otherwise only be appropriate for a media "m=" section (e.g., "a=rtcp-mux").
If "a=group" attributes with semantics "BUNDLE" are offered, corresponding session-level "a=group" attributes
MUST be added as specified in [
RFC 5888]. These attributes
MUST have semantics "BUNDLE" and
MUST include all MID identifiers from the offered bundle groups that have not been rejected. Note that regardless of the presence of "a=bundle-only" in the offer, all "m=" sections in the answer
MUST NOT have an "a=bundle-only" line.
Attributes that are common between all "m=" sections
MAY be moved to the session level if explicitly defined to be valid at the session level.
The attributes prohibited in the creation of offers are also prohibited in the creation of answers.
When createAnswer is called a second (or later) time or is called after a local description has already been installed, the processing is somewhat different than for an initial answer.
If the previous answer was not applied using setLocalDescription, meaning the PeerConnection is still in the "have-remote-offer" state, the steps for generating an initial answer
MUST be followed, subject to the following restriction:
-
The fields of the "o=" line MUST stay the same except for the <session-version> field, which MUST increment if the session description changes in any way from the previously generated answer.
If any session description was previously supplied to setLocalDescription, an answer is generated by following the steps in the "have-remote-offer" state above, along with these exceptions:
-
The "s=" and "t=" lines MUST stay the same.
-
Each "m=" and "c=" line MUST be filled in with the port and address of the default candidate for the "m=" section, as described in RFC 8839, Section 4.2.1.2. Note that in certain cases, the "m=" line protocol may not match that of the default candidate, because the "m=" line protocol value MUST match what was supplied in the offer, as described above.
-
Each "a=ice-ufrag" and "a=ice-pwd" line MUST stay the same, unless the "m=" section is restarting, in which case new ICE credentials MUST be created as specified in RFC 8839, Section 4.4.1.1.1. If the "m=" section is bundled into another "m=" section, it still MUST NOT contain any ICE credentials.
-
Each "a=tls-id" line MUST stay the same, unless the offerer's "a=tls-id" line changed, in which case a new tls-id value MUST be created, as described in RFC 8842, Section 5.2.
-
Each "a=setup" line MUST use an "active" or "passive" role value consistent with the existing DTLS association, if the association is being continued by the offerer.
-
RTCP multiplexing MUST be used, and an "a=rtcp-mux" line inserted if and only if the "m=" section previously used RTCP multiplexing.
-
If the "m=" section is not bundled into another "m=" section and RTCP multiplexing is not active, an "a=rtcp" attribute line MUST be filled in with the port and address of the default RTCP candidate. If no RTCP candidates have yet been gathered, default values MUST be used, as described in Section 5.3.1 above.
-
If the "m=" section is not bundled into another "m=" section, for each candidate that has been gathered during the most recent gathering phase (see Section 3.5.1), an "a=candidate" line MUST be added, as defined in RFC 8839, Section 5.1. If candidate gathering for the section has completed, an "a=end-of-candidates" attribute MUST be added, as described in RFC 8840, Section 8.2. If the "m=" section is bundled into another "m=" section, both "a=candidate" and "a=end-of-candidates" MUST be omitted.
-
For RtpTransceivers that are not stopped, the "a=msid" line(s) MUST stay the same, regardless of changes to the transceiver's direction or track. If no "a=msid" line is present in the current description, "a=msid" line(s) MUST be generated according to the same rules as for an initial answer.
The createAnswer method takes as a parameter an RTCAnswerOptions object. The set of parameters for RTCAnswerOptions is different than those supported in RTCOfferOptions; the IceRestart option is unnecessary, as ICE credentials will automatically be changed for all "m=" sections where the offerer chose to perform ICE restart.
The following option is supported in RTCAnswerOptions.
Silence suppression in the answer is handled as described in
Section 5.2.3.2, with one exception: if support for silence suppression was not indicated in the offer, the VoiceActivityDetection parameter has no effect, and the answer
MUST be generated as if VoiceActivityDetection was set to "false". This is done on a per-codec basis (e.g., if the offerer somehow offered support for CN but set "usedtx=0" for Opus, setting VoiceActivityDetection to "true" would result in an answer with "CN" codecs and "usedtx=0"). The impact of this rule is that an answerer will not try to use silence suppression with any endpoint that does not offer it, making silence suppression support bilateral even with non-JSEP endpoints.
The SDP returned from createOffer or createAnswer
MUST NOT be changed before passing it to setLocalDescription. If precise control over the SDP is needed, the aforementioned createOffer/createAnswer options or RtpTransceiver APIs
MUST be used.
After calling setLocalDescription with an offer or answer, the application
MAY modify the SDP to reduce its capabilities before sending it to the far side, as long as it follows the rules above that define a valid JSEP offer or answer. Likewise, an application that has received an offer or answer from a peer
MAY modify the received SDP, subject to the same constraints, before calling setRemoteDescription.
As always, the application is solely responsible for what it sends to the other party, and all incoming SDP will be processed by the JSEP implementation to the extent of its capabilities. It is an error to assume that all SDP is well formed; however, one should be able to assume that any implementation of this specification will be able to process, as a remote offer or answer, unmodified SDP coming from any other implementation of this specification.
When a SessionDescription is supplied to setLocalDescription, the following steps
MUST be performed:
-
If the description is of type "rollback", follow the processing defined in Section 5.7 and skip the processing described in the rest of this section.
-
Otherwise, the type of the SessionDescription is checked against the current state of the PeerConnection:
-
If the type is "offer", the PeerConnection state MUST be either "stable" or "have-local-offer".
-
If the type is "pranswer" or "answer", the PeerConnection state MUST be either "have-remote-offer" or "have-local-pranswer".
-
If the type is not correct for the current state, processing MUST stop and an error MUST be returned.
-
The SessionDescription is then checked to ensure that its contents are identical to those generated in the last call to createOffer/createAnswer, and thus have not been altered, as discussed in Section 5.4; otherwise, processing MUST stop and an error MUST be returned.
-
Next, the SessionDescription is parsed into a data structure, as described in Section 5.8 below.
-
Finally, the parsed SessionDescription is applied as described in Section 5.9 below.
When a SessionDescription is supplied to setRemoteDescription, the following steps
MUST be performed:
-
If the description is of type "rollback", follow the processing defined in Section 5.7 and skip the processing described in the rest of this section.
-
Otherwise, the type of the SessionDescription is checked against the current state of the PeerConnection:
-
If the type is "offer", the PeerConnection state MUST be either "stable" or "have-remote-offer".
-
If the type is "pranswer" or "answer", the PeerConnection state MUST be either "have-local-offer" or "have-remote-pranswer".
-
If the type is not correct for the current state, processing MUST stop and an error MUST be returned.
-
Next, the SessionDescription is parsed into a data structure, as described in Section 5.8 below. If parsing fails for any reason, processing MUST stop and an error MUST be returned.
-
Finally, the parsed SessionDescription is applied as described in Section 5.10 below.
A rollback may be performed if the PeerConnection is in any state except for "stable". This means that both offers and provisional answers can be rolled back. Rollback can only be used to cancel proposed changes; there is no support for rolling back from a "stable" state to a previous "stable" state. If a rollback is attempted in the "stable" state, processing
MUST stop and an error
MUST be returned. Note that this implies that once the answerer has performed setLocalDescription with its answer, this cannot be rolled back.
The effect of rollback
MUST be the same regardless of whether setLocalDescription or setRemoteDescription is called.
In order to process rollback, a JSEP implementation abandons the current offer/answer transaction, sets the signaling state to "stable", and sets the pending local and/or remote description (see Sections [
4.1.14] and [
4.1.16]) to null. Any resources or candidates that were allocated by the abandoned local description are discarded; any media that is received is processed according to the previous local and remote descriptions.
A rollback disassociates any RtpTransceivers that were associated with "m=" sections by the application of the rolled-back session description (see Sections [
5.10] and [
5.9]). This means that some RtpTransceivers that were previously associated will no longer be associated with any "m=" section; in such cases, the value of the RtpTransceiver's mid property
MUST be set to null, and the mapping between the transceiver and its "m=" section index
MUST be discarded. RtpTransceivers that were created by applying a remote offer that was subsequently rolled back
MUST be stopped and removed from the PeerConnection. However, an RtpTransceiver
MUST NOT be removed if a track was attached to the RtpTransceiver via the addTrack method. This is so that an application may call addTrack, then call setRemoteDescription with an offer, then roll back that offer, then call createOffer and have an "m=" section for the added track appear in the generated offer.
The SDP contained in the session description object consists of a sequence of text lines, each containing a key-value expression, as described in
RFC 4566,
Section 5. The SDP is read, line by line, and converted to a data structure that contains the deserialized information. However, SDP allows many types of lines, not all of which are relevant to JSEP applications. For each line, the implementation will first ensure that it is syntactically correct according to its defining ABNF, check that it conforms to the semantics used in [
RFC 4566] and [
RFC 3264], and then either parse and store or discard the provided value, as described below.
If any line is not well formed or cannot be parsed as described, the parser
MUST stop with an error and reject the session description, even if the value is to be discarded. This ensures that implementations do not accidentally misinterpret ambiguous SDP.
First, the session-level lines are checked and parsed. These lines
MUST occur in a specific order, and with a specific syntax, as defined in
RFC 4566,
Section 5. Note that while the specific line types (e.g., "v=", "c=")
MUST occur in the defined order, lines of the same type (typically "a=") can occur in any order.
The following non-attribute lines are not meaningful in the JSEP context and
MAY be discarded once they have been checked.
-
The "c=" line MUST be checked for syntax, but its value is only used for ICE mismatch detection, as defined in RFC 8445, Section 5.4. Note that JSEP implementations should never encounter this condition because ICE is required for WebRTC.
-
The "i=", "u=", "e=", "p=", "t=", "r=", "z=", and "k=" lines MUST be checked for syntax, but their values are not otherwise used.
The remaining non-attribute lines are processed as follows:
Finally, the attribute lines are processed. Specific processing
MUST be applied for the following session-level attribute ("a=") lines:
-
Any "a=group" lines are parsed as specified in RFC 5888, Section 5, and the group's semantics and MID values are stored.
-
If present, a single "a=ice-lite" line is parsed as specified in RFC 8839, Section 5.3, and a value indicating the presence of an "a=ice-lite" line is stored.
-
If present, a single "a=ice-ufrag" line is parsed as specified in RFC 8839, Section 5.4, and the ufrag value is stored.
-
If present, a single "a=ice-pwd" line is parsed as specified in RFC 8839, Section 5.4, and the password value is stored.
-
If present, a single "a=ice-options" line is parsed as specified in RFC 8839, Section 5.6, and the set of specified options is stored.
-
Any "a=fingerprint" lines are parsed as specified in RFC 8122, Section 5, and the set of fingerprint and algorithm values is stored.
-
If present, a single "a=setup" line is parsed as specified in RFC 4145, Section 4, and the setup value is stored.
-
If present, a single "a=tls-id" line is parsed as specified in RFC 8842, Section 5, and the attribute value is stored.
-
Any "a=identity" lines are parsed and the identity values stored for subsequent verification, as specified in RFC 8827, Section 5.
-
Any "a=extmap" lines are parsed as specified in RFC 5285, Section 5, and their values are stored.
Other attributes that are not relevant to JSEP may also be present, and implementations
SHOULD process any that they recognize. As required by
RFC 4566,
Section 5.13, unknown attribute lines
MUST be ignored.
Once all the session-level lines have been parsed, processing continues with the lines in "m=" sections.
Like the session-level lines, the media section lines
MUST occur in the specific order and with the specific syntax defined in
RFC 4566,
Section 5.
The "m=" line itself
MUST be parsed as described in
RFC 4566,
Section 5.14, and the <media>, <port>, <proto>, and <fmt> values stored.
Following the "m=" line, specific processing
MUST be applied for the following non-attribute lines:
-
As with the "c=" line at the session level, the "c=" line MUST be parsed according to RFC 4566, Section 5.7, but its value is not used.
-
The "b=" line, if present, MUST be parsed as specified in RFC 4566, Section 5.8, and the bwtype and bandwidth values stored.
Specific processing
MUST also be applied for the following attribute lines:
-
If present, a single "a=ice-ufrag" line is parsed as specified in RFC 8839, Section 5.4, and the ufrag value is stored.
-
If present, a single "a=ice-pwd" line is parsed as specified in RFC 8839, Section 5.4, and the password value is stored.
-
If present, a single "a=ice-options" line is parsed as specified in RFC 8839, Section 5.6, and the set of specified options is stored.
-
Any "a=candidate" attributes MUST be parsed as specified in RFC 8839, Section 5.1, and their values stored.
-
Any "a=remote-candidates" attributes MUST be parsed as specified in RFC 8839, Section 5.2, but their values are ignored.
-
If present, a single "a=end-of-candidates" attribute MUST be parsed as specified in RFC 8840, Section 8.1, and its presence or absence flagged and stored.
-
Any "a=fingerprint" lines are parsed as specified in RFC 8122, Section 5, and the set of fingerprint and algorithm values is stored.
If the "m=" <proto> value indicates use of RTP, as described in
Section 5.1.2 above, the following attribute lines
MUST be processed:
-
The "m=" <fmt> value MUST be parsed as specified in RFC 4566, Section 5.14, and the individual values stored.
-
Any "a=rtpmap" or "a=fmtp" lines MUST be parsed as specified in RFC 4566, Section 6, and their values stored.
-
If present, a single "a=ptime" line MUST be parsed as described in RFC 4566, Section 6, and its value stored.
-
If present, a single "a=maxptime" line MUST be parsed as described in RFC 4566, Section 6, and its value stored.
-
If present, a single direction attribute line (e.g., "a=sendrecv") MUST be parsed as described in RFC 4566, Section 6, and its value stored.
-
Any "a=ssrc" attributes MUST be parsed as specified in RFC 5576, Section 4.1, and their values stored.
-
Any "a=extmap" attributes MUST be parsed as specified in RFC 5285, Section 5, and their values stored.
-
Any "a=rtcp-fb" attributes MUST be parsed as specified in RFC 4585, Section 4.2, and their values stored.
-
If present, a single "a=rtcp-mux" attribute MUST be parsed as specified in RFC 5761, Section 5.1.3, and its presence or absence flagged and stored.
-
If present, a single "a=rtcp-mux-only" attribute MUST be parsed as specified in RFC 8858, Section 3, and its presence or absence flagged and stored.
-
If present, a single "a=rtcp-rsize" attribute MUST be parsed as specified in RFC 5506, Section 5, and its presence or absence flagged and stored.
-
If present, a single "a=rtcp" attribute MUST be parsed as specified in RFC 3605, Section 2.1, but its value is ignored, as this information is superfluous when using ICE.
-
If present, "a=msid" attributes MUST be parsed as specified in RFC 8830, Section 3.2, and their values stored, ignoring any "appdata" field. If no "a=msid" attributes are present, a random msid-id value is generated for a "default" MediaStream for the session, if not already present, and this value is stored.
-
Any "a=imageattr" attributes MUST be parsed as specified in RFC 6236, Section 3, and their values stored.
-
Any "a=rid" lines MUST be parsed as specified in RFC 8851, Section 10, and their values stored.
-
If present, a single "a=simulcast" line MUST be parsed as specified in [RFC 8853], and its values stored.
Otherwise, if the "m=" <proto> value indicates use of SCTP, the following attribute lines
MUST be processed:
-
The "m=" <fmt> value MUST be parsed as specified in RFC 8841, Section 4.3, and the application protocol value stored.
-
An "a=sctp-port" attribute MUST be present, and it MUST be parsed as specified in RFC 8841, Section 5.2, and the value stored.
-
If present, a single "a=max-message-size" attribute MUST be parsed as specified in RFC 8841, Section 6, and the value stored. Otherwise, use the specified default.
Other attributes that are not relevant to JSEP may also be present, and implementations
SHOULD process any that they recognize. As required by
RFC 4566,
Section 5.13, unknown attribute lines
MUST be ignored.
Assuming that parsing completes successfully, the parsed description is then evaluated to ensure internal consistency as well as proper support for mandatory features. Specifically, the following checks are performed:
-
For each "m=" section, valid values for each of the mandatory-to-use features enumerated in Section 5.1.1 MUST be present. These values MAY be either present at the media level or inherited from the session level.
-
ICE ufrag and password values, which MUST comply with the size limits specified in RFC 8839, Section 5.4.
-
A tls-id value, which MUST be set according to RFC 8842, Section 5. If this is a re-offer or a response to a re-offer and the tls-id value is different from that presently in use, the DTLS connection is not being continued and the remote description MUST be part of an ICE restart, together with new ufrag and password values.
-
A DTLS setup value, which MUST be set according to the rules specified in RFC 5763, Section 5 and MUST be consistent with the selected role of the current DTLS connection, if one exists and is being continued.
-
DTLS fingerprint values, where at least one fingerprint MUST be present.
-
All rid-ids referenced in an "a=simulcast" line MUST exist as "a=rid" lines.
-
Each "m=" section is also checked to ensure that prohibited features are not used.
-
If the RTP/RTCP multiplexing policy is "require", each "m=" section MUST contain an "a=rtcp-mux" attribute. If an "m=" section contains an "a=rtcp-mux-only" attribute, that section MUST also contain an "a=rtcp-mux" attribute.
-
If an "m=" section was present in the previous answer, the state of RTP/RTCP multiplexing MUST match what was previously negotiated.
If this session description is of type "pranswer" or "answer", the following additional checks are applied:
-
The session description MUST follow the rules defined in RFC 3264, Section 6, including the requirement that the number of "m=" sections MUST exactly match the number of "m=" sections in the associated offer.
-
For each "m=" section, the media type and protocol values MUST exactly match the media type and protocol values in the corresponding "m=" section in the associated offer.
If any of the preceding checks failed, processing
MUST stop and an error
MUST be returned.
The following steps are performed at the media engine level to apply a local description. If an error is returned, the session
MUST be restored to the state it was in before performing these steps.
First, "m=" sections are processed. For each "m=" section, the following steps
MUST be performed; if any parameters are out of bounds or cannot be applied, processing
MUST stop and an error
MUST be returned.
-
If this "m=" section is new, begin gathering candidates for it, as defined in RFC 8445, Section 5.1.1, unless it is definitively being bundled (either (1) this is an offer and the "m=" section is marked as bundle-only or (2) it is an answer and the "m=" section is bundled into another "m=" section).
-
Or, if the ICE ufrag and password values have changed, trigger the ICE agent to start an ICE restart as described in RFC 8445, Section 9, and begin gathering new candidates for the "m=" section. If this description is an answer, also start checks on that media section.
-
If the "m=" section <proto> value indicates use of RTP:
-
If there is no RtpTransceiver associated with this "m=" section, find one and associate it with this "m=" section according to the following steps. Note that this situation will only occur when applying an offer.
-
Find the RtpTransceiver that corresponds to this "m=" section, using the mapping between transceivers and "m=" section indices established when creating the offer.
-
Set the value of this RtpTransceiver's mid property to the MID of the "m=" section.
-
If RTCP mux is indicated, prepare to demux RTP and RTCP from the RTP ICE component, as specified in RFC 5761, Section 5.1.3.
-
For each specified RTP header extension, establish a mapping between the extension ID and URI, as described in RFC 5285, Section 6.
-
If the MID header extension is supported, prepare to demux RTP streams intended for this "m=" section based on the MID header extension, as described in RFC 9143, Section 15.
-
For each specified media format, establish a mapping between the payload type and the actual media format, as described in RFC 3264, Section 6.1. In addition, prepare to demux RTP streams intended for this "m=" section based on the media formats supported by this "m=" section, as described in RFC 9143, Section 9.2.
-
For each specified "rtx" media format, establish a mapping between the RTX payload type and its associated primary payload type, as described in Sections 8.6 and 8.7 of [RFC 4588].
-
If the direction attribute is of type "sendrecv" or "recvonly", enable receipt and decoding of media.
Finally, if this description is of type "pranswer" or "answer", follow the processing defined in
Section 5.11 below.
The following steps are performed to apply a remote description. If an error is returned, the session
MUST be restored to the state it was in before performing these steps.
If the answer contains any "a=ice-options" attributes where "trickle" is listed as an attribute, update the PeerConnection canTrickleIceCandidates property to be "true". Otherwise, set this property to "false".
The following steps
MUST be performed for attributes at the session level; if any parameters are out of bounds or cannot be applied, processing
MUST stop and an error
MUST be returned.
-
For any specified "CT" bandwidth value, set this value as the limit for the maximum total bitrate for all "m=" sections, as specified in RFC 4566, Section 5.8. Within this overall limit, the implementation can dynamically decide how to best allocate the available bandwidth between "m=" sections, respecting any specific limits that have been specified for individual "m=" sections.
-
For any specified "RR" or "RS" bandwidth values, handle as specified in RFC 3556, Section 2.
-
Any "AS" bandwidth value (RFC 4566, Section 5.8) MUST be ignored, as the meaning of this construct at the session level is not well defined.
For each "m=" section, the following steps
MUST be performed; if any parameters are out of bounds or cannot be applied, processing
MUST stop and an error
MUST be returned.
-
If the ICE ufrag or password changed from the previous remote description:
-
If the description is of type "offer", the implementation MUST note that an ICE restart is needed, as described in RFC 8839, Section 4.4.1.1.1.
-
If the description is of type "answer" or "pranswer", then check to see if the current local description is an ICE restart, and if not, generate an error. If the PeerConnection state is "have-remote-pranswer" and the ICE ufrag or password changed from the previous provisional answer, then signal the ICE agent to discard any previous ICE checklist state for the "m=" section. Finally, signal the ICE agent to begin checks.
-
If the current local description indicates an ICE restart but neither the ICE ufrag nor the password has changed from the previous remote description (as prescribed by RFC 8445, Section 9), generate an error.
-
Configure the ICE components associated with this media section to use the supplied ICE remote ufrag and password for their connectivity checks.
-
Pair any supplied ICE candidates with any gathered local candidates, as described in RFC 8445, Section 6.1.2, and start connectivity checks with the appropriate credentials.
-
If an "a=end-of-candidates" attribute is present, process the end-of-candidates indication as described in RFC 8838, Section 14.
-
If the "m=" section <proto> value indicates use of RTP:
-
If the "m=" section is being recycled (see Section 5.2.2), disassociate the currently associated RtpTransceiver by setting its mid property to null, and discard the mapping between the transceiver and its "m=" section index.
-
If the "m=" section is not associated with any RtpTransceiver (possibly because it was disassociated in the previous step), either find an RtpTransceiver or create one according to the following steps:
-
If the "m=" section is "sendrecv" or "recvonly", and there are RtpTransceivers of the same type that were added to the PeerConnection by addTrack and are not associated with any "m=" section and are not stopped, find the first (according to the canonical order described in Section 5.2.1) such RtpTransceiver.
-
If no RtpTransceiver was found in the previous step, create one with a "recvonly" direction.
-
Associate the found or created RtpTransceiver with the "m=" section by setting the value of the RtpTransceiver's mid property to the MID of the "m=" section, and establish a mapping between the transceiver and the index of the "m=" section. If the "m=" section does not include a MID (i.e., the remote endpoint does not support the MID extension), generate a value for the RtpTransceiver mid property, following the guidance for "a=mid" mentioned in Section 5.2.1.
-
For each specified media format that is also supported by the local application, establish a mapping between the specified payload type and the media format, as described in RFC 3264, Section 6.1. Specifically, this means that the implementation records the payload type to be used in outgoing RTP packets when sending each specified media format, as well as the relative preference for each format that is indicated in their ordering. If any indicated media format is not supported by the local application, it MUST be ignored.
-
For each specified "rtx" media format, establish a mapping between the RTX payload type and its associated primary payload type, as described in RFC 4588, Section 4. If any referenced primary payload types are not present, this MUST result in an error. Note that RTX payload types may refer to primary payload types that are not supported by the local media implementation, in which case the RTX payload type MUST also be ignored.
-
For each specified fmtp parameter that is supported by the local application, enable them on the associated media formats.
-
For each specified Synchronization Source (SSRC) that is signaled in the "m=" section, prepare to demux RTP streams intended for this "m=" section using that SSRC, as described in RFC 9143, Section 9.2.
-
For each specified RTP header extension that is also supported by the local application, establish a mapping between the extension ID and URI, as described in RFC 5285, Section 5. Specifically, this means that the implementation records the extension ID to be used in outgoing RTP packets when sending each specified header extension. If any indicated RTP header extension is not supported by the local application, it MUST be ignored.
-
For each specified RTCP feedback mechanism that is also supported by the local application, enable them on the associated media formats.
-
For any specified "TIAS" ("Transport Independent Application Specific (maximum)") bandwidth value, set this value as a constraint on the maximum RTP bitrate to be used when sending media, as specified in [RFC 3890]. If a "TIAS" value is not present but an "AS" value is specified, generate a "TIAS" value using this formula:
TIAS = AS * 1000 * 0.95 - (50 * 40 * 8)
The 1000 changes the unit from kbps to bps (as required by TIAS), and the 0.95 is to allocate 5% to RTCP. An estimate of header overhead is then subtracted out, in which the 50 is based on 50 packets per second, the 40 is based on typical header size (in bytes), and the 8 converts bytes to bits. Note that "TIAS" is preferred over "AS" because it provides more accurate control of bandwidth.
-
For any "RR" or "RS" bandwidth values, handle as specified in RFC 3556, Section 2.
-
Any specified "CT" bandwidth value MUST be ignored, as the meaning of this construct at the media level is not well defined.
-
If the "m=" section is of type "audio":
-
For each specified "CN" media format, configure silence suppression for all supported media formats with the same clock rate, as described in RFC 3389, Section 5, except for formats that have their own internal silence suppression mechanisms. Silence suppression for such formats (e.g., Opus) is controlled via fmtp parameters, as discussed in Section 5.2.3.2.
-
For each specified "telephone-event" media format, enable dual-tone multifrequency (DTMF) transmission for all supported media formats with the same clock rate, as described in RFC 4733, Section 2.5.1.2. If there are any supported media formats that do not have a corresponding telephone-event format, disable DTMF transmission for those formats.
-
For any specified "ptime" value, configure the available media formats to use the specified packet size when sending. If the specified size is not supported for a media format, use the next closest value instead.
Finally, if this description is of type "pranswer" or "answer", follow the processing defined in
Section 5.11 below.
In addition to the steps mentioned above for processing a local or remote description, the following steps are performed when processing a description of type "pranswer" or "answer".
For each "m=" section, the following steps
MUST be performed:
-
If the "m=" section has been rejected (i.e., the <port> value is set to zero in the answer), stop any reception or transmission of media for this section, and, unless a non-rejected "m=" section is bundled with this "m=" section, discard any associated ICE components, as described in the second bullet item in RFC 8839, Section 4.4.3.1.
-
If the remote DTLS fingerprint has been changed or the value of the "a=tls-id" attribute has changed, tear down the DTLS connection. This includes the case when the PeerConnection state is "have-remote-pranswer". If a DTLS connection needs to be torn down but the answer does not indicate an ICE restart or, in the case of "have-remote-pranswer", new ICE credentials, an error MUST be generated. If an ICE restart is performed without a change in the tls-id value or fingerprint, then the same DTLS connection is continued over the new ICE channel. Note that although JSEP requires that answerers change the tls-id value if and only if the offerer does, non-JSEP answerers are permitted to change the tls-id value as long as the offer contained an ICE restart. Thus, JSEP implementations that process DTLS data prior to receiving an answer MUST be prepared to receive either a ClientHello or data from the previous DTLS connection.
-
If no valid DTLS connection exists, prepare to start a DTLS connection, using the specified roles and fingerprints, on any underlying ICE components, once they are active.
-
If the "m=" section <proto> value indicates use of RTP:
-
If the "m=" section references RTCP feedback mechanisms that were not present in the corresponding "m=" section in the offer, this indicates a negotiation problem and MUST result in an error. However, new media formats and new RTP header extension values are permitted in the answer, as described in RFC 3264, Section 7 and RFC 5285, Section 6.
-
If the "m=" section has RTCP mux enabled, discard the RTCP ICE component, if one exists, and begin or continue muxing RTCP over the RTP ICE component, as specified in RFC 5761, Section 5.1.3. Otherwise, prepare to transmit RTCP over the RTCP ICE component; if no RTCP ICE component exists because RTCP mux was previously enabled, this MUST result in an error.
-
If the "m=" section has Reduced-Size RTCP enabled, configure the RTCP transmission for this "m=" section to use Reduced-Size RTCP, as specified in [RFC 5506].
-
If the direction attribute in the answer indicates that the JSEP implementation should be sending media ("sendonly" for local answers, "recvonly" for remote answers, or "sendrecv" for either type of answer), choose the media format to send as the most preferred media format from the remote description that is also locally supported, as discussed in Sections 6.1 and 7 of [RFC 3264], and start transmitting RTP media using that format once the underlying transport layers have been established. If an SSRC has not already been chosen for this outgoing RTP stream, choose a unique random one. If media is already being transmitted, the same SSRC SHOULD be used unless the clock rate of the new codec is different, in which case a new SSRC MUST be chosen, as specified in RFC 7160, Section 4.1.
-
The payload type mapping from the remote description is used to determine payload types for the outgoing RTP streams, including the payload type for the send media format chosen above. Any RTP header extensions that were negotiated should be included in the outgoing RTP streams, using the extension mapping from the remote description. If the MID header extension has been negotiated, include it in the outgoing RTP streams, as indicated in RFC 9143, Section 15. If the RtpStreamId or RepairedRtpStreamId header extensions have been negotiated and rid-ids have been established, include these header extensions in the outgoing RTP streams, as indicated in RFC 8851, Section 4.
-
If the "m=" section is of type "audio", and silence suppression was (1) configured for the send media format as a result of processing the remote description and (2) also enabled for that format in the local description, use silence suppression for outgoing media, in accordance with the guidance in Section 5.2.3.2. If these conditions are not met, silence suppression MUST NOT be used for outgoing media.
-
If simulcast has been negotiated, send the appropriate number of Source RTP Streams as specified in RFC 8853, Section 5.3.3.
-
If the send media format chosen above has a corresponding "rtx" media format or a FEC mechanism has been negotiated, establish a redundancy RTP stream with a unique random SSRC for each Source RTP Stream, and start or continue transmitting RTX/FEC packets as needed.
-
If the send media format chosen above has a corresponding "red" media format of the same clock rate, allow redundant encoding using the specified format for resiliency purposes, as discussed in RFC 8854, Section 3.2. Note that unlike RTX or FEC media formats, the "red" format is transmitted on the Source RTP Stream, not the redundancy RTP stream.
-
Enable the RTCP feedback mechanisms referenced in the media section for all Source RTP Streams using the specified media formats. Specifically, begin or continue sending the requested feedback types and reacting to received feedback, as specified in RFC 4585, Section 4.2. When sending RTCP feedback, follow the rules and recommendations from RFC 8108, Section 5.4.1 to select which SSRC to use.
-
If the direction attribute in the answer indicates that the JSEP implementation should not be sending media ("recvonly" for local answers, "sendonly" for remote answers, or "inactive" for either type of answer), stop transmitting all RTP media, but continue sending RTCP, as described in RFC 3264, Section 5.1.
-
If the "m=" section <proto> value indicates use of SCTP:
-
If an SCTP association exists and the remote SCTP port has changed, discard the existing SCTP association. This includes the case when the PeerConnection state is "have-remote-pranswer".
-
If no valid SCTP association exists, prepare to initiate an SCTP association over the associated ICE component and DTLS connection, using the local SCTP port value from the local description and the remote SCTP port value from the remote description, as described in RFC 8841, Section 10.2.
If the answer contains valid bundle groups, discard any ICE components for the "m=" sections that will be bundled onto the primary ICE components in each bundle, and begin muxing these "m=" sections accordingly, as described in
RFC 9143,
Section 7.4.
If the description is of type "answer" and there are still remaining candidates in the ICE candidate pool, discard them.