4. RTP Payload Format
4.1. RTP Header Usage
In addition to Section 5.1 of [RFC6184], the following rules apply. o Setting of the M bit: The M bit of an RTP packet for which the packet payload is an NI-MTAP MUST be equal to 1 if the last NAL unit, in decoding order, of the access unit associated with the RTP timestamp is contained in the packet. o Setting of the RTP timestamp: For an RTP packet for which the packet payload is an empty NAL unit, the RTP timestamp must be set according to Section 4.10. For an RTP packet for which the packet payload is a PACSI NAL unit, the RTP timestamp MUST be equal to the NALU-time of the next non- PACSI NAL unit in transmission order. Recall that the NALU-time of a NAL unit in an MTAP is defined in [RFC6184] as the value that the RTP timestamp would have if that NAL unit would be transported in its own RTP packet. o Setting of the SSRC: For both SST and MST, the SSRC values MUST be set according to [RFC3550].4.2. NAL Unit Extension and Header Usage
4.2.1. NAL Unit Extension
This memo specifies a NAL unit extension mechanism to allow for introduction of new types of NAL units, beyond the three NAL unit types left undefined in [RFC6184] (i.e., 0, 30, and 31). The extension mechanism utilizes the NAL unit type value 31 and is specified as follows. When the NAL unit type value is equal to 31, the one-byte NAL unit header consisting of the F, NRI, and Type fields as specified in Section 1.1.3 is extended by one additional octet, which consists of a 5-bit field named Subtype and three 1-bit fields named J, K, and L, respectively. The additional octet is shown in the following figure.
+---------------+ |0|1|2|3|4|5|6|7| +-+-+-+-+-+-+-+-+ | Subtype |J|K|L| +---------------+ The Subtype value determines the (extended) NAL unit type of this NAL unit. The interpretation of the fields J, K, and L depends on the Subtype. The semantics of the fields are as follows. When Subtype is equal to 1, the NAL unit is an empty NAL unit as specified in Section 4.10. When Subtype is equal to 2, the NAL unit is an NI-MTAP NAL unit as specified in Section 4.7.1. All other values of Subtype (0, 3-31) are reserved for future extensions, and receivers MUST ignore the entire NAL unit when Subtype is equal to any of these reserved values.4.2.2. NAL Unit Header Usage
The structure and semantics of the NAL unit header according to the H.264 specification [H.264] were introduced in Section 1.1.3. This section specifies the extended semantics of the NAL unit header fields F, NRI, I, PRID, DID, QID, TID, U, and D, according to this memo. When the Type field is equal to 31, the semantics of the fields in the extension NAL unit header were specified in Section 4.2.1. The semantics of F specified in Section 5.3 of [RFC6184] also apply in this memo. That is, a value of 0 for F indicates that the NAL unit type octet and payload should not contain bit errors or other syntax violations, whereas a value of 1 for F indicates that the NAL unit type octet and payload may contain bit errors or other syntax violations. MANEs SHOULD set the F bit to indicate bit errors in the NAL unit. For NRI, for a bitstream conforming to one of the profiles defined in Annex A of [H.264] and transported using [RFC6184], the semantics specified in Section 5.3 of [RFC6184] apply, i.e., NRI also indicates the relative importance of NAL units. For a bitstream conforming to one of the profiles defined in Annex G of [H.264] and transported using this memo, in addition to the semantics specified in Annex G of [H.264], NRI also indicates the relative importance of NAL units within a layer. For I, in addition to the semantics specified in Annex G of [H.264], according to this memo, MANEs MAY use this information to protect NAL units with I equal to 1 better than NAL units with I equal to 0. MANEs MAY also utilize information of NAL units with I equal to 1 to
decide when to forward more packets for an RTP packet stream. For example, when it is detected that spatial layer switching has happened such that the operation point has changed to a higher value of DID, MANEs MAY start to forward NAL units with the higher value of DID only after forwarding a NAL unit with I equal to 1 with the higher value of DID. Note that, in the context of this section, "protecting a NAL unit" means any RTP or network transport mechanism that could improve the probability of successful delivery of the packet conveying the NAL unit, including applying a Quality of Service (QoS) enabled network, Forward Error Correction (FEC), retransmissions, and advanced scheduling behavior, whenever possible. For PRID, the semantics specified in Annex G of [H.264] apply. Note that MANEs implementing unequal error protection MAY use this information to protect NAL units with smaller PRID values better than those with larger PRID values, for example, by including only the more important NAL units in a FEC protection mechanism. The importance for the decoding process decreases as the PRID value increases. For DID, QID, or TID, in addition to the semantics specified in Annex G of [H.264], according to this memo, values of DID, QID, or TID indicate the relative importance in their respective dimension. A lower value of DID, QID, or TID indicates a higher importance if the other two components are identical. MANEs MAY use this information to protect more important NAL units better than less important NAL units. For U, in addition to the semantics specified in Annex G of [H.264], according to this memo, MANEs MAY use this information to protect NAL units with U equal to 1 better than NAL units with U equal to 0. For D, in addition to the semantics specified in Annex G of [H.264], according to this memo, MANEs MAY use this information to determine whether a given NAL unit is required for successfully decoding a certain Operation Point of the SVC bitstream, hence to decide whether to forward the NAL unit.4.3. Payload Structures
The NAL unit structure is central to H.264/AVC, [RFC6184], as well as SVC and this memo. In H.264/AVC and SVC, all coded bits for representing a video signal are encapsulated in NAL units. In [RFC6184], each RTP packet payload is structured as a NAL unit, which contains one or a part of one NAL unit specified in H.264/AVC, or aggregates one or more NAL units specified in H.264/AVC.
[RFC6184] specifies three basic payload structures (in Section 5.2 of [RFC6184]): single NAL unit packet, aggregation packet, fragmentation unit, and six new types (24 to 29) of NAL units. The value of the Type field of the RTP packet payload header (i.e., the first byte of the payload) may be equal to any value from 1 to 23 for a single NAL unit packet, any value from 24 to 27 for an aggregation packet, and 28 or 29 for a fragmentation unit. In addition to the NAL unit types defined originally for H.264/AVC, SVC defines three new NAL unit types specifically for SVC: coded slice in scalable extension NAL units (type 20), prefix NAL units (type 14), and subset sequence parameter set NAL units (type 15), as described in Section 1.1. This memo further introduces three new types of NAL units, PACSI NAL unit (NAL unit type 30) as specified in Section 4.9, empty NAL unit (type 31, subtype 1) as specified in Section 4.10, and NI-MTAP NAL unit (type 31, subtype 2) as specified in Section 4.7.1. The RTP packet payload structure in [RFC6184] is maintained with slight extensions in this memo, as follows. Each RTP packet payload is still structured as a NAL unit, which contains one or a part of one NAL unit specified in H.264/AVC and SVC, or contains one PACSI NAL unit or one empty NAL unit, or aggregates zero or more NAL units specified in H.264/AVC and SVC, zero or one PACSI NAL unit, and zero or more empty NAL units. In this memo, one of the three basic payload structures, fragmentation unit, remains the same as in [RFC6184], and the other two, single NAL unit packet and aggregation packet, are extended as follows. The value of the Type field of the payload header may be equal to any value from 1 to 23, inclusive, and 30 to 31, inclusive, for a single NAL unit packet, and any value from 24 to 27, inclusive, and 31, for an aggregation packet. When the Type field of the payload header is equal to 31 and the Subtype field of the payload header is equal to 2, the packet is an aggregation packet (containing an NI-MTAP NAL unit). When the Type field of the payload header is equal to 31 and the Subtype field of the payload header is equal to 1, the packet is a single NAL unit packet (containing an empty NAL unit). Note that, in this memo, the length of the payload header varies depending on the value of the Type field in the first byte of the RTP packet payload. If the value is equal to 14, 20, or 30, the first four bytes of the packet payload form the payload header; otherwise, if the value is equal to 31, the first two bytes of the payload form the payload header; otherwise, the payload header is the first byte of the packet payload.
Table 1 lists the NAL unit types introduced in SVC and this memo and where they are described in this memo. Table 2 summarizes the basic payload structure types for all NAL unit types when they are directly used as RTP packet payloads according to this memo. Table 3 summarizes the NAL unit types allowed to be aggregated (i.e., used as aggregation units in aggregation packets) or fragmented (i.e., carried in fragmentation units) according to this memo. Table 1. NAL unit types introduced in SVC and this memo Type Subtype NAL Unit Name Section Numbers ----------------------------------------------------------- 14 - Prefix NAL unit 1.1 15 - Subset sequence parameter set 1.1 20 - Coded slice in scalable extension 1.1 30 - PACSI NAL unit 4.9 31 0 reserved 4.2.1 31 1 Empty NAL unit 4.10 31 2 NI-MTAP 4.7.1 31 3-31 reserved 4.2.1 Table 2. Basic payload structure types for all NAL unit types when they are directly used as RTP packet payloads Type Subtype Basic Payload Structure ------------------------------------------ 0 - reserved 1-23 - Single NAL Unit Packet 24-27 - Aggregation Packet 28-29 - Fragmentation Unit 30 - Single NAL Unit Packet 31 0 reserved 31 1 Single NAL Unit Packet 31 2 Aggregation Packet 31 3-31 reserved
Table 3. Summary of the NAL unit types allowed to be aggregated or fragmented (yes = allowed, no = disallowed, - = not applicable/not specified) Type Subtype STAP-A STAP-B MTAP16 MTAP24 FU-A FU-B NI-MTAP ------------------------------------------------------------- 0 - - - - - - - - 1-23 - yes yes yes yes yes yes yes 24-29 - no no no no no no no 30 - yes yes yes yes no no yes 31 0 - - - - - - - 31 1 yes no no no no no yes 31 2 no no no no no no no 31 3-31 - - - - - - -4.4. Transmission Modes
This memo enables transmission of an SVC bitstream over one or more RTP sessions. If only one RTP session is used for transmission of the SVC bitstream, the transmission mode is referred to as single- session transmission (SST); otherwise (more than one RTP session is used for transmission of the SVC bitstream), the transmission mode is referred to as multi-session transmission (MST). SST SHOULD be used for point-to-point unicast scenarios, while MST SHOULD be used for point-to-multipoint multicast scenarios where different receivers requires different operation points of the same SVC bitstream, to improve bandwidth utilizing efficiency. If the OPTIONAL mst-mode media type parameter (see Section 7.1) is not present, SST MUST be used; otherwise (mst-mode is present), MST MUST be used.4.5. Packetization Modes
4.5.1. Packetization Modes for Single-Session Transmission
When SST is in use, Section 5.4 of [RFC6184] applies with the following extensions. The packetization modes specified in Section 5.4 of [RFC6184], namely, single NAL unit mode, non-interleaved mode, and interleaved mode, are also referred to as session packetization modes. Table 4 summarizes the allowed session packetization modes for SST.
Table 4. Summary of allowed session packetization modes (denoted as "Session Mode" for simplicity) for SST (yes = allowed, no = disallowed) Session Mode Allowed ------------------------------------- Single NAL Unit Mode yes Non-Interleaved Mode yes Interleaved Mode yes For NAL unit types in the range of 0 to 29, inclusive, the NAL unit types allowed to be directly used as packet payloads for each session packetization mode are the same as specified in Section 5.4 of [RFC6184]. For other NAL unit types, which are newly introduced in this memo, the NAL unit types allowed to be directly used as packet payloads for each session packetization mode are summarized in Table 5. Table 5. New NAL unit types allowed to be directly used as packet payloads for each session packetization mode (yes = allowed, no = disallowed, - = not applicable/not specified) Type Subtype Single NAL Non-Interleaved Interleaved Unit Mode Mode Mode ------------------------------------------------------------- 30 - yes no no 31 0 - - - 31 1 yes yes no 31 2 no yes no 31 3-31 - - -4.5.2. Packetization Modes for Multi-Session Transmission
For MST, this memo specifies four MST packetization modes: o Non-interleaved timestamp based mode (NI-T); o Non-interleaved cross-session decoding order number (CS-DON) based mode (NI-C); o Non-interleaved combined timestamp and CS-DON mode (NI-TC); and o Interleaved CS-DON (I-C) mode. These four modes differ in two ways. First, they differ in terms of whether NAL units are required to be transmitted within each RTP session in decoding order (i.e., non-interleaved), or they are allowed to be transmitted in a different order (i.e., interleaved).
Second, they differ in the mechanisms they provide in order to recover the correct decoding order of the NAL units across all RTP sessions involved. The NI-T, NI-C, and NI-TC modes do not allow interleaving, and are thus targeted for systems that require relatively low end-to-end latency, e.g., conversational systems. The I-C mode allows interleaving and is thus targeted for systems that do not require very low end-to-end latency. The benefits of interleaving are the same as that of the interleaved mode specified in [RFC6184]. The NI-T mode uses timestamps to recover the decoding order of NAL units, whereas the NI-C and I-C modes both use the CS-DON mechanism (explained later) to do so. The NI-TC mode provides both timestamps and the CS-DON method; receivers in this case may choose to use either method for performing decoding order recovery. The MST packetization mode in use MUST be signaled by the value of the OPTIONAL mst-mode media type parameter. The used MST packetization mode governs which session packetization modes are allowed in the associated RTP sessions, which in turn govern which NAL unit types are allowed to be directly used as RTP packet payloads. Table 6 summarizes the allowed session packetization modes for NI-T, NI-C, and NI-TC. Table 7 summarizes the allowed session packetization modes for I-C. Table 6. Summary of allowed session packetization modes (denoted as "Session Mode" for simplicity) for NI-T, NI-C, and NI-TC (yes = allowed, no = disallowed) Session Mode Base Session Enhancement Session ----------------------------------------------------------- Single NAL Unit Mode yes no Non-Interleaved Mode yes yes Interleaved Mode no no Table 7. Summary of allowed session packetization modes (denoted as "Session Mode" for simplicity) for I-C (yes = allowed, no = disallowed) Session Mode Base Session Enhancement Session ----------------------------------------------------------- Single NAL Unit Mode no no Non-Interleaved Mode no no Interleaved Mode yes yes
For NAL unit types in the range of 0 to 29, inclusive, the NAL unit types allowed to be directly used as packet payloads for each session packetization mode are the same as specified in Section 5.4 of [RFC6184]. For other NAL unit types, which are newly introduced in this memo, the NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode for NI-T, NI-C, NI-TC, and I-C are summarized in Tables 8, 9, 10, and 11, respectively. Table 8. New NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode when NI-T is in use (yes = allowed, no = disallowed, - = not applicable/not specified) Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes no 31 0 - - 31 1 yes yes 31 2 no yes 31 3-31 - - Table 9. New NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode when NI-C is in use (yes = allowed, no = disallowed, - = not applicable/not specified) Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes yes 31 0 - - 31 1 no no 31 2 no yes 31 3-31 - -
Table 10. New NAL unit types allowed to be directly used as packet payloads for each allowed session packetization mode when NI-TC is in use (yes = allowed, no = disallowed, - = not applicable/not specified) Type Subtype Single NAL Non-Interleaved Unit Mode Mode --------------------------------------------------- 30 - yes yes 31 0 - - 31 1 yes yes 31 2 no yes 31 3-31 - - Table 11. New NAL unit types allowed to be directly used as packet payloads for the allowed session packetization mode when I-C is in use (yes = allowed, no = disallowed, - = not applicable/not specified) Type Subtype Interleaved Mode ------------------------------------ 30 - no 31 0 - 31 1 no 31 2 no 31 3-31 - When MST is in use and the MST packetization mode in use is NI-C, empty NAL units (type 31, subtype 1) MUST NOT be used, i.e., no RTP packet is allowed to contain one or more empty NAL units. When MST is in use and the MST packetization mode in use is I-C, both empty NAL units (type 31, subtype 1) and NI-MTAP NAL units (type 31, subtype 2) MUST NOT be used, i.e., no RTP packet is allowed to contain one or more empty NAL units or an NI-MTAP NAL unit.4.6. Single NAL Unit Packets
Section 5.6 of [RFC6184] applies with the following extensions. The payload of a single NAL unit packet MAY be a PACSI NAL unit (Type 30) or an empty NAL unit (Type 31 and Subtype 1), in addition to a NAL unit with NAL unit type equal to any value from 1 to 23, inclusive.
If the Type field of the first byte of the payload is not equal to 31, the payload header is the first byte of the payload. Otherwise, (the Type field of the first byte of the payload is equal to 31), the payload header is the first two bytes of the payload.4.7. Aggregation Packets
In addition to Section 5.7 of [RFC6184], the following applies in this memo.4.7.1. Non-Interleaved Multi-Time Aggregation Packets (NI-MTAPs)
One new NAL unit type introduced in this memo is the non-interleaved multi-time aggregation packet (NI-MTAP). An NI-MTAP consists of one or more non-interleaved multi-time aggregation units. The NAL units contained in NI-MTAPs MUST be aggregated in decoding order. A non-interleaved multi-time aggregation unit for the NI-MTAP consists of 16 bits of unsigned size information of the following NAL unit (in network byte order), and 16 bits (in network byte order) of timestamp offset (TS offset) for the NAL unit. The structure is presented in Figure 1. The starting or ending position of an aggregation unit within a packet may or may not be on a 32-bit word boundary. The NAL units in the NI-MTAP are ordered in NAL unit decoding order. The Type field of the NI-MTAP MUST be set equal to "31". The F bit MUST be set to 0 if all the F bits of the aggregated NAL units are zero; otherwise, it MUST be set to 1. The value of NRI MUST be the maximum value of NRI across all NAL units carried in the NI-MTAP packet. The field Subtype MUST be equal to 2. If the field J is equal to 1, the optional DON field MUST be present for each of the non-interleaved multi-time aggregation units. For SST, the J field MUST be equal to 0. For MST, in the NI-T mode the J field MUST be equal to 0, whereas in the NI-C or NI-TC mode the J field MUST be equal to 1. When the NI-C or NI-TC mode is in use, the DON field, when present, MUST represent the CS-DON value for the particular NAL unit as defined in Section 6.2.2. The fields K and L MUST be both equal to 0.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ : NAL unit size | TS offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DON (optional) | | |-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ NAL unit | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1. Non-interleaved multi-time aggregation unit for NI-MTAP Let TS be the RTP timestamp of the packet carrying the NAL unit. Recall that the NALU-time of a NAL unit in an MTAP is defined in [RFC6184] as the value that the RTP timestamp would have if that NAL unit would be transported in its own RTP packet. The timestamp offset field MUST be set to a value equal to the value of the following formula: if NALU-time >= TS, TS offset = NALU-time - TS else, TS offset = NALU-time + (2^32 - TS) For the "earliest" multi-time aggregation unit in an NI-MTAP, the timestamp offset MUST be zero. Hence, the RTP timestamp of the NI- MTAP itself is identical to the earliest NALU-time. Informative note: The "earliest" multi-time aggregation unit is the one that would have the smallest extended RTP timestamp among all the aggregation units of an NI-MTAP if the aggregation units were encapsulated in single NAL unit packets. An extended timestamp is a timestamp that has more than 32 bits and is capable of counting the wraparound of the timestamp field, thus enabling one to determine the smallest value if the timestamp wraps. Such an "earliest" aggregation unit may or may not be the first one in the order in which the aggregation units are encapsulated in an NI-MTAP. The "earliest" NAL unit need not be the same as the first NAL unit in the NAL unit decoding order either. Figure 2 presents an example of an RTP packet that contains an NI- MTAP that contains two non-interleaved multi-time aggregation units, labeled as 1 and 2 in the figure.
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type | Subtype |J|K|L| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | Non-interleaved multi-time aggregation unit #1 | : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | Non-interleaved multi-time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | aggregation unit #2 | : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | :...OPTIONAL RTP padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2. An RTP packet including an NI-MTAP containing two non-interleaved multi-time aggregation units4.8. Fragmentation Units (FUs)
Section 5.8 of [RFC6184] applies. Informative note: In case a NAL unit with the four-byte SVC NAL unit header is fragmented, the three-byte SVC-specific header extension is considered as part of the NAL unit payload. That is, the three-byte SVC-specific header extension is only available in the first fragment of the fragmented NAL unit.4.9. Payload Content Scalability Information (PACSI) NAL Unit
Another new type of NAL unit specified in this memo is the payload content scalability information (PACSI) NAL unit. The Type field of PACSI NAL units MUST be equal to 30 (a NAL unit type value left unspecified in [H.264] and [RFC6184]). A PACSI NAL unit MAY be carried in a single NAL unit packet or an aggregation packet, and MUST NOT be fragmented. PACSI NAL units may be used for the following purposes: o To enable MANEs to decide whether to forward, process, or discard aggregation packets, by checking in PACSI NAL units the scalability information and other characteristics of the
aggregated NAL units, rather than looking into the aggregated NAL units themselves, which are defined by the video coding specification; o To enable correct decoding order recovery in MST using the NI-C or NI-TC mode, with the help of the CS-DON information included in PACSI NAL units; and o To improve resilience to packet losses, e.g., by utilizing the following data or information included in PACSI NAL units: repeated Supplemental Enhancement Information (SEI) messages, information regarding the start and end of layer representations, and the indices to layer representations of the lowest temporal subset. PACSI NAL units MAY be ignored in the NI-T mode without affecting the decoding order recovery process. When a PACSI NAL unit is present in an aggregation packet, the following applies. o The PACSI NAL unit MUST be the first aggregated NAL unit in the aggregation packet. o There MUST be at least one additional aggregated NAL unit in the aggregation packet. o The RTP header fields and the payload header fields of the aggregation packet are set as if the PACSI NAL unit was not included in the aggregation packet. o If the aggregation packet is an MTAP16, MTAP24, or NI-MTAP with the J field equal to 1, the decoding order number (DON) for the PACSI NAL unit MUST be set to indicate that the PACSI NAL unit has an identical DON to the first NAL unit in decoding order among the remaining NAL units in the aggregation packet. When a PACSI NAL unit is included in a single NAL unit packet, it is associated with the next non-PACSI NAL unit in transmission order, and the RTP header fields of the packet are set as if the next non- PACSI NAL unit in transmission order was included in a single NAL unit packet. The PACSI NAL unit structure is as follows. The first four octets are exactly the same as the four-byte SVC NAL unit header discussed in Section 1.1.3. They are followed by one octet containing several flags, then five optional octets, and finally zero or more SEI NAL units. Each SEI NAL unit is preceded by a 16-bit unsigned size field
(in network byte order) that indicates the size of the following NAL unit in bytes (excluding these two octets, but including the NAL unit header octet of the SEI NAL unit). Figure 3 illustrates the PACSI NAL unit structure and an example of a PACSI NAL unit containing two SEI NAL units. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type |R|I| PRID |N| DID | QID | TID |U|D|O| RR| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |X|Y|T|A|P|C|S|E| TL0PICIDX (o) | IDRPICID (o) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DONC (o) | NAL unit size 1 | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | SEI NAL unit 1 | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | NAL unit size 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | SEI NAL unit 2 | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3. PACSI NAL unit structure. Fields suffixed by "(o)" are OPTIONAL. The bits A, P, and C are specified only if the bit X is equal to 1. The bits S and E are specified, and the fields TL0PICIDX and IDRPICID are present, only if the bit Y is equal to 1. The field DONC is present only if the bit T is equal to 1. The field T MUST be equal to 0 if the PACSI NAL unit is contained in an STAP-B, MTAP16, MTAP24, or NI-MTAP with the J field equal to 1. The values of the fields in PACSI NAL unit MUST be set as follows. o The F bit MUST be set to 1 if the F bit in at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order has the F bit equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the F bit MUST be set to 0.
o The NRI field MUST be set to the highest value of NRI field among all the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the value of the NRI field of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). o The Type field MUST be set to 30. o The R bit MUST be set to 1. Receivers MUST ignore the value of R. o The I bit MUST be set to 1 if the I bit of at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the I bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the I bit MUST be set to 0. o The PRID field MUST be set to the lowest value of the PRID values of the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the PRID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). o The N bit MUST be set to 1 if the N bit of all the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the N bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the N bit MUST be set to 0. o The DID field MUST be set to the lowest value of the DID values of the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the DID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). o The QID field MUST be set to the lowest value of the QID values of the remaining NAL units with the lowest value of DID in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the QID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). o The TID field MUST be set to the lowest value of the TID values of the remaining NAL units with the lowest value of DID in the aggregation packet (when the PACSI NAL unit is included in an
aggregation packet) or the TID value of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). o The U bit MUST be set to 1 if the U bit of at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the U bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the U bit MUST be set to 0. o The D bit MUST be set to 1 if the D value of all the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the D bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the D bit MUST be set to 0. o The O bit MUST be set to 1 if the O bit of at least one of the remaining NAL units in the aggregation packet is equal to 1 (when the PACSI NAL unit is included in an aggregation packet) or if the O bit of the next non-PACSI NAL unit in transmission order is equal to 1 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the O bit MUST be set to 0. o The RR field MUST be set to "11" (in binary form). Receivers MUST ignore the value of RR. o If the X bit is equal to 1, the bits A, P, and C are specified as below. Otherwise, the bits A, P, and C are unspecified, and receivers MUST ignore the values of these bits. The X bit SHOULD be identical for all the PACSI NAL units in all the RTP sessions carrying the same SVC bitstream. o If the Y bit is equal to 1, the OPTIONAL fields TL0PICIDX and IDRPICID MUST be present and specified as below, and the bits S and E are also specified as below. Otherwise, the fields TL0PICIDX and IDRPICID MUST NOT be present, while the S and E bits are unspecified and receivers MUST ignore the values of these bits. The Y bit MUST be identical for all the PACSI NAL units in all the RTP sessions carrying the same SVC bitstream. The Y bit MUST be equal to 0 when the parameter packetization-mode is equal to 2. o If the T bit is equal to 1, the OPTIONAL field DONC MUST be present and specified as below. Otherwise, the field DONC MUST NOT be present. The field T MUST be equal to 0 if the PACSI NAL unit is contained in an STAP-B, MTAP16, MTAP24, or NI-MTAP.
o The A bit MUST be set to 1 if at least one of the remaining NAL units in the aggregation packet belongs to an anchor layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order belongs to an anchor layer representation (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the A bit MUST be set to 0. Informative note: The A bit indicates whether CGS or spatial layer switching at a non-IDR layer representation (a layer representation with nal_unit_type not equal to 5 and idr_flag not equal to 1) can be performed. With some picture coding structures a non-IDR intra layer representation can be used for random access. Compared to using only IDR layer representations, higher coding efficiency can be achieved. The H.264/AVC or SVC solution to indicate the random accessibility of a non-IDR intra layer representation is using a recovery point SEI message. The A bit offers direct access to this information, without having to parse the recovery point SEI message, which may be buried deeply in an SEI NAL unit. Furthermore, the SEI message may or may not be present in the bitstream. o The P bit MUST be set to 1 if all the remaining NAL units in the aggregation packet have redundant_pic_cnt greater than 0 (when the PACSI NAL unit is included in an aggregation packet) or the next non-PACSI NAL unit in transmission order has redundant_pic_cnt greater than 0 (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the P bit MUST be set to 0. Informative note: The P bit indicates whether a packet can be discarded because it contains only redundant slice NAL units. Without this bit, the corresponding information can be obtained from the syntax element redundant_pic_cnt, which is contained in the variable-length coded slice header. o The C bit MUST be set to 1 if at least one of the remaining NAL units in the aggregation packet belongs to an intra layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non-PACSI NAL unit in transmission order belongs to an intra layer representation (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the C bit MUST be set to 0. Informative note: The C bit indicates whether a packet contains intra slices, which may be the only packets to be forwarded, e.g., when the network conditions are particularly adverse.
o The S bit MUST be set to 1, if the first NAL unit following the PACSI NAL unit in an aggregation packet is the first VCL NAL unit, in decoding order, of a layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non- PACSI NAL unit in transmission order is the first VCL NAL unit, in decoding order, of a layer representation(when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the S bit MUST be set to 0. o The E bit MUST be set to 1, if the last NAL unit following the PACSI NAL unit in an aggregation packet is the last VCL NAL unit, in decoding order, of a layer representation (when the PACSI NAL unit is included in an aggregation packet) or if the next non- PACSI NAL unit in transmission order is the last VCL NAL unit, in decoding order, of a layer representation (when the PACSI NAL unit is included in a single NAL unit packet). Otherwise, the E bit MUST be set to 0. Informative note: In an aggregation packet it is always possible to detect the beginning or end of a layer representation by detecting changes in the values of dependency_id, quality_id, and temporal_id in NAL unit headers, except from the first and last NAL units of a packet. The S or E bits are used to provide this information, for both single NAL unit and aggregation packets, so that previous or following packets do not have to be examined. This enables MANEs to detect slice loss and take proper action such as requesting a retransmission as soon as possible, as well as to allow efficient playout buffer handling similarly to the M bit present in the RTP header. The M bit in the RTP header still indicates the end of an access unit, not the end of a layer representation. o When present, the TL0PICIDX field MUST be set to equal to tl0_dep_rep_idx as specified in Annex G of [H.264] for the layer representation containing the first NAL unit following the PACSI NAL unit in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or containing the next non- PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). o When present, the IDRPICID field MUST be set to equal to effective_idr_pic_id as specified in Annex G of [H.264] for the layer representation containing the first NAL unit following the PACSI NAL unit in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or containing the next non- PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet).
Informative note: The TL0PICIDX and IDRPICID fields enable the detection of the loss of layer representations in the most important temporal layer (with temporal_id equal to 0) by receivers as well as MANEs. SVC provides a solution that uses SEI messages, which are harder to parse and may or may not be present in the bitstream. When the PACSI NAL unit is part of an NI-MTAP packet, it is possible to infer the correct values of tl0_dep_rep_idx and idr_pic_id for all layer representations contained in the NI-MTAP by following the rules that specify how these parameters are set as given in Annex G of [H.264] and by detecting the different layer representations contained in the NI- MTAP packet by detecting changes in the values of dependency_id_, quality_id, and temporal_id in the NAL unit headers as well as using the S and E flags. The only exception is if NAL units of an IDR picture are present in the NI-MTAP in a position other than the first NAL unit following the PACSI NAL unit, in which case the value of idr_pic_id cannot be inferred. In this case the NAL unit has to be partially parsed to obtain the idr_pic_id. Note that, due to the large size of IDR pictures, their inclusion in an NI- MTAP, and especially in a position other than the first NAL unit following the PACSI NAL unit, may be neither practical nor useful. o When present, the field DONC indicates the cross-session decoding order number (CS-DON) for the first of the remaining NAL units in the aggregation packet (when the PACSI NAL unit is included in an aggregation packet) or the CS-DON of the next non-PACSI NAL unit in transmission order (when the PACSI NAL unit is included in a single NAL unit packet). CS-DON is further discussed in Section 4.11. The PACSI NAL unit MAY include a subset of the SEI NAL units associated with the access unit to which the first non-PACSI NAL unit in the aggregation packet belongs, and MUST NOT contain SEI NAL units associated with any other access unit. Informative note: In H.264/AVC and SVC, within each access unit, SEI NAL units must appear before any VCL NAL unit in decoding order. Therefore, without using PACSI NAL units, SEI messages are typically only conveyed in the first of the packets carrying an access unit. Senders may repeat SEI NAL units in PACSI NAL units, so that they are repeated in more than one packet and thus increase robustness against packet losses. Receivers may use the repeated SEI messages in place of missing SEI messages. For a PACSI NAL unit included in an aggregation packet, an SEI message SHOULD NOT be included in the PACSI NAL unit and also included in one of the remaining NAL units contained in the same aggregation packet.
4.10. Empty NAL unit
An empty NAL unit MAY be included in a single NAL unit packet, an STAP-A or an NI-MTAP packet. Empty NAL units MUST have an RTP timestamp (when transported in a single NAL unit packet) or NALU- time (when transported in an aggregation packet) that is associated with an access unit for which there exists at least one NAL unit of type 1, 5, or 20. When MST is used, the type 1, 5, or 20 NAL unit may be in a different RTP session. Empty NAL units may be used in the decoding order recovery process of the NI-T mode as described in Section 5.2.1. The packet structure is shown in the following figure. +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |F|NRI| Type | Subtype |J|K|L| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 4. Empty NAL unit structure. The fields MUST be set as follows: F MUST be equal to 0 NRI MUST be equal to 3 Type MUST be equal to 31 Subtype MUST be equal to 1 J MUST be equal to 0 K MUST be equal to 0 L MUST be equal to 04.11. Decoding Order Number (DON)
The DON concept is introduced in [RFC6184] and is used to recover the decoding order when interleaving is used within a single session. Section 5.5 of [RFC6184] applies when using SST. When using MST, it is necessary to recover the decoding order across the various RTP sessions regardless if interleaving is used or not. In addition to the timestamp mechanism described later, the CS-DON mechanism is an extension of the DON facility that can be used for this purpose, and is defined in the following section.4.11.1. Cross-Session DON (CS-DON) for Multi-Session Transmission
The cross-session decoding order number (CS-DON) is a number that indicates the decoding order of NAL units across all RTP sessions involved in MST. It is similar to the DON concept in [RFC6184], but contrary to [RFC6184] where the DON was used only for interleaved
packetization, in this memo it is used not only in the interleaved MST mode (I-C) but also in two of the non-interleaved MST modes (NI-C and NI-TC). When the NI-C or NI-TC MST modes are in use, the packetization of each session MUST be as specified in Section 5.2.2. In PACSI NAL units the CS-DON value is explicitly coded in the field DONC. For non-PACSI NAL units the CS-DON value is derived as follows. Let SN indicate the RTP sequence number of a packet. o For each non-PACSI NAL unit carried in a session using the single NAL unit session packetization mode, the CS-DON value of the NAL unit is equal to (DONC_prev_PACSI + SN_diff - 1) % 65536, wherein "%" is the modulo operation, DONC_prev_PACSI is the DONC value of the previous PACSI NAL unit with the same NALU-time as the current NAL unit, and SN_diff is calculated as follows: if SN1 > SN2, SN_diff = SN1 - SN2 else SN_diff = SN2 + 65536 - SN1 where SN1 and SN2 are the SNs of the current NAL unit and the previous PACSI NAL unit with the same NALU-time, respectively. o For non-PACSI NAL units carried in a session using the non- interleaved session packetization mode, the CS-DON value of each non-PACSI NAL unit is derived as follows. For a non-PACSI NAL unit in a single NAL unit packet, the following applies. If the previous PACSI NAL unit is contained in a single NAL unit packet, the CS-DON value of the NAL unit is calculated as above; otherwise (the previous PACSI NAL unit is contained in an STAP-A packet), the CS-DON value of the NAL unit is calculated as above, with DONC_prev_PACSI being replaced by the CS-DON value of the previous non-PACSI NAL unit in decoding order (i.e., the CS-DON value of the last NAL unit of the STAP-A packet). For a non-PACSI NAL unit in an STAP-A packet, the following applies. If the non-PACSI NAL unit is the first non-PACSI NAL unit in the STAP-A packet, the CS-DON value of the NAL unit is equal to DONC of the PACSI NAL unit in the STAP-A packet;
otherwise (the non-PACSI NAL unit is not the first non- PACSI NAL unit in the STAP-A packet), the CS-DON value of the NAL unit is equal to: (the CS-DON value of the previous non-PACSI NAL unit in decoding order + 1) % 65536, wherein "%" is the modulo operation. For a non-PACSI NAL unit in a number of FU-A packets, the CS- DON value of the NAL unit is calculated the same way as when the single NAL unit session packetization mode is in use, with SN1 being the SN value of the first FU-A packet. For a non-PACSI NAL unit in an NI-MTAP packet, the CS-DON value is equal to the value of the DON field of the non-interleaved multi-time aggregation unit. When the I-C MST packetization mode is in use, the DON values derived according to [RFC6184] for all the NAL units in each of the RTP sessions MUST indicate CS-DON values.5. Packetization Rules
Section 6 of [RFC6184] applies in this memo, with the following additions.5.1. Packetization Rules for Single-Session Transmission
All receivers MUST support the single NAL unit packetization mode to provide backward compatibility to endpoints supporting only the single NAL unit mode of [RFC6184]. However, the use of single NAL unit packetization mode (packetization-mode equal to 0) SHOULD be avoided whenever possible, because encapsulating NAL units of small sizes in their own packets (e.g., small NAL units containing parameter sets, prefix NAL units, or SEI messages) is less efficient due to the packet header overhead. All receivers MUST support the non-interleaved mode. Informative note: The non-interleaved mode of [RFC6184] does allow an application to encapsulate a single NAL unit in a single RTP packet. Historically, the single NAL unit mode has been included in [RFC6184] only for compatibility with ITU-T Rec. H.241 Annex A [H.241]. There is no point in carrying this historic ballast towards a new application space such as the one provided with SVC. The implementation complexity increase for supporting the additional mechanisms of the non-interleaved mode (namely, STAP-A and FU-A) is minor, whereas the benefits are significant. As a result, the support of STAP-A and FU-A is required. Additionally,
support for two of the three NAL unit types defined in this memo, namely, empty NAL units and NI-MTAP is needed, as specified in Section 4.5.1. A NAL unit of small size SHOULD be encapsulated in an aggregation packet together with one or more other NAL units. For example, non- VCL NAL units such as access unit delimiters, parameter sets, or SEI NAL units are typically small. A prefix NAL unit and the NAL unit with which it is associated, and which follows the prefix NAL unit in decoding order, SHOULD be included in the same aggregation packet whenever an aggregation packet is used for the associated NAL unit, unless this would violate session MTU constraints or if fragmentation units are used for the associated NAL unit. Informative note: Although the prefix NAL unit is ignored by an H.264/AVC decoder, it is necessary in the SVC decoding process. Given the small size of the prefix NAL unit, it is best if it is transported in the same RTP packet as its associated NAL unit. When only an H.264/AVC compatible subset of the SVC base layer is transmitted in an RTP session, the subset MUST be encapsulated according to [RFC6184]. This way, an [RFC6184] receiver will be able to receive the H.264/AVC compatible bitstream subset. When a set of layers including one or more SVC enhancement layers is transmitted in an RTP session, the set SHOULD be carried in one RTP stream that SHOULD be encapsulated according to this memo.5.2. Packetization Rules for Multi-Session Transmission
When MST is used, the packetization rules specified in Section 5.1 still apply. In addition, the following packetization rules MUST be followed, to ensure that decoding order of NAL units carried in the sessions can be correctly recovered for each of the MST packetization modes using the de-packetization process specified in Section 6.2. The NI-T and NI-TC modes both use timestamps to recover the decoding order. In order to be able to do so, it is necessary for the RTP packet stream to contain data for all sampling instances of a given RTP session in all enhancement RTP sessions that depend on the given RTP session. The NI-C and I-C modes do not have this limitation, and use the CS-DON values as a means to explicitly indicate decoding order, either directly coded in PACSI NAL units, or inferred from
them using the packetization rules. It is noted that the NI-TC mode offers both alternatives and it is up to the receiver to select which one to use.5.2.1. NI-T/NI-TC Packetization Rules
When using the NI-T mode and a PACSI NAL unit is present, the T bit MUST be equal to 0, i.e., the DONC field MUST NOT be present. When using the NI-T mode, the optional parameters sprop-mst-remux- buf-size, sprop-remux-buf-req, remux-buf-cap, sprop-remux-init-buf- time, sprop-mst-max-don-diff MUST NOT be present. When the NI-T or NI-TC MST mode is in use, the following applies. If one or more NAL units of an access unit of sampling time instance t is present in RTP session A, then one or more NAL units of the same access unit MUST be present in any enhancement RTP session that depends on RTP session A. Informative note: The mapping between RTP and NTP format timestamps is conveyed in RTCP SR packets. In addition, the mechanisms for faster media timestamp synchronization discussed in [RFC6051] may be used to speed up the acquisition of the RTP-to- wall-clock mapping. Informative note: The rule above may require the insertion of NAL units, typically when temporal scalability is used, i.e., an enhancement RTP session does not contain any NAL units for an access unit with a particular NTP timestamp (media timestamp), which, however, is present in a lower enhancement RTP session or the base RTP session. There are two ways to insert additional NAL units in order to satisfy this rule: - One option for adding additional NAL units is to use empty NAL units (defined in Section 4.10), which can be used by the process described in Section 6.2.1 for the access unit reordering process. - Additional NAL units may also be added by the encoder itself, for example, by transmitting coded data that simply instruct the decoder to repeat the previous picture. This option, however, may be difficult to use with pre-encoded content. If a packet must be inserted in order to satisfy the above rule, e.g., in case of a MANE generating multiple RTP streams out of a single RTP stream, the inserted packet must have an RTP timestamp that maps to the same wall-clock time (in NTP format) as the one of
the RTP timestamp of any packet of the access unit present in any lower enhancement RTP session or the base RTP session. This is easy to accomplish if the NAL unit or the packet can be inserted at the time of the RTP stream generation, since the media timestamp (NTP timestamp) must be the same for the inserted packet and the packet of the corresponding access unit. If there is no knowledge of the media time at RTP stream generation or if the RTP streams are not generated at the same instance, this can be also applied later in the transmission process. In this case the NTP timestamp of the inserted packet can be calculated as follows. Assume that a packet A2 of an access unit with RTP timestamp TS_A2 is present in base RTP session A, and that no packet of that access unit is present in enhancement RTP session B, as shown in Figure 5. Thus, a packet B2 must be inserted into session B following the rule above. The most recent RTCP sender report in session A carries NTP timestamp NTP_A and the RTP timestamp TS_A. The sender report in session B with a lower NTP timestamp than NTP_A is NTP_B, and carries the RTP timestamp TS_B. RTP session B:..B0........B1........(B2)...................... RTCP session B:.....SR(NTP_B,TS_B)............................. RTP session A:..A0........A1........A2........................ RTCP session A:..................SR(NTP_A,TS_A)................ -----------------|--x------|-----x---|------------------------> NTP time --------------------+<---------->+<->+------------------------> t1 t2 RTP TS(B) time Figure 5. Example calculation of RTP timestamp for packet insertion in an enhancement layer RTP session The vertical bars ("|")in the NTP time line in the figure above indicate that access unit data is present in at least one of the sessions. The "x" marks indicate the times of the sender reports. The RTP timestamp time line for session B, shown right below the NTP time line, indicates two time segments, t1 and t2. t1 is the time difference between the sender reports between the two sessions, expressed in RTP timestamp clock ticks, and t2 is the time difference from the session A sender report to the A2 packet, again expressed in RTP timestamp clock ticks. The sum of these differences is added to
the RTP timestamp of the session report from session B in order to derive the correct RTP timestamp for the inserted packet B2. In other words: TS_B2 = TS_B + t1 + t2 Let toRTP() be a function that calculates the RTP time difference (in clock ticks of the used clock) given an NTP timestamp difference, and effRTPdiff() be a function that calculates the effective difference between two timestamps, including wraparounds: effRTPdiff( ts1, ts2 ): if( ts1 <= ts2 ) then effRTPdiff := ts1-ts2 else effRTPDiff := (4294967296 + ts2) - ts1 We have: t1 = toRTP(NTP_A - NTP_B) and t2 = effRTPdiff(TS_A2, TS_A) Hence in order to generate the RTP timestamp TS_B2 for the inserted packet B2, the RTP timestamp for packet B2 TS_B2 can be calculated as follows. TS_B2 = TS_B + toRTP(NTP_A - NTP_B) + effRTPdiff(TS_A2, TS_A)5.2.2. NI-C/NI-TC Packetization Rules
When the NI-C or NI-TC MST mode is in use, the following applies for each of the RTP sessions. o For each single NAL unit packet containing a non-PACSI NAL unit, the previous packet, if present, MUST have the same RTP timestamp as the single NAL unit packet, and the following applies. o If the NALU-time of the non-PACSI NAL unit is not equal to the NALU-time of the previous non-PACSI NAL unit in decoding order, the previous packet MUST contain a PACSI NAL unit containing the DONC field. o In an STAP-A packet the first NAL unit in the STAP-A packet MUST be a PACSI NAL unit containing the DONC field. o For an FU-A packet the previous packet MUST have the same RTP timestamp as the FU-A packet, and the following applies.
o If the FU-A packet is the start of the fragmented NAL unit, the following applies. o If the NALU-time of the fragmented NAL unit is not equal to the NALU-time of the previous non-PACSI NAL unit in decoding order, the previous packet MUST contain a PACSI NAL unit containing the DONC field; o Otherwise, (the NALU-time of the fragmented NAL unit is equal to the NALU-time of the previous non-PACSI NAL unit in decoding order), the previous packet MAY contain a PACSI NAL unit containing the DONC field. o Otherwise, if the FU-A packet is the end of the fragmented NAL unit, the following applies. o If the next non-PACSI NAL unit in decoding order has NALU- time equal to the NALU-time of the fragmented NAL unit, and is carried in a number of FU-A packets or a single NAL unit packet, the next packet MUST be a single NAL unit packet containing a PACSI NAL unit containing the DONC field. o Otherwise (the FU-A packet is neither the start nor the end of the fragmented NAL unit), the previous packet MUST be a FU-A packet. o For each single NAL unit packet containing a PACSI NAL unit, if present, the PACSI NAL unit MUST contain the DONC field. o When the optional media type parameter sprop-mst-csdon-always- present is equal to 1, the session packetization mode in use MUST be the non-interleaved mode, and only STAP-A and NI-MTAP packets can be used.5.2.3. I-C Packetization Rules
When the I-C MST packetization mode is in use, the following applies. o When a PACSI NAL unit is present, the T bit MUST be equal to 0, i.e., the DONC field is not present, and the Y bit MUST be equal to 0, i.e., the TL0PICIDX and IDRPICID are not present.5.2.4. Packetization Rules for Non-VCL NAL Units
NAL units that do not directly encode video slices are known in H.264 as non-VCL NAL units. Non-VCL units that are only used by, or only relevant to, enhancement RTP sessions SHOULD be sent in the lowest session to which they are relevant.
Some senders, however, such as those sending pre-encoded data, may be unable to easily determine which non-VCL units are relevant to which session. Thus, non-VCL NAL units MAY, instead, be sent in a session on which the session using these non-VCL NAL units depends (e.g., the base RTP session). If a non-VCL unit is relevant to more than one RTP session, neither of which depends on the other(s), the NAL unit MAY be sent in another session on which all these sessions depend.5.2.5. Packetization Rules for Prefix NAL Units
Section 5.1 of this memo applies, with the following addition. If the base layer is sent in a base RTP session using [RFC6184], prefix NAL units MAY be sent in the lowest enhancement RTP session rather than in the base RTP session.