0  Introductionp. 6

TR 26.998 (5G Glass-type AR/MR) identified multiple aspects of normative work to support "5G/AR Real-time Communication" (clause 8.4). TR 26.998 identified normative work needed to support delivery of immersive media via RTP for IMS-based and WebRTC-based conversational services. To support XR split rendering as described in clause 8.6 of TR 26.998, RTP is also needed to transport immersive media and metadata information between the edge and device.
To improve support for the above XR services and enablers, it is necessary to configure RTP with specific settings and features that enable immersive experiences. Further improvements in performance and QoE over the 5G system can be achieved by specifying RTP configurations that are integrated and optimized for the 5G system, and leverage cross-layer optimizations used by other 3GPP specifications.
As these RTP configurations will be specified for use by multiple services, service enablers, and potentially, application developers, it is very important that they do not introduce unnecessary complexities that would discourage commercial deployment of the configurations. Therefore, technologies specified here should be commercially relevant and not introduce implementation and interoperability complexity without clearly demonstrating performance gains or new relevant functionalities.

1  Scopep. 7

The present document focuses on RTP (RFC 3550) over UDP (RFC 768) for eXtended Reality in 5G.
RTP Header Extensions and RTCP Feedback Reporting are introduced for real-time immersive media and associated metadata for use in 5G Systems.

2  Referencesp. 7

TR 21.905: "Vocabulary for 3GPP Specifications".
ITU-T Rec H.264 (08/2021): "Advanced video coding for generic audiovisual services" | ISO/IEC 14496-10:2022: "Information technology - Coding of audio-visual objects - Part 10: Advanced Video Coding".
ITU-T Rec H.265 (08/2021): "High efficiency video coding" | ISO/IEC 23008-2:2023: "High Efficiency Coding and Media Delivery in Heterogeneous Environments - Part 2: High Efficiency Video Coding".
RFC 3550  (2003): "RTP: A Transport Protocol for Real-Time Applications", H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson.
RFC 6184  (2011): "RTP Payload Format for H.264 Video", Y.-K. Wang, R. Even, T. Kristensen, R. Jesup.
RFC 7798  (2016): "RTP Payload Format for High Efficiency Video Coding (HEVC)", Y.-K. Wang, Y. Sanchez, T. Schierl, S. Wenger, M. M. Hannuksela.
TR 26.928: "Extended Reality (XR) in 5G".
TR 26.998: "Support of 5G glass-type Augmented Reality / Mixed Reality (AR/MR) devices".
RFC 768  (1980): "User Datagram Protocol", J. Postel.
RFC 5761  (2010): "Multiplexing RTP Data and Control Packets on a Single Port", C. Perkins, M. Westerlund.
RFC 8285  (2017): "A General Mechanism for RTP Header Extensions", D. Singer, H. Desineni, R. Even.
TS 23.501: "System architecture for the 5G System (5GS)".
RFC 5905  (2010): "Network Time Protocol Version 4: Protocol and Algorithms Specification", D. Mills, J. Martin, J. Burbank, W. Kasch.
IEEE 1588-2019 - IEEE Standard for a Precision Clock Synchronization Protocol for Networked Measurement and Control Systems, June 2020.
RFC 4574  (2006): "The Session Description Protocol (SDP) Label Attribute", O. Levin, G. Camarillo.
RFC 3611  (2003): "RTP Control Protocol Extended Reports (RTCP XR)", T. Friedman, R. Caceres, A. Clark.
TS 26.119: "Media Capabilities for Augmented Reality".
RFC 7656  (2015): "A Taxonomy of Semantics and Mechanisms for Real-Time Transport Protocol (RTP) Sources ", J. Lennox, K. Gross, S. Nandakumar, G. Salgueiro, B. Burman.
RFC 5888:  "The Session Description Protocol (SDP) Grouping Framework", G. Camarillo et al.
ISO/IEC 60559:2020: "Floating-point arithmetic".

3  Definitions of terms, symbols and abbreviationsp. 8

3.1  Termsp. 8

For the purposes of the present document, the terms given in TR 21.905 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905.
Age of content:
The time duration between the moment the content is created and the time it is presented.
Time when the pose was estimated.
Data Burst:
A data burst is a set of multiple PDUs generated and sent by the application such that there is an idle period between two data bursts. A Data Burst can be composed of one or multiple PDU Sets.
Multimedia Session:
An association among a group of participants engaged in the communication via one or more RTP sessions, as defined in Section 2.2.4 of RFC 7656.
Orientation quaternion:
Quaternion used to represent the orientation of an object.
PDU Set:
One or more PDUs carrying the payload of one unit of information generated at the application level (e.g. frame(s), video slice(s), metadata, etc.).
PDU Set marking:
Marking the PDUs carrying a payload with the PDU Set Information.
Rendered pose:
An XR pose sent from a server to a client that was used for rendering at the server.
Roundtrip interaction delay:
The sum of the age of content and the user interaction delay.
Scene Update Time:
Time when the scene manager starts processing.
Time of completing a rendering.
Split rendering server:
Server to perform remote rendering.
Time of starting a rendering.
User interaction delay:
The time duration between the moment at which a user action is initiated and the time such an action is taken into account by the content creation engine.
XR Pose:
A position and orientation in space relative to an XR Space.
XR Service:
A service supporting XR use case as defined in clause 5 of TR 26.928.
XR Space:
A frame of reference in which an application chooses to track the real world. An XR Space provides a relation of the user's physical environment with other tracked entities.

3.2  Symbolsp. 9

For the purposes of the present document, the following symbols apply:
IP header overhead
Number of RTP packets
RTP payload data size
RTP header overhead, including any RTP header extensions
RTP packet maximum SDU size
UDP header overhead

3.3  Abbreviationsp. 9

For the purposes of the present document, the abbreviations given in TR 21.905 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905.
Aggregation Packet
Advanced Video Coding
Broken Link Access
Clean Random Access
Degrees of Freedom
Fragmentation Unit
(RTP) Header Extension
High Efficiency Video Coding
Instantaneous Decoder Refresh
Intra Random Access Picture
Network Abstraction Layer
Network Time Protocol
Operating System
Payload Content Information
Picture Parameter Set
PDU Set Importance
Precision Time Protocol
Random Access Decodable Leading
Random Access Skipped Leading
RTP Control Protocol
RTCP eXtended Report
Sequence Parameter Set
Split Rendering Server
Secure RTP
Temporal Identifier
User Plane Function
User Datagram Protocol
Video Coding Layer
Video Parameter Set
eXtended Reality

