Based on the use cases, the following formats, codecs and packaging formats are of relevance for cognitive immersive media distribution of AR:
-
Scene graph and scene description
-
Spatial description
-
2D video formats
-
3D formats such as static and dynamic point clouds or meshes
-
2D video formats with depth
-
Audio formats supporting mono, stereo, and/or spatial audio
-
Several video decoding instances
-
Decoding tools for such formats
-
Encoding tools for 2D formats
-
Low-latency downlink and uplink real-time streaming of the above media
-
Uplink streaming of pose information
-
Uplink streaming of media
In the downlink this scenario is equivalent to
clause 6.3.6 and similar KPIs and QoS aspects apply.
For the uplink, the above scenarios relate to the following cases in
clause 6 of TR 26.928. In particular:
Details on uplink streaming of sensor data are for future study.
The list of potential standardization area that has been collected is provided in the following:
-
Similar functionalities as identified in clause 6.3.7 for downlink
-
For the uplink, streaming of sensor information to the network
-
Low-latency streaming protocols to support latencies in the range between 50 to 500ms, typically using RTP-based real-time streaming
-
Simple 2D media formats to stream match AR sensor data
-
Payload format to be mapped into RTP streams
-
Capability exchange mechanism and relevant signalling
-
Protocol stack and content delivery protocol
-
Cross-layer design, radio and 5G system optimizations for QoS support
-
Spatial description format for downlink and uplink
-
Required QoE metrics