This clause documents and clusters potential standardisation areas in the context of this Technical Report.
As documented in
clause 4.5, XR centric devices are the key enablers for XR services. Key aspects for XR devices are:
-
Rendering-centric device architectures using sophisticated GPU functionalities, see clause 4.4.
-
Support for Tracking, in particular inside-out tracking in the device
-
Heavily power-constrained at least for certain form factors
-
Support for multiple decoding formats and parallel decoding of media streams following the challenges documented in clause 4.5.2
In addition, device characteristics can be quite different. Hence, the device types developed in
clause 4.8 serve as a starting point for different device types. A more formal definition of XR devices types is considered useful.
In any work considered in 3GPP, end-points compatible to Khronos-based graphics and XR functionalities should be considered. A framework for interfacing device centric XR functionalities with 5G System and radio functionalities is a relevant standardisation effort.
Streaming XR and 6DoF is considered in several use cases evaluated in this Technical report. With the establishment of 5G Media Streaming in Release-16 in
TS 26.501 and the stage-3 specifications, extensions of 5G Media Streaming to support XR experiences is useful. MPEG develops several new formats and codecs specifically addressing 3D and XR, but also proprietary formats exist that make use of regular hardware supported standardized media codecs and rendering functionalities. All of them have in common that they rely on existing and emerging device architectures that make use of existing video codecs and GPU rendering. In addition, the use of multiple video codecs in parallel is commonly applied. XR/6DoF Streaming is based on CDNs and HTTP delivery, however new functionalities are required.
Extensions to 5G Media streaming may be done in ordered to support the delivery of different XR/DoF media in 5G Systems. Relevant aspects are:
-
Additional media decoding capabilities including higher profile and levels, the use of multiple decoders in parallel to interface with rendering architectures, as well more flexible formats following clause 4.5.2
-
Support for more flexible delivery protocols allowing parallel download of different objects and parts of objects
-
Viewport-dependent streaming as documented in clause 6.2.3
-
Potentially new 5QIs and radio capabilities to support higher bitrate streaming
-
Biometrics and Emotion Metadata definition and transport
Split Rendering is a promising technology to support online gaming in power- and resource constrained devices. Split rendering requires the use of edge computing as the pose-to-render-to-photon is expected to be below 50ms. Rendering of rasterized frame buffers in the network allows to support XR devices with existing codecs using 5G System and radio capabilities. Relevant aspects for single-buffet split rendering include:
-
A simple XR split rendering application framework where a single frame buffer per media per eye is shared and final pose correction is done in the XR device
-
2D video encoders and decoders that are capable encode and decode 2K per eye as well as 90 fps as well as typical 2D rasterized graphics centric formats
-
Integration of audio into split rendering architectures
-
Formats and protocols for XR Pose information delivery and possibly other metadata in the uplink at sufficiently high frequency
-
Content Delivery protocols that support split rendering
-
5QIs and other 5GS/Radio capabilities that support split rendering
-
Edge computing discovery and capability discovery based on work in SA2 and SA6 (see clause 4.3.6)
XR conversational applications within real or computer generated virtual environments is considered in several use cases evaluated in this Technical report. Some work is already ongoing in IVAS, ITT4RT, however, potential additional normative work includes;
-
Study the mapping of XR conference applications to the 5G system architecture including Media streaming and MTSI.
-
Support for media processing in the network (e.g. NBMP) (e.g. foreground/background segmentation of the user capture, replacement of the user with a photo-realistic representation of their face, etc.)
-
6DOF metadata framework and a 6DOF capable renderer for immersive voice and audio.
-
Support of static/dynamic 3D objects' formats and transport for real-time sharing
-
Transport of collected data from multiple sensors (e.g. for spatial mapping)
-
Format for storing and sharing spatial information (e.g. indoor spatial data).
-
Content Delivery protocols that support XR conversational cases
-
5QIs and other 5GS/Radio capabilities that support XR conversational cases
-
Edge computing discovery and capability discovery based on work in SA2 and SA6 (see clause 4.3.6)
Augmented reality was discussed in details in this report. Glass-type AR/MR UEs with standalone capability, i.e., that can be connected directly to 3GPP networks are interesting emerging devices. Such a type is classified as XR5G-A5 in Table 4.3-1. However, it would also be necessary to consider situations where XR5G-A5 UEs have to fall back to XR5G-A1 or XR5G-A2, i.e., to the wired or wirelessly tethered modes, e.g., from NR Uu to 5G sidelink or IEEE 802.11ad/y. Furthermore, an evolution path for devices under the category XR5G-A3 and XR5G-A4 should be considered. Further studies are encouraged, among others
-
Basic use cases: a set of use cases relevant for XR5G-A5 can be selected from Table 5.1-1. The preferred cases will be those capable of delivering experiences previous or existing services could not support, e.g., real-time sharing or streaming 3D objects. They also have to be easier to realize in the environments of glasses that are more limited than those of phones.
-
Media formats and profiles: for the selected use cases, available formats and profiles of the media/data can be discussed. Sharing of XR and 3D Data is of interest for personal and enterprise use cases as documented in scenario 5.2. Properly generated XR data can be used in AR applications on smartphone devices as well as on AR glasses. Exchange formats for AR-based applications are relevant, for example for services such as MMS.
-
Transport technologies and protocols: in case the selected use cases include relocation or delivery of 3D or XR media/data over 3GPP networks, combinations of transport protocols, radio access and core network technologies that support the use cases at relevant QoS can be discussed. If existing technologies and protocols cannot serve the use cases properly, such gaps can be taken into account in the consideration of normative works.
-
Form factor-related issues: in Table 4.3-1, typical maximum transmit power of XR5G-A5 is 0.5-2W while phone types transmit at 3-5 W. However, if XR5G-A5 is implemented in a form factor of typical glasses, i.e., smaller than goggles or HMDs and with a weight less than 100 g, its cellular modems and antennas are located near the face. In this case, XR5G-A5 UEs can have more constraints on transmit power and it would be necessary to develop solutions to overcome it, e.g. considering situations where XR5G-A5 UEs have to fall back to XR5G-A2 from NR Uu to 5G sidelink or IEEE 802.11ad/y. Furthermore, an evolution path for devices under the category XR5G-A3 (3-7W) and XR5G-A4 (2-4W) should be considered.
As identified in the course of the development of this report, there is significant interest in 3GPP radio and system groups on the traffic characteristics for XR services. This effort should be a prime work for 3GPP to collect realistic traffic characteristics for typical XR services. Of specific interest for other groups in 3GPP is a characterization of traffic of an XR service in the following domains:
-
Downlink data rate ranges
-
Uplink data rate ranges
-
Maximum packet delay budget in uplink and downlink
-
Maximum Packet Error Rate,
-
Maximum Round Trip Time
-
Traffic Characteristics on IP level in uplink and downlink in terms of packet sizes, and temporal characteristics.
Such characteristics are expected to be available for at least the following applications
-
Viewport independent 6DoF Streaming
-
Viewport dependent 6DoF Streaming
-
Single Buffer split rendering for online cloud gaming
-
XR conversational services
The Technical Report on Typical Traffic Characteristics in
TR 26.925 should be updated to address any findings to support the 3GPP groups.
Social XR is used as an umbrella term for combining, delivering, decoding and rendering XR objects (avatars, conversational, sound sources, streaming live content, etc.) originating from different sources into a single user experience. Social XR may be VR centric, but also may apply to AR and MR.
Social XR is expected to integrate multiple XR functionalities such a 6DoF streaming with XR conversational services. Some normative work may include:
-
Social XR Components - Merging of avatar and conversational streams to original media (e.g., overlays, etc.)
-
Parallel decoding of multiple independently generated sources.
-
Proper annotation and metadata for each object to place it into scene.
-
Description and rendering of multiple objects into a Social XR experience.
Details are FFS.
Edge/Cloud processing and rendering is a promising technology to support online gaming in power- and resource constrained devices. Relevant aspects for generalized cloud/split rendering include:
-
A generalized XR cloud and split rendering application framework based on a scene description
-
Support for 3D formats in split and cloud rendering approaches
-
Formats and protocols for XR Pose information delivery and possibly other metadata in the uplink at sufficiently high frequency
-
Content Delivery protocols that support generalized split/cloud rendering
-
Distributions of processing resources across different resources in the 5G system network, in the application provider domain (cloud) and the XR device.
-
Supporting the establishment of Processing Workflows across distributed resources and managing those
-
5QIs and other 5GS/Radio capabilities that support generalized split/cloud rendering by coordination with other groups
-
Edge computing discovery and capability discovery based on work in SA2 and SA6 (see clause 4.3.6)
It is recommended that this area is studied in more details to identify key issues.