Tech-invite3GPPspaceIETFspace
21222324252627282931323334353637384‑5x

Content for  TR 26.998  Word version:  18.0.0

Top   Top   None   None   Next
0…   4…   4.2…   4.2.2…   4.2.2.2   4.2.2.3   4.2.2.4   4.2.3…   4.3…   4.4…   4.5…   4.6…   4.6.4…   4.6.5…   4.6.8…   5   6…   6.2…   6.2.4…   6.2.4.2   6.2.5…   6.3…   6.3.4…   6.3.4.2   6.3.5…   6.4…   6.4.4   6.4.5…   6.5…   6.5.4   6.5.5   6.5.6…   6.6…   6.6.4   6.6.5…   7…   8…   8.9   9   A…   A.2   A.3…   A.4   A.5   A.6   A.7…

 

0  Introductionp. 6

Augmented Reality (AR) and Mixed Reality (MR) promise to provide new experiences for immersive media services. The form factors of the devices for these services are typically not expected to deviate significantly from those of typical glasses, resulting in less physical space for the various required components such as sensors, circuit boards, antennas, cameras, and batteries, when compared with typical smartphones. Such physical limitations also reduce the media processing and communication capabilities that may be supported by AR/MR devices, in some cases requiring the devices to offload certain processing functions to a tethered device and/or a server.
This report addresses the integration of such new devices into 5G system networks and identifies potential needs for specifications to support AR glasses and AR/MR experiences in 5G.
The focus of this document is on general system aspects, especially targeting visual rendering on glasses, and may not be equally balanced or equally precise on all media types (e.g. on haptics, GPUs).
Up

1  Scopep. 7

The present document collects information on glass-type AR/MR devices in the context of 5G radio and network services. The primary scope of this Technical Report is the documentation of the following aspects:
  • Providing formal definitions for the functional structures of AR glasses, including their capabilities and constraints,
  • Documenting core use cases for AR services over 5G and defining relevant processing functions and reference architectures,
  • Identifying media exchange formats and profiles relevant to the core use cases,
  • Identifying necessary content delivery transport protocols and capability exchange mechanisms, as well as suitable 5G system functionalities (including device, edge, and network) and required QoS (including radio access and core network technologies),
  • Identifying key performance indicators and quality of experience factors,
  • Identifying relevant radio and system parameters (required bitrates, latencies, loss rates, range, etc.) to support the identified AR use cases and the required QoE,
  • Providing a detailed overall power analysis for media AR related processing and communication.
Up

2  Referencesp. 7

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
  • References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.
  • For a specific reference, subsequent revisions do not apply.
  • For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
[1]
TR 21.905: "Vocabulary for 3GPP Specifications".
[2]
TR 26.928: "Extended Reality (XR) in 5G"
[3]
Wireless Broadband Alliance, "5G and Wi-Fi RAN Convergence", April 2021.
[4]
Khronos Group, The OpenXR Specification, 1.0, https://www.khronos.org/registry/OpenXR/specs/1.0/html/xrspec.html
[5]
W3C, WebXR Device API, W3C Working Group Draft, https://www.w3.org/TR/webxr/
[6]
ISO/IEC 23090-13:2022 DIS: "Information technology - Coded representation of immersive media - Part 13: Video Decoding Interface for Immersive Media"
[7]
Microsoft Azure KinectTM DK documentation, https://docs.microsoft.com/en-us/azure/kinect-dk/
[8]
Google ARCoreTM: Use Depth in your Android app, https://developers.google.com/ar/develop/java/depth/developer-guide
[9]
Microsoft Azure KinectTM DK documentation: Use Azure Kinect Sensor SDK image transformations, https://docs.microsoft.com/en-us/azure/kinect-dk/use-image-transformation#overview
[10]
Daniel Wagner, Louahab Noui, Adrian Stannard, "Why is making good AR displays so hard?", LinkedIn Blog, August 7, 2019, https://www.linkedin.com/pulse/why-making-good-ar-displays-so-hard-daniel-wagner/
[11]
Daniel Wagner, "MOTION TO PHOTON LATENCY IN MOBILE AR AND VR", Medium Blog, August 20, 2018, https://medium.com/@DAQRI/motion-to-photon-latency-in-mobile-ar-and-vr-99f82c480926
[12]
Yodayoda, "Why loop closure is so important for global mapping", Medium Blog, December 24, 2020, https://medium.com/yodayoda/why-loop-closure-is-so-important-for-global-mapping-34ff136be08f
[13]
TS 22.261: "Service requirements for the 5G system"
[14]
TR 22.873: "Study on evolution of the IP Multimedia Subsystem (IMS) multimedia telephony service"
[15]
TS 26.114: "IP Multimedia Subsystem (IMS); Multimedia telephony; Media handling and interaction"
[16]
TR 38.838: "Study on XR (Extended Reality) evaluations for NR"
[17]
ISO/IEC 23090-2:2021: "Information technology - Coded representation of immersive media - Part 2: Omnidirectional media format"
[18]
ISO/IEC 23090-3:2021: "Information technology - Coded representation of immersive media - Part 3: Versatile video coding"
[19]
ISO/IEC 23090-5:2021: "Information technology - Coded representation of immersive media - Part 5: Visual volumetric video-based coding (V3C) and video-based point cloud compression (V-PCC)"
[20]
ISO/IEC 23090-8:2020: "Information technology - Coded representation of immersive media - Part 8: Network based media processing"
[21]
ETSI GS ISG ARF 003 v1.1.1 (2020-03): "Augmented Reality Framework (ARF) AR framework architecture"
[22]
Khronos Group, The GL Transmission Format (glTF) 2.0 Specification, https://github.com/KhronosGroup/glTF/tree/master/specification/2.0/
[23]
ISO/IEC 23090-14:2022: "Information technology - Coded representation of immersive media - Part 14: Scene Description for MPEG-I Media"
[24]
ISO/IEC 23090-10:2021: "Information technology - Coded representation of immersive media - Part 10: Carriage of Visual Volumetric Video-based Coding Data"
[25]
ISO/IEC 23090-18:2021: "Information technology - Coded representation of immersive media - Part 18: Carriage of Geometry-based Point Cloud Compression Data"
[26]
TS 26.501: "5G Media Streaming (5GMS); General description and architecture"
[27]
H. Chen, Y. Dai, H. Meng, Y. Chen and T. Li, "Understanding the Characteristics of Mobile Augmented Reality Applications," 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2018, pp. 128-138.
[28]
S. Kang, H. Choi, "Fire in Your Hands: Understanding Thermal Behavior of Smartphones", The 25th Annual International Conference on Mobile Computing and Networking (MobiCom '19)
[29]
T. Chihara, A. Seo, "Evaluation of physical workload affected by mass and center of mass of head-mounted display", Applied Ergonomics, Volume 68, pp. 204-212, 2018
[30]
[31]
T.Ebner, O.Schreer, I. Feldmann, P.Kauff, T.v.Unger, "m42921 HHI Point cloud dataset of boxing trainer", MPEG 123rd meeting, Ljubljana, Slovenia
[32]
[33]
Serhan Gül, Dimitri Podborski, Jangwoo Son, Gurdeep Singh Bhullar, Thomas Buchholz, Thomas Schierl, Cornelius Hellge, "Cloud Rendering-based Volumetric Video Streaming System for Mixed Reality Services", Proceedings of the 11th ACM Multimedia Systems Conference (MMSys'20), June 2020
[34]
[35]
[36]
[37]
S. N. B. Gunkel, H. M. Stokking, M. J. Prins, N. van der Stap, F.B.T. Haar, and O.A. Niamut, 2018, June. Virtual Reality Conferencing: Multi-user immersive VR experiences on the web. In Proceedings of the 9th ACM Multimedia Systems Conference (pp. 498-501). ACM.
[38]
Dijkstra-Soudarissanane, Sylvie, et al. "Multi-sensor capture and network processing for virtual reality conferencing." Proceedings of the 10th ACM Multimedia Systems Conference. 2019.
[39]
VRTogether, a media project funded by the European Commission as part of the H2020 program, https://vrtogether.eu/, November 2020.
[40]
MPEG131 Press Release: Point Cloud Compression - WG11 (MPEG) promotes a Video-based Point Cloud Compression Technology to the FDIS stage: https://multimediacommunication.blogspot.com/2020/07/mpeg131-press-release-point-cloud.html
[41]
TR 23.701: "Study on Web Real Time Communication (WebRTC) access to IP Multimedia Subsystem (IMS); Stage 2"
[42]
TR 23.706: "Study on enhancements to Web Real Time Communication (WebRTC) access to IP Multimedia Subsystem (IMS); Stage 2"
[43]
TS 24.371: "Web Real-Time Communications (WebRTC) access to the IP Multimedia (IM) Core Network (CN) subsystem (IMS); Stage 3; Protocol specification"
[44]
RFC 8831:  WebRTC Data Channels
[45]
RFC 8864:  Negotiation Data Channels Using the Session Description Protocol (SDP)
[46]
RFC 8827:  WebRTC Security Architecture
[47]
ISO/IEC 23090-6:2021: "Information technology - Coded representation of immersive media - Part 6: Immersive media metrics"
[48]
TR 26.926: "Traffic Models and Quality Evaluation Methods for Media and XR Services in 5G Systems"
→ to date, still a draft
[49]
[50]
Oscar Falmer: "Mobile AR Features Landscape", 20th September 2021, https://docs.google.com/spreadsheets/d/1S1qEyDRCqH_UkcSS4xVQLgcMSEpIu_mPtfHjsN02GNw/edit#gid=0
[51]
[52]
Google WebRTC project update & Stadia review, https://www.youtube.com/watch?v=avtlQeaxd_I&t=438s
[53]
Joint MPEG/Khronos/3GPP Workshop on "Streamed Media in Immersive Scene Descriptions", September 29/30, 2021, http://mpeg-sd.org/workshop.html
[54]
MPEG Systems Output WG3 N21042 "Report of Joint Workshop on Streamed Media in Immersive Scene Descriptions", MPEG#136, October 2021, https://www.mpegstandards.org/wp-content/uploads/mpeg_meetings/136_OnLine/w21042.zip
[55]
TS 23.501: "System architecture for the 5G System (5GS)"
[56]
TS 26.260: "Objective test methodologies for the evaluation of immersive audio systems"
[57]
TS 26.261: "Terminal audio quality performance requirements for immersive audio services"
→ to date, still a draft
[58]
Younes, Georges, et al. "Keyframe-based monocular SLAM design, survey, and future directions."! Robotics and Autonomous Systems 98 (2017): 67-88. https://arxiv.org/abs/1607.00470
[59]
David G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints". https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
[60]
Herber Bay et. al., "SURF Speeded Up Robust Features"!. https://people.ee.ethz.ch/~surf/eccv06.pdf
[61]
E. Rublee, V. Rabaud, K. Konolige and G. Bradski, "ORB An efficient alternative to SIFT or SURF,"! 2011 International Conference on Computer Vision, 2011, pp. 2564-2571, doi: 10.1109/ICCV.2011.6126544.
[62]
ETSI GS ISG ARF 004 v1.1.1 (2020-08): "Augmented Reality Framework (ARF) Interoperability Requirements for AR components, systems and services"
[63]
Up

3  Definitions, symbols and abbreviationsp. 10

3.1  Definitionsp. 10

For the purposes of the present document, the terms and definitions given in TR 21.905 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905.
5G AR/MR media service enabler:
A 5G AR/MR media service enabler is supporting an AR/MR application to provide AR/MR experience using at least partially 5G System tools.
5G System (Uu):
Modem and system functionalities to support 5G-based delivery using the Uu radio interface.
AR/MR Application:
a software application that integrates audio-visual content into the user's real-world environment.
AR/MR content:
AR/MR content consists of a scene with typically one or more AR objects and is agnostic to a specific service.
AR Data:
Data generated by the AR Runtime that is accessible through API by an AR/MR application such as pose information, sensors outputs, and camera outputs.
AR Media Delivery Pipeline:
pipeline for accessing AR scenes and related media over the network.
AR/MR object:
An AR/MR object provides a component of an AR scene agnostic to a renderer capability.
AR Runtime:
a set of functions that interface with a platform to perform commonly required operations such as accessing controller/peripheral state, getting current and/or predicted tracking positions, general spatial computing, and submitting rendered frames to the display processing unit.
Lightweight Scene Manager:
A scene manager that is capable to handle a limited set of 3D media and typically requires some form of pre-rendering in a network element such as the edge or cloud.
Media Access Function:
A set of functions that enables access to media and other AR related data that is needed in the scene manager or AR Runtime in order to provide an AR experience.
Peripherals:
The collection of sensors, cameras, displays and other functionalities on the device that provide a physical connection to the environment.
Scene Manager:
a set of functions that supports the application in arranging the logical and spatial representation of a multisensorial scene with support of the AR Runtime.
Simplified Entry Point:
An entry point that is generated by 5G cloud/edge processes to support offloading processing workloads from UE by lowering the complexity of the AR/MR content.
Spatial Computing:
AR functions which process sensor data to generate information about the world 3D space surrounding the AR user.
XR Spatial Description:
a data structure describing the spatial organisation of the real world using anchors, trackables, camera parameters and visual features.
XR Spatial Compute Pipeline:
pipeline that uses sensor data to provide an understanding of the physical space surrounding the device and uses XR Spatial Description information from the network.
XR Spatial Compute server:
an edge or cloud server that provides spatial computing AR functions.
XR Spatial Description server:
a cloud server for storing, updating and retrieving XR Spatial Description.
Up

3.2  Symbolsp. 11

Void

3.3  Abbreviationsp. 11

For the purposes of the present document, the abbreviations given in TR 21.905 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905.
5GMS
5G Media Streaming
AAC
Advanced Audio Coding
AF
Application Function
AGW
Access GateWay
API
Application Programming Interface
AR
Augmented Reality
ARF
Augmented Reality Framework
AS
Application Server
ATIAS
Terminal Audio quality performance and Test methods for Immersive Audio Services
ATW
Asynchronous Time Warp
AVC
Advanced Video Coding
BLE
Bluetooth Low Energy
BMFF
Based Media File Format
BoW
Bag-Of-visual-Words
CAD
Computer Aided Design
CGI
Computer Generated Imagery
CMAF
Common Media Application Format
CoM
Centre of Mass
CPU
Central Processing Unit
CSCF
Call Session Control Function
DASH
Dynamic Adaptive Streaming over HTTP
DC
Data Channel
DCMTSI
Data Channel Multimedia Telephony Service over IMS
DIS
Draft International Standard
DoF
Degree of Freedom
DRX
Discontinuous Reception
DTLS
Datagram Transport Layer Security
EAS
Edge Application Server
EDGAR
EDGe-Dependent AR
EEL
End-to-End Latency
eMMTEL
Evolution of IMS Multimedia Telephony Service
EMSA
Streaming Architecture extensions for Edge processing
ERP
Equirectangular Projection
EVC
Essential Video Coding
FDIS
Final Draft International Standard
FFS
For Future Study
FLUS
Framework for Live Uplink Streaming
FoV
Field of View
FPS
Frame Per Second
G-PCC
Geometry-based Point Cloud Compression
GBR
Guaranteed Bit Rate
glTF
Graphics Library Transmission Format
GPS
Global Positioning System
GPU
Graphics Processing Unit
HDCA
High Density Camera Array
HEVC
High Efficiency Video Coding
HLS
HTTP Live Streaming
HMD
Head-Mounted Display
HOA
Higher-Order Ambisonics
HTML
HyperText Markup Language
HTTP
HyperText Transfer Protocol
ICE
Interactive Connectivity Establishment
IMS
IP Multimedia Subsystem
IMU
Inertial Measurement Unit
ISG
Industry Specification Group
ISOBMFF
ISO Based Media File Format
ITT4RT
Immersive Teleconferencing and Telepresence for Remote Terminals
IVAS
Immersive Voice and Audio Services
JPEG
Joint Photographic Experts Group
JSON
JavaScript Object Notation
KPI
Key Performance Indicator
LI
Lawful Interception
LIDAR
LIght Detection And Ranging
LSR
Late Stage Reprojection
MAF
Media Access Function
MBS
Multicast and Broadcast Services
MIoT
Mobile Internet of Things
MMT
MPEG Media Transport
MPD
Media Presentation Description
MPR
Maximum Power Reduction
MR
Mixed Reality
MRF
Multimedia Resource Function
MSH
Media Session Handler
MTSI
Multimedia Telephony Service over IMS
NAT
Network Address Translation
NBMP
Network Based Media Processing
OHMD
Optical Head Mount Display
OMAF
Omnidirectional MediA Format
OpenGL
Open Graphics Library
OTT
Over-The-Top
PBR
Physically-Based Rendering
PCC
Point Cloud Compression
PCF
Policy Control Function
PDU
Protocol Data Unit
PLY
PoLYgon file format
PRACK
Provisional Response Acknowledgement
RAN
Radio Access Network
RFC
Request For Comments
RP
Reference Point
RTC
Real-Time Communication
RTP
Real-time Transport Protocol
SDK
Software Development Kit
SDP
Session Description Protocol
SIFT
Scale Invariant Feature Transform
SID
Study Item Description
SIP
Session Initiation Protocol
SLAM
Simultaneous Localization And Mapping
SRTCP
Secure Real-time Transport Control Protocol
SRTP
Secure Real-time Transport Protocol
SSE
Server-Sent Events
STAR
STandalone AR
STUN
Session Traversal Utilities for NAT
SURF
Speeded Up Robust Features
TCP
Transmission Control Protocol
TLS
Transport Layer Security
ToF
Time of Flight
TPU
Tensor Processing Unit
TURN
Traversal Using Relays around NAT
UE
User Equipment
USB
Universal Serial Bus
V3C
Visual Volumetric Video-based Coding
V-PCC
Video-based Point Cloud Compression
VDE
Video Decoding Engine
VDI
Video Decoding Interface
VPU
Vision Processing Unit
VVC
Versatile Video Coding
WebRTC
Web Real-Time Communication
WLAR
WireLess tethered AR
WTAR
Wired Tethered AR
XHR
XML HTTP Request
XMPP
eXtensible Messaging and Presence Protocol
XR
eXtended Reality
Up

Up   Top   ToC