This clause provides an overview of core 2D video compression technologies that are available on mobile platforms as well as their performance. For power-efficient and best performance, encoding and decoding is preferably exclusively carried out in hardware. This clause reviews the 3GPP specifications, actual hardware availability as well as the performance of codecs.
As of today, two codecs are prominently referenced and available, namely
H.264/AVC [30] and
H.265/HEVC [31]. Both codecs are defined as part of the TV Video Profiles in
TS 26.116 and are also the foundation of the VR Video Profiles in
TS 26.118. The highest defined profiles are:
These profiles and levels basically permit the delivery of video formats up to 4K at 60 frames per second. In modern mobile CPUs, the above profile/level combinations are supported, and recently even extended to support 8K video.
An overview of typical coding performance is provided in
Table 4.5-1.
A more detailed analysis of video codec performance is FFS.
Work on video compression technologies beyond the capabilities of HEVC
[31] are continued by the MPEG/ITU. For example, the Joint Video Exploration Team (JVET) initiated the work on the development of a new video coding standard, to be known as Versatile Video Coding (VVC). In addition, MPEG started working on a new video coding standard to be known as MPEG-5 Essential Video Coding (EVC) in January 2019. Also noteworthy is the improvement of encoders over time even for existing standards which also leads to bitrate reductions at the same quality.
Based on this information it can be expected that within the time frame until 2025, video compression technology will permit bitrate reductions by a factor of 50% compared to what is today possible with HEVC
[31].
On top of regular lossy video compression algorithms, low-latency, low-complexity and near lossless codecs are important for certain applications. As an example, JPEG XS is a recent standard for visually lossless low-latency lightweight image coding. According to
https://jpeg.org/static/whitepapers/jpeg-xs-whitepaper.pdf, such a codec permits simple yet efficient coding, keeps latency and complexity very low and at the same time achieves visually lossless quality at compression ratios up to 10:1.
Furthermore, for XR formats beyond regular 2D, two different approaches are taken in the compression
-
usage of existing 2D codecs and providing pre- and post-processing in order to convert the signals to 3D signals
-
usage of dedicated compression technologies for specific formats.
More details on these issues are discussed in
clause 4.6.
In XR type of applications, when buffers are processed by rendering engines, existing video codecs may be used to efficiently compress them when they need to be transmitted over the network. As typically a huge amount of data is exchanged and operation needs to be done in a power-efficient manner in constraint environments (see
clause 4.8), XR applications rely on existing video codecs on mobile platforms, for example those codecs defined in 3GPP specifications (see
clause 4.5.1). While serving an immediate need and providing a kickstart for XR type of services, such video codecs may not be fully suitable for XR applications for different reasons, some of them listed below.
First of all, the formats of the buffers in XR and graphics applications may be different and more variety exists, see
clause 4.4.4. Also in certain case, not only textures need to be supported, but also 3D formats, see
clause 4.6.
Beyond this, XR applications may require that multiple buffers are served and synchronized in order to render an XR experience. This results in requirements for parallel decoding of multiple streams for multiple buffers (texture, geometry, etc.) as well as multiple objects. In many cases these buffers need to be made available to the rendering engine in a synchronized manner to ensure the highest quality of the rendered scene. Furthermore, the amount of streams and data to be processed may vary heavily over the period of an XR session and requires flexible video decoding architectures, also taking into account efficient and low-latency processing.
As an example, MPEG is addressing several of these challenges as part of their MPEG-I project on immersive media coding. In particular, for the variety of applications, a flexible and powerful hardware based decoding and processing architecture is desirable.