For the purpose to define interfaces to a conforming video decoder, video operation points are defined. In this case the following definitions hold:
Operation Point: A collection of discrete combinations of different content formats including spatial and temporal resolutions, colour mapping, transfer functions, VR specific rendering metadata, etc. and the encoding format.
Receiver: A receiver that can decode and render any bitstream that is conforming to a certain Operation Point.
Bitstream: A video bitstream that conforms to a video encoding format and certain Operation Point including VR rendering metadata.
This clause focuses on the interoperability point to a media decoder as indicated in Figure 5.1-1. This clause does not deal with the access engine and file parser which addresses aspects how the video bitstream is delivered.
In all video operation points, the VR Presentation can be rendered using a single media decoder which provides decoded signals and rendering metadata by decoding relevant SEI messages.
This clause defines the potential parameters of Visual Operation Points. This includes the video decoder profile and levels with additional restrictions, conventional video signal parameters and VR rendering metadata. The requirements are defined from the perspective of the video decoder and renderer.
Parameters for a Visual Operation Point include:
Codec, Profile and level requirements
Restrictions of regular video parameters, typically expressed in the Video Usability information
The present document defines several operation points for different target applications and scenarios. In particular, two legacy operation points are defined that use existing video codecs H.264/AVC and H.265/HEVC to enable distribution of up to 4K full 360 mono video signals up to 60 Hz by using simple equirectangular projection.
In addition, one operation for each codec is defined that enables enhanced features, in particular stereo video, up to 8K mono, higher frame rates and HDR.
Furthermore, one additional operation point is defined that uses H.265/HEVC to enable distribution of up to 8K full 360 mono video signals up to 60 Hz and with HDR using equirectangular projection.
Table 5.1-1 summarizes the Operation Points, the detailed definitions are defined in the remainder of clause 5.1 where 3k refers to 2880 × 1440 pixels, 4k to 4096 × 2048 pixels, 6k to 6144 × 3072 pixels and 8k to 8192 × 4096 pixels (expressed in luminance pixel width × luminance pixel height).
Restrictions on source formats such as resolution and frame rates, content generation and encoding guidelines are provided in Annex A.
VR Rendering metadata in the Operation Points is carried in SEI messages. Receivers are expected to be able to process the VR metadata carried in SEI messages. However, the same VR metadata may be duplicated on system-level. In this case, the Receiver may rely on the system level processing to extract the relevant VR Rendering metadata rather than extracting this from the SEI message.