Content for TS 26.119 Word version: 18.0.0

B.1 Introduction B.2 Capability mapping to OpenXR B.2.1 Mapping overview B.2.2 XR views and rendering loop B.2.3 Available Visualization Space implementation B.2.3.1 Using OpenXR_XR_FB B.2.3.2 Using xrComputeNewSceneMSFT

B XR Runtime interface p. 42

B.1 Introduction p. 42

This annex describes the XR Runtime functions to be used with the 3GPP capabilities defined in the presented document. Clause B.2.2 focused the mapping of the 3GPP capabilities with the OpenXR runtime. Clause B.2.2 extracted relevant information from the OpenXR specification [5] regarding the rendering operations.

B.2 Capability mapping to OpenXR p. 42

B.2.1 Mapping overview p. 42

Capability	Corresponding OpenXR capability	Parameters	Corresponding OpenXR object
Create an XR System	xrGetSystem()	xrSystemIdentifier	XrSystemId* systemId;
Query XR System's graphics properties	xrGetSystemProperties()	swapchainSupported	Implicit, since the OpenXR specification support of swapchain by design.
		maxSwapchainImageHeight	uint32_t maxSwapchainImageHeight;
		maxSwapchainImageWidth	uint32_t maxSwapchainImageWidth;
		maxLayerCount	uint32_t maxLayerCount;
Query XR System's tracking properties	xrGetSystemProperties()	orientationTracking	XrBool32 orientationTracking;
Query XR System's tracking properties	xrGetSystemProperties()	positionTracking	XrBool32 positionTracking;
Enumerate XR System's supported environment blend modes	xrEnumerateEnvironmentBlendModes()	Value 'opaque' of blendMode	XrEnvironmentBlendMode* environmentBlendModes; There is one element of environmentBlendModes whose value is equal to XR_ENVIRONMENT_BLEND_MODE_OPAQUE.
		Value 'additive' of blendMode	XrEnvironmentBlendMode* environmentBlendModes; There is one element of environmentBlendModes whose value is equal to XR_ENVIRONMENT_BLEND_MODE_ADDITIVE.
		Value 'alpha_blend' of blendMode	XrEnvironmentBlendMode* environmentBlendModes; There is one element of environmentBlendModes whose value is equal to XR_ENVIRONMENT_BLEND_MODE_ALPHA_BLEND.
Enumerate supported view configuration types	xrEnumerateViewConfigurations()	Value 'monoscopic' of viewConfigurationPrimary	XrViewConfigurationType* viewConfigurationTypes; There is one element of viewConfigurationTypes whose value is equal to XR_VIEW_CONFIGURATION_TYPE_PRIMARY_MONO.
		Value 'stereoscopic' of viewConfigurationPrimary	XrViewConfigurationType* viewConfigurationTypes; There is one element of viewConfigurationTypes whose value is equal to XR_VIEW_CONFIGURATION_TYPE_PRIMARY_STEREO.
		Value 'other' of viewConfigurationPrimary	XrViewConfigurationType* viewConfigurationTypes; There is one element of viewConfigurationTypes whose value is strictly greater than XR_VIEW_CONFIGURATION_TYPE_PRIMARY_STEREO and strictly lower than XR_VIEW_CONFIGURATION_TYPE_MAX_ENUM.
Enumerate the view configuration properties	xrEnumerateViewConfigurationViews()	recommendedImageRectWidth	uint32_t recommendedImageRectWidth;
		maxImageRectWidth	uint32_t maxImageRectWidth;
		recommendedImageRectHeight	uint32_t recommendedImageRectHeight;
		maxImageRectHeight	uint32_t maxImageRectHeight;
		recommendedSwapchainSampleCount	uint32_t recommendedSwapchainSampleCount;
		maxSwapchainSampleCount	uint32_t maxSwapchainSampleCount;
Enumerate reference space types	xrEnumerateReferenceSpaces()	Value 'view' of referenceSpace	XrReferenceSpaceType* spaces; There is one element of spaces whose value is equal to XR_REFERENCE_SPACE_TYPE_VIEW.
		Value 'local' of referenceSpace	XrReferenceSpaceType* spaces; There is one element of spaces whose value is equal to XR_REFERENCE_SPACE_TYPE_LOCAL.
		Value 'stage' of referenceSpace	XrReferenceSpaceType* spaces; There is one element of spaces whose value is equal to XR_REFERENCE_SPACE_TYPE_STAGE.
		Value 'unbounded' of referenceSpace	XrReferenceSpaceType* spaces; There is one element of spaces whose value is equal to XR_REFERENCE_SPACE_TYPE_UNBOUNDED_MSFT.
		Value 'user_defined' of referenceSpace
Query the spatial range boundaries	xrGetReferenceSpaceBoundsRect()	2DSpatialRangeBoundaries	XrExtent2Df* bounds;
Enumerate swapchain image formats	xrEnumerateSwapchainFormats	swapchainImageFormatIdentifier	int64_t* formats;
Enumerate swapchain images	xrEnumerateSwapchainImages()	numberSwapchainImages	uint32_t* imageCountOutput;
Enumerate swapchain images	xrEnumerateSwapchainImages()	swapchainImages	XrSwapchainImageBaseHeader* images;
Enumerate composition layer type	N/A	Value 'projection' of compositionLayer	Part of the core specification
	N/A	Value 'quad' of compositionLayer	Part of the core specification
	xrEnumerateInstanceExtensionProperties()	Value 'cylinder' of compositionLayer	XrStructureType type; The variable type has the value XR_TYPE_COMPOSITION_LAYER_CYLINDER_KHR.
		Value 'cube' of compositionLayer	XrStructureType type; The variable type has the value XR_TYPE_COMPOSITION_LAYER_CUBE_KHR.
		Value 'equirectangular' of compositionLayer	XrStructureType type; The variable type has the value XR_TYPE_COMPOSITION_LAYER_EQUIRECT_KHR or XR_TYPE_COMPOSITION_LAYER_EQUIRECT2_KHR.
		Value 'depth' of compositionLayer	XrStructureType type; The variable type has the value XR_TYPE_COMPOSITION_LAYER_DEPTH_INFO_KHR.

B.2.2 XR views and rendering loop p. 45

Those composition layers are drawn in a specified order, with the 0th layer drawn first. Layers are drawn with a "painter's algorithm," with each successive layer potentially overwriting the destination layers whether or not the new layers are virtually closer to the viewer. Composition layers are subject to blending with other layers. Blending of layers can be controlled by the alpha channel information present in the image buffer of each layer. In addition, the image buffer of the layer may be limited by a maximum width and a maximum height when rendering them such that they fit into the capabilities of the swapchains.

For visual rendering, the following applies:

To present images to the user, the runtime provides images organized in swapchains for the application to render into.
The XR Runtime may support different swapchain image formats and the supported image formats may be provided to the application through the runtime API. XR Runtimes typically support at least sRGB formats. Details may depend on the graphics API specified when creating the session.
Swapchain images may be 2D or 2D Array. Arrays allow to extract a subset of the 2D images for rendering. Multiple swapchain handles may exist simultaneously, up to some limit imposed by the XR runtime. Swap chain parameters include:
- texture format identifier, a graphics API specific version of a format, for example sRGB.
- width and height, expressing the pixel count of the images sent to the swapchain
- faceCount, being the number of faces, which can be either 6 (for cubemaps) or 1
- indication whether the swapchain is dynamic, i.e. updated as part of the XR rendering loop or static, i.e. the application releases only one image to this swapchain over its entire lifetime.
- access protection, indicating that the swapchain's images are protected from CPU access
Once a session is running and in focussed state as introduced in clause 4.1.2, the following rendering loop is executed following Figure 4.1.4
1. The XR Application retrieves the action state, e.g. the status of the controllers and their associated pose. The application also establishes the location of different trackables.
2. Before an application can begin writing to a swapchain image, it first waits on the image to avoid writing to it before the Compositor has finished reading from it. Then an XR application synchronizes its rendering loop to the runtime. In the common case that an XR application has pipelined frame submissions, the application is expected to compute the appropriate target display time using both the predicted display time and predicted display interval. An XR Runtime is expected to provide and operate a swapchain that supports a specific frame rate.
3. Once the wait time completes, the application initiates the rendering process. In order to support the application in rendering different views the XR Runtime provides access to the viewer pose and projection parameters that are needed to render the different views. The view and projection info is provided for a particular display time within a specified XR space. Typically, the target/predicted display time for a given frame.
4. the application then performs its rendering work. Rendering work may be very simple, for example just directly copying data from the application into the swap chain or may be complex, for example iterating over the scene graph nodes and rendering complex objects. Once all views/layers are rendered, the application sends them to the XR Runtime for final compositing including the expected display time as well as the associated render pose.
5. An XR Runtime typically supports (i) planar projected images rendered from the eye point of each eye using a perspective projection, typically used to render the virtual world from the user's perspective, and (ii) quad layer type describing a posable planar rectangle in the virtual world for displaying two-dimensional content. Other projection types such as cubemaps, equirectangular or cylindric projection may also be supported.
6. The XR application offloads the composition of the final image to an XR Runtime-supplied compositor. By this, the rendering complexity is significantly lower since details such as frame-rate interpolation and distortion correction are performed by the XR Runtime. It is assumed that the XR Runtime provides a compositor functionality for device mapping. A Compositor in the runtime is responsible for taking all the received layers, performing any necessary corrections such as pose correction and lens distortion, compositing them, and then sending the final frame to the display. An application may use multiple composition layers for its rendering. Composition layers are drawn in a specified order, with the 0th layer drawn first. Layers are drawn with a "painter's algorithm," with each successive layer potentially overwriting the destination layers whether or not the new layers are virtually closer to the viewer. Composition layers are subject to blending with other layers. Blending of layers can be controlled by layer per-texel source alpha. Layer swapchain textures may contain an alpha channel. Composition and blending is done in RGBA.
7. After the compositor has blended and flattened all layers, it then presents this image to the system's display. The composited image is then blend with the user's view of the physical world behind the displays in one of three modes, based on the application's chosen environment blend mode:
  - OPAQUE. The composition layers are displayed with no view of the physical world behind them. The composited image is interpreted as an RGB image, ignoring the composited alpha channel. This is the typical mode for VR experiences, although this mode can also be supported on devices that support video passthrough.
  - ADDITIVE: The composition layers are additively blended with the real world behind the display. The composited image is interpreted as an RGB image, ignoring the composited alpha channel during the additive blending. This is the typical mode for an AR experience on a see-through headset with an additive display, although this mode can also be supported on devices that support video passthrough.
  - ALPHA_BLEND. The composition layers are alpha-blended with the real world behind the display. The composited image is interpreted as an RGBA image, with the composited alpha channel determining each pixel's level of blending with the real world behind the display. This is the typical mode for an AR experience on a phone or headset that supports video passthrough.
8. Meanwhile, while the XR Runtime uses the submitted frame for compositing and display, a new rendering process may be kicked off for a different swap chain image.

B.2.3 Available Visualization Space implementation p. 46

B.2.3.1 Using OpenXR_XR_FB p. 46

The openXR XR_FB_scene extension allows to define the boundary room and also boundary space and objects in the space:

xrGetSpaceBoundingBox3DFB provides the defined rectangular cube XrRect3DfFB by defining the offset XrOffset3DfFB values x,y,z and the extend XrExtent3DfFB values width, height and depth in the x,y,z dimensions.
xrGetSpaceSemanticLabelsFB optionally provides a way to describe the semantic meaning of an space entity. It is recommended to use the label "3GPP-AvailableVisualizationSpace" when it is used to describe available visualization space.

B.2.3.2 Using xrComputeNewSceneMSFT p. 46

The XR_MSFT_scene_understanding extension allows defining the bounding volume in 3 forms:

XrSceneSphereBoundMSFT for defining a spherical available visualization space
XrSceneOrientedBoxBoundMSFT for defining a cuboid available visualization space. Note that the bounding box is defined by its center and its edge to edge dimensions around its center. Therefore, these values shall be translated to the values defined in clause 6.2.4.

Also note that the scene components outside of the available visualization space may be excluded from rendering by the runtime.