This scenario is as follows in
Figure 4.1-10, referring to a type of
"5G EDGe-Dependent AR (EDGAR) UE" in
TR 26.998, with the main characteristic that the binaural rendering process is shared between the Cloud/Edge and the End Device (e.g. type 1 or 2). The 5G UE connects to Cloud/Edge through an embedded 5G modem, the 5G UE and End Device connect through WiFi or 5G sidelink, maybe through Bluetooth for audio. End Device sends Pose Information to Cloud/Edge if needed, and the Cloud/Edge and End Device provide the capabilities of decoding and rendering together. The 5G UE just acts as a relay device. More specifically, immersive audio decoding, pre-rendering, and re-encoding is performed in Cloud/Edge. The re-encoding may be using an intermediate ISAR format supporting head-tracked binaural post-rendering. The End Device features a decoder, post-renderer and a pose estimator. Motion to sound latency can at least be partially compensated, since the End Device can provide pose correction and head-tracked binaural rendering (ISAR Decoder&Post-Renderer).
In Variation A, as depicted in
Figure 4.1-11, the End Device feature an ISAR decoder and post-renderer, built-in loudspeakers for binaural audio playback, and a pose estimator.
In Variation B, as depicted in
Figure 4.1-12, a pair of TWS Earbuds/Headphones is used to playback the binaural audio instead of the built-in speakers used in Variation A. The End Device performs pose estimation, ISAR decoding and head-tracked binaural post-rendering followed by stereo re-encoding the binaural audio signal. The pose information is sent to the 5G UE where it is relayed to the Cloud/Edge. The TWS Earbuds/Headphones decode the binaural audio signal and perform audio playback. Variation B is expected to be more prevalent than Variation C described below due to possibly better pose estimation capability by the End Device.
Variation B.1, as depicted in
Figure 4.1-13, is like Variation B, except that the TWS Earbuds/Headphones are ISAR Decoder capable. The End Device relays the coded audio (ISAR format) from the 5G UE to the TWS Earbuds/Headphones and provide pose information to the TWS Earbuds/Headphones. Alternatively, the 5G UE can pass the coded audio directly to the TWS Earbuds/Headphones.
In Variation C, as depicted in
Figure 4.1-14, TWS Earbuds/Headphones perform ISAR decoding and head-tracked binaural post-rendering of audio and playback binaural audio. In addition, they perform pose estimation and provide pose information to the End Device or directly to the 5G UE. The End Device and the 5G UE may be used to relay pose information and coded audio between TWS Earbuds/Headphones and the 5G Cloud/Edge.