Tech-invite3GPPspaceIETFspace
21222324252627282931323334353637384‑5x

Content for  TR 26.933  Word version:  19.0.0

Top   Top   None   None   Next
0…   4…

 

0  Introductionp. 8

Providing immersive voice and audio services by end-user devices is becoming more and more practicable with the development of 4G/5G technologies. Related requirements have been investigated in TR 22.891. Several use cases for VR are envisioned in TR 26.918, in these cases, the corresponding audio capturing system are generally considered. As such, capturing capability is crucial for making truly immersive voice and audio experiences.
Due to physical constraints on their outline shapes and sizes, end-user devices are usually configured with varying numbers of microphones and different microphone setup configurations. Therefore, different audio capturing capabilities are expected. Based on this, the present document provides a diverse audio capturing system.
Up

1  Scopep. 9

The goal of this technical report is to study diverse audio capturing methods and applicable audio formats for the end-user device, considering the current physical and software constraints. The scope of the work is shown in Figure 1-1.
Copy of original 3GPP image for 3GPP TS 26.933, Fig. 1-1: Scope of TR 26.933 under the scope of the FS_DaCED study item
Up
This document addresses audio capturing configurations for end-user devices, with the aim of equipping these devices with audio capturing capability to provide a truly immersive voice and audio service.
The document aims to study the following aspects:
  1. Factors related to audio capture in different UE categories.
  2. Components used in audio capture.
  3. Acoustic design for audio capture.
  4. Signal processing, such as microphone array beamforming processing, AEC processing etc.
  5. Example of audio capture processing solutions.
Up

2  Referencesp. 9

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
  • References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.
  • For a specific reference, subsequent revisions do not apply.
  • For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
[1]
TR 21.905: "Vocabulary for 3GPP Specifications".
[2]
TR 26.891: "5G enhanced mobile broadband; Media distribution".
[3]
TS 26.258: "Codec for Immersive Voice and Audio Services (IVAS); C code (floating point)"
[4]
H. Wittek and G. Theile, "Development and application of a stereophonic multichannel recording technique for 3D Audio and VR," in AES Convention 143, New York, 2017.
[5]
Wittek, Haut, Keinath: "Double M/S - a Surround recording technique put to test", 24. Tonmeistertagung 2006
[6]
P. Geluso. "Capturing Height: The Addition of Z Microphones to Stereo and Surround Microphone Arrays," presented at the 132nd Convention of the Audio Engineering Society (2012 Apr.), convention paper 8595.
[7]
Fischer, C., Zingler, D., Medina Victoria, J.: "MS-3D: Extending the Double-MS Array for 3D-Audio applications" (29th Tonmeistertagung - VDT International Convention, November 2016)
[8]
J. Benesty, D. R. Morgan and M. M. Sondhi, "A better understanding and an improved solution to the specific problems of stereophonic acoustic echo cancellation," in IEEE Transactions on Speech and Audio Processing, vol. 6, no. 2, pp. 156-165, March 1998.
[9]
A.W. H. Khong, J. Benesty and P. A. Naylor, "Stereophonic acoustic echo cancellation: analysis of the misalignment in the frequency domain," in IEEE Signal Processing Letters, vol. 13, no. 1, pp. 33-36, Jan. 2006.
[10]
TS 26.250: "Codec for Immersive Voice and Audio Services (IVAS); General overview
[11]
TS 26.131: Terminal acoustic characteristics for telephony; Requirements
[12]
C. D. Salvador, S. Sakamoto, J. Treviño, and Y. Suzuki, "Enhancement of spatial sound recordings by adding," J. Inf. Hiding Multimedia Signal Process., vol. 8, no. 6, pp. 1392-1404, 2017.
[13]
M. Geier, J. Ahrens, and S. Spors, "Enhancing binaural reconstruction from rigid circular microphone array recordings by using virtual microphones," in Proc. AES: Conf. Audio Virtual Augmented Reality, 2018, pp. 194-202.
[14]
Lübeck, T.; Arend, J.M.; Pörschmann, C. Spatial Upsampling of Sparse Spherical Microphone Array Signals. IEEE/ACM Trans. Audio Speech Lang. Process. 2023, 31, 1163-1174.
[15]
C hen, X.; Ma, F.; Bastine, A.; Samarasinghe, P.; Sun, H. Sound Field Estimation around a Rigid Sphere with Physics-informed Neural Network. arXiv 2023, arXiv:2307.14013.
[16]
TS 26.253: "Codec for Immersive Voice and Audio Services (IVAS); Detailed Algorithmic Description including RTP payload format and SDP parameter definitions".
Up

3  Definitions of terms, symbols and abbreviationsp. 10

3.1  Termsp. 10

For the purposes of the present document, the terms given in TR 21.905 and the following apply. A term defined in the present document takes precedence over the definition of the same term, if any, in TR 21.905.

3.2  Symbolsp. 10

Void

3.3  Abbreviationsp. 11

For the purposes of the present document, the abbreviations given in TR 21.905 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in TR 21.905.
ACN
Ambisonic Channel Number
ADC
Analog-to-Digital Converter
AEC
Acoustical Echo Cancellation
AGC
Automatic Gain Control
AR
Augmented Reality
BLD
Back-Left-Down
BRU
Back-Right-Up
CI
Confidence Interval
DMA
Differential Microphone Array
DOA
Direction Of Arrival
FLU
Front-Left-Up
FOA
First Order Ambisonics
FRD
Front-Right-Down
HOA
High Order Ambisonics
HOA2
Second Order Ambisonics
HOA3
Third Order Ambisonics
IVAS
Immersive Voice and Audio Services
IRT
Institute for Radio Technology
MASA
Metadata-Assisted Spatial Audio
MASP
Microphone Array Signal Processing
MEMS
Micro-Electro-Mechanical Systems
M/S
Mid-Side
ORTF
Office de Radiodiffusion-Television Français
OSS
Optimal Stereo System
SN3D
Spherical Harmonics Normalization 3D
SNR
Signal-to-Noise Ratio
TWS
True Wireless Stereo
UE
User Equipment
VR
Virtual Reality
Up

Up   Top   ToC