Tech-
invite
3GPP
space
IETF
space
21
22
23
24
25
26
27
28
29
31
32
33
34
35
36
37
38
4‑5x
Content for
TS 26.253
Word version: 18.1.1
1…
4…
5…
6…
7…
8…
A…
5
Functional description of the encoder
5.1
Encoder overview
5.2
Common processing and coding tools
5.3
Stereo audio operation
5.4
Scene-based audio (SBA) operation
5.5
Metadata-assisted spatial audio (MASA) operation
5.6
Object-based audio (ISM) operation
5.7
Multi-channel audio (MC) operation
5.8
Combined Object-based audio and SBA (OSBA) operation
5.9
Combined Object-based audio and MASA (OMASA) operation
5.10
EVS-compatible mono audio operation
5.11
Stereo downmix operation for EVS mono coding
...
5
Functional description of the encoder
p. 36
5.1
Encoder overview
p. 36
5.2
Common processing and coding tools
p. 38
5.2.1
High-pass filtering
p. 38
5.2.2
Core-coder processing
p. 38
5.2.2.1
Overview
p. 38
5.2.2.2
Core-coder front pre-processing
p. 38
5.2.2.2.1
General
p. 38
5.2.2.2.2
Sample rate conversion
p. 38
5.2.2.2.3
Pre-emphasis
p. 39
5.2.2.2.4
Spectral analysis
p. 39
5.2.2.2.5
Signal activity detection
p. 40
5.2.2.2.6
Bandwidth detector
p. 40
5.2.2.2.7
Time domain transient detection
p. 41
5.2.2.2.8
Linear prediction analysis
p. 42
5.2.2.2.9
Open-loop pitch analysis
p. 42
5.2.2.2.10
Coding mode determination
p. 42
5.2.2.2.11
Speech/Music classification
p. 43
5.2.2.2.12
Core-coder technology pre-selection
p. 53
5.2.2.3
Core-coder modules
p. 60
5.2.2.3.1
Core-coder pre-processing
p. 60
5.2.2.3.2
LP based coding
p. 65
5.2.2.3.3
MDCT based coding
p. 93
5.2.2.3.4
Switching coding modes
p. 109
5.2.2.3.5
DTX/CNG operation
p. 112
5.2.3
Common audio coding tools
p. 114
5.2.3.1
General
p. 114
5.2.3.2
Single Channel Element (SCE)
p. 114
5.2.3.3
Channel Pair Element (CPE)
p. 115
5.2.3.4
Multichannel Coding Tool (MCT)
p. 117
5.2.3.4.1
Summary of the system
p. 117
5.2.3.4.2
MCT configurations
p. 118
5.2.3.4.3
Encoder single channel preprocessing up to whitened spectrum
p. 118
5.2.3.4.4
Joint channel Encoding System Description
p. 119
5.2.3.4.5
IGF for multi-channel processing
p. 124
5.2.3.4.6
Bitrate distribution
p. 124
5.2.3.4.7
Quantization and coding of each channel
p. 125
5.2.4
Common spatial metadata coding tools
p. 125
5.2.4.1
General
p. 125
5.2.4.2
Spatial metadata composition
p. 125
5.2.4.3
Direction metadata coding tools
p. 126
5.2.4.3.1
Overview
p. 126
5.2.4.3.2
Direction metadata quantization
p. 126
5.2.4.3.3
Direction metadata raw encoding
p. 129
5.2.4.3.4
Direction metadata entropy encoding tools
p. 129
5.2.4.4
Diffuseness and energy ratio coding methods
p. 130
5.2.4.4.1
Diffuseness and energy ratio definitions
p. 130
5.2.4.4.2
Diffuseness parameter quantization
p. 130
5.2.4.4.3
Diffuseness and energy ratio indices coding
p. 131
5.2.4.4.4
Diffuseness and energy ratio coding with two concurrent directions
p. 132
5.2.4.5
Direction metadata coding methods
p. 134
5.2.4.5.1
Entropy coding 1 (EC1)
p. 134
5.2.4.5.2
Entropy coding 2 (EC2)
p. 139
5.2.4.5.3
Entropy coding 3 (EC3)
p. 140
5.2.4.6
Coherence coding
p. 144
5.2.4.6.1
Spread coherence coding
p. 144
5.2.4.6.2
Surround coherence coding
p. 147
5.2.4.7
DTX coding
p. 147
5.2.5
Modified Discrete Fourier Transform (MDFT) Analysis Filter Bank
p. 148
5.2.5.1
General
p. 148
5.2.5.2
Modified Discrete Fourier Transform (MDFT)
p. 148
5.2.5.3
Filter bank responses
p. 148
5.2.5.3.1
General
p. 148
5.2.5.3.2
Magnitude responses for 5 ms stride operation
p. 149
5.2.5.3.3
Responses for sampling rates below 48 kHz
p. 150
5.3
Stereo audio operation
p. 151
5.3.1
Stereo format overview
p. 151
5.3.2
Unified stereo
p. 152
5.3.2.1
Overview
p. 152
5.3.2.2
Common unified stereo tools
p. 152
5.3.2.2.1
Inter-Channel Bandwidth Extension (IC-BWE)
p. 152
5.3.2.2.2
Front VAD
p. 163
5.3.2.3
TD-based stereo
p. 164
5.3.2.3.1
Inter-channel Alignment (ICA)
p. 164
5.3.2.3.2
TD stereo coder overview
p. 176
5.3.2.3.3
Time-domain stereo downmix
p. 177
5.3.2.3.4
Near out-of-phase operation (NOOP)
p. 179
5.3.2.3.4.1
NOOP signal detection
p. 179
5.3.2.3.4.2
NOOP sub-mode selection
p. 181
5.3.2.3.4.3
NOOP signal coding
p. 182
5.3.2.3.4.4
Adaptive mixing ratio for the NOOP signal
p. 183
5.3.2.3.5
LP filter coherence
p. 184
5.3.2.3.6
Open-loop pitch coherence
p. 186
5.3.2.3.7
Excitation coding in the secondary channel using two or four subframes
p. 188
5.3.2.3.8
Bit allocation
p. 190
5.3.2.3.9
Quantization of LP coefficients for the secondary channel
p. 191
5.3.2.4
DFT-based stereo
p. 193
5.3.2.4.1
General
p. 193
5.3.2.4.2
Time Domain ITD compensation
p. 194
5.3.2.4.3
STFT analysis
p. 195
5.3.2.4.4
GCC-PHAT ITD estimation
p. 196
5.3.2.4.4.5
Refined ITD control mechanism
p. 200
5.3.2.4.5
FD circular time-shift
p. 201
5.3.2.4.6
Stereo parameters estimation
p. 202
5.3.2.4.7
IPD calculation, stabilization and encoding scheme
p. 202
5.3.2.4.8
Calculation of the side and residual prediction gains
p. 206
5.3.2.4.9
Stereo parameter coding
p. 207
5.3.2.4.10
Active Downmix
p. 211
5.3.2.4.11
STFT synthesis
p. 212
5.3.2.4.12
Residual coding
p. 214
5.3.2.4.12.1
Overview
p. 214
5.3.2.4.12.2
Adaptive residual signal encoding
p. 217
5.3.2.4.12.2.1
Adaptive residual signal encoding parameter
p. 217
5.3.2.4.12.2.2
Adaptive downmix for stereo coding
p. 217
5.3.2.4.12.2.3
Calculation of downmixed signal and residual signal during transitional frame
p. 218
5.3.2.4.12.2.4
Adaptive downmix
p. 219
5.3.2.4.13
Reverberation gain parameter determination
p. 219
5.3.2.5
Stereo classifier
p. 220
5.3.2.5.1
General
p. 220
5.3.2.5.2
Classification of uncorrelated content
p. 220
5.3.2.5.3
Cross-talk detection using logistic regression
p. 223
5.3.2.5.4
Cross-talk detection based on the GCC-PHAT
p. 230
5.3.2.5.5
Stereo mode selection
p. 233
5.3.3
MDCT-based stereo
p. 236
5.3.3.1
General Overview
p. 236
5.3.3.2
ITD compensation
p. 236
5.3.3.3
MDCT core pre-processing
p. 237
5.3.3.3.1
General
p. 237
5.3.3.3.2
TCX-LTP parameter estimation
p. 237
5.3.3.3.3
Time-to-frequency transformations
p. 237
5.3.3.3.4
Mid-high bitrate pre-processing
p. 240
5.3.3.3.5
Temporal noise shaping (TNS)
p. 246
5.3.3.3.6
Spectral Noise Shaping (SNS)
p. 247
5.3.3.4
Stereo processing
p. 257
5.3.3.4.1
General
p. 257
5.3.3.4.2
Stereo spectral bands
p. 257
5.3.3.4.3
Global ILD normalization
p. 259
5.3.3.4.4
Conditional ILD correction
p. 259
5.3.3.4.5
Band-wise M/S decision
p. 260
5.3.3.4.6
M/S decision for IGF
p. 261
5.3.3.4.7
M/S Transformation
p. 262
5.3.3.4.8
Encoding of the stereo parameters
p. 263
5.3.3.5
Power spectrum calculation and spectrum noise measure
p. 263
5.3.3.6
Intelligent gap filling (IGF)
p. 263
5.3.3.6.1
General overview
p. 263
5.3.3.6.2
Stereo IGF encoding
p. 264
5.3.3.7
Bitrate distribution
p. 265
5.3.3.8
Quantization and encoding
p. 265
5.3.3.8.1
Spectrum quantization and coding
p. 265
5.3.3.8.2
Global gain estimator
p. 266
5.3.3.8.3
Rate-loop for constant bit rate and global gain
p. 266
5.3.3.8.4
Stereo noise level estimation
p. 266
5.3.4
Switching between stereo modes
p. 266
5.3.4.1
Overview
p. 266
5.3.4.2
General
p. 267
5.3.4.3
Memory handling
p. 269
5.3.4.4
Switching between DFT and TD stereo modes
p. 270
5.3.4.4.1
Switching from TD stereo to DFT stereo
p. 270
5.3.4.4.2
Switching from DFT stereo to TD stereo
p. 271
5.3.4.5
Switching between DFT and MDCT stereo modes
p. 274
5.3.4.5.1
Switching from DFT stereo to MDCT stereo
p. 274
5.3.4.5.2
Switching from MDCT stereo to DFT stereo
p. 274
5.3.4.6
Switching between TD and MDCT stereo modes
p. 274
5.3.4.6.1
Switching from TD stereo to MDCT stereo
p. 274
5.3.4.6.2
Switching from MDCT stereo to TD stereo
p. 275
5.3.5
DTX operation
p. 275
5.3.5.1
DTX in Unified stereo
p. 275
5.3.5.1.1
Signal activity detection in Unified stereo
p. 275
5.3.5.1.2
General
p. 276
5.3.5.1.3
Stereo CNG side gain encoding
p. 277
5.3.5.1.4
Stereo CNG ITD and IPD encoding
p. 278
5.3.5.1.5
Stereo CNG coherence encoding
p. 278
5.3.5.2
DTX in MDCT-based stereo
p. 281
5.3.5.2.1
Overview
p. 281
5.3.5.2.2
VAD in MDCT-based Stereo
p. 281
5.3.5.2.3
Coherence estimation
p. 281
5.3.5.2.4
Noise parameter estimation and encoding
p. 282
5.4
Scene-based audio (SBA) operation
p. 283
5.4.1
SBA format overview
p. 283
5.4.2
Combined DirAC and SPAR based SBA coding
p. 283
5.4.2.1
Combined DirAC and SPAR based SBA coding overview
p. 283
5.4.2.2
FOA signal coding at all bitrates and HOA2, HOA3 signal coding at bitrates below 256 kbps
p. 285
5.4.2.2.1
Overview
p. 285
5.4.2.2.2
Metadata parameter analysis
p. 285
5.4.2.2.3
Downmixing and core-coding
p. 286
5.4.2.3
HOA2 and HOA3 signal coding overview at 256 kbps
p. 286
5.4.2.3.1
Overview
p. 286
5.4.2.3.2
Metadata parameter analysis
p. 286
5.4.2.3.3
Downmixing and core-coding
p. 287
5.4.2.4
HOA2 and HOA3 signal coding overview at 384 kbps
p. 287
5.4.2.4.1
Overview
p. 287
5.4.2.4.2
Metadata parameter generation
p. 287
5.4.2.4.3
Downmixing and core-coding
p. 287
5.4.2.5
HOA2 and HOA3 signal coding overview at 512 kbps
p. 288
5.4.2.5.1
Overview
p. 288
5.4.2.5.2
Metadata parameter generation
p. 288
5.4.2.5.3
Downmixing and core-coding
p. 288
5.4.3
SBA parameter estimation
p. 289
5.4.3.1
SBA front VAD
p. 289
5.4.3.2
Transient Detection
p. 289
5.4.3.3
Analysis windowing and MDFT
p. 289
5.4.3.4
DirAC parameter estimation
p. 291
5.4.3.4.1
General
p. 291
5.4.3.4.2
High-order operation mode
p. 291
5.4.3.4.3
Low-order operation mode
p. 295
5.4.3.5
Hybrid encoder
p. 296
5.4.3.6
Mono signal detection
p. 296
5.4.3.6.1
Overview
p. 296
5.4.3.6.2
Signalling of Mono through the bitstream
p. 300
5.4.3.7
SPAR Metadata computation
p. 301
5.4.3.7.1
Overview
p. 301
5.4.3.7.2
Banded covariance computation
p. 302
5.4.3.7.3
Active W downmix detector
p. 303
5.4.3.7.4
Banded covariance re-mixing
p. 304
5.4.3.7.5
SPAR parameter estimation
p. 305
5.4.3.7.6
Passive-W channel prediction coefficients computation
p. 305
5.4.3.7.7
Active-W channel prediction coefficients computation
p. 305
5.4.3.7.8
Cross-Prediction coefficients computation
p. 308
5.4.3.7.9
Decorrelation coefficients computation
p. 309
5.4.3.7.10
Overview of quantization, bitrate distribution and coding of SPAR parameters
p. 309
5.4.3.7.11
SPAR bitrate distribution table
p. 310
5.4.3.7.12
Quantization and coding process
p. 313
5.4.3.7.13
Spectral filling in the side channels at 13.2 kbps and 16.4 kbps
p. 316
5.4.3.7.14
Time differential coding
p. 316
5.4.3.7.15
Band interleaving and 40ms MD update rate at 13.2 kbps and 16.4 kbps
p. 317
5.4.3.7.16
Entropy coding with Arithmetic coders
p. 318
5.4.3.7.17
Base2 coding with Huffman coder
p. 320
5.4.3.8
DirAC and SPAR parameter merge
p. 321
5.4.3.8.1
General
p. 321
5.4.3.8.2
DirAC to SPAR parameter conversion
p. 321
5.4.3.8.3
Covariance estimation from DirAC parameters
p. 321
5.4.3.8.4
Directional diffuseness
p. 323
5.4.3.8.5
SPAR parameter estimation from DirAC parameters
p. 323
5.4.3.8.6
Active W prediction from DirAC parameters
p. 323
5.4.3.8.7
Passive W prediction from DirAC parameters
p. 324
5.4.4
Downmix matrix calculation
p. 324
5.4.4.1
SPAR parameters to downmix matrix conversion
p. 324
5.4.4.2
Energy compensation in DirAC bands
p. 325
5.4.4.3
Band Interleaving of downmix matrix at 13.2 and 16.4 kbps
p. 325
5.4.5
Downmix generation
p. 325
5.4.5.1
Overview
p. 325
5.4.5.2
Time domain to MDFT conversion
p. 325
5.4.5.3
Filterbank mixing
p. 326
5.4.5.4
MDFT to Time domain conversion
p. 326
5.4.5.5
Windowing and crossfading
p. 326
5.4.5.6
Time domain mixing
p. 327
5.4.6
Principal Component Analysis (PCA)
p. 327
5.4.7
Automatic Gain Control (AGC)
p. 333
5.4.8
Core-coder encoding
p. 335
5.4.8.1
General
p. 335
5.4.8.2
MCT bitrate distribution in SBA high bitrate mode
p. 335
5.4.9
DTX operation
p. 336
5.4.9.1
Overview
p. 336
5.4.9.2
DirAC parameter estimation
p. 336
5.4.9.3
SPAR parameter estimation
p. 336
5.4.9.3.1
General
p. 336
5.4.9.3.2
Banded covariance smoothing
p. 337
5.4.9.3.3
Quantization and coding in VAD inactive frames
p. 337
5.4.9.4
SPAR and DirAC parameter merge
p. 337
5.4.9.5
CNG parameters encoding
p. 338
5.4.10
SBA bitrate switching
p. 338
5.5
Metadata-assisted spatial audio (MASA) operation
p. 339
5.5.1
MASA format overview
p. 339
5.5.2
MASA format metadata input
p. 340
5.5.2.1
Obtaining the MASA format metadata
p. 340
5.5.2.2
Direction index deindexing
p. 340
5.5.2.3
TF tile based energy calculation
p. 342
5.5.3
Coding of MASA format input data
p. 342
5.5.3.1
Overview
p. 342
5.5.3.2
MASA metadata pre-encoding configuration, processing, and signalling
p. 342
5.5.3.2.1
Overview
p. 342
5.5.3.2.2
Bitrate dependent parameter settings
p. 343
5.5.3.2.3
Metadata composition detection, alignment, and check for validity
p. 344
5.5.3.2.4
Common MASA metadata codec configuration
p. 350
5.5.3.2.5
Energy ratio compensation
p. 352
5.5.3.2.6
Merging of MASA spatial parameter metadata across MASA frequency bands and subframes
p. 353
5.5.3.2.7
Combining of MASA spatial audio metadata across multiple directions
p. 355
5.5.3.2.8
Metadata reductions for low rate
p. 357
5.5.3.2.9
Reordering of directional data with two concurrent directions
p. 358
5.5.3.2.10
Signalling bits
p. 359
5.5.3.2.11
Adjustment of highest transmitted MASA metadata band
p. 359
5.5.3.3
MASA metadata quantization and encoding
p. 360
5.5.3.3.1
Overview
p. 360
5.5.3.3.2
Energy ratio encoding
p. 361
5.5.3.3.3
Direction encoding
p. 361
5.5.3.3.4
Spread coherence encoding
p. 362
5.5.3.3.5
Surround coherence encoding
p. 362
5.5.3.3.6
Coding of the second direction parameters
p. 362
5.5.4
Encoding of MASA audio transport channels
p. 363
5.5.5
DTX operation
p. 363
5.5.6
Bitstream structure
p. 363
5.5.7
MASA bitrate switching
p. 364
5.6
Object-based audio (ISM) operation
p. 364
5.6.1
ISM format overview
p. 364
5.6.2
Discrete ISM coding mode
p. 365
5.6.2.1
Overview
p. 365
5.6.2.2
DiscISM encoding system
p. 365
5.6.2.3
Bitrates distribution between audio streams
p. 366
5.6.2.3.1
Bitrate distribution algorithm
p. 366
5.6.2.3.2
Bitrate adaptation based on ISM importance
p. 367
5.6.3
Parametric ISM coding mode
p. 368
5.6.3.1
General
p. 368
5.6.3.2
Parameters
p. 368
5.6.3.3
Noisy Speech
p. 369
5.6.3.4
Downmix
p. 370
5.6.3.5
Encoded Data
p. 372
5.6.4
ISM metadata coding
p. 372
5.6.4.1
General
p. 372
5.6.4.2
Direction metadata encoding and quantization
p. 373
5.6.4.2.1
General
p. 373
5.6.4.2.2
Intra-object metadata coding logic
p. 374
5.6.4.2.3
Inter-object metadata coding logic
p. 374
5.6.4.3
Extended metadata encoding and quantization
p. 375
5.6.4.4
Panning gain in non-diegetic rendering
p. 375
5.6.5
Bitstream structure
p. 375
5.6.5.1
Overview
p. 375
5.6.5.2
ISM signalling
p. 376
5.6.5.3
Coded metadata payload
p. 376
5.6.5.4
Audio streams payload
p. 376
5.6.6
DTX operation
p. 376
5.6.6.1
Overview
p. 376
5.6.6.2
ISM DTX operation overview
p. 377
5.6.6.3
Classification of Inactive Frames in ISM Encoder
p. 377
5.6.6.3.1
General
p. 377
5.6.6.3.2
Global SID Counter
p. 378
5.6.6.4
SID Metadata Analysis, Quantization and Coding
p. 379
5.6.6.5
CNG parameters encoding
p. 380
5.6.6.6
SID bitstream structure
p. 381
5.6.6.6.1
Overview
p. 381
5.6.6.6.2
ISM Common Signalling in SID Frame
p. 382
5.6.6.6.3
SID Data Payload
p. 382
5.6.6.6.4
SID Audio Streams Payload
p. 382
5.6.7
ISM bitrate switching
p. 383
5.7
Multi-channel audio (MC) operation
p. 383
5.7.1
MC format overview
p. 383
5.7.2
LFE channel encoding
p. 383
5.7.2.1
Low-pass filtering
p. 383
5.7.2.2
MDCT based LFE encoding
p. 384
5.7.2.2.1
Overview
p. 384
5.7.2.2.2
Windowing and MDCT
p. 384
5.7.2.2.3
Quantization
p. 385
5.7.2.2.4
Coding
p. 386
5.7.3
Multi-channel MASA (McMASA) coding mode
p. 387
5.7.3.1
McMASA coding mode overview
p. 387
5.7.3.2
LFE energy computation
p. 389
5.7.3.2.1
LFE energy computation in normal sub-mode
p. 389
5.7.3.2.2
LFE energy computation in separate-channel sub-mode
p. 390
5.7.3.3
McMASA spatial audio parameter estimation
p. 390
5.7.3.4
Even loudspeaker setup determination
p. 393
5.7.3.5
McMASA transport audio signal generation
p. 394
5.7.3.6
McMASA metadata encoding
p. 395
5.7.3.6.1
McMASA metadata pre-encoding configuration and processing
p. 395
5.7.3.6.2
Spatial metadata encoding
p. 395
5.7.3.6.3
LFE-to-total energy ratio encoding
p. 396
5.7.3.7
McMASA transport audio signal encoding
p. 397
5.7.3.8
Bitstream structure
p. 398
5.7.4
Parametric MC coding mode
p. 398
5.7.4.1
ParamMC coding mode overview
p. 398
5.7.4.2
ParamMC parameter frame index initialization and update
p. 399
5.7.4.3
ParamMC transport channel and parameter band configuration
p. 399
5.7.4.4
ParamMC transport audio signal generation
p. 400
5.7.4.5
ParamMC time domain transient detection
p. 401
5.7.4.6
ParamMC analysis windowing and MDFT
p. 401
5.7.4.7
ParamMC covariance estimation
p. 402
5.7.4.8
ParamMC default settings for the LFE channel
p. 403
5.7.4.9
ParamMC parameter based transient detection
p. 403
5.7.4.10
ParamMC band combining in case of transients
p. 403
5.7.4.11
ParamMC complex valued to real valued covariance conversion
p. 403
5.7.4.12
ParamMC LFE activity detection
p. 404
5.7.4.13
ParamMC parameter quantization
p. 404
5.7.4.13.1
Parameter quantizers
p. 404
5.7.4.13.2
Interchannel Level differences (ICLDs)
p. 404
5.7.4.13.3
Inter-channel coherences (ICCs)
p. 406
5.7.4.14
ParamMC parameter encoding
p. 406
5.7.4.14.1
Common ParamMC parameter encoding
p. 406
5.7.4.14.2
ICC and ICLD Parameter quantization indices encoding
p. 407
5.7.4.15
ParamMC transport audio signal encoding
p. 410
5.7.4.16
ParamMC bit rate switching
p. 410
5.7.5
Parametric upmix MC coding mode
p. 411
5.7.5.1
General
p. 411
5.7.5.2
Sectioning and downmixing
p. 411
5.7.5.3
Metadata parameter calculation
p. 411
5.7.5.4
Quantization
p. 413
5.7.5.5
Entropy Coding
p. 414
5.7.6
Discrete MC coding mode
p. 415
5.7.7
MC bitrate switching
p. 415
5.8
Combined Object-based audio and SBA (OSBA) operation
p. 415
5.8.1
OSBA format overview
p. 415
5.8.2
Low-bitrate pre-rendering OSBA coding mode
p. 415
5.8.3
High-bitrate discrete OSBA coding mode
p. 416
5.8.4
OSBA bitrate switching
p. 416
5.9
Combined Object-based audio and MASA (OMASA) operation
p. 417
5.9.1
OMASA format overview
p. 417
5.9.2
OMASA format configurations
p. 417
5.9.3
OMASA pre-coding processing tools
p. 418
5.9.3.1
OMASA spatial audio parameter analysis
p. 418
5.9.3.2
OMASA MASA metadata combining
p. 419
5.9.3.3
OMASA audio signals downmix
p. 420
5.9.3.4
Determination of an object to be separated
p. 421
5.9.3.5
Separation of an object from other objects
p. 422
5.9.4
Low-bitrate pre-rendering (Rend OMASA) coding mode
p. 423
5.9.4.1
Overview
p. 423
5.9.4.2
Low-bitrate pre-rendering coding method
p. 423
5.9.5
One object with MASA representation (One MASA) coding mode
p. 423
5.9.5.1
Overview
p. 423
5.9.5.2
One object with MASA representation coding method
p. 424
5.9.6
Parametric one object (Param OMASA) coding mode
p. 424
5.9.6.1
Overview
p. 424
5.9.6.2
Determination of OMASA parametric information
p. 425
5.9.6.3
One object with parametric representation coding method
p. 425
5.9.6.3.1
Coding method overview
p. 425
5.9.6.3.2
Encoding of MASA-to-total energy ratios
p. 426
5.9.6.3.3
Encoding of ISM energy ratios
p. 426
5.9.6.3.4
Encoding of ISM metadata
p. 430
5.9.7
Discrete (Disc OMASA) coding mode
p. 431
5.9.8
Inter-format bitrate adaptation
p. 431
5.9.9
OMASA bitrate switching
p. 434
5.9.10
OMASA bitstream structure
p. 434
5.10
EVS-compatible mono audio operation
p. 434
5.11
Stereo downmix operation for EVS mono coding
p. 434
5.11.1
Overview
p. 434
5.11.2
Phase-only correlation (POC) mode
p. 435
5.11.2.1
Overview
p. 435
5.11.2.2
ITD analysis
p. 435
5.11.2.2.1
Cross talk adding procedure in the frequency domain
p. 435
5.11.2.2.2
Approximated phase spectrum
p. 435
5.11.2.2.3
Calculation of phase only correlation
p. 436
5.11.2.2.4
Calculation of ITD and weighting coefficients
p. 436
5.11.2.3
Downmix process
p. 438
5.11.3
Phase compensation (PHA) mode
p. 439
5.11.3.1
Overview
p. 439
5.11.4
Mode selection rule
p. 443
5.11.5
Handling of mode switching
p. 443