The present document gives an overview for the requirements of the background acoustic noise evaluation, noise parameter encoding/decoding and comfort noise generation for the Enhanced Voice Services (EVS) speech codec during Discontinuous Transmission (DTX) operation.
The following documents contain provisions which, through reference in this text, constitute provisions of the present document.
-
References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific.
-
For a specific reference, subsequent revisions do not apply.
-
For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.
[1]
TR 21.905: "Vocabulary for 3GPP Specifications".
[2]
TS 26.445: "Codec for Enhanced Voice Services (EVS); Codec Detailed Algorithmic Description".
[3]
TS 26.442: "Codec for Enhanced Voice Services (EVS); EVS Codec ANSI C code (fixed-point)".
[4]
TS 26.443: "Codec for Enhanced Voice Services (EVS); EVS Codec ANSI C code (floating-point)".
[5]
TS 26.452: "Codec for Enhanced Voice Services (EVS); ANSI C code; Alternative fixed-point using updated basic operators".
For the purposes of the present document, the abbreviations given in
TR 21.905 and the following apply. An abbreviation defined in the present document takes precedence over the definition of the same abbreviation, if any, in
TR 21.905.
CNG
Comfort Noise Generation
EVS
Enhanced Voice Services
FD-CNG
Frequency Domain based CNG
LP-CNG
Linear Prediction based CNG
SID
Silence Insertion Descriptor
The present document gives an overview for the requirements of the background acoustic noise evaluation, noise parameter encoding/decoding and comfort noise generation for the Enhanced Voice Services (EVS) speech codec during Discontinuous Transmission (DTX) operation.
The procedure of the present document is mandatory for implementation in all network entities and UEs supporting the EVS codec.
In the case of discrepancy between the EVS comfort noise aspects described in the present document and its ANSI-C code specification contained in
TS 26.442, the procedure defined by
TS 26.442 prevails. In the case of discrepancy between the procedure described in the present document and its ANSI-C code specification contained in
TS 26.443, the procedure defined by
TS 26.443 prevails. In the case of discrepancy between the procedure described in the present document and its ANSI-C code specification contained in
TS 26.452, the procedure defined by
TS 26.452 prevails.
A basic problem when using DTX is that the background acoustic noise, which is transmitted together with the speech, would disappear when the transmission is cut, resulting in discontinuities of the background noise. Since the DTX switching can take place rapidly, it has been found that this effect can be very annoying for the listener - especially in a car environment with high background noise levels. In bad cases, the speech may be hardly intelligible.
The present document specifies the way to overcome this problem by generating on the receive (RX) side synthetic noise similar to the transmit (TX) side background noise. The comfort noise parameters are estimated on the TX side and transmitted to the RX side at a regular rate when speech is not present. This allows the comfort noise to adapt to the changes of the noise on the TX side.
The Enhanced Voice Services (EVS) speech codec supports two Comfort Noise Generation (CNG) schemes, a linear prediction based CNG (LP-CNG) as well as a frequency domain based scheme (FD-CNG). The selection of the one of the two schemes is performed within the transmit side functions on an input signal bases. The parameters for generating the comfort noise are packed as a Silence Insertion Descriptor (SID) payload.
By default in the command line the transmission rate of CNG update is fixed to 8 frames. However, the update rate of the SID payload can be configured by a command line parameter to a fixed number or to a mode where the update rate is adaptively modified according to the background noise. The fixed rate mode is limited to updates between 1 and 100 frames while in the adaptive rate mode the updates are limited to 8 and 50 depending on the noise behaviour.
As the functions of the CNG processing are highly integrated into the speech codec and make use of other coding parameters, The present document only provides an overview of the functions. The relevant references to the algorithmic descriptions are provided in the following.
A computational description of comfort noise encoding and generation in form of an ANSI-C source code is given in
TS 26.442 and
TS 26.452 for the two fixed-point implementations, using different sets of basic operators, and in
TS 26.443 for the floating-point implementation.
For the EVS primary modes, the SID payload consists of 48 bits. The first bit of the payload determines the CNG scheme, where 0 stands for the LP-CNG and 1 for the FD-CNG.
For the EVS AMR-WB IO modes, the SID payload consists of 35 bits.