Content for TR 46.055 Word version: 18.0.0

...

0 Introduction p. 5

The SMG2-Speech experts Group (SEG) started its activity early in 1995 for the standardization of an Enhanced Full Rate speech codec. The Group produced a test plan for the first phase of testing (pre-selection phase) which is described in permanent document SEG 4 (ETSI SMG2 SEG: SEG 4 (v 1.0) "A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm") to assess the performance of the submitted candidates. This test plan is based on the general knowledge coming from past ITU-T and ETSI activities on codec evaluation (GSM half rate and ITU-T 8 kbit/s recent exercises for instance). At the end of this Pre-selection Phase, SMG decided to standardize the PCS 1 900 codec, known as the US 1 codec and no formal characterisation testing has been performed for the selected codec.

The present document therefore reports the results from the Pre-selection and Verification Phase of testing only. Consequently, the results reported here are less detailed, and the confidence intervals for them are wider, than those obtained for the GSM half rate standardization (GSM 06.08, [3]) where specific and detailed characterisation testing was performed. In addition, not all laboratories followed the same pre-selection test plan, further complicating the interpretation of the results.

The following experiments included in SEG 4 were carried out by several laboratories in the Pre-selection Phase:

Experiment 1: Quality under error and tandeming conditions (A-law, Modified IRS);
Experiment 2: Quality under background noise conditions (Vehicular noise, UPCM, NoIRS);
Experiment 3: Quality under background noise conditions (Background music, UPCM, NoIRS);
Experiment 4: Talker Dependency (UPCM, NoIRS);
Experiment 5: Quality under high error conditions -EP3 (A-law, Modified IRS).

A practical 'indirect' method of performance comparison between different results was adopted utilising the Modulated Noise Reference Unit (MNRU) (see note) as a reference degradation. The MNRU provides the additional function of allowing normalisation of results across different laboratories carrying out the same experiment, through the conversion of MOS scores to Equivalent Q (dB). The Q (dB) values introduced in a test normally range from 0 to 50 dB. In SEG 4, both Experiment#1 and Experiment#5 on error conditions covers this range, the other experiments do not.

Only four laboratories ran tests which followed the Pre-selection Test Plan described in SEG 4 (BT/lab1, CNET/lab2, Tele Denmark/lab3, NEC/lab4). MOTOROLA/lab5 participated in the Pre-selection Phase but their experiments did not comply with SEG 4. TI/lab8 ran one experiment only from SEG 4. Results produced by COMSAT/lab6 following a NOKIA-designed test plan are part of standardization of the codec in North America and NOKIA/lab7 performed complementary experiments during the ETSI Pre-selection Phase.

As no further analysis have been undertaken to allow the averaging of scores across the different laboratories, results are reported in the annex on a laboratory-by-laboratory basis. For error and tandeming conditions, results are reported in terms of Equivalent Q (dB) values. For background noise conditions and talker dependency, results are reported in terms of DMOS values with either Confidence Interval (CI) or Standard Deviation (SD) as there is insufficient data available to normalise across laboratories via MNRU conditions.

The quality performance of the EFR codec is compared to High and Low references introduced in permanent documents SEG 3 (ETSI SMG2 SEG: SEG 3 "Selection Criteria for the Enhanced Full Rate Speech Coding Algorithm - Speech Quality Requirements") and SEG 4 (ETSI SMG2 SEG: SEG 4 (v 1.0) "A Subjective Pre-Selection Test Plan for the Enhanced Full Rate Speech Coding Algorithm", Section 7). These references were chosen as representative of the "minimum" and "objective" performance targets respectively, and are reported in Table 1.

Table 1: References per condition: High Ref., Low Ref. And G.728

EXPERIMENTS (SEG 4)	Conditions	High Ref	Low Ref
EXP#1	EP0	G.728	G.728
EXP#1	EP1	MNRU 24 dB	TCH-FS (EP1)
EXP#1	EP2	TCH-FS (EP1)	TCH-FS (EP2)
EXP#5	EP3	TCH-FS (EP2)	TCH-FS (EP3)
EXP#1	EP0 (tandem)	G.728	G.728
EXP#1	EP1 (tandem)	TCH-FS (EP1)	TCH-FS (EP1 tandem)
EXP#2	Vehicle 10	G.728	G.728
EXP#3	Music 20	G.728	G.728
EXP#4	Male Talkers	G.728	G.728
EXP#4	Female Talkers	G.728	G.728
EXP#4	Children	G.728	G.728

A Figure showing the general trend of the EFR behaviour for error conditions in noise-free environment, compared to the high (G.728) and low (TCH-FS) references is added to individual laboratories' quantitative results (Figure 15). The general quality performance of the EFR codec is summarised in table 15.

In the Verification Phase, the behaviour of the EFR codec under the following test conditions was tested:

behaviour of the DTX System;
performance with DTMF tones;
performance with network information tones;
performance with special input signals;
performance with music signals;
performance with noise signals;
performance with different languages;
delay of the TCH-EFR;
frequency response;
complexity.

The results of these tests are also included in this report under the respective clauses.

Furthermore, the EFR codec was checked for correct functioning for the following items:

test of overload point;
SID frame encoding;
muting behaviour;
idle channel behaviour.

No artefact or malfunctioning was detected for these items.

1 Scope p. 7

The present document gives background information on the performance of the GSM enhanced full rate speech codec. Experimental results from the Pre-selection and Verification tests carried out during the standardization process by the SEG (Speech Expert Group) are reported to give a more detailed picture of the behaviour of the GSM enhanced full rate speech codec under different conditions of operation.

2 References p. 7

The following documents contain provisions which, through reference in this text, constitute provisions of the present document.

References are either specific (identified by date of publication, edition number, version number, etc.) or non specific.
For a specific reference, subsequent revisions do not apply.
For a non-specific reference, the latest version applies. In the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document.

[1]

GSM 03.05: "Digital cellular telecommunications system (Phase 2+); Technical performance objectives".

[2]

GSM 03.50: "Digital cellular telecommunications system (Phase 2+); Transmission planning aspects of the speech service in the GSM Public Land Mobile Network (PLMN) system".

[3]

GSM 06.08: "Digital cellular telecommunications system (Phase 2+); Half rate speech; Performance of the GSM half rate speech codec".

[4]

GSM 06.10: "Digital cellular telecommunications system (Phase 2+); Full rate speech transcoding".

[5]

GSM 06.20: "Digital cellular telecommunications system (Phase 2+); Half rate speech transcoding".

3 Abbreviations p. 7

For the purposes of the present document, the following abbreviations apply:

A/D

Analogue to Digital

ADPCM

Adaptive Differential Pulse Code Modulation

ACR

Absolute Category Rating

BSC

Base Station Controller

BTS

Base Transceiver Station

C/I

Carrier-to-Interferer ratio

Confidence Interval

CNI

Comfort Noise Insertion

CRC

Cyclic Redundancy Check

D/A

Digital to Analogue

DAT

Digital Audio Tape

DCR

Degradation Category Rating

DSP

Digital Signal Processor

DTMF

Dual Tone Multi Frequency

DTX

Discontinuous Transmission for power consumption and interference reduction

EFR

Enhanced Full Rate

ESP

Product of E (Efficiency), S (Speed) and P (Percentage of Power) of the DSP

Full Rate

GBER

Average gross bit error rate

GSM

Global System for Mobile communications

Half Rate

IRS

Intermediate Reference System, No IRS= rather flat

ITU T

International Telecommunication Union - Telecommunications Standardization Sector

MNRU

Modulated Noise Reference Unit

Mod. IRS

Modified IRS

MOPS

Million of Operation per Seconds

MOS

Mean Opinion Score

Mobile Station

MSC

Mobile Switching Centre

PCM

Pulse Code Modulation

PSTN

Public Switched Telecommunications Network

Speech-to-speech correlated noise power ratio in dB

Standard Deviation

SEG

Speech Expert Group

SID

Silence Descriptor

SMG

Special Mobile Group

TCH-EFS

Traffic Channel Enhanced Full rate Speech

TCH-FS

Traffic Channel Full rate Speech

TCH-HS

Traffic Channel Half rate Speech

TDMA

Time Division Multiple Access

TMOPS

True Million of Operation per Seconds

UPCM

Uniform or Linear PCM

VAD

Voice Activity Detector

WMOPS

Weighted Million of Operations per Seconds

Four different Error Patterns (EP0, EP1, EP2 and EP3) were used, where:

EP0

without channel errors

EP1

C/I=10 dB; 5% GBER (well inside a cell)

EP2

C/I= 7 dB; 8% GBER (at a cell boundary)

EP3

C/I= 4 dB; 13% GBER (outside a cell)