SA4 has been working on the selection of a codec to recommend for Speech Enabled Services since October 2002 under the WID for SES [9]. The usual process of agreeing "design constrains" [10], "test and processing plan" [7] and "recommendation criteria" [8] was followed and completed before evaluating the candidates.
Two candidate codecs were proposed and evaluated:
-
ETSI Standard for the DSR Extended Advanced Front-end (ES 202 212)
-
AMR and AMR-WB audio codec
The performance evaluations were conducted by two leading companies in the area of speech recognition, IBM and Scansoft. Results from these evaluations were presented at SA4#30 in February 2004 and are summarised here. The "recommendation criteria" have been applied and SA4 recommends the DSR codec for Speech Enabled Services. SES codecs are introduced in packet switched conversational services in
TS 26.235 and
TS 26.236.
This technical report provides information on the recognition performance of the DSR Extended Advanced Front End conducted by speech recognition vendors IBM and Scansoft for the selection of a codec for Speech Enabled Services. The performance results are provided both as absolute word error rates for DSR and AMR-NB/AMR-WB on a range of extensive evaluation databases and as relative word error rate reductions when compared to both the AMR-NB and AMR-WB codecs.