RFC 6787

Media Resource Control Protocol Version 2 (MRCPv2)

Pages: 224
Proposed Standard
→ Errata

Part 6 of 8 – Pages 135 to 167

RFC6787 - Page 135 prevText

10.5.  Recorder Message Body

   If the RECORD request did not have a Record-URI header field, the
   STOP response or the RECORD-COMPLETE event MUST contain a message
   body carrying the captured audio.  In this case, the message carrying
   the audio content has a Record-URI header field with a Content ID
   value pointing to the message body entity that contains the recorded
   audio.  See Section 10.4.7 for details.

10.6.  RECORD

   The RECORD request places the recorder resource in the recording
   state.  Depending on the header fields specified in the RECORD
   method, the resource may start recording the audio immediately or
   wait for the endpointing functionality to detect speech in the audio.
   The audio is then made available to the client either in the message
   body or as specified by Record-URI.

   The server MUST support the 'https' URI scheme and MAY support other
   schemes.  Note that, due to the sensitive nature of voice recordings,
   any protocols used for dereferencing SHOULD employ integrity and
   confidentiality, unless other means, such as use of a controlled
   environment (see Section 4.2), are employed.

RFC6787 - Page 136

   If a RECORD operation is already in progress, invoking this method
   causes the server to issue a response having a status-code of 402
   "Method not valid in this state" and a request-state of COMPLETE.

   If the Record-URI is not valid, a status-code of 404 "Illegal Value
   for Header Field" is returned in the response.  If it is impossible
   for the server to create the requested stored content, a status-code
   of 407 "Method or Operation Failed" is returned.

   If the type specified in the Media-Type header field is not
   supported, the server MUST respond with a status-code of 409
   "Unsupported Header Field Value" with the Media-Type header field in
   its response.

   When the recording operation is initiated, the response indicates an
   IN-PROGRESS request state.  The server MAY generate a subsequent
   START-OF-INPUT event when speech is detected.  Upon completion of the
   recording operation, the server generates a RECORD-COMPLETE event.

   C->S:  MRCP/2.0 ... RECORD 543257
          Channel-Identifier:32AECB23433802@recorder
          Record-URI:<file://mediaserver/recordings/myfile.wav>
          Media-Type:audio/wav
          Capture-On-Speech:true
          Final-Silence:300
          Max-Time:6000

   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... RECORD-COMPLETE 543257 COMPLETE
          Channel-Identifier:32AECB23433802@recorder
          Completion-Cause:000 success-silence
          Record-URI:<file://mediaserver/recordings/myfile.wav>;
                     size=242552;duration=25645

                              RECORD Example

10.7.  STOP

   The STOP method moves the recorder from the recording state back to
   the idle state.  If a RECORD request is active and the STOP request
   successfully terminates it, then the STOP response MUST contain an
   Active-Request-Id-List header field containing the RECORD request-id
   that was terminated.  In this case, no RECORD-COMPLETE event is sent

RFC6787 - Page 137

   for the terminated request.  If there was no recording active, then
   the response MUST NOT contain an Active-Request-Id-List header field.
   If the recording was a success, the STOP response MUST contain a
   Record-URI header field pointing to the recorded audio content or to
   a typed entity in the body of the STOP response containing the
   recorded audio.  The STOP method MAY have a Trim-Length header field,
   in which case the specified length of audio is trimmed from the end
   of the recording after the stop.  In any case, the response MUST
   contain a status-code of 200 "Success".

   C->S:  MRCP/2.0 ... RECORD 543257
          Channel-Identifier:32AECB23433802@recorder
          Record-URI:<file://mediaserver/recordings/myfile.wav>
          Capture-On-Speech:true
          Final-Silence:300
          Max-Time:6000

   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   C->S:  MRCP/2.0 ... STOP 543257
          Channel-Identifier:32AECB23433802@recorder
          Trim-Length:200

   S->C:  MRCP/2.0 ... 543257 200 COMPLETE
          Channel-Identifier:32AECB23433802@recorder
          Record-URI:<file://mediaserver/recordings/myfile.wav>;
                     size=324253;duration=24561
          Active-Request-Id-List:543257

                               STOP Example

10.8.  RECORD-COMPLETE

   If the recording completes due to no input, silence after speech, or
   reaching the max-time, the server MUST generate the RECORD-COMPLETE
   event to the client with a request-state of COMPLETE.  If the
   recording was a success, the RECORD-COMPLETE event contains a Record-
   URI header field pointing to the recorded audio file on the server or
   to a typed entity in the message body containing the recorded audio.

RFC6787 - Page 138

   C->S:  MRCP/2.0 ... RECORD 543257
          Channel-Identifier:32AECB23433802@recorder
          Record-URI:<file://mediaserver/recordings/myfile.wav>
          Capture-On-Speech:true
          Final-Silence:300
          Max-Time:6000

   S->C:  MRCP/2.0 ... 543257 200 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... START-OF-INPUT 543257 IN-PROGRESS
          Channel-Identifier:32AECB23433802@recorder

   S->C:  MRCP/2.0 ... RECORD-COMPLETE 543257 COMPLETE
          Channel-Identifier:32AECB23433802@recorder
          Completion-Cause:000 success
          Record-URI:<file://mediaserver/recordings/myfile.wav>;
                     size=325325;duration=24652

                          RECORD-COMPLETE Example

10.9.  START-INPUT-TIMERS

   This request is sent from the client to the recorder resource when it
   discovers that a kill-on-barge-in prompt has finished playing (see
   Section 8.4.2).  This is useful in the scenario when the recorder and
   synthesizer resources are not in the same MRCPv2 session.  When a
   kill-on-barge-in prompt is being played, the client wants the RECORD
   request to be simultaneously active so that it can detect and
   implement kill-on-barge-in.  But at the same time, the client doesn't
   want the recorder resource to start the no-input timers until the
   prompt is finished.  The Start-Input-Timers header field in the
   RECORD request allows the client to say if the timers should be
   started or not.  In the above case, the recorder resource does not
   start the timers until the client sends a START-INPUT-TIMERS method
   to the recorder.

10.10.  START-OF-INPUT

   The START-OF-INPUT event is returned from the server to the client
   once the server has detected speech.  This event is always returned
   by the recorder resource when speech has been detected.  The recorder
   resource also MUST send a Proxy-Sync-Id header field with a unique
   value for this event.

   S->C:  MRCP/2.0 ... START-OF-INPUT 543259 IN-PROGRESS
          Channel-Identifier:32AECB23433801@recorder
          Proxy-Sync-Id:987654321

RFC6787 - Page 139

11.  Speaker Verification and Identification

   This section describes the methods, responses and events employed by
   MRCPv2 for doing speaker verification/identification.

   Speaker verification is a voice authentication methodology that can
   be used to identify the speaker in order to grant the user access to
   sensitive information and transactions.  Because speech is a
   biometric, a number of essential security considerations related to
   biometric authentication technologies apply to its implementation and
   usage.  Implementers should carefully read Section 12 in this
   document and the corresponding section of the SPEECHSC requirements
   [RFC4313].  Implementers and deployers of this technology are
   strongly encouraged to check the state of the art for any new risks
   and solutions that might have been developed.

   In speaker verification, a recorded utterance is compared to a
   previously stored voiceprint, which is in turn associated with a
   claimed identity for that user.  Verification typically consists of
   two phases: a designation phase to establish the claimed identity of
   the caller and an execution phase in which a voiceprint is either
   created (training) or used to authenticate the claimed identity
   (verification).

   Speaker identification is the process of associating an unknown
   speaker with a member in a population.  It does not employ a claim of
   identity.  When an individual claims to belong to a group (e.g., one
   of the owners of a joint bank account) a group authentication is
   performed.  This is generally implemented as a kind of verification
   involving comparison with more than one voice model.  It is sometimes
   called 'multi-verification'.  If the individual speaker can be
   identified from the group, this may be useful for applications where
   multiple users share the same access privileges to some data or
   application.  Speaker identification and group authentication are
   also done in two phases, a designation phase and an execution phase.
   Note that, from a functionality standpoint, identification can be
   thought of as a special case of group authentication (if the
   individual is identified) where the group is the entire population,
   although the implementation of speaker identification may be
   different from the way group authentication is performed.  To
   accommodate single-voiceprint verification, verification against
   multiple voiceprints, group authentication, and identification, this
   specification provides a single set of methods that can take a list
   of identifiers, called "voiceprint identifiers", and return a list of
   identifiers, with a score for each that represents how well the input
   speech matched each identifier.  The input and output lists of
   identifiers do not have to match, allowing a vendor-specific group
   identifier to be used as input to indicate that identification is to

RFC6787 - Page 140

   be performed.  In this specification, the terms "identification" and
   "multi-verification" are used to indicate that the input represents a
   group (potentially the entire population) and that results for
   multiple voiceprints may be returned.

   It is possible for a verifier resource to share the same session with
   a recognizer resource or to operate independently.  In order to share
   the same session, the verifier and recognizer resources MUST be
   allocated from within the same SIP dialog.  Otherwise, an independent
   verifier resource, running on the same physical server or a separate
   one, will be set up.  Note that, in addition to allowing both
   resources to be allocated in the same INVITE, it is possible to
   allocate one initially and the other later via a re-INVITE.

   Some of the speaker verification methods, described below, apply only
   to a specific mode of operation.

   The verifier resource has a verification buffer associated with it
   (see Section 11.4.14).  This allows the storage of speech utterances
   for the purposes of verification, identification, or training from
   the buffered speech.  This buffer is owned by the verifier resource,
   but other input resources (such as the recognizer resource or
   recorder resource) may write to it.  This allows the speech received
   as part of a recognition or recording operation to be later used for
   verification, identification, or training.  Access to the buffer is
   limited to one operation at time.  Hence, when the resource is doing
   read, write, or delete operations, such as a RECOGNIZE with
   ver-buffer-utterance turned on, another operation involving the
   buffer fails with a status-code of 402.  The verification buffer can
   be cleared by a CLEAR-BUFFER request from the client and is freed
   when the verifier resource is deallocated or the session with the
   server terminates.

   The verification buffer is different from collecting waveforms and
   processing them using either the real-time audio stream or stored
   audio, because this buffering mechanism does not simply accumulate
   speech to a buffer.  The verification buffer MAY contain additional
   information gathered by the recognizer resource that serves to
   improve verification performance.

11.1.  Speaker Verification State Machine

   Speaker verification may operate in a training or a verification
   session.  Starting one of these sessions does not change the state of
   the verifier resource, i.e., it remains idle.  Once a verification or
   training session is started, then utterances are trained or verified

RFC6787 - Page 141

   by calling the VERIFY or VERIFY-FROM-BUFFER method.  The state of the
   verifier resources goes from IDLE to VERIFYING state each time VERIFY
   or VERIFY-FROM-BUFFER is called.

     Idle              Session Opened       Verifying/Training
     State             State                State
      |                   |                         |
      |--START-SESSION--->|                         |
      |                   |                         |
      |                   |----------|              |
      |                   |     START-SESSION       |
      |                   |<---------|              |
      |                   |                         |
      |<--END-SESSION-----|                         |
      |                   |                         |
      |                   |---------VERIFY--------->|
      |                   |                         |
      |                   |---VERIFY-FROM-BUFFER--->|
      |                   |                         |
      |                   |----------|              |
      |                   |  VERIFY-ROLLBACK        |
      |                   |<---------|              |
      |                   |                         |
      |                   |                |--------|
      |                   | GET-INTERMEDIATE-RESULT |
      |                   |                |------->|
      |                   |                         |
      |                   |                |--------|
      |                   |     START-INPUT-TIMERS  |
      |                   |                |------->|
      |                   |                         |
      |                   |                |--------|
      |                   |         START-OF-INPUT  |
      |                   |                |------->|
      |                   |                         |
      |                   |<-VERIFICATION-COMPLETE--|
      |                   |                         |
      |                   |<--------STOP------------|
      |                   |                         |
      |                   |----------|              |
      |                   |         STOP            |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      |         STOP      |                         |
      |<---------|        |                         |

RFC6787 - Page 142

      |                   |----------|              |
      |                   |    CLEAR-BUFFER         |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      |   CLEAR-BUFFER    |                         |
      |<---------|        |                         |
      |                   |                         |
      |                   |----------|              |
      |                   |   QUERY-VOICEPRINT      |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      | QUERY-VOICEPRINT  |                         |
      |<---------|        |                         |
      |                   |                         |
      |                   |----------|              |
      |                   |  DELETE-VOICEPRINT      |
      |                   |<---------|              |
      |                   |                         |
      |----------|        |                         |
      | DELETE-VOICEPRINT |                         |
      |<---------|        |                         |

                      Verifier Resource State Machine

11.2.  Speaker Verification Methods

   The verifier resource supports the following methods.

   verifier-method          =  "START-SESSION"
                            / "END-SESSION"
                            / "QUERY-VOICEPRINT"
                            / "DELETE-VOICEPRINT"
                            / "VERIFY"
                            / "VERIFY-FROM-BUFFER"
                            / "VERIFY-ROLLBACK"
                            / "STOP"
                            / "CLEAR-BUFFER"
                            / "START-INPUT-TIMERS"
                            / "GET-INTERMEDIATE-RESULT"

   These methods allow the client to control the mode and target of
   verification or identification operations within the context of a
   session.  All the verification input operations that occur within a
   session can be used to create, update, or validate against the

RFC6787 - Page 143

   voiceprint specified during the session.  At the beginning of each
   session, the verifier resource is reset to the state it had prior to
   any previous verification session.

   Verification/identification operations can be executed against live
   or buffered audio.  The verifier resource provides methods for
   collecting and evaluating live audio data, and methods for
   controlling the verifier resource and adjusting its configured
   behavior.

   There are no dedicated methods for collecting buffered audio data.
   This is accomplished by calling VERIFY, RECOGNIZE, or RECORD as
   appropriate for the resource, with the header field
   Ver-Buffer-Utterance.  Then, when the following method is called,
   verification is performed using the set of buffered audio.

   1.  VERIFY-FROM-BUFFER

   The following methods are used for verification of live audio
   utterances:

   1.  VERIFY

   2.  START-INPUT-TIMERS

   The following methods are used for configuring the verifier resource
   and for establishing resource states:

   1.  START-SESSION

   2.  END-SESSION

   3.  QUERY-VOICEPRINT

   4.  DELETE-VOICEPRINT

   5.  VERIFY-ROLLBACK

   6.  STOP

   7.  CLEAR-BUFFER

   The following method allows the polling of a verification in progress
   for intermediate results.

   1.  GET-INTERMEDIATE-RESULT

RFC6787 - Page 144

11.3.  Verification Events

   The verifier resource generates the following events.

   verifier-event       =  "VERIFICATION-COMPLETE"
                        /  "START-OF-INPUT"

11.4.  Verification Header Fields

   A verifier resource message can contain header fields containing
   request options and information to augment the Request, Response, or
   Event message it is associated with.

   verification-header      =  repository-uri
                            /  voiceprint-identifier
                            /  verification-mode
                            /  adapt-model
                            /  abort-model
                            /  min-verification-score
                            /  num-min-verification-phrases
                            /  num-max-verification-phrases
                            /  no-input-timeout
                            /  save-waveform
                            /  media-type
                            /  waveform-uri
                            /  voiceprint-exists
                            /  ver-buffer-utterance
                            /  input-waveform-uri
                            /  completion-cause
                            /  completion-reason
                            /  speech-complete-timeout
                            /  new-audio-channel
                            /  abort-verification
                            /  start-input-timers

11.4.1.  Repository-URI

   This header field specifies the voiceprint repository to be used or
   referenced during speaker verification or identification operations.
   This header field is required in the START-SESSION, QUERY-VOICEPRINT,
   and DELETE-VOICEPRINT methods.

   repository-uri           =  "Repository-URI" ":" uri CRLF

RFC6787 - Page 145

11.4.2.  Voiceprint-Identifier

   This header field specifies the claimed identity for verification
   applications.  The claimed identity MAY be used to specify an
   existing voiceprint or to establish a new voiceprint.  This header
   field MUST be present in the QUERY-VOICEPRINT and DELETE-VOICEPRINT
   methods.  The Voiceprint-Identifier MUST be present in the START-
   SESSION method for verification operations.  For identification or
   multi-verification operations, this header field MAY contain a list
   of voiceprint identifiers separated by semicolons.  For
   identification operations, the client MAY also specify a voiceprint
   group identifier instead of a list of voiceprint identifiers.

   voiceprint-identifier        =  "Voiceprint-Identifier" ":"
                                   vid *[";" vid] CRLF
   vid                          =  1*VCHAR ["." 1*VCHAR]

11.4.3.  Verification-Mode

   This header field specifies the mode of the verifier resource and is
   set by the START-SESSION method.  Acceptable values indicate whether
   the verification session will train a voiceprint ("train") or verify/
   identify using an existing voiceprint ("verify").

   Training and verification sessions both require the voiceprint
   Repository-URI to be specified in the START-SESSION.  In many usage
   scenarios, however, the system does not know the speaker's claimed
   identity until a recognition operation has, for example, recognized
   an account number to which the user desires access.  In order to
   allow the first few utterances of a dialog to be both recognized and
   verified, the verifier resource on the MRCPv2 server retains a
   buffer.  In this buffer, the MRCPv2 server accumulates recognized
   utterances.  The client can later execute a verification method and
   apply the buffered utterances to the current verification session.

   Some voice user interfaces may require additional user input that
   should not be subject to verification.  For example, the user's input
   may have been recognized with low confidence and thus require a
   confirmation cycle.  In such cases, the client SHOULD NOT execute the
   VERIFY or VERIFY-FROM-BUFFER methods to collect and analyze the
   caller's input.  A separate recognizer resource can analyze the
   caller's response without any participation by the verifier resource.

RFC6787 - Page 146

   Once the following conditions have been met:

   1.  the voiceprint identity has been successfully established through
       the Voiceprint-Identifier header fields of the START-SESSION
       method, and

   2.  the verification mode has been set to one of "train" or "verify",

   the verifier resource can begin providing verification information
   during verification operations.  If the verifier resource does not
   reach one of the two major states ("train" or "verify") , it MUST
   report an error condition in the MRCPv2 status code to indicate why
   the verifier resource is not ready for the corresponding usage.

   The value of verification-mode is persistent within a verification
   session.  If the client attempts to change the mode during a
   verification session, the verifier resource reports an error and the
   mode retains its current value.

   verification-mode            =  "Verification-Mode" ":"
                                   verification-mode-string

   verification-mode-string     =  "train"
                                /  "verify"

11.4.4.  Adapt-Model

   This header field indicates the desired behavior of the verifier
   resource after a successful verification operation.  If the value of
   this header field is "true", the server SHOULD use audio collected
   during the verification session to update the voiceprint to account
   for ongoing changes in a speaker's incoming speech characteristics,
   unless local policy prohibits updating the voiceprint.  If the value
   is "false" (the default), the server MUST NOT update the voiceprint.
   This header field MAY occur in the START-SESSION method.

   adapt-model              = "Adapt-Model" ":" BOOLEAN CRLF

11.4.5.  Abort-Model

   The Abort-Model header field indicates the desired behavior of the
   verifier resource upon session termination.  If the value of this
   header field is "true", the server MUST discard any pending changes
   to a voiceprint due to verification training or verification
   adaptation.  If the value is "false" (the default), the server MUST
   commit any pending changes for a training session or a successful

RFC6787 - Page 147

   verification session to the voiceprint repository.  A value of "true"
   for Abort-Model overrides a value of "true" for the Adapt-Model
   header field.  This header field MAY occur in the END-SESSION method.

   abort-model             = "Abort-Model" ":" BOOLEAN CRLF

11.4.6.  Min-Verification-Score

   The Min-Verification-Score header field, when used with a verifier
   resource through a SET-PARAMS, GET-PARAMS, or START-SESSION method,
   determines the minimum verification score for which a verification
   decision of "accepted" may be declared by the server.  This is a
   float value between -1.0 and 1.0.  The default value for this header
   field is implementation specific.

   min-verification-score  = "Min-Verification-Score" ":"
                             [ %x2D ] FLOAT CRLF

11.4.7.  Num-Min-Verification-Phrases

   The Num-Min-Verification-Phrases header field is used to specify the
   minimum number of valid utterances before a positive decision is
   given for verification.  The value for this header field is an
   integer and the default value is 1.  The verifier resource MUST NOT
   declare a verification 'accepted' unless Num-Min-Verification-Phrases
   valid utterances have been received.  The minimum value is 1.  This
   header field MAY occur in START-SESSION, SET-PARAMS, or GET-PARAMS.

   num-min-verification-phrases =  "Num-Min-Verification-Phrases" ":"
                                   1*19DIGIT CRLF

11.4.8.  Num-Max-Verification-Phrases

   The Num-Max-Verification-Phrases header field is used to specify the
   number of valid utterances required before a decision is forced for
   verification.  The verifier resource MUST NOT return a decision of
   'undecided' once Num-Max-Verification-Phrases have been collected and
   used to determine a verification score.  The value for this header
   field is an integer and the minimum value is 1.  The default value is
   implementation specific.  This header field MAY occur in START-
   SESSION, SET-PARAMS, or GET-PARAMS.

   num-max-verification-phrases =  "Num-Max-Verification-Phrases" ":"
                                    1*19DIGIT CRLF

RFC6787 - Page 148

11.4.9.  No-Input-Timeout

   The No-Input-Timeout header field sets the length of time from the
   start of the verification timers (see START-INPUT-TIMERS) until the
   VERIFICATION-COMPLETE server event message declares that no input has
   been received (i.e., has a Completion-Cause of no-input-timeout).
   The value is in milliseconds.  This header field MAY occur in VERIFY,
   SET-PARAMS, or GET-PARAMS.  The value for this header field ranges
   from 0 to an implementation-specific maximum value.  The default
   value for this header field is implementation specific.

   no-input-timeout         = "No-Input-Timeout" ":" 1*19DIGIT CRLF

11.4.10.  Save-Waveform

   This header field allows the client to request that the verifier
   resource save the audio stream that was used for verification/
   identification.  The verifier resource MUST attempt to record the
   audio and make it available to the client in the form of a URI
   returned in the Waveform-URI header field in the VERIFICATION-
   COMPLETE event.  If there was an error in recording the stream, or
   the audio content is otherwise not available, the verifier resource
   MUST return an empty Waveform-URI header field.  The default value
   for this header field is "false".  This header field MAY appear in
   the VERIFY method.  Note that this header field does not appear in
   the VERIFY-FROM-BUFFER method since it only controls whether or not
   to save the waveform for live verification/identification operations.

   save-waveform            =  "Save-Waveform" ":" BOOLEAN CRLF

11.4.11.  Media-Type

   This header field MAY be specified in the SET-PARAMS, GET-PARAMS, or
   the VERIFY methods and tells the server resource the media type of
   the captured audio or video such as the one captured and returned by
   the Waveform-URI header field.

   media-type               =  "Media-Type" ":" media-type-value
                               CRLF

11.4.12.  Waveform-URI

   If the Save-Waveform header field is set to "true", the verifier
   resource MUST attempt to record the incoming audio stream of the
   verification into a file and provide a URI for the client to access
   it.  This header field MUST be present in the VERIFICATION-COMPLETE
   event if the Save-Waveform header field was set to true by the
   client.  The value of the header field MUST be empty if there was

RFC6787 - Page 149

   some error condition preventing the server from recording.
   Otherwise, the URI generated by the server MUST be globally unique
   across the server and all its verification sessions.  The content
   MUST be available via the URI until the verification session ends.
   Since the Save-Waveform header field applies only to live
   verification/identification operations, the server can return the
   Waveform-URI only in the VERIFICATION-COMPLETE event for live
   verification/identification operations.

   The server MUST also return the size in octets and the duration in
   milliseconds of the recorded audio waveform as parameters associated
   with the header field.

   waveform-uri             =  "Waveform-URI" ":" ["<" uri ">"
                               ";" "size" "=" 1*19DIGIT
                               ";" "duration" "=" 1*19DIGIT] CRLF

11.4.13.  Voiceprint-Exists

   This header field MUST be returned in QUERY-VOICEPRINT and DELETE-
   VOICEPRINT responses.  This is the status of the voiceprint specified
   in the QUERY-VOICEPRINT method.  For the DELETE-VOICEPRINT method,
   this header field indicates the status of the voiceprint at the
   moment the method execution started.

   voiceprint-exists    =  "Voiceprint-Exists" ":" BOOLEAN CRLF

11.4.14.  Ver-Buffer-Utterance

   This header field is used to indicate that this utterance could be
   later considered for speaker verification.  This way, a client can
   request the server to buffer utterances while doing regular
   recognition or verification activities, and speaker verification can
   later be requested on the buffered utterances.  This header field is
   optional in the RECOGNIZE, VERIFY, and RECORD methods.  The default
   value for this header field is "false".

   ver-buffer-utterance     = "Ver-Buffer-Utterance" ":" BOOLEAN
                              CRLF

11.4.15.  Input-Waveform-URI

   This header field specifies stored audio content that the client
   requests the server to fetch and process according to the current
   verification mode, either to train the voiceprint or verify a claimed
   identity.  This header field enables the client to implement the

RFC6787 - Page 150

   buffering use case where the recognizer and verifier resources are in
   different sessions and the verification buffer technique cannot be
   used.  It MAY be specified on the VERIFY request.

   input-waveform-uri           =  "Input-Waveform-URI" ":" uri CRLF

11.4.16.  Completion-Cause

   This header field MUST be part of a VERIFICATION-COMPLETE event from
   the verifier resource to the client.  This indicates the cause of
   VERIFY or VERIFY-FROM-BUFFER method completion.  This header field
   MUST be sent in the VERIFY, VERIFY-FROM-BUFFER, and QUERY-VOICEPRINT
   responses, if they return with a failure status and a COMPLETE state.
   In the ABNF below, the 'cause-code' contains a numerical value
   selected from the Cause-Code column of the following table.  The
   'cause-name' contains the corresponding token selected from the
   Cause-Name column.

   completion-cause         =  "Completion-Cause" ":" cause-code SP
                               cause-name CRLF
   cause-code               =  3DIGIT
   cause-name               =  *VCHAR

   +------------+--------------------------+---------------------------+
   | Cause-Code | Cause-Name               | Description               |
   +------------+--------------------------+---------------------------+
   | 000        | success                  | VERIFY or                 |
   |            |                          | VERIFY-FROM-BUFFER        |
   |            |                          | request completed         |
   |            |                          | successfully. The verify  |
   |            |                          | decision can be           |
   |            |                          | "accepted", "rejected",   |
   |            |                          | or "undecided".           |
   | 001        | error                    | VERIFY or                 |
   |            |                          | VERIFY-FROM-BUFFER        |
   |            |                          | request terminated        |
   |            |                          | prematurely due to a      |
   |            |                          | verifier resource or      |
   |            |                          | system error.             |
   | 002        | no-input-timeout         | VERIFY request completed  |
   |            |                          | with no result due to a   |
   |            |                          | no-input-timeout.         |
   | 003        | too-much-speech-timeout  | VERIFY request completed  |
   |            |                          | with no result due to too |
   |            |                          | much speech.              |
   | 004        | speech-too-early         | VERIFY request completed  |
   |            |                          | with no result due to     |
   |            |                          | speech too soon.          |

RFC6787 - Page 151

   | 005        | buffer-empty             | VERIFY-FROM-BUFFER        |
   |            |                          | request completed with no |
   |            |                          | result due to empty       |
   |            |                          | buffer.                   |
   | 006        | out-of-sequence          | Verification operation    |
   |            |                          | failed due to             |
   |            |                          | out-of-sequence method    |
   |            |                          | invocations, for example, |
   |            |                          | calling VERIFY before     |
   |            |                          | QUERY-VOICEPRINT.         |
   | 007        | repository-uri-failure   | Failure accessing         |
   |            |                          | Repository URI.           |
   | 008        | repository-uri-missing   | Repository-URI is not     |
   |            |                          | specified.                |
   | 009        | voiceprint-id-missing    | Voiceprint-Identifier is  |
   |            |                          | not specified.            |
   | 010        | voiceprint-id-not-exist  | Voiceprint-Identifier     |
   |            |                          | does not exist in the     |
   |            |                          | voiceprint repository.    |
   | 011        | speech-not-usable        | VERIFY request completed  |
   |            |                          | with no result because    |
   |            |                          | the speech was not usable |
   |            |                          | (too noisy, too short,    |
   |            |                          | etc.)                     |
   +------------+--------------------------+---------------------------+

11.4.17.  Completion-Reason

   This header field MAY be specified in a VERIFICATION-COMPLETE event
   coming from the verifier resource to the client.  It contains the
   reason text behind the VERIFY request completion.  This header field
   communicates text describing the reason for the failure.

   The completion reason text is provided for client use in logs and for
   debugging and instrumentation purposes.  Clients MUST NOT interpret
   the completion reason text.

   completion-reason        =  "Completion-Reason" ":"
                               quoted-string CRLF

11.4.18.  Speech-Complete-Timeout

   This header field is the same as the one described for the Recognizer
   resource.  See Section 9.4.15.  This header field MAY occur in
   VERIFY, SET-PARAMS, or GET-PARAMS.

RFC6787 - Page 152

11.4.19.  New-Audio-Channel

   This header field is the same as the one described for the Recognizer
   resource.  See Section 9.4.23.  This header field MAY be specified in
   a VERIFY request.

11.4.20.  Abort-Verification

   This header field MUST be sent in a STOP request to indicate whether
   or not to abort a VERIFY method in progress.  A value of "true"
   requests the server to discard the results.  A value of "false"
   requests the server to return in the STOP response the verification
   results obtained up to the point it received the STOP request.

   abort-verification   =  "Abort-Verification " ":" BOOLEAN CRLF

11.4.21.  Start-Input-Timers

   This header field MAY be sent as part of a VERIFY request.  A value
   of "false" tells the verifier resource to start the VERIFY operation
   but not to start the no-input timer yet.  The verifier resource MUST
   NOT start the timers until the client sends a START-INPUT-TIMERS
   request to the resource.  This is useful in the scenario when the
   verifier and synthesizer resources are not part of the same session.
   In this scenario, when a kill-on-barge-in prompt is being played, the
   client may want the VERIFY request to be simultaneously active so
   that it can detect and implement kill-on-barge-in (see
   Section 8.4.2).  But at the same time, the client doesn't want the
   verifier resource to start the no-input timers until the prompt is
   finished.  The default value is "true".

   start-input-timers       =  "Start-Input-Timers" ":"
                               BOOLEAN CRLF

11.5.  Verification Message Body

   A verification response or event message can carry additional data as
   described in the following subsection.

11.5.1.  Verification Result Data

   Verification results are returned to the client in the message body
   of the VERIFICATION-COMPLETE event or the GET-INTERMEDIATE-RESULT
   response message as described in Section 6.3.  Element and attribute
   descriptions for the verification portion of the NLSML format are
   provided in Section 11.5.2 with a normative definition of the schema
   in Section 16.3.

RFC6787 - Page 153

11.5.2.  Verification Result Elements

   All verification elements are contained within a single
   <verification-result> element under <result>.  The elements are
   described below and have the schema defined in Section 16.2.  The
   following elements are defined:

   1.   <voiceprint>

   2.   <incremental>

   3.   <cumulative>

   4.   <decision>

   5.   <utterance-length>

   6.   <device>

   7.   <gender>

   8.   <adapted>

   9.   <verification-score>

   10.  <vendor-specific-results>

11.5.2.1.  <voiceprint> Element

   This element in the verification results provides information on how
   the speech data matched a single voiceprint.  The result data
   returned MAY have more than one such entity in the case of
   identification or multi-verification.  Each <voiceprint> element and
   the XML data within the element describe verification result
   information for how well the speech data matched that particular
   voiceprint.  The list of <voiceprint> element data are ordered
   according to their cumulative verification match scores, with the
   highest score first.

11.5.2.2.  <cumulative> Element

   Within each <voiceprint> element there MUST be a <cumulative> element
   with the cumulative scores of how well multiple utterances matched
   the voiceprint.

RFC6787 - Page 154

11.5.2.3.  <incremental> Element

   The first <voiceprint> element MAY contain an <incremental> element
   with the incremental scores of how well the last utterance matched
   the voiceprint.

11.5.2.4.  <Decision> Element

   This element is found within the <incremental> or <cumulative>
   element within the verification results.  Its value indicates the
   verification decision.  It can have the values of "accepted",
   "rejected", or "undecided".

11.5.2.5.  <utterance-length> Element

   This element MAY occur within either the <incremental> or
   <cumulative> elements within the first <voiceprint> element.  Its
   value indicates the size in milliseconds, respectively, of the last
   utterance or the cumulated set of utterances.

11.5.2.6.  <device> Element

   This element is found within the <incremental> or <cumulative>
   element within the verification results.  Its value indicates the
   apparent type of device used by the caller as determined by the
   verifier resource.  It can have the values of "cellular-phone",
   "electret-phone", "carbon-button-phone", or "unknown".

11.5.2.7.  <gender> Element

   This element is found within the <incremental> or <cumulative>
   element within the verification results.  Its value indicates the
   apparent gender of the speaker as determined by the verifier
   resource.  It can have the values of "male", "female", or "unknown".

11.5.2.8.  <adapted> Element

   This element is found within the first <voiceprint> element within
   the verification results.  When verification is trying to confirm the
   voiceprint, this indicates if the voiceprint has been adapted as a
   consequence of analyzing the source utterances.  It is not returned
   during verification training.  The value can be "true" or "false".

11.5.2.9.  <verification-score> Element

   This element is found within the <incremental> or <cumulative>
   element within the verification results.  Its value indicates the
   score of the last utterance as determined by verification.

RFC6787 - Page 155

   During verification, the higher the score, the more likely it is that
   the speaker is the same one as the one who spoke the voiceprint
   utterances.  During training, the higher the score, the more likely
   the speaker is to have spoken all of the analyzed utterances.  The
   value is a floating point between -1.0 and 1.0.  If there are no such
   utterances, the score is -1.  Note that the verification score is not
   a probability value.

11.5.2.10.  <vendor-specific-results> Element

   MRCPv2 servers MAY send verification results that contain
   implementation-specific data that augment the information provided by
   the MRCPv2-defined elements.  Such data might be useful to clients
   who have private knowledge of how to interpret these schema
   extensions.  Implementation-specific additions to the verification
   results schema MUST belong to the vendor's own namespace.  In the
   result structure, either they MUST be indicated by a namespace prefix
   declared within the result, or they MUST be children of an element
   identified as belonging to the respective namespace.

   The following example shows the results of three voiceprints.  Note
   that the first one has crossed the verification score threshold, and
   the speaker has been accepted.  The voiceprint was also adapted with
   the most recent utterance.

   <?xml version="1.0"?>
   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
           grammar="What-Grammar-URI">
     <verification-result>
       <voiceprint id="johnsmith">
         <adapted> true </adapted>
         <incremental>
           <utterance-length> 500 </utterance-length>
           <device> cellular-phone </device>
           <gender> male </gender>
           <decision> accepted </decision>
           <verification-score> 0.98514 </verification-score>
         </incremental>
         <cumulative>
           <utterance-length> 10000 </utterance-length>
           <device> cellular-phone </device>
           <gender> male </gender>
           <decision> accepted </decision>
           <verification-score> 0.96725</verification-score>
         </cumulative>
       </voiceprint>

RFC6787 - Page 156

       <voiceprint id="marysmith">
         <cumulative>
           <verification-score> 0.93410 </verification-score>
         </cumulative>
       </voiceprint>
       <voiceprint uri="juniorsmith">
         <cumulative>
           <verification-score> 0.74209 </verification-score>
         </cumulative>
       </voiceprint>
     </verification-result>
   </result>

                      Verification Results Example 1

   In this next example, the verifier has enough information to decide
   to reject the speaker.

   <?xml version="1.0"?>
   <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
           xmlns:xmpl="http://www.example.org/2003/12/mrcpv2"
           grammar="What-Grammar-URI">
     <verification-result>
       <voiceprint id="johnsmith">
         <incremental>
           <utterance-length> 500 </utterance-length>
           <device> cellular-phone </device>
           <gender> male </gender>
           <verification-score> 0.88514 </verification-score>
           <xmpl:raspiness> high </xmpl:raspiness>
           <xmpl:emotion> sadness </xmpl:emotion>
         </incremental>
         <cumulative>
           <utterance-length> 10000 </utterance-length>
           <device> cellular-phone </device>
           <gender> male </gender>
           <decision> rejected </decision>
           <verification-score> 0.9345 </verification-score>
         </cumulative>
       </voiceprint>
     </verification-result>
   </result>

                      Verification Results Example 2

RFC6787 - Page 157

11.6.  START-SESSION

   The START-SESSION method starts a speaker verification or speaker
   identification session.  Execution of this method places the verifier
   resource into its initial state.  If this method is called during an
   ongoing verification session, the previous session is implicitly
   aborted.  If this method is invoked when VERIFY or VERIFY-FROM-BUFFER
   is active, the method fails and the server returns a status-code of
   402.

   Upon completion of the START-SESSION method, the verifier resource
   MUST have terminated any ongoing verification session and cleared any
   voiceprint designation.

   A verification session is associated with the voiceprint repository
   to be used during the session.  This is specified through the
   Repository-URI header field (see Section 11.4.1).

   The START-SESSION method also establishes, through the Voiceprint-
   Identifier header field, which voiceprints are to be matched or
   trained during the verification session.  If this is an
   Identification session or if the client wants to do Multi-
   Verification, the Voiceprint-Identifier header field contains a list
   of semicolon-separated voiceprint identifiers.

   The Adapt-Model header field MAY also be present in the START-SESSION
   request to indicate whether or not to adapt a voiceprint based on
   data collected during the session (if the voiceprint verification
   phase succeeds).  By default, the voiceprint model MUST NOT be
   adapted with data from a verification session.

   The START-SESSION also determines whether the session is for a train
   or verify of a voiceprint.  Hence, the Verification-Mode header field
   MUST be sent in every START-SESSION request.  The value of the
   Verification-Mode header field MUST be one of either "train" or
   "verify".

   Before a verification/identification session is started, the client
   may only request that VERIFY-ROLLBACK and generic SET-PARAMS and
   GET-PARAMS operations be performed on the verifier resource.  The
   server MUST return status-code 402 "Method not valid in this state"
   for all other verification operations.

   A verifier resource MUST NOT have more than a single session active
   at one time.

RFC6787 - Page 158

   C->S:  MRCP/2.0 ... START-SESSION 314161
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/voiceprintdbase/
          Voiceprint-Mode:verify
          Voiceprint-Identifier:johnsmith.voiceprint
          Adapt-Model:true

   S->C:  MRCP/2.0 ... 314161 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

11.7.  END-SESSION

   The END-SESSION method terminates an ongoing verification session and
   releases the verification voiceprint resources.  The session may
   terminate in one of three ways:

   1.  abort - the voiceprint adaptation or creation may be aborted so
       that the voiceprint remains unchanged (or is not created).

   2.  commit - when terminating a voiceprint training session, the new
       voiceprint is committed to the repository.

   3.  adapt - an existing voiceprint is modified using a successful
       verification.

   The Abort-Model header field MAY be included in the END-SESSION to
   control whether or not to abort any pending changes to the
   voiceprint.  The default behavior is to commit (not abort) any
   pending changes to the designated voiceprint.

   The END-SESSION method may be safely executed multiple times without
   first executing the START-SESSION method.  Any additional executions
   of this method without an intervening use of the START-SESSION method
   have no effect on the verifier resource.

   The following example assumes there is either a training session or a
   verification session in progress.

   C->S:  MRCP/2.0 ... END-SESSION 314174
          Channel-Identifier:32AECB23433801@speakverify
          Abort-Model:true

   S->C:  MRCP/2.0 ... 314174 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

RFC6787 - Page 159

11.8.  QUERY-VOICEPRINT

   The QUERY-VOICEPRINT method is used to get status information on a
   particular voiceprint and can be used by the client to ascertain if a
   voiceprint or repository exists and if it contains trained
   voiceprints.

   The response to the QUERY-VOICEPRINT request contains an indication
   of the status of the designated voiceprint in the Voiceprint-Exists
   header field, allowing the client to determine whether to use the
   current voiceprint for verification, train a new voiceprint, or
   choose a different voiceprint.

   A voiceprint is completely specified by providing a repository
   location and a voiceprint identifier.  The particular voiceprint or
   identity within the repository is specified by a string identifier
   that is unique within the repository.  The Voiceprint-Identifier
   header field carries this unique voiceprint identifier within a given
   repository.

   The following example assumes a verification session is in progress
   and the voiceprint exists in the voiceprint repository.

   C->S:  MRCP/2.0 ... QUERY-VOICEPRINT 314168
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/voiceprints/
          Voiceprint-Identifier:johnsmith.voiceprint

   S->C:  MRCP/2.0 ... 314168 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/voiceprints/
          Voiceprint-Identifier:johnsmith.voiceprint
          Voiceprint-Exists:true

   The following example assumes that the URI provided in the
   Repository-URI header field is a bad URI.

   C->S:  MRCP/2.0 ... QUERY-VOICEPRINT 314168
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/bad-uri/
          Voiceprint-Identifier:johnsmith.voiceprint

   S->C:  MRCP/2.0 ... 314168 405 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/bad-uri/
          Voiceprint-Identifier:johnsmith.voiceprint
          Completion-Cause:007 repository-uri-failure

RFC6787 - Page 160

11.9.  DELETE-VOICEPRINT

   The DELETE-VOICEPRINT method removes a voiceprint from a repository.
   This method MUST carry the Repository-URI and Voiceprint-Identifier
   header fields.

   An MRCPv2 server MUST reject a DELETE-VOICEPRINT request with a 401
   status code unless the MRCPv2 client has been authenticated and
   authorized.  Note that MRCPv2 does not have a standard mechanism for
   this.  See Section 12.8.

   If the corresponding voiceprint does not exist, the DELETE-VOICEPRINT
   method MUST return a 200 status code.

   The following example demonstrates a DELETE-VOICEPRINT operation to
   remove a specific voiceprint.

   C->S:  MRCP/2.0 ... DELETE-VOICEPRINT 314168
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/bad-uri/
          Voiceprint-Identifier:johnsmith.voiceprint

   S->C:  MRCP/2.0 ... 314168 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

11.10.  VERIFY

   The VERIFY method is used to request that the verifier resource
   either train/adapt the voiceprint or verify/identify a claimed
   identity.  If the voiceprint is new or was deleted by a previous
   DELETE-VOICEPRINT method, the VERIFY method trains the voiceprint.
   If the voiceprint already exists, it is adapted and not retrained by
   the VERIFY command.

   C->S:  MRCP/2.0 ... VERIFY 543260
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 543260 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801@speakverify

   When the VERIFY request completes, the MRCPv2 server MUST send a
   VERIFICATION-COMPLETE event to the client.

11.11.  VERIFY-FROM-BUFFER

   The VERIFY-FROM-BUFFER method directs the verifier resource to verify
   buffered audio against a voiceprint.  Only one VERIFY or VERIFY-FROM-
   BUFFER method may be active for a verifier resource at a time.

RFC6787 - Page 161

   The buffered audio is not consumed by this method and thus VERIFY-
   FROM-BUFFER may be invoked multiple times by the client to attempt
   verification against different voiceprints.

   For the VERIFY-FROM-BUFFER method, the server MAY optionally return
   an IN-PROGRESS response before the VERIFICATION-COMPLETE event.

   When the VERIFY-FROM-BUFFER method is invoked and the verification
   buffer is in use by another resource sharing it, the server MUST
   return an IN-PROGRESS response and wait until the buffer is available
   to it.  The verification buffer is owned by the verifier resource but
   is shared with write access from other input resources on the same
   session.  Hence, it is considered to be in use if there is a read or
   write operation such as a RECORD or RECOGNIZE with the
   Ver-Buffer-Utterance header field set to "true" on a resource that
   shares this buffer.  Note that if a RECORD or RECOGNIZE method
   returns with a failure cause code, the VERIFY-FROM-BUFFER request
   waiting to process that buffer MUST also fail with a Completion-Cause
   of 005 (buffer-empty).

   The following example illustrates the usage of some buffering
   methods.  In this scenario, the client first performed a live
   verification, but the utterance had been rejected.  In the meantime,
   the utterance is also saved to the audio buffer.  Then, another
   voiceprint is used to do verification against the audio buffer and
   the utterance is accepted.  For the example, we assume both
   Num-Min-Verification-Phrases and Num-Max-Verification-Phrases are 1.

   C->S:  MRCP/2.0 ... START-SESSION 314161
          Channel-Identifier:32AECB23433801@speakverify
          Verification-Mode:verify
          Adapt-Model:true
          Repository-URI:http://www.example.com/voiceprints
          Voiceprint-Identifier:johnsmith.voiceprint

   S->C:  MRCP/2.0 ... 314161 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

   C->S:  MRCP/2.0 ... VERIFY 314162
          Channel-Identifier:32AECB23433801@speakverify
          Ver-buffer-utterance:true

   S->C:  MRCP/2.0 ... 314162 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801@speakverify

RFC6787 - Page 162

   S->C:  MRCP/2.0 ... VERIFICATION-COMPLETE 314162 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Completion-Cause:000 success
          Content-Type:application/nlsml+xml
          Content-Length:...

          <?xml version="1.0"?>
          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
                  grammar="What-Grammar-URI">
            <verification-result>
              <voiceprint id="johnsmith">
                <incremental>
                  <utterance-length> 500 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> female </gender>
                  <decision> rejected </decision>
                  <verification-score> 0.05465 </verification-score>
                </incremental>
                <cumulative>
                  <utterance-length> 500 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> female </gender>
                  <decision> rejected </decision>
                  <verification-score> 0.05465 </verification-score>
                </cumulative>
              </voiceprint>
            </verification-result>
          </result>

   C->S:  MRCP/2.0 ... QUERY-VOICEPRINT 314163
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/voiceprints/
          Voiceprint-Identifier:johnsmith

   S->C:  MRCP/2.0 ... 314163 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Repository-URI:http://www.example.com/voiceprints/
          Voiceprint-Identifier:johnsmith.voiceprint
          Voiceprint-Exists:true

   C->S:  MRCP/2.0 ... START-SESSION 314164
          Channel-Identifier:32AECB23433801@speakverify
          Verification-Mode:verify
          Adapt-Model:true
          Repository-URI:http://www.example.com/voiceprints
          Voiceprint-Identifier:marysmith.voiceprint

RFC6787 - Page 163

   S->C:  MRCP/2.0 ... 314164 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

   C->S:  MRCP/2.0 ... VERIFY-FROM-BUFFER 314165
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 314165 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... VERIFICATION-COMPLETE 314165 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Completion-Cause:000 success
          Content-Type:application/nlsml+xml
          Content-Length:...

          <?xml version="1.0"?>
          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
                  grammar="What-Grammar-URI">
            <verification-result>
              <voiceprint id="marysmith">
                <incremental>
                  <utterance-length> 1000 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> female </gender>
                  <decision> accepted </decision>
                  <verification-score> 0.98 </verification-score>
                </incremental>
                <cumulative>
                  <utterance-length> 1000 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> female </gender>
                  <decision> accepted </decision>
                  <verification-score> 0.98 </verification-score>
                </cumulative>
              </voiceprint>
            </verification-result>
          </result>


   C->S:  MRCP/2.0 ... END-SESSION 314166
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 314166 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

                        VERIFY-FROM-BUFFER Example

RFC6787 - Page 164

11.12.  VERIFY-ROLLBACK

   The VERIFY-ROLLBACK method discards the last buffered utterance or
   discards the last live utterances (when the mode is "train" or
   "verify").  The client will likely want to invoke this method when
   the user provides undesirable input such as non-speech noises, side-
   speech, out-of-grammar utterances, commands, etc.  Note that this
   method does not provide a stack of rollback states.  Executing
   VERIFY-ROLLBACK twice in succession without an intervening
   recognition operation has no effect on the second attempt.

   C->S:  MRCP/2.0 ... VERIFY-ROLLBACK 314165
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 314165 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

                          VERIFY-ROLLBACK Example

11.13.  STOP

   The STOP method from the client to the server tells the verifier
   resource to stop the VERIFY or VERIFY-FROM-BUFFER request if one is
   active.  If such a request is active and the STOP request
   successfully terminated it, then the response header section contains
   an Active-Request-Id-List header field containing the request-id of
   the VERIFY or VERIFY-FROM-BUFFER request that was terminated.  In
   this case, no VERIFICATION-COMPLETE event is sent for the terminated
   request.  If there was no verify request active, then the response
   MUST NOT contain an Active-Request-Id-List header field.  Either way,
   the response MUST contain a status-code of 200 "Success".

   The STOP method can carry an Abort-Verification header field, which
   specifies if the verification result until that point should be
   discarded or returned.  If this header field is not present or if the
   value is "true", the verification result is discarded and the STOP
   response does not contain any result data.  If the header field is
   present and its value is "false", the STOP response MUST contain a
   Completion-Cause header field and carry the Verification result data
   in its body.

   An aborted VERIFY request does an automatic rollback and hence does
   not affect the cumulative score.  A VERIFY request that was stopped
   with no Abort-Verification header field or with the Abort-
   Verification header field set to "false" does affect cumulative
   scores and would need to be explicitly rolled back if the client does
   not want the verification result considered in the cumulative scores.

RFC6787 - Page 165

   The following example assumes a voiceprint identity has already been
   established.

   C->S:  MRCP/2.0 ... VERIFY 314177
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 314177 200 IN-PROGRESS
          Channel-Identifier:32AECB23433801@speakverify

   C->S:  MRCP/2.0 ... STOP 314178
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 314178 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Active-Request-Id-List:314177

                         STOP Verification Example

11.14.  START-INPUT-TIMERS

   This request is sent from the client to the verifier resource to
   start the no-input timer, usually once the client has ascertained
   that any audio prompts to the user have played to completion.

   C->S:  MRCP/2.0 ... START-INPUT-TIMERS 543260
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 543260 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

11.15.  VERIFICATION-COMPLETE

   The VERIFICATION-COMPLETE event follows a call to VERIFY or VERIFY-
   FROM-BUFFER and is used to communicate the verification results to
   the client.  The event message body contains only verification
   results.

   S->C:  MRCP/2.0 ... VERIFICATION-COMPLETE 543259 COMPLETE
          Completion-Cause:000 success
          Content-Type:application/nlsml+xml
          Content-Length:...

          <?xml version="1.0"?>
          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
                  grammar="What-Grammar-URI">
            <verification-result>
              <voiceprint id="johnsmith">

RFC6787 - Page 166

                <incremental>
                  <utterance-length> 500 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> male </gender>
                  <decision> accepted </decision>
                  <verification-score> 0.85 </verification-score>
                </incremental>
                <cumulative>
                  <utterance-length> 1500 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> male </gender>
                  <decision> accepted </decision>
                  <verification-score> 0.75 </verification-score>
                </cumulative>
              </voiceprint>
            </verification-result>
          </result>

11.16.  START-OF-INPUT

   The START-OF-INPUT event is returned from the server to the client
   once the server has detected speech.  This event is always returned
   by the verifier resource when speech has been detected, irrespective
   of whether or not the recognizer and verifier resources share the
   same session.

   S->C:  MRCP/2.0 ... START-OF-INPUT 543259 IN-PROGRESS
          Channel-Identifier:32AECB23433801@speakverify

11.17.  CLEAR-BUFFER

   The CLEAR-BUFFER method can be used to clear the verification buffer.
   This buffer is used to buffer speech during recognition, record, or
   verification operations that may later be used by VERIFY-FROM-BUFFER.
   As noted before, the buffer associated with the verifier resource is
   shared by other input resources like recognizers and recorders.
   Hence, a CLEAR-BUFFER request fails if the verification buffer is in
   use.  This can happen when any one of the input resources that share
   this buffer has an active read or write operation such as RECORD,
   RECOGNIZE, or VERIFY with the Ver-Buffer-Utterance header field set
   to "true".

   C->S:  MRCP/2.0 ... CLEAR-BUFFER 543260
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 543260 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify

RFC6787 - Page 167

11.18.  GET-INTERMEDIATE-RESULT

   A client can use the GET-INTERMEDIATE-RESULT method to poll for
   intermediate results of a verification request that is in progress.
   Invoking this method does not change the state of the resource.  The
   verifier resource collects the accumulated verification results and
   returns the information in the method response.  The message body in
   the response to a GET-INTERMEDIATE-RESULT REQUEST contains only
   verification results.  The method response MUST NOT contain a
   Completion-Cause header field as the request is not yet complete.  If
   the resource does not have a verification in progress, the response
   has a 402 failure status-code and no result in the body.

   C->S:  MRCP/2.0 ... GET-INTERMEDIATE-RESULT 543260
          Channel-Identifier:32AECB23433801@speakverify

   S->C:  MRCP/2.0 ... 543260 200 COMPLETE
          Channel-Identifier:32AECB23433801@speakverify
          Content-Type:application/nlsml+xml
          Content-Length:...

          <?xml version="1.0"?>
          <result xmlns="urn:ietf:params:xml:ns:mrcpv2"
                  grammar="What-Grammar-URI">
            <verification-result>
              <voiceprint id="marysmith">
                <incremental>
                  <utterance-length> 50 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> female </gender>
                  <decision> undecided </decision>
                  <verification-score> 0.85 </verification-score>
                </incremental>
                <cumulative>
                  <utterance-length> 150 </utterance-length>
                  <device> cellular-phone </device>
                  <gender> female </gender>
                  <decision> undecided </decision>
                  <verification-score> 0.65 </verification-score>
                </cumulative>
              </voiceprint>
            </verification-result>
          </result>

(next page on part 7)