Tech-invite3GPPspaceIETFspace
9796959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 6189

ZRTP: Media Path Key Agreement for Unicast Secure RTP

Pages: 115
Informational
Part 1 of 5 – Pages 1 to 10
None   None   Next

Top   ToC   RFC6189 - Page 1
Internet Engineering Task Force (IETF)                     P. Zimmermann
Request for Comments: 6189                                 Zfone Project
Category: Informational                                 A. Johnston, Ed.
ISSN: 2070-1721                                                    Avaya
                                                               J. Callas
                                                             Apple, Inc.
                                                              April 2011


         ZRTP: Media Path Key Agreement for Unicast Secure RTP

Abstract

This document defines ZRTP, a protocol for media path Diffie-Hellman exchange to agree on a session key and parameters for establishing unicast Secure Real-time Transport Protocol (SRTP) sessions for Voice over IP (VoIP) applications. The ZRTP protocol is media path keying because it is multiplexed on the same port as RTP and does not require support in the signaling protocol. ZRTP does not assume a Public Key Infrastructure (PKI) or require the complexity of certificates in end devices. For the media session, ZRTP provides confidentiality, protection against man-in-the-middle (MiTM) attacks, and, in cases where the signaling protocol provides end-to-end integrity protection, authentication. ZRTP can utilize a Session Description Protocol (SDP) attribute to provide discovery and authentication through the signaling channel. To provide best effort SRTP, ZRTP utilizes normal RTP/AVP (Audio-Visual Profile) profiles. ZRTP secures media sessions that include a voice media stream and can also secure media sessions that do not include voice by using an optional digital signature. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6189.
Top   ToC   RFC6189 - Page 2
Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

1. Introduction ....................................................4 2. Terminology .....................................................5 3. Overview ........................................................6 3.1. Key Agreement Modes ........................................7 3.1.1. Diffie-Hellman Mode Overview ........................7 3.1.2. Preshared Mode Overview .............................9 3.1.3. Multistream Mode Overview ...........................9 4. Protocol Description ...........................................10 4.1. Discovery .................................................10 4.1.1. Protocol Version Negotiation .......................11 4.1.2. Algorithm Negotiation ..............................13 4.2. Commit Contention .........................................14 4.3. Matching Shared Secret Determination ......................15 4.3.1. Calculation and Comparison of Hashes of Shared Secrets .....................................17 4.3.2. Handling a Shared Secret Cache Mismatch ............18 4.4. DH and Non-DH Key Agreements ..............................19 4.4.1. Diffie-Hellman Mode ................................19 4.4.1.1. Hash Commitment in Diffie-Hellman Mode ....20 4.4.1.2. Responder Behavior in Diffie-Hellman Mode .......................21 4.4.1.3. Initiator Behavior in Diffie-Hellman Mode .......................22 4.4.1.4. Shared Secret Calculation for DH Mode .....22 4.4.2. Preshared Mode .....................................25 4.4.2.1. Commitment in Preshared Mode ..............25 4.4.2.2. Initiator Behavior in Preshared Mode ......26 4.4.2.3. Responder Behavior in Preshared Mode ......26 4.4.2.4. Shared Secret Calculation for Preshared Mode ............................27
Top   ToC   RFC6189 - Page 3
           4.4.3. Multistream Mode ...................................28
                  4.4.3.1. Commitment in Multistream Mode ............29
                  4.4.3.2. Shared Secret Calculation for
                           Multistream Mode ..........................29
      4.5. Key Derivations ...........................................31
           4.5.1. The ZRTP Key Derivation Function ...................31
           4.5.2. Deriving ZRTPSess Key and SAS in DH or
                  Preshared Modes ....................................32
           4.5.3. Deriving the Rest of the Keys from s0 ..............33
      4.6. Confirmation ..............................................35
           4.6.1. Updating the Cache of Shared Secrets ...............35
                  4.6.1.1. Cache Update Following a Cache Mismatch ...36
      4.7. Termination ...............................................37
           4.7.1. Termination via Error Message ......................37
           4.7.2. Termination via GoClear Message ....................37
                  4.7.2.1. Key Destruction for GoClear Message .......39
           4.7.3. Key Destruction at Termination .....................40
      4.8. Random Number Generation ..................................40
      4.9. ZID and Cache Operation ...................................41
           4.9.1. Cacheless Implementations ..........................42
   5. ZRTP Messages ..................................................42
      5.1. ZRTP Message Formats ......................................44
           5.1.1. Message Type Block .................................44
           5.1.2. Hash Type Block ....................................45
                  5.1.2.1. Negotiated Hash and MAC Algorithm .........46
                  5.1.2.2. Implicit Hash and MAC Algorithm ...........47
           5.1.3. Cipher Type Block ..................................47
           5.1.4. Auth Tag Type Block ................................48
           5.1.5. Key Agreement Type Block ...........................49
           5.1.6. SAS Type Block .....................................51
           5.1.7. Signature Type Block ...............................52
      5.2. Hello Message .............................................53
      5.3. HelloACK Message ..........................................56
      5.4. Commit Message ............................................56
      5.5. DHPart1 Message ...........................................60
      5.6. DHPart2 Message ...........................................62
      5.7. Confirm1 and Confirm2 Messages ............................63
      5.8. Conf2ACK Message ..........................................66
      5.9. Error Message .............................................66
      5.10. ErrorACK Message .........................................68
      5.11. GoClear Message ..........................................68
      5.12. ClearACK Message .........................................69
      5.13. SASrelay Message .........................................69
      5.14. RelayACK Message .........................................72
      5.15. Ping Message .............................................72
      5.16. PingACK Message ..........................................73
   6. Retransmissions ................................................74
Top   ToC   RFC6189 - Page 4
   7. Short Authentication String ....................................77
      7.1. SAS Verified Flag .........................................78
      7.2. Signing the SAS ...........................................79
           7.2.1. OpenPGP Signatures .................................81
           7.2.2. ECDSA Signatures with X.509v3 Certs ................82
           7.2.3. Signing the SAS without a PKI ......................83
      7.3. Relaying the SAS through a PBX ............................84
           7.3.1. PBX Enrollment and the PBX Enrollment Flag .........87
   8. Signaling Interactions .........................................89
      8.1. Binding the Media Stream to the Signaling Layer
           via the Hello Hash ........................................90
           8.1.1. Integrity-Protected Signaling Enables
                  Integrity-Protected DH Exchange ....................92
      8.2. Deriving the SRTP Secret (srtps) from the
           Signaling Layer ...........................................93
      8.3. Codec Selection for Secure Media ..........................94
   9. False ZRTP Packet Rejection ....................................95
   10. Intermediary ZRTP Devices .....................................97
   11. The ZRTP Disclosure Flag ......................................98
      11.1. Guidelines on Proper Implementation of the
            Disclosure Flag .........................................100
   12. Mapping between ZID and AOR (SIP URI) ........................100
   13. IANA Considerations ..........................................102
   14. Media Security Requirements ..................................102
   15. Security Considerations ......................................104
      15.1. Self-Healing Key Continuity Feature .....................107
   16. Acknowledgments ..............................................108
   17. References ...................................................109
      17.1. Normative References ....................................109
      17.2. Informative References ..................................111

1. Introduction

ZRTP is a key agreement protocol that performs a Diffie-Hellman key exchange during call setup in the media path and is transported over the same port as the Real-time Transport Protocol (RTP) [RFC3550] media stream which has been established using a signaling protocol such as Session Initiation Protocol (SIP) [RFC3261]. This generates a shared secret, which is then used to generate keys and salt for a Secure RTP (SRTP) [RFC3711] session. ZRTP borrows ideas from [PGPfone]. A reference implementation of ZRTP is available in [Zfone]. The ZRTP protocol has some nice cryptographic features lacking in many other approaches to media session encryption. Although it uses a public key algorithm, it does not rely on a public key infrastructure (PKI). In fact, it does not use persistent public keys at all. It uses ephemeral Diffie-Hellman (DH) with hash
Top   ToC   RFC6189 - Page 5
   commitment and allows the detection of man-in-the-middle (MiTM)
   attacks by displaying a short authentication string (SAS) for the
   users to read and verbally compare over the phone.  It has Perfect
   Forward Secrecy, meaning the keys are destroyed at the end of the
   call, which precludes retroactively compromising the call by future
   disclosures of key material.  But even if the users are too lazy to
   bother with short authentication strings, we still get reasonable
   authentication against a MiTM attack, based on a form of key
   continuity.  It does this by caching some key material to use in the
   next call, to be mixed in with the next call's DH shared secret,
   giving it key continuity properties analogous to Secure SHell (SSH).
   All this is done without reliance on a PKI, key certification, trust
   models, certificate authorities, or key management complexity that
   bedevils the email encryption world.  It also does not rely on SIP
   signaling for the key management, and in fact, it does not rely on
   any servers at all.  It performs its key agreements and key
   management in a purely peer-to-peer manner over the RTP packet
   stream.

   ZRTP can be used and discovered without being declared or indicated
   in the signaling path.  This provides a best effort SRTP capability.
   Also, this reduces the complexity of implementations and minimizes
   interdependency between the signaling and media layers.  However,
   when ZRTP is indicated in the signaling via the zrtp-hash SDP
   attribute, ZRTP has additional useful properties.  By sending a hash
   of the ZRTP Hello message in the signaling, ZRTP provides a useful
   binding between the signaling and media paths, which is explained in
   Section 8.1.  When this is done through a signaling path that has
   end-to-end integrity protection, the DH exchange is automatically
   protected from a MiTM attack, which is explained in Section 8.1.1.

   ZRTP is designed for unicast media sessions in which there is a voice
   media stream.  For multiparty secure conferencing, separate ZRTP
   sessions may be negotiated between each party and the conference
   bridge.  For sessions lacking a voice media stream, MiTM protection
   may be provided by the mechanisms in Sections 8.1.1 or 7.2.  In terms
   of the RTP topologies defined in [RFC5117], ZRTP is designed for
   Point-to-Point topologies only.

2. Terminology

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. In this document, a "call" is synonymous with a "session".
Top   ToC   RFC6189 - Page 6

3. Overview

This section provides a description of how ZRTP works. This description is non-normative in nature but is included to build understanding of the protocol. ZRTP is negotiated the same way a conventional RTP session is negotiated in an offer/answer exchange using the standard RTP/AVP profile. The ZRTP protocol begins after two endpoints have utilized a signaling protocol, such as SIP, and are ready to exchange media. If Interactive Connectivity Establishment (ICE) [RFC5245] is being used, ZRTP begins after ICE has completed its connectivity checks. ZRTP is multiplexed on the same ports as RTP. It uses a unique header that makes it clearly differentiable from RTP or Session Traversal Utilities for NAT (STUN). ZRTP support can be discovered in the signaling path by the presence of a ZRTP SDP attribute. However, even in cases where this is not received in the signaling, an endpoint can still send ZRTP Hello messages to see if a response is received. If a response is not received, no more ZRTP messages will be sent during this session. This is safe because ZRTP has been designed to be clearly different from RTP and have a similar structure to STUN packets received (sometimes by non-supporting endpoints) during an ICE exchange. Both ZRTP endpoints begin the ZRTP exchange by sending a ZRTP Hello message to the other endpoint. The purpose of the Hello message is to confirm that the endpoint supports the protocol and to see what algorithms the two ZRTP endpoints have in common. The Hello message contains the SRTP configuration options and the ZID. Each instance of ZRTP has a unique 96-bit random ZRTP ID or ZID that is generated once at installation time. ZIDs are discovered during the Hello message exchange. The received ZID is used to look up retained shared secrets from previous ZRTP sessions with the endpoint. A response to a ZRTP Hello message is a ZRTP HelloACK message. The HelloACK message simply acknowledges receipt of the Hello. Since RTP commonly uses best effort UDP transport, ZRTP has retransmission timers in case of lost datagrams. There are two timers, both with exponential backoff mechanisms. One timer is used for retransmissions of Hello messages and the other is used for retransmissions of all other messages after receipt of a HelloACK.
Top   ToC   RFC6189 - Page 7
   If an integrity-protected signaling channel is available, a hash of
   the Hello message can be sent.  This allows rejection of false ZRTP
   Hello messages injected by an attacker.

   Hello and other ZRTP messages also contain a hash image that is used
   to link the messages together.  This allows rejection of false ZRTP
   messages injected during an exchange.

3.1. Key Agreement Modes

After both endpoints exchange Hello and HelloACK messages, the key agreement exchange can begin with the ZRTP Commit message. ZRTP supports a number of key agreement modes including both Diffie- Hellman and non-Diffie-Hellman modes as described in the following sections. The Commit message may be sent immediately after both endpoints have completed the Hello/HelloACK discovery handshake, or it may be deferred until later in the call, after the participants engage in some unencrypted conversation. The Commit message may be manually activated by a user interface element, such as a GO SECURE button, which becomes enabled after the Hello/HelloACK discovery phase. This emulates the user experience of a number of secure phones in the Public Switched Telephone Network (PSTN) world [comsec]. However, it is expected that most simple ZRTP user agents will omit such buttons and proceed directly to secure mode by sending a Commit message immediately after the Hello/HelloACK handshake.

3.1.1. Diffie-Hellman Mode Overview

An example ZRTP call flow is shown in Figure 1. Note that the order of the Hello/HelloACK exchanges in F1/F2 and F3/F4 may be reversed. That is, either Alice or Bob might send the first Hello message. Note that the endpoint that sends the Commit message is considered the initiator of the ZRTP session and drives the key agreement exchange. The Diffie-Hellman public values are exchanged in the DHPart1 and DHPart2 messages. SRTP keys and salts are then calculated. The initiator needs to generate its ephemeral key pair before sending the Commit, and the responder generates its key pair before sending DHPart1.
Top   ToC   RFC6189 - Page 8
   Alice                                                Bob
    |                                                   |
    |      Alice and Bob establish a media session.     |
    |         They initiate ZRTP on media ports         |
    |                                                   |
    | F1 Hello (version, options, Alice's ZID)          |
    |-------------------------------------------------->|
    |                                       HelloACK F2 |
    |<--------------------------------------------------|
    |            Hello (version, options, Bob's ZID) F3 |
    |<--------------------------------------------------|
    | F4 HelloACK                                       |
    |-------------------------------------------------->|
    |                                                   |
    |             Bob acts as the initiator.            |
    |                                                   |
    |        Commit (Bob's ZID, options, hash value) F5 |
    |<--------------------------------------------------|
    | F6 DHPart1 (pvr, shared secret hashes)            |
    |-------------------------------------------------->|
    |            DHPart2 (pvi, shared secret hashes) F7 |
    |<--------------------------------------------------|
    |                                                   |
    |     Alice and Bob generate SRTP session key.      |
    |                                                   |
    | F8 Confirm1 (MAC, D,A,V,E flags, sig)             |
    |-------------------------------------------------->|
    |             Confirm2 (MAC, D,A,V,E flags, sig) F9 |
    |<--------------------------------------------------|
    | F10 Conf2ACK                                      |
    |-------------------------------------------------->|
    |                    SRTP begins                    |
    |<=================================================>|
    |                                                   |

           Figure 1: Establishment of an SRTP Session Using ZRTP

   ZRTP authentication uses a Short Authentication String (SAS), which
   is ideally displayed for the human user.  Alternatively, the SAS can
   be authenticated by exchanging an optional digital signature (sig)
   over the SAS in the Confirm1 or Confirm2 messages (described in
   Section 7.2).

   The ZRTP Confirm1 and Confirm2 messages are sent for a number of
   reasons, not the least of which is that they confirm that all the key
   agreement calculations were successful and thus the encryption will
   work.  They also carry other information such as the Disclosure flag
   (D), the Allow Clear flag (A), the SAS Verified flag (V), and the
Top   ToC   RFC6189 - Page 9
   Private Branch Exchange (PBX) Enrollment flag (E).  All flags are
   encrypted to shield them from a passive observer.

3.1.2. Preshared Mode Overview

In the Preshared mode, endpoints can skip the DH calculation if they have a shared secret from a previous ZRTP session. Preshared mode is indicated in the Commit message and results in the same call flow as Multistream mode. The principal difference between Multistream mode and Preshared mode is that Preshared mode uses a previously cached shared secret, rs1, instead of an active ZRTP Session key as the initial keying material. This mode could be useful for slow processor endpoints so that a DH calculation does not need to be performed every session. Or, this mode could be used to rapidly re-establish an earlier session that was recently torn down or interrupted without the need to perform another DH calculation. Preshared mode has forward secrecy properties. If a phone's cache is captured by an opponent, the cached shared secrets cannot be used to recover earlier encrypted calls, because the shared secrets are replaced with new ones in each new call, as in DH mode. However, the captured secrets can be used by a passive wiretapper in the media path to decrypt the next call, if the next call is in Preshared mode. This differs from DH mode, which requires an active MiTM wiretapper to exploit captured secrets in the next call. However, if the next call is missed by the wiretapper, he cannot wiretap any further calls. Thus, it preserves most of the self-healing properties (Section 15.1) of key continuity enjoyed by DH mode.

3.1.3. Multistream Mode Overview

Multistream mode is an alternative key agreement method used when two endpoints have an established SRTP media stream between them with an active ZRTP Session key. ZRTP can derive multiple SRTP keys from a single DH exchange. For example, an established secure voice call that adds a video stream uses Multistream mode to quickly initiate the video stream without a second DH exchange. When Multistream mode is indicated in the Commit message, a call flow similar to Figure 1 is used, but no DH calculation is performed by either endpoint and the DHPart1 and DHPart2 messages are omitted. The Confirm1, Confirm2, and Conf2ACK messages are still sent. Since the cache is not affected during this mode, multiple Multistream ZRTP exchanges can be performed in parallel between two endpoints.
Top   ToC   RFC6189 - Page 10
   When adding additional media streams to an existing call, only
   Multistream mode is used.  Only one DH operation is performed, just
   for the first media stream.



(page 10 continued on part 2)

Next Section