Network Working Group C. Jennings, Ed. Request for Comments: 5626 Cisco Systems Updates: 3261, 3327 R. Mahy, Ed. Category: Standards Track Unaffiliated F. Audet, Ed. Skype Labs October 2009 Managing Client-Initiated Connections in the Session Initiation Protocol (SIP)Abstract
The Session Initiation Protocol (SIP) allows proxy servers to initiate TCP connections or to send asynchronous UDP datagrams to User Agents in order to deliver requests. However, in a large number of real deployments, many practical considerations, such as the existence of firewalls and Network Address Translators (NATs) or the use of TLS with server-provided certificates, prevent servers from connecting to User Agents in this way. This specification defines behaviors for User Agents, registrars, and proxy servers that allow requests to be delivered on existing connections established by the User Agent. It also defines keep-alive behaviors needed to keep NAT bindings open and specifies the usage of multiple connections from the User Agent to its registrar. Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as described in the BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Conventions and Terminology . . . . . . . . . . . . . . . . . 5 2.1. Definitions . . . . . . . . . . . . . . . . . . . . . . . 5 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1. Summary of Mechanism . . . . . . . . . . . . . . . . . . . 6 3.2. Single Registrar and UA . . . . . . . . . . . . . . . . . 7 3.3. Multiple Connections from a User Agent . . . . . . . . . . 8 3.4. Edge Proxies . . . . . . . . . . . . . . . . . . . . . . . 10 3.5. Keep-Alive Technique . . . . . . . . . . . . . . . . . . . 11 3.5.1. CRLF Keep-Alive Technique . . . . . . . . . . . . . . 12 3.5.2. STUN Keep-Alive Technique . . . . . . . . . . . . . . 12 4. User Agent Procedures . . . . . . . . . . . . . . . . . . . . 13 4.1. Instance ID Creation . . . . . . . . . . . . . . . . . . . 13 4.2. Registrations . . . . . . . . . . . . . . . . . . . . . . 14 4.2.1. Initial Registrations . . . . . . . . . . . . . . . . 14 4.2.2. Subsequent REGISTER Requests . . . . . . . . . . . . . 16 4.2.3. Third-Party Registrations . . . . . . . . . . . . . . 17 4.3. Sending Non-REGISTER Requests . . . . . . . . . . . . . . 17 4.4. Keep-Alives and Detecting Flow Failure . . . . . . . . . . 18 4.4.1. Keep-Alive with CRLF . . . . . . . . . . . . . . . . . 19 4.4.2. Keep-Alive with STUN . . . . . . . . . . . . . . . . . 21 4.5. Flow Recovery . . . . . . . . . . . . . . . . . . . . . . 21 5. Edge Proxy Procedures . . . . . . . . . . . . . . . . . . . . 22 5.1. Processing Register Requests . . . . . . . . . . . . . . . 22 5.2. Generating Flow Tokens . . . . . . . . . . . . . . . . . . 23 5.3. Forwarding Non-REGISTER Requests . . . . . . . . . . . . . 23 5.3.1. Processing Incoming Requests . . . . . . . . . . . . . 24 5.3.2. Processing Outgoing Requests . . . . . . . . . . . . . 24 5.4. Edge Proxy Keep-Alive Handling . . . . . . . . . . . . . . 25 6. Registrar Procedures . . . . . . . . . . . . . . . . . . . . . 25 7. Authoritative Proxy Procedures: Forwarding Requests . . . . . 27
8. STUN Keep-Alive Processing . . . . . . . . . . . . . . . . . . 28 8.1. Use with SigComp . . . . . . . . . . . . . . . . . . . . . 29 9. Example Message Flow . . . . . . . . . . . . . . . . . . . . . 30 9.1. Subscription to Configuration Package . . . . . . . . . . 30 9.2. Registration . . . . . . . . . . . . . . . . . . . . . . . 32 9.3. Incoming Call and Proxy Crash . . . . . . . . . . . . . . 34 9.4. Re-Registration . . . . . . . . . . . . . . . . . . . . . 37 9.5. Outgoing Call . . . . . . . . . . . . . . . . . . . . . . 38 10. Grammar . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 40 11.1. Flow-Timer Header Field . . . . . . . . . . . . . . . . . 40 11.2. "reg-id" Contact Header Field Parameter . . . . . . . . . 40 11.3. SIP/SIPS URI Parameters . . . . . . . . . . . . . . . . . 41 11.4. SIP Option Tag . . . . . . . . . . . . . . . . . . . . . . 41 11.5. 430 (Flow Failed) Response Code . . . . . . . . . . . . . 41 11.6. 439 (First Hop Lacks Outbound Support) Response Code . . . 42 11.7. Media Feature Tag . . . . . . . . . . . . . . . . . . . . 42 12. Security Considerations . . . . . . . . . . . . . . . . . . . 43 13. Operational Notes on Transports . . . . . . . . . . . . . . . 44 14. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 44 15. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 45 16. References . . . . . . . . . . . . . . . . . . . . . . . . . . 45 16.1. Normative References . . . . . . . . . . . . . . . . . . . 45 16.2. Informative References . . . . . . . . . . . . . . . . . . 47 Appendix A. Default Flow Registration Backoff Times . . . . . . . 49 Appendix B. ABNF . . . . . . . . . . . . . . . . . . . . . . . . 49
1. Introduction
There are many environments for SIP [RFC3261] deployments in which the User Agent (UA) can form a connection to a registrar or proxy but in which connections in the reverse direction to the UA are not possible. This can happen for several reasons, but the most likely is a NAT or a firewall in between the SIP UA and the proxy. Many such devices will only allow outgoing connections. This specification allows a SIP User Agent behind such a firewall or NAT to receive inbound traffic associated with registrations or dialogs that it initiates. Most IP phones and personal computers get their network configurations dynamically via a protocol such as the Dynamic Host Configuration Protocol (DHCP) [RFC2131]. These systems typically do not have a useful name in the Domain Name System (DNS) [RFC1035], and they almost never have a long-term, stable DNS name that is appropriate for use in the subjectAltName of a certificate, as required by [RFC3261]. However, these systems can still act as a Transport Layer Security (TLS) [RFC5246] client and form outbound connections to a proxy or registrar that authenticates with a server certificate. The server can authenticate the UA using a shared secret in a digest challenge (as defined in Section 22 of RFC 3261) over that TLS connection. This specification allows a SIP User Agent who has to initiate the TLS connection to receive inbound traffic associated with registrations or dialogs that it initiates. The key idea of this specification is that when a UA sends a REGISTER request or a dialog-forming request, the proxy can later use this same network "flow" -- whether this is a bidirectional stream of UDP datagrams, a TCP connection, or an analogous concept in another transport protocol -- to forward any incoming requests that need to go to this UA in the context of the registration or dialog. For a UA to receive incoming requests, the UA has to connect to a server. Since the server can't connect to the UA, the UA has to make sure that a flow is always active. This requires the UA to detect when a flow fails. Since such detection takes time and leaves a window of opportunity for missed incoming requests, this mechanism allows the UA to register over multiple flows at the same time. This specification also defines two keep-alive schemes. The keep-alive mechanism is used to keep NAT bindings fresh, and to allow the UA to detect when a flow has failed.
2. Conventions and Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].2.1. Definitions
Authoritative Proxy: A proxy that handles non-REGISTER requests for a specific Address-of-Record (AOR), performs the logical Location Server lookup described in [RFC3261], and forwards those requests to specific Contact URIs. (In [RFC3261], the role that is authoritative for REGISTER requests for a specific AOR is a Registration Server.) Edge Proxy: An edge proxy is any proxy that is located topologically between the registering User Agent and the Authoritative Proxy. The "first" edge proxy refers to the first edge proxy encountered when a UA sends a request. Flow: A Flow is a transport-layer association between two hosts that is represented by the network address and port number of both ends and by the transport protocol. For TCP, a flow is equivalent to a TCP connection. For UDP a flow is a bidirectional stream of datagrams between a single pair of IP addresses and ports of both peers. With TCP, a flow often has a one-to-one correspondence with a single file descriptor in the operating system. Flow Token: An identifier that uniquely identifies a flow which can be included in a SIP URI (Uniform Resource Identifier [RFC3986]). reg-id: This refers to the value of a new header field parameter value for the Contact header field. When a UA registers multiple times, each for a different flow, each concurrent registration gets a unique reg-id value. instance-id: This specification uses the word instance-id to refer to the value of the "sip.instance" media feature tag which appears as a "+sip.instance" Contact header field parameter. This is a Uniform Resource Name (URN) that uniquely identifies this specific UA instance. "ob" Parameter: The "ob" parameter is a SIP URI parameter that has a different meaning depending on context. In a Path header field value, it is used by the first edge proxy to indicate that a flow token was added to the URI. In a Contact or Route header field value, it indicates that the UA would like other requests in the same dialog to be routed over the same flow.
outbound-proxy-set: A set of SIP URIs (Uniform Resource Identifiers) that represents each of the outbound proxies (often edge proxies) with which the UA will attempt to maintain a direct flow. The first URI in the set is often referred to as the primary outbound proxy and the second as the secondary outbound proxy. There is no difference between any of the URIs in this set, nor does the primary/secondary terminology imply that one is preferred over the other.3. Overview
The mechanisms defined in this document are useful in several scenarios discussed below, including the simple co-located registrar and proxy, a User Agent desiring multiple connections to a resource (for redundancy, for example), and a system that uses edge proxies. This entire section is non-normative.3.1. Summary of Mechanism
Each UA has a unique instance-id that stays the same for this UA even if the UA reboots or is power cycled. Each UA can register multiple times over different flows for the same SIP Address of Record (AOR) to achieve high reliability. Each registration includes the instance-id for the UA and a reg-id label that is different for each flow. The registrar can use the instance-id to recognize that two different registrations both correspond to the same UA. The registrar can use the reg-id label to recognize whether a UA is creating a new flow or refreshing or replacing an old one, possibly after a reboot or a network failure. When a proxy goes to route a message to a UA for which it has a binding, it can use any one of the flows on which a successful registration has been completed. A failure to deliver a request on a particular flow can be tried again on an alternate flow. Proxies can determine which flows go to the same UA by comparing the instance-id. Proxies can tell that a flow replaces a previously abandoned flow by looking at the reg-id. When sending a dialog-forming request, a UA can also ask its first edge proxy to route subsequent requests in that dialog over the same flow. This is necessary whether the UA has registered or not. UAs use a simple periodic message as a keep-alive mechanism to keep their flow to the proxy or registrar alive. For connection-oriented transports such as TCP this is based on carriage-return and line-feed
sequences (CRLF), while for transports that are not connection oriented, this is accomplished by using a SIP-specific usage profile of STUN (Session Traversal Utilities for NAT) [RFC5389].3.2. Single Registrar and UA
In the topology shown below, a single server is acting as both a registrar and proxy. +-----------+ | Registrar | | Proxy | +-----+-----+ | | +----+--+ | User | | Agent | +-------+ User Agents that form only a single flow continue to register normally but include the instance-id as described in Section 4.1. The UA also includes a "reg-id" Contact header field parameter that is used to allow the registrar to detect and avoid keeping invalid contacts when a UA reboots or reconnects after its old connection has failed for some reason. For clarity, here is an example. Bob's UA creates a new TCP flow to the registrar and sends the following REGISTER request. REGISTER sip:example.com SIP/2.0 Via: SIP/2.0/TCP 192.0.2.2;branch=z9hG4bK-bad0ce-11-1036 Max-Forwards: 70 From: Bob <sip:bob@example.com>;tag=d879h76 To: Bob <sip:bob@example.com> Call-ID: 8921348ju72je840.204 CSeq: 1 REGISTER Supported: path, outbound Contact: <sip:line1@192.0.2.2;transport=tcp>; reg-id=1; ;+sip.instance="<urn:uuid:00000000-0000-1000-8000-000A95A0E128>" Content-Length: 0 The registrar challenges this registration to authenticate Bob. When the registrar adds an entry for this contact under the AOR for Bob, the registrar also keeps track of the connection over which it received this registration.
The registrar saves the instance-id ("urn:uuid:00000000-0000-1000-8000-000A95A0E128") and reg-id ("1") along with the rest of the Contact header field. If the instance-id and reg-id are the same as a previous registration for the same AOR, the registrar replaces the old Contact URI and flow information. This allows a UA that has rebooted to replace its previous registration for each flow with minimal impact on overall system load. When Alice sends a request to Bob, his authoritative proxy selects the target set. The proxy forwards the request to elements in the target set based on the proxy's policy. The proxy looks at the target set and uses the instance-id to understand if two targets both end up routing to the same UA. When the proxy goes to forward a request to a given target, it looks and finds the flows over which it received the registration. The proxy then forwards the request over an existing flow, instead of resolving the Contact URI using the procedures in [RFC3263] and trying to form a new flow to that contact. As described in the next section, if the proxy has multiple flows that all go to this UA, the proxy can choose any one of the registration bindings for this AOR that has the same instance-id as the selected UA.3.3. Multiple Connections from a User Agent
There are various ways to deploy SIP to build a reliable and scalable system. This section discusses one such design that is possible with the mechanisms in this specification. Other designs are also possible. In the example system below, the logical outbound proxy/registrar for the domain is running on two hosts that share the appropriate state and can both provide registrar and outbound proxy functionality for the domain. The UA will form connections to two of the physical hosts that can perform the authoritative proxy/registrar function for the domain. Reliability is achieved by having the UA form two TCP connections to the domain.
+-------------------+ | Domain | | Logical Proxy/Reg | | | |+-----+ +-----+| ||Host1| |Host2|| |+-----+ +-----+| +---\------------/--+ \ / \ / \ / \ / +------+ | User | | Agent| +------+ The UA is configured with multiple outbound proxy registration URIs. These URIs are configured into the UA through whatever the normal mechanism is to configure the proxy address and AOR in the UA. If the AOR is alice@example.com, the outbound-proxy-set might look something like "sip:primary.example.com" and "sip: secondary.example.com". Note that each URI in the outbound-proxy-set could resolve to several different physical hosts. The administrative domain that created these URIs should ensure that the two URIs resolve to separate hosts. These URIs are handled according to normal SIP processing rules, so mechanisms like DNS SRV [RFC2782] can be used to do load-balancing across a proxy farm. The approach in this document does not prevent future extensions, such as the SIP UA configuration framework [CONFIG-FMWK], from adding other ways for a User Agent to discover its outbound-proxy-set. The domain also needs to ensure that a request for the UA sent to Host1 or Host2 is then sent across the appropriate flow to the UA. The domain might choose to use the Path header approach (as described in the next section) to store this internal routing information on Host1 or Host2. When a single server fails, all the UAs that have a flow through it will detect a flow failure and try to reconnect. This can cause large loads on the server. When large numbers of hosts reconnect nearly simultaneously, this is referred to as the avalanche restart problem, and is further discussed in Section 4.5. The multiple flows to many servers help reduce the load caused by the avalanche restart. If a UA has multiple flows, and one of the servers fails, the UA delays a recommended amount of time before trying to form a new
connection to replace the flow to the server that failed. By spreading out the time used for all the UAs to reconnect to a server, the load on the server farm is reduced. Scalability is achieved by using DNS SRV [RFC2782] to load-balance the primary connection across a set of machines that can service the primary connection, and also using DNS SRV to load-balance across a separate set of machines that can service the secondary connection. The deployment here requires that DNS is configured with one entry that resolves to all the primary hosts and another entry that resolves to all the secondary hosts. While this introduces additional DNS configuration, the approach works and requires no additional SIP extensions to [RFC3263]. Another motivation for maintaining multiple flows between the UA and its registrar is related to multihomed UAs. Such UAs can benefit from multiple connections from different interfaces to protect against the failure of an individual access link.3.4. Edge Proxies
Some SIP deployments use edge proxies such that the UA sends the REGISTER to an edge proxy that then forwards the REGISTER to the registrar. There could be a NAT or firewall between the UA and the edge proxy. +---------+ |Registrar| |Proxy | +---------+ / \ / \ / \ +-----+ +-----+ |Edge1| |Edge2| +-----+ +-----+ \ / \ / ----------------------------NAT/FW \ / \ / +------+ |User | |Agent | +------+
The edge proxy includes a Path header [RFC3327] so that when the proxy/registrar later forwards a request to this UA, the request is routed through the edge proxy. These systems can use effectively the same mechanism as described in the previous sections but need to use the Path header. When the edge proxy receives a registration, it needs to create an identifier value that is unique to this flow (and not a subsequent flow with the same addresses) and put this identifier in the Path header URI. This identifier has two purposes. First, it allows the edge proxy to map future requests back to the correct flow. Second, because the identifier will only be returned if the user authenticates with the registrar successfully, it allows the edge proxy to indirectly check the user's authentication information via the registrar. The identifier is placed in the user portion of a loose route in the Path header. If the registration succeeds, the edge proxy needs to map future requests (that are routed to the identifier value from the Path header) to the associated flow. The term edge proxy is often used to refer to deployments where the edge proxy is in the same administrative domain as the registrar. However, in this specification we use the term to refer to any proxy between the UA and the registrar. For example, the edge proxy may be inside an enterprise that requires its use, and the registrar could be from a service provider with no relationship to the enterprise. Regardless of whether they are in the same administrative domain, this specification requires that registrars and edge proxies support the Path header mechanism in [RFC3327].3.5. Keep-Alive Technique
This document describes two keep-alive mechanisms: a CRLF keep-alive and a STUN keep-alive. Each of these mechanisms uses a client-to- server "ping" keep-alive and a corresponding server-to-client "pong" message. This ping-pong sequence allows the client, and optionally the server, to tell if its flow is still active and useful for SIP traffic. The server responds to pings by sending pongs. If the client does not receive a pong in response to its ping (allowing for retransmission for STUN as described in Section 4.4.2), it declares the flow dead and opens a new flow in its place. This document also suggests timer values for these client keep-alive mechanisms. These timer values were chosen to keep most NAT and firewall bindings open, to detect unresponsive servers within 2 minutes, and to mitigate against the avalanche restart problem. However, the client may choose different timer values to suit its needs, for example to optimize battery life. In some environments,
the server can also keep track of the time since a ping was received over a flow to guess the likelihood that the flow is still useful for delivering SIP messages. When the UA detects that a flow has failed or that the flow definition has changed, the UA needs to re-register and will use the back-off mechanism described in Section 4.5 to provide congestion relief when a large number of agents simultaneously reboot. A keep-alive mechanism needs to keep NAT bindings refreshed; for connections, it also needs to detect failure of a connection; and for connectionless transports, it needs to detect flow failures including changes to the NAT public mapping. For connection-oriented transports such as TCP [RFC0793] and SCTP [RFC4960], this specification describes a keep-alive approach based on sending CRLFs. For connectionless transport, such as UDP [RFC0768], this specification describes using STUN [RFC5389] over the same flow as the SIP traffic to perform the keep-alive. UAs and Proxies are also free to use native transport keep-alives; however, the application may not be able to set these timers on a per-connection basis, and the server certainly cannot make any assumption about what values are used. Use of native transport keep-alives is outside the scope of this document.3.5.1. CRLF Keep-Alive Technique
This approach can only be used with connection-oriented transports such as TCP or SCTP. The client periodically sends a double-CRLF (the "ping") then waits to receive a single CRLF (the "pong"). If the client does not receive a "pong" within an appropriate amount of time, it considers the flow failed. Note: Sending a CRLF over a connection-oriented transport is backwards compatible (because of requirements in Section 7.5 of [RFC3261]), but only implementations which support this specification will respond to a "ping" with a "pong".3.5.2. STUN Keep-Alive Technique
This approach can only be used for connection-less transports, such as UDP. For connection-less transports, a flow definition could change because a NAT device in the network path reboots and the resulting public IP address or port mapping for the UA changes. To detect this, STUN requests are sent over the same flow that is being used
for the SIP traffic. The proxy or registrar acts as a limited Session Traversal Utilities for NAT (STUN) [RFC5389] server on the SIP signaling port. Note: The STUN mechanism is very robust and allows the detection of a changed IP address and port. Many other options were considered, but the SIP Working Group selected the STUN-based approach. Approaches using SIP requests were abandoned because many believed that good performance and full backwards compatibility using this method were mutually exclusive.