5. Use of TLS
5.1. Overview
XMPP includes a method for securing the stream from tampering and eavesdropping. This channel encryption method makes use of the Transport Layer Security (TLS) protocol [TLS], along with a "STARTTLS" extension that is modelled after similar extensions for the IMAP [IMAP], POP3 [POP3], and ACAP [ACAP] protocols as described in RFC 2595 [USINGTLS]. The namespace name for the STARTTLS extension is 'urn:ietf:params:xml:ns:xmpp-tls'. An administrator of a given domain MAY require the use of TLS for client-to-server communications, server-to-server communications, or both. Clients SHOULD use TLS to secure the streams prior to attempting the completion of SASL negotiation (Section 6), and servers SHOULD use TLS between two domains for the purpose of securing server-to-server communications.
The following rules apply: 1. An initiating entity that complies with this specification MUST include the 'version' attribute set to a value of "1.0" in the initial stream header. 2. If the TLS negotiation occurs between two servers, communications MUST NOT proceed until the Domain Name System (DNS) hostnames asserted by the servers have been resolved (see Server-to-Server Communications (Section 14.4)). 3. When a receiving entity that complies with this specification receives an initial stream header that includes the 'version' attribute set to a value of at least "1.0", after sending a stream header in reply (including the version flag), it MUST include a <starttls/> element (qualified by the 'urn:ietf:params:xml:ns:xmpp-tls' namespace) along with the list of other stream features it supports. 4. If the initiating entity chooses to use TLS, TLS negotiation MUST be completed before proceeding to SASL negotiation; this order of negotiation is required to help safeguard authentication information sent during SASL negotiation, as well as to make it possible to base the use of the SASL EXTERNAL mechanism on a certificate provided during prior TLS negotiation. 5. During TLS negotiation, an entity MUST NOT send any white space characters (matching production [3] content of [XML]) within the root stream element as separators between elements (any white space characters shown in the TLS examples below are included for the sake of readability only); this prohibition helps to ensure proper security layer byte precision. 6. The receiving entity MUST consider the TLS negotiation to have begun immediately after sending the closing ">" character of the <proceed/> element. The initiating entity MUST consider the TLS negotiation to have begun immediately after receiving the closing ">" character of the <proceed/> element from the receiving entity. 7. The initiating entity MUST validate the certificate presented by the receiving entity; see Certificate Validation (Section 14.2) regarding certificate validation procedures. 8. Certificates MUST be checked against the hostname as provided by the initiating entity (e.g., a user), not the hostname as resolved via the Domain Name System; e.g., if the user specifies a hostname of "example.com" but a DNS SRV [SRV] lookup returned
"im.example.com", the certificate MUST be checked as "example.com". If a JID for any kind of XMPP entity (e.g., client or server) is represented in a certificate, it MUST be represented as a UTF8String within an otherName entity inside the subjectAltName, using the [ASN.1] Object Identifier "id-on-xmppAddr" specified in Section 5.1.1 of this document. 9. If the TLS negotiation is successful, the receiving entity MUST discard any knowledge obtained in an insecure manner from the initiating entity before TLS takes effect. 10. If the TLS negotiation is successful, the initiating entity MUST discard any knowledge obtained in an insecure manner from the receiving entity before TLS takes effect. 11. If the TLS negotiation is successful, the receiving entity MUST NOT offer the STARTTLS extension to the initiating entity along with the other stream features that are offered when the stream is restarted. 12. If the TLS negotiation is successful, the initiating entity MUST continue with SASL negotiation. 13. If the TLS negotiation results in failure, the receiving entity MUST terminate both the XML stream and the underlying TCP connection. 14. See Mandatory-to-Implement Technologies (Section 14.7) regarding mechanisms that MUST be supported.5.1.1. ASN.1 Object Identifier for XMPP Address
The [ASN.1] Object Identifier "id-on-xmppAddr" described above is defined as follows: id-pkix OBJECT IDENTIFIER ::= { iso(1) identified-organization(3) dod(6) internet(1) security(5) mechanisms(5) pkix(7) } id-on OBJECT IDENTIFIER ::= { id-pkix 8 } -- other name forms id-on-xmppAddr OBJECT IDENTIFIER ::= { id-on 5 } XmppAddr ::= UTF8String This Object Identifier MAY also be represented in the dotted display format as "1.3.6.1.5.5.7.8.5".
5.2. Narrative
When an initiating entity secures a stream with a receiving entity using TLS, the steps involved are as follows: 1. The initiating entity opens a TCP connection and initiates the stream by sending the opening XML stream header to the receiving entity, including the 'version' attribute set to a value of at least "1.0". 2. The receiving entity responds by opening a TCP connection and sending an XML stream header to the initiating entity, including the 'version' attribute set to a value of at least "1.0". 3. The receiving entity offers the STARTTLS extension to the initiating entity by including it with the list of other supported stream features (if TLS is required for interaction with the receiving entity, it SHOULD signal that fact by including a <required/> element as a child of the <starttls/> element). 4. The initiating entity issues the STARTTLS command (i.e., a <starttls/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-tls' namespace) to instruct the receiving entity that it wishes to begin a TLS negotiation to secure the stream. 5. The receiving entity MUST reply with either a <proceed/> element or a <failure/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-tls' namespace. If the failure case occurs, the receiving entity MUST terminate both the XML stream and the underlying TCP connection. If the proceed case occurs, the entities MUST attempt to complete the TLS negotiation over the TCP connection and MUST NOT send any further XML data until the TLS negotiation is complete. 6. The initiating entity and receiving entity attempt to complete a TLS negotiation in accordance with [TLS]. 7. If the TLS negotiation is unsuccessful, the receiving entity MUST terminate the TCP connection. If the TLS negotiation is successful, the initiating entity MUST initiate a new stream by sending an opening XML stream header to the receiving entity (it is not necessary to send a closing </stream> tag first, since the receiving entity and initiating entity MUST consider the original stream to be closed upon successful TLS negotiation).
8. Upon receiving the new stream header from the initiating entity, the receiving entity MUST respond by sending a new XML stream header to the initiating entity along with the available features (but not including the STARTTLS feature).5.3. Client-to-Server Example
The following example shows the data flow for a client securing a stream using STARTTLS (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example). Step 1: Client initiates stream to server: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 2: Server responds by sending a stream tag to client: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='c2s_123' from='example.com' version='1.0'> Step 3: Server sends the STARTTLS extension to client along with authentication mechanisms and any other stream features: <stream:features> <starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'> <required/> </starttls> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>PLAIN</mechanism> </mechanisms> </stream:features> Step 4: Client sends the STARTTLS command to server: <starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5: Server informs client that it is allowed to proceed: <proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/> Step 5 (alt): Server informs client that TLS negotiation has failed and closes both stream and TCP connection: <failure xmlns='urn:ietf:params:xml:ns:xmpp-tls'/> </stream:stream> Step 6: Client and server attempt to complete TLS negotiation over the existing TCP connection. Step 7: If TLS negotiation is successful, client initiates a new stream to server: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 7 (alt): If TLS negotiation is unsuccessful, server closes TCP connection. Step 8: Server responds by sending a stream header to client along with any available stream features: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' from='example.com' id='c2s_234' version='1.0'> <stream:features> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>PLAIN</mechanism> <mechanism>EXTERNAL</mechanism> </mechanisms> </stream:features> Step 9: Client continues with SASL negotiation (Section 6).
5.4. Server-to-Server Example
The following example shows the data flow for two servers securing a stream using STARTTLS (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example). Step 1: Server1 initiates stream to Server2: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 2: Server2 responds by sending a stream tag to Server1: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' from='example.com' id='s2s_123' version='1.0'> Step 3: Server2 sends the STARTTLS extension to Server1 along with authentication mechanisms and any other stream features: <stream:features> <starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'> <required/> </starttls> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>KERBEROS_V4</mechanism> </mechanisms> </stream:features> Step 4: Server1 sends the STARTTLS command to Server2: <starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/> Step 5: Server2 informs Server1 that it is allowed to proceed: <proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
Step 5 (alt): Server2 informs Server1 that TLS negotiation has failed and closes stream: <failure xmlns='urn:ietf:params:xml:ns:xmpp-tls'/> </stream:stream> Step 6: Server1 and Server2 attempt to complete TLS negotiation via TCP. Step 7: If TLS negotiation is successful, Server1 initiates a new stream to Server2: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 7 (alt): If TLS negotiation is unsuccessful, Server2 closes TCP connection. Step 8: Server2 responds by sending a stream header to Server1 along with any available stream features: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' from='example.com' id='s2s_234' version='1.0'> <stream:features> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>KERBEROS_V4</mechanism> <mechanism>EXTERNAL</mechanism> </mechanisms> </stream:features> Step 9: Server1 continues with SASL negotiation (Section 6).
6. Use of SASL
6.1. Overview
XMPP includes a method for authenticating a stream by means of an XMPP-specific profile of the Simple Authentication and Security Layer (SASL) protocol [SASL]. SASL provides a generalized method for adding authentication support to connection-based protocols, and XMPP uses a generic XML namespace profile for SASL that conforms to the profiling requirements of [SASL]. The following rules apply: 1. If the SASL negotiation occurs between two servers, communications MUST NOT proceed until the Domain Name System (DNS) hostnames asserted by the servers have been resolved (see Server-to-Server Communications (Section 14.4)). 2. If the initiating entity is capable of SASL negotiation, it MUST include the 'version' attribute set to a value of at least "1.0" in the initial stream header. 3. If the receiving entity is capable of SASL negotiation, it MUST advertise one or more authentication mechanisms within a <mechanisms/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace in reply to the opening stream tag received from the initiating entity (if the opening stream tag included the 'version' attribute set to a value of at least "1.0"). 4. During SASL negotiation, an entity MUST NOT send any white space characters (matching production [3] content of [XML]) within the root stream element as separators between elements (any white space characters shown in the SASL examples below are included for the sake of readability only); this prohibition helps to ensure proper security layer byte precision. 5. Any XML character data contained within the XML elements used during SASL negotiation MUST be encoded using base64, where the encoding adheres to the definition in Section 3 of RFC 3548 [BASE64]. 6. If provision of a "simple username" is supported by the selected SASL mechanism (e.g., this is supported by the DIGEST-MD5 and CRAM-MD5 mechanisms but not by the EXTERNAL and GSSAPI mechanisms), during authentication the initiating entity SHOULD provide as the simple username its sending domain (IP address or fully qualified domain name as contained in a domain identifier)
in the case of server-to-server communications or its registered account name (user or node name as contained in an XMPP node identifier) in the case of client-to-server communications. 7. If the initiating entity wishes to act on behalf of another entity and the selected SASL mechanism supports transmission of an authorization identity, the initiating entity MUST provide an authorization identity during SASL negotiation. If the initiating entity does not wish to act on behalf of another entity, it MUST NOT provide an authorization identity. As specified in [SASL], the initiating entity MUST NOT provide an authorization identity unless the authorization identity is different from the default authorization identity derived from the authentication identity as described in [SASL]. If provided, the value of the authorization identity MUST be of the form <domain> (i.e., a domain identifier only) for servers and of the form <node@domain> (i.e., node identifier and domain identifier) for clients. 8. Upon successful SASL negotiation that involves negotiation of a security layer, the receiving entity MUST discard any knowledge obtained from the initiating entity which was not obtained from the SASL negotiation itself. 9. Upon successful SASL negotiation that involves negotiation of a security layer, the initiating entity MUST discard any knowledge obtained from the receiving entity which was not obtained from the SASL negotiation itself. 10. See Mandatory-to-Implement Technologies (Section 14.7) regarding mechanisms that MUST be supported.6.2. Narrative
When an initiating entity authenticates with a receiving entity using SASL, the steps involved are as follows: 1. The initiating entity requests SASL authentication by including the 'version' attribute in the opening XML stream header sent to the receiving entity, with the value set to "1.0". 2. After sending an XML stream header in reply, the receiving entity advertises a list of available SASL authentication mechanisms; each of these is a <mechanism/> element included as a child within a <mechanisms/> container element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace, which in turn is a child of a <features/> element in the streams namespace. If Use of TLS (Section 5) needs to be established before a particular
authentication mechanism may be used, the receiving entity MUST NOT provide that mechanism in the list of available SASL authentication mechanisms prior to TLS negotiation. If the initiating entity presents a valid certificate during prior TLS negotiation, the receiving entity SHOULD offer the SASL EXTERNAL mechanism to the initiating entity during SASL negotiation (refer to [SASL]), although the EXTERNAL mechanism MAY be offered under other circumstances as well. 3. The initiating entity selects a mechanism by sending an <auth/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace to the receiving entity and including an appropriate value for the 'mechanism' attribute. This element MAY contain XML character data (in SASL terminology, the "initial response") if the mechanism supports or requires it; if the initiating entity needs to send a zero-length initial response, it MUST transmit the response as a single equals sign ("="), which indicates that the response is present but contains no data. 4. If necessary, the receiving entity challenges the initiating entity by sending a <challenge/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace to the initiating entity; this element MAY contain XML character data (which MUST be computed in accordance with the definition of the SASL mechanism chosen by the initiating entity). 5. The initiating entity responds to the challenge by sending a <response/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace to the receiving entity; this element MAY contain XML character data (which MUST be computed in accordance with the definition of the SASL mechanism chosen by the initiating entity). 6. If necessary, the receiving entity sends more challenges and the initiating entity sends more responses. This series of challenge/response pairs continues until one of three things happens: 1. The initiating entity aborts the handshake by sending an <abort/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace to the receiving entity. Upon receiving an <abort/> element, the receiving entity SHOULD allow a configurable but reasonable number of retries (at least 2), after which it MUST terminate the TCP connection; this enables the initiating entity (e.g., an end-user client) to tolerate incorrectly-provided credentials (e.g., a mistyped password) without being forced to reconnect.
2. The receiving entity reports failure of the handshake by sending a <failure/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace to the initiating entity (the particular cause of failure SHOULD be communicated in an appropriate child element of the <failure/> element as defined under SASL Errors (Section 6.4)). If the failure case occurs, the receiving entity SHOULD allow a configurable but reasonable number of retries (at least 2), after which it MUST terminate the TCP connection; this enables the initiating entity (e.g., an end-user client) to tolerate incorrectly-provided credentials (e.g., a mistyped password) without being forced to reconnect. 3. The receiving entity reports success of the handshake by sending a <success/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-sasl' namespace to the initiating entity; this element MAY contain XML character data (in SASL terminology, "additional data with success") if required by the chosen SASL mechanism. Upon receiving the <success/> element, the initiating entity MUST initiate a new stream by sending an opening XML stream header to the receiving entity (it is not necessary to send a closing </stream> tag first, since the receiving entity and initiating entity MUST consider the original stream to be closed upon sending or receiving the <success/> element). Upon receiving the new stream header from the initiating entity, the receiving entity MUST respond by sending a new XML stream header to the initiating entity, along with any available features (but not including the STARTTLS and SASL features) or an empty <features/> element (to signify that no additional features are available); any such additional features not defined herein MUST be defined by the relevant extension to XMPP.6.3. SASL Definition
The profiling requirements of [SASL] require that the following information be supplied by a protocol definition: service name: "xmpp" initiation sequence: After the initiating entity provides an opening XML stream header and the receiving entity replies in kind, the receiving entity provides a list of acceptable authentication methods. The initiating entity chooses one method from the list and sends it to the receiving entity as the value of the 'mechanism' attribute possessed by an <auth/> element, optionally including an initial response to avoid a round trip.
exchange sequence: Challenges and responses are carried through the exchange of <challenge/> elements from receiving entity to initiating entity and <response/> elements from initiating entity to receiving entity. The receiving entity reports failure by sending a <failure/> element and success by sending a <success/> element; the initiating entity aborts the exchange by sending an <abort/> element. Upon successful negotiation, both sides consider the original XML stream to be closed and new stream headers are sent by both entities. security layer negotiation: The security layer takes effect immediately after sending the closing ">" character of the <success/> element for the receiving entity, and immediately after receiving the closing ">" character of the <success/> element for the initiating entity. The order of layers is first [TCP], then [TLS], then [SASL], then XMPP. use of the authorization identity: The authorization identity may be used by xmpp to denote the non-default <node@domain> of a client or the sending <domain> of a server.6.4. SASL Errors
The following SASL-related error conditions are defined: o <aborted/> -- The receiving entity acknowledges an <abort/> element sent by the initiating entity; sent in reply to the <abort/> element. o <incorrect-encoding/> -- The data provided by the initiating entity could not be processed because the [BASE64] encoding is incorrect (e.g., because the encoding does not adhere to the definition in Section 3 of [BASE64]); sent in reply to a <response/> element or an <auth/> element with initial response data. o <invalid-authzid/> -- The authzid provided by the initiating entity is invalid, either because it is incorrectly formatted or because the initiating entity does not have permissions to authorize that ID; sent in reply to a <response/> element or an <auth/> element with initial response data. o <invalid-mechanism/> -- The initiating entity did not provide a mechanism or requested a mechanism that is not supported by the receiving entity; sent in reply to an <auth/> element.
o <mechanism-too-weak/> -- The mechanism requested by the initiating entity is weaker than server policy permits for that initiating entity; sent in reply to a <response/> element or an <auth/> element with initial response data. o <not-authorized/> -- The authentication failed because the initiating entity did not provide valid credentials (this includes but is not limited to the case of an unknown username); sent in reply to a <response/> element or an <auth/> element with initial response data. o <temporary-auth-failure/> -- The authentication failed because of a temporary error condition within the receiving entity; sent in reply to an <auth/> element or <response/> element.6.5. Client-to-Server Example
The following example shows the data flow for a client authenticating with a server using SASL, normally after successful TLS negotiation (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example). Step 1: Client initiates stream to server: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 2: Server responds with a stream tag sent to client: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='c2s_234' from='example.com' version='1.0'> Step 3: Server informs client of available authentication mechanisms: <stream:features> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>PLAIN</mechanism> </mechanisms> </stream:features>
Step 4: Client selects an authentication mechanism: <auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='DIGEST-MD5'/> Step 5: Server sends a [BASE64] encoded challenge to client: <challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cmVhbG09InNvbWVyZWFsbSIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIixxb3A9ImF1dGgi LGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNzCg== </challenge> The decoded challenge is: realm="somerealm",nonce="OA6MG9tEQGm2hh",\ qop="auth",charset=utf-8,algorithm=md5-sess Step 5 (alt): Server returns error to client: <failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <incorrect-encoding/> </failure> </stream:stream> Step 6: Client sends a [BASE64] encoded response to the challenge: <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> dXNlcm5hbWU9InNvbWVub2RlIixyZWFsbT0ic29tZXJlYWxtIixub25jZT0i T0E2TUc5dEVRR20yaGgiLGNub25jZT0iT0E2TUhYaDZWcVRyUmsiLG5jPTAw MDAwMDAxLHFvcD1hdXRoLGRpZ2VzdC11cmk9InhtcHAvZXhhbXBsZS5jb20i LHJlc3BvbnNlPWQzODhkYWQ5MGQ0YmJkNzYwYTE1MjMyMWYyMTQzYWY3LGNo YXJzZXQ9dXRmLTgK </response> The decoded response is: username="somenode",realm="somerealm",\ nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",\ nc=00000001,qop=auth,digest-uri="xmpp/example.com",\ response=d388dad90d4bbd760a152321f2143af7,charset=utf-8 Step 7: Server sends another [BASE64] encoded challenge to client: <challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZAo= </challenge>
The decoded challenge is: rspauth=ea40f60335c427b5527b84dbabcdfffd Step 7 (alt): Server returns error to client: <failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <temporary-auth-failure/> </failure> </stream:stream> Step 8: Client responds to the challenge: <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/> Step 9: Server informs client of successful authentication: <success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/> Step 9 (alt): Server informs client of failed authentication: <failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <temporary-auth-failure/> </failure> </stream:stream> Step 10: Client initiates a new stream to server: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 11: Server responds by sending a stream header to client along with any additional features (or an empty features element): <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='c2s_345' from='example.com' version='1.0'> <stream:features> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/> <session xmlns='urn:ietf:params:xml:ns:xmpp-session'/> </stream:features>
6.6. Server-to-Server Example
The following example shows the data flow for a server authenticating with another server using SASL, normally after successful TLS negotiation (note: the alternate steps shown below are provided to illustrate the protocol for failure cases; they are not exhaustive and would not necessarily be triggered by the data sent in the example). Step 1: Server1 initiates stream to Server2: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 2: Server2 responds with a stream tag sent to Server1: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' from='example.com' id='s2s_234' version='1.0'> Step 3: Server2 informs Server1 of available authentication mechanisms: <stream:features> <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <mechanism>DIGEST-MD5</mechanism> <mechanism>KERBEROS_V4</mechanism> </mechanisms> </stream:features> Step 4: Server1 selects an authentication mechanism: <auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='DIGEST-MD5'/> Step 5: Server2 sends a [BASE64] encoded challenge to Server1: <challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cmVhbG09InNvbWVyZWFsbSIsbm9uY2U9Ik9BNk1HOXRFUUdtMmhoIixxb3A9 ImF1dGgiLGNoYXJzZXQ9dXRmLTgsYWxnb3JpdGhtPW1kNS1zZXNz </challenge>
The decoded challenge is: realm="somerealm",nonce="OA6MG9tEQGm2hh",\ qop="auth",charset=utf-8,algorithm=md5-sess Step 5 (alt): Server2 returns error to Server1: <failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <incorrect-encoding/> </failure> </stream:stream> Step 6: Server1 sends a [BASE64] encoded response to the challenge: <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> dXNlcm5hbWU9ImV4YW1wbGUub3JnIixyZWFsbT0ic29tZXJlYWxtIixub25j ZT0iT0E2TUc5dEVRR20yaGgiLGNub25jZT0iT0E2TUhYaDZWcVRyUmsiLG5j PTAwMDAwMDAxLHFvcD1hdXRoLGRpZ2VzdC11cmk9InhtcHAvZXhhbXBsZS5v cmciLHJlc3BvbnNlPWQzODhkYWQ5MGQ0YmJkNzYwYTE1MjMyMWYyMTQzYWY3 LGNoYXJzZXQ9dXRmLTgK </response> The decoded response is: username="example.org",realm="somerealm",\ nonce="OA6MG9tEQGm2hh",cnonce="OA6MHXh6VqTrRk",\ nc=00000001,qop=auth,digest-uri="xmpp/example.org",\ response=d388dad90d4bbd760a152321f2143af7,charset=utf-8 Step 7: Server2 sends another [BASE64] encoded challenge to Server1: <challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> cnNwYXV0aD1lYTQwZjYwMzM1YzQyN2I1NTI3Yjg0ZGJhYmNkZmZmZAo= </challenge> The decoded challenge is: rspauth=ea40f60335c427b5527b84dbabcdfffd Step 7 (alt): Server2 returns error to Server1: <failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <invalid-authzid/> </failure> </stream:stream>
Step 8: Server1 responds to the challenge: <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/> Step 8 (alt): Server1 aborts negotiation: <abort xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/> Step 9: Server2 informs Server1 of successful authentication: <success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/> Step 9 (alt): Server2 informs Server1 of failed authentication: <failure xmlns='urn:ietf:params:xml:ns:xmpp-sasl'> <aborted/> </failure> </stream:stream> Step 10: Server1 initiates a new stream to Server2: <stream:stream xmlns='jabber:server' xmlns:stream='http://etherx.jabber.org/streams' to='example.com' version='1.0'> Step 11: Server2 responds by sending a stream header to Server1 along with any additional features (or an empty features element): <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' from='example.com' id='s2s_345' version='1.0'> <stream:features/>7. Resource Binding
After SASL negotiation (Section 6) with the receiving entity, the initiating entity MAY want or need to bind a specific resource to that stream. In general this applies only to clients: in order to conform to the addressing format (Section 3) and stanza delivery rules (Section 10) specified herein, there MUST be a resource identifier associated with the <node@domain> of the client (which is
either generated by the server or provided by the client application); this ensures that the address for use over that stream is a "full JID" of the form <node@domain/resource>. Upon receiving a success indication within the SASL negotiation, the client MUST send a new stream header to the server, to which the server MUST respond with a stream header as well as a list of available stream features. Specifically, if the server requires the client to bind a resource to the stream after successful SASL negotiation, it MUST include an empty <bind/> element qualified by the 'urn:ietf:params:xml:ns:xmpp-bind' namespace in the stream features list it presents to the client upon sending the header for the response stream sent after successful SASL negotiation (but not before): Server advertises resource binding feature to client: <stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='c2s_345' from='example.com' version='1.0'> <stream:features> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/> </stream:features> Upon being so informed that resource binding is required, the client MUST bind a resource to the stream by sending to the server an IQ stanza of type "set" (see IQ Semantics (Section 9.2.3)) containing data qualified by the 'urn:ietf:params:xml:ns:xmpp-bind' namespace. If the client wishes to allow the server to generate the resource identifier on its behalf, it sends an IQ stanza of type "set" that contains an empty <bind/> element: Client asks server to bind a resource: <iq type='set' id='bind_1'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/> </iq> A server that supports resource binding MUST be able to generate a resource identifier on behalf of a client. A resource identifier generated by the server MUST be unique for that <node@domain>.
If the client wishes to specify the resource identifier, it sends an IQ stanza of type "set" that contains the desired resource identifier as the XML character data of a <resource/> element that is a child of the <bind/> element: Client binds a resource: <iq type='set' id='bind_2'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'> <resource>someresource</resource> </bind> </iq> Once the server has generated a resource identifier for the client or accepted the resource identifier provided by the client, it MUST return an IQ stanza of type "result" to the client, which MUST include a <jid/> child element that specifies the full JID for the connected resource as determined by the server: Server informs client of successful resource binding: <iq type='result' id='bind_2'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'> <jid>somenode@example.com/someresource</jid> </bind> </iq> A server SHOULD accept the resource identifier provided by the client, but MAY override it with a resource identifier that the server generates; in this case, the server SHOULD NOT return a stanza error (e.g., <forbidden/>) to the client but instead SHOULD communicate the generated resource identifier to the client in the IQ result as shown above. When a client supplies a resource identifier, the following stanza error conditions are possible (see Stanza Errors (Section 9.3)): o The provided resource identifier cannot be processed by the server in accordance with Resourceprep (Appendix B). o The client is not allowed to bind a resource to the stream (e.g., because the node or user has reached a limit on the number of connected resources allowed). o The provided resource identifier is already in use but the server does not allow binding of multiple connected resources with the same identifier.
The protocol for these error conditions is shown below. Resource identifier cannot be processed: <iq type='error' id='bind_2'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'> <resource>someresource</resource> </bind> <error type='modify'> <bad-request xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/> </error> </iq> Client is not allowed to bind a resource: <iq type='error' id='bind_2'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'> <resource>someresource</resource> </bind> <error type='cancel'> <not-allowed xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/> </error> </iq> Resource identifier is in use: <iq type='error' id='bind_2'> <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'> <resource>someresource</resource> </bind> <error type='cancel'> <conflict xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/> </error> </iq> If, before completing the resource binding step, the client attempts to send an XML stanza other than an IQ stanza with a <bind/> child qualified by the 'urn:ietf:params:xml:ns:xmpp-bind' namespace, the server MUST NOT process the stanza and SHOULD return a <not-authorized/> stanza error to the client.
8. Server Dialback
8.1. Overview
The Jabber protocols from which XMPP was adapted include a "server dialback" method for protecting against domain spoofing, thus making it more difficult to spoof XML stanzas. Server dialback is not a security mechanism, and results in weak verification of server identities only (see Server-to-Server Communications (Section 14.4) regarding this method's security characteristics). Domains requiring robust security SHOULD use TLS and SASL; see Server-to-Server Communications (Section 14.4) for details. If SASL is used for server-to-server authentication, dialback SHOULD NOT be used since it is unnecessary. Documentation of dialback is included mainly for the sake of backward-compatibility with existing implementations and deployments. The server dialback method is made possible by the existence of the Domain Name System (DNS), since one server can (normally) discover the authoritative server for a given domain. Because dialback depends on DNS, inter-domain communications MUST NOT proceed until the Domain Name System (DNS) hostnames asserted by the servers have been resolved (see Server-to-Server Communications (Section 14.4)). Server dialback is uni-directional, and results in (weak) verification of identities for one stream in one direction. Because server dialback is not an authentication mechanism, mutual authentication is not possible via dialback. Therefore, server dialback MUST be completed in each direction in order to enable bi-directional communications between two domains. The method for generating and verifying the keys used in server dialback MUST take into account the hostnames being used, the stream ID generated by the receiving server, and a secret known by the authoritative server's network. The stream ID is security-critical in server dialback and therefore MUST be both unpredictable and non-repeating (see [RANDOM] for recommendations regarding randomness for security purposes). Any error that occurs during dialback negotiation MUST be considered a stream error, resulting in termination of the stream and of the underlying TCP connection. The possible error conditions are specified in the protocol description below. The following terminology applies: o Originating Server -- the server that is attempting to establish a connection between two domains.
o Receiving Server -- the server that is trying to authenticate that the Originating Server represents the domain which it claims to be. o Authoritative Server -- the server that answers to the DNS hostname asserted by the Originating Server; for basic environments this will be the Originating Server, but it could be a separate machine in the Originating Server's network.8.2. Order of Events
The following is a brief summary of the order of events in dialback: 1. The Originating Server establishes a connection to the Receiving Server. 2. The Originating Server sends a 'key' value over the connection to the Receiving Server. 3. The Receiving Server establishes a connection to the Authoritative Server. 4. The Receiving Server sends the same 'key' value to the Authoritative Server. 5. The Authoritative Server replies that key is valid or invalid. 6. The Receiving Server informs the Originating Server whether it is authenticated or not.
We can represent this flow of events graphically as follows: Originating Receiving Server Server ----------- --------- | | | establish connection | | ----------------------> | | | | send stream header | | ----------------------> | | | | send stream header | | <---------------------- | | | Authoritative | send dialback key | Server | ----------------------> | ------------- | | | | establish connection | | ----------------------> | | | | send stream header | | ----------------------> | | | | send stream header | | <---------------------- | | | | send verify request | | ----------------------> | | | | send verify response | | <---------------------- | | | report dialback result | | <---------------------- | | |8.3. Protocol
The detailed protocol interaction between the servers is as follows: 1. The Originating Server establishes TCP connection to the Receiving Server.
2. The Originating Server sends a stream header to the Receiving Server: <stream:stream xmlns:stream='http://etherx.jabber.org/streams' xmlns='jabber:server' xmlns:db='jabber:server:dialback'> Note: The 'to' and 'from' attributes are OPTIONAL on the root stream element. The inclusion of the xmlns:db namespace declaration with the name shown indicates to the Receiving Server that the Originating Server supports dialback. If the namespace name is incorrect, then the Receiving Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection. 3. The Receiving Server SHOULD send a stream header back to the Originating Server, including a unique ID for this interaction: <stream:stream xmlns:stream='http://etherx.jabber.org/streams' xmlns='jabber:server' xmlns:db='jabber:server:dialback' id='457F9224A0...'> Note: The 'to' and 'from' attributes are OPTIONAL on the root stream element. If the namespace name is incorrect, then the Originating Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection. Note well that the Receiving Server SHOULD reply but MAY silently terminate the XML stream and underlying TCP connection depending on security policies in place; however, if the Receiving Server desires to proceed, it MUST send a stream header back to the Originating Server. 4. The Originating Server sends a dialback key to the Receiving Server: <db:result to='Receiving Server' from='Originating Server'> 98AF014EDC0... </db:result> Note: This key is not examined by the Receiving Server, since the Receiving Server does not keep information about the Originating Server between sessions. The key generated by the Originating Server MUST be based in part on the value of the ID provided by the
Receiving Server in the previous step, and in part on a secret shared by the Originating Server and Authoritative Server. If the value of the 'to' address does not match a hostname recognized by the Receiving Server, then the Receiving Server MUST generate a <host-unknown/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'from' address matches a domain with which the Receiving Server already has an established connection, then the Receiving Server MUST maintain the existing connection until it validates whether the new connection is legitimate; additionally, the Receiving Server MAY choose to generate a <not-authorized/> stream error condition for the new connection and then terminate both the XML stream and the underlying TCP connection related to the new request. 5. The Receiving Server establishes a TCP connection back to the domain name asserted by the Originating Server, as a result of which it connects to the Authoritative Server. (Note: As an optimization, an implementation MAY reuse an existing connection here.) 6. The Receiving Server sends the Authoritative Server a stream header: <stream:stream xmlns:stream='http://etherx.jabber.org/streams' xmlns='jabber:server' xmlns:db='jabber:server:dialback'> Note: The 'to' and 'from' attributes are OPTIONAL on the root stream element. If the namespace name is incorrect, then the Authoritative Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection. 7. The Authoritative Server sends the Receiving Server a stream header: <stream:stream xmlns:stream='http://etherx.jabber.org/streams' xmlns='jabber:server' xmlns:db='jabber:server:dialback' id='1251A342B...'> Note: If the namespace name is incorrect, then the Receiving Server MUST generate an <invalid-namespace/> stream error condition and terminate both the XML stream and the underlying TCP connection between it and the Authoritative Server. If a stream error occurs between the Receiving Server and the Authoritative Server, then the Receiving Server MUST generate a <remote-connection-failed/> stream
error condition and terminate both the XML stream and the underlying TCP connection between it and the Originating Server. 8. The Receiving Server sends the Authoritative Server a request for verification of a key: <db:verify from='Receiving Server' to='Originating Server' id='457F9224A0...'> 98AF014EDC0... </db:verify> Note: Passed here are the hostnames, the original identifier from the Receiving Server's stream header to the Originating Server in Step 3, and the key that the Originating Server sent to the Receiving Server in Step 4. Based on this information, as well as shared secret information within the Authoritative Server's network, the key is verified. Any verifiable method MAY be used to generate the key. If the value of the 'to' address does not match a hostname recognized by the Authoritative Server, then the Authoritative Server MUST generate a <host-unknown/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by the Receiving Server when opening the TCP connection (or any validated domain thereof, such as a validated subdomain of the Receiving Server's hostname or another validated domain hosted by the Receiving Server), then the Authoritative Server MUST generate an <invalid-from/> stream error condition and terminate both the XML stream and the underlying TCP connection. 9. The Authoritative Server verifies whether the key was valid or invalid: <db:verify from='Originating Server' to='Receiving Server' type='valid' id='457F9224A0...'/> or <db:verify from='Originating Server' to='Receiving Server' type='invalid' id='457F9224A0...'/>
Note: If the ID does not match that provided by the Receiving Server in Step 3, then the Receiving Server MUST generate an <invalid-id/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'to' address does not match a hostname recognized by the Receiving Server, then the Receiving Server MUST generate a <host-unknown/> stream error condition and terminate both the XML stream and the underlying TCP connection. If the value of the 'from' address does not match the hostname represented by the Originating Server when opening the TCP connection (or any validated domain thereof, such as a validated subdomain of the Originating Server's hostname or another validated domain hosted by the Originating Server), then the Receiving Server MUST generate an <invalid-from/> stream error condition and terminate both the XML stream and the underlying TCP connection. After returning the verification to the Receiving Server, the Authoritative Server SHOULD terminate the stream between them. 10. The Receiving Server informs the Originating Server of the result: <db:result from='Receiving Server' to='Originating Server' type='valid'/> Note: At this point, the connection has either been validated via a type='valid', or reported as invalid. If the connection is invalid, then the Receiving Server MUST terminate both the XML stream and the underlying TCP connection. If the connection is validated, data can be sent by the Originating Server and read by the Receiving Server; before that, all XML stanzas sent to the Receiving Server SHOULD be silently dropped. The result of the foregoing is that the Receiving Server has verified the identity of the Originating Server, so that the Originating Server can send, and the Receiving Server can accept, XML stanzas over the "initial stream" (i.e., the stream from the Originating Server to the Receiving Server). In order to verify the identities of the entities using the "response stream" (i.e., the stream from the Receiving Server to the Originating Server), dialback MUST be completed in the opposite direction as well. After successful dialback negotiation, the Receiving Server SHOULD accept subsequent <db:result/> packets (e.g., validation requests sent to a subdomain or other hostname serviced by the Receiving Server) from the Originating Server over the existing validated connection; this enables "piggybacking" of the original validated connection in one direction.
Even if dialback negotiation is successful, a server MUST verify that all XML stanzas received from the other server include a 'from' attribute and a 'to' attribute; if a stanza does not meet this restriction, the server that receives the stanza MUST generate an <improper-addressing/> stream error condition and terminate both the XML stream and the underlying TCP connection. Furthermore, a server MUST verify that the 'from' attribute of stanzas received from the other server includes a validated domain for the stream; if a stanza does not meet this restriction, the server that receives the stanza MUST generate an <invalid-from/> stream error condition and terminate both the XML stream and the underlying TCP connection. Both of these checks help to prevent spoofing related to particular stanzas.