SPAKE2 is a two-round protocol, wherein the first round establishes a shared secret between A and B, and the second round serves as key confirmation. Prior to invocation, A and B are provisioned with information, such as the input password needed to run the protocol. We assume that the roles of A and B are agreed upon by both sides: A goes first and uses M, and B goes second and uses N. If this assignment of roles is not possible, a symmetric variant
MUST be used, as described later
Section 5. For instance, A may be the client when using TCP or TLS as an underlying protocol, and B may be the server. Most protocols have such a distinction. During the first round, A sends a public value pA to B, and B responds with its own public value pB. Both A and B then derive a shared secret used to produce encryption and authentication keys. The latter are used during the second round for key confirmation. (
Section 4 details the key derivation and confirmation steps.) In particular, A sends a key confirmation message cA to B, and B responds with its own key confirmation message cB. A
MUST NOT consider the protocol complete until it receives and verifies cB. Likewise, B
MUST NOT consider the protocol complete until it receives and verifies cA.
This sample flow is shown below.
A B
| |
| |
(compute pA) | pA |
|---------------------->|
| pB | (compute pB)
|<----------------------|
| |
| (derive secrets) |
| |
(compute cA) | cA |
|---------------------->|
| cB | (compute cB)
| | (check cA)
|<----------------------|
(check cB) | |
Let G be a group in which the gap Diffie-Hellman (GDH) problem is hard. Suppose G has order p*h, where p is a large prime and h will be called the cofactor. Let I be the unit element in G, e.g., the point at infinity if G is an elliptic curve group. We denote the operations in the group additively. We assume there is a representation of elements of G as byte strings: common choices would be SEC1 [
SEC1] uncompressed or compressed for elliptic curve groups or big-endian integers of a fixed (per-group) length for prime field DH. Applications
MUST specify this encoding, typically by referring to the document defining the group. We fix two elements, M and N, in the prime-order subgroup of G, as defined in
Table 1 of this document for common groups, as well as generator P of the (large) prime-order subgroup of G. In the case of a composite order group, we will work in the quotient group. For common groups used in this document, P is specified in the document defining the group, so we do not repeat it here.
For elliptic curves other than the ones in this document, the methods described in [
RFC 9380]
SHOULD be used to generate M and N, e.g., via M = hash_to_curve("M SPAKE2 seed OID x") and N = hash_to_curve("N SPAKE2 seed OID x"), where x is an OID for the curve. Applications
MAY include a domain separation tag (DST) in this step, as specified in [
RFC 9380], though this is not required.
|| denotes concatenation of byte strings. We also let len(S) denote the length of a string in bytes, represented as an eight-byte little-endian number. Finally, let nil represent an empty string, i.e., len(nil) = 0. Text strings in double quotes are treated as their ASCII encodings throughout this document.
KDF(ikm, salt, info, L) is a key-derivation function that takes as input a salt, input keying material (IKM), an info string, and derived key length L to derive a cryptographic key of length L. MAC(key, message) is a Message Authentication Code algorithm that takes a secret key and message as input to produce an output. Let Hash be a hash function from arbitrary strings to bit strings of a fixed length that is at least 256 bits long. Common choices for Hash are SHA-256 or SHA-512 [
RFC 6234]. Let MHF be a memory-hard hash function designed to slow down brute-force attackers. Scrypt [
RFC 7914] is a common example of this function. The output length of MHF matches that of Hash. Parameter selection for MHF is out of scope for this document.
Section 6 specifies variants of KDF, MAC, and Hash that are suitable for use with the protocols contained herein.
Let A and B be two parties. A and B may also have digital representations of the parties' identities, such as Media Access Control addresses or other names (hostnames, usernames, etc.). A and B may share additional authenticated data (AAD) of a length that is at most 2
16 - 128 bits and separate from their identities, which they may want to include in the protocol execution. One example of AAD is a list of supported protocol versions if SPAKE2 were used in a higher-level protocol that negotiates use of a particular PAKE. Including this list would ensure that both parties agree upon the same set of supported protocols and therefore prevents downgrade attacks. We also assume A and B share integer w; typically, w = MHF(pw) mod p for a user-supplied password, pw. Standards, such as [
NIST.SP.800-56Ar3], suggest taking mod p of a hash value that is 64 bits longer than that needed to represent p to remove statistical bias introduced by the modulation. Protocols using this specification
MUST define the method used to compute w. In some cases, it may be necessary to carry out various forms of normalization of the password before hashing [
RFC 8265]. The hashing algorithm
SHOULD be an MHF so as to slow down brute-force attackers.
To begin, A picks x randomly and uniformly from the integers in [0,p) and calculates X=x*P and pA=w*M+X. Then, it transmits pA to B.
B selects y randomly and uniformly from the integers in [0,p) and calculates Y=y*P and pB=w*N+Y. Then, it transmits pB to A.
Both A and B calculate group element K. A calculates it as h*x*(pB-w*N), while B calculates it as h*y*(pA-w*M). A knows pB because it has received it, and likewise B knows pA. The multiplication by h prevents small subgroup confinement attacks by computing a unique value in the quotient group.
K is a shared value, though it
MUST NOT be used or output as a shared secret from the protocol. Both A and B must derive two additional shared secrets from the protocol transcript, which includes K. This use of the transcript ensures any manipulation of the messages sent is reflected in the keys. The transcript TT is encoded as follows:
TT = len(A) || A
|| len(B) || B
|| len(pA) || pA
|| len(pB) || pB
|| len(K) || K
|| len(w) || w
Here, w is encoded as a big-endian number padded to the length of p. This representation prevents timing attacks that otherwise would reveal the length of w. len(w) is thus a constant for a given group. We include it for consistency.
If an identity is absent, it is encoded as a zero-length string. This
MUST only be done for applications in which identities are implicit. Otherwise, the protocol risks unknown key-share attacks, where both sides of a connection disagree over who is authenticated.
Upon completion of this protocol, A and B compute shared secrets Ke, KcA, and KcB, as specified in
Section 4. A
MUST send B a key confirmation message so that both parties agree upon these shared secrets. The confirmation message cA is computed as a MAC over the protocol transcript TT, using KcA as follows: cA = MAC(KcA, TT). Similarly, B
MUST send A a confirmation message using a MAC that is computed equivalently, except with the use of KcB. Key confirmation verification requires computing cA (or cB, respectively) and checking for equality against that which was received.