6. Radix-64 Conversions As stated in the introduction, OpenPGP's underlying native representation for objects is a stream of arbitrary octets, and some systems desire these objects to be immune to damage caused by character set translation, data conversions, etc. In principle, any printable encoding scheme that met the requirements of the unsafe channel would suffice, since it would not change the underlying binary bit streams of the native OpenPGP data structures. The OpenPGP standard specifies one such printable encoding scheme to ensure interoperability. OpenPGP's Radix-64 encoding is composed of two parts: a base64 encoding of the binary data, and a checksum. The base64 encoding is identical to the MIME base64 content-transfer-encoding [RFC2231, Section 6.8]. An OpenPGP implementation MAY use ASCII Armor to protect the raw binary data. The checksum is a 24-bit CRC converted to four characters of radix-64 encoding by the same MIME base64 transformation, preceded by an equals sign (=). The CRC is computed by using the generator 0x864CFB and an initialization of 0xB704CE. The accumulation is done on the data before it is converted to radix-64, rather than on the converted data. A sample implementation of this algorithm is in the next section. The checksum with its leading equal sign MAY appear on the first line after the Base64 encoded data. Rationale for CRC-24: The size of 24 bits fits evenly into printable base64. The nonzero initialization can detect more errors than a zero initialization.
6.1. An Implementation of the CRC-24 in "C" #define CRC24_INIT 0xb704ceL #define CRC24_POLY 0x1864cfbL typedef long crc24; crc24 crc_octets(unsigned char *octets, size_t len) { crc24 crc = CRC24_INIT; int i; while (len--) { crc ^= (*octets++) << 16; for (i = 0; i < 8; i++) { crc <<= 1; if (crc & 0x1000000) crc ^= CRC24_POLY; } } return crc & 0xffffffL; } 6.2. Forming ASCII Armor When OpenPGP encodes data into ASCII Armor, it puts specific headers around the data, so OpenPGP can reconstruct the data later. OpenPGP informs the user what kind of data is encoded in the ASCII armor through the use of the headers. Concatenating the following data creates ASCII Armor: - An Armor Header Line, appropriate for the type of data - Armor Headers - A blank (zero-length, or containing only whitespace) line - The ASCII-Armored data - An Armor Checksum - The Armor Tail, which depends on the Armor Header Line. An Armor Header Line consists of the appropriate header line text surrounded by five (5) dashes ('-', 0x2D) on either side of the header line text. The header line text is chosen based upon the type of data that is being encoded in Armor, and how it is being encoded. Header line texts include the following strings:
BEGIN PGP MESSAGE Used for signed, encrypted, or compressed files. BEGIN PGP PUBLIC KEY BLOCK Used for armoring public keys BEGIN PGP PRIVATE KEY BLOCK Used for armoring private keys BEGIN PGP MESSAGE, PART X/Y Used for multi-part messages, where the armor is split amongst Y parts, and this is the Xth part out of Y. BEGIN PGP MESSAGE, PART X Used for multi-part messages, where this is the Xth part of an unspecified number of parts. Requires the MESSAGE-ID Armor Header to be used. BEGIN PGP SIGNATURE Used for detached signatures, OpenPGP/MIME signatures, and natures following clearsigned messages. Note that PGP 2.x s BEGIN PGP MESSAGE for detached signatures. The Armor Headers are pairs of strings that can give the user or the receiving OpenPGP implementation some information about how to decode or use the message. The Armor Headers are a part of the armor, not a part of the message, and hence are not protected by any signatures applied to the message. The format of an Armor Header is that of a key-value pair. A colon (':' 0x38) and a single space (0x20) separate the key and value. OpenPGP should consider improperly formatted Armor Headers to be corruption of the ASCII Armor. Unknown keys should be reported to the user, but OpenPGP should continue to process the message. Currently defined Armor Header Keys are: - "Version", that states the OpenPGP Version used to encode the message. - "Comment", a user-defined comment. - "MessageID", a 32-character string of printable characters. The string must be the same for all parts of a multi-part message that uses the "PART X" Armor Header. MessageID strings should be
unique enough that the recipient of the mail can associate all the parts of a message with each other. A good checksum or cryptographic hash function is sufficient. - "Hash", a comma-separated list of hash algorithms used in this message. This is used only in clear-signed messages. - "Charset", a description of the character set that the plaintext is in. Please note that OpenPGP defines text to be in UTF-8 by default. An implementation will get best results by translating into and out of UTF-8. However, there are many instances where this is easier said than done. Also, there are communities of users who have no need for UTF-8 because they are all happy with a character set like ISO Latin-5 or a Japanese character set. In such instances, an implementation MAY override the UTF-8 default by using this header key. An implementation MAY implement this key and any translations it cares to; an implementation MAY ignore it and assume all text is UTF-8. The MessageID SHOULD NOT appear unless it is in a multi-part message. If it appears at all, it MUST be computed from the finished (encrypted, signed, etc.) message in a deterministic fashion, rather than contain a purely random value. This is to allow the legitimate recipient to determine that the MessageID cannot serve as a covert means of leaking cryptographic key information. The Armor Tail Line is composed in the same manner as the Armor Header Line, except the string "BEGIN" is replaced by the string "END." 6.3. Encoding Binary in Radix-64 The encoding process represents 24-bit groups of input bits as output strings of 4 encoded characters. Proceeding from left to right, a 24-bit input group is formed by concatenating three 8-bit input groups. These 24 bits are then treated as four concatenated 6-bit groups, each of which is translated into a single digit in the Radix-64 alphabet. When encoding a bit stream with the Radix-64 encoding, the bit stream must be presumed to be ordered with the most-significant-bit first. That is, the first bit in the stream will be the high-order bit in the first 8-bit octet, and the eighth bit will be the low-order bit in the first 8-bit octet, and so on.
+--first octet--+-second octet--+--third octet--+ |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0| +-----------+---+-------+-------+---+-----------+ |5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0| +--1.index--+--2.index--+--3.index--+--4.index--+ Each 6-bit group is used as an index into an array of 64 printable characters from the table below. The character referenced by the index is placed in the output string. Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 2 C 19 T 36 k 53 1 3 D 20 U 37 l 54 2 4 E 21 V 38 m 55 3 5 F 22 W 39 n 56 4 6 G 23 X 40 o 57 5 7 H 24 Y 41 p 58 6 8 I 25 Z 42 q 59 7 9 J 26 a 43 r 60 8 10 K 27 b 44 s 61 9 11 L 28 c 45 t 62 + 12 M 29 d 46 u 63 / 13 N 30 e 47 v 14 O 31 f 48 w (pad) = 15 P 32 g 49 x 16 Q 33 h 50 y The encoded output stream must be represented in lines of no more than 76 characters each. Special processing is performed if fewer than 24 bits are available at the end of the data being encoded. There are three possibilities: 1. The last data group has 24 bits (3 octets). No special processing is needed. 2. The last data group has 16 bits (2 octets). The first two 6-bit groups are processed as above. The third (incomplete) data group has two zero-value bits added to it, and is processed as above. A pad character (=) is added to the output. 3. The last data group has 8 bits (1 octet). The first 6-bit group is processed as above. The second (incomplete) data group has four zero-value bits added to it, and is processed as above. Two pad characters (=) are added to the output.
6.4. Decoding Radix-64 Any characters outside of the base64 alphabet are ignored in Radix-64 data. Decoding software must ignore all line breaks or other characters not found in the table above. In Radix-64 data, characters other than those in the table, line breaks, and other white space probably indicate a transmission error, about which a warning message or even a message rejection might be appropriate under some circumstances. Because it is used only for padding at the end of the data, the occurrence of any "=" characters may be taken as evidence that the end of the data has been reached (without truncation in transit). No such assurance is possible, however, when the number of octets transmitted was a multiple of three and no "=" characters are present. 6.5. Examples of Radix-64 Input data: 0x14fb9c03d97e Hex: 1 4 f b 9 c | 0 3 d 9 7 e 8-bit: 00010100 11111011 10011100 | 00000011 11011001 11111110 6-bit: 000101 001111 101110 011100 | 000000 111101 100111 111110 Decimal: 5 15 46 28 0 61 37 62 Output: F P u c A 9 l + Input data: 0x14fb9c03d9 Hex: 1 4 f b 9 c | 0 3 d 9 8-bit: 00010100 11111011 10011100 | 00000011 11011001 pad with 00 6-bit: 000101 001111 101110 011100 | 000000 111101 100100 Decimal: 5 15 46 28 0 61 36 pad with = Output: F P u c A 9 k = Input data: 0x14fb9c03 Hex: 1 4 f b 9 c | 0 3 8-bit: 00010100 11111011 10011100 | 00000011 pad with 0000 6-bit: 000101 001111 101110 011100 | 000000 110000 Decimal: 5 15 46 28 0 48 pad with = = Output: F P u c A w = =
6.6. Example of an ASCII Armored Message -----BEGIN PGP MESSAGE----- Version: OpenPrivacy 0.99 yDgBO22WxBHv7O8X7O/jygAEzol56iUKiXmV+XmpCtmpqQUKiQrFqclFqUDBovzS vBSFjNSiVHsuAA== =njUN -----END PGP MESSAGE----- Note that this example is indented by two spaces. 7. Cleartext signature framework It is desirable to sign a textual octet stream without ASCII armoring the stream itself, so the signed text is still readable without special software. In order to bind a signature to such a cleartext, this framework is used. (Note that RFC 2015 defines another way to clear sign messages for environments that support MIME.) The cleartext signed message consists of: - The cleartext header '-----BEGIN PGP SIGNED MESSAGE-----' on a single line, - One or more "Hash" Armor Headers, - Exactly one empty line not included into the message digest, - The dash-escaped cleartext that is included into the message digest, - The ASCII armored signature(s) including the '-----BEGIN PGP SIGNATURE-----' Armor Header and Armor Tail Lines. If the "Hash" armor header is given, the specified message digest algorithm is used for the signature. If there are no such headers, MD5 is used, an implementation MAY omit them for V2.x compatibility. If more than one message digest is used in the signature, the "Hash" armor header contains a comma-delimited list of used message digests. Current message digest names are described below with the algorithm IDs. 7.1. Dash-Escaped Text The cleartext content of the message must also be dash-escaped.
Dash escaped cleartext is the ordinary cleartext where every line starting with a dash '-' (0x2D) is prefixed by the sequence dash '-' (0x2D) and space ' ' (0x20). This prevents the parser from recognizing armor headers of the cleartext itself. The message digest is computed using the cleartext itself, not the dash escaped form. As with binary signatures on text documents, a cleartext signature is calculated on the text using canonical <CR><LF> line endings. The line ending (i.e. the <CR><LF>) before the '-----BEGIN PGP SIGNATURE-----' line that terminates the signed text is not considered part of the signed text. Also, any trailing whitespace (spaces, and tabs, 0x09) at the end of any line is ignored when the cleartext signature is calculated. 8. Regular Expressions A regular expression is zero or more branches, separated by '|'. It matches anything that matches one of the branches. A branch is zero or more pieces, concatenated. It matches a match for the first, followed by a match for the second, etc. A piece is an atom possibly followed by '*', '+', or '?'. An atom followed by '*' matches a sequence of 0 or more matches of the atom. An atom followed by '+' matches a sequence of 1 or more matches of the atom. An atom followed by '?' matches a match of the atom, or the null string. An atom is a regular expression in parentheses (matching a match for the regular expression), a range (see below), '.' (matching any single character), '^' (matching the null string at the beginning of the input string), '$' (matching the null string at the end of the input string), a '\' followed by a single character (matching that character), or a single character with no other significance (matching that character). A range is a sequence of characters enclosed in '[]'. It normally matches any single character from the sequence. If the sequence begins with '^', it matches any single character not from the rest of the sequence. If two characters in the sequence are separated by '-', this is shorthand for the full list of ASCII characters between them (e.g. '[0-9]' matches any decimal digit). To include a literal ']' in the sequence, make it the first character (following a possible '^'). To include a literal '-', make it the first or last character.
9. Constants This section describes the constants used in OpenPGP. Note that these tables are not exhaustive lists; an implementation MAY implement an algorithm not on these lists. See the section "Notes on Algorithms" below for more discussion of the algorithms. 9.1. Public Key Algorithms ID Algorithm -- --------- 1 - RSA (Encrypt or Sign) 2 - RSA Encrypt-Only 3 - RSA Sign-Only 16 - Elgamal (Encrypt-Only), see [ELGAMAL] 17 - DSA (Digital Signature Standard) 18 - Reserved for Elliptic Curve 19 - Reserved for ECDSA 20 - Elgamal (Encrypt or Sign) 21 - Reserved for Diffie-Hellman (X9.42, as defined for IETF-S/MIME) 100 to 110 - Private/Experimental algorithm. Implementations MUST implement DSA for signatures, and Elgamal for encryption. Implementations SHOULD implement RSA keys. Implementations MAY implement any other algorithm. 9.2. Symmetric Key Algorithms ID Algorithm -- --------- 0 - Plaintext or unencrypted data 1 - IDEA [IDEA] 2 - Triple-DES (DES-EDE, as per spec - 168 bit key derived from 192) 3 - CAST5 (128 bit key, as per RFC 2144) 4 - Blowfish (128 bit key, 16 rounds) [BLOWFISH] 5 - SAFER-SK128 (13 rounds) [SAFER] 6 - Reserved for DES/SK 7 - Reserved for AES with 128-bit key
8 - Reserved for AES with 192-bit key 9 - Reserved for AES with 256-bit key 100 to 110 - Private/Experimental algorithm. Implementations MUST implement Triple-DES. Implementations SHOULD implement IDEA and CAST5.Implementations MAY implement any other algorithm. 9.3. Compression Algorithms ID Algorithm -- --------- 0 - Uncompressed 1 - ZIP (RFC 1951) 2 - ZLIB (RFC 1950) 100 to 110 - Private/Experimental algorithm. Implementations MUST implement uncompressed data. Implementations SHOULD implement ZIP. Implementations MAY implement ZLIB. 9.4. Hash Algorithms ID Algorithm Text Name -- --------- ---- ---- 1 - MD5 "MD5" 2 - SHA-1 "SHA1" 3 - RIPE-MD/160 "RIPEMD160" 4 - Reserved for double-width SHA (experimental) 5 - MD2 "MD2" 6 - Reserved for TIGER/192 "TIGER192" 7 - Reserved for HAVAL (5 pass, 160-bit) "HAVAL-5-160" 100 to 110 - Private/Experimental algorithm. Implementations MUST implement SHA-1. Implementations SHOULD implement MD5. 10. Packet Composition OpenPGP packets are assembled into sequences in order to create messages and to transfer keys. Not all possible packet sequences are meaningful and correct. This describes the rules for how packets should be placed into sequences. 10.1. Transferable Public Keys OpenPGP users may transfer public keys. The essential elements of a transferable public key are:
- One Public Key packet - Zero or more revocation signatures - One or more User ID packets - After each User ID packet, zero or more signature packets (certifications) - Zero or more Subkey packets - After each Subkey packet, one signature packet, optionally a revocation. The Public Key packet occurs first. Each of the following User ID packets provides the identity of the owner of this public key. If there are multiple User ID packets, this corresponds to multiple means of identifying the same unique individual user; for example, a user may have more than one email address, and construct a User ID for each one. Immediately following each User ID packet, there are zero or more signature packets. Each signature packet is calculated on the immediately preceding User ID packet and the initial Public Key packet. The signature serves to certify the corresponding public key and user ID. In effect, the signer is testifying to his or her belief that this public key belongs to the user identified by this user ID. After the User ID packets there may be one or more Subkey packets. In general, subkeys are provided in cases where the top-level public key is a signature-only key. However, any V4 key may have subkeys, and the subkeys may be encryption-only keys, signature-only keys, or general-purpose keys. Each Subkey packet must be followed by one Signature packet, which should be a subkey binding signature issued by the top level key. Subkey and Key packets may each be followed by a revocation Signature packet to indicate that the key is revoked. Revocation signatures are only accepted if they are issued by the key itself, or by a key that is authorized to issue revocations via a revocation key subpacket in a self-signature by the top level key. Transferable public key packet sequences may be concatenated to allow transferring multiple public keys in one operation.
10.2. OpenPGP Messages An OpenPGP message is a packet or sequence of packets that corresponds to the following grammatical rules (comma represents sequential composition, and vertical bar separates alternatives): OpenPGP Message :- Encrypted Message | Signed Message | Compressed Message | Literal Message. Compressed Message :- Compressed Data Packet. Literal Message :- Literal Data Packet. ESK :- Public Key Encrypted Session Key Packet | Symmetric-Key Encrypted Session Key Packet. ESK Sequence :- ESK | ESK Sequence, ESK. Encrypted Message :- Symmetrically Encrypted Data Packet | ESK Sequence, Symmetrically Encrypted Data Packet. One-Pass Signed Message :- One-Pass Signature Packet, OpenPGP Message, Corresponding Signature Packet. Signed Message :- Signature Packet, OpenPGP Message | One-Pass Signed Message. In addition, decrypting a Symmetrically Encrypted Data packet and decompressing a Compressed Data packet must yield a valid OpenPGP Message. 10.3. Detached Signatures Some OpenPGP applications use so-called "detached signatures." For example, a program bundle may contain a file, and with it a second file that is a detached signature of the first file. These detached signatures are simply a signature packet stored separately from the data that they are a signature of. 11. Enhanced Key Formats 11.1. Key Structures The format of an OpenPGP V3 key is as follows. Entries in square brackets are optional and ellipses indicate repetition.
RSA Public Key [Revocation Self Signature] User ID [Signature ...] [User ID [Signature ...] ...] Each signature certifies the RSA public key and the preceding user ID. The RSA public key can have many user IDs and each user ID can have many signatures. The format of an OpenPGP V4 key that uses two public keys is similar except that the other keys are added to the end as 'subkeys' of the primary key. Primary-Key [Revocation Self Signature] [Direct Key Self Signature...] User ID [Signature ...] [User ID [Signature ...] ...] [[Subkey [Binding-Signature-Revocation] Primary-Key-Binding-Signature] ...] A subkey always has a single signature after it that is issued using the primary key to tie the two keys together. This binding signature may be in either V3 or V4 format, but V4 is preferred, of course. In the above diagram, if the binding signature of a subkey has been revoked, the revoked binding signature may be removed, leaving only one signature. In a key that has a main key and subkeys, the primary key MUST be a key capable of signing. The subkeys may be keys of any other type. There may be other constructions of V4 keys, too. For example, there may be a single-key RSA key in V4 format, a DSA primary key with an RSA encryption key, or RSA primary key with an Elgamal subkey, etc. It is also possible to have a signature-only subkey. This permits a primary key that collects certifications (key signatures) but is used only used for certifying subkeys that are used for encryption and signatures. 11.2. Key IDs and Fingerprints For a V3 key, the eight-octet key ID consists of the low 64 bits of the public modulus of the RSA key. The fingerprint of a V3 key is formed by hashing the body (but not the two-octet length) of the MPIs that form the key material (public modulus n, followed by exponent e) with MD5.
A V4 fingerprint is the 160-bit SHA-1 hash of the one-octet Packet Tag, followed by the two-octet packet length, followed by the entire Public Key packet starting with the version field. The key ID is the low order 64 bits of the fingerprint. Here are the fields of the hash material, with the example of a DSA key: a.1) 0x99 (1 octet) a.2) high order length octet of (b)-(f) (1 octet) a.3) low order length octet of (b)-(f) (1 octet) b) version number = 4 (1 octet); c) time stamp of key creation (4 octets); d) algorithm (1 octet): 17 = DSA (example); e) Algorithm specific fields. Algorithm Specific Fields for DSA keys (example): e.1) MPI of DSA prime p; e.2) MPI of DSA group order q (q is a prime divisor of p-1); e.3) MPI of DSA group generator g; e.4) MPI of DSA public key value y (= g**x where x is secret). Note that it is possible for there to be collisions of key IDs -- two different keys with the same key ID. Note that there is a much smaller, but still non-zero probability that two different keys have the same fingerprint. Also note that if V3 and V4 format keys share the same RSA key material, they will have different key ids as well as different fingerprints. 12. Notes on Algorithms 12.1. Symmetric Algorithm Preferences The symmetric algorithm preference is an ordered list of algorithms that the keyholder accepts. Since it is found on a self-signature, it is possible that a keyholder may have different preferences. For example, Alice may have TripleDES only specified for "alice@work.com" but CAST5, Blowfish, and TripleDES specified for "alice@home.org".
Note that it is also possible for preferences to be in a subkey's binding signature. Since TripleDES is the MUST-implement algorithm, if it is not explicitly in the list, it is tacitly at the end. However, it is good form to place it there explicitly. Note also that if an implementation does not implement the preference, then it is implicitly a TripleDES-only implementation. An implementation MUST not use a symmetric algorithm that is not in the recipient's preference list. When encrypting to more than one recipient, the implementation finds a suitable algorithm by taking the intersection of the preferences of the recipients. Note that the MUST-implement algorithm, TripleDES, ensures that the intersection is not null. The implementation may use any mechanism to pick an algorithm in the intersection. If an implementation can decrypt a message that a keyholder doesn't have in their preferences, the implementation SHOULD decrypt the message anyway, but MUST warn the keyholder than protocol has been violated. (For example, suppose that Alice, above, has software that implements all algorithms in this specification. Nonetheless, she prefers subsets for work or home. If she is sent a message encrypted with IDEA, which is not in her preferences, the software warns her that someone sent her an IDEA-encrypted message, but it would ideally decrypt it anyway.) An implementation that is striving for backward compatibility MAY consider a V3 key with a V3 self-signature to be an implicit preference for IDEA, and no ability to do TripleDES. This is technically non-compliant, but an implementation MAY violate the above rule in this case only and use IDEA to encrypt the message, provided that the message creator is warned. Ideally, though, the implementation would follow the rule by actually generating two messages, because it is possible that the OpenPGP user's implementation does not have IDEA, and thus could not read the message. Consequently, an implementation MAY, but SHOULD NOT use IDEA in an algorithm conflict with a V3 key. 12.2. Other Algorithm Preferences Other algorithm preferences work similarly to the symmetric algorithm preference, in that they specify which algorithms the keyholder accepts. There are two interesting cases that other comments need to be made about, though, the compression preferences and the hash preferences.
12.2.1. Compression Preferences Compression has been an integral part of PGP since its first days. OpenPGP and all previous versions of PGP have offered compression. And in this specification, the default is for messages to be compressed, although an implementation is not required to do so. Consequently, the compression preference gives a way for a keyholder to request that messages not be compressed, presumably because they are using a minimal implementation that does not include compression. Additionally, this gives a keyholder a way to state that it can support alternate algorithms. Like the algorithm preferences, an implementation MUST NOT use an algorithm that is not in the preference vector. If the preferences are not present, then they are assumed to be [ZIP(1), UNCOMPRESSED(0)]. 12.2.2. Hash Algorithm Preferences Typically, the choice of a hash algorithm is something the signer does, rather than the verifier, because a signer does not typically know who is going to be verifying the signature. This preference, though, allows a protocol based upon digital signatures ease in negotiation. Thus, if Alice is authenticating herself to Bob with a signature, it makes sense for her to use a hash algorithm that Bob's software uses. This preference allows Bob to state in his key which algorithms Alice may use. 12.3. Plaintext Algorithm 0, "plaintext", may only be used to denote secret keys that are stored in the clear. Implementations must not use plaintext in Symmetrically Encrypted Data Packets; they must use Literal Data Packets to encode unencrypted or literal data. 12.4. RSA There are algorithm types for RSA-signature-only, and RSA-encrypt- only keys. These types are deprecated. The "key flags" subpacket in a signature is a much better way to express the same idea, and generalizes it to all algorithms. An implementation SHOULD NOT create such a key, but MAY interpret it. An implementation SHOULD NOT implement RSA keys of size less than 768 bits.
It is permissible for an implementation to support RSA merely for backward compatibility; for example, such an implementation would support V3 keys with IDEA symmetric cryptography. Note that this is an exception to the other MUST-implement rules. An implementation that supports RSA in V4 keys MUST implement the MUST-implement features. 12.5. Elgamal If an Elgamal key is to be used for both signing and encryption, extra care must be taken in creating the key. An ElGamal key consists of a generator g, a prime modulus p, a secret exponent x, and a public value y = g^x mod p. The generator and prime must be chosen so that solving the discrete log problem is intractable. The group g should generate the multiplicative group mod p-1 or a large subgroup of it, and the order of g should have at least one large prime factor. A good choice is to use a "strong" Sophie-Germain prime in choosing p, so that both p and (p-1)/2 are primes. In fact, this choice is so good that implementors SHOULD do it, as it avoids a small subgroup attack. In addition, a result of Bleichenbacher [BLEICHENBACHER] shows that if the generator g has only small prime factors, and if g divides the order of the group it generates, then signatures can be forged. In particular, choosing g=2 is a bad choice if the group order may be even. On the other hand, a generator of 2 is a fine choice for an encryption-only key, as this will make the encryption faster. While verifying Elgamal signatures, note that it is important to test that r and s are less than p. If this test is not done then signatures can be trivially forged by using large r values of approximately twice the length of p. This attack is also discussed in the Bleichenbacher paper. Details on safe use of Elgamal signatures may be found in [MENEZES], which discusses all the weaknesses described above. If an implementation allows Elgamal signatures, then it MUST use the algorithm identifier 20 for an Elgamal public key that can sign. An implementation SHOULD NOT implement Elgamal keys of size less than 768 bits. For long-term security, Elgamal keys should be 1024 bits or longer.
12.6. DSA An implementation SHOULD NOT implement DSA keys of size less than 768 bits. Note that present DSA is limited to a maximum of 1024 bit keys, which are recommended for long-term use. 12.7. Reserved Algorithm Numbers A number of algorithm IDs have been reserved for algorithms that would be useful to use in an OpenPGP implementation, yet there are issues that prevent an implementor from actually implementing the algorithm. These are marked in the Public Algorithms section as "(reserved for)". The reserved public key algorithms, Elliptic Curve (18), ECDSA (19), and X9.42 (21) do not have the necessary parameters, parameter order, or semantics defined. The reserved symmetric key algorithm, DES/SK (6), does not have semantics defined. The reserved hash algorithms, TIGER192 (6), and HAVAL-5-160 (7), do not have OIDs. The reserved algorithm number 4, reserved for a double-width variant of SHA1, is not presently defined. We have reserver three algorithm IDs for the US NIST's Advanced Encryption Standard. This algorithm will work with (at least) 128, 192, and 256-bit keys. We expect that this algorithm will be selected from the candidate algorithms in the year 2000. 12.8. OpenPGP CFB mode OpenPGP does symmetric encryption using a variant of Cipher Feedback Mode (CFB mode). This section describes the procedure it uses in detail. This mode is what is used for Symmetrically Encrypted Data Packets; the mechanism used for encrypting secret key material is similar, but described in those sections above. OpenPGP CFB mode uses an initialization vector (IV) of all zeros, and prefixes the plaintext with ten octets of random data, such that octets 9 and 10 match octets 7 and 8. It does a CFB "resync" after encrypting those ten octets. Note that for an algorithm that has a larger block size than 64 bits, the equivalent function will be done with that entire block. For example, a 16-octet block algorithm would operate on 16 octets, and then produce two octets of check, and then work on 16-octet blocks.
Step by step, here is the procedure: 1. The feedback register (FR) is set to the IV, which is all zeros. 2. FR is encrypted to produce FRE (FR Encrypted). This is the encryption of an all-zero value. 3. FRE is xored with the first 8 octets of random data prefixed to the plaintext to produce C1-C8, the first 8 octets of ciphertext. 4. FR is loaded with C1-C8. 5. FR is encrypted to produce FRE, the encryption of the first 8 octets of ciphertext. 6. The left two octets of FRE get xored with the next two octets of data that were prefixed to the plaintext. This produces C9-C10, the next two octets of ciphertext. 7. (The resync step) FR is loaded with C3-C10. 8. FR is encrypted to produce FRE. 9. FRE is xored with the first 8 octets of the given plaintext, now that we have finished encrypting the 10 octets of prefixed data. This produces C11-C18, the next 8 octets of ciphertext. 10. FR is loaded with C11-C18 11. FR is encrypted to produce FRE. 12. FRE is xored with the next 8 octets of plaintext, to produce the next 8 octets of ciphertext. These are loaded into FR and the process is repeated until the plaintext is used up. 13. Security Considerations As with any technology involving cryptography, you should check the current literature to determine if any algorithms used here have been found to be vulnerable to attack. This specification uses Public Key Cryptography technologies. Possession of the private key portion of a public-private key pair is assumed to be controlled by the proper party or parties. Certain operations in this specification involve the use of random numbers. An appropriate entropy source should be used to generate these numbers. See RFC 1750.
The MD5 hash algorithm has been found to have weaknesses (pseudo- collisions in the compress function) that make some people deprecate its use. They consider the SHA-1 algorithm better. Many security protocol designers think that it is a bad idea to use a single key for both privacy (encryption) and integrity (signatures). In fact, this was one of the motivating forces behind the V4 key format with separate signature and encryption keys. If you as an implementor promote dual-use keys, you should at least be aware of this controversy. The DSA algorithm will work with any 160-bit hash, but it is sensitive to the quality of the hash algorithm, if the hash algorithm is broken, it can leak the secret key. The Digital Signature Standard (DSS) specifies that DSA be used with SHA-1. RIPEMD-160 is considered by many cryptographers to be as strong. An implementation should take care which hash algorithms are used with DSA, as a weak hash can not only allow a signature to be forged, but could leak the secret key. These same considerations about the quality of the hash algorithm apply to Elgamal signatures. If you are building an authentication system, the recipient may specify a preferred signing algorithm. However, the signer would be foolish to use a weak algorithm simply because the recipient requests it. Some of the encryption algorithms mentioned in this document have been analyzed less than others. For example, although CAST5 is presently considered strong, it has been analyzed less than Triple- DES. Other algorithms may have other controversies surrounding them. Some technologies mentioned here may be subject to government control in some countries. 14. Implementation Nits This section is a collection of comments to help an implementer, particularly with an eye to backward compatibility. Previous implementations of PGP are not OpenPGP-compliant. Often the differences are small, but small differences are frequently more vexing than large differences. Thus, this list of potential problems and gotchas for a developer who is trying to be backward-compatible. * PGP 5.x does not accept V4 signatures for anything other than key material. * PGP 5.x does not recognize the "five-octet" lengths in new-format headers or in signature subpacket lengths.
* PGP 5.0 rejects an encrypted session key if the keylength differs from the S2K symmetric algorithm. This is a bug in its validation function. * PGP 5.0 does not handle multiple one-pass signature headers and trailers. Signing one will compress the one-pass signed literal and prefix a V3 signature instead of doing a nested one-pass signature. * When exporting a private key, PGP 2.x generates the header "BEGIN PGP SECRET KEY BLOCK" instead of "BEGIN PGP PRIVATE KEY BLOCK". All previous versions ignore the implied data type, and look directly at the packet data type. * In a clear-signed signature, PGP 5.0 will figure out the correct hash algorithm if there is no "Hash:" header, but it will reject a mismatch between the header and the actual algorithm used. The "standard" (i.e. Zimmermann/Finney/et al.) version of PGP 2.x rejects the "Hash:" header and assumes MD5. There are a number of enhanced variants of PGP 2.6.x that have been modified for SHA-1 signatures. * PGP 5.0 can read an RSA key in V4 format, but can only recognize it with a V3 keyid, and can properly use only a V3 format RSA key. * Neither PGP 5.x nor PGP 6.0 recognize Elgamal Encrypt and Sign keys. They only handle Elgamal Encrypt-only keys. * There are many ways possible for two keys to have the same key material, but different fingerprints (and thus key ids). Perhaps the most interesting is an RSA key that has been "upgraded" to V4 format, but since a V4 fingerprint is constructed by hashing the key creation time along with other things, two V4 keys created at different times, yet with the same key material will have different fingerprints. * If an implementation is using zlib to interoperate with PGP 2.x, then the "windowBits" parameter should be set to -13.
15. Authors and Working Group Chair The working group can be contacted via the current chair: John W. Noerenberg, II Qualcomm, Inc 6455 Lusk Blvd San Diego, CA 92131 USA Phone: +1 619-658-3510 EMail: jwn2@qualcomm.com The principal authors of this memo are: Jon Callas Network Associates, Inc. 3965 Freedom Circle Santa Clara, CA 95054, USA Phone: +1 408-346-5860 EMail: jon@pgp.com, jcallas@nai.com Lutz Donnerhacke IKS GmbH Wildenbruchstr. 15 07745 Jena, Germany Phone: +49-3641-675642 EMail: lutz@iks-jena.de Hal Finney Network Associates, Inc. 3965 Freedom Circle Santa Clara, CA 95054, USA EMail: hal@pgp.com Rodney Thayer EIS Corporation Clearwater, FL 33767, USA EMail: rodney@unitran.com
This memo also draws on much previous work from a number of other authors who include: Derek Atkins, Charles Breed, Dave Del Torto, Marc Dyksterhouse, Gail Haspert, Gene Hoffman, Paul Hoffman, Raph Levien, Colin Plumb, Will Price, William Stallings, Mark Weaver, and Philip R. Zimmermann. 16. References [BLEICHENBACHER] Bleichenbacher, Daniel, "Generating ElGamal signatures without knowing the secret key," Eurocrypt 96. Note that the version in the proceedings has an error. A revised version is available at the time of writing from <ftp://ftp.inf.ethz.ch/pub/publications/papers/ti/isc /ElGamal.ps> [BLOWFISH] Schneier, B. "Description of a New Variable-Length Key, 64-Bit Block Cipher (Blowfish)" Fast Software Encryption, Cambridge Security Workshop Proceedings (December 1993), Springer-Verlag, 1994, pp191-204 <http://www.counterpane.com/bfsverlag.html> [DONNERHACKE] Donnerhacke, L., et. al, "PGP263in - an improved international version of PGP", ftp://ftp.iks- jena.de/mitarb/lutz/crypt/software/pgp/ [ELGAMAL] T. ElGamal, "A Public-Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms," IEEE Transactions on Information Theory, v. IT-31, n. 4, 1985, pp. 469-472. [IDEA] Lai, X, "On the design and security of block ciphers", ETH Series in Information Processing, J.L. Massey (editor), Vol. 1, Hartung-Gorre Verlag Knostanz, Technische Hochschule (Zurich), 1992 [ISO-10646] ISO/IEC 10646-1:1993. International Standard -- Information technology -- Universal Multiple-Octet Coded Character Set (UCS) -- Part 1: Architecture and Basic Multilingual Plane. UTF-8 is described in Annex R, adopted but not yet published. UTF-16 is described in Annex Q, adopted but not yet published. [MENEZES] Alfred Menezes, Paul van Oorschot, and Scott Vanstone, "Handbook of Applied Cryptography," CRC Press, 1996.
[RFC822] Crocker, D., "Standard for the format of ARPA Internet text messages", STD 11, RFC 822, August 1982. [RFC1423] Balenson, D., "Privacy Enhancement for Internet Electronic Mail: Part III: Algorithms, Modes, and Identifiers", RFC 1423, October 1993. [RFC1641] Goldsmith, D. and M. Davis, "Using Unicode with MIME", RFC 1641, July 1994. [RFC1750] Eastlake, D., Crocker, S. and J. Schiller, "Randomness Recommendations for Security", RFC 1750, December 1994. [RFC1951] Deutsch, P., "DEFLATE Compressed Data Format Specification version 1.3.", RFC 1951, May 1996. [RFC1983] Malkin, G., "Internet Users' Glossary", FYI 18, RFC 1983, August 1996. [RFC1991] Atkins, D., Stallings, W. and P. Zimmermann, "PGP Message Exchange Formats", RFC 1991, August 1996. [RFC2015] Elkins, M., "MIME Security with Pretty Good Privacy (PGP)", RFC 2015, October 1996. [RFC2231] Borenstein, N. and N. Freed, "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies.", RFC 2231, November 1996. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Level", BCP 14, RFC 2119, March 1997. [RFC2144] Adams, C., "The CAST-128 Encryption Algorithm", RFC 2144, May 1997. [RFC2279] Yergeau., F., "UTF-8, a transformation format of Unicode and ISO 10646", RFC 2279, January 1998. [RFC2313] Kaliski, B., "PKCS #1: RSA Encryption Standard version 1.5", RFC 2313, March 1998. [SAFER] Massey, J.L. "SAFER K-64: One Year Later", B. Preneel, editor, Fast Software Encryption, Second International Workshop (LNCS 1008) pp212-241, Springer-Verlag 1995
17. Full Copyright Statement Copyright (C) The Internet Society (1998). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.