Network Working Group J. Callas Request for Comments: 2440 Network Associates Category: Standards Track L. Donnerhacke IN-Root-CA Individual Network e.V. H. Finney Network Associates R. Thayer EIS Corporation November 1998 OpenPGP Message Format Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (1998). All Rights Reserved. IESG Note This document defines many tag values, yet it doesn't describe a mechanism for adding new tags (for new features). Traditionally the Internet Assigned Numbers Authority (IANA) handles the allocation of new values for future expansion and RFCs usually define the procedure to be used by the IANA. However, there are subtle (and not so subtle) interactions that may occur in this protocol between new features and existing features which result in a significant reduction in over all security. Therefore, this document does not define an extension procedure. Instead requests to define new tag values (say for new encryption algorithms for example) should be forwarded to the IESG Security Area Directors for consideration or forwarding to the appropriate IETF Working Group for consideration. Abstract This document is maintained in order to publish all necessary information needed to develop interoperable applications based on the OpenPGP format. It is not a step-by-step cookbook for writing an application. It describes only the format and methods needed to read, check, generate, and write conforming packets crossing any network. It does not deal with storage and implementation questions. It does,
however, discuss implementation issues necessary to avoid security flaws. Open-PGP software uses a combination of strong public-key and symmetric cryptography to provide security services for electronic communications and data storage. These services include confidentiality, key management, authentication, and digital signatures. This document specifies the message formats used in OpenPGP. Table of Contents Status of this Memo 1 IESG Note 1 Abstract 1 Table of Contents 2 1. Introduction 4 1.1. Terms 5 2. General functions 5 2.1. Confidentiality via Encryption 5 2.2. Authentication via Digital signature 6 2.3. Compression 7 2.4. Conversion to Radix-64 7 2.5. Signature-Only Applications 7 3. Data Element Formats 7 3.1. Scalar numbers 8 3.2. Multi-Precision Integers 8 3.3. Key IDs 8 3.4. Text 8 3.5. Time fields 9 3.6. String-to-key (S2K) specifiers 9 3.6.1. String-to-key (S2k) specifier types 9 3.6.1.1. Simple S2K 9 3.6.1.2. Salted S2K 10 3.6.1.3. Iterated and Salted S2K 10 3.6.2. String-to-key usage 11 3.6.2.1. Secret key encryption 11 3.6.2.2. Symmetric-key message encryption 11 4. Packet Syntax 12 4.1. Overview 12 4.2. Packet Headers 12 4.2.1. Old-Format Packet Lengths 13 4.2.2. New-Format Packet Lengths 13 4.2.2.1. One-Octet Lengths 14 4.2.2.2. Two-Octet Lengths 14 4.2.2.3. Five-Octet Lengths 14 4.2.2.4. Partial Body Lengths 14 4.2.3. Packet Length Examples 14
4.3. Packet Tags 15 5. Packet Types 16 5.1. Public-Key Encrypted Session Key Packets (Tag 1) 16 5.2. Signature Packet (Tag 2) 17 5.2.1. Signature Types 17 5.2.2. Version 3 Signature Packet Format 19 5.2.3. Version 4 Signature Packet Format 21 5.2.3.1. Signature Subpacket Specification 22 5.2.3.2. Signature Subpacket Types 24 5.2.3.3. Signature creation time 25 5.2.3.4. Issuer 25 5.2.3.5. Key expiration time 25 5.2.3.6. Preferred symmetric algorithms 25 5.2.3.7. Preferred hash algorithms 25 5.2.3.8. Preferred compression algorithms 26 5.2.3.9. Signature expiration time 26 5.2.3.10. Exportable Certification 26 5.2.3.11. Revocable 27 5.2.3.12. Trust signature 27 5.2.3.13. Regular expression 27 5.2.3.14. Revocation key 27 5.2.3.15. Notation Data 28 5.2.3.16. Key server preferences 28 5.2.3.17. Preferred key server 29 5.2.3.18. Primary user id 29 5.2.3.19. Policy URL 29 5.2.3.20. Key Flags 29 5.2.3.21. Signer's User ID 30 5.2.3.22. Reason for Revocation 30 5.2.4. Computing Signatures 31 5.2.4.1. Subpacket Hints 32 5.3. Symmetric-Key Encrypted Session-Key Packets (Tag 3) 32 5.4. One-Pass Signature Packets (Tag 4) 33 5.5. Key Material Packet 34 5.5.1. Key Packet Variants 34 5.5.1.1. Public Key Packet (Tag 6) 34 5.5.1.2. Public Subkey Packet (Tag 14) 34 5.5.1.3. Secret Key Packet (Tag 5) 35 5.5.1.4. Secret Subkey Packet (Tag 7) 35 5.5.2. Public Key Packet Formats 35 5.5.3. Secret Key Packet Formats 37 5.6. Compressed Data Packet (Tag 8) 38 5.7. Symmetrically Encrypted Data Packet (Tag 9) 39 5.8. Marker Packet (Obsolete Literal Packet) (Tag 10) 39 5.9. Literal Data Packet (Tag 11) 40 5.10. Trust Packet (Tag 12) 40 5.11. User ID Packet (Tag 13) 41 6. Radix-64 Conversions 41
6.1. An Implementation of the CRC-24 in "C" 42 6.2. Forming ASCII Armor 42 6.3. Encoding Binary in Radix-64 44 6.4. Decoding Radix-64 46 6.5. Examples of Radix-64 46 6.6. Example of an ASCII Armored Message 47 7. Cleartext signature framework 47 7.1. Dash-Escaped Text 47 8. Regular Expressions 48 9. Constants 49 9.1. Public Key Algorithms 49 9.2. Symmetric Key Algorithms 49 9.3. Compression Algorithms 50 9.4. Hash Algorithms 50 10. Packet Composition 50 10.1. Transferable Public Keys 50 10.2. OpenPGP Messages 52 10.3. Detached Signatures 52 11. Enhanced Key Formats 52 11.1. Key Structures 52 11.2. Key IDs and Fingerprints 53 12. Notes on Algorithms 54 12.1. Symmetric Algorithm Preferences 54 12.2. Other Algorithm Preferences 55 12.2.1. Compression Preferences 56 12.2.2. Hash Algorithm Preferences 56 12.3. Plaintext 56 12.4. RSA 56 12.5. Elgamal 57 12.6. DSA 58 12.7. Reserved Algorithm Numbers 58 12.8. OpenPGP CFB mode 58 13. Security Considerations 59 14. Implementation Nits 60 15. Authors and Working Group Chair 62 16. References 63 17. Full Copyright Statement 65 1. Introduction This document provides information on the message-exchange packet formats used by OpenPGP to provide encryption, decryption, signing, and key management functions. It builds on the foundation provided in RFC 1991 "PGP Message Exchange Formats."
1.1. Terms * OpenPGP - This is a definition for security software that uses PGP 5.x as a basis. * PGP - Pretty Good Privacy. PGP is a family of software systems developed by Philip R. Zimmermann from which OpenPGP is based. * PGP 2.6.x - This version of PGP has many variants, hence the term PGP 2.6.x. It used only RSA, MD5, and IDEA for its cryptographic transforms. An informational RFC, RFC 1991, was written describing this version of PGP. * PGP 5.x - This version of PGP is formerly known as "PGP 3" in the community and also in the predecessor of this document, RFC 1991. It has new formats and corrects a number of problems in the PGP 2.6.x design. It is referred to here as PGP 5.x because that software was the first release of the "PGP 3" code base. "PGP", "Pretty Good", and "Pretty Good Privacy" are trademarks of Network Associates, Inc. and are used with permission. This document uses the terms "MUST", "SHOULD", and "MAY" as defined in RFC 2119, along with the negated forms of those terms. 2. General functions OpenPGP provides data integrity services for messages and data files by using these core technologies: - digital signatures - encryption - compression - radix-64 conversion In addition, OpenPGP provides key management and certificate services, but many of these are beyond the scope of this document. 2.1. Confidentiality via Encryption OpenPGP uses two encryption methods to provide confidentiality: symmetric-key encryption and public key encryption. With public-key encryption, the object is encrypted using a symmetric encryption algorithm. Each symmetric key is used only once. A new "session key" is generated as a random number for each message. Since it is used
only once, the session key is bound to the message and transmitted with it. To protect the key, it is encrypted with the receiver's public key. The sequence is as follows: 1. The sender creates a message. 2. The sending OpenPGP generates a random number to be used as a session key for this message only. 3. The session key is encrypted using each recipient's public key. These "encrypted session keys" start the message. 4. The sending OpenPGP encrypts the message using the session key, which forms the remainder of the message. Note that the message is also usually compressed. 5. The receiving OpenPGP decrypts the session key using the recipient's private key. 6. The receiving OpenPGP decrypts the message using the session key. If the message was compressed, it will be decompressed. With symmetric-key encryption, an object may be encrypted with a symmetric key derived from a passphrase (or other shared secret), or a two-stage mechanism similar to the public-key method described above in which a session key is itself encrypted with a symmetric algorithm keyed from a shared secret. Both digital signature and confidentiality services may be applied to the same message. First, a signature is generated for the message and attached to the message. Then, the message plus signature is encrypted using a symmetric session key. Finally, the session key is encrypted using public-key encryption and prefixed to the encrypted block. 2.2. Authentication via Digital signature The digital signature uses a hash code or message digest algorithm, and a public-key signature algorithm. The sequence is as follows: 1. The sender creates a message. 2. The sending software generates a hash code of the message. 3. The sending software generates a signature from the hash code using the sender's private key. 4. The binary signature is attached to the message.
5. The receiving software keeps a copy of the message signature. 6. The receiving software generates a new hash code for the received message and verifies it using the message's signature. If the verification is successful, the message is accepted as authentic. 2.3. Compression OpenPGP implementations MAY compress the message after applying the signature but before encryption. 2.4. Conversion to Radix-64 OpenPGP's underlying native representation for encrypted messages, signature certificates, and keys is a stream of arbitrary octets. Some systems only permit the use of blocks consisting of seven-bit, printable text. For transporting OpenPGP's native raw binary octets through channels that are not safe to raw binary data, a printable encoding of these binary octets is needed. OpenPGP provides the service of converting the raw 8-bit binary octet stream to a stream of printable ASCII characters, called Radix-64 encoding or ASCII Armor. Implementations SHOULD provide Radix-64 conversions. Note that many applications, particularly messaging applications, will want more advanced features as described in the OpenPGP-MIME document, RFC 2015. An application that implements OpenPGP for messaging SHOULD implement OpenPGP-MIME. 2.5. Signature-Only Applications OpenPGP is designed for applications that use both encryption and signatures, but there are a number of problems that are solved by a signature-only implementation. Although this specification requires both encryption and signatures, it is reasonable for there to be subset implementations that are non-comformant only in that they omit encryption. 3. Data Element Formats This section describes the data elements used by OpenPGP.
3.1. Scalar numbers Scalar numbers are unsigned, and are always stored in big-endian format. Using n[k] to refer to the kth octet being interpreted, the value of a two-octet scalar is ((n[0] << 8) + n[1]). The value of a four-octet scalar is ((n[0] << 24) + (n[1] << 16) + (n[2] << 8) + n[3]). 3.2. Multi-Precision Integers Multi-Precision Integers (also called MPIs) are unsigned integers used to hold large integers such as the ones used in cryptographic calculations. An MPI consists of two pieces: a two-octet scalar that is the length of the MPI in bits followed by a string of octets that contain the actual integer. These octets form a big-endian number; a big-endian number can be made into an MPI by prefixing it with the appropriate length. Examples: (all numbers are in hexadecimal) The string of octets [00 01 01] forms an MPI with the value 1. The string [00 09 01 FF] forms an MPI with the value of 511. Additional rules: The size of an MPI is ((MPI.length + 7) / 8) + 2 octets. The length field of an MPI describes the length starting from its most significant non-zero bit. Thus, the MPI [00 02 01] is not formed correctly. It should be [00 01 01]. 3.3. Key IDs A Key ID is an eight-octet scalar that identifies a key. Implementations SHOULD NOT assume that Key IDs are unique. The section, "Enhanced Key Formats" below describes how Key IDs are formed. 3.4. Text The default character set for text is the UTF-8 [RFC2279] encoding of Unicode [ISO10646].
3.5. Time fields A time field is an unsigned four-octet number containing the number of seconds elapsed since midnight, 1 January 1970 UTC. 3.6. String-to-key (S2K) specifiers String-to-key (S2K) specifiers are used to convert passphrase strings into symmetric-key encryption/decryption keys. They are used in two places, currently: to encrypt the secret part of private keys in the private keyring, and to convert passphrases to encryption keys for symmetrically encrypted messages. 3.6.1. String-to-key (S2k) specifier types There are three types of S2K specifiers currently supported, as follows: 3.6.1.1. Simple S2K This directly hashes the string to produce the key data. See below for how this hashing is done. Octet 0: 0x00 Octet 1: hash algorithm Simple S2K hashes the passphrase to produce the session key. The manner in which this is done depends on the size of the session key (which will depend on the cipher used) and the size of the hash algorithm's output. If the hash size is greater than or equal to the session key size, the high-order (leftmost) octets of the hash are used as the key. If the hash size is less than the key size, multiple instances of the hash context are created -- enough to produce the required key data. These instances are preloaded with 0, 1, 2, ... octets of zeros (that is to say, the first instance has no preloading, the second gets preloaded with 1 octet of zero, the third is preloaded with two octets of zeros, and so forth). As the data is hashed, it is given independently to each hash context. Since the contexts have been initialized differently, they will each produce different hash output. Once the passphrase is hashed, the output data from the multiple hashes is concatenated, first hash leftmost, to produce the key data, with any excess octets on the right discarded.
3.6.1.2. Salted S2K This includes a "salt" value in the S2K specifier -- some arbitrary data -- that gets hashed along with the passphrase string, to help prevent dictionary attacks. Octet 0: 0x01 Octet 1: hash algorithm Octets 2-9: 8-octet salt value Salted S2K is exactly like Simple S2K, except that the input to the hash function(s) consists of the 8 octets of salt from the S2K specifier, followed by the passphrase. 3.6.1.3. Iterated and Salted S2K This includes both a salt and an octet count. The salt is combined with the passphrase and the resulting value is hashed repeatedly. This further increases the amount of work an attacker must do to try dictionary attacks. Octet 0: 0x03 Octet 1: hash algorithm Octets 2-9: 8-octet salt value Octet 10: count, a one-octet, coded value The count is coded into a one-octet number using the following formula: #define EXPBIAS 6 count = ((Int32)16 + (c & 15)) << ((c >> 4) + EXPBIAS); The above formula is in C, where "Int32" is a type for a 32-bit integer, and the variable "c" is the coded count, Octet 10. Iterated-Salted S2K hashes the passphrase and salt data multiple times. The total number of octets to be hashed is specified in the encoded count in the S2K specifier. Note that the resulting count value is an octet count of how many octets will be hashed, not an iteration count. Initially, one or more hash contexts are set up as with the other S2K algorithms, depending on how many octets of key data are needed. Then the salt, followed by the passphrase data is repeatedly hashed until the number of octets specified by the octet count has been hashed. The one exception is that if the octet count is less than the size of the salt plus passphrase, the full salt plus passphrase will be hashed even though that is greater than the octet count.
After the hashing is done the data is unloaded from the hash context(s) as with the other S2K algorithms. 3.6.2. String-to-key usage Implementations SHOULD use salted or iterated-and-salted S2K specifiers, as simple S2K specifiers are more vulnerable to dictionary attacks. 3.6.2.1. Secret key encryption An S2K specifier can be stored in the secret keyring to specify how to convert the passphrase to a key that unlocks the secret data. Older versions of PGP just stored a cipher algorithm octet preceding the secret data or a zero to indicate that the secret data was unencrypted. The MD5 hash function was always used to convert the passphrase to a key for the specified cipher algorithm. For compatibility, when an S2K specifier is used, the special value 255 is stored in the position where the hash algorithm octet would have been in the old data structure. This is then followed immediately by a one-octet algorithm identifier, and then by the S2K specifier as encoded above. Therefore, preceding the secret data there will be one of these possibilities: 0: secret data is unencrypted (no pass phrase) 255: followed by algorithm octet and S2K specifier Cipher alg: use Simple S2K algorithm using MD5 hash This last possibility, the cipher algorithm number with an implicit use of MD5 and IDEA, is provided for backward compatibility; it MAY be understood, but SHOULD NOT be generated, and is deprecated. These are followed by an 8-octet Initial Vector for the decryption of the secret values, if they are encrypted, and then the secret key values themselves. 3.6.2.2. Symmetric-key message encryption OpenPGP can create a Symmetric-key Encrypted Session Key (ESK) packet at the front of a message. This is used to allow S2K specifiers to be used for the passphrase conversion or to create messages with a mix of symmetric-key ESKs and public-key ESKs. This allows a message to be decrypted either with a passphrase or a public key.
PGP 2.X always used IDEA with Simple string-to-key conversion when encrypting a message with a symmetric algorithm. This is deprecated, but MAY be used for backward-compatibility. 4. Packet Syntax This section describes the packets used by OpenPGP. 4.1. Overview An OpenPGP message is constructed from a number of records that are traditionally called packets. A packet is a chunk of data that has a tag specifying its meaning. An OpenPGP message, keyring, certificate, and so forth consists of a number of packets. Some of those packets may contain other OpenPGP packets (for example, a compressed data packet, when uncompressed, contains OpenPGP packets). Each packet consists of a packet header, followed by the packet body. The packet header is of variable length. 4.2. Packet Headers The first octet of the packet header is called the "Packet Tag." It determines the format of the header and denotes the packet contents. The remainder of the packet header is the length of the packet. Note that the most significant bit is the left-most bit, called bit 7. A mask for this bit is 0x80 in hexadecimal. +---------------+ PTag |7 6 5 4 3 2 1 0| +---------------+ Bit 7 -- Always one Bit 6 -- New packet format if set PGP 2.6.x only uses old format packets. Thus, software that interoperates with those versions of PGP must only use old format packets. If interoperability is not an issue, either format may be used. Note that old format packets have four bits of content tags, and new format packets have six; some features cannot be used and still be backward-compatible. Old format packets contain: Bits 5-2 -- content tag Bits 1-0 - length-type
New format packets contain: Bits 5-0 -- content tag 4.2.1. Old-Format Packet Lengths The meaning of the length-type in old-format packets is: 0 - The packet has a one-octet length. The header is 2 octets long. 1 - The packet has a two-octet length. The header is 3 octets long. 2 - The packet has a four-octet length. The header is 5 octets long. 3 - The packet is of indeterminate length. The header is 1 octet long, and the implementation must determine how long the packet is. If the packet is in a file, this means that the packet extends until the end of the file. In general, an implementation SHOULD NOT use indeterminate length packets except where the end of the data will be clear from the context, and even then it is better to use a definite length, or a new-format header. The new-format headers described below have a mechanism for precisely encoding data of indeterminate length. 4.2.2. New-Format Packet Lengths New format packets have four possible ways of encoding length: 1. A one-octet Body Length header encodes packet lengths of up to 191 octets. 2. A two-octet Body Length header encodes packet lengths of 192 to 8383 octets. 3. A five-octet Body Length header encodes packet lengths of up to 4,294,967,295 (0xFFFFFFFF) octets in length. (This actually encodes a four-octet scalar number.) 4. When the length of the packet body is not known in advance by the issuer, Partial Body Length headers encode a packet of indeterminate length, effectively making it a stream.
4.2.2.1. One-Octet Lengths A one-octet Body Length header encodes a length of from 0 to 191 octets. This type of length header is recognized because the one octet value is less than 192. The body length is equal to: bodyLen = 1st_octet; 4.2.2.2. Two-Octet Lengths A two-octet Body Length header encodes a length of from 192 to 8383 octets. It is recognized because its first octet is in the range 192 to 223. The body length is equal to: bodyLen = ((1st_octet - 192) << 8) + (2nd_octet) + 192 4.2.2.3. Five-Octet Lengths A five-octet Body Length header consists of a single octet holding the value 255, followed by a four-octet scalar. The body length is equal to: bodyLen = (2nd_octet << 24) | (3rd_octet << 16) | (4th_octet << 8) | 5th_octet 4.2.2.4. Partial Body Lengths A Partial Body Length header is one octet long and encodes the length of only part of the data packet. This length is a power of 2, from 1 to 1,073,741,824 (2 to the 30th power). It is recognized by its one octet value that is greater than or equal to 224, and less than 255. The partial body length is equal to: partialBodyLen = 1 << (1st_octet & 0x1f); Each Partial Body Length header is followed by a portion of the packet body data. The Partial Body Length header specifies this portion's length. Another length header (of one of the three types -- one octet, two-octet, or partial) follows that portion. The last length header in the packet MUST NOT be a partial Body Length header. Partial Body Length headers may only be used for the non-final parts of the packet. 4.2.3. Packet Length Examples These examples show ways that new-format packets might encode the packet lengths.
A packet with length 100 may have its length encoded in one octet: 0x64. This is followed by 100 octets of data. A packet with length 1723 may have its length coded in two octets: 0xC5, 0xFB. This header is followed by the 1723 octets of data. A packet with length 100000 may have its length encoded in five octets: 0xFF, 0x00, 0x01, 0x86, 0xA0. It might also be encoded in the following octet stream: 0xEF, first 32768 octets of data; 0xE1, next two octets of data; 0xE0, next one octet of data; 0xF0, next 65536 octets of data; 0xC5, 0xDD, last 1693 octets of data. This is just one possible encoding, and many variations are possible on the size of the Partial Body Length headers, as long as a regular Body Length header encodes the last portion of the data. Note also that the last Body Length header can be a zero-length header. An implementation MAY use Partial Body Lengths for data packets, be they literal, compressed, or encrypted. The first partial length MUST be at least 512 octets long. Partial Body Lengths MUST NOT be used for any other packet types. Please note that in all of these explanations, the total length of the packet is the length of the header(s) plus the length of the body. 4.3. Packet Tags The packet tag denotes what type of packet the body holds. Note that old format headers can only have tags less than 16, whereas new format headers can have tags as great as 63. The defined tags (in decimal) are: 0 -- Reserved - a packet tag must not have this value 1 -- Public-Key Encrypted Session Key Packet 2 -- Signature Packet 3 -- Symmetric-Key Encrypted Session Key Packet 4 -- One-Pass Signature Packet 5 -- Secret Key Packet 6 -- Public Key Packet 7 -- Secret Subkey Packet 8 -- Compressed Data Packet 9 -- Symmetrically Encrypted Data Packet 10 -- Marker Packet 11 -- Literal Data Packet 12 -- Trust Packet
13 -- User ID Packet 14 -- Public Subkey Packet 60 to 63 -- Private or Experimental Values