Network Working Group J. Callas Request for Comments: 4880 PGP Corporation Obsoletes: 1991, 2440 L. Donnerhacke Category: Standards Track IKS GmbH H. Finney PGP Corporation D. Shaw R. Thayer November 2007 OpenPGP Message Format Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited.Abstract
This document is maintained in order to publish all necessary information needed to develop interoperable applications based on the OpenPGP format. It is not a step-by-step cookbook for writing an application. It describes only the format and methods needed to read, check, generate, and write conforming packets crossing any network. It does not deal with storage and implementation questions. It does, however, discuss implementation issues necessary to avoid security flaws. OpenPGP software uses a combination of strong public-key and symmetric cryptography to provide security services for electronic communications and data storage. These services include confidentiality, key management, authentication, and digital signatures. This document specifies the message formats used in OpenPGP.
Table of Contents
1. Introduction ....................................................5 1.1. Terms ......................................................5 2. General functions ...............................................6 2.1. Confidentiality via Encryption .............................6 2.2. Authentication via Digital Signature .......................7 2.3. Compression ................................................7 2.4. Conversion to Radix-64 .....................................8 2.5. Signature-Only Applications ................................8 3. Data Element Formats ............................................8 3.1. Scalar Numbers .............................................8 3.2. Multiprecision Integers ....................................9 3.3. Key IDs ....................................................9 3.4. Text .......................................................9 3.5. Time Fields ...............................................10 3.6. Keyrings ..................................................10 3.7. String-to-Key (S2K) Specifiers ............................10 3.7.1. String-to-Key (S2K) Specifier Types ................10 3.7.1.1. Simple S2K ................................10 3.7.1.2. Salted S2K ................................11 3.7.1.3. Iterated and Salted S2K ...................11 3.7.2. String-to-Key Usage ................................12 3.7.2.1. Secret-Key Encryption .....................12 3.7.2.2. Symmetric-Key Message Encryption ..........13 4. Packet Syntax ..................................................13 4.1. Overview ..................................................13 4.2. Packet Headers ............................................13 4.2.1. Old Format Packet Lengths ..........................14 4.2.2. New Format Packet Lengths ..........................15 4.2.2.1. One-Octet Lengths .........................15 4.2.2.2. Two-Octet Lengths .........................15 4.2.2.3. Five-Octet Lengths ........................15 4.2.2.4. Partial Body Lengths ......................16 4.2.3. Packet Length Examples .............................16 4.3. Packet Tags ...............................................17 5. Packet Types ...................................................17 5.1. Public-Key Encrypted Session Key Packets (Tag 1) ..........17 5.2. Signature Packet (Tag 2) ..................................19 5.2.1. Signature Types ....................................19 5.2.2. Version 3 Signature Packet Format ..................21 5.2.3. Version 4 Signature Packet Format ..................24 5.2.3.1. Signature Subpacket Specification .........25 5.2.3.2. Signature Subpacket Types .................27 5.2.3.3. Notes on Self-Signatures ..................27 5.2.3.4. Signature Creation Time ...................28 5.2.3.5. Issuer ....................................28 5.2.3.6. Key Expiration Time .......................28
5.2.3.7. Preferred Symmetric Algorithms ............28 5.2.3.8. Preferred Hash Algorithms .................29 5.2.3.9. Preferred Compression Algorithms ..........29 5.2.3.10. Signature Expiration Time ................29 5.2.3.11. Exportable Certification .................29 5.2.3.12. Revocable ................................30 5.2.3.13. Trust Signature ..........................30 5.2.3.14. Regular Expression .......................31 5.2.3.15. Revocation Key ...........................31 5.2.3.16. Notation Data ............................31 5.2.3.17. Key Server Preferences ...................32 5.2.3.18. Preferred Key Server .....................33 5.2.3.19. Primary User ID ..........................33 5.2.3.20. Policy URI ...............................33 5.2.3.21. Key Flags ................................33 5.2.3.22. Signer's User ID .........................34 5.2.3.23. Reason for Revocation ....................35 5.2.3.24. Features .................................36 5.2.3.25. Signature Target .........................36 5.2.3.26. Embedded Signature .......................37 5.2.4. Computing Signatures ...............................37 5.2.4.1. Subpacket Hints ...........................38 5.3. Symmetric-Key Encrypted Session Key Packets (Tag 3) .......38 5.4. One-Pass Signature Packets (Tag 4) ........................39 5.5. Key Material Packet .......................................40 5.5.1. Key Packet Variants ................................40 5.5.1.1. Public-Key Packet (Tag 6) .................40 5.5.1.2. Public-Subkey Packet (Tag 14) .............40 5.5.1.3. Secret-Key Packet (Tag 5) .................41 5.5.1.4. Secret-Subkey Packet (Tag 7) ..............41 5.5.2. Public-Key Packet Formats ..........................41 5.5.3. Secret-Key Packet Formats ..........................43 5.6. Compressed Data Packet (Tag 8) ............................45 5.7. Symmetrically Encrypted Data Packet (Tag 9) ...............45 5.8. Marker Packet (Obsolete Literal Packet) (Tag 10) ..........46 5.9. Literal Data Packet (Tag 11) ..............................46 5.10. Trust Packet (Tag 12) ....................................47 5.11. User ID Packet (Tag 13) ..................................48 5.12. User Attribute Packet (Tag 17) ...........................48 5.12.1. The Image Attribute Subpacket .....................48 5.13. Sym. Encrypted Integrity Protected Data Packet (Tag 18) ..49 5.14. Modification Detection Code Packet (Tag 19) ..............52 6. Radix-64 Conversions ...........................................53 6.1. An Implementation of the CRC-24 in "C" ....................54 6.2. Forming ASCII Armor .......................................54 6.3. Encoding Binary in Radix-64 ...............................57 6.4. Decoding Radix-64 .........................................58 6.5. Examples of Radix-64 ......................................59
6.6. Example of an ASCII Armored Message .......................59 7. Cleartext Signature Framework ..................................59 7.1. Dash-Escaped Text .........................................60 8. Regular Expressions ............................................61 9. Constants ......................................................61 9.1. Public-Key Algorithms .....................................62 9.2. Symmetric-Key Algorithms ..................................62 9.3. Compression Algorithms ....................................63 9.4. Hash Algorithms ...........................................63 10. IANA Considerations ...........................................63 10.1. New String-to-Key Specifier Types ........................64 10.2. New Packets ..............................................64 10.2.1. User Attribute Types ..............................64 10.2.1.1. Image Format Subpacket Types .............64 10.2.2. New Signature Subpackets ..........................64 10.2.2.1. Signature Notation Data Subpackets .......65 10.2.2.2. Key Server Preference Extensions .........65 10.2.2.3. Key Flags Extensions .....................65 10.2.2.4. Reason For Revocation Extensions .........65 10.2.2.5. Implementation Features ..................66 10.2.3. New Packet Versions ...............................66 10.3. New Algorithms ...........................................66 10.3.1. Public-Key Algorithms .............................66 10.3.2. Symmetric-Key Algorithms ..........................67 10.3.3. Hash Algorithms ...................................67 10.3.4. Compression Algorithms ............................67 11. Packet Composition ............................................67 11.1. Transferable Public Keys .................................67 11.2. Transferable Secret Keys .................................69 11.3. OpenPGP Messages .........................................69 11.4. Detached Signatures ......................................70 12. Enhanced Key Formats ..........................................70 12.1. Key Structures ...........................................70 12.2. Key IDs and Fingerprints .................................71 13. Notes on Algorithms ...........................................72 13.1. PKCS#1 Encoding in OpenPGP ...............................72 13.1.1. EME-PKCS1-v1_5-ENCODE .............................73 13.1.2. EME-PKCS1-v1_5-DECODE .............................73 13.1.3. EMSA-PKCS1-v1_5 ...................................74 13.2. Symmetric Algorithm Preferences ..........................75 13.3. Other Algorithm Preferences ..............................76 13.3.1. Compression Preferences ...........................76 13.3.2. Hash Algorithm Preferences ........................76 13.4. Plaintext ................................................77 13.5. RSA ......................................................77 13.6. DSA ......................................................77 13.7. Elgamal ..................................................78 13.8. Reserved Algorithm Numbers ...............................78
13.9. OpenPGP CFB Mode .........................................78 13.10. Private or Experimental Parameters ......................79 13.11. Extension of the MDC System .............................80 13.12. Meta-Considerations for Expansion .......................80 14. Security Considerations .......................................81 15. Implementation Nits ...........................................84 16. References ....................................................86 16.1. Normative References .....................................86 16.2. Informative References ...................................881. Introduction
This document provides information on the message-exchange packet formats used by OpenPGP to provide encryption, decryption, signing, and key management functions. It is a revision of RFC 2440, "OpenPGP Message Format", which itself replaces RFC 1991, "PGP Message Exchange Formats" [RFC1991] [RFC2440].1.1. Terms
* OpenPGP - This is a term for security software that uses PGP 5.x as a basis, formalized in RFC 2440 and this document. * PGP - Pretty Good Privacy. PGP is a family of software systems developed by Philip R. Zimmermann from which OpenPGP is based. * PGP 2.6.x - This version of PGP has many variants, hence the term PGP 2.6.x. It used only RSA, MD5, and IDEA for its cryptographic transforms. An informational RFC, RFC 1991, was written describing this version of PGP. * PGP 5.x - This version of PGP is formerly known as "PGP 3" in the community and also in the predecessor of this document, RFC 1991. It has new formats and corrects a number of problems in the PGP 2.6.x design. It is referred to here as PGP 5.x because that software was the first release of the "PGP 3" code base. * GnuPG - GNU Privacy Guard, also called GPG. GnuPG is an OpenPGP implementation that avoids all encumbered algorithms. Consequently, early versions of GnuPG did not include RSA public keys. GnuPG may or may not have (depending on version) support for IDEA or other encumbered algorithms. "PGP", "Pretty Good", and "Pretty Good Privacy" are trademarks of PGP Corporation and are used with permission. The term "OpenPGP" refers to the protocol described in this and related documents.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. The key words "PRIVATE USE", "HIERARCHICAL ALLOCATION", "FIRST COME FIRST SERVED", "EXPERT REVIEW", "SPECIFICATION REQUIRED", "IESG APPROVAL", "IETF CONSENSUS", and "STANDARDS ACTION" that appear in this document when used to describe namespace allocation are to be interpreted as described in [RFC2434].2. General functions
OpenPGP provides data integrity services for messages and data files by using these core technologies: - digital signatures - encryption - compression - Radix-64 conversion In addition, OpenPGP provides key management and certificate services, but many of these are beyond the scope of this document.2.1. Confidentiality via Encryption
OpenPGP combines symmetric-key encryption and public-key encryption to provide confidentiality. When made confidential, first the object is encrypted using a symmetric encryption algorithm. Each symmetric key is used only once, for a single object. A new "session key" is generated as a random number for each object (sometimes referred to as a session). Since it is used only once, the session key is bound to the message and transmitted with it. To protect the key, it is encrypted with the receiver's public key. The sequence is as follows: 1. The sender creates a message. 2. The sending OpenPGP generates a random number to be used as a session key for this message only. 3. The session key is encrypted using each recipient's public key. These "encrypted session keys" start the message.
4. The sending OpenPGP encrypts the message using the session key, which forms the remainder of the message. Note that the message is also usually compressed. 5. The receiving OpenPGP decrypts the session key using the recipient's private key. 6. The receiving OpenPGP decrypts the message using the session key. If the message was compressed, it will be decompressed. With symmetric-key encryption, an object may be encrypted with a symmetric key derived from a passphrase (or other shared secret), or a two-stage mechanism similar to the public-key method described above in which a session key is itself encrypted with a symmetric algorithm keyed from a shared secret. Both digital signature and confidentiality services may be applied to the same message. First, a signature is generated for the message and attached to the message. Then the message plus signature is encrypted using a symmetric session key. Finally, the session key is encrypted using public-key encryption and prefixed to the encrypted block.2.2. Authentication via Digital Signature
The digital signature uses a hash code or message digest algorithm, and a public-key signature algorithm. The sequence is as follows: 1. The sender creates a message. 2. The sending software generates a hash code of the message. 3. The sending software generates a signature from the hash code using the sender's private key. 4. The binary signature is attached to the message. 5. The receiving software keeps a copy of the message signature. 6. The receiving software generates a new hash code for the received message and verifies it using the message's signature. If the verification is successful, the message is accepted as authentic.2.3. Compression
OpenPGP implementations SHOULD compress the message after applying the signature but before encryption.
If an implementation does not implement compression, its authors should be aware that most OpenPGP messages in the world are compressed. Thus, it may even be wise for a space-constrained implementation to implement decompression, but not compression. Furthermore, compression has the added side effect that some types of attacks can be thwarted by the fact that slightly altered, compressed data rarely uncompresses without severe errors. This is hardly rigorous, but it is operationally useful. These attacks can be rigorously prevented by implementing and using Modification Detection Codes as described in sections following.2.4. Conversion to Radix-64
OpenPGP's underlying native representation for encrypted messages, signature certificates, and keys is a stream of arbitrary octets. Some systems only permit the use of blocks consisting of seven-bit, printable text. For transporting OpenPGP's native raw binary octets through channels that are not safe to raw binary data, a printable encoding of these binary octets is needed. OpenPGP provides the service of converting the raw 8-bit binary octet stream to a stream of printable ASCII characters, called Radix-64 encoding or ASCII Armor. Implementations SHOULD provide Radix-64 conversions.2.5. Signature-Only Applications
OpenPGP is designed for applications that use both encryption and signatures, but there are a number of problems that are solved by a signature-only implementation. Although this specification requires both encryption and signatures, it is reasonable for there to be subset implementations that are non-conformant only in that they omit encryption.3. Data Element Formats
This section describes the data elements used by OpenPGP.3.1. Scalar Numbers
Scalar numbers are unsigned and are always stored in big-endian format. Using n[k] to refer to the kth octet being interpreted, the value of a two-octet scalar is ((n[0] << 8) + n[1]). The value of a four-octet scalar is ((n[0] << 24) + (n[1] << 16) + (n[2] << 8) + n[3]).
3.2. Multiprecision Integers
Multiprecision integers (also called MPIs) are unsigned integers used to hold large integers such as the ones used in cryptographic calculations. An MPI consists of two pieces: a two-octet scalar that is the length of the MPI in bits followed by a string of octets that contain the actual integer. These octets form a big-endian number; a big-endian number can be made into an MPI by prefixing it with the appropriate length. Examples: (all numbers are in hexadecimal) The string of octets [00 01 01] forms an MPI with the value 1. The string [00 09 01 FF] forms an MPI with the value of 511. Additional rules: The size of an MPI is ((MPI.length + 7) / 8) + 2 octets. The length field of an MPI describes the length starting from its most significant non-zero bit. Thus, the MPI [00 02 01] is not formed correctly. It should be [00 01 01]. Unused bits of an MPI MUST be zero. Also note that when an MPI is encrypted, the length refers to the plaintext MPI. It may be ill-formed in its ciphertext.3.3. Key IDs
A Key ID is an eight-octet scalar that identifies a key. Implementations SHOULD NOT assume that Key IDs are unique. The section "Enhanced Key Formats" below describes how Key IDs are formed.3.4. Text
Unless otherwise specified, the character set for text is the UTF-8 [RFC3629] encoding of Unicode [ISO10646].
3.5. Time Fields
A time field is an unsigned four-octet number containing the number of seconds elapsed since midnight, 1 January 1970 UTC.3.6. Keyrings
A keyring is a collection of one or more keys in a file or database. Traditionally, a keyring is simply a sequential list of keys, but may be any suitable database. It is beyond the scope of this standard to discuss the details of keyrings or other databases.3.7. String-to-Key (S2K) Specifiers
String-to-key (S2K) specifiers are used to convert passphrase strings into symmetric-key encryption/decryption keys. They are used in two places, currently: to encrypt the secret part of private keys in the private keyring, and to convert passphrases to encryption keys for symmetrically encrypted messages.3.7.1. String-to-Key (S2K) Specifier Types
There are three types of S2K specifiers currently supported, and some reserved values: ID S2K Type -- -------- 0 Simple S2K 1 Salted S2K 2 Reserved value 3 Iterated and Salted S2K 100 to 110 Private/Experimental S2K These are described in Sections 3.7.1.1 - 3.7.1.3.3.7.1.1. Simple S2K
This directly hashes the string to produce the key data. See below for how this hashing is done. Octet 0: 0x00 Octet 1: hash algorithm Simple S2K hashes the passphrase to produce the session key. The manner in which this is done depends on the size of the session key (which will depend on the cipher used) and the size of the hash
algorithm's output. If the hash size is greater than the session key size, the high-order (leftmost) octets of the hash are used as the key. If the hash size is less than the key size, multiple instances of the hash context are created -- enough to produce the required key data. These instances are preloaded with 0, 1, 2, ... octets of zeros (that is to say, the first instance has no preloading, the second gets preloaded with 1 octet of zero, the third is preloaded with two octets of zeros, and so forth). As the data is hashed, it is given independently to each hash context. Since the contexts have been initialized differently, they will each produce different hash output. Once the passphrase is hashed, the output data from the multiple hashes is concatenated, first hash leftmost, to produce the key data, with any excess octets on the right discarded.3.7.1.2. Salted S2K
This includes a "salt" value in the S2K specifier -- some arbitrary data -- that gets hashed along with the passphrase string, to help prevent dictionary attacks. Octet 0: 0x01 Octet 1: hash algorithm Octets 2-9: 8-octet salt value Salted S2K is exactly like Simple S2K, except that the input to the hash function(s) consists of the 8 octets of salt from the S2K specifier, followed by the passphrase.3.7.1.3. Iterated and Salted S2K
This includes both a salt and an octet count. The salt is combined with the passphrase and the resulting value is hashed repeatedly. This further increases the amount of work an attacker must do to try dictionary attacks. Octet 0: 0x03 Octet 1: hash algorithm Octets 2-9: 8-octet salt value Octet 10: count, a one-octet, coded value
The count is coded into a one-octet number using the following formula: #define EXPBIAS 6 count = ((Int32)16 + (c & 15)) << ((c >> 4) + EXPBIAS); The above formula is in C, where "Int32" is a type for a 32-bit integer, and the variable "c" is the coded count, Octet 10. Iterated-Salted S2K hashes the passphrase and salt data multiple times. The total number of octets to be hashed is specified in the encoded count in the S2K specifier. Note that the resulting count value is an octet count of how many octets will be hashed, not an iteration count. Initially, one or more hash contexts are set up as with the other S2K algorithms, depending on how many octets of key data are needed. Then the salt, followed by the passphrase data, is repeatedly hashed until the number of octets specified by the octet count has been hashed. The one exception is that if the octet count is less than the size of the salt plus passphrase, the full salt plus passphrase will be hashed even though that is greater than the octet count. After the hashing is done, the data is unloaded from the hash context(s) as with the other S2K algorithms.3.7.2. String-to-Key Usage
Implementations SHOULD use salted or iterated-and-salted S2K specifiers, as simple S2K specifiers are more vulnerable to dictionary attacks.3.7.2.1. Secret-Key Encryption
An S2K specifier can be stored in the secret keyring to specify how to convert the passphrase to a key that unlocks the secret data. Older versions of PGP just stored a cipher algorithm octet preceding the secret data or a zero to indicate that the secret data was unencrypted. The MD5 hash function was always used to convert the passphrase to a key for the specified cipher algorithm. For compatibility, when an S2K specifier is used, the special value 254 or 255 is stored in the position where the hash algorithm octet would have been in the old data structure. This is then followed immediately by a one-octet algorithm identifier, and then by the S2K specifier as encoded above.
Therefore, preceding the secret data there will be one of these possibilities: 0: secret data is unencrypted (no passphrase) 255 or 254: followed by algorithm octet and S2K specifier Cipher alg: use Simple S2K algorithm using MD5 hash This last possibility, the cipher algorithm number with an implicit use of MD5 and IDEA, is provided for backward compatibility; it MAY be understood, but SHOULD NOT be generated, and is deprecated. These are followed by an Initial Vector of the same length as the block size of the cipher for the decryption of the secret values, if they are encrypted, and then the secret-key values themselves.3.7.2.2. Symmetric-Key Message Encryption
OpenPGP can create a Symmetric-key Encrypted Session Key (ESK) packet at the front of a message. This is used to allow S2K specifiers to be used for the passphrase conversion or to create messages with a mix of symmetric-key ESKs and public-key ESKs. This allows a message to be decrypted either with a passphrase or a public-key pair. PGP 2.X always used IDEA with Simple string-to-key conversion when encrypting a message with a symmetric algorithm. This is deprecated, but MAY be used for backward-compatibility.