RFC 2440

OpenPGP Message Format

Pages: 65
Obsoleted by: 4880

Part 3 of 3 – Pages 41 to 65

noToC RFC2440 - Page 41 prevText

6. Radix-64 Conversions

   As stated in the introduction, OpenPGP's underlying native
   representation for objects is a stream of arbitrary octets, and some
   systems desire these objects to be immune to damage caused by
   character set translation, data conversions, etc.

   In principle, any printable encoding scheme that met the requirements
   of the unsafe channel would suffice, since it would not change the
   underlying binary bit streams of the native OpenPGP data structures.
   The OpenPGP standard specifies one such printable encoding scheme to
   ensure interoperability.

   OpenPGP's Radix-64 encoding is composed of two parts: a base64
   encoding of the binary data, and a checksum.  The base64 encoding is
   identical to the MIME base64 content-transfer-encoding [RFC2231,
   Section 6.8]. An OpenPGP implementation MAY use ASCII Armor to
   protect the raw binary data.

   The checksum is a 24-bit CRC converted to four characters of radix-64
   encoding by the same MIME base64 transformation, preceded by an
   equals sign (=).  The CRC is computed by using the generator 0x864CFB
   and an initialization of 0xB704CE.  The accumulation is done on the
   data before it is converted to radix-64, rather than on the converted
   data.  A sample implementation of this algorithm is in the next
   section.

   The checksum with its leading equal sign MAY appear on the first line
   after the Base64 encoded data.

   Rationale for CRC-24: The size of 24 bits fits evenly into printable
   base64.  The nonzero initialization can detect more errors than a
   zero initialization.

noToC RFC2440 - Page 42

6.1. An Implementation of the CRC-24 in "C"

       #define CRC24_INIT 0xb704ceL
       #define CRC24_POLY 0x1864cfbL

       typedef long crc24;
       crc24 crc_octets(unsigned char *octets, size_t len)
       {
           crc24 crc = CRC24_INIT;
           int i;

           while (len--) {
               crc ^= (*octets++) << 16;
               for (i = 0; i < 8; i++) {
                   crc <<= 1;
                   if (crc & 0x1000000)
                       crc ^= CRC24_POLY;
               }
           }
           return crc & 0xffffffL;
       }

6.2. Forming ASCII Armor

   When OpenPGP encodes data into ASCII Armor, it puts specific headers
   around the data, so OpenPGP can reconstruct the data later. OpenPGP
   informs the user what kind of data is encoded in the ASCII armor
   through the use of the headers.

   Concatenating the following data creates ASCII Armor:

     - An Armor Header Line, appropriate for the type of data

     - Armor Headers

     - A blank (zero-length, or containing only whitespace) line

     - The ASCII-Armored data

     - An Armor Checksum

     - The Armor Tail, which depends on the Armor Header Line.

   An Armor Header Line consists of the appropriate header line text
   surrounded by five (5) dashes ('-', 0x2D) on either side of the
   header line text.  The header line text is chosen based upon the type
   of data that is being encoded in Armor, and how it is being encoded.
   Header line texts include the following strings:

noToC RFC2440 - Page 43

   BEGIN PGP MESSAGE
       Used for signed, encrypted, or compressed files.

   BEGIN PGP PUBLIC KEY BLOCK
       Used for armoring public keys

   BEGIN PGP PRIVATE KEY BLOCK
       Used for armoring private keys

   BEGIN PGP MESSAGE, PART X/Y
       Used for multi-part messages, where the armor is split amongst Y
       parts, and this is the Xth part out of Y.

   BEGIN PGP MESSAGE, PART X
       Used for multi-part messages, where this is the Xth part of an
       unspecified number of parts. Requires the MESSAGE-ID Armor Header
       to be used.

   BEGIN PGP SIGNATURE
       Used for detached signatures, OpenPGP/MIME signatures, and
       natures following clearsigned messages. Note that PGP 2.x s BEGIN
       PGP MESSAGE for detached signatures.

   The Armor Headers are pairs of strings that can give the user or the
   receiving OpenPGP implementation some information about how to decode
   or use the message.  The Armor Headers are a part of the armor, not a
   part of the message, and hence are not protected by any signatures
   applied to the message.

   The format of an Armor Header is that of a key-value pair.  A colon
   (':' 0x38) and a single space (0x20) separate the key and value.
   OpenPGP should consider improperly formatted Armor Headers to be
   corruption of the ASCII Armor.  Unknown keys should be reported to
   the user, but OpenPGP should continue to process the message.

   Currently defined Armor Header Keys are:

     - "Version", that states the OpenPGP Version used to encode the
       message.

     - "Comment", a user-defined comment.

     - "MessageID", a 32-character string of printable characters.  The
       string must be the same for all parts of a multi-part message
       that uses the "PART X" Armor Header.  MessageID strings should be

noToC RFC2440 - Page 44

       unique enough that the recipient of the mail can associate all
       the parts of a message with each other. A good checksum or
       cryptographic hash function is sufficient.

     - "Hash", a comma-separated list of hash algorithms used in this
       message. This is used only in clear-signed messages.

     - "Charset", a description of the character set that the plaintext
       is in. Please note that OpenPGP defines text to be in UTF-8 by
       default. An implementation will get best results by translating
       into and out of UTF-8. However, there are many instances where
       this is easier said than done. Also, there are communities of
       users who have no need for UTF-8 because they are all happy with
       a character set like ISO Latin-5 or a Japanese character set. In
       such instances, an implementation MAY override the UTF-8 default
       by using this header key. An implementation MAY implement this
       key and any translations it cares to; an implementation MAY
       ignore it and assume all text is UTF-8.

       The MessageID SHOULD NOT appear unless it is in a multi-part
       message. If it appears at all, it MUST be computed from the
       finished (encrypted, signed, etc.) message in a deterministic
       fashion, rather than contain a purely random value.  This is to
       allow the legitimate recipient to determine that the MessageID
       cannot serve as a covert means of leaking cryptographic key
       information.

   The Armor Tail Line is composed in the same manner as the Armor
   Header Line, except the string "BEGIN" is replaced by the string
   "END."

6.3. Encoding Binary in Radix-64

   The encoding process represents 24-bit groups of input bits as output
   strings of 4 encoded characters. Proceeding from left to right, a
   24-bit input group is formed by concatenating three 8-bit input
   groups. These 24 bits are then treated as four concatenated 6-bit
   groups, each of which is translated into a single digit in the
   Radix-64 alphabet. When encoding a bit stream with the Radix-64
   encoding, the bit stream must be presumed to be ordered with the
   most-significant-bit first. That is, the first bit in the stream will
   be the high-order bit in the first 8-bit octet, and the eighth bit
   will be the low-order bit in the first 8-bit octet, and so on.

noToC RFC2440 - Page 45

         +--first octet--+-second octet--+--third octet--+
         |7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|7 6 5 4 3 2 1 0|
         +-----------+---+-------+-------+---+-----------+
         |5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|5 4 3 2 1 0|
         +--1.index--+--2.index--+--3.index--+--4.index--+

   Each 6-bit group is used as an index into an array of 64 printable
   characters from the table below. The character referenced by the
   index is placed in the output string.

     Value Encoding  Value Encoding  Value Encoding  Value Encoding
         0 A            17 R            34 i            51 z
         1 B            18 S            35 j            52 0
         2 C            19 T            36 k            53 1
         3 D            20 U            37 l            54 2
         4 E            21 V            38 m            55 3
         5 F            22 W            39 n            56 4
         6 G            23 X            40 o            57 5
         7 H            24 Y            41 p            58 6
         8 I            25 Z            42 q            59 7
         9 J            26 a            43 r            60 8
        10 K            27 b            44 s            61 9
        11 L            28 c            45 t            62 +
        12 M            29 d            46 u            63 /
        13 N            30 e            47 v
        14 O            31 f            48 w         (pad) =
        15 P            32 g            49 x
        16 Q            33 h            50 y

   The encoded output stream must be represented in lines of no more
   than 76 characters each.

   Special processing is performed if fewer than 24 bits are available
   at the end of the data being encoded. There are three possibilities:

    1. The last data group has 24 bits (3 octets). No special
       processing is needed.

    2. The last data group has 16 bits (2 octets). The first two 6-bit
       groups are processed as above. The third (incomplete) data group
       has two zero-value bits added to it, and is processed as above.
       A pad character (=) is added to the output.

    3. The last data group has 8 bits (1 octet). The first 6-bit group
       is processed as above. The second (incomplete) data group has
       four zero-value bits added to it, and is processed as above. Two
       pad characters (=) are added to the output.

noToC RFC2440 - Page 46

6.4. Decoding Radix-64

   Any characters outside of the base64 alphabet are ignored in Radix-64
   data. Decoding software must ignore all line breaks or other
   characters not found in the table above.

   In Radix-64 data, characters other than those in the table, line
   breaks, and other white space probably indicate a transmission error,
   about which a warning message or even a message rejection might be
   appropriate under some circumstances.

   Because it is used only for padding at the end of the data, the
   occurrence of any "=" characters may be taken as evidence that the
   end of the data has been reached (without truncation in transit). No
   such assurance is possible, however, when the number of octets
   transmitted was a multiple of three and no "=" characters are
   present.

6.5. Examples of Radix-64

       Input data:  0x14fb9c03d97e
       Hex:     1   4    f   b    9   c     | 0   3    d   9    7   e
       8-bit:   00010100 11111011 10011100  | 00000011 11011001
       11111110
       6-bit:   000101 001111 101110 011100 | 000000 111101 100111
       111110
       Decimal: 5      15     46     28       0      61     37     62
       Output:  F      P      u      c        A      9      l      +

       Input data:  0x14fb9c03d9
       Hex:     1   4    f   b    9   c     | 0   3    d   9
       8-bit:   00010100 11111011 10011100  | 00000011 11011001
                                                       pad with 00
       6-bit:   000101 001111 101110 011100 | 000000 111101 100100
       Decimal: 5      15     46     28       0      61     36
                                                          pad with =
       Output:  F      P      u      c        A      9      k      =

       Input data:  0x14fb9c03
       Hex:     1   4    f   b    9   c     | 0   3
       8-bit:   00010100 11111011 10011100  | 00000011
                                              pad with 0000
       6-bit:   000101 001111 101110 011100 | 000000 110000
       Decimal: 5      15     46     28       0      48
                                                   pad with =      =
       Output:  F      P      u      c        A      w      =      =

noToC RFC2440 - Page 47

6.6. Example of an ASCII Armored Message


  -----BEGIN PGP MESSAGE-----
  Version: OpenPrivacy 0.99

  yDgBO22WxBHv7O8X7O/jygAEzol56iUKiXmV+XmpCtmpqQUKiQrFqclFqUDBovzS
  vBSFjNSiVHsuAA==
  =njUN
  -----END PGP MESSAGE-----

   Note that this example is indented by two spaces.

7. Cleartext signature framework

   It is desirable to sign a textual octet stream without ASCII armoring
   the stream itself, so the signed text is still readable without
   special software. In order to bind a signature to such a cleartext,
   this framework is used.  (Note that RFC 2015 defines another way to
   clear sign messages for environments that support MIME.)

   The cleartext signed message consists of:

     - The cleartext header '-----BEGIN PGP SIGNED MESSAGE-----' on a
       single line,

     - One or more "Hash" Armor Headers,

     - Exactly one empty line not included into the message digest,

     - The dash-escaped cleartext that is included into the message
       digest,

     - The ASCII armored signature(s) including the '-----BEGIN PGP
       SIGNATURE-----' Armor Header and Armor Tail Lines.

   If the "Hash" armor header is given, the specified message digest
   algorithm is used for the signature. If there are no such headers,
   MD5 is used, an implementation MAY omit them for V2.x compatibility.
   If more than one message digest is used in the signature, the "Hash"
   armor header contains a comma-delimited list of used message digests.

   Current message digest names are described below with the algorithm
   IDs.

7.1. Dash-Escaped Text

   The cleartext content of the message must also be dash-escaped.

noToC RFC2440 - Page 48

   Dash escaped cleartext is the ordinary cleartext where every line
   starting with a dash '-' (0x2D) is prefixed by the sequence dash '-'
   (0x2D) and space ' ' (0x20). This prevents the parser from
   recognizing armor headers of the cleartext itself. The message digest
   is computed using the cleartext itself, not the dash escaped form.

   As with binary signatures on text documents, a cleartext signature is
   calculated on the text using canonical <CR><LF> line endings.  The
   line ending (i.e. the <CR><LF>) before the '-----BEGIN PGP
   SIGNATURE-----' line that terminates the signed text is not
   considered part of the signed text.

   Also, any trailing whitespace (spaces, and tabs, 0x09) at the end of
   any line is ignored when the cleartext signature is calculated.

8. Regular Expressions

   A regular expression is zero or more branches, separated by '|'. It
   matches anything that matches one of the branches.

   A branch is zero or more pieces, concatenated. It matches a match for
   the first, followed by a match for the second, etc.

   A piece is an atom possibly followed by '*', '+', or '?'. An atom
   followed by '*' matches a sequence of 0 or more matches of the atom.
   An atom followed by '+' matches a sequence of 1 or more matches of
   the atom. An atom followed by '?' matches a match of the atom, or the
   null string.

   An atom is a regular expression in parentheses (matching a match for
   the regular expression), a range (see below), '.' (matching any
   single character), '^' (matching the null string at the beginning of
   the input string), '$' (matching the null string at the end of the
   input string), a '\' followed by a single character (matching that
   character), or a single character with no other significance
   (matching that character).

   A range is a sequence of characters enclosed in '[]'. It normally
   matches any single character from the sequence. If the sequence
   begins with '^', it matches any single character not from the rest of
   the sequence. If two characters in the sequence are separated by '-',
   this is shorthand for the full list of ASCII characters between them
   (e.g. '[0-9]' matches any decimal digit). To include a literal ']' in
   the sequence, make it the first character (following a possible '^').
   To include a literal '-', make it the first or last character.

noToC RFC2440 - Page 49

9. Constants

   This section describes the constants used in OpenPGP.

   Note that these tables are not exhaustive lists; an implementation
   MAY implement an algorithm not on these lists.

   See the section "Notes on Algorithms" below for more discussion of
   the algorithms.

9.1. Public Key Algorithms

       ID           Algorithm
       --           ---------
       1          - RSA (Encrypt or Sign)
       2          - RSA Encrypt-Only
       3          - RSA Sign-Only
       16         - Elgamal (Encrypt-Only), see [ELGAMAL]
       17         - DSA (Digital Signature Standard)
       18         - Reserved for Elliptic Curve
       19         - Reserved for ECDSA
       20         - Elgamal (Encrypt or Sign)





       21         - Reserved for Diffie-Hellman (X9.42,
                    as defined for IETF-S/MIME)
       100 to 110 - Private/Experimental algorithm.

   Implementations MUST implement DSA for signatures, and Elgamal for
   encryption. Implementations SHOULD implement RSA keys.
   Implementations MAY implement any other algorithm.

9.2. Symmetric Key Algorithms

       ID           Algorithm
       --           ---------
       0          - Plaintext or unencrypted data
       1          - IDEA [IDEA]
       2          - Triple-DES (DES-EDE, as per spec -
                    168 bit key derived from 192)
       3          - CAST5 (128 bit key, as per RFC 2144)
       4          - Blowfish (128 bit key, 16 rounds) [BLOWFISH]
       5          - SAFER-SK128 (13 rounds) [SAFER]
       6          - Reserved for DES/SK
       7          - Reserved for AES with 128-bit key

noToC RFC2440 - Page 50

       8          - Reserved for AES with 192-bit key
       9          - Reserved for AES with 256-bit key
       100 to 110 - Private/Experimental algorithm.

   Implementations MUST implement Triple-DES. Implementations SHOULD
   implement IDEA and CAST5.Implementations MAY implement any other
   algorithm.

9.3. Compression Algorithms

       ID           Algorithm
       --           ---------
       0          - Uncompressed
       1          - ZIP (RFC 1951)
       2          - ZLIB (RFC 1950)
       100 to 110 - Private/Experimental algorithm.

   Implementations MUST implement uncompressed data. Implementations
   SHOULD implement ZIP. Implementations MAY implement ZLIB.

9.4. Hash Algorithms

       ID           Algorithm                              Text Name
       --           ---------                              ---- ----
       1          - MD5                                    "MD5"
       2          - SHA-1                                  "SHA1"
       3          - RIPE-MD/160                            "RIPEMD160"
       4          - Reserved for double-width SHA (experimental)
       5          - MD2                                    "MD2"
       6          - Reserved for TIGER/192                 "TIGER192"
       7          - Reserved for HAVAL (5 pass, 160-bit)
       "HAVAL-5-160"
       100 to 110 - Private/Experimental algorithm.

   Implementations MUST implement SHA-1. Implementations SHOULD
   implement MD5.

10. Packet Composition

   OpenPGP packets are assembled into sequences in order to create
   messages and to transfer keys.  Not all possible packet sequences are
   meaningful and correct.  This describes the rules for how packets
   should be placed into sequences.

10.1. Transferable Public Keys

   OpenPGP users may transfer public keys. The essential elements of a
   transferable public key are:

noToC RFC2440 - Page 51

     - One Public Key packet

     - Zero or more revocation signatures

     - One or more User ID packets

     - After each User ID packet, zero or more signature packets
       (certifications)

     - Zero or more Subkey packets

     - After each Subkey packet, one signature packet, optionally a
       revocation.

   The Public Key packet occurs first.  Each of the following User ID
   packets provides the identity of the owner of this public key.  If
   there are multiple User ID packets, this corresponds to multiple
   means of identifying the same unique individual user; for example, a
   user may have more than one email address, and construct a User ID
   for each one.

   Immediately following each User ID packet, there are zero or more
   signature packets. Each signature packet is calculated on the
   immediately preceding User ID packet and the initial Public Key
   packet. The signature serves to certify the corresponding public key
   and user ID.  In effect, the signer is testifying to his or her
   belief that this public key belongs to the user identified by this
   user ID.

   After the User ID packets there may be one or more Subkey packets.
   In general, subkeys are provided in cases where the top-level public
   key is a signature-only key.  However, any V4 key may have subkeys,
   and the subkeys may be encryption-only keys, signature-only keys, or
   general-purpose keys.

   Each Subkey packet must be followed by one Signature packet, which
   should be a subkey binding signature issued by the top level key.

   Subkey and Key packets may each be followed by a revocation Signature
   packet to indicate that the key is revoked.  Revocation signatures
   are only accepted if they are issued by the key itself, or by a key
   that is authorized to issue revocations via a revocation key
   subpacket in a self-signature by the top level key.

   Transferable public key packet sequences may be concatenated to allow
   transferring multiple public keys in one operation.

noToC RFC2440 - Page 52

10.2. OpenPGP Messages

   An OpenPGP message is a packet or sequence of packets that
   corresponds to the following grammatical rules (comma represents
   sequential composition, and vertical bar separates alternatives):

   OpenPGP Message :- Encrypted Message | Signed Message |
                      Compressed Message | Literal Message.

   Compressed Message :- Compressed Data Packet.

   Literal Message :- Literal Data Packet.

   ESK :- Public Key Encrypted Session Key Packet |
          Symmetric-Key Encrypted Session Key Packet.

   ESK Sequence :- ESK | ESK Sequence, ESK.

   Encrypted Message :- Symmetrically Encrypted Data Packet |
               ESK Sequence, Symmetrically Encrypted Data Packet.

   One-Pass Signed Message :- One-Pass Signature Packet,
               OpenPGP Message, Corresponding Signature Packet.

   Signed Message :- Signature Packet, OpenPGP Message |
               One-Pass Signed Message.

   In addition, decrypting a Symmetrically Encrypted Data packet and

   decompressing a Compressed Data packet must yield a valid OpenPGP
   Message.

10.3. Detached Signatures

   Some OpenPGP applications use so-called "detached signatures." For
   example, a program bundle may contain a file, and with it a second
   file that is a detached signature of the first file. These detached
   signatures are simply a signature packet stored separately from the
   data that they are a signature of.

11. Enhanced Key Formats

11.1. Key Structures

   The format of an OpenPGP V3 key is as follows.  Entries in square
   brackets are optional and ellipses indicate repetition.

noToC RFC2440 - Page 53

           RSA Public Key
              [Revocation Self Signature]
               User ID [Signature ...]
              [User ID [Signature ...] ...]

   Each signature certifies the RSA public key and the preceding user
   ID. The RSA public key can have many user IDs and each user ID can
   have many signatures.

   The format of an OpenPGP V4 key that uses two public keys is similar
   except that the other keys are added to the end as 'subkeys' of the
   primary key.

           Primary-Key
              [Revocation Self Signature]
              [Direct Key Self Signature...]
               User ID [Signature ...]
              [User ID [Signature ...] ...]
              [[Subkey [Binding-Signature-Revocation]
                      Primary-Key-Binding-Signature] ...]

   A subkey always has a single signature after it that is issued using
   the primary key to tie the two keys together.  This binding signature
   may be in either V3 or V4 format, but V4 is preferred, of course.

   In the above diagram, if the binding signature of a subkey has been
   revoked, the revoked binding signature may be removed, leaving only
   one signature.

   In a key that has a main key and subkeys, the primary key MUST be a
   key capable of signing. The subkeys may be keys of any other type.
   There may be other constructions of V4 keys, too. For example, there
   may be a single-key RSA key in V4 format, a DSA primary key with an
   RSA encryption key, or RSA primary key with an Elgamal subkey, etc.

   It is also possible to have a signature-only subkey. This permits a
   primary key that collects certifications (key signatures) but is used
   only used for certifying subkeys that are used for encryption and
   signatures.

11.2. Key IDs and Fingerprints

   For a V3 key, the eight-octet key ID consists of the low 64 bits of
   the public modulus of the RSA key.

   The fingerprint of a V3 key is formed by hashing the body (but not
   the two-octet length) of the MPIs that form the key material (public
   modulus n, followed by exponent e) with MD5.

noToC RFC2440 - Page 54

   A V4 fingerprint is the 160-bit SHA-1 hash of the one-octet Packet
   Tag, followed by the two-octet packet length, followed by the entire
   Public Key packet starting with the version field.  The key ID is the
   low order 64 bits of the fingerprint.  Here are the fields of the
   hash material, with the example of a DSA key:

  a.1) 0x99 (1 octet)

  a.2) high order length octet of (b)-(f) (1 octet)

  a.3) low order length octet of (b)-(f) (1 octet)

    b) version number = 4 (1 octet);

    c) time stamp of key creation (4 octets);

    d) algorithm (1 octet): 17 = DSA (example);

    e) Algorithm specific fields.

   Algorithm Specific Fields for DSA keys (example):

  e.1) MPI of DSA prime p;

  e.2) MPI of DSA group order q (q is a prime divisor of p-1);

  e.3) MPI of DSA group generator g;

  e.4) MPI of DSA public key value y (= g**x where x is secret).

   Note that it is possible for there to be collisions of key IDs -- two
   different keys with the same key ID. Note that there is a much
   smaller, but still non-zero probability that two different keys have
   the same fingerprint.

   Also note that if V3 and V4 format keys share the same RSA key
   material, they will have different key ids as well as different
   fingerprints.

12. Notes on Algorithms

12.1. Symmetric Algorithm Preferences

   The symmetric algorithm preference is an ordered list of algorithms
   that the keyholder accepts. Since it is found on a self-signature, it
   is possible that a keyholder may have different preferences. For
   example, Alice may have TripleDES only specified for "alice@work.com"
   but CAST5, Blowfish, and TripleDES specified for "alice@home.org".

noToC RFC2440 - Page 55

   Note that it is also possible for preferences to be in a subkey's
   binding signature.

   Since TripleDES is the MUST-implement algorithm, if it is not
   explicitly in the list, it is tacitly at the end. However, it is good
   form to place it there explicitly. Note also that if an
   implementation does not implement the preference, then it is
   implicitly a TripleDES-only implementation.

   An implementation MUST not use a symmetric algorithm that is not in
   the recipient's preference list. When encrypting to more than one
   recipient, the implementation finds a suitable algorithm by taking
   the intersection of the preferences of the recipients. Note that the
   MUST-implement algorithm, TripleDES, ensures that the intersection is
   not null. The implementation may use any mechanism to pick an
   algorithm in the intersection.

   If an implementation can decrypt a message that a keyholder doesn't
   have in their preferences, the implementation SHOULD decrypt the
   message anyway, but MUST warn the keyholder than protocol has been
   violated. (For example, suppose that Alice, above, has software that
   implements all algorithms in this specification. Nonetheless, she
   prefers subsets for work or home. If she is sent a message encrypted
   with IDEA, which is not in her preferences, the software warns her
   that someone sent her an IDEA-encrypted message, but it would ideally
   decrypt it anyway.)

   An implementation that is striving for backward compatibility MAY
   consider a V3 key with a V3 self-signature to be an implicit
   preference for IDEA, and no ability to do TripleDES. This is
   technically non-compliant, but an implementation MAY violate the
   above rule in this case only and use IDEA to encrypt the message,
   provided that the message creator is warned. Ideally, though, the
   implementation would follow the rule by actually generating two
   messages, because it is possible that the OpenPGP user's
   implementation does not have IDEA, and thus could not read the
   message. Consequently, an implementation MAY, but SHOULD NOT use IDEA
   in an algorithm conflict with a V3 key.

12.2. Other Algorithm Preferences

   Other algorithm preferences work similarly to the symmetric algorithm
   preference, in that they specify which algorithms the keyholder
   accepts. There are two interesting cases that other comments need to
   be made about, though, the compression preferences and the hash
   preferences.

noToC RFC2440 - Page 56

12.2.1. Compression Preferences

   Compression has been an integral part of PGP since its first days.
   OpenPGP and all previous versions of PGP have offered compression.
   And in this specification, the default is for messages to be
   compressed, although an implementation is not required to do so.
   Consequently, the compression preference gives a way for a keyholder
   to request that messages not be compressed, presumably because they
   are using a minimal implementation that does not include compression.
   Additionally, this gives a keyholder a way to state that it can
   support alternate algorithms.

   Like the algorithm preferences, an implementation MUST NOT use an
   algorithm that is not in the preference vector. If the preferences
   are not present, then they are assumed to be [ZIP(1),
   UNCOMPRESSED(0)].

12.2.2. Hash Algorithm Preferences

   Typically, the choice of a hash algorithm is something the signer
   does, rather than the verifier, because a signer does not typically
   know who is going to be verifying the signature. This preference,
   though, allows a protocol based upon digital signatures ease in
   negotiation.

   Thus, if Alice is authenticating herself to Bob with a signature, it
   makes sense for her to use a hash algorithm that Bob's software uses.
   This preference allows Bob to state in his key which algorithms Alice
   may use.

12.3. Plaintext

   Algorithm 0, "plaintext", may only be used to denote secret keys that
   are stored in the clear. Implementations must not use plaintext in
   Symmetrically Encrypted Data Packets; they must use Literal Data
   Packets to encode unencrypted or literal data.

12.4. RSA

   There are algorithm types for RSA-signature-only, and RSA-encrypt-
   only keys. These types are deprecated. The "key flags" subpacket in a
   signature is a much better way to express the same idea, and
   generalizes it to all algorithms. An implementation SHOULD NOT create
   such a key, but MAY interpret it.

   An implementation SHOULD NOT implement RSA keys of size less than 768
   bits.

noToC RFC2440 - Page 57

   It is permissible for an implementation to support RSA merely for
   backward compatibility; for example, such an implementation would
   support V3 keys with IDEA symmetric cryptography. Note that this is
   an exception to the other MUST-implement rules. An implementation
   that supports RSA in V4 keys MUST implement the MUST-implement
   features.

12.5. Elgamal

   If an Elgamal key is to be used for both signing and encryption,
   extra care must be taken in creating the key.

   An ElGamal key consists of a generator g, a prime modulus p, a secret
   exponent x, and a public value y = g^x mod p.

   The generator and prime must be chosen so that solving the discrete
   log problem is intractable.  The group g should generate the
   multiplicative group mod p-1 or a large subgroup of it, and the order
   of g should have at least one large prime factor.  A good choice is
   to use a "strong" Sophie-Germain prime in choosing p, so that both p
   and (p-1)/2 are primes. In fact, this choice is so good that
   implementors SHOULD do it, as it avoids a small subgroup attack.

   In addition, a result of Bleichenbacher [BLEICHENBACHER] shows that
   if the generator g has only small prime factors, and if g divides the
   order of the group it generates, then signatures can be forged.  In
   particular, choosing g=2 is a bad choice if the group order may be
   even. On the other hand, a generator of 2 is a fine choice for an
   encryption-only key, as this will make the encryption faster.

   While verifying Elgamal signatures, note that it is important to test
   that r and s are less than p.  If this test is not done then
   signatures can be trivially forged by using large r values of
   approximately twice the length of p.  This attack is also discussed
   in the Bleichenbacher paper.

   Details on safe use of Elgamal signatures may be found in [MENEZES],
   which discusses all the weaknesses described above.

   If an implementation allows Elgamal signatures, then it MUST use the
   algorithm identifier 20 for an Elgamal public key that can sign.

   An implementation SHOULD NOT implement Elgamal keys of size less than
   768 bits. For long-term security, Elgamal keys should be 1024 bits or
   longer.

noToC RFC2440 - Page 58

12.6. DSA

   An implementation SHOULD NOT implement DSA keys of size less than 768
   bits. Note that present DSA is limited to a maximum of 1024 bit keys,
   which are recommended for long-term use.

12.7. Reserved Algorithm Numbers

   A number of algorithm IDs have been reserved for algorithms that
   would be useful to use in an OpenPGP implementation, yet there are
   issues that prevent an implementor from actually implementing the
   algorithm. These are marked in the Public Algorithms section as
   "(reserved for)".

   The reserved public key algorithms, Elliptic Curve (18), ECDSA (19),
   and X9.42 (21) do not have the necessary parameters, parameter order,
   or semantics defined.

   The reserved symmetric key algorithm, DES/SK (6), does not have
   semantics defined.

   The reserved hash algorithms, TIGER192 (6), and HAVAL-5-160 (7), do
   not have OIDs. The reserved algorithm number 4, reserved for a
   double-width variant of SHA1, is not presently defined.

   We have reserver three algorithm IDs for the US NIST's Advanced
   Encryption Standard. This algorithm will work with (at least) 128,
   192, and 256-bit keys. We expect that this algorithm will be selected
   from the candidate algorithms in the year 2000.

12.8. OpenPGP CFB mode

   OpenPGP does symmetric encryption using a variant of Cipher Feedback
   Mode (CFB mode). This section describes the procedure it uses in
   detail. This mode is what is used for Symmetrically Encrypted Data
   Packets; the mechanism used for encrypting secret key material is
   similar, but described in those sections above.

   OpenPGP CFB mode uses an initialization vector (IV) of all zeros, and
   prefixes the plaintext with ten octets of random data, such that
   octets 9 and 10 match octets 7 and 8.  It does a CFB "resync" after
   encrypting those ten octets.

   Note that for an algorithm that has a larger block size than 64 bits,
   the equivalent function will be done with that entire block.  For
   example, a 16-octet block algorithm would operate on 16 octets, and
   then produce two octets of check, and then work on 16-octet blocks.

noToC RFC2440 - Page 59

   Step by step, here is the procedure:

   1.  The feedback register (FR) is set to the IV, which is all zeros.

   2.  FR is encrypted to produce FRE (FR Encrypted).  This is the
       encryption of an all-zero value.

   3.  FRE is xored with the first 8 octets of random data prefixed to
       the plaintext to produce C1-C8, the first 8 octets of ciphertext.

   4.  FR is loaded with C1-C8.

   5.  FR is encrypted to produce FRE, the encryption of the first 8
       octets of ciphertext.

   6.  The left two octets of FRE get xored with the next two octets of
       data that were prefixed to the plaintext.  This produces C9-C10,
       the next two octets of ciphertext.

   7.  (The resync step) FR is loaded with C3-C10.

   8.  FR is encrypted to produce FRE.

   9.  FRE is xored with the first 8 octets of the given plaintext, now
       that we have finished encrypting the 10 octets of prefixed data.
       This produces C11-C18, the next 8 octets of ciphertext.

   10.  FR is loaded with C11-C18

   11.  FR is encrypted to produce FRE.

   12.  FRE is xored with the next 8 octets of plaintext, to produce the
       next 8 octets of ciphertext.  These are loaded into FR and the
       process is repeated until the plaintext is used up.

13. Security Considerations

   As with any technology involving cryptography, you should check the
   current literature to determine if any algorithms used here have been
   found to be vulnerable to attack.

   This specification uses Public Key Cryptography technologies.
   Possession of the private key portion of a public-private key pair is
   assumed to be controlled by the proper party or parties.

   Certain operations in this specification involve the use of random
   numbers.  An appropriate entropy source should be used to generate
   these numbers.  See RFC 1750.

noToC RFC2440 - Page 60

   The MD5 hash algorithm has been found to have weaknesses (pseudo-
   collisions in the compress function) that make some people deprecate
   its use.  They consider the SHA-1 algorithm better.

   Many security protocol designers think that it is a bad idea to use a
   single key for both privacy (encryption) and integrity (signatures).
   In fact, this was one of the motivating forces behind the V4 key
   format with separate signature and encryption keys. If you as an
   implementor promote dual-use keys, you should at least be aware of
   this controversy.

   The DSA algorithm will work with any 160-bit hash, but it is
   sensitive to the quality of the hash algorithm, if the hash algorithm
   is broken, it can leak the secret key. The Digital Signature Standard
   (DSS) specifies that DSA be used with SHA-1.  RIPEMD-160 is
   considered by many cryptographers to be as strong. An implementation
   should take care which hash algorithms are used with DSA, as a weak
   hash can not only allow a signature to be forged, but could leak the
   secret key. These same considerations about the quality of the hash
   algorithm apply to Elgamal signatures.

   If you are building an authentication system, the recipient may
   specify a preferred signing algorithm. However, the signer would be
   foolish to use a weak algorithm simply because the recipient requests
   it.

   Some of the encryption algorithms mentioned in this document have
   been analyzed less than others.  For example, although CAST5 is
   presently considered strong, it has been analyzed less than Triple-
   DES. Other algorithms may have other controversies surrounding them.

   Some technologies mentioned here may be subject to government control
   in some countries.

14. Implementation Nits

   This section is a collection of comments to help an implementer,
   particularly with an eye to backward compatibility. Previous
   implementations of PGP are not OpenPGP-compliant. Often the
   differences are small, but small differences are frequently more
   vexing than large differences. Thus, this list of potential problems
   and gotchas for a developer who is trying to be backward-compatible.

     * PGP 5.x does not accept V4 signatures for anything other than
       key material.

     * PGP 5.x does not recognize the "five-octet" lengths in new-format
       headers or in signature subpacket lengths.

noToC RFC2440 - Page 61

     * PGP 5.0 rejects an encrypted session key if the keylength differs
       from the S2K symmetric algorithm. This is a bug in its validation
       function.

     * PGP 5.0 does not handle multiple one-pass signature headers and
       trailers. Signing one will compress the one-pass signed literal
       and prefix a V3 signature instead of doing a nested one-pass
       signature.

     * When exporting a private key, PGP 2.x generates the header "BEGIN
       PGP SECRET KEY BLOCK" instead of "BEGIN PGP PRIVATE KEY BLOCK".
       All previous versions ignore the implied data type, and look
       directly at the packet data type.

     * In a clear-signed signature, PGP 5.0 will figure out the correct
       hash algorithm if there is no "Hash:" header, but it will reject
       a mismatch between the header and the actual algorithm used. The
       "standard" (i.e. Zimmermann/Finney/et al.) version of PGP 2.x
       rejects the "Hash:" header and assumes MD5. There are a number of
       enhanced variants of PGP 2.6.x that have been modified for SHA-1
       signatures.

     * PGP 5.0 can read an RSA key in V4 format, but can only recognize
       it with a V3 keyid, and can properly use only a V3 format RSA
       key.

     * Neither PGP 5.x nor PGP 6.0 recognize Elgamal Encrypt and Sign
       keys. They only handle Elgamal Encrypt-only keys.

     * There are many ways possible for two keys to have the same key
       material, but different fingerprints (and thus key ids). Perhaps
       the most interesting is an RSA key that has been "upgraded" to V4
       format, but since a V4 fingerprint is constructed by hashing the
       key creation time along with other things, two V4 keys created at
       different times, yet with the same key material will have
       different fingerprints.

     * If an implementation is using zlib to interoperate with PGP 2.x,
       then the "windowBits" parameter should be set to -13.

noToC RFC2440 - Page 62

15. Authors and Working Group Chair

   The working group can be contacted via the current chair:

   John W. Noerenberg, II
   Qualcomm, Inc
   6455 Lusk Blvd
   San Diego, CA 92131 USA

   Phone: +1 619-658-3510
   EMail: jwn2@qualcomm.com


   The principal authors of this memo are:

   Jon Callas
   Network Associates, Inc.
   3965 Freedom Circle
   Santa Clara, CA 95054, USA

   Phone: +1 408-346-5860
   EMail: jon@pgp.com, jcallas@nai.com


   Lutz Donnerhacke
   IKS GmbH
   Wildenbruchstr. 15
   07745 Jena, Germany

   Phone: +49-3641-675642
   EMail: lutz@iks-jena.de


   Hal Finney
   Network Associates, Inc.
   3965 Freedom Circle
   Santa Clara, CA 95054, USA

   EMail: hal@pgp.com


   Rodney Thayer
   EIS Corporation
   Clearwater, FL 33767, USA

   EMail: rodney@unitran.com

noToC RFC2440 - Page 63

   This memo also draws on much previous work from a number of other
   authors who include: Derek Atkins, Charles Breed, Dave Del Torto,
   Marc Dyksterhouse, Gail Haspert, Gene Hoffman, Paul Hoffman, Raph
   Levien, Colin Plumb, Will Price, William Stallings, Mark Weaver, and
   Philip R. Zimmermann.

16. References

   [BLEICHENBACHER] Bleichenbacher, Daniel, "Generating ElGamal
                    signatures without knowing the secret key,"
                    Eurocrypt 96.  Note that the version in the
                    proceedings has an error.  A revised version is
                    available at the time of writing from
                    <ftp://ftp.inf.ethz.ch/pub/publications/papers/ti/isc
                    /ElGamal.ps>

   [BLOWFISH]       Schneier, B. "Description of a New Variable-Length
                    Key, 64-Bit Block Cipher (Blowfish)" Fast Software
                    Encryption, Cambridge Security Workshop Proceedings
                    (December 1993), Springer-Verlag, 1994, pp191-204

                    <http://www.counterpane.com/bfsverlag.html>

   [DONNERHACKE]    Donnerhacke, L., et. al, "PGP263in - an improved
                    international version of PGP", ftp://ftp.iks-
                    jena.de/mitarb/lutz/crypt/software/pgp/

   [ELGAMAL]        T. ElGamal, "A Public-Key Cryptosystem and a
                    Signature Scheme Based on Discrete Logarithms," IEEE
                    Transactions on Information Theory, v. IT-31, n. 4,
                    1985, pp. 469-472.

   [IDEA]           Lai, X, "On the design and security of block
                    ciphers", ETH Series in Information Processing, J.L.
                    Massey (editor), Vol. 1, Hartung-Gorre Verlag
                    Knostanz, Technische Hochschule (Zurich), 1992

   [ISO-10646]      ISO/IEC 10646-1:1993. International Standard --
                    Information technology -- Universal Multiple-Octet
                    Coded Character Set (UCS) -- Part 1: Architecture
                    and Basic Multilingual Plane.  UTF-8 is described in
                    Annex R, adopted but not yet published.  UTF-16 is
                    described in Annex Q, adopted but not yet published.

   [MENEZES]        Alfred Menezes, Paul van Oorschot, and Scott
                    Vanstone, "Handbook of Applied Cryptography," CRC
                    Press, 1996.

noToC RFC2440 - Page 64

   [RFC822]         Crocker, D., "Standard for the format of ARPA
                    Internet text messages", STD 11, RFC 822, August
                    1982.

   [RFC1423]        Balenson, D., "Privacy Enhancement for Internet
                    Electronic Mail: Part III: Algorithms, Modes, and
                    Identifiers", RFC 1423, October 1993.

   [RFC1641]        Goldsmith, D. and M. Davis, "Using Unicode with
                    MIME", RFC 1641, July 1994.

   [RFC1750]        Eastlake, D., Crocker, S. and J. Schiller,
                    "Randomness Recommendations for Security", RFC 1750,
                    December 1994.

   [RFC1951]        Deutsch, P., "DEFLATE Compressed Data Format
                    Specification version 1.3.", RFC 1951, May 1996.

   [RFC1983]        Malkin, G., "Internet Users' Glossary", FYI 18, RFC
                    1983, August 1996.

   [RFC1991]        Atkins, D., Stallings, W. and P. Zimmermann, "PGP
                    Message Exchange Formats", RFC 1991, August 1996.

   [RFC2015]        Elkins, M., "MIME Security with Pretty Good Privacy
                    (PGP)", RFC 2015, October 1996.

   [RFC2231]        Borenstein, N. and N. Freed, "Multipurpose Internet
                    Mail Extensions (MIME) Part One: Format of Internet
                    Message Bodies.", RFC 2231, November 1996.

   [RFC2119]        Bradner, S., "Key words for use in RFCs to Indicate
                    Requirement Level", BCP 14, RFC 2119, March 1997.

   [RFC2144]        Adams, C., "The CAST-128 Encryption Algorithm", RFC
                    2144, May 1997.

   [RFC2279]        Yergeau., F., "UTF-8, a transformation format of
                    Unicode and ISO 10646", RFC 2279, January 1998.

   [RFC2313]        Kaliski, B., "PKCS #1: RSA Encryption Standard
                    version 1.5", RFC 2313, March 1998.

   [SAFER]          Massey, J.L. "SAFER K-64: One Year Later", B.
                    Preneel, editor, Fast Software Encryption, Second
                    International Workshop (LNCS 1008) pp212-241,
                    Springer-Verlag 1995

noToC RFC2440 - Page 65

17.  Full Copyright Statement

   Copyright (C) The Internet Society (1998).  All Rights Reserved.

   This document and translations of it may be copied and furnished to
   others, and derivative works that comment on or otherwise explain it
   or assist in its implementation may be prepared, copied, published
   and distributed, in whole or in part, without restriction of any
   kind, provided that the above copyright notice and this paragraph are
   included on all such copies and derivative works.  However, this
   document itself may not be modified in any way, such as by removing
   the copyright notice or references to the Internet Society or other
   Internet organizations, except as needed for the purpose of
   developing Internet standards in which case the procedures for
   copyrights defined in the Internet Standards process must be
   followed, or as required to translate it into languages other than
   English.

   The limited permissions granted above are perpetual and will not be
   revoked by the Internet Society or its successors or assigns.

   This document and the information contained herein is provided on an
   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.