RFC 1521

MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies

Pages: 81
Obsoletes: 1341
Obsoleted by: 2045 2046 2047 2048 2049
Updated by: 1590

Part 2 of 3 – Pages 23 to 55

noToC RFC1521 - Page 23 prevText

6.    Additional Content-Header Fields

6.1.  Optional Content-ID Header Field

   In constructing a high-level user agent, it may be desirable to allow
   one body to make reference to another.  Accordingly, bodies may be
   labeled using the "Content-ID" header field, which is syntactically
   identical to the "Message-ID" header field:

   id :=  "Content-ID" ":" msg-id
   Like the Message-ID values, Content-ID values must be generated to be
   world-unique.

   The Content-ID value may be used for uniquely identifying MIME
   entities in several contexts, particularly for cacheing data
   referenced by the message/external-body mechanism.  Although the
   Content-ID header is generally optional, its use is mandatory in

noToC RFC1521 - Page 24

   implementations which generate data of the optional MIME Content-type
   "message/external-body".  That is, each message/external-body entity
   must have a Content-ID field to permit cacheing of such data.

   It is also worth noting that the Content-ID value has special
   semantics in the case of the multipart/alternative content-type.
   This is explained in the section of this document dealing with
   multipart/alternative.

6.2.  Optional Content-Description Header Field

   The ability to associate some descriptive information with a given
   body is often desirable. For example, it may be useful to mark an
   "image" body as "a picture of the Space Shuttle Endeavor."  Such text
   may be placed in the Content-Description header field.

   description := "Content-Description" ":" *text

   The description is presumed to be given in the US-ASCII character
   set, although the mechanism specified in [RFC-1522] may be used for
   non-US-ASCII Content-Description values.

7.    The Predefined Content-Type Values

   This document defines seven initial Content-Type values and an
   extension mechanism for private or experimental types.  Further
   standard types must be defined by new published specifications.  It
   is expected that most innovation in new types of mail will take place
   as subtypes of the seven types defined here.  The most essential
   characteristics of the seven content-types are summarized in Appendix
   F.

7.1  The Text Content-Type

   The text Content-Type is intended for sending material which is
   principally textual in form.  It is the default Content-Type.  A
   "charset" parameter may be used to indicate the character set of the
   body text for some text subtypes, notably including the primary
   subtype, "text/plain", which indicates plain (unformatted) text.  The
   default Content-Type for Internet mail is "text/plain; charset=us-
   ascii".

   Beyond plain text, there are many formats for representing what might
   be known as "extended text" -- text with embedded formatting and
   presentation information.  An interesting characteristic of many such
   representations is that they are to some extent readable even without
   the software that interprets them.  It is useful, then, to
   distinguish them, at the highest level, from such unreadable data as

noToC RFC1521 - Page 25

   images, audio, or text represented in an unreadable form.  In the
   absence of appropriate interpretation software, it is reasonable to
   show subtypes of text to the user, while it is not reasonable to do
   so with most nontextual data.

   Such formatted textual data should be represented using subtypes of
   text.  Plausible subtypes of text are typically given by the common
   name of the representation format, e.g., "text/richtext" [RFC-1341].

7.1.1.     The charset parameter

   A critical parameter that may be specified in the Content-Type field
   for text/plain data is the character set.  This is specified with a
   "charset" parameter, as in:

        Content-type: text/plain; charset=us-ascii

   Unlike some other parameter values, the values of the charset
   parameter are NOT case sensitive.  The default character set, which
   must be assumed in the absence of a charset parameter, is US-ASCII.

   The specification for any future subtypes of "text" must specify
   whether or not they will also utilize a "charset" parameter, and may
   possibly restrict its values as well.  When used with a particular
   body, the semantics of the "charset" parameter should be identical to
   those specified here for "text/plain", i.e., the body consists
   entirely of characters in the given charset.  In particular, definers
   of future text subtypes should pay close attention the the
   implications of multibyte character sets for their subtype
   definitions.

   This RFC specifies the definition of the charset parameter for the
   purposes of MIME to be a unique mapping of a byte stream to glyphs, a
   mapping which does not require external profiling information.

   An initial list of predefined character set names can be found at the
   end of this section.  Additional character sets may be registered
   with IANA, although the standardization of their use requires the
   usual IESG [RFC-1340] review and approval.  Note that if the
   specified character set includes 8-bit data, a Content-Transfer-
   Encoding header field and a corresponding encoding on the data are
   required in order to transmit the body via some mail transfer
   protocols, such as SMTP.

   The default character set, US-ASCII, has been the subject of some
   confusion and ambiguity in the past.  Not only were there some
   ambiguities in the definition, there have been wide variations in
   practice.  In order to eliminate such ambiguity and variations in the

noToC RFC1521 - Page 26

   future, it is strongly recommended that new user agents explicitly
   specify a character set via the Content-Type header field.  "US-
   ASCII" does not indicate an arbitrary seven-bit character code, but
   specifies that the body uses character coding that uses the exact
   correspondence of codes to characters specified in ASCII.  National
   use variations of ISO 646 [ISO-646] are NOT ASCII and their use in
   Internet mail is explicitly discouraged. The omission of the ISO 646
   character set is deliberate in this regard.  The character set name
   of "US-ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only.
   The character set name "ASCII" is reserved and must not be used for
   any purpose.

      NOTE: RFC 821 explicitly specifies "ASCII", and references an
      earlier version of the American Standard.  Insofar as one of the
      purposes of specifying a Content-Type and character set is to
      permit the receiver to unambiguously determine how the sender
      intended the coded message to be interpreted, assuming anything
      other than "strict ASCII" as the default would risk unintentional
      and incompatible changes to the semantics of messages now being
      transmitted.  This also implies that messages containing
      characters coded according to national variations on ISO 646, or
      using code-switching procedures (e.g., those of ISO 2022), as well
      as 8-bit or multiple octet character encodings MUST use an
      appropriate character set specification to be consistent with this
      specification.

   The complete US-ASCII character set is listed in [US-ASCII].  Note
   that the control characters including DEL (0-31, 127) have no defined
   meaning apart from the combination CRLF (ASCII values 13 and 10)
   indicating a new line.  Two of the characters have de facto meanings
   in wide use: FF (12) often means "start subsequent text on the
   beginning of a new page"; and TAB or HT (9) often (though not always)
   means "move the cursor to the next available column after the current
   position where the column number is a multiple of 8 (counting the
   first column as column 0)." Apart from this, any use of the control
   characters or DEL in a body must be part of a private agreement
   between the sender and recipient.  Such private agreements are
   discouraged and should be replaced by the other capabilities of this
   document.

      NOTE: Beyond US-ASCII, an enormous proliferation of character sets
      is possible. It is the opinion of the IETF working group that a
      large number of character sets is NOT a good thing.  We would
      prefer to specify a single character set that can be used
      universally for representing all of the world's languages in
      electronic mail.  Unfortunately, existing practice in several
      communities seems to point to the continued use of multiple
      character sets in the near future.  For this reason, we define

noToC RFC1521 - Page 27

      names for a small number of character sets for which a strong
      constituent base exists.

   The defined charset values are:

   US-ASCII -- as defined in [US-ASCII].

        ISO-8859-X -- where "X" is to be replaced, as necessary, for the
             parts of ISO-8859 [ISO-8859].  Note that the ISO 646
             character sets have deliberately been omitted in favor of
             their 8859 replacements, which are the designated character
             sets for Internet mail.  As of the publication of this
             document, the legitimate values for "X" are the digits 1
             through 9.

   The character sets specified above are the ones that were relatively
   uncontroversial during the drafting of MIME.  This document does not
   endorse the use of any particular character set other than US-ASCII,
   and recognizes that the future evolution of world character sets
   remains unclear.  It is expected that in the future, additional
   character sets will be registered for use in MIME.

   Note that the character set used, if anything other than US-ASCII,
   must always be explicitly specified in the Content-Type field.

   No other character set name may be used in Internet mail without the
   publication of a formal specification and its registration with IANA,
   or by private agreement, in which case the character set name must
   begin with "X-".

   Implementors are discouraged from defining new character sets for
   mail use unless absolutely necessary.

   The "charset" parameter has been defined primarily for the purpose of
   textual data, and is described in this section for that reason.
   However, it is conceivable that non-textual data might also wish to
   specify a charset value for some purpose, in which case the same
   syntax and values should be used.

   In general, mail-sending software must always use the "lowest common
   denominator" character set possible.  For example, if a body contains
   only US-ASCII characters, it must be marked as being in the US-ASCII
   character set, not ISO-8859-1, which, like all the ISO-8859 family of
   character sets, is a superset of US-ASCII.  More generally, if a
   widely-used character set is a subset of another character set, and a
   body contains only characters in the widely-used subset, it must be
   labeled as being in that subset.  This will increase the chances that
   the recipient will be able to view the mail correctly.

noToC RFC1521 - Page 28

7.1.2.     The Text/plain subtype

   The primary subtype of text is "plain".  This indicates plain
   (unformatted) text.  The default Content-Type for Internet mail,
   "text/plain; charset=us-ascii", describes existing Internet practice.
   That is, it is the type of body defined by RFC 822.

   No other text subtype is defined by this document.

   The formal grammar for the content-type header field for text is as
   follows:

   text-type := "text" "/" text-subtype [";" "charset" "=" charset]

   text-subtype := "plain" / extension-token

   charset := "us-ascii"/ "iso-8859-1"/ "iso-8859-2"/ "iso-8859-3"
          / "iso-8859-4"/ "iso-8859-5"/ "iso-8859-6"/ "iso-8859-7"
          / "iso-8859-8" / "iso-8859-9" / extension-token
                    ; case insensitive

7.2.  The Multipart Content-Type

   In the case of multiple part entities, in which one or more different
   sets of data are combined in a single body, a "multipart" Content-
   Type field must appear in the entity's header. The body must then
   contain one or more "body parts," each preceded by an encapsulation
   boundary, and the last one followed by a closing boundary.  Each part
   starts with an encapsulation boundary, and then contains a body part
   consisting of header area, a blank line, and a body area.  Thus a
   body part is similar to an RFC 822 message in syntax, but different
   in meaning.

   A body part is NOT to be interpreted as actually being an RFC 822
   message.  To begin with, NO header fields are actually required in
   body parts.  A body part that starts with a blank line, therefore, is
   allowed and is a body part for which all default values are to be
   assumed.  In such a case, the absence of a Content-Type header field
   implies that the corresponding body is plain US-ASCII text.  The only
   header fields that have defined meaning for body parts are those the
   names of which begin with "Content-".  All other header fields are
   generally to be ignored in body parts.  Although they should
   generally be retained in mail processing, they may be discarded by
   gateways if necessary.  Such other fields are permitted to appear in
   body parts but must not be depended on.  "X-" fields may be created
   for experimental or private purposes, with the recognition that the
   information they contain may be lost at some gateways.

noToC RFC1521 - Page 29

      NOTE: The distinction between an RFC 822 message and a body part
      is subtle, but important. A gateway between Internet and X.400
      mail, for example, must be able to tell the difference between a
      body part that contains an image and a body part that contains an
      encapsulated message, the body of which is an image.  In order to
      represent the latter, the body part must have "Content-Type:
      message", and its body (after the blank line) must be the
      encapsulated message, with its own "Content-Type: image" header
      field.  The use of similar syntax facilitates the conversion of
      messages to body parts, and vice versa, but the distinction
      between the two must be understood by implementors.  (For the
      special case in which all parts actually are messages, a "digest"
      subtype is also defined.)

   As stated previously, each body part is preceded by an encapsulation
   boundary.  The encapsulation boundary MUST NOT appear inside any of
   the encapsulated parts.  Thus, it is crucial that the composing agent
   be able to choose and specify the unique boundary that will separate
   the parts.

   All present and future subtypes of the "multipart" type must use an
   identical syntax.  Subtypes may differ in their semantics, and may
   impose additional restrictions on syntax, but must conform to the
   required syntax for the multipart type.  This requirement ensures
   that all conformant user agents will at least be able to recognize
   and separate the parts of any multipart entity, even of an
   unrecognized subtype.

   As stated in the definition of the Content-Transfer-Encoding field,
   no encoding other than "7bit", "8bit", or "binary" is permitted for
   entities of type "multipart".  The multipart delimiters and header
   fields are always represented as 7-bit ASCII in any case (though the
   header fields may encode non-ASCII header text as per [RFC-1522]),
   and data within the body parts can be encoded on a part-by-part
   basis, with Content-Transfer-Encoding fields for each appropriate
   body part.

   Mail gateways, relays, and other mail handling agents are commonly
   known to alter the top-level header of an RFC 822 message.  In
   particular, they frequently add, remove, or reorder header fields.
   Such alterations are explicitly forbidden for the body part headers
   embedded in the bodies of messages of type "multipart."

7.2.1.     Multipart:  The common syntax

   All subtypes of "multipart" share a common syntax, defined in this
   section.  A simple example of a multipart message also appears in
   this section.  An example of a more complex multipart message is

noToC RFC1521 - Page 30

   given in Appendix C.

   The Content-Type field for multipart entities requires one parameter,
   "boundary", which is used to specify the encapsulation boundary.  The
   encapsulation boundary is defined as a line consisting entirely of
   two hyphen characters ("-", decimal code 45) followed by the boundary
   parameter value from the Content-Type header field.

      NOTE: The hyphens are for rough compatibility with the earlier RFC
      934 method of message encapsulation, and for ease of searching for
      the boundaries in some implementations. However, it should be
      noted that multipart messages are NOT completely compatible with
      RFC 934 encapsulations; in particular, they do not obey RFC 934
      quoting conventions for embedded lines that begin with hyphens.
      This mechanism was chosen over the RFC 934 mechanism because the
      latter causes lines to grow with each level of quoting.  The
      combination of this growth with the fact that SMTP implementations
      sometimes wrap long lines made the RFC 934 mechanism unsuitable
      for use in the event that deeply-nested multipart structuring is
      ever desired.

   WARNING TO IMPLEMENTORS: The grammar for parameters on the Content-
   type field is such that it is often necessary to enclose the
   boundaries in quotes on the Content-type line.  This is not always
   necessary, but never hurts.  Implementors should be sure to study the
   grammar carefully in order to avoid producing illegal Content-type
   fields. Thus, a typical multipart Content-Type header field might
   look like this:

                 Content-Type: multipart/mixed;
                      boundary=gc0p4Jq0M2Yt08jU534c0p

   But the following is illegal:

                 Content-Type: multipart/mixed;
                      boundary=gc0p4Jq0M:2Yt08jU534c0p

   (because of the colon) and must instead be represented as

                 Content-Type: multipart/mixed;
                      boundary="gc0p4Jq0M:2Yt08jU534c0p"

   This indicates that the entity consists of several parts, each itself
   with a structure that is syntactically identical to an RFC 822
   message, except that the header area might be completely empty, and
   that the parts are each preceded by the line

                 --gc0p4Jq0M:2Yt08jU534c0p

noToC RFC1521 - Page 31

   Note that the encapsulation boundary must occur at the beginning of a
   line, i.e., following a CRLF, and that the initial CRLF is considered
   to be attached to the encapsulation boundary rather than part of the
   preceding part.  The boundary must be followed immediately either by
   another CRLF and the header fields for the next part, or by two
   CRLFs, in which case there are no header fields for the next part
   (and it is therefore assumed to be of Content-Type text/plain).

      NOTE: The CRLF preceding the encapsulation line is conceptually
      attached to the boundary so that it is possible to have a part
      that does not end with a CRLF (line break). Body parts that must
      be considered to end with line breaks, therefore, must have two
      CRLFs preceding the encapsulation line, the first of which is part
      of the preceding body part, and the second of which is part of the
      encapsulation boundary.

   Encapsulation boundaries must not appear within the encapsulations,
   and must be no longer than 70 characters, not counting the two
   leading hyphens.

   The encapsulation boundary following the last body part is a
   distinguished delimiter that indicates that no further body parts
   will follow.  Such a delimiter is identical to the previous
   delimiters, with the addition of two more hyphens at the end of the
   line:

                 --gc0p4Jq0M2Yt08jU534c0p--

   There appears to be room for additional information prior to the
   first encapsulation boundary and following the final boundary.  These
   areas should generally be left blank, and implementations must ignore
   anything that appears before the first boundary or after the last
   one.

      NOTE: These "preamble" and "epilogue" areas are generally not used
      because of the lack of proper typing of these parts and the lack
      of clear semantics for handling these areas at gateways,
      particularly X.400 gateways.  However, rather than leaving the
      preamble area blank, many MIME implementations have found this to
      be a convenient place to insert an explanatory note for recipients
      who read the message with pre-MIME software, since such notes will
      be ignored by MIME-compliant software.

      NOTE: Because encapsulation boundaries must not appear in the body
      parts being encapsulated, a user agent must exercise care to
      choose a unique boundary.  The boundary in the example above could
      have been the result of an algorithm designed to produce
      boundaries with a very low probability of already existing in the

noToC RFC1521 - Page 32

      data to be encapsulated without having to prescan the data.
      Alternate algorithms might result in more 'readable' boundaries
      for a recipient with an old user agent, but would require more
      attention to the possibility that the boundary might appear in the
      encapsulated part.  The simplest boundary possible is something
      like "---", with a closing boundary of "-----".

   As a very simple example, the following multipart message has two
   parts, both of them plain text, one of them explicitly typed and one
   of them implicitly typed:

      From: Nathaniel Borenstein <nsb@bellcore.com>
      To:  Ned Freed <ned@innosoft.com>
      Subject: Sample message
      MIME-Version: 1.0
      Content-type: multipart/mixed; boundary="simple
      boundary"

      This is the preamble.  It is to be ignored, though it
      is a handy place for mail composers to include an
      explanatory note to non-MIME conformant readers.
      --simple boundary

      This is implicitly typed plain ASCII text.
      It does NOT end with a linebreak.
      --simple boundary
      Content-type: text/plain; charset=us-ascii

      This is explicitly typed plain ASCII text.
      It DOES end with a linebreak.

      --simple boundary--
      This is the epilogue.  It is also to be ignored.

   The use of a Content-Type of multipart in a body part within another
   multipart entity is explicitly allowed.  In such cases, for obvious
   reasons, care must be taken to ensure that each nested multipart
   entity must use a different boundary delimiter. See Appendix C for an
   example of nested multipart entities.

   The use of the multipart Content-Type with only a single body part
   may be useful in certain contexts, and is explicitly permitted.

   The only mandatory parameter for the multipart Content-Type is the
   boundary parameter, which consists of 1 to 70 characters from a set
   of characters known to be very robust through email gateways, and NOT
   ending with white space.  (If a boundary appears to end with white
   space, the white space must be presumed to have been added by a

noToC RFC1521 - Page 33

   gateway, and must be deleted.)  It is formally specified by the
   following BNF:

   boundary := 0*69<bchars> bcharsnospace

   bchars := bcharsnospace / " "

   bcharsnospace :=    DIGIT / ALPHA / "'" / "(" / ")" / "+" /"_"
                 / "," / "-" / "." / "/" / ":" / "=" / "?"

   Overall, the body of a multipart entity may be specified  as
   follows:

   multipart-body := preamble 1*encapsulation
                  close-delimiter epilogue

   encapsulation := delimiter body-part CRLF

   delimiter := "--" boundary CRLF ; taken from Content-Type field.
                                   ; There must be no space
                                   ; between "--" and boundary.

   close-delimiter := "--" boundary "--" CRLF ; Again, no space
   by "--",

   preamble := discard-text   ;  to  be  ignored upon receipt.

   epilogue := discard-text   ;  to  be  ignored upon receipt.

   discard-text := *(*text CRLF)

   body-part := <"message" as defined in RFC 822,
             with all header fields optional, and with the
             specified delimiter not occurring anywhere in
             the message body, either on a line by itself
             or as a substring anywhere.  Note that the
             semantics of a part differ from the semantics
             of a message, as described in the text.>

      NOTE: In certain transport enclaves, RFC 822 restrictions such as
      the one that limits bodies to printable ASCII characters may not
      be in force.  (That is, the transport domains may resemble
      standard Internet mail transport as specified in RFC821 and
      assumed by RFC822, but without certain restrictions.)  The
      relaxation of these restrictions should be construed as locally
      extending the definition of bodies, for example to include octets
      outside of the ASCII range, as long as these extensions are
      supported by the transport and adequately documented in the

noToC RFC1521 - Page 34

      Content-Transfer-Encoding header field. However, in no event are
      headers (either message headers or body-part headers) allowed to
      contain anything other than ASCII characters.

      NOTE: Conspicuously missing from the multipart type is a notion of
      structured, related body parts.  In general, it seems premature to
      try to standardize interpart structure yet.  It is recommended
      that those wishing to provide a more structured or integrated
      multipart messaging facility should define a subtype of multipart
      that is syntactically identical, but that always expects the
      inclusion of a distinguished part that can be used to specify the
      structure and integration of the other parts, probably referring
      to them by their Content-ID field.  If this approach is used,
      other implementations will not recognize the new subtype, but will
      treat it as the primary subtype (multipart/mixed) and will thus be
      able to show the user the parts that are recognized.

7.2.2.     The Multipart/mixed (primary) subtype

   The primary subtype for multipart, "mixed", is intended for use when
   the body parts are independent and need to be bundled in a particular
   order.  Any multipart subtypes that an implementation does not
   recognize must be treated as being of subtype "mixed".

7.2.3.     The Multipart/alternative subtype

   The multipart/alternative type is syntactically identical to
   multipart/mixed, but the semantics are different.  In particular,
   each of the parts is an "alternative" version of the same
   information.

   Systems should recognize that the content of the various parts are
   interchangeable.  Systems should choose the "best" type based on the
   local environment and preferences, in some cases even through user
   interaction.  As with multipart/mixed, the order of body parts is
   significant.  In this case, the alternatives appear in an order of
   increasing faithfulness to the original content. In general, the best
   choice is the LAST part of a type supported by the recipient system's
   local environment.

   Multipart/alternative may be used, for example, to send mail in a
   fancy text format in such a way that it can easily be displayed
   anywhere:

noToC RFC1521 - Page 35

   From:  Nathaniel Borenstein <nsb@bellcore.com>
   To: Ned Freed <ned@innosoft.com>
   Subject: Formatted text mail
   MIME-Version: 1.0
   Content-Type: multipart/alternative; boundary=boundary42

   --boundary42

   Content-Type: text/plain; charset=us-ascii

      ...plain text version of message goes here....
   --boundary42
   Content-Type: text/richtext

      .... RFC 1341 richtext version of same message goes here ...
   --boundary42
   Content-Type: text/x-whatever

      .... fanciest formatted version of same  message  goes  here
      ...
   --boundary42--

   In this example, users whose mail system understood the "text/x-
   whatever" format would see only the fancy version, while other users
   would see only the richtext or plain text version, depending on the
   capabilities of their system.

   In general, user agents that compose multipart/alternative entities
   must place the body parts in increasing order of preference, that is,
   with the preferred format last.  For fancy text, the sending user
   agent should put the plainest format first and the richest format
   last.  Receiving user agents should pick and display the last format
   they are capable of displaying.  In the case where one of the
   alternatives is itself of type "multipart" and contains unrecognized
   sub-parts, the user agent may choose either to show that alternative,
   an earlier alternative, or both.

      NOTE: From an implementor's perspective, it might seem more
      sensible to reverse this ordering, and have the plainest
      alternative last.  However, placing the plainest alternative first
      is the friendliest possible option when multipart/alternative
      entities are viewed using a non-MIME-conformant mail reader.
      While this approach does impose some burden on conformant mail
      readers, interoperability with older mail readers was deemed to be
      more important in this case.

   It may be the case that some user agents, if they can recognize more
   than one of the formats, will prefer to offer the user the choice of

noToC RFC1521 - Page 36

   which format to view.  This makes sense, for example, if mail
   includes both a nicely-formatted image version and an easily-edited
   text version.  What is most critical, however, is that the user not
   automatically be shown multiple versions of the same data.  Either
   the user should be shown the last recognized version or should be
   given the choice.

   NOTE ON THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each
   part of a multipart/alternative entity represents the same data, but
   the mappings between the two are not necessarily without information
   loss.  For example, information is lost when translating ODA to
   PostScript or plain text.  It is recommended that each part should
   have a different Content-ID value in the case where the information
   content of the two parts is not identical.  However, where the
   information content is identical -- for example, where several parts
   of type "application/external- body" specify alternate ways to access
   the identical data -- the same Content-ID field value should be used,
   to optimize any cacheing mechanisms that might be present on the
   recipient's end.  However, it is recommended that the Content-ID
   values used by the parts should not be the same Content-ID value that
   describes the multipart/alternative as a whole, if there is any such
   Content-ID field.  That is, one Content-ID value will refer to the
   multipart/alternative entity, while one or more other Content-ID
   values will refer to the parts inside it.

7.2.4.     The Multipart/digest subtype

   This document defines a "digest" subtype of the multipart Content-
   Type.  This type is syntactically identical to multipart/mixed, but
   the semantics are different.  In particular, in a digest, the default
   Content-Type value for a body part is changed from "text/plain" to
   "message/rfc822".  This is done to allow a more readable digest
   format that is largely compatible (except for the quoting convention)
   with RFC 934.

noToC RFC1521 - Page 37

   A digest in this format might, then, look something like this:

   From: Moderator-Address
   To: Recipient-List
   MIME-Version: 1.0
   Subject:  Internet Digest, volume 42
   Content-Type: multipart/digest;
        boundary="---- next message ----"

   ------ next message ----

   From: someone-else
   Subject: my opinion

      ...body goes here ...

   ------ next message ----

   From: someone-else-again
   Subject: my different opinion

      ... another body goes here...

   ------ next message ------

7.2.5.     The Multipart/parallel subtype

   This document defines a "parallel" subtype of the multipart Content-
   Type.  This type is syntactically identical to multipart/mixed, but
   the semantics are different.  In particular, in a parallel entity,
   the order of body parts is not significant.

   A common presentation of this type is to display all of the parts
   simultaneously on hardware and software that are capable of doing so.
   However, composing agents should be aware that many mail readers will
   lack this capability and will show the parts serially in any event.

7.2.6.     Other Multipart subtypes

   Other multipart subtypes are expected in the future.  MIME
   implementations must in general treat unrecognized subtypes of
   multipart as being equivalent to "multipart/mixed".

   The formal grammar for content-type header fields for multipart data
   is given by:

   multipart-type := "multipart" "/" multipart-subtype
                  ";" "boundary" "=" boundary

noToC RFC1521 - Page 38

   multipart-subtype := "mixed" / "parallel" / "digest"
                  / "alternative" / extension-token

7.3.  The Message Content-Type

   It is frequently desirable, in sending mail, to encapsulate another
   mail message. For this common operation, a special Content-Type,
   "message", is defined.  The primary subtype, message/rfc822, has no
   required parameters in the Content-Type field.  Additional subtypes,
   "partial" and "External-body", do have required parameters.  These
   subtypes are explained below.

      NOTE: It has been suggested that subtypes of message might be
      defined for forwarded or rejected messages.  However, forwarded
      and rejected messages can be handled as multipart messages in
      which the first part contains any control or descriptive
      information, and a second part, of type message/rfc822, is the
      forwarded or rejected message.  Composing rejection and forwarding
      messages in this manner will preserve the type information on the
      original message and allow it to be correctly presented to the
      recipient, and hence is strongly encouraged.

   As stated in the definition of the Content-Transfer-Encoding field,
   no encoding other than "7bit", "8bit", or "binary" is permitted for
   messages or parts of type "message".  Even stronger restrictions
   apply to the subtypes "message/partial" and "message/external-body",
   as specified below.  The message header fields are always US-ASCII in
   any case, and data within the body can still be encoded, in which
   case the Content-Transfer-Encoding header field in the encapsulated
   message will reflect this.  Non-ASCII text in the headers of an
   encapsulated message can be specified using the mechanisms described
   in [RFC-1522].

   Mail gateways, relays, and other mail handling agents are commonly
   known to alter the top-level header of an RFC 822 message.  In
   particular, they frequently add, remove, or reorder header fields.
   Such alterations are explicitly forbidden for the encapsulated
   headers embedded in the bodies of messages of type "message."

7.3.1.     The Message/rfc822 (primary) subtype

   A Content-Type of "message/rfc822" indicates that the body contains
   an encapsulated message, with the syntax of an RFC 822 message.
   However, unlike top-level RFC 822 messages, it is not required that
   each message/rfc822 body must include a "From", "Subject", and at
   least one destination header.

   It should be noted that, despite the use of the numbers "822", a

noToC RFC1521 - Page 39

   message/rfc822 entity can include enhanced information as defined in
   this document.  In other words, a message/rfc822 message may be a
   MIME message.

7.3.2.     The Message/Partial subtype

   A subtype of message, "partial", is defined in order to allow large
   objects to be delivered as several separate pieces of mail and
   automatically reassembled by the receiving user agent.  (The concept
   is similar to IP fragmentation/reassembly in the basic Internet
   Protocols.)  This mechanism can be used when intermediate transport
   agents limit the size of individual messages that can be sent.
   Content-Type "message/partial" thus indicates that the body contains
   a fragment of a larger message.

   Three parameters must be specified in the Content-Type field of type
   message/partial: The first, "id", is a unique identifier, as close to
   a world-unique identifier as possible, to be used to match the parts
   together.  (In general, the identifier is essentially a message-id;
   if placed in double quotes, it can be any message-id, in accordance
   with the BNF for "parameter" given earlier in this specification.)
   The second, "number", an integer, is the part number, which indicates
   where this part fits into the sequence of fragments.  The third,
   "total", another integer, is the total number of parts. This third
   subfield is required on the final part, and is optional (though
   encouraged) on the earlier parts.  Note also that these parameters
   may be given in any order.

   Thus, part 2 of a 3-part message may have either of the following
   header fields:

                Content-Type: Message/Partial;
                     number=2; total=3;
                     id="oc=jpbe0M2Yt4s@thumper.bellcore.com"

                Content-Type: Message/Partial;
                     id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
                     number=2

   But part 3 MUST specify the total number of parts:

                Content-Type: Message/Partial;
                     number=3; total=3;
                     id="oc=jpbe0M2Yt4s@thumper.bellcore.com"

   Note that part numbering begins with 1, not 0.

   When the parts of a message broken up in this manner are put

noToC RFC1521 - Page 40

   together, the result is a complete MIME entity, which may have its
   own Content-Type header field, and thus may contain any other data
   type.

   Message fragmentation and reassembly: The semantics of a reassembled
   partial message must be those of the "inner" message, rather than of
   a message containing the inner message.  This makes it possible, for
   example, to send a large audio message as several partial messages,
   and still have it appear to the recipient as a simple audio message
   rather than as an encapsulated message containing an audio message.
   That is, the encapsulation of the message is considered to be
   "transparent".

   When generating and reassembling the parts of a message/partial
   message, the headers of the encapsulated message must be merged with
   the headers of the enclosing entities.  In this process the following
   rules must be observed:

      (1) All of the header fields from the initial enclosing entity
      (part one), except those that start with "Content-" and the
      specific header fields "Message-ID", "Encrypted", and "MIME-
      Version", must be copied, in order, to the new message.

      (2) Only those header fields in the enclosed message which start
      with "Content-" and "Message-ID", "Encrypted", and "MIME-Version"
      must be appended, in order, to the header fields of the new
      message.  Any header fields in the enclosed message which do not
      start with "Content-" (except for "Message-ID", "Encrypted", and
      "MIME-Version") will be ignored.

      (3) All of the header fields from the second and any subsequent
      messages will be ignored.

   For example, if an audio message is broken into two parts, the first
   part might look something like this:

      X-Weird-Header-1: Foo
      From: Bill@host.com
      To: joe@otherhost.com
      Subject: Audio mail
      Message-ID: <id1@host.com>
      MIME-Version: 1.0
      Content-type: message/partial;
           id="ABC@host.com";
           number=1; total=2

      X-Weird-Header-1: Bar
      X-Weird-Header-2: Hello

noToC RFC1521 - Page 41

      Message-ID: <anotherid@foo.com>
      MIME-Version: 1.0
      Content-type: audio/basic
      Content-transfer-encoding: base64

         ... first half of encoded audio data goes here...

   and the second half might look something like this:

      From: Bill@host.com
      To: joe@otherhost.com
      Subject: Audio mail
      MIME-Version: 1.0
      Message-ID: <id2@host.com>
      Content-type: message/partial;
           id="ABC@host.com"; number=2; total=2

         ... second half of encoded audio data goes here...

   Then, when the fragmented message is reassembled, the resulting
   message to be displayed to the user should look something like this:

      X-Weird-Header-1: Foo
      From: Bill@host.com
      To: joe@otherhost.com
      Subject: Audio mail
      Message-ID: <anotherid@foo.com>
      MIME-Version: 1.0
      Content-type: audio/basic
      Content-transfer-encoding: base64

         ... first half of encoded audio data goes here...
         ... second half of encoded audio data goes here...

   Note on encoding of MIME entities encapsulated inside message/partial
   entities: Because data of type "message" may never be encoded in
   base64 or quoted-printable, a problem might arise if message/partial
   entities are constructed in an environment that supports binary or
   8-bit transport.  The problem is that the binary data would be split
   into multiple message/partial objects, each of them requiring binary
   transport.  If such objects were encountered at a gateway into a 7-
   bit transport environment, there would be no way to properly encode
   them for the 7-bit world, aside from waiting for all of the parts,
   reassembling the message, and then encoding the reassembled data in
   base64 or quoted-printable.  Since it is possible that different
   parts might go through different gateways, even this is not an
   acceptable solution.  For this reason, it is specified that MIME
   entities of type message/partial must always have a content-

noToC RFC1521 - Page 42

   transfer-encoding of 7-bit (the default).  In particular, even in
   environments that support binary or 8-bit transport, the use of a
   content-transfer-encoding of "8bit" or "binary" is explicitly
   prohibited for entities of type message/partial.

   It should be noted that, because some message transfer agents may
   choose to automatically fragment large messages, and because such
   agents may use different fragmentation thresholds, it is possible
   that the pieces of a partial message, upon reassembly, may prove
   themselves to comprise a partial message.  This is explicitly
   permitted.

   It should also be noted that the inclusion of a "References" field in
   the headers of the second and subsequent pieces of a fragmented
   message that references the Message-Id on the previous piece may be
   of benefit to mail readers that understand and track references.
   However, the generation of such "References" fields is entirely
   optional.

   Finally, it should be noted that the "Encrypted" header field has
   been made obsolete by Privacy Enhanced Messaging (PEM), but the rules
   above are believed to describe the correct way to treat it if it is
   encountered in the context of conversion to and from message/partial
   fragments.

7.3.3.     The Message/External-Body subtype

   The external-body subtype indicates that the actual body data are not
   included, but merely referenced.  In this case, the parameters
   describe a mechanism for accessing the external data.

   When an entity is of type "message/external-body", it consists of a
   header, two consecutive CRLFs, and the message header for the
   encapsulated message.  If another pair of consecutive CRLFs appears,
   this of course ends the message header for the encapsulated message.
   However, since the encapsulated message's body is itself external, it
   does NOT appear in the area that follows.  For example, consider the
   following message:

      Content-type: message/external-body; access-
      type=local-file;

           name="/u/nsb/Me.gif"

      Content-type:  image/gif
      Content-ID: <id42@guppylake.bellcore.com>
      Content-Transfer-Encoding: binary

noToC RFC1521 - Page 43

      THIS IS NOT REALLY THE BODY!

   The area at the end, which might be called the "phantom body", is
   ignored for most external-body messages.  However, it may be used to
   contain auxiliary information for some such messages, as indeed it is
   when the access-type is "mail-server".  Of the access-types defined
   by this document, the phantom body is used only when the access-type
   is "mail-server".  In all other cases, the phantom body is ignored.

   The only always-mandatory parameter for message/external-body is
   "access-type"; all of the other parameters may be mandatory or
   optional depending on the value of access-type.

      ACCESS-TYPE -- A case-insensitive word, indicating the supported
      access mechanism by which the file or data may be obtained.
      Values include, but are not limited to, "FTP", "ANON-FTP", "TFTP",
      "AFS", "LOCAL-FILE", and "MAIL-SERVER".  Future values, except for
      experimental values beginning with "X-" must be registered with
      IANA, as described in Appendix E .

   In addition, the following three parameters are optional for ALL
   access-types:

      EXPIRATION -- The date (in the RFC 822 "date-time" syntax, as
      extended by RFC 1123 to permit 4 digits in the year field) after
      which the existence of the external data is not guaranteed.

      SIZE -- The size (in octets) of the data.  The intent of this
      parameter is to help the recipient decide whether or not to expend
      the necessary resources to retrieve the external data.  Note that
      this describes the size of the data in its canonical form, that
      is, before any Content- Transfer-Encoding has been applied or
      after the data have been decoded.

      PERMISSION -- A case-insensitive field that indicates whether or
      not it is expected that clients might also attempt to overwrite
      the data.  By default, or if permission is "read", the assumption
      is that they are not, and that if the data is retrieved once, it
      is never needed again.  If PERMISSION is "read-write", this
      assumption is invalid, and any local copy must be considered no
      more than a cache.  "Read" and "Read-write" are the only defined
      values of permission.

   The precise semantics of the access-types defined here are described
   in the sections that follow.

   The encapsulated headers in ALL message/external-body entities MUST
   include a Content-ID header field to give a unique identifier by

noToC RFC1521 - Page 44

   which to reference the data.  This identifier may be used for
   cacheing mechanisms, and for recognizing the receipt of the data when
   the access-type is "mail-server".

   Note that, as specified here, the tokens that describe external-body
   data, such as file names and mail server commands, are required to be
   in the US-ASCII character set.  If this proves problematic in
   practice, a new mechanism may be required as a future extension to
   MIME, either as newly defined access-types for message/external-body
   or by some other mechanism.

   As with message/partial, it is specified that MIME entities of type
   message/external-body must always have a content-transfer-encoding of
   7-bit (the default).  In particular, even in environments that
   support binary or 8-bit transport, the use of a content-transfer-
   encoding of "8bit" or "binary" is explicitly prohibited for entities
   of type message/external-body.

7.3.3.1.  The "ftp" and "tftp" access-types

   An access-type of FTP or TFTP indicates that the message body is
   accessible as a file using the FTP [RFC-959] or TFTP [RFC-783]
   protocols, respectively.  For these access-types, the following
   additional parameters are mandatory:

      NAME -- The name of the file that contains the actual body data.

      SITE -- A machine from which the file may be obtained, using the
      given protocol. This must be a fully qualified domain name, not a
      nickname.

   Before any data are retrieved, using FTP, the user will generally
   need to be asked to provide a login id and a password for the machine
   named by the site parameter.  For security reasons, such an id and
   password are not specified as content-type parameters, but must be
   obtained from the user.

   In addition, the following parameters are optional:

      DIRECTORY -- A directory from which the data named by NAME should
      be retrieved.

      MODE -- A case-insensitive string indicating the mode to be used
      when retrieving the information.  The legal values for access-type
      "TFTP" are "NETASCII", "OCTET", and "MAIL", as specified by the
      TFTP protocol [RFC-783].  The legal values for access-type "FTP"
      are "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a
      decimal integer, typically 8.  These correspond to the

noToC RFC1521 - Page 45

      representation types "A" "E" "I" and "L n" as specified by the FTP
      protocol [RFC-959].  Note that "BINARY" and "TENEX" are not valid
      values for MODE, but that "OCTET" or "IMAGE" or "LOCAL8" should be
      used instead.  IF MODE is not specified, the default value is
      "NETASCII" for TFTP and "ASCII" otherwise.

7.3.3.2.  The "anon-ftp" access-type

   The "anon-ftp" access-type is identical to the "ftp" access type,
   except that the user need not be asked to provide a name and password
   for the specified site.  Instead, the ftp protocol will be used with
   login "anonymous" and a password that corresponds to the user's email
   address.

7.3.3.3.  The "local-file" and "afs" access-types

   An access-type of "local-file" indicates that the actual body is
   accessible as a file on the local machine.  An access-type of "afs"
   indicates that the file is accessible via the global AFS file system.
   In both cases, only a single parameter is required:

      NAME -- The name of the file that contains the actual body data.

   The following optional parameter may be used to describe the locality
   of reference for the data, that is, the site or sites at which the
   file is expected to be visible:

      SITE -- A domain specifier for a machine or set of machines that
      are known to have access to the data file.  Asterisks may be used
      for wildcard matching to a part of a domain name, such as
      "*.bellcore.com", to indicate a set of machines on which the data
      should be directly visible, while a single asterisk may be used to
      indicate a file that is expected to be universally available,
      e.g., via a global file system.

7.3.3.4.  The "mail-server" access-type

   The "mail-server" access-type indicates that the actual body is
   available from a mail server.  The mandatory parameter for this
   access-type is:

      SERVER -- The email address of the mail server from which the
      actual body data can be obtained.

   Because mail servers accept a variety of syntaxes, some of which is
   multiline, the full command to be sent to a mail server is not
   included as a parameter on the content-type line.  Instead, it is
   provided as the "phantom body" when the content-type is

noToC RFC1521 - Page 46

   message/external-body and the access- type is mail-server.

   An optional parameter for this access-type is:

      SUBJECT -- The subject that is to be used in the mail that is sent
      to obtain the data. Note that keying mail servers on Subject lines
      is NOT recommended, but such mail servers are known to exist.

   Note that MIME does not define a mail server syntax.  Rather, it
   allows the inclusion of arbitrary mail server commands in the phantom
   body.  Implementations must include the phantom body in the body of
   the message it sends to the mail server address to retrieve the
   relevant data.

   It is worth noting that, unlike other access-types, mail-server
   access is asynchronous and will happen at an unpredictable time in
   the future.  For this reason, it is important that there be a
   mechanism by which the returned data can be matched up with the
   original message/external-body entity.  MIME mailservers must use the
   same Content-ID field on the returned message that was used in the
   original message/external-body entity, to facilitate such matching.

7.3.3.5.  Examples and Further Explanations

   With the emerging possibility of very wide-area file systems, it
   becomes very hard to know in advance the set of machines where a file
   will and will not be accessible directly from the file system.
   Therefore it may make sense to provide both a file name, to be tried
   directly, and the name of one or more sites from which the file is
   known to be accessible.  An implementation can try to retrieve remote
   files using FTP or any other protocol, using anonymous file retrieval
   or prompting the user for the necessary name and password.  If an
   external body is accessible via multiple mechanisms, the sender may
   include multiple parts of type message/external-body within an entity
   of type multipart/alternative.

   However, the external-body mechanism is not intended to be limited to
   file retrieval, as shown by the mail-server access-type.  Beyond
   this, one can imagine, for example, using a video server for external
   references to video clips.

   If an entity is of type "message/external-body", then the body of the
   entity will contain the header fields of the encapsulated message.
   The body itself is to be found in the external location.  This means
   that if the body of the "message/external-body" message contains two
   consecutive CRLFs, everything after those pairs is NOT part of the
   message itself.  For most message/external-body messages, this
   trailing area must simply be ignored.  However, it is a convenient

noToC RFC1521 - Page 47

   place for additional data that cannot be included in the content-type
   header field.  In particular, if the "access-type" value is "mail-
   server", then the trailing area must contain commands to be sent to
   the mail server at the address given by the value of the SERVER
   parameter.

   The embedded message header fields which appear in the body of the
   message/external-body data must be used to declare the Content-type
   of the external body if it is anything other than plain ASCII text,
   since the external body does not have a header section to declare its
   type.  Similarly, any Content-transfer-encoding other than "7bit"
   must also be declared here.  Thus a complete message/external-body
   message, referring to a document in PostScript format, might look
   like this:

      From: Whomever
      To: Someone
      Subject: whatever
      MIME-Version: 1.0
      Message-ID: <id1@host.com>
      Content-Type: multipart/alternative; boundary=42
      Content-ID: <id001@guppylake.bellcore.com>

      --42
      Content-Type: message/external-body;
           name="BodyFormats.ps";
           site="thumper.bellcore.com";
           access-type=ANON-FTP;
           directory="pub";
           mode="image";
           expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"

      Content-type: application/postscript
      Content-ID: <id42@guppylake.bellcore.com>

      --42
      Content-Type: message/external-body;
           name="/u/nsb/writing/rfcs/RFC-MIME.ps";
           site="thumper.bellcore.com";
           access-type=AFS
           expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"

      Content-type: application/postscript
      Content-ID: <id42@guppylake.bellcore.com>

      --42
      Content-Type: message/external-body;
           access-type=mail-server

noToC RFC1521 - Page 48

           server="listserv@bogus.bitnet";
           expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"

      Content-type: application/postscript
      Content-ID: <id42@guppylake.bellcore.com>

      get RFC-MIME.DOC

      --42--

   Note that in the above examples, the default Content-transfer-
   encoding of "7bit" is assumed for the external postscript data.

   Like the message/partial type, the message/external-body type is
   intended to be transparent, that is, to convey the data type in the
   external body rather than to convey a message with a body of that
   type.  Thus the headers on the outer and inner parts must be merged
   using the same rules as for message/partial.  In particular, this
   means that the Content-type header is overridden, but the From and
   Subject headers are preserved.

   Note that since the external bodies are not transported as mail, they
   need not conform to the 7-bit and line length requirements, but might
   in fact be binary files.  Thus a Content-Transfer-Encoding is not
   generally necessary, though it is permitted.

   Note that the body of a message of type "message/external-body" is
   governed by the basic syntax for an RFC 822 message.  In particular,
   anything before the first consecutive pair of CRLFs is header
   information, while anything after it is body information, which is
   ignored for most access-types.

   The formal grammar for content-type header fields for data of type
   message is given by:

   message-type := "message" "/" message-subtype

   message-subtype := "rfc822"
                   / "partial" 2#3partial-param
                   / "external-body" 1*external-param
                   / extension-token

   partial-param :=     (";" "id" "=" value)
              /  (";" "number" "=" 1*DIGIT)
              /  (";" "total" "=" 1*DIGIT)
         ; id & number required; total  required  for  last part

   external-param :=   (";" "access-type" "=" atype)

noToC RFC1521 - Page 49

              / (";" "expiration" "=" date-time)
                   ; Note that date-time is quoted
              / (";" "size" "=" 1*DIGIT)
              / (";"  "permission"  "="  ("read"  /  "read-write"))
                   ; Permission is case-insensitive
              / (";" "name" "="  value)
              / (";" "site" "=" value)
              / (";" "dir" "=" value)
              / (";" "mode" "=" value)
              / (";" "server" "=" value)
              / (";" "subject" "=" value)
          ; access-type required;others required based on access-type

   atype := "ftp" / "anon-ftp" / "tftp" / "local-file"
                  / "afs" / "mail-server" / extension-token
                  ; Case-insensitive

7.4.  The Application Content-Type

   The "application" Content-Type is to be used for data which do not
   fit in any of the other categories, and particularly for data to be
   processed by mail-based uses of application programs.  This is
   information which must be processed by an application before it is
   viewable or usable to a user.  Expected uses for Content-Type
   application include mail-based file transfer, spreadsheets, data for
   mail-based scheduling systems, and languages for "active"
   (computational) email.  (The latter, in particular, can pose security
   problems which must be understood by implementors, and are considered
   in detail in the discussion of the application/PostScript content-
   type.)

   For example, a meeting scheduler might define a standard
   representation for information about proposed meeting dates.  An
   intelligent user agent would use this information to conduct a dialog
   with the user, and might then send further mail based on that dialog.
   More generally, there have been several "active" messaging languages
   developed in which programs in a suitably specialized language are
   sent through the mail and automatically run in the recipient's
   environment.

   Such applications may be defined as subtypes of the "application"
   Content-Type.  This document defines two subtypes: octet-stream, and
   PostScript.

   In general, the subtype of application will often be the name of the
   application for which the data are intended.  This does not mean,
   however, that any application program name may be used freely as a
   subtype of application.  Such usages (other than subtypes beginning

noToC RFC1521 - Page 50

   with "x-") must be registered with IANA, as described in Appendix E.

7.4.1.     The Application/Octet-Stream (primary) subtype

   The primary subtype of application, "octet-stream", may be used to
   indicate that a body contains binary data.  The set of possible
   parameters includes, but is not limited to:

      TYPE -- the general type or category of binary data.  This is
      intended as information for the human recipient rather than for
      any automatic processing.

      PADDING -- the number of bits of padding that were appended to the
      bit-stream comprising the actual contents to produce the enclosed
      byte-oriented data.  This is useful for enclosing a bit-stream in
      a body when the total number of bits is not a multiple of the byte
      size.

   An additional parameter, "conversions", was defined in [RFC-1341] but
   has been removed.

   RFC 1341 also defined the use of a "NAME" parameter which gave a
   suggested file name to be used if the data were to be written to a
   file.  This has been deprecated in anticipation of a separate
   Content-Disposition header field, to be defined in a subsequent RFC.

   The recommended action for an implementation that receives
   application/octet-stream mail is to simply offer to put the data in a
   file, with any Content-Transfer-Encoding undone, or perhaps to use it
   as input to a user-specified process.

   To reduce the danger of transmitting rogue programs through the mail,
   it is strongly recommended that implementations NOT implement a
   path-search mechanism whereby an arbitrary program named in the
   Content-Type parameter (e.g., an "interpreter=" parameter) is found
   and executed using the mail body as input.

7.4.2.     The Application/PostScript subtype

   A Content-Type of "application/postscript" indicates a PostScript
   program.  Currently two variants of the PostScript language are
   allowed; the original level 1 variant is described in [POSTSCRIPT]
   and the more recent level 2 variant is described in [POSTSCRIPT2].

   PostScript is a registered trademark of Adobe Systems, Inc.  Use of
   the MIME content-type "application/postscript" implies recognition of
   that trademark and all the rights it entails.

noToC RFC1521 - Page 51

   The PostScript language definition provides facilities for internal
   labeling of the specific language features a given program uses. This
   labeling, called the PostScript document structuring conventions, is
   very general and provides substantially more information than just
   the language level.

   The use of document structuring conventions, while not required, is
   strongly recommended as an aid to interoperability.  Documents which
   lack proper structuring conventions cannot be tested to see whether
   or not they will work in a given environment.  As such, some systems
   may assume the worst and refuse to process unstructured documents.

   The execution of general-purpose PostScript interpreters entails
   serious security risks, and implementors are discouraged from simply
   sending PostScript email bodies to "off-the-shelf" interpreters.
   While it is usually safe to send PostScript to a printer, where the
   potential for harm is greatly constrained, implementors should
   consider all of the following before they add interactive display of
   PostScript bodies to their mail readers.

   The remainder of this section outlines some, though probably not all,
   of the possible problems with sending PostScript through the mail.

   Dangerous operations in the PostScript language include, but may not
   be limited to, the PostScript operators deletefile, renamefile,
   filenameforall, and file.  File is only dangerous when applied to
   something other than standard input or output. Implementations may
   also define additional nonstandard file operators; these may also
   pose a threat to security.  Filenameforall, the wildcard file search
   operator, may appear at first glance to be harmless. Note, however,
   that this operator has the potential to reveal information about what
   files the recipient has access to, and this information may itself be
   sensitive.  Message senders should avoid the use of potentially
   dangerous file operators, since these operators are quite likely to
   be unavailable in secure PostScript implementations.  Message-
   receiving and -displaying software should either completely disable
   all potentially dangerous file operators or take special care not to
   delegate any special authority to their operation. These operators
   should be viewed as being done by an outside agency when interpreting
   PostScript documents.  Such disabling and/or checking should be done
   completely outside of the reach of the PostScript language itself;
   care should be taken to insure that no method exists for re-enabling
   full-function versions of these operators.

   The PostScript language provides facilities for exiting the normal
   interpreter, or server, loop. Changes made in this "outer"
   environment are customarily retained across documents, and may in
   some cases be retained semipermanently in nonvolatile memory. The

noToC RFC1521 - Page 52

   operators associated with exiting the interpreter loop have the
   potential to interfere with subsequent document processing. As such,
   their unrestrained use constitutes a threat of service denial.
   PostScript operators that exit the interpreter loop include, but may
   not be limited to, the exitserver and startjob operators.  Message-
   sending software should not generate PostScript that depends on
   exiting the interpreter loop to operate. The ability to exit will
   probably be unavailable in secure PostScript implementations.
   Message-receiving and -displaying software should, if possible,
   disable the ability to make retained changes to the PostScript
   environment, and eliminate the startjob and exitserver commands.  If
   these commands cannot be eliminated, the password associated with
   them should at least be set to a hard-to-guess value.

   PostScript provides operators for setting system-wide and device-
   specific parameters. These parameter settings may be retained across
   jobs and may potentially pose a threat to the correct operation of
   the interpreter.  The PostScript operators that set system and device
   parameters include, but may not be limited to, the setsystemparams
   and setdevparams operators.  Message-sending software should not
   generate PostScript that depends on the setting of system or device
   parameters to operate correctly. The ability to set these parameters
   will probably be unavailable in secure PostScript implementations.
   Message-receiving and -displaying software should, if possible,
   disable the ability to change system and device parameters.  If these
   operators cannot be disabled, the password associated with them
   should at least be set to a hard-to-guess value.

   Some PostScript implementations provide nonstandard facilities for
   the direct loading and execution of machine code.  Such facilities
   are quite obviously open to substantial abuse.  Message-sending
   software should not make use of such features. Besides being totally
   hardware- specific, they are also likely to be unavailable in secure
   implementations of PostScript.  Message-receiving and -displaying
   software should not allow such operators to be used if they exist.

   PostScript is an extensible language, and many, if not most,
   implementations of it provide a number of their own extensions. This
   document does not deal with such extensions explicitly since they
   constitute an unknown factor.  Message-sending software should not
   make use of nonstandard extensions; they are likely to be missing
   from some implementations. Message-receiving and -displaying software
   should make sure that any nonstandard PostScript operators are secure
   and don't present any kind of threat.

   It is possible to write PostScript that consumes huge amounts of
   various system resources. It is also possible to write PostScript
   programs that loop infinitely.  Both types of programs have the

noToC RFC1521 - Page 53

   potential to cause damage if sent to unsuspecting recipients.
   Message-sending software should avoid the construction and
   dissemination of such programs, which is antisocial.  Message-
   receiving and -displaying software should provide appropriate
   mechanisms to abort processing of a document after a reasonable
   amount of time has elapsed. In addition, PostScript interpreters
   should be limited to the consumption of only a reasonable amount of
   any given system resource.

   Finally, bugs may exist in some PostScript interpreters which could
   possibly be exploited to gain unauthorized access to a recipient's
   system.  Apart from noting this possibility, there is no specific
   action to take to prevent this, apart from the timely correction of
   such bugs if any are found.

7.4.3.     Other Application subtypes

   It is expected that many other subtypes of application will be
   defined in the future.  MIME implementations must generally treat any
   unrecognized subtypes as being equivalent to application/octet-
   stream.

   The formal grammar for content-type header fields for application
   data is given by:

   application-type :=  "application" "/" application-subtype

   application-subtype := ("octet-stream" *stream-param)
                       / "postscript" / extension-token

   stream-param :=  (";" "type" "=" value)
                       / (";" "padding" "=" padding)

   padding := "0" / "1" /  "2" /  "3" / "4" / "5" / "6" / "7"

7.5.  The Image Content-Type

   A Content-Type of "image" indicates that the body contains an image.
   The subtype names the specific image format.  These names are case
   insensitive.  Two initial subtypes are "jpeg" for the JPEG format,
   JFIF encoding, and "gif" for GIF format [GIF].

   The list of image subtypes given here is neither exclusive nor
   exhaustive, and is expected to grow as more types are registered with
   IANA, as described in Appendix E.

   The formal grammar for the content-type header field for data of type
   image is given by:

noToC RFC1521 - Page 54

   image-type := "image" "/" ("gif" / "jpeg" / extension-token)

7.6.  The Audio Content-Type

   A Content-Type of "audio" indicates that the body contains audio
   data.  Although there is not yet a consensus on an "ideal" audio
   format for use with computers, there is a pressing need for a format
   capable of providing interoperable behavior.

   The initial subtype of "basic" is specified to meet this requirement
   by providing an absolutely minimal lowest common denominator audio
   format.  It is expected that richer formats for higher quality and/or
   lower bandwidth audio will be defined by a later document.

   The content of the "audio/basic" subtype is audio encoded using 8-bit
   ISDN mu-law [PCM].  When this subtype is present, a sample rate of
   8000 Hz and a single channel is assumed.

   The formal grammar for the content-type header field for data of type
   audio is given by:

   audio-type := "audio" "/" ("basic" / extension-token)

7.7.  The Video Content-Type

   A Content-Type of "video" indicates that the body contains a time-
   varying-picture image, possibly with color and coordinated sound.
   The term "video" is used extremely generically, rather than with
   reference to any particular technology or format, and is not meant to
   preclude subtypes such as animated drawings encoded compactly.  The
   subtype "mpeg" refers to video coded according to the MPEG standard
   [MPEG].

   Note that although in general this document strongly discourages the
   mixing of multiple media in a single body, it is recognized that many
   so-called "video" formats include a representation for synchronized
   audio, and this is explicitly permitted for subtypes of "video".

   The formal grammar for the content-type header field for data of type
   video is given by:

   video-type := "video" "/" ("mpeg" / extension-token)

7.8.  Experimental Content-Type Values

   A Content-Type value beginning with the characters "X-" is a private
   value, to be used by consenting mail systems by mutual agreement.
   Any format without a rigorous and public definition must be named

noToC RFC1521 - Page 55

   with an "X-" prefix, and publicly specified values shall never begin
   with "X-".  (Older versions of the widely-used Andrew system use the
   "X-BE2" name, so new systems should probably choose a different
   name.)

   In general, the use of "X-" top-level types is strongly discouraged.
   Implementors should invent subtypes of the existing types whenever
   possible.  The invention of new types is intended to be restricted
   primarily to the development of new media types for email, such as
   digital odors or holography, and not for new data formats in general.
   In many cases, a subtype of application will be more appropriate than
   a new top-level type.

(next page on part 3)