8. Summary Using the MIME-Version, Content-Type, and Content-Transfer-Encoding header fields, it is possible to include, in a standardized way, arbitrary types of data objects with RFC 822 conformant mail messages. No restrictions imposed by either RFC 821 or RFC 822 are violated, and care has been taken to avoid problems caused by additional restrictions imposed by the characteristics of some Internet mail transport mechanisms (see Appendix B). The "multipart" and "message" Content-Types allow mixing and hierarchical structuring of objects of different types in a single message. Further Content- Types provide a standardized mechanism for tagging messages or body parts as audio, image, or several other kinds of data. A distinguished parameter syntax allows further specification of data format details, particularly the specification of alternate character sets. Additional optional header fields provide mechanisms for certain extensions deemed desirable by many implementors. Finally, a number of useful Content-Types are defined for general use by consenting user agents, notably message/partial, and message/external-body. 9. Security Considerations Security issues are discussed in Section 7.4.2 and in Appendix F. Implementors should pay special attention to the security implications of any mail content-types that can cause the remote execution of any actions in the recipient's environment. In such cases, the discussion of the application/postscript content-type in Section 7.4.2 may serve as a model for considering other content- types with remote execution capabilities.
10. Authors' Addresses For more information, the authors of this document may be contacted via Internet mail: Nathaniel S. Borenstein MRE 2D-296, Bellcore 445 South St. Morristown, NJ 07962-1910 Phone: +1 201 829 4270 Fax: +1 201 829 7019 Email: nsb@bellcore.com Ned Freed Innosoft International, Inc. 250 West First Street Suite 240 Claremont, CA 91711 Phone: +1 909 624 7907 Fax: +1 909 621 5319 Email: ned@innosoft.com MIME is a result of the work of the Internet Engineering Task Force Working Group on Email Extensions. The chairman of that group, Greg Vaudreuil, may be reached at: Gregory M. Vaudreuil Tigon Corporation 17060 Dallas Parkway Dallas Texas, 75248 Phone: +1 214-733-2722 EMail: gvaudre@cnri.reston.va.us
11. Acknowledgements This document is the result of the collective effort of a large number of people, at several IETF meetings, on the IETF-SMTP and IETF-822 mailing lists, and elsewhere. Although any enumeration seems doomed to suffer from egregious omissions, the following are among the many contributors to this effort: Harald Tveit Alvestrand Timo Lehtinen Randall Atkinson John R. MacMillan Philippe Brandon Rick McGowan Kevin Carosso Leo Mclaughlin Uhhyung Choi Goli Montaser-Kohsari Cristian Constantinof Keith Moore Mark Crispin Tom Moore Dave Crocker Erik Naggum Terry Crowley Mark Needleman Walt Daniels John Noerenberg Frank Dawson Mats Ohrman Hitoshi Doi Julian Onions Kevin Donnelly Michael Patton Keith Edwards David J. Pepper Chris Eich Blake C. Ramsdell Johnny Eriksson Luc Rooijakkers Craig Everhart Marshall T. Rose Patrik Faeltstroem Jonathan Rosenberg Erik E. Fair Jan Rynning Roger Fajman Harri Salminen Alain Fontaine Michael Sanderson James M. Galvin Masahiro Sekiguchi Philip Gladstone Mark Sherman Thomas Gordon Keld Simonsen Phill Gross Bob Smart James Hamilton Peter Speck Steve Hardcastle-Kille Henry Spencer David Herron Einar Stefferud Bruce Howard Michael Stein Bill Janssen Klaus Steinberger Olle Jaernefors Peter Svanberg Risto Kankkunen James Thompson Phil Karn Steve Uhler Alan Katz Stuart Vance Tim Kehres Erik van der Poel Neil Katin Guido van Rossum Kyuho Kim Peter Vanderbilt Anders Klemets Greg Vaudreuil John Klensin Ed Vielmetti Valdis Kletniek Ryan Waldron
Jim Knowles Wally Wedel Stev Knowles Sven-Ove Westberg Bob Kummerfeld Brian Wideen Pekka Kytolaakso John Wobus Stellan Lagerstrom Glenn Wright Vincent Lau Rayan Zachariassen Donald Lindsay David Zimmerman Marc Andreessen Bob Braden Brian Capouch Peter Clitherow Dave Collier-Brown John Coonrod Stephen Crocker Jim Davis Axel Deininger Dana S Emery Martin Forssen Stephen Gildea Terry Gray Mark Horton Warner Losh Carlyn Lowery Laurence Lundblade Charles Lynn Larry Masinter Michael J. McInerny Jon Postel Christer Romson Yutaka Sato Markku Savela Richard Alan Schafer Larry W. Virden Rhys Weatherly Jay Weber Dave Wecker The authors apologize for any omissions from this list, which are certainly unintentional.
Appendix A -- Minimal MIME-Conformance The mechanisms described in this document are open-ended. It is definitely not expected that all implementations will support all of the Content-Types described, nor that they will all share the same extensions. In order to promote interoperability, however, it is useful to define the concept of "MIME-conformance" to define a certain level of implementation that allows the useful interworking of messages with content that differs from US ASCII text. In this section, we specify the requirements for such conformance. A mail user agent that is MIME-conformant MUST: 1. Always generate a "MIME-Version: 1.0" header field. 2. Recognize the Content-Transfer-Encoding header field, and decode all received data encoded with either the quoted-printable or base64 implementations. Encode any data sent that is not in seven-bit mail-ready representation using one of these transformations and include the appropriate Content-Transfer- Encoding header field, unless the underlying transport mechanism supports non-seven-bit data, as SMTP does not. 3. Recognize and interpret the Content-Type header field, and avoid showing users raw data with a Content-Type field other than text. Be able to send at least text/plain messages, with the character set specified as a parameter if it is not US-ASCII. 4. Explicitly handle the following Content-Type values, to at least the following extents: Text: -- Recognize and display "text" mail with the character set "US-ASCII." -- Recognize other character sets at least to the extent of being able to inform the user about what character set the message uses. -- Recognize the "ISO-8859-*" character sets to the extent of being able to display those characters that are common to ISO-8859-* and US-ASCII, namely all characters represented by octet values 0-127.
-- For unrecognized subtypes, show or offer to show the user the "raw" version of the data after conversion of the content from canonical form to local form. Message: -- Recognize and display at least the primary (822) encapsulation. Multipart: -- Recognize the primary (mixed) subtype. Display all relevant information on the message level and the body part header level and then display or offer to display each of the body parts individually. -- Recognize the "alternative" subtype, and avoid showing the user redundant parts of multipart/alternative mail. -- Treat any unrecognized subtypes as if they were "mixed". Application: -- Offer the ability to remove either of the two types of Content-Transfer- Encoding defined in this document and put the resulting information in a user file. 5. Upon encountering any unrecognized Content- Type, an implementation must treat it as if it had a Content-Type of "application/octet-stream" with no parameter sub-arguments. How such data are handled is up to an implementation, but likely options for handling such unrecognized data include offering the user to write it into a file (decoded from its mail transport format) or offering the user to name a program to which the decoded data should be passed as input. Unrecognized predefined types, which in a MIME-conformant mailer might still include audio, image, or video, should also be treated in this way. A user agent that meets the above conditions is said to be MIME-
conformant. The meaning of this phrase is that it is assumed to be "safe" to send virtually any kind of properly-marked data to users of such mail systems, because such systems will at least be able to treat the data as undifferentiated binary, and will not simply splash it onto the screen of unsuspecting users. There is another sense in which it is always "safe" to send data in a format that is MIME- conformant, which is that such data will not break or be broken by any known systems that are conformant with RFC 821 and RFC 822. User agents that are MIME-conformant have the additional guarantee that the user will not be shown data that were never intended to be viewed as text.
Appendix B -- General Guidelines For Sending Email Data Internet email is not a perfect, homogeneous system. Mail may become corrupted at several stages in its travel to a final destination. Specifically, email sent throughout the Internet may travel across many networking technologies. Many networking and mail technologies do not support the full functionality possible in the SMTP transport environment. Mail traversing these systems is likely to be modified in such a way that it can be transported. There exist many widely-deployed non-conformant MTAs in the Internet. These MTAs, speaking the SMTP protocol, alter messages on the fly to take advantage of the internal data structure of the hosts they are implemented on, or are just plain broken. The following guidelines may be useful to anyone devising a data format (Content-Type) that will survive the widest range of networking technologies and known broken MTAs unscathed. Note that anything encoded in the base64 encoding will satisfy these rules, but that some well-known mechanisms, notably the UNIX uuencode facility, will not. Note also that anything encoded in the Quoted-Printable encoding will survive most gateways intact, but possibly not some gateways to systems that use the EBCDIC character set. (1) Under some circumstances the encoding used for data may change as part of normal gateway or user agent operation. In particular, conversion from base64 to quoted-printable and vice versa may be necessary. This may result in the confusion of CRLF sequences with line breaks in text bodies. As such, the persistence of CRLF as something other than a line break must not be relied on. (2) Many systems may elect to represent and store text data using local newline conventions. Local newline conventions may not match the RFC822 CRLF convention -- systems are known that use plain CR, plain LF, CRLF, or counted records. The result is that isolated CR and LF characters are not well tolerated in general; they may be lost or converted to delimiters on some systems, and hence must not be relied on. (3) TAB (HT) characters may be misinterpreted or may be automatically converted to variable numbers of spaces. This is unavoidable in some environments, notably those not based on the ASCII character set. Such conversion is STRONGLY DISCOURAGED, but it may occur, and mail formats must not rely on the persistence of TAB (HT) characters. (4) Lines longer than 76 characters may be wrapped or truncated in some environments. Line wrapping and line truncation are STRONGLY
DISCOURAGED, but unavoidable in some cases. Applications which require long lines must somehow differentiate between soft and hard line breaks. (A simple way to do this is to use the quoted- printable encoding.) (5) Trailing "white space" characters (SPACE, TAB (HT)) on a line may be discarded by some transport agents, while other transport agents may pad lines with these characters so that all lines in a mail file are of equal length. The persistence of trailing white space, therefore, must not be relied on. (6) Many mail domains use variations on the ASCII character set, or use character sets such as EBCDIC which contain most but not all of the US-ASCII characters. The correct translation of characters not in the "invariant" set cannot be depended on across character converting gateways. For example, this situation is a problem when sending uuencoded information across BITNET, an EBCDIC system. Similar problems can occur without crossing a gateway, since many Internet hosts use character sets other than ASCII internally. The definition of Printable Strings in X.400 adds further restrictions in certain special cases. In particular, the only characters that are known to be consistent across all gateways are the 73 characters that correspond to the upper and lower case letters A-Z and a-z, the 10 digits 0-9, and the following eleven special characters: "'" (ASCII code 39) "(" (ASCII code 40) ")" (ASCII code 41) "+" (ASCII code 43) "," (ASCII code 44) "-" (ASCII code 45) "." (ASCII code 46) "/" (ASCII code 47) ":" (ASCII code 58) "=" (ASCII code 61) "?" (ASCII code 63) A maximally portable mail representation, such as the base64 encoding, will confine itself to relatively short lines of text in which the only meaningful characters are taken from this set of 73 characters. (7) Some mail transport agents will corrupt data that includes certain literal strings. In particular, a period (".") alone on a line is known to be corrupted by some (incorrect) SMTP implementations, and a line that starts with the five characters "From " (the fifth character is a SPACE) are commonly corrupted as
well. A careful composition agent can prevent these corruptions by encoding the data (e.g., in the quoted-printable encoding, "=46rom " in place of "From " at the start of a line, and "=2E" in place of "." alone on a line. Please note that the above list is NOT a list of recommended practices for MTAs. RFC 821 MTAs are prohibited from altering the character of white space or wrapping long lines. These BAD and illegal practices are known to occur on established networks, and implementations should be robust in dealing with the bad effects they can cause.
Appendix C -- A Complex Multipart Example What follows is the outline of a complex multipart message. This message has five parts to be displayed serially: two introductory plain text parts, an embedded multipart message, a richtext part, and a closing encapsulated text message in a non-ASCII character set. The embedded multipart message has two parts to be displayed in parallel, a picture and an audio fragment. MIME-Version: 1.0 From: Nathaniel Borenstein <nsb@bellcore.com> To: Ned Freed <ned@innosoft.com> Subject: A multipart example Content-Type: multipart/mixed; boundary=unique-boundary-1 This is the preamble area of a multipart message. Mail readers that understand multipart format should ignore this preamble. If you are reading this text, you might want to consider changing to a mail reader that understands how to properly display multipart messages. --unique-boundary-1 ...Some text appears here... [Note that the preceding blank line means no header fields were given and this is text, with charset US ASCII. It could have been done with explicit typing as in the next part.] --unique-boundary-1 Content-type: text/plain; charset=US-ASCII This could have been part of the previous part, but illustrates explicit versus implicit typing of body parts. --unique-boundary-1 Content-Type: multipart/parallel; boundary=unique-boundary-2 --unique-boundary-2 Content-Type: audio/basic Content-Transfer-Encoding: base64 ... base64-encoded 8000 Hz single-channel mu-law-format audio data goes here....
--unique-boundary-2 Content-Type: image/gif Content-Transfer-Encoding: base64 ... base64-encoded image data goes here.... --unique-boundary-2-- --unique-boundary-1 Content-type: text/richtext This is <bold><italic>richtext.</italic></bold> <smaller>as defined in RFC 1341</smaller> <nl><nl>Isn't it <bigger><bigger>cool?</bigger></bigger> --unique-boundary-1 Content-Type: message/rfc822 From: (mailbox in US-ASCII) To: (address in US-ASCII) Subject: (subject in US-ASCII) Content-Type: Text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: Quoted-printable ... Additional text in ISO-8859-1 goes here ... --unique-boundary-1--
Appendix D -- Collected Grammar This appendix contains the complete BNF grammar for all the syntax specified by this document. By itself, however, this grammar is incomplete. It refers to several entities that are defined by RFC 822. Rather than reproduce those definitions here, and risk unintentional differences between the two, this document simply refers the reader to RFC 822 for the remaining definitions. Wherever a term is undefined, it refers to the RFC 822 definition. application-subtype := ("octet-stream" *stream-param) / "postscript" / extension-token application-type := "application" "/" application-subtype attribute := token ; case-insensitive atype := "ftp" / "anon-ftp" / "tftp" / "local-file" / "afs" / "mail-server" / extension-token ; Case-insensitive audio-type := "audio" "/" ("basic" / extension-token) body-part := <"message" as defined in RFC 822, with all header fields optional, and with the specified delimiter not occurring anywhere in the message body, either on a line by itself or as a substring anywhere.> NOTE: In certain transport enclaves, RFC 822 restrictions such as the one that limits bodies to printable ASCII characters may not be in force. (That is, the transport domains may resemble standard Internet mail transport as specified in RFC821 and assumed by RFC822, but without certain restrictions.) The relaxation of these restrictions should be construed as locally extending the definition of bodies, for example to include octets outside of the ASCII range, as long as these extensions are supported by the transport and adequately documented in the Content-Transfer-Encoding header field. However, in no event are headers (either message headers or body-part headers) allowed to contain anything other than ASCII characters.
boundary := 0*69<bchars> bcharsnospace bchars := bcharsnospace / " " bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / "_" / "," / "-" / "." / "/" / ":" / "=" / "?" charset := "us-ascii" / "iso-8859-1" / "iso-8859-2"/ "iso-8859-3" / "iso-8859-4" / "iso-8859-5" / "iso-8859-6" / "iso-8859-7" / "iso-8859-8" / "iso-8859-9" / extension-token ; case insensitive close-delimiter := "--" boundary "--" CRLF;Again,no space by "--", content := "Content-Type" ":" type "/" subtype *(";" parameter) ; case-insensitive matching of type and subtype delimiter := "--" boundary CRLF ;taken from Content-Type field. ; There must be no space ; between "--" and boundary. description := "Content-Description" ":" *text discard-text := *(*text CRLF) encapsulation := delimiter body-part CRLF encoding := "Content-Transfer-Encoding" ":" mechanism epilogue := discard-text ; to be ignored upon receipt. extension-token := x-token / iana-token external-param := (";" "access-type" "=" atype) / (";" "expiration" "=" date-time) ; Note that date-time is quoted / (";" "size" "=" 1*DIGIT) / (";" "permission" "=" ("read" / "read-write")) ; Permission is case-insensitive / (";" "name" "=" value) / (";" "site" "=" value) / (";" "dir" "=" value) / (";" "mode" "=" value) / (";" "server" "=" value) / (";" "subject" "=" value) ;access-type required; others required based on access-type
iana-token := <a publicly-defined extension token, registered with IANA, as specified in appendix E> id := "Content-ID" ":" msg-id image-type := "image" "/" ("gif" / "jpeg" / extension-token) mechanism := "7bit" ; case-insensitive / "quoted-printable" / "base64" / "8bit" / "binary" / x-token message-subtype := "rfc822" / "partial" 2#3partial-param / "external-body" 1*external-param / extension-token message-type := "message" "/" message-subtype multipart-body :=preamble 1*encapsulation close-delimiter epilogue multipart-subtype := "mixed" / "parallel" / "digest" / "alternative" / extension-token multipart-type := "multipart" "/" multipart-subtype ";" "boundary" "=" boundary octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") ; octet must be used for characters > 127, =, SPACE, or TAB, ; and is recommended for any characters not listed in ; Appendix B as "mail-safe". padding := "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" parameter := attribute "=" value partial-param := (";" "id" "=" value) / (";" "number" "=" 1*DIGIT) / (";" "total" "=" 1*DIGIT) ; id & number required;total required for last part preamble := discard-text ; to be ignored upon receipt. ptext := octet / <any ASCII character except "=", SPACE, or TAB>
; characters not listed as "mail-safe" in Appendix B ; are also not recommended. quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] CRLF) ; Maximum line length of 76 characters excluding CRLF stream-param := (";" "type" "=" value) / (";" "padding" "=" padding) subtype := token ; case-insensitive text-subtype := "plain" / extension-token text-type := "text" "/" text-subtype [";" "charset" "=" charset] token := 1*<any (ASCII) CHAR except SPACE, CTLs, or tspecials> tspecials := "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / <"> / "/" / "[" / "]" / "?" / "=" ; Must be in quoted-string, ; to use within parameter values type := "application" / "audio" ; case-insensitive / "image" / "message" / "multipart" / "text" / "video" / extension-token ; All values case-insensitive value := token / quoted-string version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT video-type := "video" "/" ("mpeg" / extension-token) x-token := <The two characters "X-" or "x-" followed, with no intervening white space, by any token>
Appendix E -- IANA Registration Procedures MIME has been carefully designed to have extensible mechanisms, and it is expected that the set of content-type/subtype pairs and their associated parameters will grow significantly with time. Several other MIME fields, notably character set names, access-type parameters for the message/external-body type, and possibly even Content-Transfer-Encoding values, are likely to have new values defined over time. In order to ensure that the set of such values is developed in an orderly, well-specified, and public manner, MIME defines a registration process which uses the Internet Assigned Numbers Authority (IANA) as a central registry for such values. In general, parameters in the content-type header field are used to convey supplemental information for various content types, and their use is defined when the content-type and subtype are defined. New parameters should not be defined as a way to introduce new functionality. In order to simplify and standardize the registration process, this appendix gives templates for the registration of new values with IANA. Each of these is given in the form of an email message template, to be filled in by the registering party. E.1 Registration of New Content-type/subtype Values Note that MIME is generally expected to be extended by subtypes. If a new fundamental top-level type is needed, its specification must be published as an RFC or submitted in a form suitable to become an RFC, and be subject to the Internet standards process. To: IANA@isi.edu Subject: Registration of new MIME content-type/subtype MIME type name: (If the above is not an existing top-level MIME type, please explain why an existing type cannot be used.) MIME subtype name: Required parameters: Optional parameters: Encoding considerations:
Security considerations: Published specification: (The published specification must be an Internet RFC or RFC-to-be if a new top-level type is being defined, and must be a publicly available specification in any case.) Person & email address to contact for further information: E.2 Registration of New Access-type Values for Message/external-body To: IANA@isi.edu Subject: Registration of new MIME Access-type for Message/external-body content-type MIME access-type name: Required parameters: Optional parameters: Published specification: (The published specification must be an Internet RFC or RFC-to-be.) Person & email address to contact for further information:
Appendix F -- Summary of the Seven Content-types Content-type: text Subtypes defined by this document: plain Important Parameters: charset Encoding notes: quoted-printable generally preferred if an encoding is needed and the character set is mostly an ASCII superset. Security considerations: Rich text formats such as TeX and Troff often contain mechanisms for executing arbitrary commands or file system operations, and should not be used automatically unless these security problems have been addressed. Even plain text may contain control characters that can be used to exploit the capabilities of "intelligent" terminals and cause security violations. User interfaces designed to run on such terminals should be aware of and try to prevent such problems. ________________________________________________________ Content-type: multipart Subtypes defined by this document: mixed, alternative, digest, parallel. Important Parameters: boundary Encoding notes: No content-transfer-encoding is permitted. ________________________________________________________ Content-type: message Subtypes defined by this document: rfc822, partial, external-body Important Parameters: id, number, total, access-type, expiration, size, permission, name, site, directory, mode, server, subject Encoding notes: No content-transfer-encoding is permitted. Specifically, only "7bit" is permitted for "message/partial" or "message/external-body", and only "7bit", "8bit", or "binary" are permitted for other subtypes of "message". ______________________________________________________________ Content-type: application Subtypes defined by this document: octet-stream, postscript Important Parameters: type, padding
Deprecated Parameters: name and conversions were defined in RFC 1341. Encoding notes: base64 preferred for unreadable subtypes. Security considerations: This type is intended for the transmission of data to be interpreted by locally-installed programs. If used, for example, to transmit executable binary programs or programs in general-purpose interpreted languages, such as LISP programs or shell scripts, severe security problems could result. Authors of mail-reading agents are cautioned against giving their systems the power to execute mail-based application data without carefully considering the security implications. While it is certainly possible to define safe application formats and even safe interpreters for unsafe formats, each interpreter should be evaluated separately for possible security problems. ________________________________________________________________ Content-type: image Subtypes defined by this document: jpeg, gif Important Parameters: none Encoding notes: base64 generally preferred ________________________________________________________________ Content-type: audio Subtypes defined by this document: basic Important Parameters: none Encoding notes: base64 generally preferred ________________________________________________________________ Content-type: video Subtypes defined by this document: mpeg Important Parameters: none Encoding notes: base64 generally preferred
Appendix G -- Canonical Encoding Model There was some confusion, in earlier drafts of this memo, regarding the model for when email data was to be converted to canonical form and encoded, and in particular how this process would affect the treatment of CRLFs, given that the representation of newlines varies greatly from system to system. For this reason, a canonical model for encoding is presented below. The process of composing a MIME entity can be modeled as being done in a number of steps. Note that these steps are roughly similar to those steps used in RFC 1421 and are performed for each 'innermost level' body: Step 1. Creation of local form. The body to be transmitted is created in the system's native format. The native character set is used, and where appropriate local end of line conventions are used as well. The body may be a UNIX-style text file, or a Sun raster image, or a VMS indexed file, or audio data in a system-dependent format stored only in memory, or anything else that corresponds to the local model for the representation of some form of information. Fundamentally, the data is created in the "native" form specified by the type/subtype information. Step 2. Conversion to canonical form. The entire body, including "out-of-band" information such as record lengths and possibly file attribute information, is converted to a universal canonical form. The specific content type of the body as well as its associated attributes dictate the nature of the canonical form that is used. Conversion to the proper canonical form may involve character set conversion, transformation of audio data, compression, or various other operations specific to the various content types. If character set conversion is involved, however, care must be taken to understand the semantics of the content-type, which may have strong implications for any character set conversion, e.g. with regard to syntactically meaningful characters in a text subtype other than "plain". For example, in the case of text/plain data, the text must be converted to a supported character set and lines must be delimited with CRLF delimiters in accordance with RFC822. Note that the restriction on line lengths implied by RFC822 is eliminated if the next step employs either quoted-printable or base64 encoding.
Step 3. Apply transfer encoding. A Content-Transfer-Encoding appropriate for this body is applied. Note that there is no fixed relationship between the content type and the transfer encoding. In particular, it may be appropriate to base the choice of base64 or quoted-printable on character frequency counts which are specific to a given instance of a body. Step 4. Insertion into entity. The encoded object is inserted into a MIME entity with appropriate headers. The entity is then inserted into the body of a higher-level entity (message or multipart) if needed. It is vital to note that these steps are only a model; they are specifically NOT a blueprint for how an actual system would be built. In particular, the model fails to account for two common designs: 1. In many cases the conversion to a canonical form prior to encoding will be subsumed into the encoder itself, which understands local formats directly. For example, the local newline convention for text bodies might be carried through to the encoder itself along with knowledge of what that format is. 2. The output of the encoders may have to pass through one or more additional steps prior to being transmitted as a message. As such, the output of the encoder may not be conformant with the formats specified by RFC822. In particular, once again it may be appropriate for the converter's output to be expressed using local newline conventions rather than using the standard RFC822 CRLF delimiters. Other implementation variations are conceivable as well. The vital aspect of this discussion is that, in spite of any optimizations, collapsings of required steps, or insertion of additional processing, the resulting messages must be consistent with those produced by the model described here. For example, a message with the following header fields: Content-type: text/foo; charset=bar Content-Transfer-Encoding: base64 must be first represented in the text/foo form, then (if necessary) represented in the "bar" character set, and finally transformed via the base64 algorithm into a mail-safe form.
Appendix H -- Changes from RFC 1341 This document is a relatively minor revision of RFC 1341. For the convenience of those familiar with RFC 1341, the technical changes from that document are summarized in this appendix. 1. The definition of "tspecials" has been changed to no longer include ".". 2. The Content-ID field is now mandatory for message/external-body parts. 3. The text/richtext type (including the old Section 7.1.3 and Appendix D) has been moved to a separate document. 4. The rules on header merging for message/partial data have been changed to treat the Encrypted and MIME-Version headers as special cases. 5. The definition of the external-body access-type parameter has been changed so that it can only indicate a single access method (which was all that made sense). 6. There is a new "Subject" parameter for message/external-body, access-type mail-server, to permit MIME-based use of mail servers that rely on Subject field information. 7. The "conversions" parameter for application/octet-stream has been removed. 8. Section 7.4.1 now deprecates the use of the "name" parameter for application/octet-stream, as this will be superseded in the future by a Content-Disposition header. 9. The formal grammar for multipart bodies has been changed so that a CRLF is no longer required before the first boundary line. 10. MIME entities of type "message/partial" and "message/external- body" are now required to use only the "7bit" transfer-encoding. (Specifically, "binary" and "8bit" are not permitted.) 11. The "application/oda" content-type has been removed. 12. A note has been added to the end of section 7.2.3, explaining the semantics of Content-ID in a multipart/alternative MIME entity. 13. The formal syntax for the "MIME-Version" field has been tightened, but in a way that is completely compatible with the only
version number defined in RFC 1341. 14. In Section 7.3.1, the definition of message/rfc822 has been relaxed regarding mandatory fields. All other changes from RFC 1341 were editorial changes and do not affect the technical content of MIME. Considerable formal grammar has been added, but this reflects the prose specification that was already in place.
References [US-ASCII] Coded Character Set--7-Bit American Standard Code for Information Interchange, ANSI X3.4-1986. [ATK] Borenstein, Nathaniel S., Multimedia Applications Development with the Andrew Toolkit, Prentice-Hall, 1990. [GIF] Graphics Interchange Format (Version 89a), Compuserve, Inc., Columbus, Ohio, 1990. [ISO-2022] International Standard--Information Processing--ISO 7-bit and 8-bit coded character sets--Code extension techniques, ISO 2022:1986. [ISO-8859] Information Processing -- 8-bit Single-Byte Coded Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO 8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: Latin alphabet No. 5, ISO 8859-9, 1990. [ISO-646] International Standard--Information Processing--ISO 7-bit coded character set for information interchange, ISO 646:1983. [MPEG] Video Coding Draft Standard ISO 11172 CD, ISO IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991. [PCM] CCITT, Fascicle III.4 - Recommendation G.711, Geneva, 1972, "Pulse Code Modulation (PCM) of Voice Frequencies". [POSTSCRIPT] Adobe Systems, Inc., PostScript Language Reference Manual, Addison-Wesley, 1985. [POSTSCRIPT2] Adobe Systems, Inc., PostScript Language Reference Manual, Addison-Wesley, Second Edition, 1990. [X400] Schicker, Pietro, "Message Handling Systems, X.400", Message Handling Systems and Distributed Applications, E. Stefferud, O-j. Jacobsen, and P. Schicker, eds., North-Holland, 1989, pp. 3-41. [RFC-783] Sollins, K., "TFTP Protocol (revision 2)", RFC 783, MIT, June 1981. [RFC-821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, USC/Information Sciences Institute, August 1982.
[RFC-822] Crocker, D., "Standard for the Format of ARPA Internet Text Messages", STD 11, RFC 822, UDEL, August 1982. [RFC-934] Rose, M., and E. Stefferud, "Proposed Standard for Message Encapsulation", RFC 934, Delaware and NMA, January 1985. [RFC-959] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, RFC 959, USC/Information Sciences Institute, October 1985. [RFC-1049] Sirbu, M., "Content-Type Header Field for Internet Messages", STD 11, RFC 1049, CMU, March 1988. [RFC-1421] Linn, J., "Privacy Enhancement for Internet Electronic Mail: Part I - Message Encryption and Authentication Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG, February 1993. [RFC-1154] Robinson, D. and R. Ullmann, "Encoding Header Field for Internet Messages", RFC 1154, Prime Computer, Inc., April 1990. [RFC-1341] Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1341, Bellcore, Innosoft, June 1992. [RFC-1342] Moore, K., "Representation of Non-Ascii Text in Internet Message Headers", RFC 1342, University of Tennessee, June 1992. [RFC-1343] Borenstein, N., "A User Agent Configuration Mechanism for Multimedia Mail Format Information", RFC 1343, Bellcore, June 1992. [RFC-1344] Borenstein, N., "Implications of MIME for Internet Mail Gateways", RFC 1344, Bellcore, June 1992. [RFC-1345] Simonsen, K., "Character Mnemonics & Character Sets", RFC 1345, Rationel Almen Planlaegning, June 1992. [RFC-1426] Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M., Stefferud, E., and D. Crocker, "SMTP Service Extension for 8bit-MIME transport", RFC 1426, United Nations Universit, Innosoft, Dover Beach Consulting, Inc., Network Management Associates, Inc., The Branch Office, February 1993. [RFC-1522] Moore, K., "Representation of Non-Ascii Text in Internet Message Headers" RFC 1522, University of Tennessee, September 1993. [RFC-1340] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC 1340, USC/Information Sciences Institute, July 1992.