Network Working Group D. Crocker Request for Comments: 5598 Brandenburg InternetWorking Category: Informational July 2009 Internet Mail ArchitectureAbstract
Over its thirty-five-year history, Internet Mail has changed significantly in scale and complexity, as it has become a global infrastructure service. These changes have been evolutionary, rather than revolutionary, reflecting a strong desire to preserve both its installed base and its usefulness. To collaborate productively on this large and complex system, all participants need to work from a common view of it and use a common language to describe its components and the interactions among them. But the many differences in perspective currently make it difficult to know exactly what another participant means. To serve as the necessary common frame of reference, this document describes the enhanced Internet Mail architecture, reflecting the current service. Status of This Memo This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited. Copyright Notice Copyright (c) 2009 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents in effect on the date of publication of this document (http://trustee.ietf.org/license-info). Please review these documents carefully, as they describe your rights and restrictions with respect to this document. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may
not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English.Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. History . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. The Role of This Architecture . . . . . . . . . . . . . . 6 1.3. Document Conventions . . . . . . . . . . . . . . . . . . . 7 2. Responsible Actor Roles . . . . . . . . . . . . . . . . . . . 7 2.1. User Actors . . . . . . . . . . . . . . . . . . . . . . . 8 2.2. Message Handling Service (MHS) Actors . . . . . . . . . . 11 2.3. Administrative Actors . . . . . . . . . . . . . . . . . . 14 3. Identities . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.1. Mailbox . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2. Scope of Email Address Use . . . . . . . . . . . . . . . . 18 3.3. Domain Names . . . . . . . . . . . . . . . . . . . . . . . 19 3.4. Message Identifier . . . . . . . . . . . . . . . . . . . . 19 4. Services and Standards . . . . . . . . . . . . . . . . . . . . 21 4.1. Message Data . . . . . . . . . . . . . . . . . . . . . . . 24 4.1.4. Identity References in a Message . . . . . . . . . . . 25 4.2. User-Level Services . . . . . . . . . . . . . . . . . . . 29 4.3. MHS-Level Services . . . . . . . . . . . . . . . . . . . . 31 4.4. Transition Modes . . . . . . . . . . . . . . . . . . . . . 34 4.5. Implementation and Operation . . . . . . . . . . . . . . . 35 5. Mediators . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1. Alias . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.2. ReSender . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.3. Mailing Lists . . . . . . . . . . . . . . . . . . . . . . 39 5.4. Gateways . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.5. Boundary Filter . . . . . . . . . . . . . . . . . . . . . 42 6. Considerations . . . . . . . . . . . . . . . . . . . . . . . . 42 6.1. Security Considerations . . . . . . . . . . . . . . . . . 42 6.2. Internationalization . . . . . . . . . . . . . . . . . . . 43 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 45 7.1. Normative References . . . . . . . . . . . . . . . . . . . 45 7.2. Informative References . . . . . . . . . . . . . . . . . . 47 Appendix A. Acknowledgments . . . . . . . . . . . . . . . . . . . 50 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
1. Introduction
Over its thirty-five-year history, Internet Mail has changed significantly in scale and complexity, as it has become a global infrastructure service. These changes have been evolutionary, rather than revolutionary, reflecting a strong desire to preserve both its installed base and its usefulness. Today, Internet Mail is distinguished by many independent operators, many different components for providing service to Users, as well as many different components that transfer messages. The underlying technical standards for Internet Mail comprise a rich array of functional capabilities. These specifications form the core: * Simple Mail Transfer Protocol (SMTP) ([RFC0821], [RFC2821], [RFC5321]) moves a message through the Internet. * Internet Mail Format (IMF) ([RFC0733], [RFC0822], [RFC2822], [RFC5322]) defines a message object. * Multipurpose Internet Mail Extensions (MIME) [RFC2045] defines an enhancement to the message object that permits using multimedia attachments. Public collaboration on technical, operations, and policy activities of email, including those that respond to the challenges of email abuse, has brought a much wider range of participants into the technical community. To collaborate productively on this large and complex system, all participants need to work from a common view of it and use a common language to describe its components and the interactions among them. But the many differences in perspective currently make it difficult to know exactly what another participant means. It is the need to resolve these differences that motivates this document, which describes the realities of the current system. Internet Mail is the subject of ongoing technical, operations, and policy work, and the discussions often are hindered by different models of email-service design and different meanings for the same terms. To serve as the necessary common frame of reference, this document describes the enhanced Internet Mail architecture, reflecting the current service. The document focuses on:
* Capturing refinements to the email model * Clarifying functional roles for the architectural components * Clarifying identity-related issues, across the email service * Defining terminology for architectural components and their interactions1.1. History
The first standardized architecture for networked email specified a simple split between the user world, in the form of Message User Agents (MUAs), and the transfer world, in the form of the Message Handling Service (MHS), which is composed of Message Transfer Agents (MTAs) [RFC1506]. The MHS accepts a message from one User and delivers it to one or more other Users, creating a virtual MUA-to-MUA exchange environment. As shown in Figure 1, this architecture defines two logical layers of interoperability. One is directly between Users. The other is among the components along the transfer path. In addition, there is interoperability between the layers, first when a message is posted from the User to the MHS and later when it is delivered from the MHS to the User. The operational service has evolved, although core aspects of the service, such as mailbox addressing and message format style, remain remarkably constant. The original distinction between the user level and transfer level remains, but with elaborations in each. The term "Internet Mail" is used to refer to the entire collection of user and transfer components and services. For Internet Mail, the term "end-to-end" usually refers to a single posting and the set of deliveries that result from a single transit of the MHS. A common exception is group dialogue that is mediated through a Mailing List; in this case, two postings occur before intended Recipients receive an Author's message, as discussed in Section 2.1.4. In fact, some uses of email consider the entire email service, including Author and Recipient, as a subordinate component. For these services, "end-to-end" refers to points outside the email service. Examples are voicemail over email [RFC3801], EDI (Electronic Data Interchange) over email [RFC1767], and facsimile over email [RFC4142].
+--------+ ++================>| User | || +--------+ || ^ +--------+ || +--------+ . | User +==++=========>| User | . +---+----+ || +--------+ . . || ^ . . || +--------+ . . . ++==>| User | . . . +--------+ . . . ^ . . . . . . V . . . +---+-----------------+------+------+---+ | . . . . | | .................>. . . | | . . . | | ........................>. . | | . . | | ...............................>. | | | | Message Handling Service (MHS) | +---------------------------------------+ Legend: === lines indicate primary (possibly indirect) transfers or roles ... lines indicate supporting transfers or roles Figure 1: Basic Internet Mail Service Model End-to-end Internet Mail exchange is accomplished by using a standardized infrastructure with these components and characteristics: * An email object * Global addressing * An asynchronous sequence of point-to-point transfer mechanisms * No requirement for prior arrangement between MTAs or between Authors and Recipients * No requirement for prior arrangement between point-to-point transfer services over the open Internet
* No requirement for Author, Originator, or Recipients to be online at the same time The end-to-end portion of the service is the email object, called a "message". Broadly, the message itself distinguishes control information, for handling, from the Author's content. A precept to the design of mail over the open Internet is permitting User-to-User and MTA-to-MTA interoperability without prior, direct arrangement between the independent administrative authorities responsible for handling a message. All participants rely on having the core services universally supported and accessible, either directly or through Gateways that act as translators between Internet Mail and email environments conforming to other standards. Given the importance of spontaneity and serendipity in interpersonal communications, not requiring such prearrangement between participants is a core benefit of Internet Mail and remains a core requirement for it. Within localized networks at the edge of the public Internet, prior administrative arrangement often is required and can include access control, routing constraints, and configuration of the information query service. Although Recipient authentication has usually been required for message access since the beginning of Internet Mail, in recent years it also has been required for message submission. In these cases, a server validates the client's identity, whether by explicit security protocols or by implicit infrastructure queries to identify "local" participants.1.2. The Role of This Architecture
An Internet service is an integration of related capabilities among two or more participating nodes. The capabilities are accomplished across the Internet by one or more protocols. What connects a protocol to a service is an architecture. An architecture specifies how the protocols implement the service by defining the logical components of a service and the relationships among them. From that logical view, a service defines what is being done, an architecture defines where the pieces are (in relation to each other), and a protocol defines how particular capabilities are performed. As such, an architecture will more formally describe a service at a relatively high level. A protocol that implements some portion of a service will conform to the architecture to a greater or lesser extent, depending on the pragmatic tradeoffs they make when trying to implement the architecture in the context of real-world constraints. Failure to precisely follow an architecture is not a failure of the protocol, nor is failure to precisely cast a protocol a failure of
the architecture. Where a protocol varies from the architecture, it will of course be appropriate for it to explain the reason for the variance. However, such variance is not a mark against a protocol: Happily, the IETF prefers running code to architectural purity. In this particular case, this architecture attempts to define the logical components of Internet email and does so post hoc, trying to capture the architectural principles that the current email protocols embody. To different extents, email protocols will conform to this architecture more or less well. Insofar as this architecture differs from those protocols, the reasons are generally well understood and are required for interoperation. The differences are not a sign that protocols need to be fixed. However, this architecture is a best attempt at a logical model of Internet email, and insofar as new protocol development varies from this architecture, it is necessary for designers to understand those differences and explain them carefully.1.3. Document Conventions
References to structured fields of a message use a two-part dotted notation. The first part cites the document that contains the specification for the field, and the second part is the name of the field. Hence <RFC5322.From> is the IMF From: header field in an email content header, and <RFC5321.MailFrom> is the address in the SMTP "Mail From" command. When occurring without the IMF (RFC 5322) qualifier, header field names are shown with a colon suffix. For example, From:. References to labels for actors, functions or components have the first letter capitalized.2. Responsible Actor Roles
Internet Mail is a highly distributed service, with a variety of Actors playing different roles. These Actors fall into three basic types: * User * Message Handling Service (MHS) * ADministrative Management Domain (ADMD)
Although related to a technical architecture, the focus on Actors concerns participant responsibilities, rather than functionality of modules. For that reason, the labels used are different from those used in classic diagrams of email architecture.2.1. User Actors
Users are the sources and sinks of messages. Users can be people, organizations, or processes. They can have an exchange that iterates, and they can expand or contract the set of Users that participate in a set of exchanges. In Internet Mail, there are four types of Users: * Authors * Recipients * Return Handlers * Mediators Figure 2 shows the primary and secondary flows of messages among them. As a pragmatic heuristic: User Actors can generate, modify, or look at the whole message.
++==========++ || Author ||<..................................<.. ++=++=++=++=++ . || || || ++===========++ . || || ++====>|| Recipient || . || || ++=====+=====++ . || || . . || || ..........................>.+ || || . || || ................... . || || . . . || || V . . || || +-----------+ ++=====+=====++ . || ++========>| Mediator +===>|| Recipient || . || +-----+-----+ ++=====+=====++ . || . . . || ..................+.......>.+ || . || ..............+.................. . || . . . . \/ V V ' . +-----------+ +-----------+ ++=====+=====++ . | Mediator +===>| Mediator +===>|| Recipient || . +-----+-----+ +-----+-----+ ++=====+=====++ . . . . . .................+.................+.......>.. Legend: === lines indicate primary (possibly indirect) transfers or roles ... lines indicate supporting transfers or roles Figure 2: Relationships among User Actors From a User's perspective, all message-transfer activities are performed by a monolithic Message Handling Service (MHS), even though the actual service can be provided by many independent organizations. Users are customers of this unified service. Whenever any MHS Actor sends information back to an Author or Originator in the sequence of handling a message, that Actor is a User.2.1.1. Author
The Author is responsible for creating the message, its contents, and its list of Recipient addresses. The MHS transfers the message from the Author and delivers it to the Recipients. The MHS has an Originator role (Section 2.2.1) that correlates with the Author role.
2.1.2. Recipient
The Recipient is a consumer of the delivered message. The MHS has a Receiver role (Section 2.2.4) that correlates with the Recipient role. This is labeled Recv in Figure 3. Any Recipient can close the user-communication loop by creating and submitting a new message that replies to the Author. An example of an automated form of reply is the Message Disposition Notification (MDN), which informs the Author about the Recipient's handling of the message. (See Section 4.1.)2.1.3. Return Handler
Also called "Bounce Handler", the Return Handler is a special form of Recipient tasked with servicing notifications generated by the MHS as it transfers or delivers the message. (See Figure 3.) These notices can be about failures or completions and are sent to an address that is specified by the Originator. This Return Handling address (also known as a Return Address) might have no visible characteristics in common with the address of the Author or Originator.2.1.4. Mediator
A Mediator receives, aggregates, reformulates, and redistributes messages among Authors and Recipients who are the principals in (potentially) protracted exchanges. This activity is easily confused with the underlying MHS transfer exchanges. However, each serves very different purposes and operates in very different ways. When mail is delivered to the Mediator specified in the RFC5321.RcptTo command for the original message, the MHS handles it the same way as for any other Recipient. In particular, the MHS sees each posting and delivery activity between sources and sinks as independent; it does not see subsequent re-posting as a continuation of a process. Because the Mediator originates messages, it can receive replies. Hence, when submitting a reformulated message, the Mediator is an Author, albeit an Author actually serving as an agent of one or more other Authors. So a Mediator really is a full-fledged User. Mediators are considered extensively in Section 5. A Mediator attempts to preserve the original Author's information in the message it reformulates but is permitted to make meaningful changes to the message content or envelope. The MHS sees a new message, but Users receive a message that they interpret as being from, or at least initiated by, the Author of the original message. The role of a Mediator is not limited to merely connecting other participants; the Mediator is responsible for the new message.
A Mediator's role is complex and contingent, for example, modifying and adding content or regulating which Users are allowed to participate and when. The common example of this role is a group Mailing List. In a more complex use, a sequence of Mediators could perform a sequence of formal steps, such as reviewing, modifying, and approving a purchase request. A Gateway is a particularly interesting form of Mediator. It is a hybrid of User and Relay that connects heterogeneous mail services. Its purpose is to emulate a Relay. For a detailed discussion, see Section 2.2.3.2.2. Message Handling Service (MHS) Actors
The Message Handling Service (MHS) performs a single end-to-end transfer on behalf of the Author to reach the Recipient addresses specified in the original RFC5321.RcptTo commands. Exchanges that are either mediated or iterative and protracted, such as those used for collaboration over time, are handled by the User Actors, not by the MHS Actors. As a pragmatic heuristic MHS Actors generate, modify, or look at only transfer data, rather than the entire message. Figure 3 shows the relationships among transfer participants in Internet Mail. Although it shows the Originator (labeled Origin) as distinct from the Author, and Receiver (labeled Recv) as distinct from Recipient, each pair of roles usually has the same Actor. Transfers typically entail one or more Relays. However, direct delivery from the Originator to Receiver is possible. Intra- organization mail services usually have only one Relay.
++==========++ ++===========++ || Author || || Recipient || ++====++====++ +--------+ ++===========++ || | Return | /\ || +-+------+ || \/ . ^ || +---------+ . . +---++---+ | | . . | | /--+---------+----------------------------+--------+----\ | | | . . MHS | | | | | Origin +<...... .................+ Recv | | | | | ^ | | | | +---++----+ . +--------+ | | || . /\ | | || ..............+.................. || | | \/ . . . || | | +-------+-+ +--+------+ +-+--++---+ | | | Relay +=======>| Relay +=======>| Relay | | | +---------+ +----++---+ +---------+ | | || | | || | | \/ | | +---------+ | | | Gateway +-->... | | +---------+ | \-------------------------------------------------------/ Legend: === and || lines indicate primary (possibly indirect) transfers or roles ... lines indicate supporting transfers or roles Figure 3: Relationships among MHS Actors2.2.1. Originator
The Originator ensures that a message is valid for posting and then submits it to a Relay. A message is valid if it conforms to both Internet Mail standards and local operational policies. The Originator can simply review the message for conformance and reject it if it finds errors, or it can create some or all of the necessary information. In effect, the Originator is responsible for the functions of the Mail Submission Agent. The Originator operates with dual allegiance. It serves the Author and can be the same entity. But its role in assuring validity means that it also represents the local operator of the MHS, that is, the local ADministrative Management Domain (ADMD).
The Originator also performs any post-submission, Author-related administrative tasks associated with message transfer and delivery. Notably, these tasks pertain to sending error and delivery notices, enforcing local policies, and dealing with messages from the Author that prove to be problematic for the Internet. The Originator is accountable for the message content, even when it is not responsible for it. The Author creates the message, but the Originator handles any transmission issues with it.2.2.2. Relay
The Relay performs MHS-level transfer-service routing and store-and- forward by transmitting or retransmitting the message to its Recipients. The Relay adds trace information [RFC2505] but does not modify the envelope information or the message content semantics. It can modify message content representation, such as changing the form of transfer encoding from binary to text, but only as required to meet the capabilities of the next hop in the MHS. A Message Handling System (MHS) network consists of a set of Relays. This network is above any underlying packet-switching network that might be used and below any Gateways or other Mediators. In other words, email scenarios can involve three distinct architectural layers, each providing its own type of data of store- and-forward service: * User Mediators * MHS Relays * Packet Switches The bottom layer is the Internet's IP service. The most basic email scenarios involve Relays and Switches. When a Relay stops attempting to transfer a message, it becomes an Author because it sends an error message to the Return Address. The potential for looping is avoided by omitting a Return Address from this message.2.2.3. Gateway
A Gateway is a hybrid of User and Relay that connects heterogeneous mail services. Its purpose is to emulate a Relay and the closer it comes to this, the better. A Gateway operates as a User when it needs the ability to modify message content.
Differences between mail services can be as small as minor syntax variations, but they usually encompass significant, semantic distinctions. One difference could be email addresses that are hierarchical and machine-specific rather than a flat, global namespace. Another difference could be support for text-only content or multimedia. Hence the Relay function in a Gateway presents a significant design challenge if the resulting performance is to be seen as nearly seamless. The challenge is to ensure User-to-User functionality between the services, despite differences in their syntax and semantics. The basic test of Gateway design is whether an Author on one side of a Gateway can send a useful message to a Recipient on the other side, without requiring changes to any components in the Author's or Recipient's mail services other than adding the Gateway. To each of these otherwise independent services, the Gateway appears to be a native participant. But the ultimate test of Gateway design is whether the Author and Recipient can sustain a dialogue. In particular, can a Recipient's MUA automatically formulate a valid Reply that will reach the Author?2.2.4. Receiver
The Receiver performs final delivery or sends the message to an alternate address. It can also perform filtering and other policy enforcement immediately before or after delivery.2.3. Administrative Actors
Administrative Actors can be associated with different organizations, each with its own administrative authority. This operational independence, coupled with the need for interaction between groups, provides the motivation to distinguish among ADministrative Management Domains (ADMDs). Each ADMD can have vastly different operating policies and trust-based decision-making. One obvious example is the distinction between mail that is exchanged within an organization and mail that is exchanged between independent organizations. The rules for handling both types of traffic tend to be quite different. That difference requires defining the boundaries of each, and this requires the ADMD construct. Operation of Internet Mail services is carried out by different providers (or operators). Each can be an independent ADMD. This independence of administrative decision-making defines boundaries that distinguish different portions of the Internet Mail service. A department that operates a local Relay, an IT department that operates an enterprise Relay, and an ISP that operates a public shared email service can be configured into many combinations of
administrative and operational relationships. Each is a distinct ADMD, potentially having a complex arrangement of functional components. Figure 4 depicts relationships among ADMDs. The benefit of the ADMD construct is that it facilitates discussion about designs, policies, and operations that need to distinguish between internal issues and external ones. The architectural impact of the need for boundaries between ADMDs is discussed in [Tussle]. Most significant is that the entities communicating across ADMD boundaries typically have the added burden of enforcing organizational policies concerning external communications. At a more mundane level, routing mail between ADMDs can be an issue, such as needing to route mail between organizational partners over specially trusted paths. These are three basic types of ADMDs: Edge: Independent transfer services in networks at the edge of the open Internet Mail service. Consumer: Might be a type of Edge service, as is common for web- based email access. Transit: Mail Service Providers (MSPs) that offer value-added capabilities for Edge ADMDs, such as aggregation and filtering. The mail-level transit service is different from packet-level switching. End-to-end packet transfers usually go through intermediate routers; email exchange across the open Internet can be directly between the Boundary MTAs of Edge ADMDs. This distinction between direct and indirect interaction highlights the differences discussed in Section 2.2.2.
+--------+ +---------+ +-------+ +-----------+ | ADMD1 |<===>| ADMD2 |<===>| ADMD3 |<===>| ADMD4 | | ----- | | ----- | | ----- | | ----- | | | | | | | | | | Author | | | | | | Recipient | | . | | | | | | ^ | | V | | | | | | . | | Edge..+....>|.Transit.+....>|-Edge..+....>|..Consumer | | | | | | | | | +--------+ +---------+ +-------+ +-----------+ Legend: === lines indicate primary (possibly indirect) transfers or roles ... lines indicate supporting transfers or roles Figure 4: Administrative Domain (ADMD) Example Edge networks can use proprietary email standards internally. However, the distinction between Transit network and Edge network transfer services is significant because it highlights the need for concern over interaction and protection between independent administrations. In particular, this distinction calls for additional care in assessing the transitions of responsibility and the accountability and authorization relationships among participants in message transfer. The interactions of ADMD components are subject to the policies of that domain, which cover concerns such as these: * Reliability * Access control * Accountability * Content evaluation and modification These policies can be implemented in different functional components, according to the needs of the ADMD. For example, see [RFC5068]. Consumer, Edge, and Transit services can be offered by providers that operate component services or sets of services. Further, it is possible for one ADMD to host services for other ADMDs.
These are common examples of ADMDs: Enterprise Service Providers: These ADMDs operate the internal data and/or the mail services within an organization. Internet Service Providers (ISP): These ADMDs operate the underlying data communication services, which are used by one or more Relay and User. ISPs are not responsible for performing email functions, but they can provide an environment in which those functions can be performed. Mail Service Providers: These ADMDs operate email services, such as for consumers or client companies. Practical operational concerns demand that providers be involved in administration and enforcement issues. This involvement can extend to operators of lower-level packet services.