6. Optional Headers Many MAIL headers, and many of those specified in present and future MAIL extensions, are potentially applicable to news. Headers specific to MAIL's point-to-point transmission paradigm, e.g., To and Cc, SHOULD NOT appear in news articles. (Gateways wishing to preserve such information for debugging probably SHOULD hide it under different names; prefixing "X-" to the original headers, resulting in forms like "X-To", is suggested.) The following optional headers are either specific to news or of particular note in news articles; an article MAY contain some or all of them. (Note that there are some circumstances in which some of them are mandatory; these are explained under the individual headers.) An article MUST NOT contain two or more headers with any one of these header names. NOTE: The ban on duplicate header names does not apply to headers not specified in this Draft, such as "X-" headers. Software should not assume that all header names in a given article are unique. 6.1. Followup-To The Followup-To header contents specify to which newsgroup(s) followups should be posted: Followup-To-content = Newsgroups-content / "poster"
The syntax is the same as that of the Newsgroups content, with the exception that the magic word "poster" means that followups should be mailed to the article's reply address rather than posted. In the absence of Followup-To, the default newsgroup(s) for a followup are those in the Newsgroups header. NOTE: The way to request that followups be mailed to a specific address other than that in the From line is to supply "Followup-To: poster" and a Reply-To header. Putting a mailing address in the Followup-To line is incorrect; posting agents should reject or rewrite such headers. NOTE: There is no syntax for "no followups allowed" because "Followup-To: poster" accomplishes this effect without extra machinery. Although it is generally desirable to limit followups to the smallest reasonable set of newsgroups, especially when the precursor was cross-posted widely, posting agents SHOULD NOT supply a Followup-To header except at the poster's explicit request. NOTE: In particular, it is incorrect for the posting agent to assume that followups to a cross-posted article should be directed to the first newsgroup only. Trimming the list of newsgroups should be the poster's decision, not the posting agent's. However, when an article is to be cross-posted to a considerable number of newsgroups, a posting agent might wish to SUGGEST to the poster that followups go to a shorter list. 6.2. Expires The Expires header content specifies a date and time when the article is deemed to be no longer useful and should be removed ("expired"): Expires-content = Date-content The content syntax is the same as that of the Date content. In the absence of Expires, the default is decided by the administrators of each host the article reaches, who MAY also restrict the extent to which the Expires header is honored. The Expires header has two main applications: removing articles whose utility ends on a specific date (e.g., event announcements that can be removed once the day of the event has passed) and preserving articles expected to be of prolonged usefulness (e.g., information aimed at new readers of a newsgroup). The latter application is sometimes abused. Since individual hosts have local policies for expiration of news (depending on available disk space, for instance),
posters SHOULD NOT provide Expires headers for articles unless there is a natural expiration date associated with the topic. Posting agents MUST NOT provide a default Expires header. Leave it out and allow local policies to be used unless there is a good reason not to. Expiry dates are properly the decision of individual host administrators; posters and moderators SHOULD set only expiry dates with which most administrators would agree. NOTE: A poster preparing an Expires header for an article whose utility ends on a specific day should typically specify the NEXT day as the expiry date. A meeting on July 7th remains of interest on the 7th. 6.3. Reply-To The Reply-To header content specifies a reply address different from the author's address given in the From header: Reply-To-content = From-content In the absence of Reply-To, the reply address is the address in the From header. Use of a Reply-To header is preferable to including a similar request in the article body, because reply-preparation software can take account of Reply-To automatically. 6.4. Sender The Sender header identifies the poster, in the event that this differs from the author identified in the From header: Sender-content = From-content In the absence of Sender, the default poster is the author (named in the From header). NOTE: The intent is that the Sender header have a fairly high probability of identifying the person who really posted the article. The ability to specify a From header naming someone other than the poster is useful but can be abused. If the poster supplies a From header, the posting agent MUST ensure that a Sender header is present, unless it can verify that the mailing address in the From header is a valid mailing address for the poster. A poster-supplied Sender header MAY be used, if its mailing address is verifiably a valid mailing address for the poster;
otherwise, the posting agent MUST supply a Sender header and delete (or rename, for example, to X-Unverifiable-Sender) any poster- supplied Sender header. NOTE: It might be useful to preserve a poster-supplied Sender header so that the poster can supply the full-name part of the content. The mailing address, however, must be right, hence, the posting agent must generate the Sender header if it is unable to verify the mailing address of a poster-supplied one. NOTE: NNTP implementors, in particular, are urged to note this requirement (which would eliminate the need for ad hoc headers like NNTP-Posting-Host), although there are admittedly some implementation difficulties. A user name from an [RFC1413] server and a host name from an inverse mapping of the address, perhaps with a "full name" comment noting the origin of the information, would be at least a first approximation: Sender: fred@zoo.toronto.edu (RFC-1413@reverse-lookup; not verified) While this does not completely meet the specs, it comes a lot closer than not having a Sender header at all. Even just supplying a placeholder for the user name: Sender: somebody@zoo.toronto.edu (user name unknown) would be better than nothing. 6.5. References The References header content lists message IDs of precursors: References-content = message-id *( space message-id ) A followup MUST have a References header, and an article that is not a followup MUST NOT have a References header. The References-content of a followup MUST be the precursor's References-content (if any) followed by the precursor's message ID. NOTE: Use the See-Also header (Section 6.16) for interconnection of articles that are not in a followup relationship to each other. NOTE: In retrospect, RFCs 850 and 1036, and the implementations whose practice they represented, erred here. The proper MAIL header to use for references to precursors is In-Reply-To, and the References header is meant to be used for the purposes here ascribed to See-Also. This incompatibility is far too solidly
established to be fixed, unfortunately. The best that can be done is to provide a clear mapping between the two and urge gateways to do the transformation. The news usage is (now) a deliberate violation of the MAIL specifications; articles containing news References headers are technically not valid MAIL messages, although it is unlikely that much MAIL software will notice because the incompatibility is at a subtle semantic level that does not affect the syntax. UNRESOLVED ISSUE: Would it be better to just give up and admit that news uses References for both purposes? UNRESOLVED ISSUE: Should the syntax be generalized to include URLs as alternatives to message IDs? Perhaps not; too many things know about References already. And non-articles can't be precursors of articles, not really. Followup agents SHOULD NOT shorten References headers. If it is absolutely necessary to shorten the header, as a desperate last resort, a followup agent MAY do this by deleting some of the message IDs. However, it MUST NOT delete the first message ID, the last three message IDs (including that of the immediate precursor), or any message ID mentioned in the body of the followup. If it is possible for the followup agent to determine the Subject content of the articles identified in the References header, it MUST NOT delete the message ID of any article where the Subject content changed (other than by prepending of a back reference). The followup agent MUST NOT delete any message ID whose local part ends with "_-_" (underscore (ASCII 95), hyphen (ASCII 45), underscore); followup agents are urged to use this form to mark subject changes and to avoid using it otherwise. NOTE: As software capable of exploiting References chains has grown more common, the random shortening permitted by [RFC1036] has become increasingly troublesome. ANY shortening is undesirable, and software should do it only in cases of dire necessity. In such cases, these rules attempt to limit the damage. NOTE: The first message ID is very important as the starting point of the "thread" of discussion and absolutely should not be deleted. Keeping the last three message IDs gives thread- following software a fighting chance to reconstruct a full thread even if an article or two is missing. Keeping message IDs mentioned in the body is obviously desirable.
NOTE: Subject changes are difficult to determine, but they are significant as possible beginnings of new threads. The "_-_" convention is provided so that posting agents (which have more information about subjects) can flag articles containing a subject change in a way that followup agents can detect without access to the articles themselves. The sequence is chosen as one that is fairly unlikely to occur by accident. UNRESOLVED ISSUE: Is "_-_" really worth having? When a References header is shortened, at least three blanks SHOULD be left between adjacent message IDs at each point where deletions were made. Software preparing new References headers SHOULD preserve multiple blanks in older References content. NOTE: It's desirable to have some marker of where deletions occurred, but the restricted syntax of the header makes this difficult. Extra white space is not a very good marker, since it may be deleted by software that ill-advisedly rewrites headers, but at least it doesn't break existing software. To repeat: followup agents SHOULD NOT shorten References headers. NOTE: Unfortunately, reading agents and other software analyzing References patterns have to be prepared for the worst anyway. The worst includes random deletions and the possibility of circular References chains (when References is misused in place of See-Also (Section 6.16)). 6.6. Control The Control header content marks the article as a control message and specifies the desired actions (other than the usual ones of filing and passing on the article): Control-content = verb *( space argument ) verb = 1*( letter / digit ) argument = 1*<ASCII printable character> The verb indicates what action should be taken, and the argument(s) (if any) supply details. In some cases, the body of the article may also contain details. Section 7 describes the standard verbs. See also the Also-Control header (Section 6.15). NOTE: Control messages are often processed and filed rather differently than normal articles.
NOTE: The restriction of verbs to letters and digits is new but is consistent with existing practice and potentially simplifies implementation by avoiding characters significant to command interpreters. Beware that the arguments are under no such restriction in general. NOTE: Two other conventions for distinguishing control messages from normal articles were formerly in use: a three-component newsgroup name ending in ".ctl" or a subject beginning with "cmsg " was considered to imply that the article was a control message. These conventions are obsolete. Do not use them. An article with a Control header MUST NOT have an Also-Control or Supersedes header. 6.7. Distribution The Distribution header content specifies geographic or organizational limits on an article's propagation: Distribution-content = distribution *( dist-delim distribution ) dist-delim = "," distribution = plain-component A distribution is syntactically identical to a one-component newsgroup name and must satisfy the same rules and restrictions. In the absence of Distribution, the default distribution is "world". NOTE: This syntax has the disadvantage of containing no white space, making it impossible to continue a Distribution header across several lines. Implementors of relayers and reading agents are warned that it is intended that the successor to this Draft will change the definition of dist delimiter to: dist-delim = "," [ space ] and are urged to fix their software to handle (i.e., ignore) white space following the commas. A relayer MUST NOT pass an article to another relayer unless configuration information specifies transmission to that other relayer of BOTH (a) at least one of the article's newsgroup(s), and (b) at least one of the article's distribution(s). In effect, the only role of distributions is to limit propagation, by preventing transmission of articles that would have been transmitted had the decision been based solely on newsgroups.
A posting agent might wish to present a menu of possible distributions, or suggest a default, but normally SHOULD NOT supply a default without giving the poster a chance to override it. A followup agent SHOULD initially supply the same Distribution header as found in the precursor, although the poster MAY alter this if appropriate. Despite the syntactic similarity and some historical confusion, distributions are NOT newsgroup names. The whole point of putting a distribution on an article is that it is DIFFERENT from the newsgroup(s). In general, a meaningful distribution corresponds to some sort of region of propagation: a geographical area, an organization, or a cooperating subnet. NOTE: Distributions have historically suffered from the completely uncontrolled nature of their name space, the lack of feedback to posters on incomplete propagation resulting from use of random trash in Distribution headers, and confusion with newsgroups (arising partly because many regions and organizations DO have internal newsgroups with names resembling their internal distributions). This has resulted in much garbage in Distribution headers, notably the pointless practice of automatically supplying the first component of the newsgroup name as a distribution (which is MOST unlikely to restrict propagation!). Many sites have opted to maximize propagation of such ill-formed articles by essentially ignoring distributions. This unfortunately interferes with legitimate uses. The situation is bad enough that distributions must be considered largely useless except within cooperating subnets that make an organized effort to restrain propagation of their internal distributions. NOTE: The distributions "world" and "local" have no standard magic meaning (except that the former is the default distribution if none is given). Some pieces of software do assign such meanings to them. 6.8. Keywords The Keywords header content is one or more phrases intended to describe some aspect of the content of the article: Keywords-content = plain-phrase *( "," [ space ] plain-phrase ) Keywords, separated by commas, each follow the <plain-phrase> syntax defined in Section 5.2. Encoded words in keywords MUST NOT contain characters other than letters (of either case), digits, and the characters "!", "*", "+", "-", "/", "=", and "_".
NOTE: Posters and posting agents are asked to take note that keywords are separated by commas, not by white space. The following Keywords header contains only one keyword (a rather unlikely and improbable one): Keywords: Thompson Ritchie Multics Linux and should probably have been written: Keywords: Thompson, Ritchie, Multics, Linux This particular error is unfortunately rather widespread. NOTE: Reading agents and archivers preparing indexes of articles should bear in mind that user-chosen keywords are notoriously poor for indexing purposes unless the keywords are picked from a predefined set (which they are not in this case). Also, some followup agents unwisely propagate the Keywords header from the precursor into the followup by default. At least one news-based experiment has found the contents of Keywords headers to be completely valueless for indexing. 6.9. Summary The Summary header content is a short phrase summarizing the article's content: Summary-content = nonblank-text As with the subject, no restriction is placed on the content since it is intended solely for display to humans. NOTE: Reading agents should be aware that the Summary header is often used as a sort of secondary Subject header, and (if present) its contents should perhaps be displayed when the subject is displayed. The summary SHOULD be terse. Posters SHOULD avoid trying to cram their entire article into the headers; even the simplest query usually benefits from a sentence or two of elaboration and context, and not all reading agents display all headers. 6.10. Approved The Approved header content indicates the mailing addresses (and possibly the full names) of the persons or entities approving the article for posting:
Approved-content = From-content *( "," [ space ] From-content ) An Approved header is required in all postings to moderated newsgroups; the presence or absence of this header allows a posting agent to distinguish between articles posted by the moderator (which are normal articles to be posted normally) and attempted contributions by others (which should be mailed to the moderator for approval). An Approved header is also required in certain control messages, to reduce the probability of accidental posting of same; see the relevant parts of Section 7. NOTE: There is, at present, no way to authenticate Approved headers to ensure that the claimed approval really was bestowed. Nor is there an established mechanism for even maintaining a list of legitimate approvers (such a list would quickly become out of date if it had to be maintained by hand). Such mechanisms, presumably relying on cryptographic authentication, would be a worthwhile extension to this Draft, and experimental work in this area is encouraged. (The problem is harder than it sounds because news is used on many systems that do not have real-time access to key servers.) NOTE: Relayer implementors, please note well: it is the POSTING AGENT that is authorized to distinguish between moderator postings and attempted contributions, and to mail the latter to the moderator. As discussed in Section 9.1, relayers MUST NOT, repeat MUST NOT, send such mail; on receipt of an unApproved article in a moderated newsgroup, they should discard the article, NOT transform it into a mail message (except perhaps to a local administrator). NOTE: [RFC1036] restricted Approved to a single From-content. However, multiple moderation is no longer rare, and multi- moderator Approved headers are already in use. 6.11. Lines The Lines header content indicates the number of lines in the body of the article: Lines-content = 1*digit The line count includes all body lines, including the signature (if any) and including empty lines (if any) at the beginning or end of the body. (The single empty separator line between the headers and the body is not part of the body.) The "body" here is the body as found in the posted article, AFTER all transformations such as MIME encodings.
Reading agents SHOULD NOT rely on the presence of this header, since it is optional (and some posting agents do not supply it). They MUST NOT rely on it being precise, since it frequently is not. NOTE: The average line length in article bodies is surprisingly consistent at about 40 characters, and since the line count typically is used only for approximate judgements ("is this too long to read quickly?"), dividing the byte count of the body by 40 gives an estimate of the body line count that is adequate for normal use. This estimate is NOT adequate if the body has been MIME encoded, but neither is the Lines header: at least one major relayer will add a Lines header to an article that lacks one, without considering the possibility of MIME encodings when computing the line count. NOTE: It would be better to have a Content-Size header as part of MIME, so that body parts could have their own sizes, and so that the units used could be appropriate to the data type (line count is not a useful measure of the size of an encoded image, for example). Doing this is preferable to trying to fix Lines. UNRESOLVED ISSUE: Update on Content-Size? Relayers SHOULD discard this header if they find it necessary to re-encode the article in such a way that the original Lines header would be rendered incorrect. 6.12. Xref The Xref header content indicates where an article was filed by the last relayer to process it: Xref-content = relayer 1*( space location ) relayer = relayer-name location = newsgroup-name ":" article-locator article-locator = 1*<ASCII printable character> The relayer's name is included so that software can determine which relayer generated the header (and specifically, whether it really was the one that filed the copy being examined). The locations specify what newsgroups the article was filed under (which may differ from those in the Newsgroups header) and where it was filed under them. The exact form of an article locator is implementation-specific. NOTE: Reading agents can exploit this information to avoid presenting the same article to a reader several times. The information is sometimes available in system databases, but having it in the article is convenient. Relayers traditionally generate
an Xref header only if the article is cross-posted, but this is not mandatory, and there is at least one new application ("mirroring": keeping news databases on two hosts identical) where the header is useful in all articles. NOTE: The traditional form of an article locator is a decimal number, with articles in each newsgroup numbered consecutively starting from 1. NNTP [RFC977] demands that such a model be provided, and there may be other software that expects it, but it seems desirable to permit flexibility for unorthodox implementations. A relayer inserting an Xref header into an article MUST delete any previous Xref header. A relayer that is not inserting its own Xref header SHOULD delete any previous Xref header. A relayer MAY delete the Xref header when passing an article on to another relayer. NOTE: [RFC1036] specified that the Xref header was not transmitted when an article was passed to another relayer, but the major news implementations have never obeyed this rule, and applications like mirroring depend on this disobedience. A relayer MUST use the same name in Xref headers as it uses in Path headers. Reading agents MUST ignore an Xref header containing a relayer name that differs from the one that begins the path list. 6.13. Organization The Organization header content is a short phrase identifying the poster's organization: Organization-content = nonblank-text This header is typically supplied by the posting agent. The Organization content SHOULD mention geographical location (e.g., city and country) when it is not obvious from the organization's name. NOTE: The motive here is that the organization is often difficult to guess from the mailing address, is not always supplied in a signature, and can help identify the poster to the reader. NOTE: There is no "s" in "Organization". The Organization content is provided for identification only and does not imply that the poster speaks for the organization or that the article represents organization policy. Posting agents SHOULD permit the poster to override a local default Organization header.
6.14. Supersedes The Supersedes header content specifies articles to be cancelled on arrival of this one: Supersedes-content = message-id *( space message-id ) Supersedes is equivalent to Also-Control (Section 6.15) with an implicit verb of "cancel" (Section 7.1). NOTE: Supersedes is normally used where the article is an updated version of the one(s) being cancelled. NOTE: Although the ability to use multiple message IDs in Supersedes is highly desirable (see Section 7.1), posters are warned that existing implementations often do not correctly handle more than one. NOTE: There is no "c" in "Supersedes". An article with a Supersedes header MUST NOT have an Also-Control or Control header. 6.15. Also-Control The Also-Control header content marks the article as being a control message IN ADDITION to being a normal news article and specifies the desired actions: Also-Control-content = Control-content An article with an Also-Control header is filed and passed on normally, but the content of the Also-Control header is processed as if it were found in a Control header. NOTE: It is sometimes desirable to piggyback control actions on a normal article, so that the article will be filed normally but will also be acted on as a control message. This header is essentially a generalization of Supersedes. NOTE: Be warned that some old relayers do not implement Also-Control. An article with an Also-Control header MUST NOT have a Control or Supersedes header.
6.16. See-Also The See-Also header content lists message IDs of articles that are related to this one but are not its precursors: See-Also-content = message-id *( space message-id ) See-Also resembles References, but without the restrictions imposed on References by the followup rules. NOTE: See-Also provides a way to group related articles, such as the parts of a single document that had to be split across multiple articles due to its size, or to cross-reference between parallel threads. NOTE: See the discussion (in Section 6.5) on MAIL compatibility issues of References and See-Also. NOTE: In the specific case where it is desired to essentially make another article PART of the current one, e.g., for annotation of the other article, MIME's "message/external-body" convention can be used to do so without actual inclusion. "news-message-ID" was registered as a standard external-body access method, with a mandatory NAME parameter giving the message ID and an optional SITE parameter suggesting an NNTP site that might have the article available (if it is not available locally), by IANA 22 June 1993. UNRESOLVED ISSUE: Could the syntax be generalized to include URLs as alternatives to message IDs? Here it makes much more sense than in References. 6.17. Article-Names The Article-Names header content indicates any special significance the article may have in particular newsgroups: Article-Names-content = 1*( name-clause space ) name-clause = newsgroup-name ":" article-name article-name = letter 1*( letter / digit / "-" ) Each name clause specifies a newsgroup (which SHOULD be among those in the Newsgroups header) and an article name local to that newsgroup. Article names MAY be used by relayers to file the article in special ways, or they MAY just be noted for possible special attention by reading agents. Article names are case-sensitive.
NOTE: This header provides a way to mark special postings, such as introductions, frequently-asked-question lists, etc., so that reading agents have a way of finding them automatically. The newsgroup name is specified for each article name because the names may be newsgroup-specific; for example, many frequently- asked-question lists are posted to "news.answers" in addition to their "home" newsgroup, and they would not be known by the same name(s) in both newsgroups. The Article-Names header SHOULD be ignored unless the article also contains an Approved header. NOTE: This stipulation is made in anticipation of the possibility that Approved headers will be involved in cryptographic authentication. The presence of an Article-Names header does not necessarily imply that the article will be retained unusually long before expiration, or that previous article(s) with similar Article-Names headers will be cancelled by its arrival. Posters preparing special postings SHOULD include appropriate other headers, such as Expires and Supersedes, to request such actions. Different networks MAY establish different sets of article names for the special postings they deem significant; it is preferable for usage to be standardized within networks, although it might be desirable for individual newsgroups to have different naming conventions in some situations. Article names MUST be 14 characters or less. The following names are suggested but are not mandatory: intro Introduction to the newsgroup for newcomers. charter Charter, rules, organization, moderation policies, etc. background Biographies of special participants, history of the newsgroup, notes on related newsgroups, etc. subgroups Descriptions of sub-newsgroups under this newsgroup, e.g., "sci.space.news" under "sci.space". facts Information relating to the purpose of the newsgroup, e.g., an acronym glossary in "sci.space". references Where to get more information: books, journals, FTP repositories, etc. faq Answers to frequently asked questions.
menu If present, a list of all of the other article names local to this newsgroup, with brief descriptions of their contents. Such articles may be divided into subsections using the MIME "multipart/mixed" conventions. If size considerations make it necessary to split such articles, names ending in a hyphen and a part number are suggested; for example, a three-part frequently-asked- questions list could have article names "faq-1", "faq-2", and "faq-3". NOTE: It is somewhat premature to attempt to standardize article names, since this is essentially a new feature with no experience behind it. However, if reading agents are to attach special significance to these names, some attempt at standard conventions is imperative. This is a first attempt at providing some. 6.18. Article-Updates The Article-Updates header content indicates what previous articles this one is deemed (by the poster) to update (i.e., replace): Article-Updates-content = message-id *( space message-id ) Each message ID identifies a previous article that this one is deemed to update. This MUST NOT cause the previous article(s) to be cancelled or otherwise altered, unless this is implied by other headers (e.g., Supersedes); Article-Updates is merely an advisory that MAY be noted for special attention by reading agents. NOTE: This header provides a way to mark articles that are only minor updates of previous ones, containing no significant new information and not worth reading if the previous ones have been read. NOTE: If suitable conventions using MIME multipart bodies and the "message/external-body" body-part type can be developed, a replacing article might contain only differences between the old text and the new text, rather than a complete new copy. This is the motivation for not making Article-Updates also function as Supersedes does: the replacing article might depend on the continued presence of the replaced article. 7. Control Messages The following sections document the currently defined control messages. "Message" is used herein as a synonym for "article" unless context indicates otherwise.
Posting agents are warned that since certain control messages require article bodies in quite specific formats, signatures SHOULD NOT be appended to such articles, and it may be wise to take greater care than usual to avoid unintended (although perhaps well-meaning) alterations to text supplied by the poster. Relayers MUST assume that control messages mean what they say; they MAY be obeyed as is or rejected, but MUST NOT be reinterpreted. The execution of the actions requested by control messages is subject to local administrative restrictions, which MAY deny requests or refer them to an administrator for approval. The descriptions below are generally phrased in terms suggesting mandatory actions, but any or all of these MAY be subject to local administrative approval (either as a class or case-by-case). Analogously, where the description below specifies that a message or portion thereof is to be ignored, this action MAY include reporting it to an administrator. NOTE: The exact choice of local action might depend on what action the control message requests, who it claims to come from, etc. Relayers MUST propagate even control messages they do not understand. In the following sections, each type of control message is defined syntactically by defining its arguments and its body. For example, "cancel" is defined by defining cancel-arguments and cancel-body. 7.1. cancel The cancel message requests that one or more previous articles be "cancelled": cancel-arguments = message-id *( space message-id ) cancel-body = body The argument(s) identify the articles to be cancelled, by message ID. The body is a comment, which software MUST ignore, and SHOULD contain an indication of why the cancellation was requested. The cancel message SHOULD be posted to the same newsgroup(s), with the same distribution(s), as the article(s) it is attempting to cancel. NOTE: Using the same newsgroups and distributions maximizes the chances of the cancel message propagating everywhere the target articles went. NOTE: [RFC1036] permitted only a single message-id in a cancel message. Support for cancelling multiple articles is highly desirable, especially for use with Supersedes (see Section 6.14). If several revisions of an article appear in fast succession, each
using Supersedes to cancel the previous one, it is possible for a middle revision to be destroyed by cancellation before it is propagated onward to cancel its predecessor. Allowing each article to cancel several predecessors greatly alleviates this problem. (Posting agents preparing a cancel of an article that itself cancels other articles might wish to add those articles to the cancel-arguments.) However, posters should be aware that much old software does not implement multiple cancellation properly and should avoid using it when reliable cancellation is vitally important. When an article (the "target article") is to be cancelled, there are four cases of interest: the article hasn't arrived yet, it has arrived and been filed and is available for reading, it has expired and been archived on some less-accessible storage medium, or it has expired and been deleted. The next few paragraphs discuss each case in turn (in reverse order, which is convenient for the explanation). EXPIRED AND DELETED. Take no action. EXPIRED AND ARCHIVED. If the article is readily accessible and can be deleted or made unreadable easily, treat as under AVAILABLE below. Otherwise, treat as under EXPIRED AND DELETED. NOTE: While it is desirable for archived articles to be cancellable, this can easily involve rewriting an entire archive volume just to get rid of one article, perhaps with manual actions required to arrange it. It is difficult to envision a situation so dire as to require such measures from hundreds or thousands of administrators, or for that matter one in which widespread compliance with such a request is likely. AVAILABLE. Compare the mailing addresses from the From lines of the cancel message and the target article, bearing in mind that local parts (except for "postmaster") are case-sensitive and domains are case-insensitive. If they do not match, either refer the issue to an administrator for a case-by-case decision, or treat as if they matched. NOTE: It is generally trivial to forge articles, so nothing short of cryptographic authentication is really adequate to ensure that a cancel came from the original article's author. Moreover, it is highly desirable to permit authorities other than the author to cancel articles, to allow for cases in which the author is unavailable, uncooperative, or malicious, and in which damage and/or legal problems may be minimized by prompt cancellation.
Reliable authentication that would permit such administrative cancels would be a worthwhile extension to this Draft, and experimental work in this area is encouraged. NOTE: Meanwhile, a simple check of addresses is useful accident prevention and catches at least the most simple-minded forgers. Since the intent is accident prevention rather than ironclad security, use of the From address is appropriate, all the more so because in the presence of gateways (especially redundant multiple gateways), the author may not have full control over Sender headers. NOTE: The "refer... or treat as if they matched" rule is intended to specifically forbid quietly ignoring cancels with mismatched addresses. If the addresses match, then if technically possible, the relayer MUST delete the target article completely and immediately. Failing that, it MUST make the target article unreadable (preferably to everyone, minimally to everyone but the administrator) and either arrange for it to be deleted as soon as possible or notify an administrator at once. NOTE: To allow for events such as criminal actions, malicious forgeries, and copyright infringements, where damage and/or legal problems may be minimized by prompt cancellation, complete removal is strongly preferred over merely making the target article unreadable. The potential for malice is outweighed by the importance of really getting rid of the target article in some legitimate cases. (In cases of inadvertent copyright violation in particular, the ability to quickly remedy the violation is of considerable legal importance.) Failing that, making it unreadable is better than nothing. NOTE: Merely annotating the article so that readers see an indication that the author wanted it cancelled is not acceptable. Making the article unreadable is the minimum action. NOTE: There have been experiments with making cancelled articles unreadable, so that local news administrators could reverse cancellations. In practice, administrators almost never find cause to do so. Removal appears to be clearly preferable where technically feasible.
NOT ARRIVED YET. If practical, retain the cancel message until the target article does arrive, or until there is no further possibility of it arriving and being accepted (see Section 9.2), and then treat as under AVAILABLE. Failing that, arrange for the target article to be rejected and discarded if it does arrive. NOTE: It may well be impractical to retain the control message, given uncertainty about whether the target article will ever arrive. Existing practice in such cases is to assume that addresses would match and arrange the equivalent of deletion. This is often done by making a spurious entry in a database of already-seen message IDs (see Section 9.3), so that if the article does arrive, it will be rejected as a duplicate. The cancel message MUST be propagated onward in the usual fashion, regardless of which of the four cases applied, so that the target article will be cancelled everywhere even if cancellation and target article follow different routes. NOTE: [RFC1036] appeared to require stopping cancel propagation in the NOT ARRIVED YET case, although the wording was somewhat unclear. This appears to have been an unwise decision; there are known cases of important cancellations (in situations of inadvertent copyright violation, for example) achieving rather poorer propagation than the target article. News propagation is often a much less orderly process than the authors of [RFC1036] apparently envisioned. Modern implementations generally propagate the cancellation regardless. Posting agents meant for use by ordinary posters SHOULD reject an attempt to post a cancel message if the target article is available and the mailing address in its From header does not match the one in the cancel message's From header. NOTE: This, again, is primarily accident prevention. 7.2. ihave, sendme The ihave and sendme control messages implement a crude batched predecessor of the NNTP [RFC977] protocol. They are largely obsolete in the Internet but still see use in the UUCP environment, especially for backup feeds that normally are active only when a primary feed path has failed. NOTE: The ihave and sendme messages defined here have ABSOLUTELY NOTHING TO DO WITH NNTP, despite similarities of terminology.
The two messages share the same syntax: ihave-arguments = *( message-id space ) relayer-name sendme-arguments = ihave-arguments ihave-body = *( message-id eol ) sendme-body = ihave-body Message IDs MUST appear in either the arguments or the body, but not both. Relayers SHOULD generate the form putting message IDs in the body, but the other form MUST be supported for backward compatibility. NOTE: [RFC1036] made the relayer name optional, but difficulties could easily ensue in determining the origin of the message, and this option is believed to be unused nowadays. Putting the message IDs in the body is strongly preferred over putting them in the arguments because it lends itself much better to large numbers of message IDs and avoids the empty-body problem mentioned in Section 4.3.1. The ihave message states that the named relayer has filed articles with the specified message IDs, which may be of interest to the relayer(s) receiving the ihave message. The sendme message requests that the relayer receiving it send the articles having the specified message IDs to the named relayer. These control messages are normally sent essentially as point-to- point messages, by using "to." newsgroups (see Section 5.5) that are sent only to the relayer for which the messages are intended. The two relayers MUST be neighbors, exchanging news directly with each other. Each relayer advertises its new arrivals to the other using ihave messages, and each uses sendme messages to request the articles it lacks. NOTE: Arguably these point-to-point control messages should flow by some other protocol, e.g., mail, but administrative and interfacing issues are simplified if the news system doesn't need to talk to the mail system. To reduce overhead, ihave and sendme messages SHOULD be sent relatively infrequently and SHOULD contain substantial numbers of message IDs. If ihave and sendme are being used to implement a backup feed, it may be desirable to insert a delay between reception of an ihave and generation of a sendme, so that a slightly slow primary feed will not cause large numbers of articles to be requested unnecessarily via sendme.
7.3. newgroup The newgroup control message requests that a new newsgroup be created: newgroup-arguments = newsgroup-name [ space moderation ] moderation = "moderated" / "unmoderated" newgroup-body = body / [ body ] descriptor [ body ] descriptor = descriptor-tag eol description-line eol descriptor-tag = "For your newsgroups file:" description-line = newsgroup-name space description description = nonblank-text [ " (Moderated)" ] The first argument names the newsgroup to be created, and the second one (if present) indicates whether it is moderated. If there is no second argument, the default is "unmoderated". NOTE: Implementors are warned that there is occasional use of other forms in the second argument. It is suggested that such violations of this Draft, which are also violations of [RFC1036], cause the newgroup message to be ignored. [RFC1036] was slightly vague about how second arguments other than "moderated" were to be treated (specifically, whether they were illegal or just ignored), but it is thought that all existing major implementations will handle "unmoderated" correctly, and it appears desirable to tighten up the specs to make it possible for other forms to be used in future. The body is a comment, which software MUST ignore, except that if it contains a descriptor, the description line is intended to be suitable for addition to a list of newsgroup descriptions. The description cannot be continued onto later lines but is not constrained to any particular length. Moderated newsgroups have descriptions that end with the string " (Moderated)" (note that this string begins with a blank). NOTE: It is unfortunate that the description line is part of the body, rather than being supplied in a header, but this is established practice. Newsgroup creators are cautioned that the descriptor tag must be reproduced exactly as given above, must be alone on a line, and that it is case-sensitive. (To reduce errors in this regard, posting agents might wish to question or reject newgroup messages that do not contain a descriptor.) Given the desire for short lines, description writers should avoid content- free phrases like "discussion of" and "news about", and stick to defining what the newsgroup is about.
The remainder of the body SHOULD contain an explanation of the purpose of the newsgroup and the decision to create it. NOTE: Criteria for newsgroup creation vary widely and are outside the scope of this Draft, but if formal procedures of one kind or another were followed in the decision, the body should mention this. Administrators often look for such information when deciding whether to comply with creation/deletion requests. A newgroup message that lacks an Approved header MUST be ignored. NOTE: It would also be desirable to ignore a newgroup message unless its Approved header names a person who is authorized (in some sense) to create such a newsgroup. A cooperating subnet with sufficiently strong coordination to maintain a correct and current list of authorized creators might wish to do so for its internal newsgroups. It also (or alternatively) might wish to ignore a newgroup message for an internal newsgroup that was posted (or cross-posted) to a non-internal newsgroup. NOTE: As mentioned in Section 6.10, some form of (cryptographic?) authentication of Approved headers would be highly desirable, especially for control messages. It would be desirable to provide some way of supplying a moderator's address in a newgroup message for a moderated newsgroup, but this will cause problems unless effective authentication is available, so it is left for future work. NOTE: This leaves news administrators stuck with the annoying chore of arranging proper mailing of moderated-newsgroup submissions. On Usenet, this can be simplified by exploiting a forwarding facility that some major sites provide: they maintain forwarding addresses, each the name of a moderated newsgroup with all periods (".", ASCII 46) replaced by hyphens ("-", ASCII 45), which forward mail to the current newsgroup moderators. More advice on the subject of forwarding to moderators can be found in the document titled "How to Construct the Mailpaths File", posted regularly to the Usenet newsgroups news.lists, news.admin.misc, and news.answers. A newgroup message naming a newsgroup that already exists is requesting a change in the moderation status or description of the newsgroup. The same rules apply.
7.4. rmgroup The rmgroup message requests that a newsgroup be deleted: rmgroup-arguments = newsgroup-name rmgroup-body = body The sole argument is the newsgroup name. The body is a comment, which software MUST ignore; it SHOULD contain an explanation of the decision to delete the newsgroup. NOTE: Criteria for newsgroup deletion vary widely and are outside the scope of this Draft, but if formal procedures of one kind or another were followed in the decision, the body should mention this. Administrators often look for such information when deciding whether to comply with creation/deletion requests. A rmgroup message that lacks an Approved header MUST be ignored. NOTE: It would also be desirable to ignore a rmgroup message unless its Approved header names a person who is authorized (in some sense) to delete such a newsgroup. A cooperating subnet with sufficiently strong coordination to maintain a correct and current list of authorized deleters might wish to do so for its internal newsgroups. It also (or alternatively) might wish to ignore a rmgroup message for an internal newsgroup that was posted (or cross-posted) to a non-internal newsgroup. Unexpected deletion of a newsgroup being a disruptive action, implementations are strongly advised to refer rmgroup messages to an administrator by default, unless perhaps the message can be determined to have originated within a cooperating subnet whose members are considered trustworthy. Abuses have occurred. 7.5. sendsys, version, whogets The sendsys message requests that a description of the relayer's news feeds to other relayers be mailed to the article's reply address: sendsys-arguments = [ relayer-name ] sendsys-body = body If there is an argument, relayers other than the one named by the argument MUST NOT respond. The body is a comment, which software MUST ignore; it SHOULD contain an explanation of the reason for the request.
The version message requests that the name and version of the relayer software be mailed to the reply address: version-arguments = version-body = body There are no arguments. The body is a comment, which software MUST ignore; it SHOULD contain an explanation of the reason for the request. The whogets message requests that a description of the relayer and its news feeds to other relayers be mailed to the article's reply address: whogets-arguments = newsgroup-name [ space relayer-name ] whogets-body = body The first argument is the name of the "target newsgroup", specifying the newsgroup for which propagation information is desired. This MUST be a complete newsgroup name, not the name of a hierarchy or a portion of a newsgroup name that is not itself the name of a newsgroup. If there is a second argument, only the relayer named by that argument should respond. The body is a comment, which software MUST ignore; it SHOULD contain an explanation of the reason for the request. NOTE: Whogets is intended as a replacement for sendsys (and version) with a precisely specified reply format. Since the syntax for specifying what newsgroups get sent to what other relayers varies widely between different forms of relayer software, the only practical way to standardize the reply format is to indicate a specific newsgroup and ask where THAT newsgroup propagates. The requirement that it be a complete newsgroup name is intended to (largely) avoid the problem of having to answer "yes and no" in cases where not all newsgroups in a hierarchy are sent. Any of these messages lacking an Approved header MUST be ignored. Response to any of these messages SHOULD be delayed for at least 24 hours, and no response should be attempted if the message has been cancelled in that time. Also, no response SHOULD be attempted unless the local part of the destination address is "newsmap". News administrators SHOULD arrange for mail to "newsmap" on their systems to be discarded (without reply) unless legitimate use is in progress. NOTE: Because these messages can cause many, many relayers to send mail to one person, such messages, specifying mailing to an innocent person's mailbox, have been forged as a half-witted
practical joke. A delay gives administrators time to notice a fraudulent message and act (by cancelling the message, preparing to divert the flood of mail into the bit bucket, or both). Restriction of the destination address to "newsmap" reduces the appeal of fraud by making it impossible to use it to harass a normal user. (A site that does NOT discard mail to "newsmap", but rather bounces it back, may incur higher communications costs than if the mail had been accepted into a user's mailbox, but a malicious forger could accomplish this anyway, by using an address whose local part is very unlikely to be a legitimate mailbox name.) NOTE: [RFC1036] did not require the Approved header for these control messages. This has been added because of the possibility that cryptographic authentication of Approved headers will become available. The body of the reply to a sendsys message SHOULD be of the form: sendsys-reply = responder 1*sys-line responder = "Responding-System:" space domain eol sys-line = relayer-name ":" newsgroup-patterns [ ":" text ] eol newsgroup-patterns = newsgroup-name *( "," newsgroup-name ) The first line identifies the responding system, using a syntax resembling a header (but note that it is part of the BODY). Remaining lines indicate what newsgroups are sent to what other systems. The syntax of newsgroup patterns is not well standardized; the form described is common (often with newsgroup names only partially given, denoting all names starting with a particular set of components) but not universal. The whogets message provides a better-defined alternative. The reply to a version message is of somewhat ill-defined form, with a body normally consisting of a single line of text that somehow describes the version of the relayer software. The whogets message provides a better-defined alternative.
The body of the reply to a whogets message MUST be of the form: whogets-reply = responder-domain responder-relayer response-date responding-to arrived-via responder-version whogets-delimiter *pass-line responder-domain = "Responding-System:" space domain eol responder-relayer = "Responding-Relayer:" space relayer-name eol response-date = "Response-Date:" space date eol responding-to = "Responding-To:" space message-id eol arrived-via = "Arrived-Via:" path-list eol responder-version = "Responding-Version:" space nonblank-text eol whogets-delimiter = eol pass-line = relayer-name [ space domain ] eol The first six lines identify the responding relayer by its Internet domain name (use of the ".uucp" and ".bitnet" pseudo-domains is permissible, for registered hosts in them, but discouraged) and its relayer name; specify the date when the reply was generated and the message ID of the whogets message being replied to; give the path list (from the Path header) of the whogets message (which MAY, if absolutely necessary, be truncated to a convenient length, but MUST contain at least the leading three relayer names); and indicate the version of relayer software responding. Note that these lines are part of the BODY even though their format resembles that of headers. Despite the apparently fixed order specified by the syntax above, they can appear in any order, but there must be exactly one of each. After those preliminaries, and an empty line to unambiguously define their end, the remaining lines are the relayer names (which MAY be accompanied by the corresponding domain names, if known) of systems to which the responding system passes the target newsgroup. Only the names of news relayers are to be included. NOTE: It is desirable for a reply to identify its source by both domain name and relayer name because news propagation is governed by the latter but location in a broader context is best determined by the former. The date and whogets message ID should, in principle, be present in the MAIL headers but are included in the body for robustness in the presence of uncooperative mail systems. The reason for the path list is discussed below. Adding version information eliminates the need for a separate message to gather it.
NOTE: The limitation of pass lines to contain only names of news relayers is meant to exclude names used within a single host (as identifiers for mail gateways, portions of ihave/sendme implementations, etc.), which do not actually refer to other hosts. A relayer that is unaware of the existence of the target newsgroup MUST NOT reply to a whogets message at all, although this MUST NOT influence decisions on whether to pass the article on to other relayers. NOTE: While this may result in discontinuous maps in cases where some hosts have not honored requests for creation of a newsgroup, it will also prevent a flood of useless responses in the event that a whogets message intended to map a small region "leaks" out to a larger one. The possibility of discontinuous recognition of a newsgroup does make it important that the whogets message itself continue to propagate (if other criteria permit). This is also the reason for the inclusion of the whogets message's path list, or at least the leading portion of it, in the reply: to permit reconstruction of at least small gaps in maps. Different networks set different rules for the legitimacy of these messages, given that they may reveal details of organization-internal topology that are sometimes considered proprietary. NOTE: On Usenet, in particular, willingness to respond to these messages is held to be a condition of network membership: the topology of Usenet is public information. Organizations wishing to belong to such networks while keeping their internal topology confidential might wish to organize their internal news software so that all articles reaching outsiders appear to be from a single "gatekeeper" system, with the details of internal topology hidden behind that system. UNRESOLVED ISSUE: It might be useful to have a way to set some sort of hop limit for these.
7.6. checkgroups The checkgroups control message contains a supposedly authoritative list of the valid newsgroups within some subset of the newsgroup name space: checkgroups-arguments = checkgroups-body = [ invalidation ] valid-groups / invalidation invalidation = "!" plain-component *( "," plain-component ) eol valid-groups = 1*( description-line eol ) There are no arguments. The body lines (except possibly for an initial invalidation) each contain a description line for a newsgroup, as defined under the newgroup message (Section 7.3). NOTE: Some other, ill-defined, forms of the checkgroups body were formerly used. See Appendix A. The checkgroups message applies to all hierarchies containing any of the newsgroups listed in the body. The checkgroups message asserts that the newsgroups it lists are the only newsgroups in those hierarchies. If there is an invalidation, it asserts that the hierarchies it names no longer contain any newsgroups. Processing a checkgroups message MAY cause a local list of newsgroup descriptions to be updated. It SHOULD also cause the local lists of newsgroups (and their moderation statuses) in the mentioned hierarchies to be checked against the message. The results of the check MAY be used for automatic corrective action or MAY be reported to the news administrator in some way. NOTE: Automatically updating descriptions of existing newsgroups is relatively safe. In the case of newsgroup additions or deletions, simply notifying the administrator is generally the wisest action, unless perhaps the message can be determined to have originated within a cooperating subnet whose members are considered trustworthy. NOTE: There is a problem with the checkgroups concept: not all newsgroups in a hierarchy necessarily propagate to the same set of machines. (Notably, there is a set of newsgroups known as the "inet" newsgroups, which have relatively limited distribution but coexist in several hierarchies with more widely distributed newsgroups.) The advice of checkgroups should always be taken with a grain of salt and should never be followed blindly.