Internet Architecture Board (IAB) J. Reschke Request for Comments: 7749 greenbytes Obsoletes: 2629 February 2016 Category: Informational ISSN: 2070-1721 The "xml2rfc" Version 2 VocabularyAbstract
This document defines the "xml2rfc" version 2 vocabulary: an XML- based language used for writing RFCs and Internet-Drafts. Version 2 represents the state of the vocabulary (as implemented by several tools and as used by the RFC Editor) around 2014. This document obsoletes RFC 2629. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Architecture Board (IAB) and represents information that the IAB has deemed valuable to provide for permanent record. It represents the consensus of the Internet Architecture Board (IAB). Documents approved for publication by the IAB are not a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7749. Copyright Notice Copyright (c) 2016 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document.
Table of Contents
1. Introduction ....................................................4 1.1. Syntax Notation ............................................4 2. Elements ........................................................5 2.1. <abstract> .................................................5 2.2. <address> ..................................................5 2.3. <annotation> ...............................................6 2.4. <area> .....................................................6 2.5. <artwork> ..................................................7 2.6. <author> ..................................................10 2.7. <back> ....................................................11 2.8. <c> .......................................................12 2.9. <city> ....................................................12 2.10. <code> ...................................................12 2.11. <country> ................................................12 2.12. <cref> ...................................................13 2.13. <date> ...................................................14 2.14. <email> ..................................................15 2.15. <eref> ...................................................15 2.16. <facsimile> ..............................................16 2.17. <figure> .................................................16 2.18. <format> .................................................18 2.19. <front> ..................................................19 2.20. <iref> ...................................................20 2.21. <keyword> ................................................21 2.22. <list> ...................................................21 2.23. <middle> .................................................23 2.24. <note> ...................................................24 2.25. <organization> ...........................................24 2.26. <phone> ..................................................24 2.27. <postal> .................................................25 2.28. <postamble> ..............................................25 2.29. <preamble> ...............................................26 2.30. <reference> ..............................................26 2.31. <references> .............................................27 2.32. <region> .................................................28 2.33. <rfc> ....................................................28 2.34. <section> ................................................32 2.35. <seriesInfo> .............................................33 2.36. <spanx> ..................................................34 2.37. <street> .................................................35 2.38. <t> ......................................................35 2.39. <texttable> ..............................................36 2.40. <title> ..................................................38 2.41. <ttcol> ..................................................38 2.42. <uri> ....................................................39
2.43. <vspace> .................................................39 2.44. <workgroup> ..............................................39 2.45. <xref> ...................................................40 3. Escaping for Use in XML ........................................42 4. Special Unicode Code Points ....................................42 5. Including Files ................................................43 6. Internationalization Considerations ............................44 7. Security Considerations ........................................44 8. IANA Considerations ............................................44 8.1. Internet Media Type Registration ..........................44 9. References .....................................................46 9.1. Normative References ......................................46 9.2. Informative References ....................................46 Appendix A. Front-Page ("Boilerplate") Generation .................50 A.1. The "category" Attribute ...................................50 A.2. The "ipr" Attribute ........................................50 A.2.1. Current Values: "*trust200902" .........................51 A.2.2. Historic Values ........................................52 A.3. The "submissionType" Attribute .............................54 A.4. The "consensus" Attribute ..................................55 Appendix B. Changes from RFC 2629 ("v1") ..........................56 B.1. Removed Elements ...........................................56 B.2. Changed Defaults ...........................................56 B.3. Changed Elements ...........................................57 B.4. New Elements ...............................................57 Appendix C. RELAX NG Schema .......................................58 C.1. Checking Validity ..........................................65 IAB Members at the Time of Approval ...............................66 Acknowledgments ...................................................66 Index .............................................................67 Author's Address ..................................................76
1. Introduction
This document describes version 2 ("v2") of the "xml2rfc" vocabulary: an XML-based language ("Extensible Markup Language" [XML]) used for writing RFCs [RFC7322] and Internet-Drafts [IDGUIDE]. Version 2 represents the state of the vocabulary (as implemented by several tools and as used by the RFC Editor) around 2014. It obsoletes the original version ("v1") [RFC2629], which contained the original language definition and which was subsequently extended. Many of the changes leading to version 2 have been described in "Writing I-Ds and RFCs using XML (revised)" [V1rev], but that document has not been updated since 2008. Processing Instructions (Section 2.6 of [XML]) generally are specific to a given processor and thus are not considered to be part of the vocabulary. See Section 4.1 of [TCLReadme] for a list of the Processing Instructions supported by the first implementation of an xml2rfc processor. Note that the vocabulary contains certain constructs that might not be used when generating the final text; however, they can provide useful data for other uses (such as index generation, populating a keyword database, or syntax checks).1.1. Syntax Notation
The XML vocabulary here is defined in prose, based on the RELAX NG schema [RNC] contained in Appendix C (specified in RELAX NG Compact Notation (RNC)). Note that the schema can be used for automated validity checks, but certain constraints are only described in prose (example: the conditionally required presence of the "abbrev" attribute).
2. Elements
The sections below describe all elements and their attributes. Note that attributes not labeled "mandatory" are optional. Except inside <artwork>, horizontal whitespace and line breaks are collapsed into a single whitespace, and leading and trailing whitespace is trimmed off.2.1. <abstract>
Contains the Abstract of the document. The Abstract ought to be self-contained and thus should not contain references or unexpanded abbreviations. See Section 4.3 of [RFC7322] for more information. This element appears as a child element of <front> (Section 2.19). Content model: One or more <t> elements (Section 2.38)2.2. <address>
Provides address information for the author. This element appears as a child element of <author> (Section 2.6). Content model: In this order: 1. One optional <postal> element (Section 2.27) 2. One optional <phone> element (Section 2.26) 3. One optional <facsimile> element (Section 2.16) 4. One optional <email> element (Section 2.14) 5. One optional <uri> element (Section 2.42)
2.3. <annotation>
Provides additional prose augmenting a bibliographical reference. This element appears as a child element of <reference> (Section 2.30). Content model: In any order: o Text o <xref> elements (Section 2.45) o <eref> elements (Section 2.15) o <iref> elements (Section 2.20) o <cref> elements (Section 2.12) o <spanx> elements (Section 2.36)2.4. <area>
Provides information about the IETF area to which this document relates (currently not used when generating documents). The value ought to be either the full name or the abbreviation of one of the IETF areas as listed on <https://www.ietf.org/iesg/area.html>. The list at the time that this document is being published is "Applications and Real-Time" ("art"), "General" ("gen"), "Internet" ("int"), "Operations and Management" ("ops"), "Routing" ("rtg"), "Security" ("sec"), and "Transport" ("tsv"). Note that the set of IETF areas can change over time; for instance, "Applications and Real-Time" ("art") replaced "Applications" ("app") and "Real-time Applications and Infrastructure" ("rai") in 2015. This element appears as a child element of <front> (Section 2.19). Content model: only text content.
2.5. <artwork>
This element allows the inclusion of "artwork" in the document. <artwork> is the only element in the vocabulary that provides full control of horizontal whitespace and line breaks; thus, it is used for a variety of things, such as: o diagrams ("line art"), o source code, o formal languages (such as ABNF [RFC5234] or the RNC notation used in this document), o message flow diagrams, o complex tables, or o protocol unit diagrams. Note that processors differ in the handling of horizontal TAB characters (some expand them, some treat them as single spaces), and thus these ought to be avoided. Alternatively, the "src" attribute allows referencing an external graphics file, such as a bitmap or a vector drawing, using a URI ("Uniform Resource Identifier") [RFC3986]. In this case, the textual content acts as a fallback for output formats that do not support graphics; thus, it ought to contain either (1) a "line art" variant of the graphics or (2) prose that describes the included image in sufficient detail. Note that RFCs occasionally are published with enhanced diagrams; [RFC5598] is a recent example of an RFC that was published along with a PDF with images. This element appears as a child element of <figure> (Section 2.17). Content model: Text
2.5.1. "align" Attribute
Controls whether the artwork appears left justified (default), centered, or right justified. Allowed values: o "left" (default) o "center" o "right"2.5.2. "alt" Attribute
Alternative text description of the artwork (not just the caption).2.5.3. "height" Attribute
The suggested height of the graphics (when it was included using the "src" attribute). This attribute is format dependent and ought to be avoided. When generating HTML output [HTML], current implementations copy the attribute "as is", thus effectively treating it as CSS (Cascading Style Sheets) pixels (see Section 4.3.2 of [CSS]). For other output formats, it is usually ignored.2.5.4. "name" Attribute
A filename suitable for the contents (such as for extraction to a local file). This attribute generally isn't used for document generation, but it can be helpful for other kinds of tools (such as automated syntax checkers, which work by extracting the source code).2.5.5. "src" Attribute
The URI reference of a graphics file (Section 4.1 of [RFC3986]). Note that this can be a "data" URI [RFC2397] as well, in which case the graphics file is wholly part of the XML file.
2.5.6. "type" Attribute
Specifies the type of the artwork. The value is either an Internet Media Type (see [RFC2046]) or a keyword (such as "abnf"). The set of recognized keywords varies across implementations. How it is used depends on context and application. For instance, a formatter can attempt to syntax-highlight code in certain known languages.2.5.7. "width" Attribute
The suggested width of the graphics (when it was included using the "src" attribute). This attribute is format dependent and ought to be avoided. When generating HTML output [HTML], current implementations copy the attribute "as is", thus effectively treating it as CSS pixels (see Section 4.3.2 of [CSS]). For other output formats, it is usually ignored.2.5.8. "xml:space" Attribute
Determines whitespace handling. "preserve" is both the default value and the only meaningful setting (because that's what the <artwork> element is for). See also Section 2.10 of [XML]. Allowed values: o "default" o "preserve" (default)
2.6. <author>
Provides information about a document's author. This is used both for the document itself (at the beginning of the document) and for referenced documents (inside of <reference>). The <author> elements contained within the document's <front> element are used to fill the boilerplate, and also to generate the "Author's Address" section (see Section 4.12 of [RFC7322]). Note that an "author" can also be just an organization (by not specifying any of the name attributes, but adding the <organization> child element). Furthermore, the "role" attribute can be used to mark an author as "editor". This is reflected on the front page and in the "Author's Address" section, as well as in bibliographical references. Note that this specification does not define a precise meaning for the term "editor". See Sections 4.10 and 4.11 of [RFC7322] for more information. This element appears as a child element of <front> (Section 2.19). Content model: In this order: 1. One optional <organization> element (Section 2.25) 2. One optional <address> element (Section 2.2)2.6.1. "fullname" Attribute
The full name (used in the automatically generated "Author's Address" section).2.6.2. "initials" Attribute
An abbreviated variant of the given name(s), to be used in conjunction with the separately specified surname. It usually appears on the front page, in footers, and in references. Some processors will post-process the value -- for instance, when it only contains a single letter (in which case they might add a trailing dot). Relying on this kind of post-processing can lead to results varying across formatters and thus ought to be avoided.
2.6.3. "role" Attribute
Specifies the role the author had in creating the document. Allowed values: o "editor"2.6.4. "surname" Attribute
The author's surname, to be used in conjunction with the separately specified initials. It usually appears on the front page, in footers, and in references.2.7. <back>
Contains the "back" part of the document: the references and appendices. In <back>, <section> elements indicate appendices. This element appears as a child element of <rfc> (Section 2.33). Content model: In this order: 1. Optional <references> elements (Section 2.31) 2. Optional <section> elements (Section 2.34)
2.8. <c>
Provides the content of a cell in a table. This element appears as a child element of <texttable> (Section 2.39). Content model: In any order: o Text o <xref> elements (Section 2.45) o <eref> elements (Section 2.15) o <iref> elements (Section 2.20) o <cref> elements (Section 2.12) o <spanx> elements (Section 2.36)2.9. <city>
Gives the city name in a postal address. This element appears as a child element of <postal> (Section 2.27). Content model: only text content.2.10. <code>
Gives the postal region code. This element appears as a child element of <postal> (Section 2.27). Content model: only text content.2.11. <country>
Gives the country in a postal address. This element appears as a child element of <postal> (Section 2.27). Content model: only text content.
2.12. <cref>
Represents a comment. Comments can be used in a document while it is a work in progress. They usually appear (1) inline and visually highlighted, (2) at the end of the document (depending on file format and settings of the formatter), or (3) not at all (when generating an RFC). This element appears as a child element of <annotation> (Section 2.3), <c> (Section 2.8), <postamble> (Section 2.28), <preamble> (Section 2.29), and <t> (Section 2.38). Content model: only text content.2.12.1. "anchor" Attribute
Document-wide unique identifier for this comment. The processor will autogenerate an identifier when none is given. The value needs to be a valid XML "Name" (Section 2.3 of [XML]), additionally constrained to US-ASCII characters [USASCII].2.12.2. "source" Attribute
Holds the "source" of a comment, such as the name or the initials of the person who made the comment.
2.13. <date>
Provides information about the publication date. Note that this element is used for the boilerplate of the document being produced, and also inside bibliographic references. In the "boilerplate" case, it defines the publication date, which, when producing Internet-Drafts, will be used for computing the expiration date (see Section 8 of [IDGUIDE]). When one or more of "year", "month", or "day" are left out, the processor will attempt to use the current system date if the attributes that are present are consistent with that date. Note that in this case, month names need to match the full (English) month name ("January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", or "December") in order for expiration calculations to work (some implementations might support additional formats, though). In the case of bibliographic references, the date information can have prose text for the month or year. For example, vague dates (year="ca. 2000"), date ranges (year="2012-2013"), non-specific months (month="Second quarter") and so on are allowed. This element appears as a child element of <front> (Section 2.19). Content model: this element does not have any contents.2.13.1. "day" Attribute
In the "boilerplate" case, the day of publication; this is a number. Otherwise, an indication of the publication day, with the format not being restricted.2.13.2. "month" Attribute
In the "boilerplate" case, the month of publication; this is the English name of the month. Otherwise, an indication of the publication month, with the format not being restricted.2.13.3. "year" Attribute
In the "boilerplate" case, the year of publication; this is a number (usually four-digit). Otherwise, an indication of the publication year, with the format not being restricted.
2.14. <email>
Provides an email address. The value is expected to be an email address conforming to the addr-spec definition in Section 2 of [RFC6068] (so does not include the prefix "mailto:"). This element appears as a child element of <address> (Section 2.2). Content model: only text content.2.15. <eref>
Represents an "external" link (as specified in the "target" attribute). If the element has no text content, the value of the "target" attribute will be inserted in angle brackets (as described in Appendix C of [RFC3986]) and, depending on the capabilities of the output format, hyperlinked. Otherwise, the text content will be used (and potentially hyperlinked). Depending on output format and formatter, additional text might be inserted (such as a "URI" counter, and a "URIs" section in the back of the document). Avoid this variant when consistent rendering across formats and formatters is desired. This element appears as a child element of <annotation> (Section 2.3), <c> (Section 2.8), <postamble> (Section 2.28), <preamble> (Section 2.29), and <t> (Section 2.38). Content model: only text content.2.15.1. "target" Attribute (Mandatory)
URI of the link target (see Section 3 of [RFC3986]).
2.16. <facsimile>
Represents the phone number of a fax machine. The value is expected to be the scheme-specific part of a "tel" URI (so does not include the prefix "tel:"), using the "global numbers" syntax. See Section 3 of [RFC3966] for details. This element appears as a child element of <address> (Section 2.2). Content model: only text content.2.17. <figure>
This element is used to represent a figure, consisting of an optional preamble, the actual figure, an optional postamble, and an optional title. This element appears as a child element of <section> (Section 2.34) and <t> (Section 2.38). Content model: In this order: 1. Optional <iref> elements (Section 2.20) 2. One optional <preamble> element (Section 2.29) 3. One <artwork> element (Section 2.5) 4. One optional <postamble> element (Section 2.28)2.17.1. "align" Attribute
Used to change the alignment of <preamble> and <postamble>. Note: does not affect title or <artwork> alignment. Allowed values: o "left" (default) o "center" o "right"
2.17.2. "alt" Attribute
Duplicates functionality available on <artwork>; avoid it.2.17.3. "anchor" Attribute
Document-wide unique identifier for this figure. Furthermore, the presence of this attribute causes the figure to be numbered. The value needs to be a valid XML "Name" (Section 2.3 of [XML]).2.17.4. "height" Attribute
Duplicates functionality available on <artwork>; avoid it.2.17.5. "src" Attribute
Duplicates functionality available on <artwork>; avoid it.2.17.6. "suppress-title" Attribute
Figures that have an "anchor" attribute will automatically get an autogenerated title (such as "Figure 1"), even if the "title" attribute is absent. Setting this attribute to "true" will prevent this. Allowed values: o "true" o "false" (default)2.17.7. "title" Attribute
The title for the figure; this usually appears on a line after the figure.2.17.8. "width" Attribute
Duplicates functionality available on <artwork>; avoid it.
2.18. <format>
Provides a link to an additional format variant for a reference. Note that these additional links are neither used in published RFCs nor supported by all tools. If the goal is to provide a single URI for a reference, the "target" attribute on <reference> can be used instead. This element appears as a child element of <reference> (Section 2.30). Content model: this element does not have any contents.2.18.1. "octets" Attribute
Octet length of linked-to document.2.18.2. "target" Attribute
URI of document.2.18.3. "type" Attribute (Mandatory)
The type of the linked-to document, such as "TXT", "HTML", or "PDF".
2.19. <front>
Represents the "front matter": metadata (such as author information), the Abstract, and additional notes. This element appears as a child element of <reference> (Section 2.30) and <rfc> (Section 2.33). Content model: In this order: 1. One <title> element (Section 2.40) 2. One or more <author> elements (Section 2.6) 3. One <date> element (Section 2.13) 4. Optional <area> elements (Section 2.4) 5. Optional <workgroup> elements (Section 2.44) 6. Optional <keyword> elements (Section 2.21) 7. One optional <abstract> element (Section 2.1) 8. Optional <note> elements (Section 2.24)
2.20. <iref>
Provides terms for the document's index. Index entries can be either regular entries (when just the "item" attribute is given) or nested entries (by specifying "subitem" as well), grouped under a regular entry. In this document, for instance, every element definition appears as a regular index entry ("iref element 2.20"). In addition, for each use of that element inside another parent element, a nested entry was added ("iref element 2.20, ... inside annotation 2.3"). Index entries generally refer to the exact place where the <iref> element occurred. An exception is the occurrence as a child element of <section>, in which case the whole section is considered to be relevant for that index entry. In some formats, index entries of this type might be displayed as ranges. This element appears as a child element of <annotation> (Section 2.3), <c> (Section 2.8), <figure> (Section 2.17), <postamble> (Section 2.28), <preamble> (Section 2.29), <section> (Section 2.34), and <t> (Section 2.38). Content model: this element does not have any contents.2.20.1. "item" Attribute (Mandatory)
The item to include.2.20.2. "primary" Attribute
Setting this to "true" declares the occurrence as "primary", which might cause it to be highlighted in the index. Allowed values: o "true" o "false" (default)2.20.3. "subitem" Attribute
The subitem to include.
2.21. <keyword>
Specifies a keyword applicable to the document. Note that each element should only contain a single keyword; for multiple keywords, the element can simply be repeated. Keywords are used both in the RFC Index and in the metadata of generated documents. This element appears as a child element of <front> (Section 2.19). Content model: only text content.2.22. <list>
Delineates a text list. Each list item is represented by a <t> element. The vocabulary currently does not directly support list items consisting of multiple paragraphs; if this is needed, <vspace> (Section 2.43) can be used as a workaround. This element appears as a child element of <t> (Section 2.38). Content model: One or more <t> elements (Section 2.38)2.22.1. "counter" Attribute
This attribute holds a token that serves as an identifier for a counter. The intended use is continuation of lists, where the counter will be incremented for every list item, and there is no way to reset the counter. Note that this attribute functions only when the "style" attribute is using the "format..." syntax (Section 2.22.3); otherwise, it is ignored.
2.22.2. "hangIndent" Attribute
For list styles with potentially wide labels, this attribute can override the default indentation level, measured in number of characters. Note that it only affects styles with variable-width labels ("format..." and "hanging"; see below), and it may not affect formats in which the list item text appears _below_ the label.2.22.3. "style" Attribute
This attribute is used to control the display of a list. The value of this attribute is inherited by any nested lists that do not have this attribute set. It may be set to: "empty" For unlabeled list items; it can also be used for indentation purposes (this is the default value when there is an enclosing list where the style is specified). "hanging" For lists where the items are labeled with a piece of text. The label text is specified in the "hangText" attribute of the <t> element (Section 2.38.2). "letters" For ordered lists using letters as labels (lowercase letters followed by a period; after "z", it rolls over to a two-letter format). For nested lists, processors usually flip between uppercase and lowercase. "numbers" For ordered lists using numbers as labels. "symbols" For unordered (bulleted) lists. The style of the bullets is chosen automatically by the processor (some implementations allow overriding the default using a Processing Instruction).
And finally: "format ..." For lists with customized labels, consisting of fixed text and an item counter in various formats. The value is a free-form text that allows counter values to be inserted using a "percent-letter" format. For instance, "[REQ%d]" generates labels of the form "[REQ1]", where "%d" inserts the item number as a decimal number. The following formats are supported: %c lowercase letters (a, b, c, etc.) %C uppercase letters (A, B, C, etc.) %d decimal numbers (1, 2, 3, etc.) %i lowercase Roman numerals (i, ii, iii, etc.) %I uppercase Roman numerals (I, II, III, etc.) %% represents a percent sign Other formats are reserved for future use.2.23. <middle>
Represents the main content of the document. This element appears as a child element of <rfc> (Section 2.33). Content model: One or more <section> elements (Section 2.34)
2.24. <note>
Creates an unnumbered section that appears after the Abstract. It is usually used for additional information to reviewers (working group information, mailing list, ...), or for additional publication information such as "IESG Notes". This element appears as a child element of <front> (Section 2.19). Content model: One or more <t> elements (Section 2.38)2.24.1. "title" Attribute (Mandatory)
The title of the note.2.25. <organization>
Specifies the affiliation (Section 4.1.2 of [RFC7322]) of an author. This information appears both in the "Author's Address" section and on the front page (see Section 4.1.1 of [RFC7322] for more information). If the value is long, an abbreviated variant can be specified in the "abbrev" attribute. This element appears as a child element of <author> (Section 2.6). Content model: only text content.2.25.1. "abbrev" Attribute
Abbreviated variant.2.26. <phone>
Represents a phone number. The value is expected to be the scheme-specific part of a "tel" URI (so does not include the prefix "tel:"), using the "global numbers" syntax. See Section 3 of [RFC3966] for details. This element appears as a child element of <address> (Section 2.2). Content model: only text content.