Appendix D. Standard Prelude
This appendix is normative. The following prelude is automatically added to each CDDL file. (Note that technically, it is a postlude, as it does not disturb the selection of the first rule as the root of the definition.) any = # uint = #0 nint = #1 int = uint / nint bstr = #2 bytes = bstr tstr = #3 text = tstr tdate = #6.0(tstr) time = #6.1(number) number = int / float biguint = #6.2(bstr) bignint = #6.3(bstr) bigint = biguint / bignint integer = int / bigint unsigned = uint / biguint decfrac = #6.4([e10: int, m: integer]) bigfloat = #6.5([e2: int, m: integer]) eb64url = #6.21(any) eb64legacy = #6.22(any) eb16 = #6.23(any) encoded-cbor = #6.24(bstr) uri = #6.32(tstr) b64url = #6.33(tstr) b64legacy = #6.34(tstr) regexp = #6.35(tstr) mime-message = #6.36(tstr) cbor-any = #6.55799(any)
float16 = #7.25 float32 = #7.26 float64 = #7.27 float16-32 = float16 / float32 float32-64 = float32 / float64 float = float16-32 / float64 false = #7.20 true = #7.21 bool = false / true nil = #7.22 null = nil undefined = #7.23 Figure 14: CDDL Prelude Note that the prelude is deemed to be fixed. This means, for instance, that additional tags beyond those defined in [RFC7049], as registered, need to be defined in each CDDL file that is using them. A common stumbling point is that the prelude does not define a type "string". CBOR has byte strings ("bytes" in the prelude) and text strings ("text"), so a type that is simply called "string" would be ambiguous.Appendix E. Use with JSON
This appendix is normative. The JSON generic data model (implicit in [RFC8259]) is a subset of the generic data model of CBOR. So, one can use CDDL with JSON by limiting oneself to what can be represented in JSON. Roughly speaking, this means leaving out byte strings, tags, and simple values other than "false", "true", and "null", leading to the following limited prelude:
any = # uint = #0 nint = #1 int = uint / nint tstr = #3 text = tstr number = int / float float16 = #7.25 float32 = #7.26 float64 = #7.27 float16-32 = float16 / float32 float32-64 = float32 / float64 float = float16-32 / float64 false = #7.20 true = #7.21 bool = false / true nil = #7.22 null = nil Figure 15: JSON-Compatible Subset of CDDL Prelude (The major types given here do not have a direct meaning in JSON, but they can be interpreted as CBOR major types translated through Section 4 of [RFC7049].) There are a few fine points in using CDDL with JSON. First, JSON does not distinguish between integers and floating-point numbers; there is only one kind of number (which may happen to be integral). In this context, specifying a type as "uint", "nint", or "int" then becomes a predicate that the number be integral. As an example, this means that the following JSON numbers are all matching "uint": 10 10.0 1e1 1.0e1 100e-1 (The fact that these are all integers may be surprising to users accustomed to the long tradition in programming languages of using decimal points or exponents in a number to indicate a floating-point literal.) CDDL distinguishes the various CBOR number types, but there is only one number type in JSON. The effect of specifying a floating-point precision (float16/float32/float64) is only to restrict the set of
permissible values to those expressible with binary16/binary32/ binary64; this is unlikely to be very useful when using CDDL for specifying JSON data structures. Fundamentally, the number system of JSON itself is based on decimal numbers and decimal fractions and does not have limits to its precision or range. In practice, JSON numbers are often parsed into a number type that is called "float64" here, creating a number of limitations to the generic data model [RFC7493]. In particular, this means that integers can only be expressed with interoperable exactness when they lie in the range [-(2**53)+1, (2**53)-1] -- a smaller range than that covered by CDDL "int". JSON applications that want to stay compatible with I-JSON ("Internet JSON"; see [RFC7493]) may therefore want to define integer types with more limited ranges, such as in Figure 16. Note that the types given here are not part of the prelude; they need to be copied into the CDDL specification if needed. ij-uint = 0..9007199254740991 ij-nint = -9007199254740991..-1 ij-int = -9007199254740991..9007199254740991 Figure 16: I-JSON Types for CDDL (Not Part of Prelude) JSON applications that do not need to stay compatible with I-JSON and that actually may need to go beyond the 64-bit unsigned and negative integers supported by "int" (= "uint"/"nint") may want to use the following additional types from the standard prelude, which are expressed in terms of tags but can straightforwardly be mapped into JSON (but not I-JSON) numbers: biguint = #6.2(bstr) bignint = #6.3(bstr) bigint = biguint / bignint integer = int / bigint unsigned = uint / biguint CDDL at this point does not have a way to express the unlimited floating-point precision that is theoretically possible with JSON; at the time of writing, this is rarely used in protocols in practice. Note that a data model described in CDDL is always restricted by what can be expressed in the serialization; e.g., floating-point values such as NaN (not a number) and the infinities cannot be represented in JSON even if they are allowed in the CDDL generic data model.
Appendix F. A CDDL Tool
This appendix is for information only. A rough CDDL tool is available. For CDDL specifications, it can check the syntax, generate one or more instances (expressed in CBOR diagnostic notation or in pretty-printed JSON), and validate an existing instance against the specification: Usage: cddl spec.cddl generate [n] cddl spec.cddl json-generate [n] cddl spec.cddl validate instance.cbor cddl spec.cddl validate instance.json Figure 17: CDDL Tool Usage Install on a system with a modern Ruby via: gem install cddl Figure 18: CDDL Tool Installation The accompanying CBOR diagnostic tools (which are automatically installed by the above) are described in <https://github.com/cabo/ cbor-diag>; they can be used to convert between binary CBOR, a pretty-printed hexadecimal form of binary CBOR, CBOR diagnostic notation, JSON, and YAML [YAML].Appendix G. Extended Diagnostic Notation
This appendix is normative. Section 6 of [RFC7049] defines a "diagnostic notation" in order to be able to converse about CBOR data items without having to resort to binary data. Diagnostic notation is based on JSON, with extensions for representing CBOR constructs such as binary data and tags. (Standardizing this together with the actual interchange format does not serve to create another interchange format but enables the use of a shared diagnostic notation in tools for and documents about CBOR.) This appendix discusses a few extensions to the diagnostic notation that have turned out to be useful since RFC 7049 was written. We refer to the result as Extended Diagnostic Notation (EDN).
G.1. Whitespace in Byte String Notation
Examples often benefit from some whitespace (spaces, line breaks) in byte strings. In EDN, whitespace is ignored in prefixed byte strings; for instance, the following are equivalent: h'48656c6c6f20776f726c64' h'48 65 6c 6c 6f 20 77 6f 72 6c 64' h'4 86 56c 6c6f 20776 f726c64'G.2. Text in Byte String Notation
Diagnostic notation notates byte strings in one of the base encodings per [RFC4648], enclosed in single quotes, prefixed by >h< for base16, >b32< for base32, >h32< for base32hex, or >b64< for base64 or base64url. Quite often, byte strings carry bytes that are meaningfully interpreted as UTF-8 text. EDN allows the use of single quotes without a prefix to express byte strings with UTF-8 text; for instance, the following are equivalent: 'hello world' h'68656c6c6f20776f726c64' The escaping rules of JSON strings are applied equivalently for text-based byte strings, e.g., "\" stands for a single backslash and "'" stands for a single quote. Whitespace is included literally, i.e., the previous section does not apply to text-based byte strings.G.3. Embedded CBOR and CBOR Sequences in Byte Strings
Where a byte string is to carry an embedded CBOR-encoded item, or more generally a sequence of zero or more such items, the diagnostic notation for these zero or more CBOR data items, separated by commas, can be enclosed in << and >> to notate the byte string resulting from encoding the data items and concatenating the result. For instance, each pair of columns in the following are equivalent: <<1>> h'01' <<1, 2>> h'0102' <<"foo", null>> h'63666F6FF6' <<>> h''
G.4. Concatenated Strings
While the ability to include whitespace enables line-breaking of encoded byte strings, a mechanism is needed to be able to include text strings as well as byte strings in direct UTF-8 representation into line-based documents (such as RFCs and source code). We extend the diagnostic notation by allowing multiple text strings or multiple byte strings to be notated separated by whitespace; these are then concatenated into a single text or byte string, respectively. Text strings and byte strings do not mix within such a concatenation, except that byte string notation can be used inside a sequence of concatenated text string notation to encode characters that may be better represented in an encoded way. The following four values are equivalent: "Hello world" "Hello " "world" "Hello" h'20' "world" "" h'48656c6c6f20776f726c64' "" Similarly, the following byte string values are equivalent: 'Hello world' 'Hello ' 'world' 'Hello ' h'776f726c64' 'Hello' h'20' 'world' '' h'48656c6c6f20776f726c64' '' b64'' h'4 86 56c 6c6f' h' 20776 f726c64' (Note that the approach of separating by whitespace, while familiar from the C language, requires some attention -- a single comma makes a big difference here.)
G.5. Hexadecimal, Octal, and Binary Numbers
In addition to JSON's decimal numbers, EDN provides hexadecimal, octal, and binary numbers in the usual C-language notation (octal with 0o prefix present only). The following are equivalent: 4711 0x1267 0o11147 0b1001001100111 As are: 1.5 0x1.8p0 0x18p-4G.6. Comments
Longer pieces of diagnostic notation may benefit from comments. JSON famously does not provide for comments, and basic diagnostic notation per RFC 7049 inherits this property. In EDN, comments can be included, delimited by slashes ("/"). Any text within and including a pair of slashes is considered a comment. Comments are considered whitespace. Hence, they are allowed in prefixed byte strings; for instance, the following are equivalent: h'68656c6c6f20776f726c64' h'68 65 6c /doubled l!/ 6c 6f /hello/ 20 /space/ 77 6f 72 6c 64' /world/ This can be used to annotate a CBOR structure as in: /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416, /objective/ [/objective-name/ "opsonize", /D, N, S/ 7, /loop-count/ 105]] (There are currently no end-of-line comments. If we want to add them, "//" sounds like a reasonable delimiter given that we already use slashes for comments, but we could also go, for example, for "#".)
Appendix H. Examples
This appendix is for information only. This appendix contains a few examples of structures defined using CDDL. The theme for the examples is taken from [RFC7071], which defines certain JSON structures in English. For a similar example, it may also be of interest to examine Appendix A of [RFC8007], which contains a CDDL definition for a JSON structure defined in the main body of that RFC. These examples all happen to describe data that is interchanged in JSON. Examples for CDDL definitions of data that is interchanged in CBOR can be found in [RFC8152], [GRASP], and [RFC8428]. [RFC7071] defines the "reputon" structure for JSON using somewhat formalized English text. Here is a (somewhat verbose) equivalent definition using the same terms, but notated in CDDL: reputation-object = { reputation-context, reputon-list } reputation-context = ( application: text ) reputon-list = ( reputons: reputon-array ) reputon-array = [* reputon] reputon = { rater-value, assertion-value, rated-value, rating-value, ? conf-value, ? normal-value, ? sample-value, ? gen-value, ? expire-value, * ext-value, }
rater-value = ( rater: text ) assertion-value = ( assertion: text ) rated-value = ( rated: text ) rating-value = ( rating: float16 ) conf-value = ( confidence: float16 ) normal-value = ( normal-rating: float16 ) sample-value = ( sample-size: uint ) gen-value = ( generated: uint ) expire-value = ( expires: uint ) ext-value = ( text => any ) An equivalent, more compact form of this example would be: reputation-object = { application: text reputons: [* reputon] } reputon = { rater: text assertion: text rated: text rating: float16 ? confidence: float16 ? normal-rating: float16 ? sample-size: uint ? generated: uint ? expires: uint * text => any } Note how this rather clearly delineates the structure somewhat shrouded by so many words in Section 6.2.2 of [RFC7071]. Also, this definition makes it clear that several ext-values are allowed (by definition with different member names); RFC 7071 could be read to forbid the repetition of ext-value ("A specific reputon-element MUST NOT appear more than once" is ambiguous).
The CDDL tool described in Appendix F generates as one example: { "application": "conchometry", "reputons": [ { "rater": "Ephthianura", "assertion": "codding", "rated": "sphaerolitic", "rating": 0.34133473256800795, "confidence": 0.9481983064298332, "expires": 1568, "unplaster": "grassy" }, { "rater": "nonchargeable", "assertion": "raglan", "rated": "alienage", "rating": 0.5724646875815566, "sample-size": 3514, "Aldebaran": "unchurched", "puruloid": "impersonable", "uninfracted": "pericarpoidal", "schorl": "Caro" }, { "rater": "precollectable", "assertion": "Merat", "rated": "thermonatrite", "rating": 0.19164006323936977, "confidence": 0.6065252103391268, "normal-rating": 0.5187773690879303, "generated": 899, "speedy": "solidungular", "noviceship": "medicine", "checkrow": "epidictic" } ] }
Acknowledgements
Inspiration was taken from the C and Pascal languages, MPEG's conventions for describing structures in the ISO base media file format, RELAX NG and its compact syntax [RELAXNG], and, in particular, Andrew Lee Newton's early proposals on JSON Content Rules (JCR) as found in draft version four (-04) of [JCR]. Lots of highly useful feedback came from members of the IETF CBOR WG -- in particular, Ari Keranen, Brian Carpenter, Burt Harris, Jeffrey Yasskin, Jim Hague, Jim Schaad, Joe Hildebrand, Max Pritikin, Michael Richardson, Pete Cordell, Sean Leonard, and Yaron Sheffer. Also, Francesca Palombini and Joe volunteered to chair the WG when it was created, providing the framework for generating and processing this feedback, with Barry Leiba having taken over from Joe since then. Chris Lonvick and Ines Robles provided additional reviews during IESG processing, and Alexey Melnikov steered the process as the responsible Area Director. The CDDL tool described in Appendix F was written by Carsten Bormann, building on previous work by Troy Heninger and Tom Lord.Contributors
CDDL was originally conceived by Bert Greevenbosch, who also wrote the original five draft versions of this document.
Authors' Addresses
Henk Birkholz Fraunhofer SIT Rheinstrasse 75 Darmstadt 64295 Germany Email: henk.birkholz@sit.fraunhofer.de Christoph Vigano Universitaet Bremen Email: christoph.vigano@uni-bremen.de Carsten Bormann Universitaet Bremen TZI Bibliothekstr. 1 Bremen D-28359 Germany Phone: +49-421-218-63921 Email: cabo@tzi.org