RFC 8610

Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures

Pages: 64
Proposed Standard
→ Errata
Updated by: 9682

Part 4 of 4 – Pages 52 to 64

RFC8610 - Page 52 prevText

Appendix D.  Standard Prelude

   This appendix is normative.

   The following prelude is automatically added to each CDDL file.
   (Note that technically, it is a postlude, as it does not disturb the
   selection of the first rule as the root of the definition.)

                  any = #

                  uint = #0
                  nint = #1
                  int = uint / nint

                  bstr = #2
                  bytes = bstr
                  tstr = #3
                  text = tstr

                  tdate = #6.0(tstr)
                  time = #6.1(number)
                  number = int / float
                  biguint = #6.2(bstr)
                  bignint = #6.3(bstr)
                  bigint = biguint / bignint
                  integer = int / bigint
                  unsigned = uint / biguint
                  decfrac = #6.4([e10: int, m: integer])
                  bigfloat = #6.5([e2: int, m: integer])
                  eb64url = #6.21(any)
                  eb64legacy = #6.22(any)
                  eb16 = #6.23(any)
                  encoded-cbor = #6.24(bstr)
                  uri = #6.32(tstr)
                  b64url = #6.33(tstr)
                  b64legacy = #6.34(tstr)
                  regexp = #6.35(tstr)
                  mime-message = #6.36(tstr)
                  cbor-any = #6.55799(any)

RFC8610 - Page 53

                  float16 = #7.25
                  float32 = #7.26
                  float64 = #7.27
                  float16-32 = float16 / float32
                  float32-64 = float32 / float64
                  float = float16-32 / float64

                  false = #7.20
                  true = #7.21
                  bool = false / true
                  nil = #7.22
                  null = nil
                  undefined = #7.23

                          Figure 14: CDDL Prelude

   Note that the prelude is deemed to be fixed.  This means, for
   instance, that additional tags beyond those defined in [RFC7049], as
   registered, need to be defined in each CDDL file that is using them.

   A common stumbling point is that the prelude does not define a type
   "string".  CBOR has byte strings ("bytes" in the prelude) and text
   strings ("text"), so a type that is simply called "string" would be
   ambiguous.

Appendix E.  Use with JSON

   This appendix is normative.

   The JSON generic data model (implicit in [RFC8259]) is a subset of
   the generic data model of CBOR.  So, one can use CDDL with JSON by
   limiting oneself to what can be represented in JSON.  Roughly
   speaking, this means leaving out byte strings, tags, and simple
   values other than "false", "true", and "null", leading to the
   following limited prelude:

RFC8610 - Page 54

                      any = #

                      uint = #0
                      nint = #1
                      int = uint / nint

                      tstr = #3
                      text = tstr

                      number = int / float

                      float16 = #7.25
                      float32 = #7.26
                      float64 = #7.27
                      float16-32 = float16 / float32
                      float32-64 = float32 / float64
                      float = float16-32 / float64

                      false = #7.20
                      true = #7.21
                      bool = false / true
                      nil = #7.22
                      null = nil

             Figure 15: JSON-Compatible Subset of CDDL Prelude

   (The major types given here do not have a direct meaning in JSON, but
   they can be interpreted as CBOR major types translated through
   Section 4 of [RFC7049].)

   There are a few fine points in using CDDL with JSON.  First, JSON
   does not distinguish between integers and floating-point numbers;
   there is only one kind of number (which may happen to be integral).
   In this context, specifying a type as "uint", "nint", or "int" then
   becomes a predicate that the number be integral.  As an example, this
   means that the following JSON numbers are all matching "uint":

      10 10.0 1e1 1.0e1 100e-1

   (The fact that these are all integers may be surprising to users
   accustomed to the long tradition in programming languages of using
   decimal points or exponents in a number to indicate a floating-point
   literal.)

   CDDL distinguishes the various CBOR number types, but there is only
   one number type in JSON.  The effect of specifying a floating-point
   precision (float16/float32/float64) is only to restrict the set of

RFC8610 - Page 55

   permissible values to those expressible with binary16/binary32/
   binary64; this is unlikely to be very useful when using CDDL for
   specifying JSON data structures.

   Fundamentally, the number system of JSON itself is based on decimal
   numbers and decimal fractions and does not have limits to its
   precision or range.  In practice, JSON numbers are often parsed into
   a number type that is called "float64" here, creating a number of
   limitations to the generic data model [RFC7493].  In particular, this
   means that integers can only be expressed with interoperable
   exactness when they lie in the range [-(2**53)+1, (2**53)-1] -- a
   smaller range than that covered by CDDL "int".

   JSON applications that want to stay compatible with I-JSON ("Internet
   JSON"; see [RFC7493]) may therefore want to define integer types with
   more limited ranges, such as in Figure 16.  Note that the types given
   here are not part of the prelude; they need to be copied into the
   CDDL specification if needed.

               ij-uint = 0..9007199254740991
               ij-nint = -9007199254740991..-1
               ij-int = -9007199254740991..9007199254740991

          Figure 16: I-JSON Types for CDDL (Not Part of Prelude)

   JSON applications that do not need to stay compatible with I-JSON and
   that actually may need to go beyond the 64-bit unsigned and negative
   integers supported by "int" (= "uint"/"nint") may want to use the
   following additional types from the standard prelude, which are
   expressed in terms of tags but can straightforwardly be mapped into
   JSON (but not I-JSON) numbers:

      biguint = #6.2(bstr)
      bignint = #6.3(bstr)
      bigint = biguint / bignint
      integer = int / bigint
      unsigned = uint / biguint

   CDDL at this point does not have a way to express the unlimited
   floating-point precision that is theoretically possible with JSON; at
   the time of writing, this is rarely used in protocols in practice.

   Note that a data model described in CDDL is always restricted by what
   can be expressed in the serialization; e.g., floating-point values
   such as NaN (not a number) and the infinities cannot be represented
   in JSON even if they are allowed in the CDDL generic data model.

RFC8610 - Page 56

Appendix F.  A CDDL Tool

   This appendix is for information only.

   A rough CDDL tool is available.  For CDDL specifications, it can
   check the syntax, generate one or more instances (expressed in CBOR
   diagnostic notation or in pretty-printed JSON), and validate an
   existing instance against the specification:

                   Usage:
                   cddl spec.cddl generate [n]
                   cddl spec.cddl json-generate [n]
                   cddl spec.cddl validate instance.cbor
                   cddl spec.cddl validate instance.json

                        Figure 17: CDDL Tool Usage

   Install on a system with a modern Ruby via:

                             gem install cddl

                     Figure 18: CDDL Tool Installation

   The accompanying CBOR diagnostic tools (which are automatically
   installed by the above) are described in <https://github.com/cabo/
   cbor-diag>; they can be used to convert between binary CBOR, a
   pretty-printed hexadecimal form of binary CBOR, CBOR diagnostic
   notation, JSON, and YAML [YAML].

Appendix G.  Extended Diagnostic Notation

   This appendix is normative.

   Section 6 of [RFC7049] defines a "diagnostic notation" in order to be
   able to converse about CBOR data items without having to resort to
   binary data.  Diagnostic notation is based on JSON, with extensions
   for representing CBOR constructs such as binary data and tags.

   (Standardizing this together with the actual interchange format does
   not serve to create another interchange format but enables the use of
   a shared diagnostic notation in tools for and documents about CBOR.)

   This appendix discusses a few extensions to the diagnostic notation
   that have turned out to be useful since RFC 7049 was written.  We
   refer to the result as Extended Diagnostic Notation (EDN).

RFC8610 - Page 57

G.1.  Whitespace in Byte String Notation

   Examples often benefit from some whitespace (spaces, line breaks) in
   byte strings.  In EDN, whitespace is ignored in prefixed byte
   strings; for instance, the following are equivalent:

      h'48656c6c6f20776f726c64'
      h'48 65 6c 6c 6f 20 77 6f 72 6c 64'
      h'4 86 56c 6c6f
        20776 f726c64'

G.2.  Text in Byte String Notation

   Diagnostic notation notates byte strings in one of the base encodings
   per [RFC4648], enclosed in single quotes, prefixed by >h< for base16,
   >b32< for base32, >h32< for base32hex, or >b64< for base64 or
   base64url.  Quite often, byte strings carry bytes that are
   meaningfully interpreted as UTF-8 text.  EDN allows the use of single
   quotes without a prefix to express byte strings with UTF-8 text; for
   instance, the following are equivalent:

      'hello world'
      h'68656c6c6f20776f726c64'

   The escaping rules of JSON strings are applied equivalently for
   text-based byte strings, e.g., "\" stands for a single backslash and
   "'" stands for a single quote.  Whitespace is included literally,
   i.e., the previous section does not apply to text-based byte strings.

G.3.  Embedded CBOR and CBOR Sequences in Byte Strings

   Where a byte string is to carry an embedded CBOR-encoded item, or
   more generally a sequence of zero or more such items, the diagnostic
   notation for these zero or more CBOR data items, separated by commas,
   can be enclosed in << and >> to notate the byte string resulting from
   encoding the data items and concatenating the result.  For instance,
   each pair of columns in the following are equivalent:

      <<1>>              h'01'
      <<1, 2>>           h'0102'
      <<"foo", null>>    h'63666F6FF6'
      <<>>               h''

RFC8610 - Page 58

G.4.  Concatenated Strings

   While the ability to include whitespace enables line-breaking of
   encoded byte strings, a mechanism is needed to be able to include
   text strings as well as byte strings in direct UTF-8 representation
   into line-based documents (such as RFCs and source code).

   We extend the diagnostic notation by allowing multiple text strings
   or multiple byte strings to be notated separated by whitespace; these
   are then concatenated into a single text or byte string,
   respectively.  Text strings and byte strings do not mix within such a
   concatenation, except that byte string notation can be used inside a
   sequence of concatenated text string notation to encode characters
   that may be better represented in an encoded way.  The following four
   values are equivalent:

      "Hello world"
      "Hello " "world"
      "Hello" h'20' "world"
      "" h'48656c6c6f20776f726c64' ""

   Similarly, the following byte string values are equivalent:

      'Hello world'
      'Hello ' 'world'
      'Hello ' h'776f726c64'
      'Hello' h'20' 'world'
      '' h'48656c6c6f20776f726c64' '' b64''
      h'4 86 56c 6c6f' h' 20776 f726c64'

   (Note that the approach of separating by whitespace, while familiar
   from the C language, requires some attention -- a single comma makes
   a big difference here.)

RFC8610 - Page 59

G.5.  Hexadecimal, Octal, and Binary Numbers

   In addition to JSON's decimal numbers, EDN provides hexadecimal,
   octal, and binary numbers in the usual C-language notation (octal
   with 0o prefix present only).

   The following are equivalent:

      4711
      0x1267
      0o11147
      0b1001001100111

   As are:

      1.5
      0x1.8p0
      0x18p-4

G.6.  Comments

   Longer pieces of diagnostic notation may benefit from comments.  JSON
   famously does not provide for comments, and basic diagnostic notation
   per RFC 7049 inherits this property.

   In EDN, comments can be included, delimited by slashes ("/").  Any
   text within and including a pair of slashes is considered a comment.

   Comments are considered whitespace.  Hence, they are allowed in
   prefixed byte strings; for instance, the following are equivalent:

      h'68656c6c6f20776f726c64'
      h'68 65 6c /doubled l!/ 6c 6f /hello/
        20 /space/
        77 6f 72 6c 64' /world/

   This can be used to annotate a CBOR structure as in:

      /grasp-message/ [/M_DISCOVERY/ 1, /session-id/ 10584416,
                       /objective/ [/objective-name/ "opsonize",
                                    /D, N, S/ 7, /loop-count/ 105]]

   (There are currently no end-of-line comments.  If we want to add
   them, "//" sounds like a reasonable delimiter given that we already
   use slashes for comments, but we could also go, for example,
   for "#".)

RFC8610 - Page 60

Appendix H.  Examples

   This appendix is for information only.

   This appendix contains a few examples of structures defined
   using CDDL.  The theme for the examples is taken from [RFC7071],
   which defines certain JSON structures in English.  For a similar
   example, it may also be of interest to examine Appendix A of
   [RFC8007], which contains a CDDL definition for a JSON structure
   defined in the main body of that RFC.

   These examples all happen to describe data that is interchanged in
   JSON.  Examples for CDDL definitions of data that is interchanged in
   CBOR can be found in [RFC8152], [GRASP], and [RFC8428].

   [RFC7071] defines the "reputon" structure for JSON using somewhat
   formalized English text.  Here is a (somewhat verbose) equivalent
   definition using the same terms, but notated in CDDL:

                 reputation-object = {
                   reputation-context,
                   reputon-list
                 }

                 reputation-context = (
                   application: text
                 )

                 reputon-list = (
                   reputons: reputon-array
                 )

                 reputon-array = [* reputon]

                 reputon = {
                   rater-value,
                   assertion-value,
                   rated-value,
                   rating-value,
                   ? conf-value,
                   ? normal-value,
                   ? sample-value,
                   ? gen-value,
                   ? expire-value,
                   * ext-value,
                 }

RFC8610 - Page 61

                 rater-value = ( rater: text )
                 assertion-value = ( assertion: text )
                 rated-value = ( rated: text )
                 rating-value = ( rating: float16 )
                 conf-value = ( confidence: float16 )
                 normal-value = ( normal-rating: float16 )
                 sample-value = ( sample-size: uint )
                 gen-value = ( generated: uint )
                 expire-value = ( expires: uint )
                 ext-value = ( text => any )

   An equivalent, more compact form of this example would be:

                        reputation-object = {
                          application: text
                          reputons: [* reputon]
                        }

                        reputon = {
                          rater: text
                          assertion: text
                          rated: text
                          rating: float16
                          ? confidence: float16
                          ? normal-rating: float16
                          ? sample-size: uint
                          ? generated: uint
                          ? expires: uint
                          * text => any
                        }

   Note how this rather clearly delineates the structure somewhat
   shrouded by so many words in Section 6.2.2 of [RFC7071].  Also, this
   definition makes it clear that several ext-values are allowed (by
   definition with different member names); RFC 7071 could be read to
   forbid the repetition of ext-value ("A specific reputon-element
   MUST NOT appear more than once" is ambiguous).

RFC8610 - Page 62

   The CDDL tool described in Appendix F generates as one example:

                  {
                    "application": "conchometry",
                    "reputons": [
                      {
                        "rater": "Ephthianura",
                        "assertion": "codding",
                        "rated": "sphaerolitic",
                        "rating": 0.34133473256800795,
                        "confidence": 0.9481983064298332,
                        "expires": 1568,
                        "unplaster": "grassy"
                      },
                      {
                        "rater": "nonchargeable",
                        "assertion": "raglan",
                        "rated": "alienage",
                        "rating": 0.5724646875815566,
                        "sample-size": 3514,
                        "Aldebaran": "unchurched",
                        "puruloid": "impersonable",
                        "uninfracted": "pericarpoidal",
                        "schorl": "Caro"
                      },
                      {
                        "rater": "precollectable",
                        "assertion": "Merat",
                        "rated": "thermonatrite",
                        "rating": 0.19164006323936977,
                        "confidence": 0.6065252103391268,
                        "normal-rating": 0.5187773690879303,
                        "generated": 899,
                        "speedy": "solidungular",
                        "noviceship": "medicine",
                        "checkrow": "epidictic"
                      }
                    ]
                  }

RFC8610 - Page 63

Acknowledgements

   Inspiration was taken from the C and Pascal languages, MPEG's
   conventions for describing structures in the ISO base media file
   format, RELAX NG and its compact syntax [RELAXNG], and, in
   particular, Andrew Lee Newton's early proposals on JSON Content Rules
   (JCR) as found in draft version four (-04) of [JCR].

   Lots of highly useful feedback came from members of the IETF CBOR WG
   -- in particular, Ari Keranen, Brian Carpenter, Burt Harris, Jeffrey
   Yasskin, Jim Hague, Jim Schaad, Joe Hildebrand, Max Pritikin, Michael
   Richardson, Pete Cordell, Sean Leonard, and Yaron Sheffer.  Also,
   Francesca Palombini and Joe volunteered to chair the WG when it was
   created, providing the framework for generating and processing this
   feedback, with Barry Leiba having taken over from Joe since then.
   Chris Lonvick and Ines Robles provided additional reviews during IESG
   processing, and Alexey Melnikov steered the process as the
   responsible Area Director.

   The CDDL tool described in Appendix F was written by Carsten Bormann,
   building on previous work by Troy Heninger and Tom Lord.

Contributors

   CDDL was originally conceived by Bert Greevenbosch, who also wrote
   the original five draft versions of this document.

RFC8610 - Page 64

Authors' Addresses

   Henk Birkholz
   Fraunhofer SIT
   Rheinstrasse 75
   Darmstadt  64295
   Germany

   Email: henk.birkholz@sit.fraunhofer.de


   Christoph Vigano
   Universitaet Bremen

   Email: christoph.vigano@uni-bremen.de


   Carsten Bormann
   Universitaet Bremen TZI
   Bibliothekstr. 1
   Bremen  D-28359
   Germany

   Phone: +49-421-218-63921
   Email: cabo@tzi.org