RFC 8610

Concise Data Definition Language (CDDL): A Notational Convention to Express Concise Binary Object Representation (CBOR) and JSON Data Structures

Pages: 64
Proposed Standard
→ Errata
Updated by: 9682

Part 2 of 4 – Pages 15 to 36

RFC8610 - Page 15 prevText

3.  Syntax

   In this section, the overall syntax of CDDL is shown, alongside some
   examples just illustrating syntax.  (The definition does not attempt
   to be overly formal; refer to Appendix B for details.)

3.1.  General Conventions

   The basic syntax is inspired by ABNF [RFC5234], with the following:

   o  Rules, whether they define groups or types, are defined with a
      name, followed by an equals sign "=" and the actual definition
      according to the respective syntactic rules of that definition.

   o  A name can consist of any of the characters from the set {"A" to
      "Z", "a" to "z", "0" to "9", "_", "-", "@", ".", "$"}, starting
      with an alphabetic character (including "@", "_", "$") and ending
      in such a character or a digit.

      *  Names are case sensitive.

      *  It is preferred style to start a name with a lowercase letter.

      *  The hyphen is preferred over the underscore (except in a
         "bareword" (Section 3.5.1), where the semantics may actually
         require an underscore).

      *  The period may be useful for larger specifications, to express
         some module structure (as in "tcp.throughput" vs.
         "udp.throughput").

      *  A number of names are predefined in the CDDL prelude, as listed
         in Appendix D.

      *  Rule names (types or groups) do not appear in the actual CBOR
         encoding, but names used as "barewords" in member keys do.

   o  Comments are started by a ";" (semicolon) character and finish at
      the end of a line (LF or CRLF).

   o  Except within strings, whitespace (spaces, newlines, and comments)
      is used to separate syntactic elements for readability (and to
      separate identifiers, range operators, or numbers that follow each
      other); it is otherwise completely optional.

   o  Hexadecimal numbers are preceded by "0x" (without quotes) and are
      case insensitive.  Similarly, binary numbers are preceded by "0b".

RFC8610 - Page 16

   o  Text strings are enclosed by double quotation '"' characters.
      They follow the conventions for strings as defined in Section 7 of
      [RFC8259].  (ABNF users may want to note that there is no support
      in CDDL for the concept of case insensitivity in text strings; if
      necessary, regular expressions can be used (Section 3.8.3).)

   o  Byte strings are enclosed by single quotation "'" characters and
      may be prefixed by "h" or "b64".  If unprefixed, the string is
      interpreted as with a text string, except that single quotes must
      be escaped and that the resulting UTF-8 bytes are marked as a byte
      string (major type 2).  If prefixed as "h" or "b64", the string is
      interpreted as a sequence of pairs of hex digits (base16; see
      Section 8 of [RFC4648]) or a base64(url) string (Section 4 or
      Section 5 of [RFC4648]), respectively (as with the diagnostic
      notation in Section 6 of [RFC7049]; cf. Appendix G.2); any
      whitespace present within the string (including comments) is
      ignored in the prefixed case.

   o  CDDL uses UTF-8 [RFC3629] for its encoding.  Processing of CDDL
      does not involve Unicode normalization processes.

   Example:

                    ; This is a comment
                    person = { g }

                    g = (
                      "name": tstr,
                      age: int,  ; "age" is a bareword
                    )

3.2.  Occurrence

   An optional _occurrence_ indicator can be given in front of a group
   entry.  It is either (1) one of the characters "?" (optional), "*"
   (zero or more), or "+" (one or more) or (2) of the form n*m, where n
   and m are optional unsigned integers and n is the lower limit
   (default 0) and m is the upper limit (default no limit) of
   occurrences.

   If no occurrence indicator is specified, the group entry is to occur
   exactly once (as if 1*1 were specified).  A group entry with an
   occurrence indicator matches sequences of name/value pairs that are
   composed by concatenating a number of sequences that the basic group
   entry matches, where the number needs to be allowed by the occurrence
   indicator.

RFC8610 - Page 17

   Note that CDDL, outside any directives/annotations that could
   possibly be defined, does not make any prescription as to whether
   arrays or maps use definite-length or indefinite-length encoding.
   That is, there is no correlation between leaving the size of an array
   "open" in the spec and the fact that it is then interchanged with
   definite or indefinite length.

   Please also note that CDDL can describe flexibility that the data
   model of the target representation does not have.  This is rather
   obvious for JSON but is also relevant for CBOR:

                           apartment = {
                             kitchen: size,
                             * bedroom: size,
                           }
                           size = float ; in m2

   The previous specification does not mean that CBOR is changed to
   allow using the key "bedroom" more than once.  In other words, due to
   the restrictions imposed by the data model, the third line pretty
   much turns into:

                             ? bedroom: size,

   (Occurrence indicators beyond one are still useful in maps for groups
   that allow a variety of keys.)

3.3.  Predefined Names for Types

   CDDL predefines a number of names.  This subsection summarizes these
   names, but please see Appendix D for the exact definitions.

   The following keywords for primitive datatypes are defined:

   "bool"  Boolean value (major type 7, additional information 20
      or 21).

   "uint"  An unsigned integer (major type 0).

   "nint"  A negative integer (major type 1).

   "int"  An unsigned integer or a negative integer.

   "float16"  A number representable as a half-precision float [IEEE754]
      (major type 7, additional information 25).

   "float32"  A number representable as a single-precision float
      [IEEE754] (major type 7, additional information 26).

RFC8610 - Page 18

   "float64"  A number representable as a double-precision float
      [IEEE754] (major type 7, additional information 27).

   "float"  One of float16, float32, or float64.

   "bstr" or "bytes"  A byte string (major type 2).

   "tstr" or "text"  Text string (major type 3).

   (Note that there are no predefined names for arrays or maps; these
   are defined with the syntax given below.)

   In addition, a number of types are defined in the prelude that are
   associated with CBOR tags, such as "tdate", "bigint", "regexp", etc.

3.4.  Arrays

   Array definitions surround a group with square brackets.

   For each entry, an occurrence indicator as specified in Section 3.2
   is permitted.

   For example:

                     unlimited-people = [* person]
                     one-or-two-people = [1*2 person]
                     at-least-two-people = [2* person]
                     person = (
                         name: tstr,
                         age: uint,
                     )

   The group "person" is defined in such a way that repeating it in the
   array each time generates alternating names and ages, so these are
   four valid values for a data item of type "unlimited-people":

      ["roundlet", 1047, "psychurgy", 2204, "extrarhythmical", 2231]
      []
      ["aluminize", 212, "climograph", 4124]
      ["penintime", 1513, "endocarditis", 4084, "impermeator", 1669,
       "coextension", 865]

RFC8610 - Page 19

3.5.  Maps

   The syntax for specifying maps merits special attention, as well as a
   number of optimizations and conveniences, as it is likely to be the
   focal point of many specifications employing CDDL.  While the syntax
   does not strictly distinguish struct and table usage of maps, it
   caters specifically to each of them.

   But first, let's reiterate a feature of CBOR that it has inherited
   from JSON: the key/value pairs in CBOR maps have no fixed ordering.
   (One could imagine situations where fixing the ordering may be of
   use.  For example, a decoder could look for values related with
   integer keys 1, 3, and 7.  If the order were fixed and the decoder
   encounters the key 4 without having encountered key 3, it could
   conclude that key 3 is not available without doing more complicated
   bookkeeping.  Unfortunately, neither JSON nor CBOR supports this, so
   no attempt was made to support this in CDDL either.)

3.5.1.  Structs

   The "struct" usage of maps is similar to the way JSON objects are
   used in many JSON applications.

   A map is defined in the same way as that for defining an array (see
   Section 3.4), except for using curly braces "{}" instead of square
   brackets "[]".

   An occurrence indicator as specified in Section 3.2 is permitted for
   each group entry.

   The following is an example of a record with a structure embedded:

       Geography = [
         city           : tstr,
         gpsCoordinates : GpsCoordinates,
       ]

       GpsCoordinates = {
         longitude      : uint,            ; degrees, scaled by 10^7
         latitude       : uint,            ; degrees, scaled by 10^7
       }

   When encoding, the Geography record is encoded using a CBOR array
   with two members (the keys for the group entries are ignored),
   whereas the GpsCoordinates structure is encoded as a CBOR map with
   two key/value pairs.

RFC8610 - Page 20

   Types used in a structure can be defined in separate rules or just in
   place (potentially placed inside parentheses, such as for choices).
   For example:

                           located-samples = {
                             sample-point: int,
                             samples: [+ float],
                           }

   where "located-samples" is the datatype to be used when referring to
   the struct, and "sample-point" and "samples" are the keys to be used.
   This is actually a complete example: an identifier that is followed
   by a colon can be directly used as the text string for a member key
   (we speak of a "bareword" member key), as can a double-quoted string
   or a number.  (When other types -- in particular, types that contain
   more than one value -- are used as the types of keys, they are
   followed by a double arrow; see below.)

   If a text string key does not match the syntax for an identifier (or
   if the specifier just happens to prefer using double quotes), the
   text string syntax can also be used in the member key position,
   followed by a colon.  The above example could therefore have been
   written with quoted strings in the member key positions.

   More generally, types specified in ways other than those listed for
   the cases described above can be used in a key-type position by
   following them with a double arrow -- in particular, the double arrow
   is necessary if a type is named by an identifier (which, when
   followed by a colon, would be interpreted as a "bareword" and turned
   into a text string).  A literal text string also gives rise to a type
   (which contains a single value only -- the given string), so another
   form for this example is:

                         located-samples = {
                           "sample-point" => int,
                           "samples" => [+ float],
                         }

RFC8610 - Page 21

   See Section 3.5.4 below for how the colon (":") shortcut described
   here also adds some implied semantics.

   A better way to demonstrate the use of the double arrow may be:

             located-samples = {
               sample-point: int,
               samples: [+ float],
               * equipment-type => equipment-tolerances,
             }
             equipment-type = [name: tstr, manufacturer: tstr]
             equipment-tolerances = [+ [float, float]]

   The example below defines a struct with optional entries: display
   name (as a text string), the name components first name and family
   name (as text strings), and age information (as an unsigned integer).

                          PersonalData = {
                            ? displayName: tstr,
                            NameComponents,
                            ? age: uint,
                          }

                          NameComponents = (
                            ? firstName: tstr,
                            ? familyName: tstr,
                          )

   Note that the group definition for NameComponents does not generate
   another map; instead, all four keys are directly in the struct built
   by PersonalData.

   In this example, all key/value pairs are optional from the
   perspective of CDDL.  With no occurrence indicator, an entry is
   mandatory.

RFC8610 - Page 22

   If the addition of more entries not specified by the current
   specification is desired, one can add this possibility explicitly:

                          PersonalData = {
                            ? displayName: tstr,
                            NameComponents,
                            ? age: uint,
                            * tstr => any
                          }

                          NameComponents = (
                            ? firstName: tstr,
                            ? familyName: tstr,
                          )

            Figure 7: Personal Data: Example for Extensibility

   The CDDL tool described in Appendix F generated the following as one
   acceptable instance for this specification:

         {"familyName": "agust", "antiforeignism": "pretzel",
          "springbuck": "illuminatingly", "exuviae": "ephemeris",
          "kilometrage": "frogfish"}

   (See Section 3.9 for one way to explicitly identify an extension
   point.)

3.5.2.  Tables

   A table can be specified by defining a map with entries where the
   key type allows more than just a single value; for example:

                         square-roots = {* x => y}
                         x = int
                         y = float

   Here, the key in each key/value pair has datatype x (defined as int),
   and the value has datatype y (defined as float).

   If the specification does not need to restrict one of x or y (i.e.,
   the application is free to choose per entry), it can be replaced by
   the predefined name "any".

RFC8610 - Page 23

   As another example, the following could be used as a conversion table
   converting from an integer or float to a string:

                      tostring = {* mynumber => tstr}
                      mynumber = int / float

3.5.3.  Non-deterministic Order

   While the way arrays are matched is fully determined by the PEG
   formalism (see Appendix A), matching is more complicated for maps, as
   maps do not have an inherent order.  For each candidate name/value
   pair that the PEG algorithm would try, a matching member is picked
   out of the entire map.  For certain group expressions, more than one
   member in the map may match.  Most often, this is inconsequential, as
   the group expression tends to consume all matches:

                            labeled-values = {
                              ? fritz: number,
                              * label => value
                            }
                            label = text
                            value = number

   Here, if any member with the key "fritz" is present, this will be
   picked by the first entry of the group; all remaining text/number
   members will be picked by the second entry (and if anything remains
   unpicked, the map does not match).

   However, it is possible to construct group expressions where what is
   actually picked is indeterminate, but does matter:

                            do-not-do-this = {
                              int => int,
                              int => 6,
                            }

   When this expression is matched against "{3: 5, 4: 6}", the first
   group entry might pick off the "3: 5", leaving "4: 6" for matching
   the second one.  Or it might pick off "4: 6", leaving nothing for the
   second entry.  This pathological non-determinism is caused by
   specifying "more general" before "more specific" and by having a
   general rule that only consumes a subset of the map key/value pairs
   that it is able to match -- both tend not to occur in real-world
   specifications of maps.  At the time of writing, CDDL tools cannot
   detect such cases automatically, and for the present version of the
   CDDL specification, the specification writer is simply urged to not
   write pathologically non-deterministic specifications.

RFC8610 - Page 24

   (The astute reader will be reminded of what was called "ambiguous
   content models" in the Standard Generalized Markup Language (SGML)
   and "non-deterministic content models" in XML.  That problem is
   related to the one described here, but the problem here is
   specifically caused by the lack of order in maps, something that the
   XML schema languages do not have to contend with.  Note that
   RELAX NG's "interleave" pattern handles lack of order explicitly on
   the specification side, while the instances in XML always have
   determinate order.)

3.5.4.  Cuts in Maps

   The extensibility idiom discussed above for structs has one problem:

                        extensible-map-example = {
                          ? "optional-key" => int,
                          * tstr => any
                        }

   In this example, there is one optional key "optional-key", which,
   when present, maps to an integer.  There is also a wildcard for any
   future additions.

   Unfortunately, the data item

                      { "optional-key": "nonsense" }

   does match this specification: while the first entry of the group
   does not match, the second one (the wildcard) does.  This may very
   well be desirable (e.g., if a future extension is to be allowed to
   extend the type of "optional-key"), but in many cases it isn't.

   In anticipation of a more general potential feature called "cuts",
   CDDL allows inserting a cut "^" into the definition of the map entry:

                       extensible-map-example = {
                         ? "optional-key" ^ => int,
                         * tstr => any
                       }

   A cut in this position means that once the member key matches the
   name part of an entry that carries a cut, other potential matches for
   the key of the member that occur in later entries in the group of the
   map are no longer allowed.  In other words, when a group entry would
   pick a key/value pair based on just a matching key, it "locks in" the
   pick -- this rule applies, independently of whether the value matches

RFC8610 - Page 25

   as well, so when it does not, the entire map fails to match.  In
   summary, the example above no longer matches the specification as
   modified with the cut.

   Since the desire for this kind of exclusive matching is so frequent,
   the ":" shortcut is actually defined to include the cut semantics.
   So, the preceding example (including the cut) can be written more
   simply as:

                        extensible-map-example = {
                          ? "optional-key": int,
                          * tstr => any
                        }

   or even shorter, using a bareword for the key:

                        extensible-map-example = {
                          ? optional-key: int,
                          * tstr => any
                        }

3.6.  Tags

   A type can make use of a CBOR tag (major type 6) by using the
   representation type notation, giving #6.nnn(type) where nnn is an
   unsigned integer giving the tag number and "type" is the type of the
   data item being tagged.

   For example, the following line from the CDDL prelude (Appendix D)
   defines "biguint" as a type name for an unsigned bignum N:

                           biguint = #6.2(bstr)

   The tags defined by [RFC7049] are included in the prelude.
   Additional tags registered since [RFC7049] was written need to be
   added to a CDDL specification as needed; e.g., a binary Universally
   Unique Identifier (UUID) tag could be referenced as "buuid" in a
   specification after defining

                            buuid = #6.37(bstr)

   In the following example, usage of tag 32 for URIs is optional:

                        my_uri = #6.32(tstr) / tstr

RFC8610 - Page 26

3.7.  Unwrapping

   The group that is used to define a map or an array can often be
   reused in the definition of another map or array.  Similarly, a type
   defined as a tag carries an internal data item that one would like to
   refer to.  In these cases, it is expedient to simply use the name of
   the map, array, or tag type as a handle for the group or type defined
   inside it.

   The "unwrap" operator (written by preceding a name by a tilde
   character "~") can be used to strip the type defined for a name by
   one layer, exposing the underlying group (for maps and arrays) or
   type (for tags).

   For example, an application might want to define a basic header and
   an advanced header.  Without unwrapping, this might be done as
   follows:

             basic-header-group = (
               field1: int,
               field2: text,
             )

             basic-header = [ basic-header-group ]

             advanced-header = [
               basic-header-group,
               field3: bytes,
               field4: number, ; as in the tagged type "time"
             ]

   Unwrapping simplifies this to:

                            basic-header = [
                              field1: int,
                              field2: text,
                            ]

                            advanced-header = [
                              ~basic-header,
                              field3: bytes,
                              field4: ~time,
                            ]

   (Note that leaving out the first unwrap operator in the latter
   example would lead to nesting the basic-header in its own array
   inside the advanced-header, while, with the unwrapped basic-header,
   the definition of the group inside basic-header is essentially

RFC8610 - Page 27

   repeated inside advanced-header, leading to a single array.  This can
   be used for various applications often solved by inheritance in
   programming languages.  The effect of unwrapping can also be
   described as "threading in" the group or type inside the referenced
   type, which suggested the thread-like "~" character.)

3.8.  Controls

   A _control_ allows relating a _target_ type with a _controller_ type
   via a _control operator_.

   The syntax for a control type is "target .control-operator
   controller", where control operators are special identifiers prefixed
   by a dot.  (Note that _target_ or _controller_ might need to be
   parenthesized.)

   A number of control operators are defined at this point.  Further
   control operators may be defined by new versions of this
   specification or by registering them according to the procedures in
   Section 6.1.

3.8.1.  Control Operator .size

   A ".size" control controls the size of the target in bytes by the
   control type.  The control is defined for text and byte strings,
   where it directly controls the number of bytes in the string.  It is
   also defined for unsigned integers (see below).  Figure 8 shows
   example usage for byte strings.

                   full-address = [[+ label], ip4, ip6]
                   ip4 = bstr .size 4
                   ip6 = bstr .size 16
                   label = bstr .size (1..63)

                    Figure 8: Control for Size in Bytes

   When applied to an unsigned integer, the ".size" control restricts
   the range of that integer by giving a maximum number of bytes that
   should be needed in a computer representation of that unsigned
   integer.  In other words, "uint .size N" is equivalent to
   "0...BYTES_N", where BYTES_N == 256**N.

     audio_sample = uint .size 3 ; 24-bit, equivalent to 0...16777216

                Figure 9: Control for Integer Size in Bytes

RFC8610 - Page 28

   Note that, as with value restrictions in CDDL, this control is not a
   representation constraint; a number that fits into fewer bytes can
   still be represented in that form, and an inefficient implementation
   could use a longer form (unless that is restricted by some format
   constraints outside of CDDL, such as the rules in Section 3.9 of
   [RFC7049]).

3.8.2.  Control Operator .bits

   A ".bits" control on a byte string indicates that, in the target,
   only the bits numbered by a number in the control type are allowed to
   be set.  (Bits are counted the usual way, bit number "n" being set in
   "str" meaning that "(str[n >> 3] & (1 << (n & 7))) != 0".)
   Similarly, a ".bits" control on an unsigned integer "i" indicates
   that for all unsigned integers "n" where "(i & (1 << n)) != 0", "n"
   must be in the control type.

                      tcpflagbytes = bstr .bits flags
                      flags = &(
                        fin: 8,
                        syn: 9,
                        rst: 10,
                        psh: 11,
                        ack: 12,
                        urg: 13,
                        ece: 14,
                        cwr: 15,
                        ns: 0,
                      ) / (4..7) ; data offset bits

                      rwxbits = uint .bits rwx
                      rwx = &(r: 2, w: 1, x: 0)

                Figure 10: Control for What Bits Can Be Set

   The CDDL tool described in Appendix F generates the following ten
   example instances for "tcpflagbytes":

      h'906d' h'01fc' h'8145' h'01b7' h'013d' h'409f' h'018e' h'c05f'
      h'01fa' h'01fe'

   These examples do not illustrate that the above CDDL specification
   does not explicitly specify a size of two bytes: a valid all-clear
   instance of flag bytes could be "h''" or "h'00'" or even "h'000000'"
   as well.

RFC8610 - Page 29

3.8.3.  Control Operator .regexp

   A ".regexp" control indicates that the text string given as a target
   needs to match the XML Schema Definition (XSD) regular expression
   given as a value in the control type.  XSD regular expressions are
   defined in Appendix F of [W3C.REC-xmlschema-2-20041028].

     nai = tstr .regexp "[A-Za-z0-9]+@[A-Za-z0-9]+(\\.[A-Za-z0-9]+)+"

                   Figure 11: Control with an XSD regexp

   An example matching this regular expression:

                       "N1@CH57HF.4Znqe0.dYJRN.igjf"

3.8.3.1.  Usage Considerations

   Note that XSD regular expressions do not support the usual \x or \u
   escapes for hexadecimal expression of bytes or Unicode code points.
   However, in CDDL the XSD regular expressions are contained in text
   strings, the literal notation for which provides \u escapes; this
   should suffice for most applications that use regular expressions for
   text strings.  (Note that this also means that there is one level of
   string escaping before the XSD escaping rules are applied.)

   XSD regular expressions support character class subtraction, a
   feature often not found in regular expression libraries;
   specification writers may want to use this feature sparingly.
   Similar considerations apply to Unicode character classes; where
   these are used, the specification that employs CDDL SHOULD identify
   which Unicode versions are addressed.

   Other surprises for infrequent users of XSD regular expressions may
   include the following:

   o  No direct support for case insensitivity.  While case
      insensitivity has gone mostly out of fashion in protocol design,
      it is sometimes needed and then needs to be expressed manually as
      in "[Cc][Aa][Ss][Ee]".

   o  The support for popular character classes such as \w and \d is
      based on Unicode character properties; this is often not what is
      desired in an ASCII-based protocol and thus might lead to
      surprises.  (\s and \S do have their more conventional meanings,
      and "." matches any character but the line-ending characters \r
      or \n.)

RFC8610 - Page 30

3.8.3.2.  Discussion

   There are many flavors of regular expression in use in the
   programming community.  For instance, Perl-Compatible Regular
   Expressions (PCREs) are widely used and probably are more useful than
   XSD regular expressions.  However, there is no normative reference
   for PCREs that could be used in the present document.  Instead, we
   opt for XSD regular expressions for now.  There is precedent for that
   choice in the IETF, e.g., in YANG [RFC7950].

   Note that CDDL uses controls as its main extension point.  This
   creates the opportunity to add further regular expression formats in
   addition to the one referenced here, if desired.  As an example, a
   proposal for a ".pcre" control is defined in [CDDL-Freezer].

3.8.4.  Control Operators .cbor and .cborseq

   A ".cbor" control on a byte string indicates that the byte string
   carries a CBOR-encoded data item.  Decoded, the data item matches the
   type given as the right-hand-side argument (type1 in the following
   example).

      "bytes .cbor type1"

   Similarly, a ".cborseq" control on a byte string indicates that the
   byte string carries a sequence of CBOR-encoded data items.  When the
   data items are taken as an array, the array matches the type given as
   the right-hand-side argument (type2 in the following example).

      "bytes .cborseq type2"

   (The conversion of the encoded sequence to an array can be effected,
   for instance, by wrapping the byte string between the two bytes 0x9f
   and 0xff and decoding the wrapped byte string as a CBOR-encoded
   data item.)

3.8.5.  Control Operators .within and .and

   A ".and" control on a type indicates that the data item matches both
   the left-hand-side type and the type given as the right-hand side.
   (Formally, the resulting type is the intersection of the two types
   given.)

      "type1 .and type2"

RFC8610 - Page 31

   A variant of the ".and" control is the ".within" control, which
   expresses an additional intent: the left-hand-side type is meant to
   be a subset of the right-hand-side type.

      "type1 .within type2"

   While both forms have the identical formal semantics (intersection),
   the intention of the ".within" form is that the right-hand side gives
   guidance to the types allowed on the left-hand side, which typically
   is a socket (Section 3.9):

        message = $message .within message-structure
        message-structure = [message_type, *message_option]
        message_type = 0..255
        message_option = any

        $message /= [3, dough: text, topping: [* text]]
        $message /= [4, noodles: text, sauce: text, parmesan: bool]

   For ".within", a tool might flag an error if type1 allows data items
   that are not allowed by type2.  In contrast, for ".and", there is no
   expectation that type1 is already a subset of type2.

3.8.6.  Control Operators .lt, .le, .gt, .ge, .eq, .ne, and .default

   The controls .lt, .le, .gt, .ge, .eq, and .ne specify a constraint
   on the left-hand-side type to be a value less than, less than or
   equal to, greater than, greater than or equal to, equal to, or not
   equal to a value given as a right-hand-side type (containing just
   that single value).  In the present specification, the first four
   controls (.lt, .le, .gt, and .ge) are defined only for numeric types,
   as these have a natural ordering relationship.

                     speed = number .ge 0  ; unit: m/s

   .ne and .eq are defined for both numeric values and values of other
   types.  If one of the values is not of a numeric type, equality is
   determined as follows: text strings are equal (satisfy .eq / do not
   satisfy .ne) if they are bytewise identical; the same applies for
   byte strings.  Arrays are equal if they have the same number of
   elements, all of which are equal pairwise in order between the
   arrays.  Maps are equal if they have the same number of key/value
   pairs, and there is pairwise equality between the key/value pairs
   between the two maps.  Tagged values are equal if they both have the
   same tag and the values are equal.  Values of simple types match if
   they are the same values.  Numeric types that occur within arrays,

RFC8610 - Page 32

   maps, or tagged values are equal if their numeric value is equal and
   they are both integers or both floating-point values.  All other
   cases are not equal (e.g., comparing a text string with a byte
   string).

   A variant of the ".ne" control is the ".default" control, which
   expresses an additional intent: the value specified by the
   right-hand-side type is intended as a default value for the
   left-hand-side type given, and the implied .ne control is there to
   prevent this value from being sent over the wire.  This control is
   only meaningful when the control type is used in an optional context;
   otherwise, there would be no way to make use of the default value.

               timer = {
                 time: uint,
                 ? displayed-step: (number .gt 0) .default 1
               }

3.9.  Socket/Plug

   For both type choices and group choices, a mechanism is defined that
   facilitates starting out with empty choices and assembling them
   later, potentially in separate files that are concatenated to build
   the full specification.

   Per convention, CDDL extension points are marked with a leading
   dollar sign (types) or two leading dollar signs (groups).  Tools
   honor that convention by not raising an error if such a type or group
   is not defined at all; the symbol is then taken to be an empty type
   choice (group choice), i.e., no choice is available.

            tcp-header = {seq: uint, ack: uint, * $$tcp-option}

            ; later, in a different file

            $$tcp-option //= (
            sack: [+(left: uint, right: uint)]
            )

            ; and, maybe in another file

            $$tcp-option //= (
            sack-permitted: true
            )

   Names that start with a single "$" are "type sockets", starting out
   as an empty type, and intended to be extended via "/=".  Names that
   start with a double "$$" are "group sockets", starting out as an

RFC8610 - Page 33

   empty group choice, and intended to be extended via "//=".  In either
   case, it is not an error if there is no definition for a socket at
   all; this then means there is no way to satisfy the rule (i.e., the
   choice is empty).

   As a convention, all definitions (plugs) for socket names must be
   augmentations, i.e., they must be using "/=" and "//=", respectively.

   To pick up the example illustrated in Figure 7, the socket/plug
   mechanism could be used as shown in Figure 12:

                     PersonalData = {
                       ? displayName: tstr,
                       NameComponents,
                       ? age: uint,
                       * $$personaldata-extensions
                     }

                     NameComponents = (
                       ? firstName: tstr,
                       ? familyName: tstr,
                     )

                     ; The above already works as is.
                     ; But then, we can add later:

                     $$personaldata-extensions //= (
                       favorite-salsa: tstr,
                     )

                     ; and again, somewhere else:

                     $$personaldata-extensions //= (
                       shoesize: uint,
                     )

     Figure 12: Personal Data Example: Using Socket/Plug Extensibility

3.10.  Generics

   Using angle brackets, the left-hand side of a rule can add formal
   parameters after the name being defined, as in:

      messages = message<"reboot", "now"> / message<"sleep", 1..100>
      message<t, v> = {type: t, value: v}

RFC8610 - Page 34

   When using a generic rule, the formal parameters are bound to the
   actual arguments supplied (also using angle brackets), within the
   scope of the generic rule (as if there were a rule of the form
   parameter = argument).

   Generic rules can be used for establishing names for both types and
   groups.

   (At this time, there are some limitations to the nesting of generics
   in the CDDL tool described in Appendix F.)

3.11.  Operator Precedence

   As with any language that has multiple syntactic features such as
   prefix and infix operators, CDDL has operators that bind more tightly
   than others.  This is becoming more complicated than, say, in ABNF,
   as CDDL has both types and groups, with operators that are specific
   to these concepts.  Type operators (such as "/" for type choice)
   operate on types, while group operators (such as "//" for group
   choice) operate on groups.  Types can simply be used in groups, but
   groups need to be bracketed (as arrays or maps) to become types.  So,
   type operators naturally bind closer than group operators.

   For instance, in

      t = [group1]
      group1 = (a / b // c / d)
      a = 1 b = 2 c = 3 d = 4

   group1 is a group choice between the type choice of a and b and the
   type choice of c and d.  This becomes more relevant once member keys
   and/or occurrences are added in:

      t = {group2}
      group2 = (? ab: a / b // cd: c / d)
      a = 1 b = 2 c = 3 d = 4

   is a group choice between the optional member "ab" of type a or b and
   the member "cd" of type c or d.  Note that the optionality is
   attached to the first choice ("ab"), not to the second choice.

RFC8610 - Page 35

   Similarly, in

      t = [group3]
      group3 = (+ a / b / c)
      a = 1 b = 2 c = 3

   group3 is a repetition of a type choice between a, b, and c; if just
   a is to be repeatable, a group choice is needed to focus the
   occurrence:

      t = [group4]
      group4 = (+ a // b / c)
      a = 1 b = 2 c = 3

   group4 is a group choice between a repeatable a and a single b or c.

   A comment has been that the semantics of group3 could be
   counterintuitive.  In general, as with many other languages with
   operator precedence rules, the specification writer is encouraged not
   to rely on them, but to insert parentheses liberally to guide readers
   that are not familiar with CDDL precedence rules:

      t = [group4a]
      group4a = ((+ a) // (b / c))
      a = 1 b = 2 c = 3

   The operator precedences, in sequence of loose to tight binding, are
   defined in Appendix B and summarized in Table 1.  (Arities given are
   1 for unary prefix operators and 2 for binary infix operators.)

RFC8610 - Page 36

       +----------+-------+---------------------------+------------+
       | Operator | Arity | Operates on               | Precedence |
       +----------+-------+---------------------------+------------+
       |    =     |   2   | name = type, name = group |     1      |
       |    /=    |   2   | name /= type              |     1      |
       |   //=    |   2   | name //= group            |     1      |
       |    //    |   2   | group // group            |     2      |
       |    ,     |   2   | group, group              |     3      |
       |    *     |   1   | * group                   |     4      |
       |   n*m    |   1   | n*m group                 |     4      |
       |    +     |   1   | + group                   |     4      |
       |    ?     |   1   | ? group                   |     4      |
       |    =>    |   2   | type => type              |     5      |
       |    :     |   2   | name: type                |     5      |
       |    /     |   2   | type / type               |     6      |
       |    ..    |   2   | type..type                |     7      |
       |   ...    |   2   | type...type               |     7      |
       |  .ctrl   |   2   | type .ctrl type           |     7      |
       |    &     |   1   | &group                    |     8      |
       |    ~     |   1   | ~type                     |     8      |
       +----------+-------+---------------------------+------------+

                 Table 1: Summary of Operator Precedences

(page 36 continued on part 3)