The International Object Identifier tree [
X.660] is a hierarchically managed space of identifiers, each of which is uniquely represented as a sequence of unsigned integer values [
X.680]. (These integer values are called "primary integer values" in [
X.660] because they can be accompanied by (not necessarily unambiguous) secondary identifiers. We ignore the latter and simply use the term "integer values" here, occasionally calling out their unsignedness. We also use the term "arc" when the focus is on the edge of the tree labeled by such an integer value, as well as in the sense of a "long arc", i.e., a (sub)sequence of such integer values.)
While these sequences can easily be represented in CBOR arrays of unsigned integers, a more compact representation can often be achieved by adopting the widely used representation of object identifiers defined in BER; this representation may also be more amenable to processing by other software that makes use of object identifiers.
BER represents the sequence of unsigned integers by concatenating self-delimiting representations [
RFC 6256] of each of the integer values in sequence.
ASN.1 distinguishes absolute object identifiers (ASN.1 type
OBJECT IDENTIFIER), which begin at a root arc ([
X.660], Clause 3.5.21), from relative object identifiers (ASN.1 type
RELATIVE-OID), which begin relative to some object identifier known from context ([
X.680], Clause 3.8.63). As a special optimization, BER combines the first two integers in an absolute object identifier into one numeric identifier by making use of the property of the hierarchy that the first arc has only three integer values (0, 1, and 2) and the second arcs under 0 and 1 are limited to the integer values between 0 and 39. (The root arc
joint-iso-itu-t(2) has no such limitations on its second arc.) If X and Y are the first two integer values, the single integer value actually encoded is computed as:
X * 40 + Y
The inverse transformation (again making use of the known ranges of X and Y) is applied when decoding the object identifier.
Since the semantics of absolute and relative object identifiers differ and since it is very common for companies to use self-assigned numbers under the arc
1.3.6.1.4.1 (IANA Private Enterprise Number OID [
IANA.enterprise-numbers]) that adds 5 fixed bytes to an encoded OID value, this specification defines three tags, collectively called the "OID tags" here:
-
Tag number 111:
-
Used to tag a byte string as the BER encoding [X.690] of anabsolute object identifier (simply "object identifier" or "OID").
-
Tag number 110:
-
Used to tag a byte string as the BER encoding [X.690] of a relativeobject identifier (also called "relative OID"). Since the encoding of eachnumber is the same as for Self-Delimiting Numeric Values(SDNVs) [RFC 6256], this tag can also be used for tagging a byte string thatcontains a sequence of zero or more SDNVs (or a moreapplication-specific tag can be created for such an application).
-
Tag number 112:
-
Structurally like tag 110 but understood to be relative to1.3.6.1.4.1 (IANA Private Enterprise Number OID [IANA.enterprise-numbers]). Hence, thesemantics of the result are that of an absolute object identifier.
To form a valid tag, a byte string tagged with 111, 110, or 112
MUST be syntactically valid contents (the value part) for a BER representation of an object identifier (see
Table 1):
Tag number |
Section of [X.690] |
111 |
8.19 |
110 |
8.20 |
112 |
8.20 |
Table 1: Tag Number and Section of X.690 Governing Tag Content
This is a concatenation of zero or more SDNV values, where each SDNV value is a sequence of one or more bytes that all have their most significant bit set, except for the last byte, where it is unset. Also, the first byte of each SDNV cannot be a leading zero in SDNV's base-128 arithmetic, so it cannot take the value 0x80 (bullet (c) in Section 8.1.2.4.2 of [
X.690]).
In other words:
-
The byte string's first byte, and any byte that follows a byte that has the most significant bit unset, MUST NOT be 0x80 (this requirement requires expressing the integer values in their shortest form, with no leading zeroes).
-
The byte string's last byte MUST NOT have the most significant bit set (this requirement excludes an incomplete final integer value).
If either of these invalid conditions are encountered, the tag is invalid.
[
X.680] restricts RELATIVE-OID values to having at least one arc, i.e., their encoding would have at least one SDNV. This specification permits empty relative object identifiers; they may still be excluded by application semantics.
To facilitate the search for specific object ID values, it is
RECOMMENDED that definite length encoding (see
Section 3.2.3 of
RFC 8949) be used for the byte strings that are used as tag content for these tags.
The valid set of byte strings can also be expressed using regular expressions on bytes, using no specific notation but resembling Perl Compatible Regular Expressions [
PCRE]. Unlike typical regular expressions that operate on character sequences, the following regular expressions take bytes as their domain, so they can be applied directly to CBOR byte strings.
For byte strings with tag 111:
/^(([\x81-\xFF][\x80-\xFF]*)?[\x00-\x7F])+$/
For byte strings with tags 110 or 112:
/^(([\x81-\xFF][\x80-\xFF]*)?[\x00-\x7F])*$/
A tag with tagged content that does not conform to the applicable regular expression is invalid.
For an absolute OID with a prefix of
1.3.6.1.4.1, representations with both the 111 and 112 tags are applicable, where the representation with 112 will be five bytes shorter (by leaving out the prefix h'2b06010401' from the enclosed byte string). This specification makes that shorter representation the preferred serialization (see Sections
3.4 and
4.1 of [
RFC 8949]). Note that this also implies that the Core Deterministic Encoding Requirements (
Section 4.2.1 of
RFC 8949) require the use of 112 tags instead of 111 tags wherever that is possible.
Staying close to the way object identifiers are encoded in ASN.1 BER makes back-and-forth translation easy; otherwise, we would choose a more efficient encoding. Object identifiers in IETF protocols are serialized in dotted decimal form or BER form, so there is an advantage in not inventing a third form. Also, expectations of the cost of encoding object identifiers are based on BER; using a different encoding might not be aligned with these expectations. If additional information about an OID is desired, lookup services such as the [
X.672] and the [
OID-INFO] are available.