Network Working Group R. Finking Request for Comments: 4997 Siemens/Roke Manor Research Category: Standards Track G. Pelletier Ericsson July 2007 Formal Notation for RObust Header Compression (ROHC-FN) Status of This Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The IETF Trust (2007).Abstract
This document defines Robust Header Compression - Formal Notation (ROHC-FN), a formal notation to specify field encodings for compressed formats when defining new profiles within the ROHC framework. ROHC-FN offers a library of encoding methods that are often used in ROHC profiles and can thereby help to simplify future profile development work.Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Overview of ROHC-FN . . . . . . . . . . . . . . . . . . . . . 5 3.1. Scope of the Formal Notation . . . . . . . . . . . . . . . 6 3.2. Fundamentals of the Formal Notation . . . . . . . . . . . 7 3.2.1. Fields and Encodings . . . . . . . . . . . . . . . . . 7 3.2.2. Formats and Encoding Methods . . . . . . . . . . . . . 9 3.3. Example Using IPv4 . . . . . . . . . . . . . . . . . . . . 11 4. Normative Definition of ROHC-FN . . . . . . . . . . . . . . . 13 4.1. Structure of a Specification . . . . . . . . . . . . . . . 13 4.2. Identifiers . . . . . . . . . . . . . . . . . . . . . . . 14 4.3. Constant Definitions . . . . . . . . . . . . . . . . . . . 15 4.4. Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.4.1. Attribute References . . . . . . . . . . . . . . . . . 17 4.4.2. Representation of Field Values . . . . . . . . . . . . 17
4.5. Grouping of Fields . . . . . . . . . . . . . . . . . . . . 17 4.6. "THIS" . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.7. Expressions . . . . . . . . . . . . . . . . . . . . . . . 19 4.7.1. Integer Literals . . . . . . . . . . . . . . . . . . . 20 4.7.2. Integer Operators . . . . . . . . . . . . . . . . . . 20 4.7.3. Boolean Literals . . . . . . . . . . . . . . . . . . . 20 4.7.4. Boolean Operators . . . . . . . . . . . . . . . . . . 20 4.7.5. Comparison Operators . . . . . . . . . . . . . . . . . 21 4.8. Comments . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.9. "ENFORCE" Statements . . . . . . . . . . . . . . . . . . . 22 4.10. Formal Specification of Field Lengths . . . . . . . . . . 23 4.11. Library of Encoding Methods . . . . . . . . . . . . . . . 24 4.11.1. uncompressed_value . . . . . . . . . . . . . . . . . . 24 4.11.2. compressed_value . . . . . . . . . . . . . . . . . . . 25 4.11.3. irregular . . . . . . . . . . . . . . . . . . . . . . 26 4.11.4. static . . . . . . . . . . . . . . . . . . . . . . . . 27 4.11.5. lsb . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.11.6. crc . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.12. Definition of Encoding Methods . . . . . . . . . . . . . . 29 4.12.1. Structure . . . . . . . . . . . . . . . . . . . . . . 30 4.12.2. Arguments . . . . . . . . . . . . . . . . . . . . . . 37 4.12.3. Multiple Formats . . . . . . . . . . . . . . . . . . . 38 4.13. Profile-Specific Encoding Methods . . . . . . . . . . . . 40 5. Security Considerations . . . . . . . . . . . . . . . . . . . 41 6. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 41 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 42 8.1. Normative References . . . . . . . . . . . . . . . . . . . 42 8.2. Informative References . . . . . . . . . . . . . . . . . . 42 Appendix A. Formal Syntax of ROHC-FN . . . . . . . . . . . . . . 43 Appendix B. Bit-level Worked Example . . . . . . . . . . . . . . 45 B.1. Example Packet Format . . . . . . . . . . . . . . . . . . 45 B.2. Initial Encoding . . . . . . . . . . . . . . . . . . . . . 46 B.3. Basic Compression . . . . . . . . . . . . . . . . . . . . 47 B.4. Inter-Packet Compression . . . . . . . . . . . . . . . . . 48 B.5. Specifying Initial Values . . . . . . . . . . . . . . . . 50 B.6. Multiple Packet Formats . . . . . . . . . . . . . . . . . 51 B.7. Variable Length Discriminators . . . . . . . . . . . . . . 53 B.8. Default Encoding . . . . . . . . . . . . . . . . . . . . . 55 B.9. Control Fields . . . . . . . . . . . . . . . . . . . . . . 56 B.10. Use of "ENFORCE" Statements as Conditionals . . . . . . . 59
1. Introduction
Robust Header Compression - Formal Notation (ROHC-FN) is a formal notation designed to help with the definition of ROHC [RFC4995] header compression profiles. Previous header compression profiles have been so far specified using a combination of English text together with ASCII Box notation. Unfortunately, this was sometimes unclear and ambiguous, revealing the limitations of defining complex structures and encodings for compressed formats this way. The primary objective of the Formal Notation is to provide a more rigorous means to define header formats -- compressed and uncompressed -- as well as the relationships between them. No other formal notation exists that meets these requirements, so ROHC-FN aims to meet them. In addition, ROHC-FN offers a library of encoding methods that are often used in ROHC profiles, so that the specification of new profiles using the formal notation can be achieved without having to redefine this library from scratch. Informally, an encoding method defines a two-way mapping between uncompressed data and compressed data.2. Terminology
o Compressed format A compressed format consists of a list of fields that provides bindings between encodings and the fields it compresses. One or more compressed formats can be combined to represent an entire compressed header format. o Context Context is information about the current (de)compression state of the flow. Specifically, a context for a specific field can be either uninitialised, or it can include a set of one or more values for the field's attributes defined by the compression algorithm, where a value may come from the field's attributes corresponding to a previous packet. See also a more generalized definition in Section 2.2 of [RFC4995]. o Control field Control fields are transmitted from a ROHC compressor to a ROHC decompressor, but are not part of the uncompressed header itself.
o Encoding method, encodings Encoding methods are two-way relations that can be applied to compress and decompress fields of a protocol header. o Field The protocol header is divided into a set of contiguous bit patterns known as fields. Each field is defined by a collection of attributes that indicate its value and length in bits for both the compressed and uncompressed headers. The way the header is divided into fields is specific to the definition of a profile, and it is not necessary for the field divisions to be identical to the ones given by the specification(s) for the protocol header being compressed. o Library of encoding methods The library of encoding methods contains a number of commonly used encoding methods for compressing header fields. o Profile A ROHC [RFC4995] profile is a description of how to compress a certain protocol stack. Each profile consists of a set of formats (for example, uncompressed and compressed formats) along with a set of rules that control compressor and decompressor behaviour. o ROHC-FN specification The specification of the set of formats of a ROHC profile using ROHC-FN. o Uncompressed format An uncompressed format consists of a list of fields that provides the order of the fields to be compressed for a contiguous set of bits whose bit layout corresponds to the protocol header being compressed.3. Overview of ROHC-FN
This section gives an overview of ROHC-FN. It also explains how ROHC-FN can be used to specify the compression of header fields as part of a ROHC profile.
3.1. Scope of the Formal Notation
This section explains how the formal notation relates to the ROHC framework and to specifications of ROHC profiles. The ROHC framework [RFC4995] provides the general principles for performing robust header compression. It defines the concept of a profile, which makes ROHC a general platform for different compression schemes. It sets link layer requirements, and in particular negotiation requirements, for all ROHC profiles. It defines a set of common functions such as Context Identifiers (CIDs), padding, and segmentation. It also defines common formats (IR, IR- DYN, Feedback, Add-CID, etc.), and finally it defines a generic, profile independent, feedback mechanism. A ROHC profile is a description of how to compress a certain protocol stack. For example, ROHC profiles are available for RTP/UDP/IP and many other protocol stacks. At a high level, each ROHC profile consists of a set of formats (defining the bits to be transmitted) along with a set of rules that control compressor and decompressor behaviour. The purpose of the formats is to define how to compress and decompress headers. The formats define one or more compressed versions of each uncompressed header, and simultaneously define the inverse: how to relate a compressed header back to the original uncompressed header. The set of formats will typically define compression of headers relative to a context of field values from previous headers in a flow, improving the overall compression by taking into account redundancies between headers of successive packets. Therefore, in addition to defining the formats, a profile has to: o specify how to manage the context for both the compressor and the decompressor, o define when and what to send in feedback messages, if any, from decompressor to compressor, o outline compression principles to make the profile robust against bit errors and dropped packets. All this is needed to ensure that the compressor and decompressor contexts are kept consistent with each other, while still facilitating the best possible compression performance. The ROHC-FN is designed to help in the specification of compressed formats that, when put together based on the profile definition, make
up the formats used in a ROHC profile. It offers a library of encoding methods for compressing fields, and a mechanism for combining these encoding methods to create compressed formats tailored to a specific protocol stack. The scope of ROHC-FN is limited to specifying the relationship between the compressed and uncompressed formats. To form a complete profile specification, the control logic for the profile behaviour needs to be defined by other means.3.2. Fundamentals of the Formal Notation
There are two fundamental elements to the formal notation: 1. Fields and their encodings, which define the mapping between a header's uncompressed and compressed forms. 2. Encoding methods, which define the way headers are broken down into fields. Encoding methods define lists of uncompressed fields and the lists of compressed fields they map onto. These two fundamental elements are at the core of the notation and are outlined below.3.2.1. Fields and Encodings
Headers are made up of fields. For example, version number, header length, and sequence number are all fields used in real protocols. Fields have attributes. Attributes describe various things about the field. For example: field.ULENGTH The above indicates the uncompressed length of the field. A field is said to have a value attribute, i.e., a compressed value or an uncompressed value, if the corresponding length attribute is greater than zero. See Section 4.4 for more details on field attributes. The relationship between the compressed and uncompressed attributes of a field are specified with encoding methods, using the following notation: field =:= encoding_method; In the field definition above, the symbol "=:=" means "is encoded by". This field definition does not represent an assignment operation from the right hand side to the left side. Instead, it is
a two-way mapping between the compressed and uncompressed attributes of the field. It both represents the compression and the decompression operation in a single field definition, through a process of two-way matching. Two-way matching is a binary operation that attempts to make the operands (i.e., the compressed and uncompressed attributes) match. This is similar to the unification process in logic. The operands represent one unspecified data object and one specified object. Values can be matched from either operand. During compression, the uncompressed attributes of the field are already defined. The given encoding matches the compressed attributes against them. During decompression, the compressed attributes of the field are already defined, so the uncompressed attributes are matched to the compressed attributes using the given encoding method. Thus, both compression and decompression are defined by a single field definition. Therefore, an encoding method (including any parameters specified) creates a reversible binding between the attributes of a field. At the compressor, a format can be used if a set of bindings that is successful for all the attributes in all its fields can be found. At the decompressor, the operation is reversed using the same bindings and the attributes in each field are filled according to the specified bindings; decoding fails if the binding for an attribute fails. For example, the "static" encoding method creates a binding between the attribute corresponding to the uncompressed value of the field and the corresponding value of the field in the context. o For the compressor, the "static" binding is successful when both the context value and the uncompressed value are the same. If the two values differ then the binding fails. o For the decompressor, the "static" binding succeeds only if a valid context entry containing the value of the uncompressed field exists. Otherwise, the binding will fail. Both the compressed and uncompressed forms of each field are represented as a string of bits; the most significant bit first, of the length specified by the length attribute. The bit string is the binary representation of the value attribute of the field, modulo "2^length", where "length" is the length attribute of the field. However, this is only the representation of the bits exchanged between the compressor and the decompressor, designed to allow
maximum compression efficiency. The FN itself uses the full range of integers. See Section 4.4.2 for further details.3.2.2. Formats and Encoding Methods
The ROHC-FN provides a library of commonly used encoding methods. Encoding methods can be defined using plain English, or using a formal definition consisting of, for example, a collection of expressions (Section 4.7) and "ENFORCE" statements (Section 4.9). ROHC-FN also provides mechanisms for combining fields and their encoding methods into higher level encoding methods following a well- defined structure. This is similar to the definition of functions and procedures in an ordinary programming language. It allows complexity to be handled by being broken down into manageable parts. New encoding methods are defined at the top level of a profile. These can then be used in the definition of other higher level encoding methods, and so on. new_encoding_method // This block is an encoding method { UNCOMPRESSED { // This block is an uncompressed format field_1 [ 16 ]; field_2 [ 32 ]; field_3 [ 48 ]; } CONTROL { // This block defines control fields ctrl_field_1; ctrl_field_2; } DEFAULT { // This block defines default encodings // for specified fields ctrl_field_2 =:= encoding_method_2; field_1 =:= encoding_method_1; } COMPRESSED format_0 { // This block is a compressed format field_1; field_2 =:= encoding_method_2; field_3 =:= encoding_method_3; ctrl_field_1 =:= encoding_method_4; ctrl_field_2; }
COMPRESSED format_1 { // This block is a compressed format field_1; field_2 =:= encoding_method_3; field_3 =:= encoding_method_4; ctrl_field_2 =:= encoding_method_5; ctrl_field_3 =:= encoding_method_6; // This is a control field // with no uncompressed value } } In the example above, the encoding method being defined is called "new_encoding_method". The section headed "UNCOMPRESSED" indicates the order of fields in the uncompressed header, i.e., the uncompressed header format. The number of bits in each of the fields is indicated in square brackets. After this is another section, "CONTROL", which defines two control fields. Following this is the "DEFAULT" section which defines default encoding methods for two of the fields (see below). Finally, two alternative compressed formats follow, each defined in sections headed "COMPRESSED". The fields that occur in the compressed formats are either: o fields that occur in the uncompressed format; or o control fields that have an uncompressed value and that occur in the CONTROL section; or o control fields that do not have an uncompressed value and thus are defined as part of the compressed format. Central to each of these formats is a "field list", which defines the fields contained in the format and also the order that those fields appear in that format. For the "DEFAULT" and "CONTROL" sections, the field order is not significant. In addition to specifying field order, the field list may also specify bindings for any or all of the fields it contains. Fields that have no bindings defined for them are bound using the default bindings specified in the "DEFAULT" section (see Section 4.12.1.5). Fields from the compressed format have the same name as they do in the uncompressed format. If there are any fields that are present exclusively in the compressed format, but that do have an uncompressed value, they must be declared in the "CONTROL" section of the definition of the encoding method (see Section 4.12.1.3 for more details on defining control fields). Fields that have no uncompressed value do not appear in an "UNCOMPRESSED" field list and do not have to appear in the "CONTROL"
field list either. Instead, they are only declared in the compressed field lists where they are used. In the example above, all the fields that appear in the compressed format are also found in the uncompressed format, or the control field list, except for ctrl_field_3; this is possible because ctrl_field_3 has no "uncompressed" value at all. Fields such as a checksum on the compressed information fall into this category.3.3. Example Using IPv4
This section gives an overview of how the notation is used by means of an example. The example will develop the formal notation for an encoding method capable of compressing a single, well-known header: the IPv4 header [RFC791]. The first step is to specify the overall structure of the IPv4 header. To do this, we use an encoding method that we will call "ipv4_header". More details on definitions of encoding methods can be found in Section 4.12. This is notated as follows: ipv4_header { The fragment of notation above declares the encoding method "ipv4_header", the definition follows the opening brace (see Section 4.12). Definitions within the pair of braces are local to "ipv4_header". This scoping mechanism helps to clarify which fields belong to which formats; it is also useful when compressing complex protocol stacks with several headers, often with the same field names occurring in multiple headers (see Section 4.2). The next step is to specify the fields contained in the uncompressed IPv4 header to represent the uncompressed format for which the encoding method will define one or more compressed formats. This is accomplished using ROHC-FN as follows:
UNCOMPRESSED { version [ 4 ]; header_length [ 4 ]; dscp [ 6 ]; ecn [ 2 ]; length [ 16 ]; id [ 16 ]; reserved [ 1 ]; dont_frag [ 1 ]; more_fragments [ 1 ]; offset [ 13 ]; ttl [ 8 ]; protocol [ 8 ]; checksum [ 16 ]; src_addr [ 32 ]; dest_addr [ 32 ]; } The width of each field is indicated in square brackets. This part of the notation is used in the example for illustration to help the reader's understanding. However, indicating the field lengths in this way is optional since the width of each field can also normally be derived from the encoding that is used to compress/decompress it for a specific format. This part of the notation is formally defined in Section 4.10. The next step is to specify the compressed format. This includes the encodings for each field that map between the compressed and uncompressed forms of the field. In the example, these encoding methods are mainly taken from the ROHC-FN library (see Section 4.11). Since the intention here is to illustrate the use of the notation, rather than to describe the optimum method of compressing IPv4 headers, this example uses only three encoding methods. The "uncompressed_value" encoding method (defined in Section 4.11.1) can compress any field whose uncompressed length and value are fixed, or can be calculated using an expression. No compressed bits need to be sent because the uncompressed field can be reconstructed using its known size and value. The "uncompressed_value" encoding method is used to compress five fields in the IPv4 header, as described below: COMPRESSED { header_length =:= uncompressed_value(4, 5); version =:= uncompressed_value(4, 4); reserved =:= uncompressed_value(1, 0); offset =:= uncompressed_value(13, 0); more_fragments =:= uncompressed_value(1, 0);
The first parameter indicates the length of the uncompressed field in bits, and the second parameter gives its integer value. Note that the order of the fields in the compressed format is independent of the order of the fields in the uncompressed format. The "irregular" encoding method (defined in Section 4.11.3) can be used to encode any field for which both uncompressed attributes (ULENGTH and UVALUE) are defined, and whose ULENGTH attribute is either fixed or can be calculated using an expression. It is a fail- safe encoding method that can be used for such fields in the case where no other encoding method applies. All of the bits in the uncompressed form of the field are present in the compressed form as well; hence this encoding does not achieve any compression. src_addr =:= irregular(32); dest_addr =:= irregular(32); length =:= irregular(16); id =:= irregular(16); ttl =:= irregular(8); protocol =:= irregular(8); dscp =:= irregular(6); ecn =:= irregular(2); dont_frag =:= irregular(1); Finally, the third encoding method is specific only to the uncompressed format defined above for the IPv4 header, "inferred_ip_v4_header_checksum": checksum =:= inferred_ip_v4_header_checksum [ 0 ]; } } The "inferred_ip_v4_header_checksum" encoding method is different from the other two encoding methods in that it is not defined in the ROHC-FN library of encoding methods. Its definition could be given either by using the formal notation as part of the profile definition itself (see Section 4.12) or by using plain English text (see Section 4.13). In our example, the "inferred_ip_v4_header_checksum" is a specific encoding method that calculates the IP checksum from the rest of the header values. Like the "uncompressed_value" encoding method, no compressed bits need to be sent, since the field value can be reconstructed at the decompressor. This is notated explicitly by specifying, in square brackets, a length of 0 for the checksum field in the compressed format. Again, this notation is optional since the encoding method itself would be defined as sending zero compressed
bits, however it is useful to the reader to include such notation (see Section 4.10 for details on this part of the notation). Finally the definition of the format is terminated with a closing brace. At this point, the above example has defined a compressed format that can be used to represent the entire compressed IPv4 header, and provides enough information to allow an implementation to construct the compressed format from an uncompressed format (compression) and vice versa (decompression).