RFC 4997

Formal Notation for RObust Header Compression (ROHC-FN)

Pages: 62
Proposed Standard

Part 1 of 3 – Pages 1 to 13

RFC4997 - Page 1

Network Working Group                                         R. Finking
Request for Comments: 4997                   Siemens/Roke Manor Research
Category: Standards Track                                   G. Pelletier
                                                                Ericsson
                                                               July 2007


        Formal Notation for RObust Header Compression (ROHC-FN)

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The IETF Trust (2007).

Abstract

   This document defines Robust Header Compression - Formal Notation
   (ROHC-FN), a formal notation to specify field encodings for
   compressed formats when defining new profiles within the ROHC
   framework.  ROHC-FN offers a library of encoding methods that are
   often used in ROHC profiles and can thereby help to simplify future
   profile development work.

Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
   3.  Overview of ROHC-FN  . . . . . . . . . . . . . . . . . . . . .  5
     3.1.  Scope of the Formal Notation . . . . . . . . . . . . . . .  6
     3.2.  Fundamentals of the Formal Notation  . . . . . . . . . . .  7
       3.2.1.  Fields and Encodings . . . . . . . . . . . . . . . . .  7
       3.2.2.  Formats and Encoding Methods . . . . . . . . . . . . .  9
     3.3.  Example Using IPv4 . . . . . . . . . . . . . . . . . . . . 11
   4.  Normative Definition of ROHC-FN  . . . . . . . . . . . . . . . 13
     4.1.  Structure of a Specification . . . . . . . . . . . . . . . 13
     4.2.  Identifiers  . . . . . . . . . . . . . . . . . . . . . . . 14
     4.3.  Constant Definitions . . . . . . . . . . . . . . . . . . . 15
     4.4.  Fields . . . . . . . . . . . . . . . . . . . . . . . . . . 16
       4.4.1.  Attribute References . . . . . . . . . . . . . . . . . 17
       4.4.2.  Representation of Field Values . . . . . . . . . . . . 17

RFC4997 - Page 2

     4.5.  Grouping of Fields . . . . . . . . . . . . . . . . . . . . 17
     4.6.  "THIS" . . . . . . . . . . . . . . . . . . . . . . . . . . 18
     4.7.  Expressions  . . . . . . . . . . . . . . . . . . . . . . . 19
       4.7.1.  Integer Literals . . . . . . . . . . . . . . . . . . . 20
       4.7.2.  Integer Operators  . . . . . . . . . . . . . . . . . . 20
       4.7.3.  Boolean Literals . . . . . . . . . . . . . . . . . . . 20
       4.7.4.  Boolean Operators  . . . . . . . . . . . . . . . . . . 20
       4.7.5.  Comparison Operators . . . . . . . . . . . . . . . . . 21
     4.8.  Comments . . . . . . . . . . . . . . . . . . . . . . . . . 21
     4.9.  "ENFORCE" Statements . . . . . . . . . . . . . . . . . . . 22
     4.10. Formal Specification of Field Lengths  . . . . . . . . . . 23
     4.11. Library of Encoding Methods  . . . . . . . . . . . . . . . 24
       4.11.1. uncompressed_value . . . . . . . . . . . . . . . . . . 24
       4.11.2. compressed_value . . . . . . . . . . . . . . . . . . . 25
       4.11.3. irregular  . . . . . . . . . . . . . . . . . . . . . . 26
       4.11.4. static . . . . . . . . . . . . . . . . . . . . . . . . 27
       4.11.5. lsb  . . . . . . . . . . . . . . . . . . . . . . . . . 27
       4.11.6. crc  . . . . . . . . . . . . . . . . . . . . . . . . . 29
     4.12. Definition of Encoding Methods . . . . . . . . . . . . . . 29
       4.12.1. Structure  . . . . . . . . . . . . . . . . . . . . . . 30
       4.12.2. Arguments  . . . . . . . . . . . . . . . . . . . . . . 37
       4.12.3. Multiple Formats . . . . . . . . . . . . . . . . . . . 38
     4.13. Profile-Specific Encoding Methods  . . . . . . . . . . . . 40
   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 41
   6.  Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 41
   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 41
   8.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 42
     8.1.  Normative References . . . . . . . . . . . . . . . . . . . 42
     8.2.  Informative References . . . . . . . . . . . . . . . . . . 42
   Appendix A.  Formal Syntax of ROHC-FN  . . . . . . . . . . . . . . 43
   Appendix B.  Bit-level Worked Example  . . . . . . . . . . . . . . 45
     B.1.  Example Packet Format  . . . . . . . . . . . . . . . . . . 45
     B.2.  Initial Encoding . . . . . . . . . . . . . . . . . . . . . 46
     B.3.  Basic Compression  . . . . . . . . . . . . . . . . . . . . 47
     B.4.  Inter-Packet Compression . . . . . . . . . . . . . . . . . 48
     B.5.  Specifying Initial Values  . . . . . . . . . . . . . . . . 50
     B.6.  Multiple Packet Formats  . . . . . . . . . . . . . . . . . 51
     B.7.  Variable Length Discriminators . . . . . . . . . . . . . . 53
     B.8.  Default Encoding . . . . . . . . . . . . . . . . . . . . . 55
     B.9.  Control Fields . . . . . . . . . . . . . . . . . . . . . . 56
     B.10. Use of "ENFORCE" Statements as Conditionals  . . . . . . . 59

RFC4997 - Page 3

1.  Introduction

   Robust Header Compression - Formal Notation (ROHC-FN) is a formal
   notation designed to help with the definition of ROHC [RFC4995]
   header compression profiles.  Previous header compression profiles
   have been so far specified using a combination of English text
   together with ASCII Box notation.  Unfortunately, this was sometimes
   unclear and ambiguous, revealing the limitations of defining complex
   structures and encodings for compressed formats this way.  The
   primary objective of the Formal Notation is to provide a more
   rigorous means to define header formats -- compressed and
   uncompressed -- as well as the relationships between them.  No other
   formal notation exists that meets these requirements, so ROHC-FN aims
   to meet them.

   In addition, ROHC-FN offers a library of encoding methods that are
   often used in ROHC profiles, so that the specification of new
   profiles using the formal notation can be achieved without having to
   redefine this library from scratch.  Informally, an encoding method
   defines a two-way mapping between uncompressed data and compressed
   data.

2.  Terminology

   o  Compressed format

      A compressed format consists of a list of fields that provides
      bindings between encodings and the fields it compresses.  One or
      more compressed formats can be combined to represent an entire
      compressed header format.

   o  Context

      Context is information about the current (de)compression state of
      the flow.  Specifically, a context for a specific field can be
      either uninitialised, or it can include a set of one or more
      values for the field's attributes defined by the compression
      algorithm, where a value may come from the field's attributes
      corresponding to a previous packet.  See also a more generalized
      definition in Section 2.2 of [RFC4995].

   o  Control field

      Control fields are transmitted from a ROHC compressor to a ROHC
      decompressor, but are not part of the uncompressed header itself.

RFC4997 - Page 4

   o  Encoding method, encodings

      Encoding methods are two-way relations that can be applied to
      compress and decompress fields of a protocol header.

   o  Field

      The protocol header is divided into a set of contiguous bit
      patterns known as fields.  Each field is defined by a collection
      of attributes that indicate its value and length in bits for both
      the compressed and uncompressed headers.  The way the header is
      divided into fields is specific to the definition of a profile,
      and it is not necessary for the field divisions to be identical to
      the ones given by the specification(s) for the protocol header
      being compressed.

   o  Library of encoding methods

      The library of encoding methods contains a number of commonly used
      encoding methods for compressing header fields.

   o  Profile

      A ROHC [RFC4995] profile is a description of how to compress a
      certain protocol stack.  Each profile consists of a set of formats
      (for example, uncompressed and compressed formats) along with a
      set of rules that control compressor and decompressor behaviour.

   o  ROHC-FN specification

      The specification of the set of formats of a ROHC profile using
      ROHC-FN.

   o  Uncompressed format

      An uncompressed format consists of a list of fields that provides
      the order of the fields to be compressed for a contiguous set of
      bits whose bit layout corresponds to the protocol header being
      compressed.

3.  Overview of ROHC-FN

   This section gives an overview of ROHC-FN.  It also explains how
   ROHC-FN can be used to specify the compression of header fields as
   part of a ROHC profile.

RFC4997 - Page 5

3.1.  Scope of the Formal Notation

   This section explains how the formal notation relates to the ROHC
   framework and to specifications of ROHC profiles.

   The ROHC framework [RFC4995] provides the general principles for
   performing robust header compression.  It defines the concept of a
   profile, which makes ROHC a general platform for different
   compression schemes.  It sets link layer requirements, and in
   particular negotiation requirements, for all ROHC profiles.  It
   defines a set of common functions such as Context Identifiers (CIDs),
   padding, and segmentation.  It also defines common formats (IR, IR-
   DYN, Feedback, Add-CID, etc.), and finally it defines a generic,
   profile independent, feedback mechanism.

   A ROHC profile is a description of how to compress a certain protocol
   stack.  For example, ROHC profiles are available for RTP/UDP/IP and
   many other protocol stacks.

   At a high level, each ROHC profile consists of a set of formats
   (defining the bits to be transmitted) along with a set of rules that
   control compressor and decompressor behaviour.  The purpose of the
   formats is to define how to compress and decompress headers.  The
   formats define one or more compressed versions of each uncompressed
   header, and simultaneously define the inverse: how to relate a
   compressed header back to the original uncompressed header.

   The set of formats will typically define compression of headers
   relative to a context of field values from previous headers in a
   flow, improving the overall compression by taking into account
   redundancies between headers of successive packets.  Therefore, in
   addition to defining the formats, a profile has to:

   o  specify how to manage the context for both the compressor and the
      decompressor,

   o  define when and what to send in feedback messages, if any, from
      decompressor to compressor,

   o  outline compression principles to make the profile robust against
      bit errors and dropped packets.

   All this is needed to ensure that the compressor and decompressor
   contexts are kept consistent with each other, while still
   facilitating the best possible compression performance.

   The ROHC-FN is designed to help in the specification of compressed
   formats that, when put together based on the profile definition, make

RFC4997 - Page 6

   up the formats used in a ROHC profile.  It offers a library of
   encoding methods for compressing fields, and a mechanism for
   combining these encoding methods to create compressed formats
   tailored to a specific protocol stack.

   The scope of ROHC-FN is limited to specifying the relationship
   between the compressed and uncompressed formats.  To form a complete
   profile specification, the control logic for the profile behaviour
   needs to be defined by other means.

3.2.  Fundamentals of the Formal Notation

   There are two fundamental elements to the formal notation:

   1.  Fields and their encodings, which define the mapping between a
       header's uncompressed and compressed forms.

   2.  Encoding methods, which define the way headers are broken down
       into fields.  Encoding methods define lists of uncompressed
       fields and the lists of compressed fields they map onto.

   These two fundamental elements are at the core of the notation and
   are outlined below.

3.2.1.  Fields and Encodings

   Headers are made up of fields.  For example, version number, header
   length, and sequence number are all fields used in real protocols.

   Fields have attributes.  Attributes describe various things about the
   field.  For example:

     field.ULENGTH

   The above indicates the uncompressed length of the field.  A field is
   said to have a value attribute, i.e., a compressed value or an
   uncompressed value, if the corresponding length attribute is greater
   than zero.  See Section 4.4 for more details on field attributes.

   The relationship between the compressed and uncompressed attributes
   of a field are specified with encoding methods, using the following
   notation:

     field   =:=   encoding_method;

   In the field definition above, the symbol "=:=" means "is encoded
   by".  This field definition does not represent an assignment
   operation from the right hand side to the left side.  Instead, it is

RFC4997 - Page 7

   a two-way mapping between the compressed and uncompressed attributes
   of the field.  It both represents the compression and the
   decompression operation in a single field definition, through a
   process of two-way matching.

   Two-way matching is a binary operation that attempts to make the
   operands (i.e., the compressed and uncompressed attributes) match.
   This is similar to the unification process in logic.  The operands
   represent one unspecified data object and one specified object.
   Values can be matched from either operand.

   During compression, the uncompressed attributes of the field are
   already defined.  The given encoding matches the compressed
   attributes against them.  During decompression, the compressed
   attributes of the field are already defined, so the uncompressed
   attributes are matched to the compressed attributes using the given
   encoding method.  Thus, both compression and decompression are
   defined by a single field definition.

   Therefore, an encoding method (including any parameters specified)
   creates a reversible binding between the attributes of a field.  At
   the compressor, a format can be used if a set of bindings that is
   successful for all the attributes in all its fields can be found.  At
   the decompressor, the operation is reversed using the same bindings
   and the attributes in each field are filled according to the
   specified bindings; decoding fails if the binding for an attribute
   fails.

   For example, the "static" encoding method creates a binding between
   the attribute corresponding to the uncompressed value of the field
   and the corresponding value of the field in the context.

   o  For the compressor, the "static" binding is successful when both
      the context value and the uncompressed value are the same.  If the
      two values differ then the binding fails.

   o  For the decompressor, the "static" binding succeeds only if a
      valid context entry containing the value of the uncompressed field
      exists.  Otherwise, the binding will fail.

   Both the compressed and uncompressed forms of each field are
   represented as a string of bits; the most significant bit first, of
   the length specified by the length attribute.  The bit string is the
   binary representation of the value attribute of the field, modulo
   "2^length", where "length" is the length attribute of the field.
   However, this is only the representation of the bits exchanged
   between the compressor and the decompressor, designed to allow

RFC4997 - Page 8

   maximum compression efficiency.  The FN itself uses the full range of
   integers.  See Section 4.4.2 for further details.

3.2.2.  Formats and Encoding Methods

   The ROHC-FN provides a library of commonly used encoding methods.
   Encoding methods can be defined using plain English, or using a
   formal definition consisting of, for example, a collection of
   expressions (Section 4.7) and "ENFORCE" statements (Section 4.9).

   ROHC-FN also provides mechanisms for combining fields and their
   encoding methods into higher level encoding methods following a well-
   defined structure.  This is similar to the definition of functions
   and procedures in an ordinary programming language.  It allows
   complexity to be handled by being broken down into manageable parts.
   New encoding methods are defined at the top level of a profile.
   These can then be used in the definition of other higher level
   encoding methods, and so on.

   new_encoding_method         // This block is an encoding method
   {
     UNCOMPRESSED {            // This block is an uncompressed format
       field_1   [ 16 ];
       field_2   [ 32 ];
       field_3   [ 48 ];
     }

     CONTROL {                 // This block defines control fields
       ctrl_field_1;
       ctrl_field_2;
     }

     DEFAULT {                 // This block defines default encodings
                               // for specified fields
       ctrl_field_2 =:= encoding_method_2;
       field_1      =:= encoding_method_1;
     }

     COMPRESSED format_0 {     // This block is a compressed format
       field_1;
       field_2      =:= encoding_method_2;
       field_3      =:= encoding_method_3;
       ctrl_field_1 =:= encoding_method_4;
       ctrl_field_2;
     }

RFC4997 - Page 9

     COMPRESSED format_1 {     // This block is a compressed format
       field_1;
       field_2      =:= encoding_method_3;
       field_3      =:= encoding_method_4;
       ctrl_field_2 =:= encoding_method_5;
       ctrl_field_3 =:= encoding_method_6; // This is a control field
                                           // with no uncompressed value
     }
   }

   In the example above, the encoding method being defined is called
   "new_encoding_method".  The section headed "UNCOMPRESSED" indicates
   the order of fields in the uncompressed header, i.e., the
   uncompressed header format.  The number of bits in each of the fields
   is indicated in square brackets.  After this is another section,
   "CONTROL", which defines two control fields.  Following this is the
   "DEFAULT" section which defines default encoding methods for two of
   the fields (see below).  Finally, two alternative compressed formats
   follow, each defined in sections headed "COMPRESSED".  The fields
   that occur in the compressed formats are either:

   o  fields that occur in the uncompressed format; or

   o  control fields that have an uncompressed value and that occur in
      the CONTROL section; or

   o  control fields that do not have an uncompressed value and thus are
      defined as part of the compressed format.

   Central to each of these formats is a "field list", which defines the
   fields contained in the format and also the order that those fields
   appear in that format.  For the "DEFAULT" and "CONTROL" sections, the
   field order is not significant.

   In addition to specifying field order, the field list may also
   specify bindings for any or all of the fields it contains.  Fields
   that have no bindings defined for them are bound using the default
   bindings specified in the "DEFAULT" section (see Section 4.12.1.5).

   Fields from the compressed format have the same name as they do in
   the uncompressed format.  If there are any fields that are present
   exclusively in the compressed format, but that do have an
   uncompressed value, they must be declared in the "CONTROL" section of
   the definition of the encoding method (see Section 4.12.1.3 for more
   details on defining control fields).

   Fields that have no uncompressed value do not appear in an
   "UNCOMPRESSED" field list and do not have to appear in the "CONTROL"

RFC4997 - Page 10

   field list either.  Instead, they are only declared in the compressed
   field lists where they are used.

   In the example above, all the fields that appear in the compressed
   format are also found in the uncompressed format, or the control
   field list, except for ctrl_field_3; this is possible because
   ctrl_field_3 has no "uncompressed" value at all.  Fields such as a
   checksum on the compressed information fall into this category.

3.3.  Example Using IPv4

   This section gives an overview of how the notation is used by means
   of an example.  The example will develop the formal notation for an
   encoding method capable of compressing a single, well-known header:
   the IPv4 header [RFC791].

   The first step is to specify the overall structure of the IPv4
   header.  To do this, we use an encoding method that we will call
   "ipv4_header".  More details on definitions of encoding methods can
   be found in Section 4.12.  This is notated as follows:

     ipv4_header
     {

   The fragment of notation above declares the encoding method
   "ipv4_header", the definition follows the opening brace (see
   Section 4.12).

   Definitions within the pair of braces are local to "ipv4_header".
   This scoping mechanism helps to clarify which fields belong to which
   formats; it is also useful when compressing complex protocol stacks
   with several headers, often with the same field names occurring in
   multiple headers (see Section 4.2).

   The next step is to specify the fields contained in the uncompressed
   IPv4 header to represent the uncompressed format for which the
   encoding method will define one or more compressed formats.  This is
   accomplished using ROHC-FN as follows:

RFC4997 - Page 11

       UNCOMPRESSED {
         version         [  4 ];
         header_length   [  4 ];
         dscp            [  6 ];
         ecn             [  2 ];
         length          [ 16 ];
         id              [ 16 ];
         reserved        [  1 ];
         dont_frag       [  1 ];
         more_fragments  [  1 ];
         offset          [ 13 ];
         ttl             [  8 ];
         protocol        [  8 ];
         checksum        [ 16 ];
         src_addr        [ 32 ];
         dest_addr       [ 32 ];
       }

   The width of each field is indicated in square brackets.  This part
   of the notation is used in the example for illustration to help the
   reader's understanding.  However, indicating the field lengths in
   this way is optional since the width of each field can also normally
   be derived from the encoding that is used to compress/decompress it
   for a specific format.  This part of the notation is formally defined
   in Section 4.10.

   The next step is to specify the compressed format.  This includes the
   encodings for each field that map between the compressed and
   uncompressed forms of the field.  In the example, these encoding
   methods are mainly taken from the ROHC-FN library (see Section 4.11).
   Since the intention here is to illustrate the use of the notation,
   rather than to describe the optimum method of compressing IPv4
   headers, this example uses only three encoding methods.

   The "uncompressed_value" encoding method (defined in Section 4.11.1)
   can compress any field whose uncompressed length and value are fixed,
   or can be calculated using an expression.  No compressed bits need to
   be sent because the uncompressed field can be reconstructed using its
   known size and value.  The "uncompressed_value" encoding method is
   used to compress five fields in the IPv4 header, as described below:

       COMPRESSED {
         header_length  =:= uncompressed_value(4, 5);
         version        =:= uncompressed_value(4, 4);
         reserved       =:= uncompressed_value(1, 0);
         offset         =:= uncompressed_value(13, 0);
         more_fragments =:= uncompressed_value(1, 0);

RFC4997 - Page 12

   The first parameter indicates the length of the uncompressed field in
   bits, and the second parameter gives its integer value.

   Note that the order of the fields in the compressed format is
   independent of the order of the fields in the uncompressed format.

   The "irregular" encoding method (defined in Section 4.11.3) can be
   used to encode any field for which both uncompressed attributes
   (ULENGTH and UVALUE) are defined, and whose ULENGTH attribute is
   either fixed or can be calculated using an expression.  It is a fail-
   safe encoding method that can be used for such fields in the case
   where no other encoding method applies.  All of the bits in the
   uncompressed form of the field are present in the compressed form as
   well; hence this encoding does not achieve any compression.

         src_addr       =:= irregular(32);
         dest_addr      =:= irregular(32);
         length         =:= irregular(16);
         id             =:= irregular(16);
         ttl            =:= irregular(8);
         protocol       =:= irregular(8);
         dscp           =:= irregular(6);
         ecn            =:= irregular(2);
         dont_frag      =:= irregular(1);

   Finally, the third encoding method is specific only to the
   uncompressed format defined above for the IPv4 header,
   "inferred_ip_v4_header_checksum":

         checksum       =:= inferred_ip_v4_header_checksum [ 0 ];
       }
     }

   The "inferred_ip_v4_header_checksum" encoding method is different
   from the other two encoding methods in that it is not defined in the
   ROHC-FN library of encoding methods.  Its definition could be given
   either by using the formal notation as part of the profile definition
   itself (see Section 4.12) or by using plain English text (see
   Section 4.13).

   In our example, the "inferred_ip_v4_header_checksum" is a specific
   encoding method that calculates the IP checksum from the rest of the
   header values.  Like the "uncompressed_value" encoding method, no
   compressed bits need to be sent, since the field value can be
   reconstructed at the decompressor.  This is notated explicitly by
   specifying, in square brackets, a length of 0 for the checksum field
   in the compressed format.  Again, this notation is optional since the
   encoding method itself would be defined as sending zero compressed

RFC4997 - Page 13

   bits, however it is useful to the reader to include such notation
   (see Section 4.10 for details on this part of the notation).

   Finally the definition of the format is terminated with a closing
   brace.  At this point, the above example has defined a compressed
   format that can be used to represent the entire compressed IPv4
   header, and provides enough information to allow an implementation to
   construct the compressed format from an uncompressed format
   (compression) and vice versa (decompression).

(page 13 continued on part 2)