Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 4997

Formal Notation for RObust Header Compression (ROHC-FN)

Pages: 62
Proposed Standard
Part 3 of 3 – Pages 41 to 62
First   Prev   None

Top   ToC   RFC4997 - Page 41   prevText

5. Security Considerations

This document describes a formal notation similar to ABNF [RFC4234], and hence is not believed to raise any security issues (note that ABNF has a completely separate purpose to the ROHC formal notation).

6. Contributors

Richard Price did much of the foundational work on the formal notation. He authored the initial document describing a formal notation on which this document is based. Kristofer Sandlund contributed to this work by applying new ideas to the ROHC-TCP profile, by providing feedback, and by helping resolve different issues during the entire development of the notation. Carsten Bormann provided the translation of the formal notation syntax using ABNF in Appendix A, and also contributed with feedback and reviews to validate the completeness and correctness of the notation.

7. Acknowledgements

A number of important concepts and ideas have been borrowed from ROHC [RFC3095]. Thanks to Mark West, Eilert Brinkmann, Alan Ford, and Lars-Erik Jonsson for their contributions, reviews, and feedback that led to significant improvements to the readability, completeness, and overall quality of the notation. Thanks to Stewart Sadler, Caroline Daniels, Alan Finney, and David Findlay for their reviews and comments. Thanks to Rob Hancock and Stephen McCann for their early work on the formal notation. The
Top   ToC   RFC4997 - Page 42
   authors would also like to thank Christian Schmidt, Qian Zhang,
   Hongbin Liao, and Max Riegel for their comments and valuable input.

   Additional thanks: this document was reviewed during working group
   last-call by committed reviewers Mark West, Carsten Bormann, and Joe
   Touch, as well as by Sally Floyd who provided a review at the request
   of the Transport Area Directors.  Thanks also to Magnus Westerlund
   for his feedback in preparation for the IESG review.

8. References

8.1. Normative References

[C90] ISO/IEC, "ISO/IEC 9899:1990 Information technology -- Programming Language C", ISO 9899:1990, April 1990. [RFC2822] Resnick, P., Ed., "STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES", RFC 2822, April 2001. [RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, October 2005. [RFC4995] Jonsson, L-E., Pelletier, G., and K. Sandlund, "The RObust Header Compression (ROHC) Framework", RFC 4995, July 2007.

8.2. Informative References

[RFC3095] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H., Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K., Wiebke, T., Yoshimura, T., and H. Zheng, "RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed", RFC 3095, July 2001. [RFC791] University of Southern California, "DARPA INTERNET PROGRAM PROTOCOL SPECIFICATION", RFC 791, September 1981.
Top   ToC   RFC4997 - Page 43

Appendix A. Formal Syntax of ROHC-FN

This section gives a definition of the syntax of ROHC-FN in ABNF [RFC4234], using "fnspec" as the start rule. ; overall structure fnspec = S *(constdef S) [globctl S] 1*(methdef S) constdef = constname S "=" S expn S ";" globctl = CONTROL S formbody methdef = id S [parmlist S] "{" S 1*(formatdef S) "}" / id S [parmlist S] STRQ *STRCHAR STRQ S ";" parmlist = "(" S id S *( "," S id S ) ")" formatdef = formhead S formbody formhead = UNCOMPRESSED [ 1*WS id ] / COMPRESSED [ 1*WS id ] / CONTROL / INITIAL / DEFAULT formbody = "{" S *((fielddef/enforcer) S) "}" fielddef = fieldgroup S ["=:=" S encspec S] [lenspec S] ";" fieldgroup = fieldname *( S ":" S fieldname ) fieldname = id encspec = "'" *("0"/"1") "'" / id [ S "(" S expn S *( "," S expn S ) ")"] lenspec = "[" S expn S *("," S expn S) "]" enforcer = ENFORCE S "(" S expn S ")" S ";" ; expressions expn = *(expnb S "||" S) expnb expnb = *(expna S "&&" S) expna expna = *(expn7 S ("=="/"!=") S) expn7 expn7 = *(expn6 S ("<"/"<="/">"/">=") S) expn6 expn6 = *(expn4 S ("+"/"-") S) expn4 expn4 = *(expn3 S ("*"/"/"/"%") S) expn3 expn3 = expn2 [S "^" S expn3] expn2 = ["!" S] expn1 expn1 = expn0 / attref / constname / litval / id expn0 = "(" S expn S ")" / VARIABLE attref = fieldnameref "." attname fieldnameref = fieldname / THIS attname = ( U / C ) ( LENGTH / VALUE ) litval = ["-"] "0b" 1*("0"/"1") / ["-"] "0x" 1*(DIGIT/"a"/"b"/"c"/"d"/"e"/"f") / ["-"] 1*DIGIT / false / true
Top   ToC   RFC4997 - Page 44
   ; lexical categories
   constname = UPCASE *(UPCASE / DIGIT / "_")
   id        = ALPHA *(ALPHA / DIGIT / "_")
   ALPHA     = %x41-5A / %x61-7A
   UPCASE    = %x41-5A
   DIGIT     = %x30-39
   COMMENT   = "//" *(SP / HTAB / VCHAR) CRLF
   SP        = %x20
   HTAB      = %x09
   VCHAR     = %x21-7E
   CRLF      = %x0A / %x0D.0A
   NL        = COMMENT / CRLF
   WS        = SP / HTAB / NL
   S         = *WS
   STRCHAR   = SP / HTAB / %x21 / %x23-7E
   STRQ      = %x22


   ; case-sensitive literals
   C            = %d67
   COMPRESSED   = %d67.79.77.80.82.69.83.83.69.68
   CONTROL      = %d67.79.78.84.82.79.76
   DEFAULT      = %d68.69.70.65.85.76.84
   ENFORCE      = %d69.78.70.79.82.67.69
   INITIAL      = %d73.78.73.84.73.65.76
   LENGTH       = %d76.69.78.71.84.72
   THIS         = %d84.72.73.83
   U            = %d85
   UNCOMPRESSED = %d85.78.67.79.77.80.82.69.83.83.69.68
   VALUE        = %d86.65.76.85.69
   VARIABLE     = %d86.65.82.73.65.66.76.69
   false        = %d102.97.108.115.101
   true         = %d116.114.117.101
Top   ToC   RFC4997 - Page 45

Appendix B. Bit-level Worked Example

This section gives a worked example at the bit level, showing how a simple ROHC-FN specification describes the compression of real data from an imaginary protocol header. The example used has been kept fairly simple, whilst still aiming to illustrate some of the intricacies that arise in use of the notation. In particular, fields have been kept short to make it possible to read the binary representation of the headers without too much difficulty.

B.1. Example Packet Format

Our imaginary header is just 16 bits long, and consists of the following fields: 1. version number -- 2 bits 2. type -- 2 bits 3. flow id -- 4 bits 4. sequence number -- 4 bits 5. flag bits -- 4 bits So for example 0101000100010000 indicates a header with a version number of one, a type of one, a flow id of one, a sequence number of one, and all flag bits set to zero. Here is an ASCII box notation diagram of the imaginary header: 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ |version| type | flow_id | +---+---+---+---+---+---+---+---+ | sequence_no | flag_bits | +---+---+---+---+---+---+---+---+
Top   ToC   RFC4997 - Page 46

B.2. Initial Encoding

An initial definition based solely on the above information is as follows: eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; flag_bits [ 4 ]; } COMPRESSED initial_definition { version_no =:= irregular(2); type =:= irregular(2); flow_id =:= irregular(4); sequence_no =:= irregular(4); flag_bits =:= irregular(4); } } This defines the format nicely, but doesn't actually offer any compression. If we use it to encode the above header, we get: Uncompressed header: 0101000100010000 Compressed header: 0101000100010000 This is because we have stated that all fields are "irregular" -- i.e., we haven't specified anything about their behaviour. Note that since we have only one compressed format and one uncompressed format, it makes no difference whether the encoding methods for each field are specified in the compressed or uncompressed format. It would make no difference at all if we wrote the following instead: eg_header { UNCOMPRESSED { version_no =:= irregular(2); type =:= irregular(2); flow_id =:= irregular(4); sequence_no =:= irregular(4); flag_bits =:= irregular(4); }
Top   ToC   RFC4997 - Page 47
       COMPRESSED initial_definition {
         version_no   [ 2 ];
         type         [ 2 ];
         flow_id      [ 4 ];
         sequence_no  [ 4 ];
         flag_bits    [ 4 ];
       }
     }

B.3. Basic Compression

In order to achieve any compression we need to notate more knowledge about the header and its behaviour in a flow. For example, we may know the following facts about the header: 1. version number -- indicates which version of the protocol this is: always one for this version of the protocol. 2. type -- may take any value. 3. flow id -- may take any value. 4. sequence number -- make take any value. 5. flag bits -- contains three flags, a, b, and c, each of which may be set or clear, and a reserved flag bit, which is always clear (i.e., zero). We could notate this knowledge as follows: eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag [ 1 ]; } COMPRESSED basic { version_no =:= uncompressed_value(2, 1) [ 0 ]; type =:= irregular(2) [ 2 ]; flow_id =:= irregular(4) [ 4 ]; sequence_no =:= irregular(4) [ 4 ]; abc_flag_bits =:= irregular(3) [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 0 ];
Top   ToC   RFC4997 - Page 48
       }
     }

   Using this simple scheme, we have successfully encoded the fact that
   one of the fields has a permanently fixed value of one, and therefore
   contains no useful information.  We have also encoded the fact that
   the final flag bit is always zero, which again contains no useful
   information.  Both of these facts have been notated using the
   "uncompressed_value" encoding method (see Section 4.11.1).

   Using this new encoding on the above header, we get:

     Uncompressed header: 0101000100010000
     Compressed header:   0100010001000

   This reduces the amount of data we need to transmit by roughly 20%.
   However, this encoding fails to take advantage of relationships
   between values of a field in one packet and its value in subsequent
   packets.  For example, every header in the following sequence is
   compressed by the same amount despite the similarities between them:

     Uncompressed header: 0101000100010000
     Compressed header:   0100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   0100010100000


     Uncompressed header: 0110000101110000
     Compressed header:   1000010111000

B.4. Inter-Packet Compression

The profile we have defined so far has not compressed the sequence number or flow ID fields at all, since they can take any value. However the value of each of these fields in one header has a very simple relationship to their values in previous headers: o the sequence number is unusual -- it increases by three each time, o the flow_id stays the same -- it always has the same value that it did in the previous header in the flow, o the abc_flag_bits stay the same most of the time -- they usually have the same value that they did in the previous header in the flow.
Top   ToC   RFC4997 - Page 49
   An obvious way of notating this is as follows:

     // This obvious encoding will not work (correct encoding below)
     eg_header
     {
       UNCOMPRESSED {
         version_no     [ 2 ];
         type           [ 2 ];
         flow_id        [ 4 ];
         sequence_no    [ 4 ];
         abc_flag_bits  [ 3 ];
         reserved_flag  [ 1 ];
       }

       COMPRESSED obvious {
         version_no    =:= uncompressed_value(2, 1);
         type          =:= irregular(2);
         flow_id       =:= static;
         sequence_no   =:= lsb(0, -3);
         abc_flag_bits =:= irregular(3);
         reserved_flag =:= uncompressed_value(1, 0);
       }
     }

   The dependency on previous packets is notated using the "static" and
   "lsb" encoding methods (see Section 4.11.4 and Section 4.11.5
   respectively).  However there are a few problems with the above
   notation.

   Firstly, and most importantly, the "flow_id" field is notated as
   "static", which means that it doesn't change from packet to packet.
   However, the notation does not indicate how to communicate the value
   of the field initially.  There is no point saying "it's the same
   value as last time" if there has not been a first time where we
   define what that value is, so that it can be referred back to.  The
   above notation provides no way of communicating that.  Similarly with
   the sequence number -- there needs to be a way of communicating its
   initial value.  In fact, except for the explicit notation indicating
   their lengths, even the lengths of these two fields would be left
   undefined.  This problem will be solved below, in Appendix B.5.

   Secondly, the sequence number field is communicated very efficiently
   in zero bits, but it is not at all robust against packet loss.  If a
   packet is lost then there is no way to handle the missing sequence
   number.  When communicating sequence numbers, or any other field
   encoded with "lsb" encoding, a very important consideration for the
   notator is how robust against packet loss the compressed protocol
   should be.  This will vary a lot from protocol stack to protocol
Top   ToC   RFC4997 - Page 50
   stack.  For the example protocol we'll assume short, low overhead
   flows and say we need to be robust to the loss of just one packet,
   which we can achieve with two bits of "lsb" encoding (one bit isn't
   enough since the sequence number increases by three each time -- see
   Section 4.11.5).  This will be addressed below in Appendix B.5.

   Finally, although the flag bits are usually the same as in the
   previous header in the flow, the profile doesn't make any use of this
   fact; since they are sometimes not the same as those in the previous
   header, it is not safe to say that they are always the same, so
   "static" encoding can't be used exclusively.  This problem will be
   solved later through the use of multiple formats in Appendix B.6.

B.5. Specifying Initial Values

To communicate initial values for fields compressed with a context dependent encoding such as "static" or "lsb" we use an "INITIAL" field list. This can help with fields whose start value is fixed and known. For example, if we knew that at the start of the flow that "flow_id" would always be 1 and "sequence_no" would always be 0, we could notate that like this: // This encoding will not work either (correct encoding below) eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag [ 1 ]; } INITIAL { // set initial values of fields before flow starts flow_id =:= uncompressed_value(4, 1); sequence_no =:= uncompressed_value(4, 0); } COMPRESSED obvious { version_no =:= uncompressed_value(2, 1); type =:= irregular(2); flow_id =:= static; sequence_no =:= lsb(2, -3); abc_flag_bits =:= irregular(3); reserved_flag =:= uncompressed_value(1, 0); }
Top   ToC   RFC4997 - Page 51
     }

   However, this use of "INITIAL" is no good since the initial values of
   both "flow_id" and "sequence_no" vary from flow to flow.  "INITIAL"
   is only applicable where the initial value of a field is fixed, as is
   often the case with control fields.

B.6. Multiple Packet Formats

To communicate initial values for the sequence number and flow ID fields correctly, and to take advantage of the fact that the flag bits are usually the same as in the previous header, we need to depart from the single format encoding we are currently using and instead use multiple formats. Here, we have expressed the encodings for two of the fields in the uncompressed format, since they will always be true for uncompressed headers of that format. The remaining fields, whose encoding method may depend on exactly how the header is being compressed, have their encodings specified in the compressed formats. eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } COMPRESSED irregular_format { discriminator =:= '0' [ 1 ]; version_no [ 0 ]; type =:= irregular(2) [ 2 ]; flow_id =:= irregular(4) [ 4 ]; sequence_no =:= irregular(4) [ 4 ]; abc_flag_bits =:= irregular(3) [ 3 ]; reserved_flag [ 0 ]; } COMPRESSED compressed_format { discriminator =:= '1' [ 1 ]; version_no [ 0 ]; type =:= irregular(2) [ 2 ]; flow_id =:= static [ 0 ]; sequence_no =:= lsb(2, -3) [ 2 ];
Top   ToC   RFC4997 - Page 52
         abc_flag_bits =:= static       [ 0 ];
         reserved_flag                  [ 0 ];
       }
     }

   Note that we have added a discriminator field, so that the
   decompressor can tell which format has been used by the compressor.
   The format with a "static" flow ID and "lsb" encoded sequence number
   is now 5 bits long.  Note that despite having to add the
   discriminator field, this format is still the same size as the
   original incorrect "obvious" format because it takes advantage of the
   fact that the abc flag bits rarely change.

   However, the original "basic" format has also grown by one bit due to
   the addition of the discriminator ("irregular_format").  An important
   consideration when creating multiple formats is whether each format
   occurs frequently enough that the average compressed header length is
   shorter as a result of its usage.  For example, if in fact the flag
   bits always changed between packets, the "compressed_format" encoding
   could never be used; all we would have achieved is lengthening the
   "basic" format by one bit.

   Using the above notation, we now get:

     Uncompressed header: 0101000100010000
     Compressed header:   00100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   10100 ; 00100010100000


     Uncompressed header: 0110000101110000
     Compressed header:   11011 ; 01000010111000

   The first header in the stream is compressed the same way as before,
   except that it now has the extra 1-bit discriminator at the start
   (0).  When a second header arrives with the same flow ID as the first
   and its sequence number three higher, it can be compressed in two
   possible ways: either by using "compressed_format" or, in the same
   way as previously, by using "irregular_format".

   Note that we show all theoretically possible encodings of a header as
   defined by the ROHC-FN specification, separated by semi-colons.
   Either of the above encodings for each header could be produced by a
   valid implementation, although a good implementation would always aim
   to pick the encoding that leads to the best compression.  A good
   implementation would also take robustness into account and therefore
Top   ToC   RFC4997 - Page 53
   probably wouldn't assume on the second packet that the decompressor
   had available the context necessary to decompress the shorter
   "compressed_format" form.

   Finally, note that the fields whose encoding methods are specified in
   the uncompressed format have zero length when compressed.  This means
   their position in the compressed format is not significant.  In this
   case, there is no need to notate them when defining the compressed
   formats.  In the next part of the example we will see that they have
   been removed from the compressed formats altogether.

B.7. Variable Length Discriminators

Suppose we do some analysis on flows of our example protocol and discover that whilst it is usual for successive packets to have the same flags, on the occasions when they don't, the packet is almost always a "flags set" packet in which all three of the abc flags are set. To encode the flow more efficiently a format needs to be written to reflect this. This now gives a total of three formats, which means we need three discriminators to differentiate between them. The obvious solution here is to increase the number of bits in the discriminator from one to two and use discriminators 00, 01, and 10 for example. However we can do slightly better than this. Any uniquely identifiable discriminator will suffice, so we can use 00, 01, and 1. If the discriminator starts with 1, that's the whole thing. If it starts with 0, the decompressor knows it has to check one more bit to determine the kind of format. Note that care must be taken when using variable length discriminators. For example, it would be erroneous to use 0, 01, and 10 as discriminators since after reading an initial 0, the decompressor would have no way of knowing if the next bit was a second bit of discriminator, or the first bit of the next field in the format. However, 0, 10, and 11 would be correct, as the first bit again indicates whether or not there are further discriminator bits to follow.
Top   ToC   RFC4997 - Page 54
   This gives us the following:

     eg_header
     {
       UNCOMPRESSED {
         version_no    =:= uncompressed_value(2, 1) [ 2 ];
         type                                       [ 2 ];
         flow_id                                    [ 4 ];
         sequence_no                                [ 4 ];
         abc_flag_bits                              [ 3 ];
         reserved_flag =:= uncompressed_value(1, 0) [ 1 ];
       }

       COMPRESSED irregular_format {
         discriminator =:= '00'         [ 2 ];
         type          =:= irregular(2) [ 2 ];
         flow_id       =:= irregular(4) [ 4 ];
         sequence_no   =:= irregular(4) [ 4 ];
         abc_flag_bits =:= irregular(3) [ 3 ];
       }

       COMPRESSED flags_set {
         discriminator =:= '01'                     [ 2 ];
         type          =:= irregular(2)             [ 2 ];
         flow_id       =:= static                   [ 0 ];
         sequence_no   =:= lsb(2, -3)               [ 2 ];
         abc_flag_bits =:= uncompressed_value(3, 7) [ 0 ];
       }

       COMPRESSED flags_static {
         discriminator =:= '1'          [ 1 ];
         type          =:= irregular(2) [ 2 ];
         flow_id       =:= static       [ 0 ];
         sequence_no   =:= lsb(2, -3)   [ 2 ];
         abc_flag_bits =:= static       [ 0 ];
       }
     }

   Here is some example output:

     Uncompressed header: 0101000100010000
     Compressed header:   000100010001000


     Uncompressed header: 0101000101000000
     Compressed header:   10100 ; 000100010100000
Top   ToC   RFC4997 - Page 55
     Uncompressed header: 0110000101110000
     Compressed header:   11011 ; 001000010111000


     Uncompressed header: 0111000110101110
     Compressed header:   011110 ; 001100011010111

   Here we have a very similar sequence to last time, except that there
   is now an extra message on the end that has the flag bits set.  The
   encoding for the first message in the stream is now one bit larger,
   the encoding for the next two messages is the same as before, since
   that format has not grown; thanks to the use of variable length
   discriminators.  Finally, the packet that comes through with all the
   flag bits set can be encoded in just six bits, only one bit more than
   the most common format.  Without the extra format, this last packet
   would have to be encoded using the longest format and would have
   taken up 14 bits.

B.8. Default Encoding

Some of the common encoding methods used so far have been "factored out" into the definition of the uncompressed format, meaning that they don't need to be defined for every compressed format. However, there is still some redundancy in the notation. For a number of fields, the same encoding method is used several times in different formats (though not necessarily in all of them), but the field encoding is redefined explicitly each time. If the encoding for any of these fields changed in the future, then every format that uses that encoding would have to be modified to reflect this change. This problem can be avoided by specifying default encoding methods for these fields. Doing so can also lead to a more concisely notated profile: eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } DEFAULT { type =:= irregular(2); flow_id =:= static;
Top   ToC   RFC4997 - Page 56
         sequence_no   =:= lsb(2, -3);
       }

       COMPRESSED irregular_format {
         discriminator =:= '00'         [ 2 ];
         type                           [ 2 ]; // Uses default
         flow_id       =:= irregular(4) [ 4 ]; // Overrides default
         sequence_no   =:= irregular(4) [ 4 ]; // Overrides default
         abc_flag_bits =:= irregular(3) [ 3 ];
       }

       COMPRESSED flags_set {
         discriminator =:= '01' [ 2 ];
         type                   [ 2 ]; // Uses default
         sequence_no            [ 2 ]; // Uses default
         abc_flag_bits =:= uncompressed_value(3, 7);
       }

       COMPRESSED flags_static {
         discriminator =:= '1' [ 1 ];
         type                  [ 2 ]; // Uses default
         sequence_no           [ 2 ]; // Uses default
         abc_flag_bits =:= static;
       }
     }

   The above profile behaves in exactly the same way as the one notated
   previously, since it has the same meaning.  Note that the purpose
   behind the different formats becomes clearer with the default
   encoding methods factored out: all that remains are the encodings
   that are specific to each format.  Note also that default encoding
   methods that compress down to zero bits have become completely
   implicit.  For example the compressed formats using the default
   encoding for "flow_id" don't mention it (the default is "static"
   encoding that compresses to zero bits).

B.9. Control Fields

One inefficiency in the compression scheme we have produced thus far is that it uses two bits to provide the "lsb" encoded sequence number with robustness for the loss of just one packet. In theory, only one bit should be needed. The root of the problem is the unusual sequence number that the protocol uses -- it counts up in increments of three. In order to encode it at maximum efficiency we need to translate this into a field that increments by one each time. We do this using a control field.
Top   ToC   RFC4997 - Page 57
   A control field is extra data that is communicated in the compressed
   format, but which is not a direct encoding of part of the
   uncompressed header.  Control fields can be used to communicate extra
   information in the compressed format, that allows other fields to be
   compressed more efficiently.

   The control field that we introduce scales the sequence number down
   by a factor of three.  Instead of encoding the original sequence
   number in the compressed packet, we encode the scaled sequence
   number, allowing us to have robustness to the loss of one packet by
   using just one bit of "lsb" encoding:

     eg_header
     {
       UNCOMPRESSED {
         version_no    =:= uncompressed_value(2, 1) [ 2 ];
         type                                       [ 2 ];
         flow_id                                    [ 4 ];
         sequence_no                                [ 4 ];
         abc_flag_bits                              [ 3 ];
         reserved_flag =:= uncompressed_value(1, 0) [ 1 ];
       }

       CONTROL {
         // need modulo maths to calculate scaling correctly,
         // due to 4 bit wrap around
         scaled_seq_no   [ 4 ];
         ENFORCE(sequence_no.UVALUE
                   == (scaled_seq_no.UVALUE * 3) % 16);
       }

       DEFAULT {
         type          =:= irregular(2);
         flow_id       =:= static;
         scaled_seq_no =:= lsb(1, -1);
       }

       COMPRESSED irregular_format {
         discriminator =:= '00'         [ 2 ];
         type                           [ 2 ];
         flow_id       =:= irregular(4) [ 4 ];
         scaled_seq_no =:= irregular(4) [ 4 ]; // Overrides default
         abc_flag_bits =:= irregular(3) [ 3 ];
       }

       COMPRESSED flags_set {
         discriminator =:= '01' [ 2 ];
         type                   [ 2 ];
Top   ToC   RFC4997 - Page 58
         scaled_seq_no          [ 1 ]; // Uses default
         abc_flag_bits =:= uncompressed_value(3, 7);
       }

       COMPRESSED flags_static {
         discriminator =:= '1' [ 1 ];
         type                  [ 2 ];
         scaled_seq_no         [ 1 ]; // Uses default
         abc_flag_bits =:= static;
       }
     }

   Normally, the encoding method(s) used to encode a field specifies the
   length of the field.  In the above notation, since there is no
   encoding method using "sequence_no" directly, its length needs to be
   defined explicitly using an "ENFORCE" statement.  This is done using
   the abbreviated syntax, both for consistency and also for ease of
   readability.  Note that this is unusual: whereas the majority of
   field length indications are redundant (and thus optional), this one
   isn't.  If it was removed from the above notation, the length of the
   "sequence_no" field would be undefined.

   Here is some example output:

     Uncompressed header: 0101000100010000
     Compressed header:   000100011011000


     Uncompressed header: 0101000101000000
     Compressed header:   1010 ; 000100011100000


     Uncompressed header: 0110000101110000
     Compressed header:   1101 ; 001000011101000


     Uncompressed header: 0111000110101110
     Compressed header:   01110 ; 001100011110111

   In this form, we see that this gives us a saving of a further bit in
   most packets.  Assuming the bulk of a flow is made up of
   "flags_static" headers, the mean size of the headers in a compressed
   flow is now just over a quarter of their size in an uncompressed
   flow.
Top   ToC   RFC4997 - Page 59

B.10. Use of "ENFORCE" Statements as Conditionals

Earlier, we created a new format "flags_set" to handle packets with all three of the flag bits set. As it happens, these three flags are always all set for "type 3" packets, and are never all set for other packet types (a "type 3" packet is one where the type field is set to three). This allows extra efficiency in encoding such packets. We know the type is three, so we don't need to encode the type field in the compressed header. The type field was previously encoded as "irregular(2)", which is two bits long. Removing this reduces the size of the "flags_set" format from five bits to three, making it the smallest format in the encoding method definition. In order to notate that the "flags_set" format should only be used for "type 3" headers, and the "flags_static" format only when the type isn't three, it is necessary to state these conditions inside each format. This can be done with an "ENFORCE" statement: eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } CONTROL { // need modulo maths to calculate scaling correctly, // due to 4 bit wrap around scaled_seq_no [ 4 ]; ENFORCE(sequence_no.UVALUE == (scaled_seq_no.UVALUE * 3) % 16); } DEFAULT { type =:= irregular(2); scaled_seq_no =:= lsb(1, -1); flow_id =:= static; } COMPRESSED irregular_format { discriminator =:= '00' [ 2 ]; type [ 2 ];
Top   ToC   RFC4997 - Page 60
         flow_id       =:= irregular(4) [ 4 ];
         scaled_seq_no =:= irregular(4) [ 4 ];
         abc_flag_bits =:= irregular(3) [ 3 ];
       }

       COMPRESSED flags_set {
         ENFORCE(type.UVALUE == 3); // redundant condition
         discriminator =:= '01'                      [ 2 ];
         type          =:= uncompressed_value(2, 3)  [ 0 ];
         scaled_seq_no                               [ 1 ];
         abc_flag_bits =:= uncompressed_value(3, 7)  [ 0 ];
       }

       COMPRESSED flags_static {
         ENFORCE(type.UVALUE != 3);
         discriminator =:= '1'    [ 1 ];
         type                     [ 2 ];
         scaled_seq_no            [ 1 ];
         abc_flag_bits =:= static [ 0 ];
       }
     }

   The two "ENFORCE" statements in the last two formats act as "guards".
   Guards prevent formats from being used under the wrong circumstances.
   In fact, the "ENFORCE" statement in "flags_set" is redundant.  The
   condition it guards for is already enforced by the new encoding
   method used for the "type" field.  The encoding method
   "uncompressed_value(2,3)" binds the "UVALUE" attribute to three.
   This is exactly what the "ENFORCE" statement does, so it can be
   removed without any change in meaning.  The "uncompressed_value"
   encoding method on the other hand is not redundant.  It specifies
   other bindings on the type field in addition to the one that the
   "ENFORCE" statement specifies.  Therefore it would not be possible to
   remove the encoding method and leave just the "ENFORCE" statement.

   Note that a guard is solely preventative.  A guard can never force a
   format to be chosen by the compressor.  A format can only be
   guaranteed to be chosen in a given situation if there are no other
   formats that can be used instead.  This is demonstrated in the
   example output below.  The compressor can still choose the
   "irregular" format if it wishes:

     Uncompressed header: 0101000100010000
     Compressed header:   000100011011000


     Uncompressed header: 0101000101000000
     Compressed header:   1010 ; 000100011100000
Top   ToC   RFC4997 - Page 61
     Uncompressed header: 0110000101110000
     Compressed header:   1101 ; 001000011101000


     Uncompressed header: 0111000110101110
     Compressed header:   010 ; 001100011110111

   This saves just two extra bits (a 7% saving) in the example flow.

Authors' Addresses

Robert Finking Siemens/Roke Manor Research Old Salisbury Lane Romsey, Hampshire SO51 0ZN UK Phone: +44 (0)1794 833189 EMail: robert.finking@roke.co.uk URI: http://www.roke.co.uk Ghyslain Pelletier Ericsson Box 920 Lulea SE-971 28 Sweden Phone: +46 (0) 8 404 29 43 EMail: ghyslain.pelletier@ericsson.com
Top   ToC   RFC4997 - Page 62
Full Copyright Statement

   Copyright (C) The IETF Trust (2007).

   This document is subject to the rights, licenses and restrictions
   contained in BCP 78, and except as set forth therein, the authors
   retain all their rights.

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Intellectual Property

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.

Acknowledgement

   Funding for the RFC Editor function is currently provided by the
   Internet Society.