5. Security Considerations
This document describes a formal notation similar to ABNF [RFC4234], and hence is not believed to raise any security issues (note that ABNF has a completely separate purpose to the ROHC formal notation).6. Contributors
Richard Price did much of the foundational work on the formal notation. He authored the initial document describing a formal notation on which this document is based. Kristofer Sandlund contributed to this work by applying new ideas to the ROHC-TCP profile, by providing feedback, and by helping resolve different issues during the entire development of the notation. Carsten Bormann provided the translation of the formal notation syntax using ABNF in Appendix A, and also contributed with feedback and reviews to validate the completeness and correctness of the notation.7. Acknowledgements
A number of important concepts and ideas have been borrowed from ROHC [RFC3095]. Thanks to Mark West, Eilert Brinkmann, Alan Ford, and Lars-Erik Jonsson for their contributions, reviews, and feedback that led to significant improvements to the readability, completeness, and overall quality of the notation. Thanks to Stewart Sadler, Caroline Daniels, Alan Finney, and David Findlay for their reviews and comments. Thanks to Rob Hancock and Stephen McCann for their early work on the formal notation. The
authors would also like to thank Christian Schmidt, Qian Zhang, Hongbin Liao, and Max Riegel for their comments and valuable input. Additional thanks: this document was reviewed during working group last-call by committed reviewers Mark West, Carsten Bormann, and Joe Touch, as well as by Sally Floyd who provided a review at the request of the Transport Area Directors. Thanks also to Magnus Westerlund for his feedback in preparation for the IESG review.8. References
8.1. Normative References
[C90] ISO/IEC, "ISO/IEC 9899:1990 Information technology -- Programming Language C", ISO 9899:1990, April 1990. [RFC2822] Resnick, P., Ed., "STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES", RFC 2822, April 2001. [RFC4234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", RFC 4234, October 2005. [RFC4995] Jonsson, L-E., Pelletier, G., and K. Sandlund, "The RObust Header Compression (ROHC) Framework", RFC 4995, July 2007.8.2. Informative References
[RFC3095] Bormann, C., Burmeister, C., Degermark, M., Fukushima, H., Hannu, H., Jonsson, L-E., Hakenberg, R., Koren, T., Le, K., Liu, Z., Martensson, A., Miyazaki, A., Svanbro, K., Wiebke, T., Yoshimura, T., and H. Zheng, "RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed", RFC 3095, July 2001. [RFC791] University of Southern California, "DARPA INTERNET PROGRAM PROTOCOL SPECIFICATION", RFC 791, September 1981.
Appendix A. Formal Syntax of ROHC-FN
This section gives a definition of the syntax of ROHC-FN in ABNF [RFC4234], using "fnspec" as the start rule. ; overall structure fnspec = S *(constdef S) [globctl S] 1*(methdef S) constdef = constname S "=" S expn S ";" globctl = CONTROL S formbody methdef = id S [parmlist S] "{" S 1*(formatdef S) "}" / id S [parmlist S] STRQ *STRCHAR STRQ S ";" parmlist = "(" S id S *( "," S id S ) ")" formatdef = formhead S formbody formhead = UNCOMPRESSED [ 1*WS id ] / COMPRESSED [ 1*WS id ] / CONTROL / INITIAL / DEFAULT formbody = "{" S *((fielddef/enforcer) S) "}" fielddef = fieldgroup S ["=:=" S encspec S] [lenspec S] ";" fieldgroup = fieldname *( S ":" S fieldname ) fieldname = id encspec = "'" *("0"/"1") "'" / id [ S "(" S expn S *( "," S expn S ) ")"] lenspec = "[" S expn S *("," S expn S) "]" enforcer = ENFORCE S "(" S expn S ")" S ";" ; expressions expn = *(expnb S "||" S) expnb expnb = *(expna S "&&" S) expna expna = *(expn7 S ("=="/"!=") S) expn7 expn7 = *(expn6 S ("<"/"<="/">"/">=") S) expn6 expn6 = *(expn4 S ("+"/"-") S) expn4 expn4 = *(expn3 S ("*"/"/"/"%") S) expn3 expn3 = expn2 [S "^" S expn3] expn2 = ["!" S] expn1 expn1 = expn0 / attref / constname / litval / id expn0 = "(" S expn S ")" / VARIABLE attref = fieldnameref "." attname fieldnameref = fieldname / THIS attname = ( U / C ) ( LENGTH / VALUE ) litval = ["-"] "0b" 1*("0"/"1") / ["-"] "0x" 1*(DIGIT/"a"/"b"/"c"/"d"/"e"/"f") / ["-"] 1*DIGIT / false / true
; lexical categories constname = UPCASE *(UPCASE / DIGIT / "_") id = ALPHA *(ALPHA / DIGIT / "_") ALPHA = %x41-5A / %x61-7A UPCASE = %x41-5A DIGIT = %x30-39 COMMENT = "//" *(SP / HTAB / VCHAR) CRLF SP = %x20 HTAB = %x09 VCHAR = %x21-7E CRLF = %x0A / %x0D.0A NL = COMMENT / CRLF WS = SP / HTAB / NL S = *WS STRCHAR = SP / HTAB / %x21 / %x23-7E STRQ = %x22 ; case-sensitive literals C = %d67 COMPRESSED = %d67.79.77.80.82.69.83.83.69.68 CONTROL = %d67.79.78.84.82.79.76 DEFAULT = %d68.69.70.65.85.76.84 ENFORCE = %d69.78.70.79.82.67.69 INITIAL = %d73.78.73.84.73.65.76 LENGTH = %d76.69.78.71.84.72 THIS = %d84.72.73.83 U = %d85 UNCOMPRESSED = %d85.78.67.79.77.80.82.69.83.83.69.68 VALUE = %d86.65.76.85.69 VARIABLE = %d86.65.82.73.65.66.76.69 false = %d102.97.108.115.101 true = %d116.114.117.101
Appendix B. Bit-level Worked Example
This section gives a worked example at the bit level, showing how a simple ROHC-FN specification describes the compression of real data from an imaginary protocol header. The example used has been kept fairly simple, whilst still aiming to illustrate some of the intricacies that arise in use of the notation. In particular, fields have been kept short to make it possible to read the binary representation of the headers without too much difficulty.B.1. Example Packet Format
Our imaginary header is just 16 bits long, and consists of the following fields: 1. version number -- 2 bits 2. type -- 2 bits 3. flow id -- 4 bits 4. sequence number -- 4 bits 5. flag bits -- 4 bits So for example 0101000100010000 indicates a header with a version number of one, a type of one, a flow id of one, a sequence number of one, and all flag bits set to zero. Here is an ASCII box notation diagram of the imaginary header: 0 1 2 3 4 5 6 7 +---+---+---+---+---+---+---+---+ |version| type | flow_id | +---+---+---+---+---+---+---+---+ | sequence_no | flag_bits | +---+---+---+---+---+---+---+---+
B.2. Initial Encoding
An initial definition based solely on the above information is as follows: eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; flag_bits [ 4 ]; } COMPRESSED initial_definition { version_no =:= irregular(2); type =:= irregular(2); flow_id =:= irregular(4); sequence_no =:= irregular(4); flag_bits =:= irregular(4); } } This defines the format nicely, but doesn't actually offer any compression. If we use it to encode the above header, we get: Uncompressed header: 0101000100010000 Compressed header: 0101000100010000 This is because we have stated that all fields are "irregular" -- i.e., we haven't specified anything about their behaviour. Note that since we have only one compressed format and one uncompressed format, it makes no difference whether the encoding methods for each field are specified in the compressed or uncompressed format. It would make no difference at all if we wrote the following instead: eg_header { UNCOMPRESSED { version_no =:= irregular(2); type =:= irregular(2); flow_id =:= irregular(4); sequence_no =:= irregular(4); flag_bits =:= irregular(4); }
COMPRESSED initial_definition { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; flag_bits [ 4 ]; } }B.3. Basic Compression
In order to achieve any compression we need to notate more knowledge about the header and its behaviour in a flow. For example, we may know the following facts about the header: 1. version number -- indicates which version of the protocol this is: always one for this version of the protocol. 2. type -- may take any value. 3. flow id -- may take any value. 4. sequence number -- make take any value. 5. flag bits -- contains three flags, a, b, and c, each of which may be set or clear, and a reserved flag bit, which is always clear (i.e., zero). We could notate this knowledge as follows: eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag [ 1 ]; } COMPRESSED basic { version_no =:= uncompressed_value(2, 1) [ 0 ]; type =:= irregular(2) [ 2 ]; flow_id =:= irregular(4) [ 4 ]; sequence_no =:= irregular(4) [ 4 ]; abc_flag_bits =:= irregular(3) [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 0 ];
} } Using this simple scheme, we have successfully encoded the fact that one of the fields has a permanently fixed value of one, and therefore contains no useful information. We have also encoded the fact that the final flag bit is always zero, which again contains no useful information. Both of these facts have been notated using the "uncompressed_value" encoding method (see Section 4.11.1). Using this new encoding on the above header, we get: Uncompressed header: 0101000100010000 Compressed header: 0100010001000 This reduces the amount of data we need to transmit by roughly 20%. However, this encoding fails to take advantage of relationships between values of a field in one packet and its value in subsequent packets. For example, every header in the following sequence is compressed by the same amount despite the similarities between them: Uncompressed header: 0101000100010000 Compressed header: 0100010001000 Uncompressed header: 0101000101000000 Compressed header: 0100010100000 Uncompressed header: 0110000101110000 Compressed header: 1000010111000B.4. Inter-Packet Compression
The profile we have defined so far has not compressed the sequence number or flow ID fields at all, since they can take any value. However the value of each of these fields in one header has a very simple relationship to their values in previous headers: o the sequence number is unusual -- it increases by three each time, o the flow_id stays the same -- it always has the same value that it did in the previous header in the flow, o the abc_flag_bits stay the same most of the time -- they usually have the same value that they did in the previous header in the flow.
An obvious way of notating this is as follows: // This obvious encoding will not work (correct encoding below) eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag [ 1 ]; } COMPRESSED obvious { version_no =:= uncompressed_value(2, 1); type =:= irregular(2); flow_id =:= static; sequence_no =:= lsb(0, -3); abc_flag_bits =:= irregular(3); reserved_flag =:= uncompressed_value(1, 0); } } The dependency on previous packets is notated using the "static" and "lsb" encoding methods (see Section 4.11.4 and Section 4.11.5 respectively). However there are a few problems with the above notation. Firstly, and most importantly, the "flow_id" field is notated as "static", which means that it doesn't change from packet to packet. However, the notation does not indicate how to communicate the value of the field initially. There is no point saying "it's the same value as last time" if there has not been a first time where we define what that value is, so that it can be referred back to. The above notation provides no way of communicating that. Similarly with the sequence number -- there needs to be a way of communicating its initial value. In fact, except for the explicit notation indicating their lengths, even the lengths of these two fields would be left undefined. This problem will be solved below, in Appendix B.5. Secondly, the sequence number field is communicated very efficiently in zero bits, but it is not at all robust against packet loss. If a packet is lost then there is no way to handle the missing sequence number. When communicating sequence numbers, or any other field encoded with "lsb" encoding, a very important consideration for the notator is how robust against packet loss the compressed protocol should be. This will vary a lot from protocol stack to protocol
stack. For the example protocol we'll assume short, low overhead flows and say we need to be robust to the loss of just one packet, which we can achieve with two bits of "lsb" encoding (one bit isn't enough since the sequence number increases by three each time -- see Section 4.11.5). This will be addressed below in Appendix B.5. Finally, although the flag bits are usually the same as in the previous header in the flow, the profile doesn't make any use of this fact; since they are sometimes not the same as those in the previous header, it is not safe to say that they are always the same, so "static" encoding can't be used exclusively. This problem will be solved later through the use of multiple formats in Appendix B.6.B.5. Specifying Initial Values
To communicate initial values for fields compressed with a context dependent encoding such as "static" or "lsb" we use an "INITIAL" field list. This can help with fields whose start value is fixed and known. For example, if we knew that at the start of the flow that "flow_id" would always be 1 and "sequence_no" would always be 0, we could notate that like this: // This encoding will not work either (correct encoding below) eg_header { UNCOMPRESSED { version_no [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag [ 1 ]; } INITIAL { // set initial values of fields before flow starts flow_id =:= uncompressed_value(4, 1); sequence_no =:= uncompressed_value(4, 0); } COMPRESSED obvious { version_no =:= uncompressed_value(2, 1); type =:= irregular(2); flow_id =:= static; sequence_no =:= lsb(2, -3); abc_flag_bits =:= irregular(3); reserved_flag =:= uncompressed_value(1, 0); }
} However, this use of "INITIAL" is no good since the initial values of both "flow_id" and "sequence_no" vary from flow to flow. "INITIAL" is only applicable where the initial value of a field is fixed, as is often the case with control fields.B.6. Multiple Packet Formats
To communicate initial values for the sequence number and flow ID fields correctly, and to take advantage of the fact that the flag bits are usually the same as in the previous header, we need to depart from the single format encoding we are currently using and instead use multiple formats. Here, we have expressed the encodings for two of the fields in the uncompressed format, since they will always be true for uncompressed headers of that format. The remaining fields, whose encoding method may depend on exactly how the header is being compressed, have their encodings specified in the compressed formats. eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } COMPRESSED irregular_format { discriminator =:= '0' [ 1 ]; version_no [ 0 ]; type =:= irregular(2) [ 2 ]; flow_id =:= irregular(4) [ 4 ]; sequence_no =:= irregular(4) [ 4 ]; abc_flag_bits =:= irregular(3) [ 3 ]; reserved_flag [ 0 ]; } COMPRESSED compressed_format { discriminator =:= '1' [ 1 ]; version_no [ 0 ]; type =:= irregular(2) [ 2 ]; flow_id =:= static [ 0 ]; sequence_no =:= lsb(2, -3) [ 2 ];
abc_flag_bits =:= static [ 0 ]; reserved_flag [ 0 ]; } } Note that we have added a discriminator field, so that the decompressor can tell which format has been used by the compressor. The format with a "static" flow ID and "lsb" encoded sequence number is now 5 bits long. Note that despite having to add the discriminator field, this format is still the same size as the original incorrect "obvious" format because it takes advantage of the fact that the abc flag bits rarely change. However, the original "basic" format has also grown by one bit due to the addition of the discriminator ("irregular_format"). An important consideration when creating multiple formats is whether each format occurs frequently enough that the average compressed header length is shorter as a result of its usage. For example, if in fact the flag bits always changed between packets, the "compressed_format" encoding could never be used; all we would have achieved is lengthening the "basic" format by one bit. Using the above notation, we now get: Uncompressed header: 0101000100010000 Compressed header: 00100010001000 Uncompressed header: 0101000101000000 Compressed header: 10100 ; 00100010100000 Uncompressed header: 0110000101110000 Compressed header: 11011 ; 01000010111000 The first header in the stream is compressed the same way as before, except that it now has the extra 1-bit discriminator at the start (0). When a second header arrives with the same flow ID as the first and its sequence number three higher, it can be compressed in two possible ways: either by using "compressed_format" or, in the same way as previously, by using "irregular_format". Note that we show all theoretically possible encodings of a header as defined by the ROHC-FN specification, separated by semi-colons. Either of the above encodings for each header could be produced by a valid implementation, although a good implementation would always aim to pick the encoding that leads to the best compression. A good implementation would also take robustness into account and therefore
probably wouldn't assume on the second packet that the decompressor had available the context necessary to decompress the shorter "compressed_format" form. Finally, note that the fields whose encoding methods are specified in the uncompressed format have zero length when compressed. This means their position in the compressed format is not significant. In this case, there is no need to notate them when defining the compressed formats. In the next part of the example we will see that they have been removed from the compressed formats altogether.B.7. Variable Length Discriminators
Suppose we do some analysis on flows of our example protocol and discover that whilst it is usual for successive packets to have the same flags, on the occasions when they don't, the packet is almost always a "flags set" packet in which all three of the abc flags are set. To encode the flow more efficiently a format needs to be written to reflect this. This now gives a total of three formats, which means we need three discriminators to differentiate between them. The obvious solution here is to increase the number of bits in the discriminator from one to two and use discriminators 00, 01, and 10 for example. However we can do slightly better than this. Any uniquely identifiable discriminator will suffice, so we can use 00, 01, and 1. If the discriminator starts with 1, that's the whole thing. If it starts with 0, the decompressor knows it has to check one more bit to determine the kind of format. Note that care must be taken when using variable length discriminators. For example, it would be erroneous to use 0, 01, and 10 as discriminators since after reading an initial 0, the decompressor would have no way of knowing if the next bit was a second bit of discriminator, or the first bit of the next field in the format. However, 0, 10, and 11 would be correct, as the first bit again indicates whether or not there are further discriminator bits to follow.
This gives us the following: eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } COMPRESSED irregular_format { discriminator =:= '00' [ 2 ]; type =:= irregular(2) [ 2 ]; flow_id =:= irregular(4) [ 4 ]; sequence_no =:= irregular(4) [ 4 ]; abc_flag_bits =:= irregular(3) [ 3 ]; } COMPRESSED flags_set { discriminator =:= '01' [ 2 ]; type =:= irregular(2) [ 2 ]; flow_id =:= static [ 0 ]; sequence_no =:= lsb(2, -3) [ 2 ]; abc_flag_bits =:= uncompressed_value(3, 7) [ 0 ]; } COMPRESSED flags_static { discriminator =:= '1' [ 1 ]; type =:= irregular(2) [ 2 ]; flow_id =:= static [ 0 ]; sequence_no =:= lsb(2, -3) [ 2 ]; abc_flag_bits =:= static [ 0 ]; } } Here is some example output: Uncompressed header: 0101000100010000 Compressed header: 000100010001000 Uncompressed header: 0101000101000000 Compressed header: 10100 ; 000100010100000
Uncompressed header: 0110000101110000 Compressed header: 11011 ; 001000010111000 Uncompressed header: 0111000110101110 Compressed header: 011110 ; 001100011010111 Here we have a very similar sequence to last time, except that there is now an extra message on the end that has the flag bits set. The encoding for the first message in the stream is now one bit larger, the encoding for the next two messages is the same as before, since that format has not grown; thanks to the use of variable length discriminators. Finally, the packet that comes through with all the flag bits set can be encoded in just six bits, only one bit more than the most common format. Without the extra format, this last packet would have to be encoded using the longest format and would have taken up 14 bits.B.8. Default Encoding
Some of the common encoding methods used so far have been "factored out" into the definition of the uncompressed format, meaning that they don't need to be defined for every compressed format. However, there is still some redundancy in the notation. For a number of fields, the same encoding method is used several times in different formats (though not necessarily in all of them), but the field encoding is redefined explicitly each time. If the encoding for any of these fields changed in the future, then every format that uses that encoding would have to be modified to reflect this change. This problem can be avoided by specifying default encoding methods for these fields. Doing so can also lead to a more concisely notated profile: eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } DEFAULT { type =:= irregular(2); flow_id =:= static;
sequence_no =:= lsb(2, -3); } COMPRESSED irregular_format { discriminator =:= '00' [ 2 ]; type [ 2 ]; // Uses default flow_id =:= irregular(4) [ 4 ]; // Overrides default sequence_no =:= irregular(4) [ 4 ]; // Overrides default abc_flag_bits =:= irregular(3) [ 3 ]; } COMPRESSED flags_set { discriminator =:= '01' [ 2 ]; type [ 2 ]; // Uses default sequence_no [ 2 ]; // Uses default abc_flag_bits =:= uncompressed_value(3, 7); } COMPRESSED flags_static { discriminator =:= '1' [ 1 ]; type [ 2 ]; // Uses default sequence_no [ 2 ]; // Uses default abc_flag_bits =:= static; } } The above profile behaves in exactly the same way as the one notated previously, since it has the same meaning. Note that the purpose behind the different formats becomes clearer with the default encoding methods factored out: all that remains are the encodings that are specific to each format. Note also that default encoding methods that compress down to zero bits have become completely implicit. For example the compressed formats using the default encoding for "flow_id" don't mention it (the default is "static" encoding that compresses to zero bits).B.9. Control Fields
One inefficiency in the compression scheme we have produced thus far is that it uses two bits to provide the "lsb" encoded sequence number with robustness for the loss of just one packet. In theory, only one bit should be needed. The root of the problem is the unusual sequence number that the protocol uses -- it counts up in increments of three. In order to encode it at maximum efficiency we need to translate this into a field that increments by one each time. We do this using a control field.
A control field is extra data that is communicated in the compressed format, but which is not a direct encoding of part of the uncompressed header. Control fields can be used to communicate extra information in the compressed format, that allows other fields to be compressed more efficiently. The control field that we introduce scales the sequence number down by a factor of three. Instead of encoding the original sequence number in the compressed packet, we encode the scaled sequence number, allowing us to have robustness to the loss of one packet by using just one bit of "lsb" encoding: eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } CONTROL { // need modulo maths to calculate scaling correctly, // due to 4 bit wrap around scaled_seq_no [ 4 ]; ENFORCE(sequence_no.UVALUE == (scaled_seq_no.UVALUE * 3) % 16); } DEFAULT { type =:= irregular(2); flow_id =:= static; scaled_seq_no =:= lsb(1, -1); } COMPRESSED irregular_format { discriminator =:= '00' [ 2 ]; type [ 2 ]; flow_id =:= irregular(4) [ 4 ]; scaled_seq_no =:= irregular(4) [ 4 ]; // Overrides default abc_flag_bits =:= irregular(3) [ 3 ]; } COMPRESSED flags_set { discriminator =:= '01' [ 2 ]; type [ 2 ];
scaled_seq_no [ 1 ]; // Uses default abc_flag_bits =:= uncompressed_value(3, 7); } COMPRESSED flags_static { discriminator =:= '1' [ 1 ]; type [ 2 ]; scaled_seq_no [ 1 ]; // Uses default abc_flag_bits =:= static; } } Normally, the encoding method(s) used to encode a field specifies the length of the field. In the above notation, since there is no encoding method using "sequence_no" directly, its length needs to be defined explicitly using an "ENFORCE" statement. This is done using the abbreviated syntax, both for consistency and also for ease of readability. Note that this is unusual: whereas the majority of field length indications are redundant (and thus optional), this one isn't. If it was removed from the above notation, the length of the "sequence_no" field would be undefined. Here is some example output: Uncompressed header: 0101000100010000 Compressed header: 000100011011000 Uncompressed header: 0101000101000000 Compressed header: 1010 ; 000100011100000 Uncompressed header: 0110000101110000 Compressed header: 1101 ; 001000011101000 Uncompressed header: 0111000110101110 Compressed header: 01110 ; 001100011110111 In this form, we see that this gives us a saving of a further bit in most packets. Assuming the bulk of a flow is made up of "flags_static" headers, the mean size of the headers in a compressed flow is now just over a quarter of their size in an uncompressed flow.
B.10. Use of "ENFORCE" Statements as Conditionals
Earlier, we created a new format "flags_set" to handle packets with all three of the flag bits set. As it happens, these three flags are always all set for "type 3" packets, and are never all set for other packet types (a "type 3" packet is one where the type field is set to three). This allows extra efficiency in encoding such packets. We know the type is three, so we don't need to encode the type field in the compressed header. The type field was previously encoded as "irregular(2)", which is two bits long. Removing this reduces the size of the "flags_set" format from five bits to three, making it the smallest format in the encoding method definition. In order to notate that the "flags_set" format should only be used for "type 3" headers, and the "flags_static" format only when the type isn't three, it is necessary to state these conditions inside each format. This can be done with an "ENFORCE" statement: eg_header { UNCOMPRESSED { version_no =:= uncompressed_value(2, 1) [ 2 ]; type [ 2 ]; flow_id [ 4 ]; sequence_no [ 4 ]; abc_flag_bits [ 3 ]; reserved_flag =:= uncompressed_value(1, 0) [ 1 ]; } CONTROL { // need modulo maths to calculate scaling correctly, // due to 4 bit wrap around scaled_seq_no [ 4 ]; ENFORCE(sequence_no.UVALUE == (scaled_seq_no.UVALUE * 3) % 16); } DEFAULT { type =:= irregular(2); scaled_seq_no =:= lsb(1, -1); flow_id =:= static; } COMPRESSED irregular_format { discriminator =:= '00' [ 2 ]; type [ 2 ];
flow_id =:= irregular(4) [ 4 ]; scaled_seq_no =:= irregular(4) [ 4 ]; abc_flag_bits =:= irregular(3) [ 3 ]; } COMPRESSED flags_set { ENFORCE(type.UVALUE == 3); // redundant condition discriminator =:= '01' [ 2 ]; type =:= uncompressed_value(2, 3) [ 0 ]; scaled_seq_no [ 1 ]; abc_flag_bits =:= uncompressed_value(3, 7) [ 0 ]; } COMPRESSED flags_static { ENFORCE(type.UVALUE != 3); discriminator =:= '1' [ 1 ]; type [ 2 ]; scaled_seq_no [ 1 ]; abc_flag_bits =:= static [ 0 ]; } } The two "ENFORCE" statements in the last two formats act as "guards". Guards prevent formats from being used under the wrong circumstances. In fact, the "ENFORCE" statement in "flags_set" is redundant. The condition it guards for is already enforced by the new encoding method used for the "type" field. The encoding method "uncompressed_value(2,3)" binds the "UVALUE" attribute to three. This is exactly what the "ENFORCE" statement does, so it can be removed without any change in meaning. The "uncompressed_value" encoding method on the other hand is not redundant. It specifies other bindings on the type field in addition to the one that the "ENFORCE" statement specifies. Therefore it would not be possible to remove the encoding method and leave just the "ENFORCE" statement. Note that a guard is solely preventative. A guard can never force a format to be chosen by the compressor. A format can only be guaranteed to be chosen in a given situation if there are no other formats that can be used instead. This is demonstrated in the example output below. The compressor can still choose the "irregular" format if it wishes: Uncompressed header: 0101000100010000 Compressed header: 000100011011000 Uncompressed header: 0101000101000000 Compressed header: 1010 ; 000100011100000
Uncompressed header: 0110000101110000 Compressed header: 1101 ; 001000011101000 Uncompressed header: 0111000110101110 Compressed header: 010 ; 001100011110111 This saves just two extra bits (a 7% saving) in the example flow.Authors' Addresses
Robert Finking Siemens/Roke Manor Research Old Salisbury Lane Romsey, Hampshire SO51 0ZN UK Phone: +44 (0)1794 833189 EMail: robert.finking@roke.co.uk URI: http://www.roke.co.uk Ghyslain Pelletier Ericsson Box 920 Lulea SE-971 28 Sweden Phone: +46 (0) 8 404 29 43 EMail: ghyslain.pelletier@ericsson.com
Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgement Funding for the RFC Editor function is currently provided by the Internet Society.