Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 3095

RObust Header Compression (ROHC): Framework and four profiles: RTP, UDP, ESP, and uncompressed

Pages: 168
Proposed Standard
Updated by:  37594815
Part 2 of 7 – Pages 18 to 39
First   Prev   Next

Top   ToC   RFC3095 - Page 18   prevText

4. Header compression framework

4.1. Operating assumptions

Cellular links, which are a primary target for ROHC, have a number of characteristics that are described briefly here. ROHC requires functionality from lower layers that is outlined here and more thoroughly described in the lower layer guidelines document [LLG]. Channels ROHC header-compressed packets flow on channels. Unlike many fixed links, some cellular radio links can have several channels connecting the same pair of nodes. Each channel can have different characteristics in terms of error rate, bandwidth, etc. Context identifiers On some channels, the ability to transport multiple packet streams is required. It can also be feasible to have channels dedicated to individual packet streams. Therefore, ROHC uses a distinct context identifier space per channel and can eliminate context identifiers completely for one of the streams when few streams share a channel. Packet type indication Packet type indication is done in the header compression scheme itself. Unless the link already has a way of indicating packet types which can be used, such as PPP, this provides smaller compressed headers overall. It may also be less difficult to allocate a single packet type, rather than many, in order to run ROHC over links such as PPP. Reordering The channel between compressor and decompressor is required to maintain packet ordering, i.e., the decompressor must receive packets in the same order as the compressor sent them. (Reordering before the compression point, however, is dealt with, i.e., there is no assumption that the compressor will only receive packets in sequence.)
Top   ToC   RFC3095 - Page 19
   Duplication

      The channel between compressor and decompressor is required to not
      duplicate packets.  (Duplication before the compression point,
      however, is dealt with, i.e., there is no assumption that the
      compressor will receive only one copy of each packet.)

   Packet length

      ROHC is designed under the assumption that lower layers indicate
      the length of a compressed packet.  ROHC packets do not contain
      length information for the payload.

   Framing

      The link layer must provide framing that makes it possible to
      distinguish frame boundaries and individual frames.

   Error detection/protection

      The ROHC scheme has been designed to cope with residual errors in
      the headers delivered to the decompressor.  CRCs and sanity checks
      are used to prevent or reduce damage propagation.  However, it is
      RECOMMENDED that lower layers deploy error detection for ROHC
      headers and do not deliver ROHC headers with high residual error
      rates.

      Without giving a hard limit on the residual error rate acceptable
      to ROHC, it is noted that for a residual bit error rate of at most
      1E-5, the ROHC scheme has been designed not to increase the number
      of damaged headers, i.e., the number of damaged headers due to
      damage propagation is designed to be less than the number of
      damaged headers caught by the ROHC error detection scheme.

   Negotiation

      In addition to the packet handling mechanisms above, the link
      layer MUST provide a way to negotiate header compression
      parameters, see also section 5.1.1.  (For unidirectional links,
      this negotiation may be performed out-of-band or even a priori.)

4.2. Dynamicity

The ROHC protocol achieves its compression gain by establishing state information at both ends of the link, i.e., at the compressor and at the decompressor. Different parts of the state are established at different times and with different frequency; hence, it can be said that some of the state information is more dynamic than the rest.
Top   ToC   RFC3095 - Page 20
   Some state information is established at the time a channel is
   established; ROHC assumes the existence of an out-of-band negotiation
   protocol (such as PPP), or predefined channel state (most useful for
   unidirectional links).  In both cases, we speak of "negotiated
   channel state".  ROHC does not assume that this state can change
   dynamically during the channel lifetime (and does not explicitly
   support such changes, although some changes may be innocuous from a
   protocol point of view).  An example of negotiated channel state is
   the highest context ID number to be used by the compressor (MAX_CID).

   Other state information is associated with the individual packet
   streams in the channel; this state is said to be part of the context.
   Using context identifiers (CIDs), multiple packet streams with
   different contexts can share a channel.  The negotiated channel state
   indicates the highest context identifier to be used, as well as the
   selection of one of two ways to indicate the CID in the compressed
   header.

   It is up to the compressor to decide which packets to associate with
   a context (or, equivalently, which packets constitute a single
   stream); however, ROHC is efficient only when all packets of a stream
   share certain properties, such as having the same values for fields
   that are described as "static" in this document (e.g., the IP
   addresses, port numbers, and RTP parameters such as the payload
   type).  The efficiency of ROHC RTP also depends on the compressor
   seeing most RTP Sequence Numbers.

   Streams need not share all characteristics important for compression.
   ROHC has a notion of compression profiles: a compression profile
   denotes a predefined set of such characteristics.  To provide
   extensibility, the negotiated channel state includes the set of
   profiles acceptable to the decompressor.  The context state includes
   the profile currently in use for the context.

   Other elements of the context state may include the current values of
   all header fields (from these one can deduce whether an IPv4 header
   is present in the header chain, and whether UDP Checksums are
   enabled), as well as additional compression context that is not part
   of an uncompressed header, e.g., TS_STRIDE, IP-ID characteristics
   (incrementing as a 16-bit value in network byte order? random?), a
   number of old reference headers, and the compressor/decompressor
   state machines (see next section).

   This document actually defines four ROHC profiles: One uncompressed
   profile, the main ROHC RTP compression profile, and two variants of
   this profile for compression of packets with header chains that end
Top   ToC   RFC3095 - Page 21
   in UDP and ESP, respectively, but where RTP compression is not
   applicable.  The descriptive text in the rest of this section is
   referring to the main ROHC RTP compression profile.

4.3. Compression and decompression states

Header compression with ROHC can be characterized as an interaction between two state machines, one compressor machine and one decompressor machine, each instantiated once per context. The compressor and the decompressor have three states each, which in many ways are related to each other even if the meaning of the states are slightly different for the two parties. Both machines start in the lowest compression state and transit gradually to higher states. Transitions need not be synchronized between the two machines. In normal operation it is only the compressor that temporarily transits back to lower states. The decompressor will transit back only when context damage is detected. Subsequent sections present an overview of the state machines and their corresponding states, respectively, starting with the compressor.

4.3.1. Compressor states

For ROHC compression, the three compressor states are the Initialization and Refresh (IR), First Order (FO), and Second Order (SO) states. The compressor starts in the lowest compression state (IR) and transits gradually to higher compression states. The compressor will always operate in the highest possible compression state, under the constraint that the compressor is sufficiently confident that the decompressor has the information necessary to decompress a header compressed according to that state. +----------+ +----------+ +----------+ | IR State | <--------> | FO State | <--------> | SO State | +----------+ +----------+ +----------+ Decisions about transitions between the various compression states are taken by the compressor on the basis of: - variations in packet headers - positive feedback from decompressor (Acknowledgments -- ACKs) - negative feedback from decompressor (Negative ACKs -- NACKs) - periodic timeouts (when operating in unidirectional mode, i.e., over simplex channels or when feedback is not enabled)
Top   ToC   RFC3095 - Page 22
   How transitions are performed is explained in detail in chapter 5 for
   each mode of operation.

4.3.1.1. Initialization and Refresh (IR) State
The purpose of the IR state is to initialize the static parts of the context at the decompressor or to recover after failure. In this state, the compressor sends complete header information. This includes all static and nonstatic fields in uncompressed form plus some additional information. The compressor stays in the IR state until it is fairly confident that the decompressor has received the static information correctly.
4.3.1.2. First Order (FO) State
The purpose of the FO state is to efficiently communicate irregularities in the packet stream. When operating in this state, the compressor rarely sends information about all dynamic fields, and the information sent is usually compressed at least partially. Only a few static fields can be updated. The difference between IR and FO should therefore be clear. The compressor enters this state from the IR state, and from the SO state whenever the headers of the packet stream do not conform to their previous pattern. It stays in the FO state until it is confident that the decompressor has acquired all the parameters of the new pattern. Changes in fields that are always irregular are communicated in all packets and are therefore part of what is a uniform pattern. Some or all packets sent in the FO state carry context updating information. It is very important to detect corruption of such packets to avoid erroneous updates and context inconsistencies.
4.3.1.3. Second Order (SO) State
This is the state where compression is optimal. The compressor enters the SO state when the header to be compressed is completely predictable given the SN (RTP Sequence Number) and the compressor is sufficiently confident that the decompressor has acquired all parameters of the functions from SN to other fields. Correct decompression of packets sent in the SO state only hinges on correct decompression of the SN. However, successful decompression also requires that the information sent in the preceding FO state packets has been successfully received by the decompressor.
Top   ToC   RFC3095 - Page 23
   The compressor leaves this state and goes back to the FO state when
   the header no longer conforms to the uniform pattern and cannot be
   independently compressed on the basis of previous context
   information.

4.3.2. Decompressor states

The decompressor starts in its lowest compression state, "No Context" and gradually transits to higher states. The decompressor state machine normally never leaves the "Full Context" state once it has entered this state. +--------------+ +----------------+ +--------------+ | No Context | <---> | Static Context | <---> | Full Context | +--------------+ +----------------+ +--------------+ Initially, while working in the "No Context" state, the decompressor has not yet successfully decompressed a packet. Once a packet has been decompressed correctly (for example, upon reception of an initialization packet with static and dynamic information), the decompressor can transit all the way to the "Full Context" state, and only upon repeated failures will it transit back to lower states. However, when that happens it first transits back to the "Static Context" state. There, reception of any packet sent in the FO state is normally sufficient to enable transition to the "Full Context" state again. Only when decompression of several packets sent in the FO state fails in the "Static Context" state will the decompressor go all the way back to the "No Context" state. When state transitions are performed is explained in detail in chapter 5.

4.4. Modes of operation

The ROHC scheme has three modes of operation, called Unidirectional, Bidirectional Optimistic, and Bidirectional Reliable mode. It is important to understand the difference between states, as described in the previous chapter, and modes. These abstractions are orthogonal to each other. The state abstraction is the same for all modes of operation, while the mode controls the logic of state transitions and what actions to perform in each state.
Top   ToC   RFC3095 - Page 24
                         +----------------------+
                         |  Unidirectional Mode |
                         |   +--+  +--+  +--+   |
                         |   |IR|  |FO|  |SO|   |
                         |   +--+  +--+  +--+   |
                         +----------------------+
                           ^                  ^
                          /                    \
                         /                      \
                        v                        v
    +----------------------+                  +----------------------+
    |   Optimistic Mode    |                  |    Reliable Mode     |
    |   +--+  +--+  +--+   |                  |   +--+  +--+  +--+   |
    |   |IR|  |FO|  |SO|   | <--------------> |   |IR|  |FO|  |SO|   |
    |   +--+  +--+  +--+   |                  |   +--+  +--+  +--+   |
    +----------------------+                  +----------------------+

   The optimal mode to operate in depends on the characteristics of the
   environment of the compression protocol, such as feedback abilities,
   error probabilities and distributions, effects of header size
   variation, etc.  All ROHC implementations MUST implement and support
   all three modes of operation.  The three modes are briefly described
   in the following subsections.

   Detailed descriptions of the three modes of operation regarding
   compression and decompression logic are given in chapter 5.  The mode
   transition mechanisms, too, are described in chapter 5.

4.4.1. Unidirectional mode -- U-mode

When in the Unidirectional mode of operation, packets are sent in one direction only: from compressor to decompressor. This mode therefore makes ROHC usable over links where a return path from decompressor to compressor is unavailable or undesirable. In U-mode, transitions between compressor states are performed only on account of periodic timeouts and irregularities in the header field change patterns in the compressed packet stream. Due to the periodic refreshes and the lack of feedback for initiation of error recovery, compression in the Unidirectional mode will be less efficient and have a slightly higher probability of loss propagation compared to any of the Bidirectional modes. Compression with ROHC MUST start in the Unidirectional mode. Transition to any of the Bidirectional modes can be performed as soon as a packet has reached the decompressor and it has replied with a feedback packet indicating that a mode transition is desired (see chapter 5).
Top   ToC   RFC3095 - Page 25

4.4.2. Bidirectional Optimistic mode -- O-mode

The Bidirectional Optimistic mode is similar to the Unidirectional mode. The difference is that a feedback channel is used to send error recovery requests and (optionally) acknowledgments of significant context updates from decompressor to compressor (not, however, for pure sequence number updates). Periodic refreshes are not used in the Bidirectional Optimistic mode. O-mode aims to maximize compression efficiency and sparse usage of the feedback channel. It reduces the number of damaged headers delivered to the upper layers due to residual errors or context invalidation. The frequency of context invalidation may be higher than for R-mode, in particular when long loss/error bursts occur. Refer to section 4.7 for more details.

4.4.3. Bidirectional Reliable mode -- R-mode

The Bidirectional Reliable mode differs in many ways from the previous two. The most important differences are a more intensive usage of the feedback channel and a stricter logic at both the compressor and the decompressor that prevents loss of context synchronization between compressor and decompressor except for very high residual bit error rates. Feedback is sent to acknowledge all context updates, including updates of the sequence number field. However, not every packet updates the context in Reliable mode. R-mode aims to maximize robustness against loss propagation and damage propagation, i.e., minimize the probability of context invalidation, even under header loss/error burst conditions. It may have a lower probability of context invalidation than O-mode, but a larger number of damaged headers may be delivered when the context actually is invalidated. Refer to section 4.7 for more details.

4.5. Encoding methods

This chapter describes the encoding methods used for header fields. How the methods are applied to each field (e.g., values of associated parameters) is specified in section 5.7.

4.5.1. Least Significant Bits (LSB) encoding

Least Significant Bits (LSB) encoding is used for header fields whose values are usually subject to small changes. With LSB encoding, the k least significant bits of the field value are transmitted instead of the original field value, where k is a positive integer. After receiving k bits, the decompressor derives the original value using a previously received value as reference (v_ref).
Top   ToC   RFC3095 - Page 26
   The scheme is guaranteed to be correct if the compressor and the
   decompressor each use interpretation intervals

       1) in which the original value resides, and

       2) in which the original value is the only value that has the
          exact same k least significant bits as those transmitted.

   The interpretation interval can be described as a function f(v_ref,
   k).  Let

   f(v_ref, k) = [v_ref - p, v_ref + (2^k - 1) - p]

   where p is an integer.

         <------- interpretation interval (size is 2^k) ------->
         |-------------+---------------------------------------|
      v_ref - p        v_ref                        v_ref + (2^k-1) - p


   The function f has the following property: for any value k, the k
   least significant bits will uniquely identify a value in f(v_ref, k).

   The parameter p is introduced so that the interpretation interval can
   be shifted with respect to v_ref.  Choosing a good value for p will
   yield a more efficient encoding for fields with certain
   characteristics.  Below are some examples:

   a) For field values that are expected always to increase, p can be
      set to -1.  The interpretation interval becomes
      [v_ref + 1, v_ref + 2^k].

   b) For field values that stay the same or increase, p can be set to
      0.  The interpretation interval becomes [v_ref, v_ref + 2^k - 1].

   c) For field values that are expected to deviate only slightly from a
      constant value, p can be set to 2^(k-1) - 1.  The interpretation
      interval becomes [v_ref - 2^(k-1) + 1, v_ref + 2^(k-1)].

   d) For field values that are expected to undergo small negative
      changes and larger positive changes, such as the RTP TS for video,
      or RTP SN when there is misordering, p can be set to 2^(k-2) - 1.
      The interval becomes [v_ref - 2^(k-2) + 1, v_ref + 3 * 2^(k-2)],
      i.e., 3/4 of the interval is used for positive changes.

   The following is a simplified procedure for LSB compression and
   decompression; it is modified for robustness and damage propagation
   protection in the next subsection:
Top   ToC   RFC3095 - Page 27
   1) The compressor (decompressor) always uses v_ref_c (v_ref_d), the
      last value that has been compressed (decompressed), as v_ref;

   2) When compressing a value v, the compressor finds the minimum value
      of k such that v falls into the interval f(v_ref_c, k).  Call this
      function k = g(v_ref_c, v). When only a few distinct values of k
      are possible, for example due to limitations imposed by packet
      formats (see section 5.7), the compressor will instead pick the
      smallest k that puts v in the interval f(v_ref_c, k).

   3) When receiving m LSBs, the decompressor uses the interpretation
      interval f(v_ref_d, m), called interval_d.  It picks as the
      decompressed value the one in interval_d whose LSBs match the
      received m bits.

   Note that the values to be encoded have a finite range; for example,
   the RTP SN ranges from 0 to 0xFFFF.  When the SN value is close to 0
   or 0xFFFF, the interpretation interval can straddle the wraparound
   boundary between 0 and 0xFFFF.

   The scheme is complicated by two factors: packet loss between the
   compressor and decompressor, and transmission errors undetected by
   the lower layer.  In the former case, the compressor and decompressor
   will lose the synchronization of v_ref, and thus also of the
   interpretation interval.  If v is still covered by the
   intersection(interval_c, interval_d), the decompression will be
   correct.  Otherwise, incorrect decompression will result.  The next
   section will address this issue further.

   In the case of undetected transmission errors, the corrupted LSBs
   will give an incorrectly decompressed value that will later be used
   as v_ref_d, which in turn is likely to lead to damage propagation.
   This problem is addressed by using a secure reference, i.e., a
   reference value whose correctness is verified by a protecting CRC.
   Consequently, the procedure 1) above is modified as follows:

   1) a) the compressor always uses as v_ref_c the last value that has
         been compressed and sent with a protecting CRC.
      b) the decompressor always uses as v_ref_d the last correct
         value, as verified by a successful CRC.

   Note that in U/O-mode, 1) b) is modified so that if decompression of
   the SN fails using the last verified SN reference, another
   decompression attempt is made using the last but one verified SN
   reference.  This procedure mitigates damage propagation when a small
   CRC fails to detect a damaged value.  See section 5.3.2.2.3 for
   further details.
Top   ToC   RFC3095 - Page 28

4.5.2. Window-based LSB encoding (W-LSB encoding)

This section describes how to modify the simplified algorithm in 4.5.1 to achieve robustness. The compressor may not be able to determine the exact value of v_ref_d that will be used by the decompressor for a particular value v, since some candidates for v_ref_d may have been lost or damaged. However, by using feedback or by making reasonable assumptions, the compressor can limit the candidate set. The compressor then calculates k such that no matter which v_ref_d in the candidate set the decompressor uses, v is covered by the resulting interval_d. Since the decompressor always uses as the reference the last received value where the CRC succeeded, the compressor maintains a sliding window containing the candidates for v_ref_d. The sliding window is initially empty. The following operations are performed on the sliding window by the compressor: 1) After sending a value v (compressed or uncompressed) protected by a CRC, the compressor adds v to the sliding window. 2) For each value v being compressed, the compressor chooses k = max(g(v_min, v), g(v_max, v)), where v_min and v_max are the minimum and maximum values in the sliding window, and g is the function defined in the previous section. 3) When the compressor is sufficiently confident that a certain value v and all values older than v will not be used as reference by the decompressor, the window is advanced by removing those values (including v). The confidence may be obtained by various means. In R-mode, an ACK from the decompressor implies that values older than the ACKed one can be removed from the sliding window. In U/O-mode there is always a CRC to verify correct decompression, and a sliding window with a limited maximum width is used. The window width is an implementation dependent optimization parameter. Note that the decompressor follows the procedure described in the previous section, except that in R-mode it MUST ACK each header received with a succeeding CRC (see also section 5.5).

4.5.3. Scaled RTP Timestamp encoding

The RTP Timestamp (TS) will usually not increase by an arbitrary number from packet to packet. Instead, the increase is normally an integral multiple of some unit (TS_STRIDE). For example, in the case of audio, the sample rate is normally 8 kHz and one voice frame may
Top   ToC   RFC3095 - Page 29
   cover 20 ms.  Furthermore, each voice frame is often carried in one
   RTP packet.  In this case, the RTP increment is always n * 160 (=
   8000 * 0.02), for some integer n.  Note that silence periods have no
   impact on this, as the sample clock at the source normally keeps
   running without changing either frame rate or frame boundaries.

   In the case of video, there is usually a TS_STRIDE as well when the
   video frame level is considered.  The sample rate for most video
   codecs is 90 kHz.  If the video frame rate is fixed, say, to 30
   frames/second, the TS will increase by n * 3000 (= n * 90000 / 30)
   between video frames.  Note that a video frame is often divided into
   several RTP packets to increase robustness against packet loss.  In
   this case several RTP packets will carry the same TS.

   When using scaled RTP Timestamp encoding, the TS is downscaled by a
   factor of TS_STRIDE before compression.  This saves

      floor(log2(TS_STRIDE))

   bits for each compressed TS.  TS and TS_SCALED satisfy the following
   equality:

      TS = TS_SCALED * TS_STRIDE + TS_OFFSET

   TS_STRIDE is explicitly, and TS_OFFSET implicitly, communicated to
   the decompressor.  The following algorithm is used:

   1. Initialization: The compressor sends to the decompressor the value
      of TS_STRIDE and the absolute value of one or several TS fields.
      The latter are used by the decompressor to initialize TS_OFFSET to
      (absolute value) modulo TS_STRIDE.  Note that TS_OFFSET is the
      same regardless of which absolute value is used, as long as the
      unscaled TS value does not wrap around; see 4) below.

   2. Compression: After initialization, the compressor no longer
      compresses the original TS values.  Instead, it compresses the
      downscaled values: TS_SCALED = TS / TS_STRIDE.  The compression
      method could be either W-LSB encoding or the timer-based encoding
      described in the next section.

   3. Decompression: When receiving the compressed value of TS_SCALED,
      the decompressor first derives the value of the original
      TS_SCALED.  The original RTP TS is then calculated as TS =
      TS_SCALED * TS_STRIDE + TS_OFFSET.

   4. Offset at wraparound: Wraparound of the unscaled 32-bit TS will
      invalidate the current value of TS_OFFSET used in the equation
      above.  For example, let us assume TS_STRIDE = 160 = 0xA0 and the
Top   ToC   RFC3095 - Page 30
      current TS = 0xFFFFFFF0.  TS_OFFSET is then 0x50 = 80.  Then if
      the next RTP TS = 0x00000130 (i.e., the increment is 160 * 2 =
      320), the new TS_OFFSET should be 0x00000130 modulo 0xA0 = 0x90 =
      144.  The compressor is not required to re-initialize TS_OFFSET at
      wraparound.  Instead, the decompressor MUST detect wraparound of
      the unscaled TS (which is trivial) and update TS_OFFSET to

         TS_OFFSET = (Wrapped around unscaled TS) modulo TS_STRIDE

   5. Interpretation interval at wraparound: Special rules are needed
      for the interpretation interval of the scaled TS at wraparound,
      since the maximum scaled TS, TSS_MAX, (0xFFFFFFFF / TS_STRIDE) may
      not have the form 2^m - 1.  For example, when TS_STRIDE is 160,
      the scaled TS is at most 26843545 which has LSBs 10011001.  The
      wraparound boundary between the TSS_MAX may thus not correspond to
      a natural boundary between LSBs.

               interpretation interval
          |<------------------------------>|

                       unused                       scaled TS
      ------------|--------------|---------------------->
                          TSS_MAX         zero

      When TSS_MAX is part of the interpretation interval, a number of
      unused values are inserted into it after TSS_MAX such that their
      LSBs follow naturally upon each other.  For example, for TS_STRIDE
      = 160 and k = 4, values corresponding to the LSBs 1010 through
      1111 are inserted.  The number of inserted values depends on k and
      the LSBs of the maximum scaled TS.  The number of valid values in
      the interpretation interval should be high enough to maintain
      robustness.  This can be ensured by the following rule:

            Let a be the number of LSBs needed if there was no
            wraparound, and let b be the number of LSBs needed to
            disambiguate between TSS_MAX and zero where the a LSBs of
            TSS_MAX are set to zero.  The number of LSB bits to send
            while TSS_MAX or zero is part of the interpretation interval
            is b.

   This scaling method can be applied to many frame-based codecs.
   However, the value of TS_STRIDE might change during a session, for
   example as a result of adaptation strategies.  If that happens, the
   unscaled TS is compressed until re-initialization of the new
   TS_STRIDE and TS_OFFSET is completed.
Top   ToC   RFC3095 - Page 31

4.5.4. Timer-based compression of RTP Timestamp

The RTP Timestamp [RFC 1889] is defined to identify the number of the first sample used to generate the payload. When 1) RTP packets carry payloads corresponding to a fixed sampling interval, 2) the sampling is done at a constant rate, and 3) packets are generated in lock-step with sampling, then the timestamp value will closely approximate a linear function of the time of day. This is the case for conversational media, such as interactive speech. The linear ratio is determined by the source sample rate. The linear pattern can be complicated by packetization (e.g., in the case of video where a video frame usually corresponds to several RTP packets) or frame rearrangement (e.g., B-frames are sent out-of-order by some video codecs). With a fixed sample rate of 8 kHz, 20 ms in the time domain is equivalent to an increment of 160 in the unscaled TS domain, and to an increment of 1 in the scaled TS domain with TS_STRIDE = 160. As a consequence, the (scaled) TS of headers arriving at the decompressor will be a linear function of time of day, with some deviation due to the delay jitter (and the clock inaccuracies) between the source and the decompressor. In normal operation, i.e., no crashes or failures, the delay jitter will be bounded to meet the requirements of conversational real-time traffic. Hence, by using a local clock the decompressor can obtain an approximation of the (scaled) TS in the header to be decompressed by considering its arrival time. The approximation can then be refined with the k LSBs of the (scaled) TS carried in the header. The value of k required to ensure correct decompression is a function of the jitter between the source and the decompressor. If the compressor knows the potential jitter introduced between compressor and decompressor, it can determine k by using a local clock to estimate jitter in packet arrival times, or alternatively it can use a fixed k and discard packets arriving too much out of time. The advantages of this scheme include: a) The size of the compressed TS is constant and small. In particular, it does NOT depend on the length of silence intervals. This is in contrast to other TS compression techniques, which at the beginning of a talkspurt require sending a number of bits dependent on the duration of the preceding silence interval. b) No synchronization is required between the clock local to the compressor and the clock local to the decompressor.
Top   ToC   RFC3095 - Page 32
   Note that although this scheme can be made to work using both scaled
   and unscaled TS, in practice it is always combined with scaled TS
   encoding because of the less demanding requirement on the clock
   resolution, e.g., 20 ms instead of 1/8 ms.  Therefore, the algorithm
   described below assumes that the clock-based encoding scheme operates
   on the scaled TS.  The case of unscaled TS would be similar, with
   changes to scale factors.

   The major task of the compressor is to determine the value of k.  Its
   sliding window now contains not only potential reference values for
   the TS but also their times of arrival at the compressor.

   1) The compressor maintains a sliding window

      {(T_j, a_j), for each header j that can be used as a reference},

      where T_j is the scaled TS for header j, and a_j is the arrival
      time of header j.  The sliding window serves the same purpose as
      the W-LSB sliding window of section 4.5.2.

   2) When a new header n arrives with T_n as the scaled TS, the
      compressor notes the arrival time a_n.  It then calculates

         Max_Jitter_BC =

            max {|(T_n - T_j) - ((a_n - a_j) / TIME_STRIDE)|,
               for all headers j in the sliding window},

      where TIME_STRIDE is the time interval equivalent to one
      TS_STRIDE, e.g., 20 ms.  Max_Jitter_BC is the maximum observed
      jitter before the compressor, in units of TS_STRIDE, for the
      headers in the sliding window.

   3) k is calculated as

            k = ceiling(log2(2 * J + 1),

         where J = Max_Jitter_BC + Max_Jitter_CD + 2.

      Max_Jitter_CD is the upper bound of jitter expected on the
      communication channel between compressor and decompressor (CD-CC).
      It depends only on the characteristics of CD-CC.
Top   ToC   RFC3095 - Page 33
      The constant 2 accounts for the quantization error introduced by
      the clocks at the compressor and decompressor, which can be +/-1.

      Note that the calculation of k follows the compression algorithm
      described in section 4.5.1, with p = 2^(k-1) - 1.

   4) The sliding window is subject to the same window operations as in
      section 4.5.2, 1) and 3), except that the values added and removed
      are paired with their arrival times.

   Decompressor:

   1) The decompressor uses as its reference header the last correctly
      (as verified by CRC) decompressed header.  It maintains the pair
      (T_ref, a_ref), where T_ref is the scaled TS of the reference
      header, and a_ref is the arrival time of the reference header.

   2) When receiving a compressed header n at time a_n, the
      approximation of the original scaled TS is calculated as:

         T_approx = T_ref + (a_n - a_ref) / TIME_STRIDE.

   3) The approximation is then refined by the k least significant bits
      carried in header n, following the decompression algorithm of
      section 4.5.1, with p = 2^(k-1) - 1.

      Note: The algorithm does not assume any particular pattern in the
      packets arriving at the compressor, i.e., it tolerates reordering
      before the compressor and nonincreasing RTP Timestamp behavior.

      Note: Integer arithmetic is used in all equations above.  If
      TIME_STRIDE is not equal to an integral number of clock ticks,
      time must be normalized such that TIME_STRIDE is an integral
      number of clock ticks.  For example, if a clock tick is 20 ms and
      TIME_STRIDE is 30 ms, (a_n - a_ref) in 2) can be multiplied by 3
      and TIME_STRIDE can have the value 2.

      Note: The clock resolution of the compressor or decompressor can
      be worse than TIME_STRIDE, in which case the difference, i.e.,
      actual resolution - TIME_STRIDE, is treated as additional jitter
      in the calculation of k.

      Note: The clock resolution of the decompressor may be communicated
      to the compressor using the CLOCK feedback option.

      Note: The decompressor may observe the jitter and report this to
      the compressor using the JITTER feedback option.  The compressor
      may use this information to refine its estimate of Max_Jitter_CD.
Top   ToC   RFC3095 - Page 34

4.5.5. Offset IP-ID encoding

As all IPv4 packets have an IP Identifier to allow for fragmentation, ROHC provides for transparent compression of this ID. There is no explicit support in ROHC for the IPv6 fragmentation header, so there is never a need to discuss IP IDs outside the context of IPv4. This section assumes (initially) that the IPv4 stack at the source host assigns IP-ID according to the value of a 2-byte counter which is increased by one after each assignment to an outgoing packet. Therefore, the IP-ID field of a particular IPv4 packet flow will increment by 1 from packet to packet except when the source has emitted intermediate packets not belonging to that flow. For such IPv4 stacks, the RTP SN will increase by 1 for each packet emitted and the IP-ID will increase by at least the same amount. Thus, it is more efficient to compress the offset, i.e., (IP-ID - RTP SN), instead of IP-ID itself. The remainder of section 4.5.5 describes how to compress/decompress the sequence of offsets using W-LSB encoding/decoding, with p = 0 (see section 4.5.1). All IP-ID arithmetic is done using unsigned 16-bit quantities, i.e., modulo 2^16. Compressor: The compressor uses W-LSB encoding (section 4.5.2) to compress a sequence of offsets Offset_i = ID_i - SN_i, where ID_i and SN_i are the values of the IP-ID and RTP SN of header i. The sliding window contains such offsets and not the values of header fields, but the rules for adding and deleting offsets from the window otherwise follow section 4.5.2. Decompressor: The reference header is the last correctly (as verified by CRC) decompressed header. When receiving a compressed packet m, the decompressor calculates Offset_ref = ID_ref - SN_ref, where ID_ref and SN_ref are the values of IP-ID and RTP SN in the reference header, respectively.
Top   ToC   RFC3095 - Page 35
      Then W-LSB decoding is used to decompress Offset_m, using the
      received LSBs in packet m and Offset_ref.  Note that m may contain
      zero LSBs for Offset_m, in which case Offset_m = Offset_ref.

         Finally, the IP-ID for packet m is regenerated as

         IP-ID for m = decompressed SN of packet m + Offset_m

   Network byte order:

      Some IPv4 stacks do use a counter to generate IP ID values as
      described, but do not transmit the contents of this counter in
      network byte order, but instead send the two octets reversed.  In
      this case, the compressor can compress the IP-ID field after
      swapping the bytes.  Consequently, the decompressor also swaps the
      bytes of the IP-ID after decompression to regenerate the original
      IP-ID.  This requires that the compressor and the decompressor
      synchronize on the byte order of the IP-ID field using the NBO or
      NBO2 flag (see section 5.7).

   Random IP Identifier:

      Some IPv4 stacks generate the IP Identifier values using a
      pseudo-random number generator.  While this may provide some
      security benefits, it makes it pointless to attempt compressing
      the field.  Therefore, the compressor should detect such random
      behavior of the field.  After detection and synchronization with
      the decompressor using the RND or RND2 flag, the field is sent
      as-is in its entirety as additional octets after the compressed
      header.

4.5.6. Self-describing variable-length values

The values of TS_STRIDE and a few other compression parameters can vary widely. TS_STRIDE can be 160 for voice and 90 000 for 1 f/s video. To optimize the transfer of such values, a variable number of octets is used to encode them. The number of octets used is determined by the first few bits of the first octet: First bit is 0: 1 octet. 7 bits transferred. Up to 127 decimal. Encoded octets in hexadecimal: 00 to 7F First bits are 10: 2 octets. 14 bits transferred. Up to 16 383 decimal. Encoded octets in hexadecimal: 80 00 to BF FF
Top   ToC   RFC3095 - Page 36
   First bits are 110: 3 octets.
            21 bits transferred.
            Up to 2 097 151 decimal.
            Encoded octets in hexadecimal: C0 00 00 to DF FF FF

   First bits are 111: 4 octets.
            29 bits transferred.
            Up to 536 870 911 decimal.
            Encoded octets in hexadecimal: E0 00 00 00 to FF FF FF FF

4.5.7. Encoded values across several fields in compressed headers

When a compressed header has an extension, pieces of an encoded value can be present in more than one field. When an encoded value is split over several fields in this manner, the more significant bits of the value are closer to the beginning of the header. If the number of bits available in compressed header fields exceeds the number of bits in the value, the most significant field is padded with zeroes in its most significant bits. For example, an unscaled TS value can be transferred using an UOR-2 header (see section 5.7) with an extension of type 3. The Tsc bit of the extension is then unset (zero) and the variable length TS field of the extension is 4 octets, with 29 bits available for the TS (see section 4.5.6). The UOR-2 TS field will contain the three most significant bits of the unscaled TS, and the 4-octet TS field in the extension will contain the remaining 29 bits.

4.6. Errors caused by residual errors

ROHC is designed under the assumption that packets can be damaged between the compressor and decompressor, and that such damaged packets can be delivered to the decompressor ("residual errors"). Residual errors may damage the SN in compressed headers. Such damage will cause generation of a header which upper layers may not be able to distinguish from a correct header. When the compressed header contains a CRC, the CRC will catch the bad header with a probability dependent on the size of the CRC. When ROHC does not detect the bad header, it will be delivered to upper layers. Damage is not confined to the SN: a) Damage to packet type indication bits can cause a header to be interpreted as having a different packet type.
Top   ToC   RFC3095 - Page 37
   b) Damage to CID information may cause a packet to be interpreted
      according to another context and possibly also according to
      another profile.  Damage to CIDs will be more harmful when a large
      part of the CID space is being used, so that it is likely that the
      damaged CID corresponds to an active context.

   c) Feedback information can also be subject to residual errors, both
      when feedback is piggybacked and when it is sent in separate ROHC
      packets.  ROHC uses sanity checks and adds CRCs to vital feedback
      information to allow detection of some damaged feedback.

      Note that context damage can also result in generation of
      incorrect headers; section 4.7 elaborates further on this.

4.7. Impairment considerations

Impairments to headers can be classified into the following types: (1) the lower layer was not able to decode the packet and did not deliver it to ROHC, (2) the lower layer was able to decode the packet, but discarded it because of a detected error, (3) ROHC detected an error in the generated header and discarded the packet, or (4) ROHC did not detect that the regenerated header was damaged and delivered it to upper layers. Impairments cause loss or damage of individual headers. Some impairment scenarios also cause context invalidation, which in turn results in loss propagation and damage propagation. Damage propagation and undetected residual errors both contribute to the number of damaged headers delivered to upper layers. Loss propagation and impairments resulting in loss or discarding of single packets both contribute to the packet loss seen by upper layers. Examples of context invalidating scenarios are: (a) Impairment of type (4) on the forward channel, causing the decompressor to update its context with incorrect information;
Top   ToC   RFC3095 - Page 38
     (b) Loss/error burst of pattern update headers: Impairments of
         types (1),(2) and (3) on consecutive pattern update headers; a
         pattern update header is a header carrying a new pattern
         information, e.g., at the beginning of a new talk spurt; this
         causes the decompressor to lose the pattern update
         information;

     (c) Loss/error burst of headers: Impairments of types (1),(2) and
         (3) on a number of consecutive headers that is large enough to
         cause the decompressor to lose the SN synchronization;

     (d) Impairment of type (4) on the feedback channel which mimics a
         valid ACK and makes the compressor update its context;

     (e) a burst of damaged headers (3) erroneously triggers the "k-
         out-of-n" rule for detecting context invalidation, which
         results in a NACK/update sequence during which headers are
         discarded.

   Scenario (a) is mitigated by the CRC carried in all context updating
   headers.  The larger the CRC, the lower the chance of context
   invalidation caused by (a).  In R-mode, the CRC of context updating
   headers is always 7 bits or more.  In U/O-mode, it is usually 3 bits
   and sometimes 7 or 8 bits.

   Scenario (b) is almost completely eliminated when the compressor
   ensures through ACKs that no context updating headers are lost, as in
   R-mode.

   Scenario (c) is almost completely eliminated when the compressor
   ensures through ACKs that the decompressor will always detect the SN
   wraparound, as in R-mode.  It is also mitigated by the SN repair
   mechanisms in U/O-mode.

   Scenario (d) happens only when the compressor receives a damaged
   header that mimics an ACK of some header present in the W-LSB window,
   say ACK of header 2, while in reality header 2 was never received or
   accepted by the decompressor, i.e., header 2 was subject to
   impairment (1), (2) or (3).  The damaged header must mimic the
   feedback packet type, the ACK feedback type, and the SN LSBs of some
   header in the W-LSB window.

   Scenario (e) happens when a burst of residual errors causes the CRC
   check to fail in k out of the last n headers carrying CRCs.  Large k
   and n reduces the probability of scenario (e), but also increases the
   number of headers lost or damaged as a consequence of any context
   invalidation.
Top   ToC   RFC3095 - Page 39
   ROHC detects damaged headers using CRCs over the original headers.
   The smallest headers in this document either include a 3-bit CRC
   (U/O-mode) or do not include a CRC (R-mode).  For the smallest
   headers, damage is thus detected with a probability of roughly 7/8
   for U/O-mode.  For R-mode, damage to the smallest headers is not
   detected.

   All other things (coding scheme at lower layers, etc.) being equal,
   the rate of headers damaged by residual errors will be lower when
   headers are compressed compared when they are not, since fewer bits
   are transmitted.  Consequently, for a given ROHC CRC setup the rate
   of incorrect headers delivered to applications will also be reduced.

   The above analysis suggests that U/O-mode may be more prone than R-
   mode to context invalidation.  On the other hand, the CRC present in
   all U/O-mode headers continuously screens out residual errors coming
   from lower layers, reduces the number of damaged headers delivered to
   upper layers when context is invalidated, and permits quick detection
   of context invalidation.

   R-mode always uses a stronger CRC on context updating headers, but no
   CRC in other headers.  A residual error on a header which carries no
   CRC will result in a damaged header being delivered to upper layers
   (4).  The number of damaged headers delivered to the upper layers
   depends on the ratio of headers with CRC vs. headers without CRC,
   which is a compressor parameter.



(page 39 continued on part 3)

Next Section