RFC 6386

VP8 Data Format and Decoding Guide

Pages: 304
Informational
→ Errata

Part 5 of 11 – Pages 108 to 132

RFC6386 - Page 108 prevText

17.  Motion Vector Decoding

   As discussed above, motion vectors appear in two places in the VP8
   datastream: applied to whole macroblocks in NEWMV mode and applied to
   subsets of macroblocks in NEW4x4 mode.  The format of the vectors is
   identical in both cases.

   Each vector has two pieces: a vertical component (row) followed by a
   horizontal component (column).  The row and column use separate
   coding probabilities but are otherwise represented identically.

17.1.  Coding of Each Component

   Each component is a signed integer V representing a vertical or
   horizontal luma displacement of V quarter-pixels (and a chroma
   displacement of V eighth-pixels).  The absolute value of V, if
   non-zero, is followed by a boolean sign.  V may take any value
   between -1023 and +1023, inclusive.

RFC6386 - Page 109

   The absolute value A is coded in one of two different ways according
   to its size.  For 0 <= A <= 7, A is tree-coded, and for 8 <= A <=
   1023, the bits in the binary expansion of A are coded using
   independent boolean probabilities.  The coding of A begins with a
   bool specifying which range is in effect.

   Decoding a motion vector component then requires a 19-position
   probability table, whose offsets, along with the procedure used to
   decode components, are as follows:

   ---- Begin code block --------------------------------------

   typedef enum
   {
       mvpis_short,         /* short (<= 7) vs long (>= 8) */
       MVPsign,             /* sign for non-zero */
       MVPshort,            /* 8 short values = 7-position tree */

       MVPbits = MVPshort + 7,      /* 8 long value bits
                                       w/independent probs */

       MVPcount = MVPbits + 10      /* 19 probabilities in total */
   }
   MVPindices;

   typedef Prob MV_CONTEXT [MVPcount];    /* Decoding spec for
                                             a single component */

   /* Tree used for small absolute values (has expected
      correspondence). */

   const tree_index small_mvtree [2 * (8 - 1)] =
   {
    2, 8,          /* "0" subtree, "1" subtree */
     4, 6,         /* "00" subtree, "01" subtree */
      -0, -1,      /* 0 = "000", 1 = "001" */
      -2, -3,      /* 2 = "010", 3 = "011" */
     10, 12,       /* "10" subtree, "11" subtree */
      -4, -5,      /* 4 = "100", 5 = "101" */
      -6, -7       /* 6 = "110", 7 = "111" */
   };

   /* Read MV component at current decoder position, using
      supplied probs. */

   int read_mvcomponent(bool_decoder *d, const MV_CONTEXT *mvc)
   {
       const Prob * const p = (const Prob *) mvc;

RFC6386 - Page 110

       int A = 0;

       if (read_bool(d, p [mvpis_short]))    /* 8 <= A <= 1023 */
       {
           /* Read bits 0, 1, 2 */

           int i = 0;
           do { A += read_bool(d, p [MVPbits + i]) << i;}
             while (++i < 3);

           /* Read bits 9, 8, 7, 6, 5, 4 */

           i = 9;
           do { A += read_bool(d, p [MVPbits + i]) << i;}
             while (--i > 3);

           /* We know that A >= 8 because it is coded long,
              so if A <= 15, bit 3 is one and is not
              explicitly coded. */

           if (!(A & 0xfff0)  ||  read_bool(d, p [MVPbits + 3]))
               A += 8;
       }
       else    /* 0 <= A <= 7 */
           A = treed_read(d, small_mvtree, p + MVPshort);

       return A && read_bool(r, p [MVPsign]) ?  -A : A;
   }

   ---- End code block ----------------------------------------

17.2.  Probability Updates

   The decoder should maintain an array of two MV_CONTEXTs for decoding
   row and column components, respectively.  These MV_CONTEXTs should be
   set to their defaults every key frame.  Each individual probability
   may be updated every interframe (by field J of the frame header)
   using a constant table of update probabilities.  Each optional update
   is of the form B?  P(7), that is, a bool followed by a 7-bit
   probability specification if true.

   As with other dynamic probabilities used by VP8, the updates remain
   in effect until the next key frame or until replaced via another
   update.

RFC6386 - Page 111

   In detail, the probabilities should then be managed as follows.

   ---- Begin code block --------------------------------------

   /* Never-changing table of update probabilities for each
      individual probability used in decoding motion vectors. */

   const MV_CONTEXT vp8_mv_update_probs[2] =
   {
     {
       237,
       246,
       253, 253, 254, 254, 254, 254, 254,
       254, 254, 254, 254, 254, 250, 250, 252, 254, 254
     },
     {
       231,
       243,
       245, 253, 254, 254, 254, 254, 254,
       254, 254, 254, 254, 254, 251, 251, 254, 254, 254
     }
   };

   /* Default MV decoding probabilities. */

   const MV_CONTEXT default_mv_context[2] =
   {
     {                       // row
       162,                    // is short
       128,                    // sign
         225, 146, 172, 147, 214,  39, 156,      // short tree
       128, 129, 132,  75, 145, 178, 206, 239, 254, 254 // long bits
     },

     {                       // same for column
       164,                    // is short
       128,
       204, 170, 119, 235, 140, 230, 228,
       128, 130, 130,  74, 148, 180, 203, 236, 254, 254 // long bits

     }
   };

   /* Current MV decoding probabilities, set to above defaults
      every key frame. */

   MV_CONTEXT mvc [2];     /* always row, then column */

RFC6386 - Page 112

   /* Procedure for decoding a complete motion vector. */

   typedef struct { int16 row, col;}  MV;  /* as in previous section */

   MV read_mv(bool_decoder *d)
   {
       MV v;
       v.row = (int16) read_mvcomponent(d, mvc);
       v.col = (int16) read_mvcomponent(d, mvc + 1);
       return v;
   }

   /* Procedure for updating MV decoding probabilities, called
      every interframe with "d" at the appropriate position in
      the frame header. */

   void update_mvcontexts(bool_decoder *d)
   {
       int i = 0;
       do {                      /* component = row, then column */
           const Prob *up = mv_update_probs[i];    /* update probs
                                                      for component */
           Prob *p = mvc[i];                  /* start decode tbl "" */
           Prob * const pstop = p + MVPcount; /* end decode tbl "" */
           do {
               if (read_bool(d, *up++))     /* update this position */
               {
                   const Prob x = read_literal(d, 7);

                   *p = x? x<<1 : 1;
               }
           } while (++p < pstop);              /* next position */
       } while (++i < 2);                      /* next component */
   }

   ---- End code block ----------------------------------------

   This completes the description of the motion-vector decoding
   procedure and, with it, the procedure for decoding interframe
   macroblock prediction records.

RFC6386 - Page 113

18.  Interframe Prediction

   Given an inter-prediction specification for the current macroblock,
   that is, a reference frame together with a motion vector for each of
   the sixteen Y subblocks, we describe the calculation of the
   prediction buffer for the macroblock.  Frame reconstruction is then
   completed via the previously described processes of residue summation
   (Section 14) and loop filtering (Section 15).

   The management of inter-predicted subblocks and sub-pixel
   interpolation may be found in the reference decoder file predict.c
   (Section 20.14).

18.1.  Bounds on, and Adjustment of, Motion Vectors

   Since each motion vector is differentially encoded from a neighboring
   block or macroblock and the only clamp is to ensure that the
   referenced motion vector represents a valid location inside a
   reference frame buffer, it is technically possible within the VP8
   format for a block or macroblock to have arbitrarily large motion
   vectors, up to the size of the input image plus the extended border
   areas.  For practical reasons, VP8 imposes a motion vector size range
   limit of -4096 to 4095 full pixels, regardless of image size (VP8
   defines 14 raw bits for width and height; 16383x16383 is the maximum
   possible image size).  Bitstream-compliant encoders and decoders
   shall enforce this limit.

   Because the motion vectors applied to the chroma subblocks have
   1/8-pixel resolution, the synthetic pixel calculation, outlined in
   Section 5 and detailed below, uses this resolution for the luma
   subblocks as well.  In accordance, the stored luma motion vectors are
   all doubled, each component of each luma vector becoming an even
   integer in the range -2046 to +2046, inclusive.

   The vector applied to each chroma subblock is calculated by averaging
   the vectors for the 4 luma subblocks occupying the same visible area
   as the chroma subblock in the usual correspondence; that is, the
   vector for U and V block 0 is the average of the vectors for the Y
   subblocks { 0, 1, 4, 5}, chroma block 1 corresponds to Y blocks { 2,
   3, 6, 7}, chroma block 2 to Y blocks { 8, 9, 12, 13}, and chroma
   block 3 to Y blocks { 10, 11, 14, 15}.

RFC6386 - Page 114

   In detail, each of the two components of the vectors for each of the
   chroma subblocks is calculated from the corresponding luma vector
   components as follows:

   ---- Begin code block --------------------------------------

   int avg(int c1, int c2, int c3, int c4)
   {
       int s = c1 + c2 + c3 + c4;

       /* The shift divides by 8 (not 4) because chroma pixels
          have twice the diameter of luma pixels.  The handling
          of negative motion vector components is slightly
          cumbersome because, strictly speaking, right shifts
          of negative numbers are not well-defined in C. */

       return s >= 0 ?  (s + 4) >> 3 : -((-s + 4) >> 3);
   }

   ---- End code block ----------------------------------------

   Furthermore, if the version number in the frame tag specifies only
   full-pel chroma motion vectors, then the fractional parts of both
   components of the vector are truncated to zero, as illustrated in the
   following pseudocode (assuming 3 bits of fraction for both luma and
   chroma vectors):

   ---- Begin code block --------------------------------------

       x = x & (~7);
       y = y & (~7);

   ---- End code block ----------------------------------------

   Earlier in this document we described the vp8_clamp_mv() function to
   limit "nearest" and "near" motion vector predictors inside specified
   margins within the frame boundaries.  Additional clamping is
   performed for NEWMV macroblocks, for which the final motion vector is
   clamped again after combining the "best" predictor and the
   differential vector decoded from the stream.

   However, the secondary clamping is not performed for SPLITMV
   macroblocks, meaning that any subblock's motion vector within the
   SPLITMV macroblock may point outside the clamping zone.  These
   non-clamped vectors are also used when determining the decoding tree
   context for subsequent subblocks' modes in the vp8_mvCont() function.

RFC6386 - Page 115

18.2.  Prediction Subblocks

   The prediction calculation for each subblock is then as follows.
   Temporarily disregarding the fractional part of the motion vector
   (that is, rounding "up" or "left" by right-shifting each component
   3 bits with sign propagation) and adding the origin (upper left
   position) of the (16x16 luma or 8x8 chroma) current macroblock gives
   us an origin in the Y, U, or V plane of the predictor frame (either
   the golden frame or previous frame).

   Considering that origin to be the upper left corner of a (luma or
   chroma) macroblock, we need to specify the relative positions of the
   pixels associated to that subblock, that is, any pixels that might be
   involved in the sub-pixel interpolation processes for the subblock.

18.3.  Sub-Pixel Interpolation

   The sub-pixel interpolation is effected via two one-dimensional
   convolutions.  These convolutions may be thought of as operating on a
   two-dimensional array of pixels whose origin is the subblock origin,
   that is the origin of the prediction macroblock described above plus
   the offset to the subblock.  Because motion vectors are arbitrary, so
   are these "prediction subblock origins".

   The integer part of the motion vector is subsumed in the origin of
   the prediction subblock; the 16 (synthetic) pixels we need to
   construct are given by 16 offsets from the origin.  The integer part
   of each of these offsets is the offset of the corresponding pixel
   from the subblock origin (using the vertical stride).  To these
   integer parts is added a constant fractional part, which is simply
   the difference between the actual motion vector and its integer
   truncation used to calculate the origins of the prediction macroblock
   and subblock.  Each component of this fractional part is an integer
   between 0 and 7, representing a forward displacement in eighths of a
   pixel.

   It is these fractional displacements that determine the filtering
   process.  If they both happen to be zero (that is, we had a "whole
   pixel" motion vector), the prediction subblock is simply copied into
   the corresponding piece of the current macroblock's prediction
   buffer.  As discussed in Section 14, the layout of the macroblock's
   prediction buffer can depend on the specifics of the reconstruction
   implementation chosen.  Of course, the vertical displacement between
   lines of the prediction subblock is given by the stride, as are all
   vertical displacements used here.

RFC6386 - Page 116

   Otherwise, at least one of the fractional displacements is non-zero.
   We then synthesize the missing pixels via a horizontal, followed by a
   vertical, one-dimensional interpolation.

   The two interpolations are essentially identical.  Each uses a (at
   most) six-tap filter (the choice of which of course depends on the
   one-dimensional offset).  Thus, every calculated pixel references at
   most three pixels before (above or to the left of) it and at most
   three pixels after (below or to the right of) it.  The horizontal
   interpolation must calculate two extra rows above and three extra
   rows below the 4x4 block, to provide enough samples for the vertical
   interpolation to proceed.

   Depending on the reconstruction filter type given in the version
   number field in the frame tag, either a bicubic or a bilinear tap set
   is used.

   The exact implementation of subsampling is as follows.

   ---- Begin code block --------------------------------------

   /* Filter taps taken to 7-bit precision.
      Because DC is always passed, taps always sum to 128. */

   const int BilinearFilters[8][6] =
   {
       { 0, 0, 128,   0, 0, 0 },
       { 0, 0, 112,  16, 0, 0 },
       { 0, 0,  96,  32, 0, 0 },
       { 0, 0,  80,  48, 0, 0 },
       { 0, 0,  64,  64, 0, 0 },
       { 0, 0,  48,  80, 0, 0 },
       { 0, 0,  32,  96, 0, 0 },
       { 0, 0,  16, 112, 0, 0 }
   };

   const int filters [8] [6] = {        /* indexed by displacement */
       { 0,  0,  128,    0,   0,  0 },  /* degenerate whole-pixel */
       { 0, -6,  123,   12,  -1,  0 },  /* 1/8 */
       { 2, -11, 108,   36,  -8,  1 },  /* 1/4 */
       { 0, -9,   93,   50,  -6,  0 },  /* 3/8 */
       { 3, -16,  77,   77, -16,  3 },  /* 1/2 is symmetric */
       { 0, -6,   50,   93,  -9,  0 },  /* 5/8 = reverse of 3/8 */
       { 1, -8,   36,  108, -11,  2 },  /* 3/4 = reverse of 1/4 */
       { 0, -1,   12,  123,  -6,  0 }   /* 7/8 = reverse of 1/8 */
   };

RFC6386 - Page 117

   /* One-dimensional synthesis of a single sample.
      Filter is determined by fractional displacement */

   Pixel interp(
       const int fil[6],   /* filter to apply */
       const Pixel *p,     /* origin (rounded "before") in
                              prediction area */
       const int s         /* size of one forward step "" */
   ) {
       int32 a = 0;
       int i = 0;
       p -= s + s;         /* move back two positions */

       do {
           a += *p * fil[i];
           p += s;
       }  while (++i < 6);

       return clamp255((a + 64) >> 7);    /* round to nearest
                                              8-bit value */
   }


   /* First do horizontal interpolation, producing intermediate
      buffer. */

   void Hinterp(
       Pixel temp[9][4],   /* 9 rows of 4 (intermediate)
                              destination values */
       const Pixel *p,     /* subblock origin in prediction
                              frame */
       int s,              /* vertical stride to be used in
                              prediction frame */
       uint hfrac,         /* 0 <= horizontal displacement <= 7 */
       uint bicubic        /* 1=bicubic filter, 0=bilinear */
   ) {
       const int * const fil = bicubic ? filters [hfrac] :
         BilinearFilters[hfrac];

       int r = 0;  do              /* for each row */
       {
           int c = 0;  do          /* for each destination sample */
           {
               /* Pixel separation = one horizontal step = 1 */

               temp[r][c] = interp(fil, p + c, 1);
           }

RFC6386 - Page 118

           while (++c < 4);
       }
       while (p += s, ++r < 9);    /* advance p to next row */
   }

   /* Finish with vertical interpolation, producing final results.
      Input array "temp" is of course that computed above. */

   void Vinterp(
       Pixel final[4][4],  /* 4 rows of 4 (final) destination values */
       const Pixel temp[9][4],
       uint vfrac,         /* 0 <= vertical displacement <= 7 */
       uint bicubic        /* 1=bicubic filter, 0=bilinear */
   ) {
       const int * const fil = bicubic ? filters [vfrac] :
         BilinearFilters[vfrac];

       int r = 0;  do              /* for each row */
       {
           int c = 0;  do          /* for each destination sample */
           {
               /* Pixel separation = one vertical step = width
                  of array = 4 */

               final[r][c] = interp(fil, temp[r] + c, 4);
           }
           while (++c < 4);
       }
       while (++r < 4);
   }

   ---- End code block ----------------------------------------

18.4.  Filter Properties

   We discuss briefly the rationale behind the choice of filters.  Our
   approach is necessarily cursory; a genuinely accurate discussion
   would require a couple of books.  Readers unfamiliar with signal
   processing may or may not wish to skip this.

   All digital signals are of course sampled in some fashion.  The case
   where the inter-sample spacing (say in time for audio samples, or
   space for pixels) is uniform, that is, the same at all positions, is
   particularly common and amenable to analysis.  Many aspects of the
   treatment of such signals are best-understood in the frequency domain
   via Fourier Analysis, particularly those aspects of the signal that
   are not changed by shifts in position, especially when those
   positional shifts are not given by a whole number of samples.

RFC6386 - Page 119

   Non-integral translates of a sampled signal are a textbook example of
   the foregoing.  In our case of non-integral motion vectors, we wish
   to say what the underlying image "really is" at these pixels;
   although we don't have values for them, we feel that it makes sense
   to talk about them.  The correctness of this feeling is predicated on
   the underlying signal being band-limited, that is, not containing any
   energy in spatial frequencies that cannot be faithfully rendered at
   the pixel resolution at our disposal.  In one dimension, this range
   of "OK" frequencies is called the Nyquist band; in our two-
   dimensional case of integer-grid samples, this range might be termed
   a Nyquist rectangle.  The finer the grid, the more we know about the
   image, and the wider the Nyquist rectangle.

   It turns out that, for such band-limited signals, there is indeed an
   exact mathematical formula to produce the correct sample value at an
   arbitrary point.  Unfortunately, this calculation requires the
   consideration of every single sample in the image, as well as needing
   to operate at infinite precision.  Also, strictly speaking, all band-
   limited signals have infinite spatial (or temporal) extent, so
   everything we are discussing is really some sort of approximation.

   It is true that the theoretically correct subsampling procedure, as
   well as any approximation thereof, is always given by a translation-
   invariant weighted sum (or filter) similar to that used by VP8.  It
   is also true that the reconstruction error made by such a filter can
   be simply represented as a multiplier in the frequency domain; that
   is, such filters simply multiply the Fourier transform of any signal
   to which they are applied by a fixed function associated to the
   filter.  This fixed function is usually called the frequency response
   (or transfer function); the ideal subsampling filter has a frequency
   response equal to one in the Nyquist rectangle and zero everywhere
   else.

   Another basic fact about approximations to "truly correct"
   subsampling is that the wider the subrectangle (within the Nyquist
   rectangle) of spatial frequencies one wishes to "pass" (that is,
   correctly render) or, put more accurately, the closer one wishes to
   approximate the ideal transfer function, the more samples of the
   original signal must be considered by the subsampling, and the wider
   the calculation precision necessitated.

   The filters chosen by VP8 were chosen, within the constraints of 4 or
   6 taps and 7-bit precision, to do the best possible job of handling
   the low spatial frequencies near the 0th DC frequency along with
   introducing no resonances (places where the absolute value of the
   frequency response exceeds one).

RFC6386 - Page 120

   The justification for the foregoing has two parts.  First, resonances
   can produce extremely objectionable visible artifacts when, as often
   happens in actual compressed video streams, filters are applied
   repeatedly.  Second, the vast majority of energy in real-world images
   lies near DC and not at the high end.

   To get slightly more specific, the filters chosen by VP8 are the best
   resonance-free 4- or 6-tap filters possible, where "best" describes
   the frequency response near the origin: The response at 0 is required
   to be 1, and the graph of the response at 0 is as flat as possible.

   To provide an intuitively more obvious point of reference, the "best"
   2-tap filter is given by simple linear interpolation between the
   surrounding actual pixels.

   Finally, it should be noted that, because of the way motion vectors
   are calculated, the (shorter) 4-tap filters (used for odd fractional
   displacements) are applied in the chroma plane only.  Human color
   perception is notoriously poor, especially where higher spatial
   frequencies are involved.  The shorter filters are easier to
   understand mathematically, and the difference between them and a
   theoretically slightly better 6-tap filter is negligible where chroma
   is concerned.

19.  Annex A: Bitstream Syntax

   This annex presents the bitstream syntax in a tabular form.  All the
   information elements have been introduced and explained in the
   previous sections but are collected here for a quick reference.  Each
   syntax element is briefly described after the tabular representation
   along with a reference to the corresponding paragraph in the main
   document.  The meaning of each syntax element value is not repeated
   here.

   The top-level hierarchy of the bitstream is introduced in Section 4.

   Definition of syntax element coding types can be found in Section 8.
   The types used in the representation in this annex are:

   o  f(n), n-bit value from stream (n successive bits, not boolean
      encoded)

   o  L(n), n-bit number encoded as n booleans (with equal probability
      of being 0 or 1)

   o  B(p), bool with probability p of being 0

   o  T, tree-encoded value

RFC6386 - Page 121

19.1.  Uncompressed Data Chunk

   | Frame Tag                                         | Type  |
   | ------------------------------------------------- | ----- |
   | frame_tag                                         | f(24) |
   | if (key_frame) {                                  |       |
   |     start_code                                    | f(24) |
   |     horizontal_size_code                          | f(16) |
   |     vertical_size_code                            | f(16) |
   | }                                                 |       |

   The 3-byte frame tag can be parsed as follows:

   ---- Begin code block --------------------------------------

   unsigned char *c = pbi->source;
   unsigned int tmp;

   tmp = (c[2] << 16) | (c[1] << 8) | c[0];

   key_frame = tmp & 0x1;
   version = (tmp >> 1) & 0x7;
   show_frame = (tmp >> 4) & 0x1;
   first_part_size = (tmp >> 5) & 0x7FFFF;

   ---- End code block ----------------------------------------

   Where:

   o  key_frame indicates whether the current frame is a key frame
      or not.

   o  version determines the bitstream version.

   o  show_frame indicates whether the current frame is meant to be
      displayed or not.

   o  first_part_size determines the size of the first partition
      (control partition), excluding the uncompressed data chunk.

RFC6386 - Page 122

   The start_code is a constant 3-byte pattern having value 0x9d012a.
   The latter part of the uncompressed chunk (after the start_code) can
   be parsed as follows:

   ---- Begin code block --------------------------------------

   unsigned char *c = pbi->source + 6;
   unsigned int tmp;

   tmp = (c[1] << 8) | c[0];

   width = tmp & 0x3FFF;
   horizontal_scale = tmp >> 14;

   tmp = (c[3] << 8) | c[2];

   height = tmp & 0x3FFF;
   vertical_scale = tmp >> 14;

   ---- End code block ----------------------------------------

19.2.  Frame Header

   | Frame Header                                      | Type  |
   | ------------------------------------------------- | ----- |
   | if (key_frame) {                                  |       |
   |   color_space                                     | L(1)  |
   |   clamping_type                                   | L(1)  |
   | }                                                 |       |
   | segmentation_enabled                              | L(1)  |
   | if (segmentation_enabled)                         |       |
   |   update_segmentation()                           |       |
   | filter_type                                       | L(1)  |
   | loop_filter_level                                 | L(6)  |
   | sharpness_level                                   | L(3)  |
   | mb_lf_adjustments()                               |       |
   | log2_nbr_of_dct_partitions                        | L(2)  |
   | quant_indices()                                   |       |
   | if (key_frame)                                    |       |
   |   refresh_entropy_probs                           | L(1)  |

RFC6386 - Page 123

   | else {                                            |       |
   |   refresh_golden_frame                            | L(1)  |
   |   refresh_alternate_frame                         | L(1)  |
   |   if (!refresh_golden_frame)                      |       |
   |     copy_buffer_to_golden                         | L(2)  |
   |   if (!refresh_alternate_frame)                   |       |
   |     copy_buffer_to_alternate                      | L(2)  |
   |   sign_bias_golden                                | L(1)  |
   |   sign_bias_alternate                             | L(1)  |
   |   refresh_entropy_probs                           | L(1)  |
   |   refresh_last                                    | L(1)  |
   | }                                                 |       |
   | token_prob_update()                               |       |
   | mb_no_skip_coeff                                  | L(1)  |
   | if (mb_no_skip_coeff)                             |       |
   |   prob_skip_false                                 | L(8)  |
   | if (!key_frame) {                                 |       |
   |   prob_intra                                      | L(8)  |
   |   prob_last                                       | L(8)  |
   |   prob_gf                                         | L(8)  |
   |   intra_16x16_prob_update_flag                    | L(1)  |
   |   if (intra_16x16_prob_update_flag) {             |       |
   |     for (i = 0; i < 4; i++)                       |       |
   |       intra_16x16_prob                            | L(8)  |
   |   }                                               |       |
   |   intra_chroma prob_update_flag                   | L(1)  |
   |   if (intra_chroma_prob_update_flag) {            |       |
   |     for (i = 0; i < 3; i++)                       |       |
   |       intra_chroma_prob                           | L(8)  |
   |   }                                               |       |
   |   mv_prob_update()                                |       |
   | }                                                 |       |

   o  color_space defines the YUV color space of the sequence
      (Section 9.2)

   o  clamping_type specifies if the decoder is required to clamp the
      reconstructed pixel values (Section 9.2)

   o  segmentation_enabled enables the segmentation feature for the
      current frame (Section 9.3)

   o  filter_type determines whether the normal or the simple loop
      filter is used (Sections 9.4, 15)

   o  loop_filter_level controls the deblocking filter
      (Sections 9.4, 15)

RFC6386 - Page 124

   o  sharpness_level controls the deblocking filter (Sections 9.4, 15)

   o  log2_nbr_of_dct_partitions determines the number of separate
      partitions containing the DCT coefficients of the macroblocks
      (Section 9.5)

   o  refresh_entropy_probs determines whether updated token
      probabilities are used only for this frame or until further update

   o  refresh_golden_frame determines if the current decoded frame
      refreshes the golden frame (Section 9.7)

   o  refresh_alternate_frame determines if the current decoded frame
      refreshes the alternate reference frame (Section 9.7)

   o  copy_buffer_to_golden determines if the golden reference is
      replaced by another reference (Section 9.7)

   o  copy_buffer_to_alternate determines if the alternate reference is
      replaced by another reference (Section 9.7)

   o  sign_bias_golden controls the sign of motion vectors when the
      golden frame is referenced (Section 9.7)

   o  sign_bias_alternate controls the sign of motion vectors when the
      alternate frame is referenced (Section 9.7)

   o  refresh_last determines if the current decoded frame refreshes the
      last frame reference buffer (Section 9.8)

   o  mb_no_skip_coeff enables or disables the skipping of macroblocks
      containing no non-zero coefficients (Section 9.10)

   o  prob_skip_false indicates the probability that the macroblock is
      not skipped (flag indicating skipped macroblock is false)
      (Section 9.10)

   o  prob_intra indicates the probability of an intra macroblock
      (Section 9.10)

   o  prob_last indicates the probability that the last reference frame
      is used for inter-prediction (Section 9.10)

   o  prob_gf indicates the probability that the golden reference frame
      is used for inter-prediction (Section 9.10)

RFC6386 - Page 125

   o  intra_16x16_prob_update_flag indicates if the branch probabilities
      used in the decoding of the luma intra-prediction mode are updated
      (Section 9.10)

   o  intra_16x16_prob indicates the branch probabilities of the luma
      intra-prediction mode decoding tree

   o  intra_chroma_prob_update_flag indicates if the branch
      probabilities used in the decoding of the chroma intra-prediction
      mode are updated (Section 9.10)

   o  intra_chroma_prob indicates the branch probabilities of the chroma
      intra-prediction mode decoding tree

   | update_segmentation()                             | Type  |
   | ------------------------------------------------- | ----- |
   | update_mb_segmentation_map                        | L(1)  |
   | update_segment_feature_data                       | L(1)  |
   | if (update_segment_feature_data) {                |       |
   |   segment_feature_mode                            | L(1)  |
   |   for (i = 0; i < 4; i++) {                       |       |
   |     quantizer_update                              | L(1)  |
   |     if (quantizer_update) {                       |       |
   |       quantizer_update_value                      | L(7)  |
   |       quantizer_update_sign                       | L(1)  |
   |     }                                             |       |
   |   }                                               |       |
   |   for (i = 0; i < 4; i++) {                       |       |
   |     loop_filter_update                            | L(1)  |
   |     if (loop_filter_update) {                     |       |
   |       lf_update_value                             | L(6)  |
   |       lf_update_sign                              | L(1)  |
   |     }                                             |       |
   |   }                                               |       |
   | }                                                 |       |
   | if (update_mb_segmentation_map) {                 |       |
   |   for (i = 0; i < 3; i++) {                       |       |
   |     segment_prob_update                           | L(1)  |
   |     if (segment_prob_update)                      |       |
   |       segment_prob                                | L(8)  |
   |   }                                               |       |
   | }                                                 |       |

   o  update_mb_segmentation_map determines if the MB segmentation map
      is updated in the current frame (Section 9.3)

   o  update_segment_feature_data indicates if the segment feature data
      is updated in the current frame (Section 9.3)

RFC6386 - Page 126

   o  segment_feature_mode indicates the feature data update mode, 0 for
      delta and 1 for the absolute value (Section 9.3)

   o  quantizer_update indicates if the quantizer value is updated for
      the i^(th) segment (Section 9.3)

   o  quantizer_update_value indicates the update value for the segment
      quantizer (Section 9.3)

   o  quantizer_update_sign indicates the update sign for the segment
      quantizer (Section 9.3)

   o  loop_filter_update indicates if the loop filter level value is
      updated for the i^(th) segment (Section 9.3)

   o  lf_update_value indicates the update value for the loop filter
      level (Section 9.3)

   o  lf_update_sign indicates the update sign for the loop filter level
      (Section 9.3)

   o  segment_prob_update indicates whether the branch probabilities
      used to decode the segment_id in the MB header are decoded from
      the stream or use the default value of 255 (Section 9.3)

   o  segment_prob indicates the branch probabilities of the segment_id
      decoding tree (Section 9.3)

RFC6386 - Page 127

   | mb_lf_adjustments()                               | Type  |
   | ------------------------------------------------- | ----- |
   | loop_filter_adj_enable                            | L(1)  |
   | if (loop_filter_adj_enable) {                     |       |
   |   mode_ref_lf_delta_update                        | L(1)  |
   |   if (mode_ref_lf_delta_update) {                 |       |
   |     for (i = 0; i < 4; i++) {                     |       |
   |       ref_frame_delta_update_flag                 | L(1)  |
   |       if (ref_frame_delta_update_flag) {          |       |
   |         delta_magnitude                           | L(6)  |
   |         delta_sign                                | L(1)  |
   |       }                                           |       |
   |     }                                             |       |
   |     for (i = 0; i < 4; i++) {                     |       |
   |       mb_mode_delta_update_flag                   | L(1)  |
   |       if (mb_mode_delta_update_flag) {            |       |
   |         delta_magnitude                           | L(6)  |
   |         delta_sign                                | L(1)  |
   |       }                                           |       |
   |     }                                             |       |
   |   }                                               |       |
   | }                                                 |       |

   o  loop_filter_adj_enable indicates if the MB-level loop filter
      adjustment (based on the used reference frame and coding mode) is
      on for the current frame (Section 9.4)

   o  mode_ref_lf_delta_update indicates if the delta values used in an
      adjustment are updated in the current frame (Section 9.4)

   o  ref_frame_delta_update_flag indicates if the adjustment delta
      value corresponding to a certain used reference frame is updated
      (Section 9.4)

   o  delta_magnitude is the absolute value of the delta value

   o  delta_sign is the sign of the delta value

   o  mb_mode_delta_update_flag indicates if the adjustment delta value
      corresponding to a certain MB prediction mode is updated
      (Section 9.4)

RFC6386 - Page 128

   | quant_indices()                                   | Type  |
   | ------------------------------------------------- | ----- |
   | y_ac_qi                                           | L(7)  |
   | y_dc_delta_present                                | L(1)  |
   | if (y_dc_delta_present) {                         |       |
   |   y_dc_delta_magnitude                            | L(4)  |
   |   y_dc_delta_sign                                 | L(1)  |
   | }                                                 |       |
   | y2_dc_delta_present                               | L(1)  |
   | if (y2_dc_delta_present) {                        |       |
   |   y2_dc_delta_magnitude                           | L(4)  |
   |   y2_dc_delta_sign                                | L(1)  |
   | }                                                 |       |
   | y2_ac_delta_present                               | L(1)  |
   | if (y2_ac_delta_present) {                        |       |
   |   y2_ac_delta_magnitude                           | L(4)  |
   |   y2_ac_delta_sign                                | L(1)  |
   | }                                                 |       |
   | uv_dc_delta_present                               | L(1)  |
   | if (uv_dc_delta_present) {                        |       |
   |   uv_dc_delta_magnitude                           | L(4)  |
   |   uv_dc_delta_sign                                | L(1)  |
   | }                                                 |       |
   | uv_ac_delta_present                               | L(1)  |
   | if (uv_ac_delta_present) {                        |       |
   |   uv_ac_delta_magnitude                           | L(4)  |
   |   uv_ac_delta_sign                                | L(1)  |
   | }                                                 |       |

   o  y_ac_qi is the dequantization table index used for the luma AC
      coefficients (and other coefficient groups if no delta value is
      present) (Section 9.6)

   o  y_dc_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the luma DC
      coefficient dequantization index (Section 9.6)

   o  y_dc_delta_magnitude is the magnitude of the delta value
      (Section 9.6)

   o  y_dc_delta_sign is the sign of the delta value (Section 9.6)

   o  y2_dc_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the Y2 block DC
      coefficient dequantization index (Section 9.6)

RFC6386 - Page 129

   o  y2_ac_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the Y2 block AC
      coefficient dequantization index (Section 9.6)

   o  uv_dc_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the chroma DC
      coefficient dequantization index (Section 9.6)

   o  uv_ac_delta_present indicates if the stream contains a delta value
      that is added to the baseline index to obtain the chroma AC
      coefficient dequantization index (Section 9.6)

   | token_prob_update()                               | Type  |
   | ------------------------------------------------- | ----- |
   | for (i = 0; i < 4; i++) {                         |       |
   |   for (j = 0; j < 8; j++) {                       |       |
   |     for (k = 0; k < 3; k++) {                     |       |
   |       for (l = 0; l < 11; l++) {                  |       |
   |         coeff_prob_update_flag                    | L(1)  |
   |         if (coeff_prob_update_flag)               |       |
   |           coeff_prob                              | L(8)  |
   |       }                                           |       |
   |     }                                             |       |
   |   }                                               |       |
   | }                                                 |       |

   o  coeff_prob_update_flag indicates if the corresponding branch
      probability is updated in the current frame (Section 13.4)

   o  coeff_prob is the new branch probability (Section 13.4)

   | mv_prob_update()                                  | Type  |
   | ------------------------------------------------- | ----- |
   | for (i = 0; i < 2; i++) {                         |       |
   |   for (j = 0; j < 19; j++) {                      |       |
   |     mv_prob_update_flag                           | L(1)  |
   |     if (mv_prob_update_flag)                      |       |
   |       prob                                        | L(7)  |
   |   }                                               |       |
   | }                                                 |       |

   o  mv_prob_update_flag indicates if the corresponding MV decoding
      probability is updated in the current frame (Section 17.2)

   o  prob is the updated probability (Section 17.2)

RFC6386 - Page 130

19.3.  Macroblock Data

   | Macroblock Data                                   | Type  |
   | ------------------------------------------------- | ----- |
   | macroblock_header()                               |       |
   | residual_data()                                   |       |


   | macroblock_header()                               | Type  |
   | ------------------------------------------------- | ----- |
   | if (update_mb_segmentation_map)                   |       |
   |   segment_id                                      | T     |
   | if (mb_no_skip_coeff)                             |       |
   |   mb_skip_coeff                                   | B(p)  |
   | if (!key_frame)                                   |       |
   |   is_inter_mb                                     | B(p)  |
   | if (is_inter_mb) {                                |       |
   |   mb_ref_frame_sel1                               | B(p)  |
   |   if (mb_ref_frame_sel1)                          |       |
   |     mb_ref_frame_sel2                             | B(p)  |
   |   mv_mode                                         | T     |
   |   if (mv_mode == SPLITMV) {                       |       |
   |     mv_split_mode                                 | T     |
   |     for (i = 0; i < numMvs; i++) {                |       |
   |       sub_mv_mode                                 | T     |
   |       if (sub_mv_mode == NEWMV4x4) {              |       |
   |         read_mvcomponent()                        |       |
   |         read_mvcomponent()                        |       |
   |       }                                           |       |
   |     }                                             |       |
   |   } else if (mv_mode == NEWMV) {                  |       |
   |     read_mvcomponent()                            |       |
   |     read_mvcomponent()                            |       |
   |   }                                               |       |
   | } else { /* intra mb */                           |       |
   |   intra_y_mode                                    | T     |
   |   if (intra_y_mode == B_PRED) {                   |       |
   |     for (i = 0; i < 16; i++)                      |       |
   |       intra_b_mode                                | T     |
   |   }                                               |       |
   |   intra_uv_mode                                   | T     |
   | }                                                 |       |

   o  segment_id indicates to which segment the macroblock belongs
      (Section 10)

   o  mb_skip_coeff indicates whether the macroblock contains any coded
      coefficients or not (Section 11.1)

RFC6386 - Page 131

   o  is_inter_mb indicates whether the macroblock is intra- or inter-
      coded (Section 16)

   o  mb_ref_frame_sel1 selects the reference frame to be used; last
      frame (0), golden/alternate (1) (Section 16.2)

   o  mb_ref_frame_sel2 selects whether the golden (0) or alternate
      reference frame (1) is used (Section 16.2)

   o  mv_mode determines the macroblock motion vector mode
      (Section 16.2)

   o  mv_split_mode gives the macroblock partitioning specification and
      determines the number of motion vectors used (numMvs)
      (Section 16.2)

   o  sub_mv_mode determines the sub-macroblock motion vector mode for
      macroblocks coded using the SPLITMV motion vector mode
      (Section 16.2)

   o  intra_y_mode selects the luminance intra-prediction mode
      (Section 16.1)

   o  intra_b_mode selects the sub-macroblock luminance prediction mode
      for macroblocks coded using B_PRED mode (Section 16.1)

   o  intra_uv_mode selects the chrominance intra-prediction mode
      (Section 16.1)

   | residual_data()                                   | Type  |
   | ------------------------------------------------- | ----- |
   | if (!mb_skip_coeff) {                             |       |
   |   if ( (is_inter_mb && mv_mode != SPLITMV) ||     |       |
   |        (!is_inter_mb && intra_y_mode != B_PRED) ) |       |
   |     residual_block() /* Y2 */                     |       |
   |   for (i = 0; i < 24; i++)                        |       |
   |     residual_block() /* 16 Y, 4 U, 4 V */         |       |
   | }                                                 |       |

RFC6386 - Page 132

   | residual_block()                                  | Type  |
   | ------------------------------------------------- | ----- |
   | for (i = firstCoeff; i < 16; i++) {               |       |
   |   token                                           | T     |
   |   if (token == EOB) break;                        |       |
   |   if (token_has_extra_bits)                       |       |
   |     extra_bits                                    | L(n)  |
   |   if (coefficient != 0)                           |       |
   |     sign                                          | L(1)  |
   | }                                                 |       |

   o  firstCoeff is 1 for luma blocks of macroblocks containing Y2
      subblock; otherwise 0

   o  token defines the value of the coefficient, the value range of the
      coefficient, or the end of block (Section 13.2)

   o  extra_bits determines the value of the coefficient within the
      value range defined by the token (Section 13.2)

   o  sign indicates the sign of the coefficient (Section 13.2)

(next page on part 6)