RFC 6330

RaptorQ Forward Error Correction Scheme for Object Delivery

Pages: 69
Proposed Standard
→ Errata

Part 2 of 4 – Pages 12 to 36

RFC6330 - Page 12 prevText

5.  RaptorQ FEC Code Specification

5.1.  Background

   For the purpose of the RaptorQ FEC code specification in this
   section, the following definitions, symbols, and abbreviations apply.
   A basic understanding of linear algebra, matrix operations, and
   finite fields is assumed in this section.  In particular, matrix
   multiplication and matrix inversion operations over a mixture of the

RFC6330 - Page 13

   finite fields GF[2] and GF[256] are used.  A basic familiarity with
   sparse linear equations, and efficient implementations of algorithms
   that take advantage of sparse linear equations, is also quite
   beneficial to an implementer of this specification.

5.1.1.  Definitions

   o  Source block: a block of K source symbols that are considered
      together for RaptorQ encoding and decoding purposes.

   o  Extended Source Block: a block of K' source symbols, where K' >=
      K, constructed from a source block and zero or more padding
      symbols.

   o  Symbol: a unit of data.  The size, in octets, of a symbol is known
      as the symbol size.  The symbol size is always a positive integer.

   o  Source symbol: the smallest unit of data used during the encoding
      process.  All source symbols within a source block have the same
      size.

   o  Padding symbol: a symbol with all zero bits that is added to the
      source block to form the extended source block.

   o  Encoding symbol: a symbol that can be sent as part of the encoding
      of a source block.  The encoding symbols of a source block consist
      of the source symbols of the source block and the repair symbols
      generated from the source block.  Repair symbols generated from a
      source block have the same size as the source symbols of that
      source block.

   o  Repair symbol: the encoding symbols of a source block that are not
      source symbols.  The repair symbols are generated based on the
      source symbols of a source block.

   o  Intermediate symbols: symbols generated from the source symbols
      using an inverse encoding process based on pre-coding
      relationships.  The repair symbols are then generated directly
      from the intermediate symbols.  The encoding symbols do not
      include the intermediate symbols, i.e., intermediate symbols are
      not sent as part of the encoding of a source block.  The
      intermediate symbols are partitioned into LT symbols and PI
      symbols for the purposes of the encoding process.

   o  LT symbols: a process similar to that described in [LTCodes] is
      used to generate part of the contribution to each generated
      encoding symbol from the portion of the intermediate symbols
      designated as LT symbols.

RFC6330 - Page 14

   o  PI symbols: a process even simpler than that described in
      [LTCodes] is used to generate the other part of the contribution
      to each generated encoding symbol from the portion of the
      intermediate symbols designated as PI symbols.  In the decoding
      algorithm suggested in Section 5.4, the PI symbols are inactivated
      at the start, i.e., are placed into the matrix U at the beginning
      of the first phase of the decoding algorithm.  Because the symbols
      corresponding to the columns of U are sometimes called the
      "inactivated" symbols, and since the PI symbols are inactivated at
      the beginning, they are considered "permanently inactivated".

   o  HDPC symbols: there is a small subset of the intermediate symbols
      that are HDPC symbols.  Each HDPC symbol has a pre-coding
      relationship with a large fraction of the other intermediate
      symbols.  HDPC means "High Density Parity Check".

   o  LDPC symbols: there is a moderate-sized subset of the intermediate
      symbols that are LDPC symbols.  Each LDPC symbol has a pre-coding
      relationship with a small fraction of the other intermediate
      symbols.  LDPC means "Low Density Parity Check".

   o  Systematic code: a code in which all source symbols are included
      as part of the encoding symbols of a source block.  The RaptorQ
      code as described herein is a systematic code.

   o  Encoding Symbol ID (ESI): information that uniquely identifies
      each encoding symbol associated with a source block for sending
      and receiving purposes.

   o  Internal Symbol ID (ISI): information that uniquely identifies
      each symbol associated with an extended source block for encoding
      and decoding purposes.

   o  Arithmetic operations on octets and symbols and matrices: the
      operations that are used to produce encoding symbols from source
      symbols and vice versa.  See Section 5.7.

5.1.2.  Symbols

   i, j, u, v, h, d, a, b, d1, a1, b1, v, m, x, y   represent values or
        variables of one type or another, depending on the context.

   X    denotes a non-negative integer value that is either an ISI value
        or an ESI value, depending on the context.

   ceil(x)  denotes the smallest integer that is greater than or equal
        to x, where x is a real value.

RFC6330 - Page 15

   floor(x)  denotes the largest integer that is less than or equal to
        x, where x is a real value.

   min(x,y)  denotes the minimum value of the values x and y, and in
        general the minimum value of all the argument values.

   max(x,y)  denotes the maximum value of the values x and y, and in
        general the maximum value of all the argument values.

   i % j  denotes i modulo j.

   i + j  denotes the sum of i and j.  If i and j are octets or symbols,
        this designates the arithmetic on octets or symbols,
        respectively, as defined in Section 5.7.  If i and j are
        integers, then it denotes the usual integer addition.

   i * j  denotes the product of i and j.  If i and j are octets, this
        designates the arithmetic on octets, as defined in Section 5.7.
        If i is an octet and j is a symbol, this denotes the
        multiplication of a symbol by an octet, as also defined in
        Section 5.7.  Finally, if i and j are integers, i * j denotes
        the usual product of integers.

   a ^^ b  denotes the operation a raised to the power b.  If a is an
        octet and b is a non-negative integer, this is understood to
        mean a*a*...*a (b terms), with '*' being the octet product as
        defined in Section 5.7.

   u ^ v  denotes, for equal-length bit strings u and v, the bitwise
        exclusive-or of u and v.

   Transpose[A]  denotes the transposed matrix of matrix A.  In this
        specification, all matrices have entries that are octets.

   A^^-1  denotes the inverse matrix of matrix A.  In this
        specification, all the matrices have octets as entries, so it is
        understood that the operations of the matrix entries are to be
        done as stated in Section 5.7 and A^^-1 is the matrix inverse of
        A with respect to octet arithmetic.

   K    denotes the number of symbols in a single source block.

   K'   denotes the number of source plus padding symbols in an extended
        source block.  For the majority of this specification, the
        padding symbols are considered to be additional source symbols.

   K'_max  denotes the maximum number of source symbols that can be in a
        single source block.  Set to 56403.

RFC6330 - Page 16

   L    denotes the number of intermediate symbols for a single extended
        source block.

   S    denotes the number of LDPC symbols for a single extended source
        block.  These are LT symbols.  For each value of K' shown in
        Table 2 in Section 5.6, the corresponding value of S is a prime
        number.

   H    denotes the number of HDPC symbols for a single extended source
        block.  These are PI symbols.

   B    denotes the number of intermediate symbols that are LT symbols
        excluding the LDPC symbols.

   W    denotes the number of intermediate symbols that are LT symbols.
        For each value of K' in Table 2 shown in Section 5.6, the
        corresponding value of W is a prime number.

   P    denotes the number of intermediate symbols that are PI symbols.
        These contain all HDPC symbols.

   P1   denotes the smallest prime number greater than or equal to P.

   U    denotes the number of non-HDPC intermediate symbols that are PI
        symbols.

   C    denotes an array of intermediate symbols, C[0], C[1], C[2], ...,
        C[L-1].

   C'   denotes an array of the symbols of the extended source block,
        where C'[0], C'[1], C'[2], ..., C'[K-1] are the source symbols
        of the source block and C'[K], C'[K+1], ..., C'[K'-1] are
        padding symbols.

   V0, V1, V2, V3  denote four arrays of 32-bit unsigned integers,
        V0[0], V0[1], ..., V0[255]; V1[0], V1[1], ..., V1[255]; V2[0],
        V2[1], ..., V2[255]; and V3[0], V3[1], ..., V3[255] as shown in
        Section 5.5.

   Rand[y, i, m]  denotes a pseudo-random number generator.

   Deg[v]  denotes a degree generator.

   Enc[K', C ,(d, a, b, d1, a1, b1)]  denotes an encoding symbol
        generator.

   Tuple[K', X]  denotes a tuple generator function.

RFC6330 - Page 17

   T    denotes the symbol size in octets.

   J(K')  denotes the systematic index associated with K'.

   G    denotes any generator matrix.

   I_S  denotes the S x S identity matrix.

5.2.  Overview

   This section defines the systematic RaptorQ FEC code.

   Symbols are the fundamental data units of the encoding and decoding
   process.  For each source block, all symbols are the same size,
   referred to as the symbol size T.  The atomic operations performed on
   symbols for both encoding and decoding are the arithmetic operations
   defined in Section 5.7.

   The basic encoder is described in Section 5.3.  The encoder first
   derives a block of intermediate symbols from the source symbols of a
   source block.  This intermediate block has the property that both
   source and repair symbols can be generated from it using the same
   process.  The encoder produces repair symbols from the intermediate
   block using an efficient process, where each such repair symbol is
   the exclusive-or of a small number of intermediate symbols from the
   block.  Source symbols can also be reproduced from the intermediate
   block using the same process.  The encoding symbols are the
   combination of the source and repair symbols.

   An example of a decoder is described in Section 5.4.  The process for
   producing source and repair symbols from the intermediate block is
   designed so that the intermediate block can be recovered from any
   sufficiently large set of encoding symbols, independent of the mix of
   source and repair symbols in the set.  Once the intermediate block is
   recovered, missing source symbols of the source block can be
   recovered using the encoding process.

   Requirements for a RaptorQ-compliant decoder are provided in
   Section 5.8.  A number of decoding algorithms are possible to achieve
   these requirements.  An efficient decoding algorithm to achieve these
   requirements is provided in Section 5.4.

   The construction of the intermediate and repair symbols is based in
   part on a pseudo-random number generator described in Section 5.3.
   This generator is based on a fixed set of 1024 random numbers that
   must be available to both sender and receiver.  These numbers are

RFC6330 - Page 18

   provided in Section 5.5.  Encoding and decoding operations for
   RaptorQ use operations on octets.  Section 5.7 describes how to
   perform these operations.

   Finally, the construction of the intermediate symbols from the source
   symbols is governed by "systematic indices", values of which are
   provided in Section 5.6 for specific extended source block sizes
   between 6 and K'_max = 56403 source symbols.  Thus, the RaptorQ code
   supports source blocks with between 1 and 56403 source symbols.

5.3.  Systematic RaptorQ Encoder

5.3.1.  Introduction

   For a given source block of K source symbols, for encoding and
   decoding purposes, the source block is augmented with K'-K additional
   padding symbols, where K' is the smallest value that is at least K in
   the systematic index Table 2 of Section 5.6.  The reason for padding
   out a source block to a multiple of K' is to enable faster encoding
   and decoding and to minimize the amount of table information that
   needs to be stored in the encoder and decoder.

   For purposes of transmitting and receiving data, the value of K is
   used to determine the number of source symbols in a source block, and
   thus K needs to be known at the sender and the receiver.  In this
   case, the sender and receiver can compute K' from K and the K'-K
   padding symbols can be automatically added to the source block
   without any additional communication.  The encoding symbol ID (ESI)
   is used by a sender and receiver to identify the encoding symbols of
   a source block, where the encoding symbols of a source block consist
   of the source symbols and the repair symbols associated with the
   source block.  For a source block with K source symbols, the ESIs for
   the source symbols are 0, 1, 2, ..., K-1, and the ESIs for the repair
   symbols are K, K+1, K+2, ....  Using the ESI for identifying encoding
   symbols in transport ensures that the ESI values continue
   consecutively between the source and repair symbols.

   For purposes of encoding and decoding data, the value of K' derived
   from K is used as the number of source symbols of the extended source
   block upon which encoding and decoding operations are performed,
   where the K' source symbols consist of the original K source symbols
   and an additional K'-K padding symbols.  The Internal Symbol ID (ISI)
   is used by the encoder and decoder to identify the symbols associated
   with the extended source block, i.e., for generating encoding symbols
   and for decoding.  For a source block with K original source symbols,
   the ISIs for the original source symbols are 0, 1, 2, ..., K-1, the
   ISIs for the K'-K padding symbols are K, K+1, K+2, ..., K'-1, and the
   ISIs for the repair symbols are K', K'+1, K'+2, ....  Using the ISI

RFC6330 - Page 19

   for encoding and decoding allows the padding symbols of the extended
   source block to be treated the same way as other source symbols of
   the extended source block.  Also, it ensures that a given prefix of
   repair symbols are generated in a consistent way for a given number
   K' of source symbols in the extended source block, independent of K.

   The relationship between the ESIs and the ISIs is simple: the ESIs
   and the ISIs for the original K source symbols are the same, the K'-K
   padding symbols have an ISI but do not have a corresponding ESI
   (since they are symbols that are neither sent nor received), and a
   repair symbol ISI is simply the repair symbol ESI plus K'-K.  The
   translation between ESIs (used to identify encoding symbols sent and
   received) and the corresponding ISIs (used for encoding and
   decoding), as well as determining the proper padding of the extended
   source block with padding symbols (used for encoding and decoding),
   is the internal responsibility of the RaptorQ encoder/decoder.

5.3.2.  Encoding Overview

   The systematic RaptorQ encoder is used to generate any number of
   repair symbols from a source block that consists of K source symbols
   placed into an extended source block C'.  Figure 4 shows the encoding
   overview.

   The first step of encoding is to construct an extended source block
   by adding zero or more padding symbols such that the total number of
   symbols, K', is one of the values listed in Section 5.6.  Each
   padding symbol consists of T octets where the value of each octet is
   zero.  K' MUST be selected as the smallest value of K' from the table
   of Section 5.6 that is greater than or equal to K.

RFC6330 - Page 20

         -----------------------------------------------------------+
         |                                                          |
         |    +-----------+    +--------------+    +-------------+  |
      C' |    |           | C' | Intermediate | C  |             |  |
     ----+--->|  Padding  |--->|    Symbol    |--->|   Encoding  |--+-->
      K  |    |           | K' |  Generation  | L  |             |  |
         |    +-----------+    +--------------+    +-------------+  |
         |           |                             (d,a,b, ^        |
         |           |                            d1,a1,b1)|        |
         |           |                              +------------+  |
         |           |              K'              |   Tuple    |  |
         |           +----------------------------->|            |  |
         |                                          | Generation |  |
         |                                          +------------+  |
         |                                                 ^        |
         +-------------------------------------------------+--------+
                                                           |
                                                         ISI X

                        Figure 4: Encoding Overview

   Let C'[0], ..., C'[K-1] denote the K source symbols.

   Let C'[K], ..., C'[K'-1] denote the K'-K padding symbols, which are
   all set to zero bits.  Then, C'[0], ..., C'[K'-1] are the symbols of
   the extended source block upon which encoding and decoding are
   performed.

   In the remainder of this description, these padding symbols will be
   considered as additional source symbols and referred to as such.
   However, these padding symbols are not part of the encoding symbols,
   i.e., they are not sent as part of the encoding.  At a receiver, the
   value of K' can be computed based on K, then the receiver can insert
   K'-K padding symbols at the end of a source block of K' source
   symbols and recover the remaining K source symbols of the source
   block from received encoding symbols.

   The second step of encoding is to generate a number, L > K', of
   intermediate symbols from the K' source symbols.  In this step, K'
   source tuples (d[0], a[0], b[0], d1[0], a1[0], b1[0]), ..., (d[K'-1],
   a[K'-1], b[K'-1], d1[K'-1], a1[K'-1], b1[K'-1]) are generated using
   the Tuple[] generator as described in Section 5.3.5.4.  The K' source
   tuples and the ISIs associated with the K' source symbols are used to
   determine L intermediate symbols C[0], ..., C[L-1] from the source
   symbols using an inverse encoding process.  This process can be
   realized by a RaptorQ decoding process.

RFC6330 - Page 21

   Certain "pre-coding relationships" must hold within the L
   intermediate symbols.  Section 5.3.3.3 describes these relationships.
   Section 5.3.3.4 describes how the intermediate symbols are generated
   from the source symbols.

   Once the intermediate symbols have been generated, repair symbols can
   be produced.  For a repair symbol with ISI X > K', the tuple of non-
   negative integers (d, a, b, d1, a1, b1) can be generated, using the
   Tuple[] generator as described in Section 5.3.5.4.  Then, the (d, a,
   b, d1, a1, b1) tuple and the ISI X are used to generate the
   corresponding repair symbol from the intermediate symbols using the
   Enc[] generator described in Section 5.3.5.3.  The corresponding ESI
   for this repair symbol is then X-(K'-K).  Note that source symbols of
   the extended source block can also be generated using the same
   process, i.e., for any X < K', the symbol generated using this
   process has the same value as C'[X].

5.3.3.  First Encoding Step: Intermediate Symbol Generation

5.3.3.1.  General

   This encoding step is a pre-coding step to generate the L
   intermediate symbols C[0], ..., C[L-1] from the source symbols C'[0],
   ..., C'[K'-1], where L > K' is defined in Section 5.3.3.3.  The
   intermediate symbols are uniquely defined by two sets of constraints:

   1.  The intermediate symbols are related to the source symbols by a
       set of source symbol tuples and by the ISIs of the source
       symbols.  The generation of the source symbol tuples is defined
       in Section 5.3.3.2 using the Tuple[] generator as described in
       Section 5.3.5.4.

   2.  A number of pre-coding relationships hold within the intermediate
       symbols themselves.  These are defined in Section 5.3.3.3.

   The generation of the L intermediate symbols is then defined in
   Section 5.3.3.4.

5.3.3.2.  Source Symbol Tuples

   Each of the K' source symbols is associated with a source symbol
   tuple (d[X], a[X], b[X], d1[X], a1[X], b1[X]) for 0 <= X < K'.  The
   source symbol tuples are determined using the Tuple[] generator
   defined in Section 5.3.5.4 as:

      For each X, 0 <= X < K'

         (d[X], a[X], b[X], d1[X], a1[X], b1[X]) = Tuple[K, X]

RFC6330 - Page 22

5.3.3.3.  Pre-Coding Relationships

   The pre-coding relationships amongst the L intermediate symbols are
   defined by requiring that a set of S+H linear combinations of the
   intermediate symbols evaluate to zero.  There are S LDPC and H HDPC
   symbols, and thus L = K'+S+H.  Another partition of the L
   intermediate symbols is into two sets, one set of W LT symbols and
   another set of P PI symbols, and thus it is also the case that L =
   W+P.  The P PI symbols are treated differently than the W LT symbols
   in the encoding process.  The P PI symbols consist of the H HDPC
   symbols together with a set of U = P-H of the other K' intermediate
   symbols.  The W LT symbols consist of the S LDPC symbols together
   with W-S of the other K' intermediate symbols.  The values of these
   parameters are determined from K' as described below, where H(K'),
   S(K'), and W(K') are derived from Table 2 in Section 5.6.

   Let

   o  S = S(K')

   o  H = H(K')

   o  W = W(K')

   o  L = K' + S + H

   o  P = L - W

   o  P1 denote the smallest prime number greater than or equal to P.

   o  U = P - H

   o  B = W - S

   o  C[0], ..., C[B-1] denote the intermediate symbols that are LT
      symbols but not LDPC symbols.

   o  C[B], ..., C[B+S-1] denote the S LDPC symbols that are also LT
      symbols.

   o  C[W], ..., C[W+U-1] denote the intermediate symbols that are PI
      symbols but not HDPC symbols.

   o  C[L-H], ..., C[L-1] denote the H HDPC symbols that are also PI
      symbols.

RFC6330 - Page 23

   The first set of pre-coding relations, called LDPC relations, is
   described below and requires that at the end of this process the set
   of symbols D[0] , ..., D[S-1] are all zero:

   o  Initialize the symbols D[0] = C[B], ..., D[S-1] = C[B+S-1].

   o  For i = 0, ..., B-1 do

      *  a = 1 + floor(i/S)

      *  b = i % S

      *  D[b] = D[b] + C[i]

      *  b = (b + a) % S

      *  D[b] = D[b] + C[i]

      *  b = (b + a) % S

      *  D[b] = D[b] + C[i]

   o  For i = 0, ..., S-1 do

      *  a = i % P

      *  b = (i+1) % P

      *  D[i] = D[i] + C[W+a] + C[W+b]

   Recall that the addition of symbols is to be carried out as specified
   in Section 5.7.

   Note that the LDPC relations as defined in the algorithm above are
   linear, so there exists an S x B matrix G_LDPC,1 and an S x P matrix
   G_LDPC,2 such that

      G_LDPC,1 * Transpose[(C[0], ..., C[B-1])] + G_LDPC,2 *
      Transpose(C[W], ..., C[W+P-1]) + Transpose[(C[B], ..., C[B+S-1])]
      = 0

   (The matrix G_LDPC,1 is defined by the first loop in the above
   algorithm, and G_LDPC,2 can be deduced from the second loop.)

   The second set of relations among the intermediate symbols C[0], ...,
   C[L-1] are the HDPC relations and they are defined as follows:

RFC6330 - Page 24

   Let

   o  alpha denote the octet represented by integer 2 as defined in
      Section 5.7.

   o  MT denote an H x (K' + S) matrix of octets, where for j=0, ...,
      K'+S-2, the entry MT[i,j] is the octet represented by the integer
      1 if i= Rand[j+1,6,H] or i = (Rand[j+1,6,H] + Rand[j+1,7,H-1] + 1)
      % H, and MT[i,j] is the zero element for all other values of i,
      and for j=K'+S-1, MT[i,j] = alpha^^i for i=0, ..., H-1.

   o  GAMMA denote a (K'+S) x (K'+S) matrix of octets, where

         GAMMA[i,j] =

            alpha ^^ (i-j) for i >= j,

            0 otherwise.

   Then, the relationship between the first K'+S intermediate symbols
   C[0], ..., C[K'+S-1] and the H HDPC symbols C[K'+S], ..., C[K'+S+H-1]
   is given by:

      Transpose[C[K'+S], ..., C[K'+S+H-1]] + MT * GAMMA *
      Transpose[C[0], ..., C[K'+S-1]] = 0,

   where '*' represents standard matrix multiplication utilizing the
   octet multiplication to define the multiplication between a matrix of
   octets and a matrix of symbols (in particular, the column vector of
   symbols), and '+' denotes addition over octet vectors.

5.3.3.4.  Intermediate Symbols

5.3.3.4.1.  Definition

   Given the K' source symbols C'[0], C'[1], ..., C'[K'-1] the L
   intermediate symbols C[0], C[1], ..., C[L-1] are the uniquely defined
   symbol values that satisfy the following conditions:

   1.  The K' source symbols C'[0], C'[1], ..., C'[K'-1] satisfy the K'
       constraints

          C'[X] = Enc[K', (C[0], ..., C[L-1]), (d[X], a[X], b[X], d1[X],
          a1[X], b1[X])], for all X, 0 <= X < K',

       where (d[X], a[X], b[X], d1[X], a1[X], b1[X])) = Tuple[K',X],
       Tuple[] is defined in Section 5.3.5.4, and Enc[] is described in
       Section 5.3.5.3.

RFC6330 - Page 25

   2.  The L intermediate symbols C[0], C[1], ..., C[L-1] satisfy the
       pre-coding relationships defined in Section 5.3.3.3.

5.3.3.4.2.  Example Method for Calculation of Intermediate Symbols

   This section describes a possible method for calculation of the L
   intermediate symbols C[0], C[1], ..., C[L-1] satisfying the
   constraints in Section 5.3.3.4.1.

   The L intermediate symbols can be calculated as follows:

   Let

   o  C denote the column vector of the L intermediate symbols, C[0],
      C[1], ..., C[L-1].

   o  D denote the column vector consisting of S+H zero symbols followed
      by the K' source symbols C'[0], C'[1], ..., C'[K'-1].

   Then, the above constraints define an L x L matrix A of octets such
   that:

      A*C = D

   The matrix A can be constructed as follows:

   Let

   o  G_LDPC,1 and G_LDPC,2 be S x B and S x P matrices as defined in
      Section 5.3.3.3.

   o  G_HDPC be the H x (K'+S) matrix such that

         G_HDPC * Transpose(C[0], ..., C[K'+S-1]) = Transpose(C[K'+S],
         ..., C[L-1]),

         i.e., G_HDPC = MT*GAMMA

   o  I_S be the S x S identity matrix

   o  I_H be the H x H identity matrix

   o  G_ENC be the K' x L matrix such that

         G_ENC * Transpose[(C[0], ..., C[L-1])] =
         Transpose[(C'[0],C'[1], ...,C'[K'-1])],

RFC6330 - Page 26

         i.e., G_ENC[i,j] = 1 if and only if C[j] is included in the
         symbols that are summed to produce Enc[K', (C[0], ..., C[L-1]),
         (d[i], a[i], b[i], d1[i], a1[i], b1[i])] and G_ENC[i,j] = 0
         otherwise.

   Then

   o  The first S rows of A are equal to G_LDPC,1 | I_S | G_LDPC,2.

   o  The next H rows of A are equal to G_HDPC | I_H.

   o  The remaining K' rows of A are equal to G_ENC.

   The matrix A is depicted in Figure 5 below:

                       B               S         U         H
            +-----------------------+-------+------------------+
            |                       |       |                  |
          S |        G_LDPC,1       |  I_S  |      G_LDPC,2    |
            |                       |       |                  |
            +-----------------------+-------+----------+-------+
            |                                          |       |
          H |                G_HDPC                    |  I_H  |
            |                                          |       |
            +------------------------------------------+-------+
            |                                                  |
            |                                                  |
         K' |                      G_ENC                       |
            |                                                  |
            |                                                  |
            +--------------------------------------------------+

                             Figure 5: The Matrix A

   The intermediate symbols can then be calculated as:

      C = (A^^-1)*D

   The source tuples are generated such that for any K' matrix A has
   full rank and is therefore invertible.  This calculation can be
   realized by applying a RaptorQ decoding process to the K' source
   symbols C'[0], C'[1], ..., C'[K'-1] to produce the L intermediate
   symbols C[0], C[1], ..., C[L-1].

   To efficiently generate the intermediate symbols from the source
   symbols, it is recommended that an efficient decoder implementation
   such as that described in Section 5.4 be used.

RFC6330 - Page 27

5.3.4.  Second Encoding Step: Encoding

   In the second encoding step, the repair symbol with ISI X (X >= K')
   is generated by applying the generator Enc[K', (C[0], C[1], ...,
   C[L-1]), (d, a, b, d1, a1, b1)] defined in Section 5.3.5.3 to the L
   intermediate symbols C[0], C[1], ..., C[L-1] using the tuple (d, a,
   b, d1, a1, b1)=Tuple[K',X].

5.3.5.  Generators

5.3.5.1.  Random Number Generator

   The random number generator Rand[y, i, m] is defined as follows,
   where y is a non-negative integer, i is a non-negative integer less
   than 256, and m is a positive integer, and the value produced is an
   integer between 0 and m-1.  Let V0, V1, V2, and V3 be the arrays
   provided in Section 5.5.

   Let

   o  x0 = (y + i) mod 2^^8

   o  x1 = (floor(y / 2^^8) + i) mod 2^^8

   o  x2 = (floor(y / 2^^16) + i) mod 2^^8

   o  x3 = (floor(y / 2^^24) + i) mod 2^^8

   Then

      Rand[y, i, m] = (V0[x0] ^ V1[x1] ^ V2[x2] ^ V3[x3]) % m

5.3.5.2.  Degree Generator

   The degree generator Deg[v] is defined as follows, where v is a non-
   negative integer that is less than 2^^20 = 1048576.  Given v, find
   index d in Table 1 such that f[d-1] <= v < f[d], and set Deg[v] =
   min(d, W-2).  Recall that W is derived from K' as described in
   Section 5.3.3.3.

RFC6330 - Page 28

                 +---------+---------+---------+---------+
                 | Index d | f[d]    | Index d | f[d]    |
                 +---------+---------+---------+---------+
                 | 0       | 0       | 1       | 5243    |
                 +---------+---------+---------+---------+
                 | 2       | 529531  | 3       | 704294  |
                 +---------+---------+---------+---------+
                 | 4       | 791675  | 5       | 844104  |
                 +---------+---------+---------+---------+
                 | 6       | 879057  | 7       | 904023  |
                 +---------+---------+---------+---------+
                 | 8       | 922747  | 9       | 937311  |
                 +---------+---------+---------+---------+
                 | 10      | 948962  | 11      | 958494  |
                 +---------+---------+---------+---------+
                 | 12      | 966438  | 13      | 973160  |
                 +---------+---------+---------+---------+
                 | 14      | 978921  | 15      | 983914  |
                 +---------+---------+---------+---------+
                 | 16      | 988283  | 17      | 992138  |
                 +---------+---------+---------+---------+
                 | 18      | 995565  | 19      | 998631  |
                 +---------+---------+---------+---------+
                 | 20      | 1001391 | 21      | 1003887 |
                 +---------+---------+---------+---------+
                 | 22      | 1006157 | 23      | 1008229 |
                 +---------+---------+---------+---------+
                 | 24      | 1010129 | 25      | 1011876 |
                 +---------+---------+---------+---------+
                 | 26      | 1013490 | 27      | 1014983 |
                 +---------+---------+---------+---------+
                 | 28      | 1016370 | 29      | 1017662 |
                 +---------+---------+---------+---------+
                 | 30      | 1048576 |         |         |
                 +---------+---------+---------+---------+

       Table 1: Defines the Degree Distribution for Encoding Symbols

5.3.5.3.  Encoding Symbol Generator

   The encoding symbol generator Enc[K', (C[0], C[1], ..., C[L-1]), (d,
   a, b, d1, a1, b1)] takes the following inputs:

   o  K' is the number of source symbols for the extended source block.
      Let L, W, B, S, P, and P1 be derived from K' as described in
      Section 5.3.3.3.

RFC6330 - Page 29

   o  (C[0], C[1], ..., C[L-1]) is the array of L intermediate symbols
      (sub-symbols) generated as described in Section 5.3.3.4.

   o  (d, a, b, d1, a1, b1) is a source tuple determined from ISI X
      using the Tuple[] generator defined in Section 5.3.5.4, whereby

      *  d is a positive integer denoting an encoding symbol LT degree

      *  a is a positive integer between 1 and W-1 inclusive

      *  b is a non-negative integer between 0 and W-1 inclusive

      *  d1 is a positive integer that has value either 2 or 3 denoting
         an encoding symbol PI degree

      *  a1 is a positive integer between 1 and P1-1 inclusive

      *  b1 is a non-negative integer between 0 and P1-1 inclusive

   The encoding symbol generator produces a single encoding symbol as
   output (referred to as result), according to the following algorithm:

   o  result = C[b]

   o  For j = 1, ..., d-1 do

      *  b = (b + a) % W

      *  result = result + C[b]

   o  While (b1 >= P) do b1 = (b1+a1) % P1

   o  result = result + C[W+b1]

   o  For j = 1, ..., d1-1 do

      *  b1 = (b1 + a1) % P1

      *  While (b1 >= P) do b1 = (b1+a1) % P1

      *  result = result + C[W+b1]

   o  Return result

RFC6330 - Page 30

5.3.5.4.  Tuple Generator

   The tuple generator Tuple[K',X] takes the following inputs:

   o  K': the number of source symbols in the extended source block

   o  X: an ISI

   Let

   o  L be determined from K' as described in Section 5.3.3.3

   o  J = J(K') be the systematic index associated with K', as defined
      in Table 2 in Section 5.6

   The output of the tuple generator is a tuple, (d, a, b, d1, a1, b1),
   determined as follows:

   o  A = 53591 + J*997

   o  if (A % 2 == 0) { A = A + 1 }

   o  B = 10267*(J+1)

   o  y = (B + X*A) % 2^^32

   o  v = Rand[y, 0, 2^^20]

   o  d = Deg[v]

   o  a = 1 + Rand[y, 1, W-1]

   o  b = Rand[y, 2, W]

   o  If (d < 4) { d1 = 2 + Rand[X, 3, 2] } else { d1 = 2 }

   o  a1 = 1 + Rand[X, 4, P1-1]

   o  b1 = Rand[X, 5, P1]

5.4.  Example FEC Decoder

5.4.1.  General

   This section describes an efficient decoding algorithm for the
   RaptorQ code introduced in this specification.  Note that each
   received encoding symbol is a known linear combination of the
   intermediate symbols.  So, each received encoding symbol provides a

RFC6330 - Page 31

   linear equation among the intermediate symbols, which, together with
   the known linear pre-coding relationships amongst the intermediate
   symbols, gives a system of linear equations.  Thus, any algorithm for
   solving systems of linear equations can successfully decode the
   intermediate symbols and hence the source symbols.  However, the
   algorithm chosen has a major effect on the computational efficiency
   of the decoding.

5.4.2.  Decoding an Extended Source Block

5.4.2.1.  General

   It is assumed that the decoder knows the structure of the source
   block it is to decode, including the symbol size, T, and the number K
   of symbols in the source block and the number K' of source symbols in
   the extended source block.

   From the algorithms described in Section 5.3, the RaptorQ decoder can
   calculate the total number L = K'+S+H of intermediate symbols and
   determine how they were generated from the extended source block to
   be decoded.  In this description, it is assumed that the received
   encoding symbols for the extended source block to be decoded are
   passed to the decoder.  Furthermore, for each such encoding symbol,
   it is assumed that the number and set of intermediate symbols whose
   sum is equal to the encoding symbol are passed to the decoder.  In
   the case of source symbols, including padding symbols, the source
   symbol tuples described in Section 5.3.3.2 indicate the number and
   set of intermediate symbols that sum to give each source symbol.

   Let N >= K' be the number of received encoding symbols to be used for
   decoding, including padding symbols for an extended source block, and
   let M = S+H+N.  Then, with the notation of Section 5.3.3.4.2, we have
   A*C = D.

   Decoding an extended source block is equivalent to decoding C from
   known A and D.  It is clear that C can be decoded if and only if the
   rank of A is L.  Once C has been decoded, missing source symbols can
   be obtained by using the source symbol tuples to determine the number
   and set of intermediate symbols that must be summed to obtain each
   missing source symbol.

   The first step in decoding C is to form a decoding schedule.  In this
   step, A is converted using Gaussian elimination (using row operations
   and row and column reorderings) and after discarding M - L rows, into
   the L x L identity matrix.  The decoding schedule consists of the
   sequence of row operations and row and column reorderings during the
   Gaussian elimination process, and it only depends on A and not on D.

RFC6330 - Page 32

   The decoding of C from D can take place concurrently with the forming
   of the decoding schedule, or the decoding can take place afterwards
   based on the decoding schedule.

   The correspondence between the decoding schedule and the decoding of
   C is as follows.  Let c[0] = 0, c[1] = 1, ..., c[L-1] = L-1 and d[0]
   = 0, d[1] = 1, ..., d[M-1] = M-1 initially.

   o  Each time a multiple, beta, of row i of A is added to row i' in
      the decoding schedule, then in the decoding process the symbol
      beta*D[d[i]] is added to symbol D[d[i']].

   o  Each time a row i of A is multiplied by an octet beta, then in the
      decoding process the symbol D[d[i]] is also multiplied by beta.

   o  Each time row i is exchanged with row i' in the decoding schedule,
      then in the decoding process the value of d[i] is exchanged with
      the value of d[i'].

   o  Each time column j is exchanged with column j' in the decoding
      schedule, then in the decoding process the value of c[j] is
      exchanged with the value of c[j'].

   From this correspondence, it is clear that the total number of
   operations on symbols in the decoding of the extended source block is
   the number of row operations (not exchanges) in the Gaussian
   elimination.  Since A is the L x L identity matrix after the Gaussian
   elimination and after discarding the last M - L rows, it is clear at
   the end of successful decoding that the L symbols D[d[0]], D[d[1]],
   ..., D[d[L-1]] are the values of the L symbols C[c[0]], C[c[1]], ...,
   C[c[L-1]].

   The order in which Gaussian elimination is performed to form the
   decoding schedule has no bearing on whether or not the decoding is
   successful.  However, the speed of the decoding depends heavily on
   the order in which Gaussian elimination is performed.  (Furthermore,
   maintaining a sparse representation of A is crucial, although this is
   not described here.)  The remainder of this section describes an
   order in which Gaussian elimination could be performed that is
   relatively efficient.

5.4.2.2.  First Phase

   In the first phase of the Gaussian elimination, the matrix A is
   conceptually partitioned into submatrices and, additionally, a matrix
   X is created.  This matrix has as many rows and columns as A, and it
   will be a lower triangular matrix throughout the first phase.  At the
   beginning of this phase, the matrix A is copied into the matrix X.

RFC6330 - Page 33

   The submatrix sizes are parameterized by non-negative integers i and
   u, which are initialized to 0 and P, the number of PI symbols,
   respectively.  The submatrices of A are:

   1.  The submatrix I defined by the intersection of the first i rows
       and first i columns.  This is the identity matrix at the end of
       each step in the phase.

   2.  The submatrix defined by the intersection of the first i rows and
       all but the first i columns and last u columns.  All entries of
       this submatrix are zero.

   3.  The submatrix defined by the intersection of the first i columns
       and all but the first i rows.  All entries of this submatrix are
       zero.

   4.  The submatrix U defined by the intersection of all the rows and
       the last u columns.

   5.  The submatrix V formed by the intersection of all but the first i
       columns and the last u columns and all but the first i rows.

   Figure 6 illustrates the submatrices of A.  At the beginning of the
   first phase, V consists of the first L-P columns of A, and U consists
   of the last P columns corresponding to the PI symbols.  In each step,
   a row of A is chosen.

               +-----------+-----------------+---------+
               |           |                 |         |
               |     I     |    All Zeros    |         |
               |           |                 |         |
               +-----------+-----------------+    U    |
               |           |                 |         |
               |           |                 |         |
               | All Zeros |       V         |         |
               |           |                 |         |
               |           |                 |         |
               +-----------+-----------------+---------+

               Figure 6: Submatrices of A in the First Phase

   The following graph defined by the structure of V is used in
   determining which row of A is chosen.  The columns that intersect V
   are the nodes in the graph, and the rows that have exactly 2 nonzero
   entries in V and are not HDPC rows are the edges of the graph that
   connect the two columns (nodes) in the positions of the two ones.  A
   component in this graph is a maximal set of nodes (columns) and edges

RFC6330 - Page 34

   (rows) such that there is a path between each pair of nodes/edges in
   the graph.  The size of a component is the number of nodes (columns)
   in the component.

   There are at most L steps in the first phase.  The phase ends
   successfully when i + u = L, i.e., when V and the all zeros submatrix
   above V have disappeared, and A consists of I, the all zeros
   submatrix below I, and U.  The phase ends unsuccessfully in decoding
   failure if at some step before V disappears there is no nonzero row
   in V to choose in that step.  In each step, a row of A is chosen as
   follows:

   o  If all entries of V are zero, then no row is chosen and decoding
      fails.

   o  Let r be the minimum integer such that at least one row of A has
      exactly r nonzeros in V.

      *  If r != 2, then choose a row with exactly r nonzeros in V with
         minimum original degree among all such rows, except that HDPC
         rows should not be chosen until all non-HDPC rows have been
         processed.

      *  If r = 2 and there is a row with exactly 2 ones in V, then
         choose any row with exactly 2 ones in V that is part of a
         maximum size component in the graph described above that is
         defined by V.

      *  If r = 2 and there is no row with exactly 2 ones in V, then
         choose any row with exactly 2 nonzeros in V.

   After the row is chosen in this step, the first row of A that
   intersects V is exchanged with the chosen row so that the chosen row
   is the first row that intersects V.  The columns of A among those
   that intersect V are reordered so that one of the r nonzeros in the
   chosen row appears in the first column of V and so that the remaining
   r-1 nonzeros appear in the last columns of V.  The same row and
   column operations are also performed on the matrix X.  Then, an
   appropriate multiple of the chosen row is added to all the other rows
   of A below the chosen row that have a nonzero entry in the first
   column of V.  Specifically, if a row below the chosen row has entry
   beta in the first column of V, and the chosen row has entry alpha in
   the first column of V, then beta/alpha multiplied by the chosen row
   is added to this row to leave a zero value in the first column of V.
    Finally, i is incremented by 1 and u is incremented by r-1, which
   completes the step.

RFC6330 - Page 35

   Note that efficiency can be improved if the row operations identified
   above are not actually performed until the affected row is itself
   chosen during the decoding process.  This avoids processing of row
   operations for rows that are not eventually used in the decoding
   process, and in particular this avoids those rows for which beta!=1
   until they are actually required.  Furthermore, the row operations
   required for the HDPC rows may be performed for all such rows in one
   process, by using the algorithm described in Section 5.3.3.3.

5.4.2.3.  Second Phase

   At this point, all the entries of X outside the first i rows and i
   columns are discarded, so that X has lower triangular form.  The last
   i rows and columns of X are discarded, so that X now has i rows and i
   columns.  The submatrix U is further partitioned into the first i
   rows, U_upper, and the remaining M - i rows, U_lower.  Gaussian
   elimination is performed in the second phase on U_lower either to
   determine that its rank is less than u (decoding failure) or to
   convert it into a matrix where the first u rows is the identity
   matrix (success of the second phase).  Call this u x u identity
   matrix I_u.  The M - L rows of A that intersect U_lower - I_u are
   discarded.  After this phase, A has L rows and L columns.

5.4.2.4.  Third Phase

   After the second phase, the only portion of A that needs to be zeroed
   out to finish converting A into the L x L identity matrix is U_upper.
   The number of rows i of the submatrix U_upper is generally much
   larger than the number of columns u of U_upper.  Moreover, at this
   time, the matrix U_upper is typically dense, i.e., the number of
   nonzero entries of this matrix is large.  To reduce this matrix to a
   sparse form, the sequence of operations performed to obtain the
   matrix U_lower needs to be inverted.  To this end, the matrix X is
   multiplied with the submatrix of A consisting of the first i rows of
   A.  After this operation, the submatrix of A consisting of the
   intersection of the first i rows and columns equals to X, whereas the
   matrix U_upper is transformed to a sparse form.

5.4.2.5.  Fourth Phase

   For each of the first i rows of U_upper, do the following: if the row
   has a nonzero entry at position j, and if the value of that nonzero
   entry is b, then add to this row b times row j of I_u.  After this
   step, the submatrix of A consisting of the intersection of the first
   i rows and columns is equal to X, the submatrix U_upper consists of
   zeros, the submatrix consisting of the intersection of the last u
   rows and the first i columns consists of zeros, and the submatrix
   consisting of the last u rows and columns is the matrix I_u.

RFC6330 - Page 36

5.4.2.6.  Fifth Phase

   For j from 1 to i, perform the following operations:

   1.  If A[j,j] is not one, then divide row j of A by A[j,j].

   2.  For l from 1 to j-1, if A[j,l] is nonzero, then add A[j,l]
       multiplied with row l of A to row j of A.

   After this phase, A is the L x L identity matrix and a complete
   decoding schedule has been successfully formed.  Then, the
   corresponding decoding consisting of summing known encoding symbols
   can be executed to recover the intermediate symbols based on the
   decoding schedule.  The tuples associated with all source symbols are
   computed according to Section 5.3.3.2.  The tuples for received
   source symbols are used in the decoding.  The tuples for missing
   source symbols are used to determine which intermediate symbols need
   to be summed to recover the missing source symbols.

(page 36 continued on part 3)