The BATS coding scheme described in this document can be used, for example, by a Data Delivery Protocol (DDP). Though this document is not about a DDP, in this section, we briefly illustrate how a BATS coding scheme is employed by a DDP to make the role of the coding scheme clear. Some terms that will be used in this section are summarized here:
-
DDP:
-
Data Delivery Protocol
-
DDP packet:
-
the packet formed by a DDP employing a BATS coding scheme
-
source packet:
-
the packet formed by the data for delivery
-
outer encoder:
-
the outer code encoder of a BATS code
-
recoder:
-
the inner code encoder of a BATS code
-
outer decoder:
-
the outer code decoder of a BATS code
-
coded packet:
-
the packet generated by the outer code encoder or a recoder
-
batch:
-
a set of coded packets generated by a BATS coding scheme from the same subset of the source packets
-
recoded packet:
-
a coded packet generated by a recoder
-
degree:
-
the number of source packets used to generate a batch by the outer encoder. (The degree can be different for a different batch.)
Other common terms can be found in [
RFC 8406].
We describe a DDP that involves one source node, one destination node, and multiple intermediate nodes in between. A BATS coding scheme includes an outer code encoder (also called outer encoder), an inner code encoder (also called recoder), and an outer decoder that decodes the outer code and the inner code jointly, as illustrated in
Figure 1. The functions of the outer encoder, recoder, and outer decoder are described below:
|
| {set of source packets}
v
+-+-+-+-+-+-+-+-+
| outer encoder |
| v | source node
| recoder |
+-+-+-+-+-+-+-+-+
|
| {set of DDP packets}
v
+-+-+-+-+-+-+-+-+
| |
| recoder | intermediate node
| |
+-+-+-+-+-+-+-+-+
|
| {set of DDP packets}
v
...
|
| {set of DDP packets}
v
+-+-+-+-+-+-+-+-+
| |
| outer decoder | destination node
| |
+-+-+-+-+-+-+-+-+
|
| {set of source packets}
v
At the source node, the DDP first processes the data to be delivered into a number of source packets, each of the same number of bits (see details in
Section 2.2.1), and then provides all the source packets to the outer encoder. The outer encoder will further generate a sequence of batches, each consisting of a fixed number of coded packets (see the description in
Section 2.2.2).
Each batch generated at the source node is further processed by the recoder separately. The recoder may generate a number of new coded packets using the existing coded packets of the batch (see the description in
Section 2.2.3). After it is processed by the recoder, The DDP forms and transmits the DDP packets using the coded packets, together with the corresponding batch information.
We assume that a DDP packet is either correctly received or completely erased during the communication. The DDP extracts the coded packets and the corresponding batch information from its received DDP packets. A recoder is employed at an intermediate node that does not need the data and generates recoded packets for each batch (see the description in
Section 2.2.3). The DDP forms and transmits DDP packets using the recoded packets and the corresponding batch information.
The outer decoder is employed at the destination node that needs the data. The DDP extracts the coded packets and the corresponding batch information from its received DDP packets. The outer decoder tries to recover the transmitted data using the received batches (see the description in
Section 2.2.4). The DDP sends the decoded data to the application that needs the data.
Suppose that the DDP has F octets of data for transmission. We describe the procedures of one BATS session for transmitting the F octets. There is a limit on F of a single BATS session. If the total data has more than the limit, the data needs to be transmitted using multiple BATS sessions. The limit on F of a single BATS session depends on the coding parameters that are discussed in this section and the calculations described at the end of
Section 2.4.2.
The DDP first determines the following parameters:
-
batch size (M): the number of coded packets in a batch generated by the outer encoder
-
recoding field size (q): the number of elements in the finite field for recoding, i.e., q is 2 or 28
-
BATS payload size (TO): the number of payload octets in a BATS packet, including the coded data and the coefficient vector
Based on the above parameters, the parameters T, CO, and K are calculated as follows:
-
CO: the number of octets of a coefficient vector, calculated as CO = ceil(M * log2(q) / 8), which is also called the coefficient vector overhead
-
T: the number of data octets of a coded packet, calculated as T = TO - CO
-
K: number of source packets, calculated as K = floor(F / T) + 1
The data
MUST be padded to have T*K octets, which will be partitioned into K source packets b[0], ..., b[K - 1], each of T octets. In our padding scheme, b[0], ..., b[K - 2] are filled with data octets, and b[K - 1] is filled with the remaining data octets and padding octets. Let P = K * T - F denote the number of padding octets. We use b[K - 1, 0], ..., b[K - 1, T - P - 1] to denote the T - P source octets and b[K - 1, T - P], ..., b[K - 1, T - 1] to denote the P padding octets in b[K - 1], respectively. The padding insertion process is shown in
Figure 2.
Z = T - P
j = 1
v = 1
Let bl be the last source packet b[K - 1]
for i = Z, Z + 1, ..., T - 1 do
bl[i] = j
if i + 1 >= v + Z do
j += 1
v += j
The DDP provides the BATS encoder with the following information:
-
batch size (M): the number of coded packets in a batch
-
recoding field size (q): the number of elements in the finite field for recoding
-
maximum degree (MAX_DEG): a positive integer that specifies the largest degree can be used
-
degree distribution (DD): an unsigned integer array of size MAX_DEG+1. The i-th entry DD[i] is the probability that i is chosen as the degree, where i is between 0 and MAX_DEG.
-
a sequence of batch IDs (BIDs) (j, j = 0, 1, ...)
-
number of source packets (K)
-
packet size (T): the number of octets in a source packet
-
source packets (b[i], i = 0, 1, ..., K - 1)
Using this information, the outer encoder generates M coded packets for each BID using the following steps that are described in detail in
Section 3.2:
-
Obtain a degree d by sampling DD. Roughly, the value d is chosen with probability DD[d].
-
Choose d source packets uniformly at random from all the K source packets. It is allowed that a source packet is used by multiple batches.
-
Generate M coded packets using the d source packets.
From the outer encoder, the DDP receives a sequence of batches, where the batch with ID j has M coded packets (x[j,i], i =0, 1, ..., M - 1), each containing TO octets.
The DDP will use the batches to form DDP packets to be transmitted to other network nodes towards the destination nodes. The DDP
MUST deliver each coded packet with its batch ID, which will be further used by both the recoder and decoder.
The DDP
MUST deliver some of the information used by the encoder to each of the recoders and the decoder, either embedded in the DDP packets or transmitted using separated protocol messages. For recoders, the DDP
MUST deliver the following information to each recoder:
-
M: batch size
-
q: recoding field size
For the decoder, the DDP
MUST deliver the following information to the decoder:
-
M: batch size
-
q: recoding field size
-
K: the number of source packets
-
T: the number of octets in a source packet
-
DD: the degree distribution
The BID is used by both recoders and decoders. In
Section 2.4, we illustrate how to embed BID, M, q, and K into DDP packets. The degree distribution DD does not need to be changed frequently. See Section 6 of [
Yang17] about how to design a good degree distribution. Once designed, the degree distribution can be shared between the source node and the destination node by the DDP, which is not further discussed here.
Both the source node and the intermediate nodes perform recoding on the batches before transmission. At the source node, the recoder receives the batches from the outer code encoding procedure. At an intermediate node, the DDP receives the DDP packets from the other network nodes. If the DDP chooses not to recode, it just forwards the DDP packets it received. Otherwise, the DDP should be able to extract coded packets and the corresponding batch information from these packets.
For a received batch, the DDP determines a positive integer (Mr) and the number of recoded packets to be transmitted for the batch, and DDP provides the recoder with the following information:
-
M: batch size
-
q: recoding field size
-
a number of received coded packets of the same batch, each containing TO octets
-
Mr: the number of recoded packets to be generated
The recoder uses the information provided by the DDP to generate Mr recoded packets, each containing TO octets, which are described in
Section 3.3. The DDP uses the Mr recoded packets to form the DDP packets for transmitting.
At the destination node, the DDP receives DDP packets from an intermediate network node and should be able to extract coded packets and the corresponding batch information from these packets. The DDP then employs an outer decoder to recover the data transmitted by the source node.
The DDP provides the outer decoder (to be described in
Section 3.4) with the following information:
-
M: batch size
-
q: recoding field size
-
K: the number of source packets
-
T: the number of octets of a source packet
-
a sequence of batches, each of which is formed by a number of coded packets belonging to the same batch, with their corresponding BIDs
The decoder uses this information to decode the outer code and the inner code jointly and recover the K source packets (see details in
Section 3.4). If successful, the decoder returns the recovered K source packets to the DDP, which will use the K source packets to form the F octets data. The recommended padding deletion process is shown as follows:
// this procedure returns the number P of padding octets
// at the end of b[K - 1]
Let bl be the last decoded source packet b[K - 1]
PL = bl[T - 1]
if PL == 1 do
return P = 1
WI = T - 1
while bl[WI] == PL do
WI = WI - 1
return P = (1 + bl[WI]) * bl[WI] + T - WI - 1
The recommendation for the parameters M and q is shown as follows:
-
when q = 2, M = 16, 32, 64, 128
-
when q = 256, M = 4, 8, 16, 32
It is
RECOMMENDED that K is at least 128. The encoder/decoder
SHALL support an arbitrary positive integer value less than 2
16. However, the BATS coding scheme to be described is not optimized for small K.
Here, we provide an example of embedding the aforementioned BATS coding parameters into the DDP packets that will be used for recoding and decoding. A DDP can form a DDP packet using a coded packet by adding necessary information that can help to deliver the DDP packet to the next node (e.g., the version of the DDP, addresses, and session identifiers). We will not go into the details of formatting these fields in a DDP packet, but we focus on how to format the coding parameters and the coded packet in a DDP packet.
Here, we provide an example of using 32 bits (4 octets) to embed the parameters K, M, q, and BID. The 32 bits are separated into three subfields, denoted as K, Mq, and BID, respectively, as illustrated in
Figure 4.
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| K | Mq | BID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
K:
-
16-bit unsigned integer, specifying the number of source packets of the BATS session
-
Mq:
-
3-bit unsigned integer, specifying the value of M and q, as shown in Table 1
-
BID:
-
13-bit unsigned integer, specifying the batch ID of the batch the packet belongs to
Mq |
M |
q |
000 |
16 |
2 |
010 |
32 |
2 |
100 |
64 |
2 |
110 |
128 |
2 |
001 |
4 |
256 |
011 |
8 |
256 |
101 |
16 |
256 |
111 |
32 |
256 |
Table 1: Values of the Mq Field
The choice of the coding parameters depends on the computation cost, the network conditions, and the expected end-to-end coding performance. Usually, a larger batch size M will have a better coding performance but higher computation cost for encoding, recoding, and decoding. The field size q affects the coefficient vector overhead and also the computation cost for recoding. Within a BATS session, the BID field should be different for all batches, and hence, the maximum number of batches that can be generated for the outer encoder is 2
13. For different BATS sessions, batches can use the same BID.
CO T
+-----------------------+-------------------------------+
| coefficient vector | coded data |
+-----------------------+-------------------------------+
A coded packet has TO=T+CO octets, where the first CO octets contain the coefficient vector and the remaining T octets contain the coded data.
-
coefficient vector:
-
CO = M * log2(q) / 8 octets. For the values of M and q in Table 1, CO is at most 32 octets when q is 256 and 6 octets when q is 2.
-
coded data:
-
T octets. T should be chosen so that the whole DDP packet is at most Path MTU (PMTU).
Using the above formation, we can calculate the largest file size (F) for different parameters. For example, when q = 2 and M = 128, we have CO = 6 octets. Counting the 4 octets for embedding the coding parameters, we can choose T = PMTU - H - 10, where H is the header length of a DDP packet. As K can be at most 2
16 - 1, F can be at most (PMTU - H - 10)(2
16 - 1) octets. In this case, K / M is about 2
9 and the BID field allows transmitting 2
4 * K / M batches.