Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 7574

Peer-to-Peer Streaming Peer Protocol (PPSPP)

Pages: 85
Proposed Standard
Errata
Part 2 of 4 – Pages 21 to 44
First   Prev   Next

Top   ToC   RFC7574 - Page 21   prevText

4. Chunk Addressing Schemes

PPSPP can use different methods of chunk addressing, that is, support different ways of identifying chunks and different ways of expressing the chunk availability map of a peer in a compact fashion. All peers in a swarm MUST use the same chunk addressing method.

4.1. Start-End Ranges

A chunk specification consists of a single (start specification,end specification) pair that identifies a range of chunks (end inclusive). The start and end specifications can use one of multiple addressing schemes. Two schemes are currently defined: chunk ranges and byte ranges.

4.1.1. Chunk Ranges

The start and end specification are both chunk identifiers. Chunk identifiers are 32-bit or 64-bit unsigned integers. A PPSPP peer MUST support this scheme. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ Start chunk (32 or 64) ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ End chunk (32 or 64) ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4.1.2. Byte Ranges

The start and end specification are 64-bit byte offsets in the content. The support for this scheme is OPTIONAL.
Top   ToC   RFC7574 - Page 22
    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Start byte offset (64)                     |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    End byte offset (64)                       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4.2. Bin Numbers

PPSPP introduces a novel method of addressing chunks of content called "bin numbers" (or "bins" for short). Bin numbers allow the addressing of a binary interval of data using a single integer. This reduces the amount of state that needs to be recorded per peer and the space needed to denote intervals on the wire, making the protocol lightweight. In general, this numbering system allows PPSPP to work with simpler data structures, e.g., to use arrays instead of binary trees, thus reducing complexity. The support for this scheme is OPTIONAL. In bin addressing, the smallest binary interval is a single chunk (e.g., a block of bytes that may be of variable size), the largest interval is a complete range of 2**63 chunks. In a novel addition to the classical scheme, these intervals are numbered in a way that lays them out into a vector nicely, which is called bin numbering, as follows. Consider a chunk interval of width W. To derive the bin numbers of the complete interval and the subintervals, a minimal balanced binary tree is built that is at least W chunks wide at the base. The leaves from left-to-right correspond to the chunks 0..W-1 in the interval, and have bin number I*2 where I is the index of the chunk (counting beyond W-1 to balance the tree). The bin number of higher-level node P in the tree is calculated as follows: binP = (binL + binR) / 2 where binL is the bin of node P's left-hand child and binR is the bin of node P's right-hand child. Given that each node in the tree represents a subinterval of the original interval, each such subinterval now is addressable by a bin number, a single integer. The bin number tree of an interval of width W=8 looks like this:
Top   ToC   RFC7574 - Page 23
                                   7
                                  / \
                                /     \
                              /         \
                            /             \
                           3                11
                          / \              / \
                         /   \            /   \
                        /     \          /     \
                       1       5        9       13
                      / \     / \      / \      / \
                     0   2   4   6    8   10  12   14

                     C0  C1  C2  C3   C4  C5  C6   C7

              The bin number tree of an interval of width W=8

                                 Figure 2

   So bin 7 represents the complete interval, bin 3 represents the
   interval of chunk C0..C3, bin 1 represents the interval of chunks C0
   and C1, and bin 2 represents chunk C1.  The special numbers
   0xFFFFFFFF (32-bit) or 0xFFFFFFFFFFFFFFFF (64-bit) stands for an
   empty interval, and 0x7FFF...FFF stands for "everything".

   When bin numbering is used, the ID of a chunk is its corresponding
   (leaf) bin number in the tree, and the chunk specification in HAVE
   and ACK messages is equal to a single bin number (32-bit or 64-bit),
   as follows.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                    Bin number (32 or 64)                      ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

4.3. In Messages

4.3.1. In HAVE Messages

When a receiving peer has successfully checked the integrity of a chunk or interval of chunks, it MUST send a HAVE message to all peers it wants to allow download of those chunk(s) from. The ability to withhold HAVE messages allows them to be used as a method of choking. The HAVE message MUST contain the chunk specification of the biggest complete interval of all chunks the receiver has received and checked so far that fully includes the interval of chunks just received. So
Top   ToC   RFC7574 - Page 24
   the chunk specification MUST denote at least the interval received,
   but the receiver is supposed to aggregate and acknowledge bigger
   intervals, when possible.

   As a result, every single chunk is acknowledged a logarithmic number
   of times.  That provides some necessary redundancy of
   acknowledgements and sufficiently compensates for unreliable
   transport protocols.

   Implementation note:

       To record which chunks a peer has in the state that an
       implementation keeps for each peer, an implementation MAY use the
       efficient "binmap" data structure, which is a hybrid of a bitmap
       and a binary tree, discussed in detail in [BINMAP].

4.3.2. In ACK Messages

PPSPP peers MUST use ACK messages to acknowledge received chunks if an unreliable transport protocol is used. When a receiving peer has successfully checked the integrity of a chunk or interval of chunks C, it MUST send an ACK message containing the chunk specification of its biggest, complete interval covering C to the sending peer (see HAVE).

5. Content Integrity Protection

PPSPP can use different methods for protecting the integrity of the content while it is being distributed via the peer-to-peer network. More specifically, PPSPP can use different methods for receiving peers to detect whether a requested chunk has been maliciously modified by the sending peer. In benign environments, content integrity protection can be disabled. For static content, PPSPP currently defines one method for protecting integrity, called the Merkle Hash Tree scheme. If PPSPP operates over the Internet, this scheme MUST be used. If PPSPP operates in a benign environment, this scheme MAY be used. So the scheme is mandatory to implement, to satisfy the requirement of strong security for an IETF protocol [RFC3365]. An extended version of the scheme is used to efficiently protect dynamically generated content (live streams), as explained below and in Section 6.1. The Merkle Hash Tree scheme can work with different chunk addressing schemes. All it requires is the ability to address a range of chunks. In the following description abstract node IDs are used to identify nodes in the tree. On the wire, these are translated to the corresponding range of chunks in the chosen chunk addressing scheme.
Top   ToC   RFC7574 - Page 25

5.1. Merkle Hash Tree Scheme

PPSPP uses a method of naming content based on self-certification. In particular, content in PPSPP is identified by a single cryptographic hash that is the root hash in a Merkle hash tree calculated recursively from the content [ABMRKL]. This self- certifying hash tree allows every peer to directly detect when a malicious peer tries to distribute fake content. It also ensures only a small the amount of information is needed to start a download (the root hash and some peer addresses). For live streaming, a dynamic tree and a public key are used, see below. The Merkle hash tree of a content file that is divided into N chunks is constructed as follows. Note the construction does not assume chunks of content to be of a fixed size. Given a cryptographic hash function, more specifically an MDC [HAC01], such as SHA-256, the hashes of all the chunks of the content are calculated. Next, a binary tree of sufficient height is created. Sufficient height means that the lowest level in the tree has enough nodes to hold all chunk hashes in the set, as with bin numbering. The figure below shows the tree for a content file consisting of 7 chunks. As with the content addressing scheme, the leaves of the tree correspond to a chunk and, in this case, are assigned the hash of that chunk, starting at the leftmost leaf. As the base of the tree may be wider than the number of chunks, any remaining leaves in the tree are assigned an empty hash value of all zeros. Finally, the hash values of the higher levels in the tree are calculated, by concatenating the hash values of the two children (again left to right) and computing the hash of that aggregate. If the two children are empty hashes, the parent is an empty all-zeros hash as well (to save computation). This process ends in a hash value for the root node, which is called the "root hash". Note the root hash only depends on the content and any modification of the content will result in a different root hash.
Top   ToC   RFC7574 - Page 26
                               7 = root hash
                              / \
                            /     \
                          /         \
                        /             \
                      3*               11
                     / \              / \
                    /   \            /   \
                   /     \          /     \
                  1       5        9       13* = uncle hash
                 / \     / \      / \      / \
                0   2   4   6    8   10* 12   14

                C0  C1  C2  C3   C4  C5  C6   E
                =chunk index     ^^           = empty hash

            Merkle hash tree of a content file with N=7 chunks

                                 Figure 3

5.2. Content Integrity Verification

Assuming a peer receives the root hash of the content it wants to download from a trusted source, it can check the integrity of any chunk of that content it receives as follows. It first calculates the hash of the chunk it received, for example, chunk C4 in the previous figure. Along with this chunk, it MUST receive the hashes required to check the integrity of that chunk. In principle, these are the hash of the chunk's sibling (C5) and that of its "uncles". A chunk's uncles are the sibling Y of its parent X, and the uncle of that Y, recursively until the root is reached. For chunk C4, the uncles are nodes 13 and 3 and the sibling is 10; all marked with a * in the figure. Using this information, the peer recalculates the root hash of the tree and compares it to the root hash it received from the trusted source. If they match, the chunk of content has been positively verified to be the requested part of the content. Otherwise, the sending peer sent either the wrong content or the wrong sibling or uncle hashes. For simplicity, the set of sibling and uncle hashes is collectively referred to as the "uncle hashes". In the case of live streaming, the tree of chunks grows dynamically and the root hash is undefined or, more precisely, transient, as long as new data is generated by the live source. Section 6.1.2 defines a method for content integrity verification for live streams that works with such a dynamic tree. Although the tree is dynamic, content verification works the same for both live and predefined content, resulting in a unified method for both types of streaming.
Top   ToC   RFC7574 - Page 27

5.3. The Atomic Datagram Principle

As explained above, a datagram consists of a sequence of messages. Ideally, every datagram sent must be independent of other datagrams: each datagram SHOULD be processed separately, and a loss of one datagram must not disrupt the flow of datagrams between two peers. Thus, as a datagram carries zero or more messages, both messages and message interdependencies SHOULD NOT span over multiple datagrams. This principle implies that as any chunk is verified using its uncle hashes, the necessary hashes SHOULD be put into the same datagram as the chunk's data. If this is not possible because of a limitation on datagram size, the necessary hashes MUST be sent first in one or more datagrams. As a general rule, if some additional data is still missing to process a message within a datagram, the message SHOULD be dropped. The hashes necessary to verify a chunk are, in principle, its sibling's hash and all its uncle hashes, but the set of hashes to send can be optimized. Before sending a packet of data to the receiver, the sender inspects the receiver's previous acknowledgements (HAVE or ACK) to derive which hashes the receiver already has for sure. Suppose the receiver had acknowledged chunks C0 and C1 (the first two chunks of the file), then it must already have uncle hashes 5, 11, and so on. That is because those hashes are necessary to check C0 and C1 against the root hash. Then, hashes 3, 7, and so on must also be known as they are calculated in the process of checking the uncle hash chain. Hence, to send chunk C7, the sender needs to include just the hashes for nodes 14 and 9, which let the data be checked against hash 11, which is already known to the receiver. The sender MAY optimistically skip hashes that were sent out in previous, still-unacknowledged datagrams. It is an optimization trade-off between redundant hash transmission and the possibility of collateral data loss in the case in which some necessary hashes were lost in the network so some delivered data cannot be verified and thus had to be dropped. In either case, the receiver builds the Merkle hash tree on-demand, incrementally, starting from the root hash, and uses it for data validation. In short, the sender MUST put into the datagram the hashes he believes are necessary for the receiver to verify the chunk. The receiver MUST remember all the hashes it needs to verify missing chunks that it still wants to download. Note that the latter implies that a hardware-limited receiver MAY forget some hashes if it does not plan to announce possession of these chunks to others (i.e., does not plan to send HAVE messages.)
Top   ToC   RFC7574 - Page 28

5.4. INTEGRITY Messages

Concretely, a peer that wants to send a chunk of content creates a datagram that MUST consist of a list of INTEGRITY messages followed by a DATA message. If the INTEGRITY messages and DATA message cannot be put into a single datagram because of a limitation on datagram size, the INTEGRITY messages MUST be sent first in one or more datagrams. The list of INTEGRITY messages sent MUST contain an INTEGRITY message for each hash the receiver misses for integrity checking. An INTEGRITY message for a hash MUST contain the chunk specification corresponding to the node ID of the hash and the hash data itself. The chunk specification corresponding to a node ID is defined as the range of chunks formed by the leaves of the subtree rooted at the node. For example, node 3 in Figure 3 denotes chunks 0, 2, 4, and 6, so the chunk specification should denote that interval. The list of INTEGRITY messages MUST be sorted in order of the tree height of the nodes, descending (the leaves are at height 0). The DATA message MUST contain the chunk specification of the chunk and the chunk itself. A peer MAY send the required messages for multiple chunks in the same datagram, depending on the encapsulation.

5.5. Discussion and Overhead

The current method for protecting content integrity in BitTorrent [BITTORRENT] is not suited for streaming. It involves providing clients with the hashes of the content's chunks before the download commences by means of metadata files (called .torrent files in BitTorrent.) However, when chunks are small, as in the current UDP encapsulation of PPSPP, this implies having to download a large number of hashes before content download can begin. This, in turn, increases time-till-playback for end users, making this method unsuited for streaming. The overhead of using Merkle hash trees is limited. The size of the hash tree expressed as the total number of nodes depends on the number of chunks the content is divided (and hence the size of chunks) following this formula: nnodes = math.pow(2,math.log(nchunks,2)+1) In principle, the hash values of all these nodes will have to be sent to a peer once for it to verify all of the chunks. Hence, the maximum on-the-wire overhead is hashsize * nnodes. However, the actual number of hashes transmitted can be optimized as described in Section 5.3.
Top   ToC   RFC7574 - Page 29
   To see a peer can verify all chunks whilst receiving not all hashes,
   consider the example tree in Section 5.1.  In the case of a simple
   progressive download, of chunks 0, 2, 4, 6, etc., the sending peer
   will send the following hashes:

          +-------+---------------------------------------------+
          | Chunk | Node IDs of hashes sent                     |
          +-------+---------------------------------------------+
          |   0   | 2,5,11                                      |
          |   2   | - (receiver already knows all)              |
          |   4   | 6                                           |
          |   6   | -                                           |
          |   8   | 10,13 (hash 3 can be calculated from 0,2,5) |
          |   10  | -                                           |
          |   12  | 14                                          |
          |   14  | -                                           |
          | Total | # hashes        7                           |
          +-------+---------------------------------------------+

                  Table 1: Overhead for the Example Tree

   So the number of hashes sent in total (7) is less than the total
   number of hashes in the tree (16), as a peer does not need to send
   hashes that are calculated and verified as part of earlier chunks.

5.6. Automatic Detection of Content Size

In PPSPP, the size of a static content file, such as a video file, can be reliably and automatically derived from information received from the network when fixed-size chunks are used. As a result, it is not necessary to include the size of the content file as the metadata of the content for such files. Implementations of PPSPP MAY use this automatic detection feature. Note this feature is the only feature of PPSPP that requires that a fixed-size chunk is used. This feature builds on the Merkle hash tree and the trusted root hash as swarm ID as follows.

5.6.1. Peak Hashes

The ability for a newcomer peer to detect the size of the content depends heavily on the concept of peak hashes. The concept of peak hashes depends on the concepts of filled and incomplete nodes. Recall that when constructing the binary trees for content verification and addressing the base of the tree may have more leaves than the number of chunks in the content. In the Merkle hash tree, these leaves were assigned empty all-zero hashes to be able to calculate the higher-level hashes. A filled node is now defined as a node that corresponds to an interval of leaves that consists only of
Top   ToC   RFC7574 - Page 30
   hashes of content chunks, not empty hashes.  Reversely, an incomplete
   (not filled) node corresponds to an interval that also contains empty
   hashes, typically, an interval that extends past the end of the file.
   In the following figure, nodes 7, 11, 13, and 14 are incomplete: the
   rest is filled.

   Formally, a peak hash is the hash of a filled node in the Merkle hash
   tree, whose sibling is an incomplete node.  Practically, suppose a
   file is 7162 bytes long and a chunk is 1 kilobyte.  That file fits
   into 7 chunks, the tail chunk being 1018 bytes long.  The Merkle hash
   tree for that file is shown in Figure 4.  Following the definition,
   the peak hashes of this file are in nodes 3, 9, and 12, denoted with
   an *. E denotes an empty hash.

                                  7
                                 / \
                               /     \
                             /         \
                           /             \
                         3*               11
                        / \              / \
                       /   \            /   \
                      /     \          /     \
                     1       5        9*      13
                    / \     / \      / \      / \
                   0   2   4   6    8   10  12*  14

                   C0  C1  C2  C3   C4  C5  C6   E
                                            = 1018 bytes

                     Peak hashes in a Merkle hash tree

                                 Figure 4

   Peak hashes can be explained by the binary representation of the
   number of chunks the file occupies.  The binary representation for 7
   is 111.  Every "1" in binary representation of the file's packet
   length corresponds to a peak hash.  For this particular file, there
   are indeed three peaks: nodes 3, 9, and 12.  Therefore, the number of
   peak hashes for a file is also, at most, logarithmic with its size.

   A peer knowing which nodes contain the peak hashes for the file can
   therefore calculate the number of chunks it consists of; thus, it
   gets an estimate of the file size (given all chunks but the last are
   of a fixed size).  Which nodes are the peaks can be securely
   communicated from one (untrusted) peer, Peer A, to another peer, Peer
   B, by letting Peer A send the peak hashes and their node IDs to Peer
Top   ToC   RFC7574 - Page 31
   B.  It can be shown that the root hash that Peer B obtained from a
   trusted source is sufficient to verify that these are indeed the
   right peak hashes, as follows.

   Lemma: Peak hashes can be checked against the root hash.

   Proof: (a) Any peak hash is always the left sibling.  Otherwise, if
   it is the right sibling, its left neighbor/sibling must also be a
   filled node, because of the way chunks are laid out in the leaves,
   which contradicts the definition of a peak hash. (b) For the
   rightmost peak hash, its right sibling is zero. (c) For any peak
   hash, the right sibling might be calculated using peak hashes to the
   left and zeros for empty nodes. (d) Once the right sibling of the
   leftmost peak hash is calculated, its parent might be calculated. (e)
   Once that parent is calculated, we might trivially get to the root
   hash by concatenating the hash with zeros and hashing it repeatedly.

   Informally, the Lemma might be expressed as follows: peak hashes
   cover all data, so the remaining hashes are either trivial (zeros) or
   might be calculated from peak hashes and zero hashes.

   Finally, once Peer B has obtained the number of chunks in the
   content, it can determine the exact file size as follows.  Given that
   all chunks except the last are of a fixed size, Peer B just needs to
   know the size of the last chunk.  Knowing the number of chunks, Peer
   B can calculate the node ID of the last chunk and download it.  As
   always, Peer B verifies the integrity of this chunk against the
   trusted root hash.  As there is only one chunk of data that leads to
   a successful verification, the size of this chunk must be correct.
   Peer B can then determine the exact file size as:

       (number of chunks -1) * fixed chunk size + size of last chunk

5.6.2. Procedure

A PPSPP implementation that wants to use automatic size detection MUST operate as follows. When Peer A sends a DATA message for the first time to Peer B, Peer A MUST first send all the peak hashes for the content, in INTEGRITY messages, unless Peer B has already signaled that it knows the peak hashes by having acknowledged any chunk. If they are needed, the peak hashes MUST be sent as an extra list of uncle hashes for the chunk, before the list of actual uncle hashes of the chunk as described in Section 5.3. The receiver, Peer B, MUST check the peak hashes against the root hash to determine the approximate content size. To obtain the definite content size, Peer B MUST download the last chunk of the content from any peer that offers it.
Top   ToC   RFC7574 - Page 32
   As an example, let's consider a 7162-byte file, which fits in 7
   chunks of 1 kilobyte, distributed by Peer A.  Figure 4 shows the
   relevant Merkle hash tree.  Peer B, which only knows the root hash of
   the file after successfully connecting to Peer A, requests the first
   chunk of data, C0 in Figure 4.  Peer A replies to Peer B by including
   in the datagram the following messages in this specific order: first,
   the three peak hashes of this particular file, the hashes of nodes 3,
   9, and 12; second, the uncle hashes of C0, followed by the DATA
   message containing the actual content of C0.  Upon receiving the peak
   hashes, Peer B checks them against the root hash determining that the
   file is 7 chunks long.  To establish the exact size of the file, Peer
   B needs to request and retrieve the last chunk containing data, C6 in
   Figure 4.  Once the last chunk has been retrieved and verified, Peer
   B concludes that it is 1018 bytes long, hence determining that the
   file is exactly 7162 bytes long.

6. Live Streaming

The set of messages defined above can be used for live streaming as well. In a pull-based model, a live streaming injector can announce the chunks it generates via HAVE messages, and peers can retrieve them via REQUEST messages. Areas that need special attention are content authentication and chunk addressing (to achieve an infinite stream of chunks).

6.1. Content Authentication

For live streaming, PPSPP supports two methods for a peer to authenticate the content it receives from another peer, called "Sign All" and "Unified Merkle Tree". In the "Sign All" method, the live injector signs each chunk of content using a private key. Upon receiving the chunk, peers check the signature using the corresponding public key obtained from a trusted source. Support for this method is OPTIONAL. In the "Unified Merkle Tree" method, PPSPP combines the Merkle Hash Tree scheme for static content with signatures to unify the video-on- demand and live streaming scenarios. The use of Merkle hash trees reduces the number of signing and verification operations, hence providing a similar signature amortization to the approach described in [SIGMCAST]. If PPSPP operates over the Internet, the "Unified Merkle Tree" method MUST be used. If the protocol operates in a benign environment, the "Unified Merkle Tree" method MAY be used. So this method is mandatory to implement. In both methods, the swarm ID consists of a public key encoded as in a DNSSEC DNSKEY resource record without Base64 encoding [RFC4034].
Top   ToC   RFC7574 - Page 33
   In particular, the swarm ID consists of a 1-byte Algorithm field that
   identifies the public key's cryptographic algorithm and determines
   the format of the Public Key field that follows.  The value of this
   Algorithm field is one of the values in the "Domain Name System
   Security (DNSSEC) Algorithm Numbers" registry [IANADNSSECALGNUM].
   The RSASHA1 [RFC4034], RSASHA256 [RFC5702], ECDSAP256SHA256 and
   ECDSAP384SHA384 [RFC6605] algorithms are mandatory to implement.

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   | Algo Number(8)|                                               ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   ~                DNSSEC Public Key (variable)                   ~
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

6.1.1. Sign All

In the "Sign All" method, the live injector signs each chunk of content using a private key and peers, upon receiving the chunk, check the signature using the corresponding public key obtained from a trusted source. In particular, in PPSPP, the swarm ID of the live stream is that public key. A peer that wants to send a chunk of content creates a datagram that MUST contain a SIGNED_INTEGRITY message with the chunk's signature, followed by a DATA message with the actual chunk. If the SIGNED_INTEGRITY message and DATA message cannot be contained into a single datagram, because of a limitation on datagram size, the SIGNED_INTEGRITY message MUST be sent first in a separate datagram. The SIGNED_INTEGRITY message consists of the chunk specification, the timestamp, and the digital signature. The digital signature algorithm that is used, is determined by the Live Signature Algorithm protocol option, see Section 7.7. The signature is computed over a concatenation of the on-the-wire representation of the chunk specification, a 64-bit timestamp in NTP Timestamp format [RFC5905], and the chunk, in that order. The timestamp is the time signature that was made at the injector in UTC.

6.1.2. Unified Merkle Tree

In this method, the chunks of content are used as the basis for a Merkle hash tree as for static content. However, because chunks are continuously generated, this tree is not static, but dynamic. As a result, the tree does not have a root hash, or, more precisely, it has a transient root hash. Therefore, a public key serves as swarm
Top   ToC   RFC7574 - Page 34
   ID of the content.  It is used to digitally sign updates to the tree
   allowing peers to expand it based on trusted information using the
   following process.

6.1.2.1. Signed Munro Hashes
The live injector generates a number of chunks, denoted NCHUNKS_PER_SIG, corresponding to fixed power of 2 (NCHUNKS_PER_SIG>=2), which are added as new leaves to the existing hash tree. As a result of this expansion, the hash tree contains a new subtree that is NCHUNKS_PER_SIG chunks wide at the base. The root of this new subtree is referred to as the munro of that subtree, and its hash as the munro hash of the subtree, illustrated in Figure 5. In this figure, node 5 is the new munro, labeled with a $ sign. 3 / \ / \ / \ 1 5$ / \ / \ 0 2 4 6 Expanded live tree. With NCHUNKS_PER_SIG=2, node 5 is the munro for the new subtree spanning 4 and 6. Node 1 is the munro for the subtree spanning chunks 0 and 2, created in the previous iteration. Figure 5 Informally, the process now proceeds as follows. The injector signs only the munro hash of the new subtree using its private key. Next, the injector announces the existence of the new subtree to its peers using HAVE messages. When a peer, in response to the HAVE messages, requests a chunk from the new subtree, the injector first sends the signed munro hash corresponding to the requested chunk. Afterwards, similar to static content, the injector sends the uncle hashes necessary to verify that chunk, as in Section 5.1. In particular, the injector sends the uncle hashes necessary to verify the requested chunk against the munro hash. This differs from static content, where the verification takes places against the root hash. Finally, the injector sends the actual chunk.
Top   ToC   RFC7574 - Page 35
   The receiving peer verifies the signature on the signed munro using
   the swarm ID (a public key) and updates its hash tree.  As the peer
   now knows the munro hash is trusted, it can verify all chunks in the
   subtree against this munro hash, using the accompanying uncle hashes
   as in Section 5.1.

   To illustrate this procedure, lets consider the next iteration in the
   process.  The injector has generated the current tree shown in
   Figure 5, and it is connected to several peers that currently have
   the same tree and all posses chunks 0, 2, 4, and 6.  When the
   injector generates two new chunks, NCHUNKS_PER_SIG=2, the hash tree
   expands as shown in Figure 6.  The two new chunks, 8 and 10, extend
   the tree on the right side, and to accommodate them, a new root is
   created: node 7.  As this tree is wider at the base than the actual
   number of chunks, there are currently two empty leaves.  The munro
   node for the new subtree is 9, labeled with a $ sign.

                                     7
                                    / \
                                  /     \
                                /         \
                              /             \
                            3               11
                           / \              / \
                          /   \            /   \
                         /     \          /     \
                        1       5        9$      13
                       / \     / \      / \      / \
                      0   2   4   6    8   10   E   E

    Expanded live tree.  With NCHUNKS_PER_SIG=2, node 9 is the munro of
             the newly added subtree spanning chunks 8 and 10.

                                 Figure 6

   The injector now needs to inform its peers of the updated tree,
   communicating the addition of the new munro hash 9.  Hence, it sends
   a HAVE message with a chunk specification for nodes 8 + 10 to its
   peers.  As a response, Peer P requests the newly created chunk, e.g.,
   chunk 8, from the injector by sending a REQUEST message.  In reply,
   the injector sends the signed munro hash of node 9 as an INTEGRITY
   message with the hash of node 9, and a SIGNED_INTEGRITY message with
   the signature of the hash of node 9.  These messages are followed by
   an INTEGRITY message with the hash of node 10 and a DATA message with
   chunk 8.
Top   ToC   RFC7574 - Page 36
   Upon receipt, Peer P verifies the signature of the munro and expands
   its view of the tree.  Next, the peer computes the hash of chunk 8
   and combines it with the received hash of node 10, computing the
   expected hash of node 9.  He can then verify the content of chunk 8
   by comparing the computed hash of node 9 with the munro hash of the
   same node he just received; hence, Peer P has successfully verified
   the integrity of chunk 8.

   This procedure requires just one signing operation for every
   NCHUNKS_PER_SIG chunks created, and one verification operation for
   every NCHUNKS_PER_SIG received, making it much cheaper than "Sign
   All".  A receiving peer does additionally need to check one or more
   hashes per chunk via the Merkle Hash Tree scheme, but this has less
   hardware requirements than a signature verification for every chunk.
   This approach is similar to signature amortization via Merkle Tree
   Chaining [SIGMCAST].  The downside of this scheme is in an increased
   latency.  A peer cannot download the new chunks until the injector
   has computed the signature and announced the subtree.  A peer MUST
   check the signature before forwarding the chunks to other peers
   [POLLIVE].

   The number of chunks per signature NCHUNKS_PER_SIG MUST be a fixed
   power of 2 for simplicity.  NCHUNKS_PER_SIG MUST be larger than 1 for
   performance reasons.  There are two related factors to consider when
   choosing a value for NCHUNKS_PER_SIG.  First, the allowed CPU load on
   clients due to signature verifications, given the expected bitrate of
   the stream.  To achieve a low CPU load in a high bitrate stream,
   NCHUNKS_PER_SIG should be high.  Second, the effect on latency, which
   increases when NCHUNKS_PER_SIG gets higher, as just discussed.  Note
   how the procedure does not preclude the use of variable-size chunks.

   This method of integrity verification provides an additional benefit.
   If the system includes some peers that saved the complete broadcast,
   as soon as the broadcast ends, the content is available as a video-
   on-demand download using the now stabilized tree and the final root
   hash as swarm identifier.  Peers that saved all the chunks, can now
   announce the root hash to the tracking infrastructure and instantly
   seed the content.

6.1.2.2. Munro Signature Calculation
The digital signature algorithm used is determined by the Live Signature Algorithm protocol option, see Section 7.7. The signature is computed over a concatenation of the on-the-wire representation of the chunk specification of the munro node (see Section 6.1.2.1), a timestamp in 64-bit NTP Timestamp format [RFC5905], and the hash associated with the munro node, in that order. The timestamp is the time signature that was made at the injector in UTC.
Top   ToC   RFC7574 - Page 37
6.1.2.3. Procedure
Formally, the injector MUST NOT send a HAVE message for chunks in the new subtree until it has computed the signed munro hash for that subtree. When Peer B requests a chunk C from Peer A (either the injector or another peer), and Peer A decides to reply, it must do so as follows. First, Peer A MUST send an INTEGRITY message with the chunk specification for the munro of chunk C and the munro's hash, followed by a SIGNED_INTEGRITY message with the chunk specification for the munro, timestamp, and its signature in a single datagram, unless Peer B indicated earlier in the exchange that it already possess a chunk with the same corresponding munro (by means of HAVE or ACK messages). Following these two messages (if any), Peer A MUST send the necessary missing uncles hashes needed for verifying the chunk against its munro hash, and the chunk itself, as described in Section 5.4, sharing datagrams if possible.
6.1.2.4. Secure Tune In
When a peer tunes in to a live stream, it has to determine what is the last chunk the injector has generated. To facilitate this process in the Unified Merkle Tree scheme, each peer shares its knowledge about the injector's chunks with the others by exchanging their latest signed munro hashes, as follows. Recall that, in PPSPP, when Peer A initiates a channel with Peer B, Peer A sends a first datagram with a HANDSHAKE message, and Peer B responds with a second datagram also containing a HANDSHAKE message (see Section 3.1). When Peer A sends a third datagram to Peer B, and it is received by Peer B, both peers know that the other is listening on its stated transport address. Peer B is then allowed to send heavy payload like DATA messages in the fourth datagram. Peer A can already safely do that in the third datagram. In the Unified Merkle Tree scheme, Peer A MUST send its rightmost signed munro hash to Peer B in the third datagram, and in any subsequent datagrams to Peer B, until Peer B indicates that it possess a chunk with the same corresponding munro or a more recent munro (by means of a HAVE or ACK message). Peer B may already have indicated this fact by means of HAVE messages in the second datagram. Conversely, when Peer B sends the fourth datagram or any subsequent datagram to Peer A, Peer B MUST send its rightmost signed munro hash, unless Peer A indicated knowledge of it or more recent munros. The rightmost signed munro hash of a peer is defined as the munro hash signed by the injector of the rightmost subtree of width NCHUNKS_PER_SIG chunks in the peer's Merkle hash tree. Peer A MUST
Top   ToC   RFC7574 - Page 38
   NOT send the signed munro hash in the first datagram of the HANDSHAKE
   procedure and Peer B MUST NOT send it in the second datagram as it is
   considered heavy payload.

   When a peer receives a SIGNED_INTEGRITY message with a signed munro
   hash but the timestamp is too old, the peer MUST discard the message.
   Otherwise, it SHOULD use the signed munro to update its hash tree and
   pick a tune-in in the live stream.  A peer may use the information
   from multiple peers to pick the tune-in point.

6.2. Forgetting Chunks

As a live broadcast progresses, a peer may want to discard the chunks that it already played out. Ideally, other peers should be aware of this fact so that they will not try to request these chunks from this peer. This could happen in scenarios where live streams may be paused by viewers, or viewers are allowed to start late in a live broadcast (e.g., start watching a broadcast at 20:35 when it actually began at 20:30). PPSPP provides a simple solution for peers to stay up to date with the chunk availability of a discarding peer. A discarding peer in a live stream MUST enable the Live Discard Window protocol option, specifying how many chunks/bytes it caches before the last chunk/byte it advertised as being available (see Section 7.9). Its peers SHOULD apply this number as a sliding window filter over the peer's chunk availability as conveyed via its HAVE messages. Three factors are important when deciding for an appropriate value for this option: the desired amount of playback buffer for peers, the bitrate of the stream, and the available resources of the peer. Consider the case of a fresh peer joining the stream. The size of the discard window of the peers it connects to influences how much data it can directly download to establish its prebuffer. If the window is smaller than the desired buffer, the fresh peer has to wait until the peers downloaded more of the stream before it can start playback. As media buffers are generally specified in terms of a number of seconds, the size of the discard window is also related to the (average) bitrate of the stream. Finally, if a peer has few resources to store chunks and metadata, it should choose a small discard window.

7. Protocol Options

The HANDSHAKE message in PPSPP can contain the following protocol options. Unless stated otherwise, a protocol option consists of an 8-bit code followed by an 8-bit value. Larger values are all encoded
Top   ToC   RFC7574 - Page 39
   big-endian.  Each protocol option is explained in the following
   subsections.  The list of protocol options MUST be sorted on code
   value (ascending) in a HANDSHAKE message.

             +--------+-------------------------------------+
             | Code   | Description                         |
             +--------+-------------------------------------+
             | 0      | Version                             |
             | 1      | Minimum Version                     |
             | 2      | Swarm Identifier                    |
             | 3      | Content Integrity Protection Method |
             | 4      | Merkle Hash Tree Function           |
             | 5      | Live Signature Algorithm            |
             | 6      | Chunk Addressing Method             |
             | 7      | Live Discard Window                 |
             | 8      | Supported Messages                  |
             | 9      | Chunk Size                          |
             | 10-254 | Unassigned                          |
             | 255    | End Option                          |
             +--------+-------------------------------------+

                          Table 2: PPSPP Options

7.1. End Option

A peer MUST conclude the list of protocol options with the end option. Subsequent octets should be considered protocol messages. The code for the end option is 255, and unlike others, it has no value octet, so the option's length is 1 octet. 0 1 2 3 4 5 6 7 +-+-+-+-+-+-+-+-+ |1 1 1 1 1 1 1 1| +-+-+-+-+-+-+-+-+

7.2. Version

A peer MUST include the maximum version of the PPSPP it supports as the first protocol option in the list. The code for this option is 0. Defined values are listed in Table 3.
Top   ToC   RFC7574 - Page 40
           +---------+----------------------------------------+
           | Version | Description                            |
           +---------+----------------------------------------+
           | 0       | Reserved                               |
           | 1       | Protocol as described in this document |
           | 2-255   | Unassigned                             |
           +---------+----------------------------------------+

                      Table 3: PPSPP Version Numbers

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 0 0 0|  Version (8)  |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.3. Minimum Version

When a peer initiates the handshake, it MUST include the minimum version of the PPSPP it supports in the list of protocol options, following the min/max versioning scheme defined in [RFC6709], Section 4.1, strategy 5. The code for this option is 1. Defined values are listed in Table 3. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 0 0 1| Min. Ver. (8) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.4. Swarm Identifier

When a peer initiates the handshake, it MUST include a single swarm identifier option. If the peer is not the initiator, it MAY include a swarm identifier option, as an end-to-end check. This option has the following structure: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 0 1 0| Swarm ID Length (16) | ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ Swarm Identifier (variable) ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ The Swarm ID Length field contains the length of the single Swarm Identifier that follows in bytes. The Length field is 16 bits wide to allow for large public keys as identifiers in live streaming.
Top   ToC   RFC7574 - Page 41
   Each PPSPP peer knows the IDs of the swarms it joins, so this
   information can be immediately verified upon receipt.

7.5. Content Integrity Protection Method

A peer MUST include the content integrity method used by a swarm. The code for this option is 3. Defined values are listed in Table 4. +--------+-------------------------+ | Method | Description | +--------+-------------------------+ | 0 | No integrity protection | | 1 | Merkle Hash Tree | | 2 | Sign All | | 3 | Unified Merkle Tree | | 4-255 | Unassigned | +--------+-------------------------+ Table 4: PPSPP Content Integrity Protection Methods The "Merkle Hash Tree" method is the default for static content, see Section 5.1. "Sign All", and "Unified Merkle Tree" are for live content, see Section 6.1, with "Unified Merkle Tree" being the default. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 0 1 1| CIPM (8) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.6. Merkle Tree Hash Function

When the content integrity protection method is "Merkle Hash Tree", this option defining which hash function is used for the tree MUST be included. The code for this option is 4. Defined values are listed in Table 5 (see [FIPS180-4] for the function semantics).
Top   ToC   RFC7574 - Page 42
                        +----------+-------------+
                        | Function | Description |
                        +----------+-------------+
                        | 0        | SHA-1       |
                        | 1        | SHA-224     |
                        | 2        | SHA-256     |
                        | 3        | SHA-384     |
                        | 4        | SHA-512     |
                        | 5-255    | Unassigned  |
                        +----------+-------------+

                   Table 5: PPSPP Merkle Hash Functions

   Implementations MUST support SHA-1 (see Section 12.5) and SHA-256.
   SHA-256 is the default.

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 1 0 0|    MHF (8)    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.7. Live Signature Algorithm

When the content integrity protection method is "Sign All" or "Unified Merkle Tree", this option MUST be defined. The code for this option is 5. The 8-bit value of this option is one of the values listed in the "Domain Name System Security (DNSSEC) Algorithm Numbers" registry [IANADNSSECALGNUM]. The RSASHA1 [RFC4034], RSASHA256 [RFC5702], ECDSAP256SHA256 and ECDSAP384SHA384 [RFC6605] algorithms are mandatory to implement. Default is ECDSAP256SHA256. 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 1 0 1| LSA (8) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.8. Chunk Addressing Method

A peer MUST include the chunk addressing method it uses. The code for this option is 6. Defined values are listed in Table 6.
Top   ToC   RFC7574 - Page 43
                     +--------+---------------------+
                     | Method | Description         |
                     +--------+---------------------+
                     | 0      | 32-bit bins         |
                     | 1      | 64-bit byte ranges  |
                     | 2      | 32-bit chunk ranges |
                     | 3      | 64-bit bins         |
                     | 4      | 64-bit chunk ranges |
                     | 5-255  | Unassigned          |
                     +--------+---------------------+

                  Table 6: PPSPP Chunk Addressing Methods

   Implementations MUST support "32-bit chunk ranges" and "64-bit chunk
   ranges".  Default is "32-bit chunk ranges".

    0                   1
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |0 0 0 0 0 1 1 0|    CAM (8)    |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.9. Live Discard Window

A peer in a live swarm MUST include the discard window it uses. The code for this option is 7. The unit of the discard window depends on the chunk addressing method used, see Table 6. For bins and chunk ranges, it is a number of chunks; for byte ranges, it is a number of bytes. Its data type is the same as for a bin, or one value in a range specification. In other words, its value is a 32-bit or 64-bit integer in big-endian format. If this option is used, the Chunk Addressing Method MUST appear before it in the list. This option has the following structure: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 1 1 1| Live Discard Window (32 or 64) ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ A peer that does not, under normal circumstances, discard chunks MUST set this option to the special value 0xFFFFFFFF (32-bit) or 0xFFFFFFFFFFFFFFFF (64-bit). For example, peers that record a complete broadcast to offer it directly as a static file after the broadcast ends use these values (see Section 6.1.2). Section 6.2 explains how to determine a value for this option.
Top   ToC   RFC7574 - Page 44

7.10. Supported Messages

Peers may support just a subset of the PPSPP messages. For example, peers running over TCP may not accept ACK messages or peers used with a centralized tracking infrastructure may not accept PEX messages. For these reasons, peers who support only a proper subset of the PPSPP messages MUST signal which subset they support by means of this protocol option. The code for this option is 8. The value of this option is a length octet (SupMsgLen) indicating the length, in bytes, of the compressed bitmap that follows. The set of messages supported can be derived from the compressed bitmap by padding it with bytes of value 0 until it is 256 bits in length. Then, a 1 bit in the resulting bitmap at position X (numbering left to right) corresponds to support for message type X, see Table 7. In other words, to construct the compressed bitmap, create a bitmap with a 1 for each message type supported and a 0 for a message type that is not, store it as an array of bytes, and truncate it to the last non-zero byte. An example of the first 16 bits of the compressed bitmap for a peer supporting every message except ACKs and PEXs is 11011001 11110000. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 1 0 0 0| SupMsgLen (8) | ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ Supported Messages Bitmap (variable, max 256) ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

7.11. Chunk Size

A peer in a swarm MUST include the chunk size the swarm uses. The code for this option is 9. Its value is a 32-bit integer denoting the size of the chunks in bytes in big-endian format. When variable chunk sizes are used, this option MUST be set to the special value 0xFFFFFFFF. Section 8.1 explains how content publishers can determine a value for this option. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 1 0 0 1| Chunk Size (32) ~ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ~ | +-+-+-+-+-+-+-+-+


(next page on part 3)

Next Section