This section explains how a TCP stack can deal with typical constraints in CNN. The guidance in this section relates to the TCP implementation and its configuration.
Assuming that IPv6 is used, and for the sake of lightweight implementation and operation, unless applications require handling large data units (i.e., leading to an IPv6 datagram size greater than 1280 bytes), it may be desirable to limit the IP datagram size to 1280 bytes in order to avoid the need to support Path MTU Discovery [
RFC 8201]. In addition, an IP datagram size of 1280 bytes avoids incurring IPv6-layer fragmentation [
RFC 8900].
An IPv6 datagram size exceeding 1280 bytes can be avoided by setting the TCP MSS to 1220 bytes or less. Note that it is already a requirement for TCP implementations to consume payload space instead of increasing datagram size when including IP or TCP options in an IP packet to be sent [
RFC 6691]. Therefore, it is not required to advertise an MSS smaller than 1220 bytes in order to accommodate TCP options.
Note that setting the MTU to 1280 bytes is possible for link-layer technologies in the CNN space, even if some of them are characterized by a short data unit payload size, e.g., up to a few tens or hundreds of bytes. For example, the maximum frame size in IEEE 802.15.4 is 127 bytes. 6LoWPAN defined an adaptation layer to support IPv6 over IEEE 802.15.4 networks. The adaptation layer includes a fragmentation mechanism, since IPv6 requires the layer below to support an MTU of 1280 bytes [
RFC 8200], while IEEE 802.15.4 lacks fragmentation mechanisms. 6LoWPAN defines an IEEE 802.15.4 link MTU of 1280 bytes [
RFC 4944]. Other technologies, such as Bluetooth low energy [
RFC 7668], ITU-T G.9959 [
RFC 7428], or Digital Enhanced Cordless Telecommunications (DECT) Ultra Low Energy (ULE) [
RFC 8105], also use 6LoWPAN-based adaptation layers in order to enable IPv6 support. These technologies do support link-layer fragmentation. By exploiting this functionality, the adaptation layers that enable IPv6 over such technologies also define an MTU of 1280 bytes.
On the other hand, there exist technologies also used in the CNN space, such as Master Slave (MS) / Token Passing (TP) [
RFC 8163], Narrowband IoT (NB-IoT) [
RFC 8376], or IEEE 802.11ah [
6LO-WLANAH], that do not suffer the same degree of frame size limitations as the technologies mentioned above. It is recommended that the MTU for MS/TP be 1500 bytes [
RFC 8163]; the MTU in NB-IoT is 1600 bytes, and the maximum frame payload size for IEEE 802.11ah is 7991 bytes.
Using a larger MSS (to a suitable extent) may be beneficial in some scenarios, especially when transferring large payloads, as it reduces the number of packets (and packet headers) required for a given payload. However, the characteristics of the constrained network need to be considered. In particular, in a lossy network where unreliable fragment delivery is used, the amount of data that TCP unnecessarily retransmits due to fragment loss increases (and throughput decreases) quickly with the MSS. This happens because the loss of a fragment leads to the loss of the whole fragmented packet being transmitted. Unnecessary data retransmission is particularly harmful in CNNs due to the resource constraints of such environments. Note that, while the original 6LoWPAN fragmentation mechanism [
RFC 4944] does not offer reliable fragment delivery, fragment recovery functionality for 6LoWPAN or 6Lo environments has been standardized [
RFC 8931].
ECN [
RFC 3168] allows a router to signal in the IP header of a packet that congestion is rising, for example, when a queue size reaches a certain threshold. An ECN-enabled TCP receiver will echo back the congestion signal to the TCP sender by setting a flag in its next TCP Acknowledgment (ACK). The sender triggers congestion control measures as if a packet loss had happened.
RFC 8087 [
RFC 8087] outlines the principal gains in terms of increased throughput, reduced delay, and other benefits when ECN is used over a network path that includes equipment that supports Congestion Experienced (CE) marking. In the context of CNNs, a remarkable feature of ECN is that congestion can be signaled without incurring packet drops (which will lead to retransmissions and consumption of limited resources such as energy and bandwidth).
ECN can further reduce packet losses since congestion control measures can be applied earlier [
RFC 2884]. Fewer lost packets implies that the number of retransmitted segments decreases, which is particularly beneficial in CNNs, where energy and bandwidth resources are typically limited. Also, it makes sense to try to avoid packet drops for transactional workloads with small data sizes, which are typical for CNNs. In such traffic patterns, it is more difficult and often impossible to detect packet loss without retransmission timeouts (e.g., as there may not be three duplicate ACKs). Any retransmission timeout slows down the data transfer significantly. In addition, if the constrained device uses power-saving techniques, a retransmission timeout will incur a wake-up action, in contrast to ACK clock-triggered sending. When the congestion window of a TCP sender has a size of one segment and a TCP ACK with an ECN signal (ECN-Echo (ECE) flag) arrives at the TCP sender, the TCP sender resets the retransmit timer, and the sender will only be able to send a new packet when the retransmit timer expires. Effectively, at that moment, the TCP sender reduces its sending rate from 1 segment per Round-Trip Time (RTT) to 1 segment per Retransmission Timeout (RTO) and reduces the sending rate further on each ECN signal received in subsequent TCP ACKs. Otherwise, if an ECN signal is not present in a subsequent TCP ACK, the TCP sender resumes the normal ACK-clocked transmission of segments [
RFC 3168].
ECN can be incrementally deployed in the Internet. Guidance on configuration and usage of ECN is provided in
RFC 7567 [
RFC 7567]. Given the benefits, more and more TCP stacks in the Internet support ECN, and it makes sense to specifically leverage ECN in controlled environments such as CNNs. As of this writing, there is ongoing work to extend the types of TCP packets that are ECN capable, including pure ACKs [
TCPM-ECN]. Such a feature may further increase the benefits of ECN in CNN environments. Note, however, that supporting ECN increases implementation complexity.
There has been a significant body of research on solutions capable of explicitly indicating whether a TCP segment loss is due to corruption, in order to avoid activation of congestion control mechanisms [
ETEN] [
RFC 2757]. While such solutions may provide significant improvement, they have not been widely deployed and remain as experimental work. In fact, as of today, the IETF has not standardized any such solution.
This section discusses TCP stacks that allow transferring a single MSS. More general guidance is provided in
Section 3.3.
A TCP stack can reduce the memory requirements by advertising a TCP window size of 1 MSS and also transmit, at most, 1 MSS of unacknowledged data. In that case, both congestion and flow control implementation are quite simple. Such a small receive and send window may be sufficient for simple message exchanges in the CNN space. However, only using a window of 1 MSS can significantly affect performance. A stop-and-wait operation results in low throughput for transfers that exceed the length of 1 MSS, e.g., a firmware download. Furthermore, a single-MSS solution relies solely on timer-based loss recovery, therefore missing the performance gain of Fast Retransmit and Fast Recovery (which requires a larger window size; see
Section 3.3.1).
If CoAP is used over TCP with the default setting for NSTART in
RFC 7252 [
RFC 7252], a CoAP endpoint is not allowed to send a new message to a destination until a response for the previous message sent to that destination has been received. This is equivalent to an application-layer window size of 1 data unit. For this use of CoAP, a maximum TCP window of 1 MSS may be sufficient, as long as the CoAP message size does not exceed 1 MSS. An exception in CoAP over TCP, though, is the Capabilities and Settings Message (CSM) that must be sent at the start of the TCP connection. The first application message carrying user data is allowed to be sent immediately after the CSM message. If the sum of the CSM size plus the application message size exceeds the MSS, a sender using a single-MSS stack will need to wait for the ACK confirming the CSM before sending the application message.
A TCP implementation needs to support, at a minimum, TCP options 2, 1, and 0. These are, respectively, the MSS option, the No-Operation option, and the End Of Option List marker [
RFC 0793]. None of these are a substantial burden to support. These options are sufficient for interoperability with a standard-compliant TCP endpoint, albeit many TCP stacks support additional options and can negotiate their use. A TCP implementation is permitted to silently ignore all other TCP options.
A TCP implementation for a constrained device that uses a single-MSS TCP receive or transmit window size may not benefit from supporting the following TCP options: Window Scale [
RFC 7323], TCP Timestamps [
RFC 7323], Selective Acknowledgment (SACK) [
RFC 2018], and SACK-Permitted [
RFC 2018]. Also, other TCP options may not be required on a constrained device with a very lightweight implementation. With regard to the Window Scale option, note that it is only useful if a window size greater than 64 kB is needed.
Note that a TCP sender can benefit from the TCP Timestamps option [
RFC 7323] in detecting spurious RTOs. The latter are quite likely to occur in CNN scenarios due to a number of reasons (e.g., route changes in a multihop scenario, link-layer retries, etc.). The header overhead incurred by the Timestamps option (of up to 12 bytes) needs to be taken into account.
TCP Delayed Acknowledgments are meant to reduce the number of ACKs sent within a TCP connection, thus reducing network overhead, but they may increase the time until a sender may receive an ACK. In general, usefulness of Delayed ACKs depends heavily on the usage scenario (see
Section 3.3.2). There can be interactions with single-MSS stacks.
When traffic is unidirectional, if the sender can send at most 1 MSS of data or the receiver advertises a receive window not greater than the MSS, Delayed ACKs may unnecessarily contribute delay (up to 500 ms) to the RTT [
RFC 5681], which limits the throughput and can increase data delivery time. Note that, in some cases, it may not be possible to disable Delayed ACKs. One known workaround is to split the data to be sent into two segments of smaller size. A standard-compliant TCP receiver may immediately acknowledge the second MSS of data, which can improve throughput. However, this "split hack" may not always work since a TCP receiver is required to acknowledge every second full-sized segment, but not two consecutive small segments. The overhead of sending two IP packets instead of one is another downside of the "split hack".
Similar issues may happen when the sender uses the Nagle algorithm, since the sender may need to wait for an unnecessarily Delayed ACK to send a new segment. Disabling the algorithm will not have impact if the sender can only handle stop-and-wait operation at the TCP level.
For request-response traffic, when the receiver uses Delayed ACKs, a response to a data message can piggyback an ACK, as long as the latter is sent before the Delayed ACK timer expires, thus avoiding unnecessary ACKs without payload. Disabling Delayed ACKs at the request sender allows an immediate ACK for the data segment carrying the response.
The RTO calculation is one of the fundamental TCP algorithms [
RFC 6298]. There is a fundamental trade-off: a short, aggressive RTO behavior reduces wait time before retransmissions, but it also increases the probability of spurious timeouts. The latter leads to unnecessary waste of potentially scarce resources in CNNs such as energy and bandwidth. In contrast, a conservative timeout can result in long error recovery times and, thus, needlessly delay data delivery.
If a TCP sender uses a very small window size, and it cannot benefit from Fast Retransmit and Fast Recovery or SACK, the RTO algorithm has a large impact on performance. In that case, RTO algorithm tuning may be considered, although careful assessment of possible drawbacks is recommended [
RFC 8961].
As an example, adaptive RTO algorithms defined for CoAP over UDP have been found to perform well in CNN scenarios [
Commag] [
CORE-FASOR].
This section summarizes some widely used techniques to improve TCP, with a focus on their use in CNNs. The TCP extensions discussed here are useful in a wide range of network scenarios, including CNNs. This section is not comprehensive. A comprehensive survey of TCP extensions is published in
RFC 7414 [
RFC 7414].
Devices that have enough memory to allow a larger (i.e., more than 3 MSS of data) TCP window size can leverage a more efficient loss recovery than the timer-based approach used for a smaller TCP window size (see
Section 3.2.1) by using Fast Retransmit and Fast Recovery [
RFC 5681], at the expense of slightly greater complexity and Transmission Control Block (TCB) size. Assuming that Delayed ACKs are used by the receiver, a window size of up to 5 MSS is required for Fast Retransmit and Fast Recovery to work efficiently: in a given TCP transmission of full-sized segments 1, 2, 3, 4, and 5, if segment 2 gets lost, and the ACK for segment 1 is held by the Delayed ACK timer, then the sender should get an ACK for segment 1 when 3 arrives and duplicate ACKs when segments 4, 5, and 6 arrive. It will retransmit segment 2 when the third duplicate ACK arrives. In order to have segments 2, 3, 4, 5, and 6 sent, the window has to be of at least 5 MSS. With an MSS of 1220 bytes, a buffer of a size of 5 MSS would require 6100 bytes.
The example in the previous paragraph did not use a further TCP improvement such as Limited Transmit [
RFC 3042]. The latter may also be useful for any transfer that has more than one segment in flight. Small transfers tend to benefit more from Limited Transmit, because they are more likely to not receive enough duplicate ACKs. Assuming the example in the previous paragraph, Limited Transmit allows sending 5 MSS with a congestion window (cwnd) of three segments, plus two additional segments for the first two duplicate ACKs. With Limited Transmit, even a cwnd of two segments allows sending 5 MSS, at the expense of additional delay contributed by the Delayed ACK timer for the ACK that confirms segment 1.
When a multiple-segment window is used, the receiver will need to manage the reception of possible out-of-order received segments, requiring sufficient buffer space. Note that even when a window of 1 MSS is used, out-of-order arrival should also be managed, as the sender may send multiple sub-MSS packets that fit in the window. (On the other hand, the receiver is free to simply drop out-of-order segments, thus forcing retransmissions.)
If a device with less severe memory and processing constraints can afford advertising a TCP window size of several MSSs, it makes sense to support the SACK option to improve performance. SACK allows a data receiver to inform the data sender of non-contiguous data blocks received, thus a sender (having previously sent the SACK-Permitted option) can avoid performing unnecessary retransmissions, saving energy and bandwidth, as well as reducing latency. In addition, SACK often allows for faster loss recovery when there is more than one lost segment in a window of data, since SACK recovery may complete with less RTTs. SACK is particularly useful for bulk data transfers. A receiver supporting SACK will need to keep track of the data blocks that need to be received. The sender will also need to keep track of which data segments need to be resent after learning which data blocks are missing at the receiver. SACK adds 8*n+2 bytes to the TCP header, where n denotes the number of data blocks received, up to four blocks. For a low number of out-of-order segments, the header overhead penalty of SACK is compensated by avoiding unnecessary retransmissions. When the sender discovers the data blocks that have already been received, it needs to also store the necessary state to avoid unnecessary retransmission of data segments that have already been received.
For certain traffic patterns, Delayed ACKs may have a detrimental effect, as already noted in
Section 3.2.3. Advanced TCP stacks may use heuristics to determine the maximum delay for an ACK. For CNNs, the recommendation depends on the expected communication patterns.
When traffic over a CNN is expected mostly to be unidirectional messages with a size typically up to 1 MSS, and the time between two consecutive message transmissions is greater than the Delayed ACK timeout, it may make sense to use a smaller timeout or disable Delayed ACKs at the receiver. This avoids incurring additional delay, as well as the energy consumption of the sender (which might, e.g., keep its radio interface in receive mode) during that time. Note that disabling Delayed ACKs may only be possible if the peer device is administered by the same entity managing the constrained device. For request-response traffic, enabling Delayed ACKs is recommended at the server end, in order to allow combining a response with the ACK into a single segment, thus increasing efficiency. In addition, if a client issues requests infrequently, disabling Delayed ACKs at the client allows an immediate ACK for the data segment carrying the response.
In contrast, Delayed ACKs allow for a reduced number of ACKs in bulk transfer types of traffic, e.g., for firmware/software updates or for transferring larger data units containing a batch of sensor readings.
Note that, in many scenarios, the peer that a constrained device communicates with will be a general purpose system that communicates with both constrained and unconstrained devices. Since Delayed ACKs are often configured through system-wide parameters, the behavior of Delayed ACKs at the peer will be the same regardless of the nature of the endpoints it talks to. Such a peer will typically have Delayed ACKs enabled.
[
RFC 5681] specifies a TCP Initial Window (IW) of roughly 4 kB. Subsequently,
RFC 6928 [
RFC 6928] defines an experimental new value for the IW, which in practice will result in an IW of 10 MSS. Nowadays, the latter is used in many TCP implementations.
Note that a 10-MSS IW was recommended for resource-rich environments (e.g., broadband environments), which are significantly different from CNNs. In CNNs, many application-layer data units are relatively small (e.g., below 1 MSS). However, larger objects (e.g., large files containing sensor readings, firmware updates, etc.) may also need to be transferred in CNNs. If such a large object is transferred in CNNs, with an IW setting of 10 MSS, there is significant buffer overflow risk, since many CNN devices support network or radio buffers of a size smaller than 10 MSS. In order to avoid such a problem, the IW needs to be carefully set in CNNs, based on device and network resource constraints. In many cases, a safe IW setting will be smaller than 10 MSS.