RFC 8095

Services Provided by IETF Transport Protocols and Congestion Control Mechanisms

Pages: 54
Informational
→ Errata

Part 1 of 3 – Pages 1 to 20

RFC8095 - Page 1

Internet Engineering Task Force (IETF)                 G. Fairhurst, Ed.
Request for Comments: 8095                        University of Aberdeen
Category: Informational                                 B. Trammell, Ed.
ISSN: 2070-1721                                       M. Kuehlewind, Ed.
                                                              ETH Zurich
                                                              March 2017


                          Services Provided by
       IETF Transport Protocols and Congestion Control Mechanisms

Abstract

   This document describes, surveys, and classifies the protocol
   mechanisms provided by existing IETF protocols, as background for
   determining a common set of transport services.  It examines the
   Transmission Control Protocol (TCP), Multipath TCP, the Stream
   Control Transmission Protocol (SCTP), the User Datagram Protocol
   (UDP), UDP-Lite, the Datagram Congestion Control Protocol (DCCP), the
   Internet Control Message Protocol (ICMP), the Real-Time Transport
   Protocol (RTP), File Delivery over Unidirectional Transport /
   Asynchronous Layered Coding (FLUTE/ALC) for Reliable Multicast, NACK-
   Oriented Reliable Multicast (NORM), Transport Layer Security (TLS),
   Datagram TLS (DTLS), and the Hypertext Transport Protocol (HTTP),
   when HTTP is used as a pseudotransport.  This survey provides
   background for the definition of transport services within the TAPS
   working group.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are a candidate for any level of Internet
   Standard; see Section 2 of RFC 7841.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc8095.

RFC8095 - Page 2

Copyright Notice

   Copyright (c) 2017 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1. Introduction ....................................................4
      1.1. Overview of Transport Features .............................4
   2. Terminology .....................................................5
   3. Existing Transport Protocols ....................................6
      3.1. Transport Control Protocol (TCP) ...........................6
           3.1.1. Protocol Description ................................6
           3.1.2. Interface Description ...............................8
           3.1.3. Transport Features ..................................9
      3.2. Multipath TCP (MPTCP) .....................................10
           3.2.1. Protocol Description ...............................10
           3.2.2. Interface Description ..............................10
           3.2.3. Transport Features .................................11
      3.3. User Datagram Protocol (UDP) ..............................11
           3.3.1. Protocol Description ...............................11
           3.3.2. Interface Description ..............................12
           3.3.3. Transport Features .................................13
      3.4. Lightweight User Datagram Protocol (UDP-Lite) .............13
           3.4.1. Protocol Description ...............................13
           3.4.2. Interface Description ..............................14
           3.4.3. Transport Features .................................14
      3.5. Stream Control Transmission Protocol (SCTP) ...............14
           3.5.1. Protocol Description ...............................15
           3.5.2. Interface Description ..............................17
           3.5.3. Transport Features .................................19
      3.6. Datagram Congestion Control Protocol (DCCP) ...............20
           3.6.1. Protocol Description ...............................21
           3.6.2. Interface Description ..............................22
           3.6.3. Transport Features .................................22

RFC8095 - Page 3

      3.7. Transport Layer Security (TLS) and Datagram TLS
           (DTLS) as a Pseudotransport ...............................23
           3.7.1. Protocol Description ...............................23
           3.7.2. Interface Description ..............................24
           3.7.3. Transport Features .................................25
      3.8. Real-Time Transport Protocol (RTP) ........................26
           3.8.1. Protocol Description ...............................26
           3.8.2. Interface Description ..............................27
           3.8.3. Transport Features .................................27
      3.9. Hypertext Transport Protocol (HTTP) over TCP as a
           Pseudotransport ...........................................28
           3.9.1. Protocol Description ...............................28
           3.9.2. Interface Description ..............................29
           3.9.3. Transport Features .................................30
      3.10. File Delivery over Unidirectional Transport /
            Asynchronous Layered Coding (FLUTE/ALC) for
            Reliable Multicast .......................................31
           3.10.1. Protocol Description ..............................31
           3.10.2. Interface Description .............................33
           3.10.3. Transport Features ................................33
      3.11. NACK-Oriented Reliable Multicast (NORM) ..................34
           3.11.1. Protocol Description ..............................34
           3.11.2. Interface Description .............................35
           3.11.3. Transport Features ................................36
      3.12. Internet Control Message Protocol (ICMP) .................36
           3.12.1. Protocol Description ..............................37
           3.12.2. Interface Description .............................37
           3.12.3. Transport Features ................................38
   4. Congestion Control .............................................38
   5. Transport Features .............................................39
   6. IANA Considerations ............................................42
   7. Security Considerations ........................................42
   8. Informative References .........................................42
   Acknowledgments ...................................................53
   Contributors ......................................................53
   Authors' Addresses ................................................54

RFC8095 - Page 4

1.  Introduction

   Internet applications make use of the services provided by a
   transport protocol, such as TCP (a reliable, in-order stream
   protocol) or UDP (an unreliable datagram protocol).  We use the term
   "transport service" to mean the end-to-end service provided to an
   application by the transport layer.  That service can only be
   provided correctly if information about the intended usage is
   supplied from the application.  The application may determine this
   information at design time, compile time, or run time, and may
   include guidance on whether a feature is required, a preference by
   the application, or something in between.  Examples of features of
   transport services are reliable delivery, ordered delivery, content
   privacy to in-path devices, and integrity protection.

   The IETF has defined a wide variety of transport protocols beyond TCP
   and UDP, including SCTP, DCCP, MPTCP, and UDP-Lite.  Transport
   services may be provided directly by these transport protocols or
   layered on top of them using protocols such as WebSockets (which runs
   over TCP), RTP (over TCP or UDP) or WebRTC data channels (which run
   over SCTP over DTLS over UDP or TCP).  Services built on top of UDP
   or UDP-Lite typically also need to specify additional mechanisms,
   including a congestion control mechanism (such as NewReno [RFC6582],
   TCP-Friendly Rate Control (TFRC) [RFC5348], or Low Extra Delay
   Background Transport (LEDBAT) [RFC6817]).  This extends the set of
   available transport services beyond those provided to applications by
   TCP and UDP.

   The transport protocols described in this document provide a basis
   for the definition of transport services provided by common
   protocols, as background for the TAPS working group.  The protocols
   listed here were chosen to help expose as many potential transport
   services as possible and are not meant to be a comprehensive survey
   or classification of all transport protocols.

1.1.  Overview of Transport Features

   Transport protocols can be differentiated by the features of the
   services they provide.

   Some of these provided features are closely related to basic control
   function that a protocol needs to work over a network path, such as
   addressing.  The number of participants in a given association also
   determines its applicability: a connection can be between endpoints
   (unicast), to one of multiple endpoints (anycast), or simultaneously
   to multiple endpoints (multicast).  Unicast protocols usually support
   bidirectional communication, while multicast is generally

RFC8095 - Page 5

   unidirectional.  Another feature is whether a transport requires a
   control exchange across the network at setup (e.g., TCP) or whether
   it is connectionless (e.g., UDP).

   For packet delivery itself, reliability and integrity protection,
   ordering, and framing are basic features.  However, these features
   are implemented with different levels of assurance in different
   protocols.  As an example, a transport service may provide full
   reliability, with detection of loss and retransmission (e.g., TCP).
   SCTP offers a message-based service that can provide full or partial
   reliability and allows the protocol to minimize the head-of-line
   blocking due to the support of ordered and unordered message delivery
   within multiple streams.  UDP-Lite and DCCP can provide partial
   integrity protection to enable corruption tolerance.

   Usually, a protocol has been designed to support one specific type of
   delivery/framing: either data needs to be divided into transmission
   units based on network packets (datagram service) or a data stream is
   segmented and re-combined across multiple packets (stream service).
   Whole objects such as files are handled accordingly.  This decision
   strongly influences the interface that is provided to the upper
   layer.

   In addition, transport protocols offer a certain support for
   transmission control.  For example, a transport service can provide
   flow control to allow a receiver to regulate the transmission rate of
   a sender.  Further, a transport service can provide congestion
   control (see Section 4).  As an example, TCP and SCTP provide
   congestion control for use in the Internet, whereas UDP leaves this
   function to the upper-layer protocol that uses UDP.

   Security features are often provided independently of the transport
   protocol, via Transport Layer Security (TLS) (see Section 3.7) or by
   the application-layer protocol itself.  The security properties TLS
   provides to the application (such as confidentiality, integrity, and
   authenticity) are also features of the transport layer, even though
   they are often presently implemented in a separate protocol.

2.  Terminology

   The following terms are used throughout this document and in
   subsequent documents produced by the TAPS working group that describe
   the composition and decomposition of transport services.

   Transport Feature:  a specific end-to-end feature that the transport
      layer provides to an application.  Examples include
      confidentiality, reliable delivery, ordered delivery, message-
      versus-stream orientation, etc.

RFC8095 - Page 6

   Transport Service:  a set of transport features, without an
      association to any given framing protocol, that provides a
      complete service to an application.

   Transport Protocol:  an implementation that provides one or more
      different transport services using a specific framing and header
      format on the wire.

   Application:  an entity that uses the transport layer for end-to-end
      delivery data across the network (this may also be an upper-layer
      protocol or tunnel encapsulation).

3.  Existing Transport Protocols

   This section provides a list of known IETF transport protocols and
   transport protocol frameworks.  It does not make an assessment about
   whether specific implementations of protocols are fully compliant to
   current IETF specifications.

3.1.  Transport Control Protocol (TCP)

   TCP is an IETF Standards Track transport protocol.  [RFC793]
   introduces TCP as follows:

      The Transmission Control Protocol (TCP) is intended for use as a
      highly reliable host-to-host protocol between hosts in packet-
      switched computer communication networks, and in interconnected
      systems of such networks.

   Since its introduction, TCP has become the default connection-
   oriented, stream-based transport protocol in the Internet.  It is
   widely implemented by endpoints and widely used by common
   applications.

3.1.1.  Protocol Description

   TCP is a connection-oriented protocol that provides a three-way
   handshake to allow a client and server to set up a connection and
   negotiate features and provides mechanisms for orderly completion and
   immediate teardown of a connection [RFC793] [TCP-SPEC].  TCP is
   defined by a family of RFCs (see [RFC7414]).

   TCP provides multiplexing to multiple sockets on each host using port
   numbers.  A similar approach is adopted by other IETF-defined
   transports.  An active TCP session is identified by its four-tuple of
   local and remote IP addresses and local and remote port numbers.  The
   destination port during connection setup is often used to indicate
   the requested service.

RFC8095 - Page 7

   TCP partitions a continuous stream of bytes into segments, sized to
   fit in IP packets based on a negotiated maximum segment size and
   further constrained by the effective Maximum Transmission Unit (MTU)
   from Path MTU Discovery (PMTUD).  ICMP-based PMTUD [RFC1191]
   [RFC1981] as well as Packetization Layer PMTUD (PLPMTUD) [RFC4821]
   have been defined by the IETF.

   Each byte in the stream is identified by a sequence number.  The
   sequence number is used to order segments on receipt, to identify
   segments in acknowledgments, and to detect unacknowledged segments
   for retransmission.  This is the basis of the reliable, ordered
   delivery of data in a TCP stream.  TCP Selective Acknowledgment
   (SACK) [RFC2018] extends this mechanism by making it possible to
   provide earlier identification of which segments are missing,
   allowing faster retransmission.  SACK-based methods (e.g., Duplicate
   Selective ACK) can also result in less spurious retransmission.

   Receiver flow control is provided by a sliding window, which limits
   the amount of unacknowledged data that can be outstanding at a given
   time.  The window scale option [RFC7323] allows a receiver to use
   windows greater than 64 KB.

   All TCP senders provide congestion control, such as that described in
   [RFC5681].  TCP uses a sequence number with a sliding receiver window
   for flow control.  The TCP congestion control mechanism also utilizes
   this TCP sequence number to manage a separate congestion window
   [RFC5681].  The sending window at a given point in time is the
   minimum of the receiver window and the congestion window.  The
   congestion window is increased in the absence of congestion and
   decreased if congestion is detected.  Often, loss is implicitly
   handled as a congestion indication, which is detected in TCP (also as
   input for retransmission handling) based on two mechanisms: a
   retransmission timer with exponential back-off or the reception of
   three acknowledgments for the same segment, so called "duplicated
   ACKs" (fast retransmit).  In addition, Explicit Congestion
   Notification (ECN) [RFC3168] can be used in TCP and, if supported by
   both endpoints, allows a network node to signal congestion without
   inducing loss.  Alternatively, a delay-based congestion control
   scheme that reacts to changes in delay as an early indication of
   congestion can be used in TCP.  This is further described in
   Section 4.  Examples of different kinds of congestion control schemes
   are provided in Section 4.

   TCP protocol instances can be extended (see [RFC7414]).  Some
   protocol features may also be tuned to optimize for a specific
   deployment scenario.  Some features are sender-side only, requiring
   no negotiation with the receiver; some are receiver-side only; and
   some are explicitly negotiated during connection setup.

RFC8095 - Page 8

   TCP may buffer data, e.g., to optimize processing or capacity usage.
   TCP therefore provides mechanisms to control this, including an
   optional "PUSH" function [RFC793] that explicitly requests the
   transport service not to delay data.  By default, TCP segment
   partitioning uses Nagle's algorithm [TCP-SPEC] to buffer data at the
   sender into large segments, potentially incurring sender-side
   buffering delay; this algorithm can be disabled by the sender to
   transmit more immediately, e.g., to reduce latency for interactive
   sessions.

   TCP provides an "urgent data" function for limited out-of-order
   delivery of the data.  This function is deprecated [RFC6093].

   A TCP Reset (RST) control message may be used to force a TCP endpoint
   to close a session [RFC793], aborting the connection.

   A mandatory checksum provides a basic integrity check against
   misdelivery and data corruption over the entire packet.  Applications
   that require end-to-end integrity of data are recommended to include
   a stronger integrity check of their payload data.  The TCP checksum
   [RFC1071] [RFC2460] does not support partial payload protection (as
   in DCCP/UDP-Lite).

   TCP supports only unicast connections.

3.1.2.  Interface Description

   The User/TCP Interface defined in [RFC793] provides six user
   commands: Open, Send, Receive, Close, Status, and Abort.  This
   interface does not describe configuration of TCP options or
   parameters aside from the use of the PUSH and URGENT flags.

   [RFC1122] describes extensions of the TCP/application-layer interface
   for:

   o  reporting soft errors such as reception of ICMP error messages,
      extensive retransmission, or urgent pointer advance,

   o  providing a possibility to specify the Differentiated Services
      Code Point (DSCP) [RFC3260] (formerly, the Type-of-Service (TOS))
      for segments,

   o  providing a flush call to empty the TCP send queue, and

   o  multihoming support.

RFC8095 - Page 9

   In API implementations derived from the BSD Sockets API, TCP sockets
   are created using the "SOCK_STREAM" socket type as described in the
   IEEE Portable Operating System Interface (POSIX) Base Specifications
   [POSIX].  The features used by a protocol instance may be set and
   tuned via this API.  There are currently no documents in the RFC
   Series that describe this interface.

3.1.3.  Transport Features

   The transport features provided by TCP are:

   o  connection-oriented transport with feature negotiation and
      application-to-port mapping (implemented using SYN segments and
      the TCP Option field to negotiate features),

   o  unicast transport (though anycast TCP is implemented, at risk of
      instability due to rerouting),

   o  port multiplexing,

   o  unidirectional or bidirectional communication,

   o  stream-oriented delivery in a single stream,

   o  fully reliable delivery (implemented using ACKs sent from the
      receiver to confirm delivery),

   o  error detection (implemented using a segment checksum to verify
      delivery to the correct endpoint and integrity of the data and
      options),

   o  segmentation,

   o  data bundling (optional; uses Nagle's algorithm to coalesce data
      sent within the same RTT into full-sized segments),

   o  flow control (implemented using a window-based mechanism where the
      receiver advertises the window that it is willing to buffer), and

   o  congestion control (usually implemented using a window-based
      mechanism and four algorithms for different phases of the
      transmission: slow start, congestion avoidance, fast retransmit,
      and fast recovery [RFC5681]).

RFC8095 - Page 10

3.2.  Multipath TCP (MPTCP)

   Multipath TCP [RFC6824] is an extension for TCP to support
   multihoming for resilience, mobility, and load balancing.  It is
   designed to be as indistinguishable to middleboxes from non-multipath
   TCP as possible.  It does so by establishing regular TCP flows
   between a pair of source/destination endpoints and multiplexing the
   application's stream over these flows.  Sub-flows can be started over
   IPv4 or IPv6 for the same session.

3.2.1.  Protocol Description

   MPTCP uses TCP options for its control plane.  They are used to
   signal multipath capabilities, as well as to negotiate data sequence
   numbers, advertise other available IP addresses, and establish new
   sessions between pairs of endpoints.

   By multiplexing one byte stream over separate paths, MPTCP can
   achieve a higher throughput than TCP in certain situations.  However,
   if coupled congestion control [RFC6356] is used, it might limit this
   benefit to maintain fairness to other flows at the bottleneck.  When
   aggregating capacity over multiple paths, and depending on the way
   packets are scheduled on each TCP subflow, additional delay and
   higher jitter might be observed before in-order delivery of data to
   the applications.

3.2.2.  Interface Description

   By default, MPTCP exposes the same interface as TCP to the
   application.  [RFC6897], however, describes a richer API for MPTCP-
   aware applications.

   This Basic API describes how an application can:

   o  enable or disable MPTCP.

   o  bind a socket to one or more selected local endpoints.

   o  query local and remote endpoint addresses.

   o  get a unique connection identifier (similar to an address-port
      pair for TCP).

   The document also recommends the use of extensions defined for SCTP
   [RFC6458] (see Section 3.5) to support multihoming for resilience and
   mobility.

RFC8095 - Page 11

3.2.3.  Transport Features

   As an extension to TCP, MPTCP provides mostly the same features.  By
   establishing multiple sessions between available endpoints, it can
   additionally provide soft failover solutions in the case that one of
   the paths becomes unusable.

   Therefore, the transport features provided by MPTCP in addition to
   TCP are:

   o  multihoming for load balancing, with endpoint multiplexing of a
      single byte stream, using either coupled congestion control or
      throughput maximization,

   o  address family multiplexing (using IPv4 and IPv6 for the same
      session), and

   o  resilience to network failure and/or handover.

3.3.  User Datagram Protocol (UDP)

   The User Datagram Protocol (UDP) [RFC768] [RFC2460] is an IETF
   Standards Track transport protocol.  It provides a unidirectional
   datagram protocol that preserves message boundaries.  It provides no
   error correction, congestion control, or flow control.  It can be
   used to send broadcast datagrams (IPv4) or multicast datagrams (IPv4
   and IPv6), in addition to unicast and anycast datagrams.  IETF
   guidance on the use of UDP is provided in [RFC8085].  UDP is widely
   implemented and widely used by common applications, including DNS.

3.3.1.  Protocol Description

   UDP is a connectionless protocol that maintains message boundaries,
   with no connection setup or feature negotiation.  The protocol uses
   independent messages, ordinarily called "datagrams".  It provides
   detection of payload errors and misdelivery of packets to an
   unintended endpoint, both of which result in discard of received
   datagrams, with no indication to the user of the service.

   It is possible to create IPv4 UDP datagrams with no checksum, and
   while this is generally discouraged [RFC1122] [RFC8085], certain
   special cases permit this use.  These datagrams rely on the IPv4
   header checksum to protect from misdelivery to an unintended
   endpoint.  IPv6 does not permit UDP datagrams with no checksum,
   although in certain cases [RFC6936], this rule may be relaxed
   [RFC6935].

RFC8095 - Page 12

   UDP does not provide reliability and does not provide retransmission.
   Messages may be reordered, lost, or duplicated in transit.  Note that
   due to the relatively weak form of checksum used by UDP, applications
   that require end-to-end integrity of data are recommended to include
   a stronger integrity check of their payload data.

   Because UDP provides no flow control, a receiving application that is
   unable to run sufficiently fast, or frequently, may miss messages.
   The lack of congestion handling implies UDP traffic may experience
   loss when using an overloaded path and may cause the loss of messages
   from other protocols (e.g., TCP) when sharing the same network path.

   On transmission, UDP encapsulates each datagram into a single IP
   packet or several IP packet fragments.  This allows a datagram to be
   larger than the effective path MTU.  Fragments are reassembled before
   delivery to the UDP receiver, making this transparent to the user of
   the transport service.  When jumbograms are supported, larger
   messages may be sent without performing fragmentation.

   UDP on its own does not provide support for segmentation, receiver
   flow control, congestion control, PMTUD/PLPMTUD, or ECN.
   Applications that require these features need to provide them on
   their own or use a protocol over UDP that provides them [RFC8085].

3.3.2.  Interface Description

   [RFC768] describes basic requirements for an API for UDP.  Guidance
   on the use of common APIs is provided in [RFC8085].

   A UDP endpoint consists of a tuple of (IP address, port number).
   De-multiplexing using multiple abstract endpoints (sockets) on the
   same IP address is supported.  The same socket may be used by a
   single server to interact with multiple clients.  (Note: This
   behavior differs from TCP, which uses a pair of tuples to identify a
   connection).  Multiple server instances (processes) that bind to the
   same socket can cooperate to service multiple clients.  The socket
   implementation arranges to not duplicate the same received unicast
   message to multiple server processes.

   Many operating systems also allow a UDP socket to be "connected",
   i.e., to bind a UDP socket to a specific (remote) UDP endpoint.
   Unlike TCP's connect primitive, for UDP, this is only a local
   operation that serves to simplify the local send/receive functions
   and to filter the traffic for the specified addresses and ports
   [RFC8085].

RFC8095 - Page 13

3.3.3.  Transport Features

   The transport features provided by UDP are:

   o  unicast, multicast, anycast, or IPv4 broadcast transport,

   o  port multiplexing (where a receiving port can be configured to
      receive datagrams from multiple senders),

   o  message-oriented delivery,

   o  unidirectional or bidirectional communication where the
      transmissions in each direction are independent,

   o  non-reliable delivery,

   o  unordered delivery, and

   o  error detection (implemented using a segment checksum to verify
      delivery to the correct endpoint and integrity of the data;
      optional for IPv4 and optional under specific conditions for IPv6
      where all or none of the payload data is protected).

3.4.  Lightweight User Datagram Protocol (UDP-Lite)

   The Lightweight User Datagram Protocol (UDP-Lite) [RFC3828] is an
   IETF Standards Track transport protocol.  It provides a
   unidirectional, datagram protocol that preserves message boundaries.
   IETF guidance on the use of UDP-Lite is provided in [RFC8085].  A
   UDP-Lite service may support IPv4 broadcast, multicast, anycast, and
   unicast, as well as IPv6 multicast, anycast, and unicast.

   Examples of use include a class of applications that can derive
   benefit from having partially damaged payloads delivered rather than
   discarded.  One use is to provide header integrity checks but allow
   delivery of corrupted payloads to error-tolerant applications or to
   applications that use some other mechanism to provide payload
   integrity (see [RFC6936]).

3.4.1.  Protocol Description

   Like UDP, UDP-Lite is a connectionless datagram protocol, with no
   connection setup or feature negotiation.  It changes the semantics of
   the UDP Payload Length field to that of a Checksum Coverage Length
   field and is identified by a different IP protocol/next-header value.
   The Checksum Coverage Length field specifies the intended checksum
   coverage, with the remaining unprotected part of the payload called

RFC8095 - Page 14

   the "error-insensitive part".  Therefore, applications using UDP-Lite
   cannot make assumptions regarding the correctness of the data
   received in the insensitive part of the UDP-Lite payload.

   Otherwise, UDP-Lite is semantically identical to UDP.  In the same
   way as for UDP, mechanisms for receiver flow control, congestion
   control, PMTU or PLPMTU discovery, support for ECN, etc., need to be
   provided by upper-layer protocols [RFC8085].

3.4.2.  Interface Description

   There is no API currently specified in the RFC Series, but guidance
   on use of common APIs is provided in [RFC8085].

   The interface of UDP-Lite differs from that of UDP by the addition of
   a single (socket) option that communicates a checksum coverage length
   value.  The checksum coverage may also be made visible to the
   application via the UDP-Lite MIB module [RFC5097].

3.4.3.  Transport Features

   The transport features provided by UDP-Lite are:

   o  unicast, multicast, anycast, or IPv4 broadcast transport (same as
      for UDP),

   o  port multiplexing (same as for UDP),

   o  message-oriented delivery (same as for UDP),

   o  unidirectional or bidirectional communication where the
      transmissions in each direction are independent (same as for UDP),

   o  non-reliable delivery (same as for UDP),

   o  non-ordered delivery (same as for UDP), and

   o  partial or full payload error detection (where the Checksum
      Coverage field indicates the size of the payload data covered by
      the checksum).

3.5.  Stream Control Transmission Protocol (SCTP)

   SCTP is a message-oriented IETF Standards Track transport protocol.
   The base protocol is specified in [RFC4960].  It supports multihoming
   and path failover to provide resilience to path failures.  An SCTP
   association has multiple streams in each direction, providing
   in-sequence delivery of user messages within each stream.  This

RFC8095 - Page 15

   allows it to minimize head-of-line blocking.  SCTP supports multiple
   stream- scheduling schemes controlling stream multiplexing, including
   priority and fair weighting schemes.

   SCTP was originally developed for transporting telephony signaling
   messages and is deployed in telephony signaling networks, especially
   in mobile telephony networks.  It can also be used for other
   services, for example, in the WebRTC framework for data channels.

3.5.1.  Protocol Description

   SCTP is a connection-oriented protocol using a four-way handshake to
   establish an SCTP association and a three-way message exchange to
   gracefully shut it down.  It uses the same port number concept as
   DCCP, TCP, UDP, and UDP-Lite.  SCTP only supports unicast.

   SCTP uses the 32-bit CRC32c for protecting SCTP packets against bit
   errors and misdelivery of packets to an unintended endpoint.  This is
   stronger than the 16-bit checksums used by TCP or UDP.  However,
   partial payload checksum coverage as provided by DCCP or UDP-Lite is
   not supported.

   SCTP has been designed with extensibility in mind.  A common header
   is followed by a sequence of chunks.  [RFC4960] defines how a
   receiver processes chunks with an unknown chunk type.  The support of
   extensions can be negotiated during the SCTP handshake.  Currently
   defined extensions include mechanisms for dynamic reconfiguration of
   streams [RFC6525] and IP addresses [RFC5061].  Furthermore, the
   extension specified in [RFC3758] introduces the concept of partial
   reliability for user messages.

   SCTP provides a message-oriented service.  Multiple small user
   messages can be bundled into a single SCTP packet to improve
   efficiency.  For example, this bundling may be done by delaying user
   messages at the sender, similar to Nagle's algorithm used by TCP.
   User messages that would result in IP packets larger than the MTU
   will be fragmented at the sender and reassembled at the receiver.
   There is no protocol limit on the user message size.  For MTU
   discovery, the same mechanism as for TCP can be used [RFC1981]
   [RFC4821], as well as utilization of probe packets with padding
   chunks, as defined in [RFC4820].

   [RFC4960] specifies TCP-friendly congestion control to protect the
   network against overload.  SCTP also uses sliding window flow control
   to protect receivers against overflow.  Similar to TCP, SCTP also
   supports delaying acknowledgments.  [RFC7053] provides a way for the
   sender of user messages to request immediate sending of the
   corresponding acknowledgments.

RFC8095 - Page 16

   Each SCTP association has between 1 and 65536 unidirectional streams
   in each direction.  The number of streams can be different in each
   direction.  Every user message is sent on a particular stream.  User
   messages can be sent unordered or ordered upon request by the upper
   layer.  Unordered messages can be delivered as soon as they are
   completely received.  For user messages not requiring fragmentation,
   this minimizes head-of-line blocking.  On the other hand, ordered
   messages sent on the same stream are delivered at the receiver in the
   same order as sent by the sender.

   The base protocol defined in [RFC4960] does not allow interleaving of
   user messages.  Large messages on one stream can therefore block the
   sending of user messages on other streams.  [SCTP-NDATA] describes a
   method to overcome this limitation.  This document also specifies
   multiple algorithms for the sender-side selection of which streams to
   send data from, supporting a variety of scheduling algorithms
   including priority-based methods.  The stream reconfiguration
   extension defined in [RFC6525] allows streams to be reset during the
   lifetime of an association and to increase the number of streams, if
   the number of streams negotiated in the SCTP handshake becomes
   insufficient.

   Each user message sent is delivered to the receiver or, in case of
   excessive retransmissions, the association is terminated in a
   non-graceful way [RFC4960], similar to TCP behavior.  In addition to
   this reliable transfer, the partial reliability extension [RFC3758]
   allows a sender to abandon user messages.  The application can
   specify the policy for abandoning user messages.

   SCTP supports multihoming.  Each SCTP endpoint uses a list of IP
   addresses and a single port number.  These addresses can be any
   mixture of IPv4 and IPv6 addresses.  These addresses are negotiated
   during the handshake, and the address reconfiguration extension
   specified in [RFC5061] in combination with [RFC4895] can be used to
   change these addresses in an authenticated way during the lifetime of
   an SCTP association.  This allows for transport-layer mobility.
   Multiple addresses are used for improved resilience.  If a remote
   address becomes unreachable, the traffic is switched over to a
   reachable one, if one exists.

   For securing user messages, the use of TLS over SCTP has been
   specified in [RFC3436].  However, this solution does not support all
   services provided by SCTP, such as unordered delivery or partial
   reliability.  Therefore, the use of DTLS over SCTP has been specified
   in [RFC6083] to overcome these limitations.  When using DTLS over
   SCTP, the application can use almost all services provided by SCTP.

RFC8095 - Page 17

   [NAT-SUPP] defines methods for endpoints and middleboxes to provide
   NAT traversal for SCTP over IPv4.  For legacy NAT traversal,
   [RFC6951] defines the UDP encapsulation of SCTP packets.
   Alternatively, SCTP packets can be encapsulated in DTLS packets as
   specified in [SCTP-DTLS-ENCAPS].  The latter encapsulation is used
   within the WebRTC [WEBRTC-TRANS] context.

   An SCTP ABORT chunk may be used to force a SCTP endpoint to close a
   session [RFC4960], aborting the connection.

   SCTP has a well-defined API, described in the next subsection.

3.5.2.  Interface Description

   [RFC4960] defines an abstract API for the base protocol.  This API
   describes the following functions callable by the upper layer of
   SCTP: Initialize, Associate, Send, Receive, Receive Unsent Message,
   Receive Unacknowledged Message, Shutdown, Abort, SetPrimary, Status,
   Change Heartbeat, Request Heartbeat, Get SRTT Report, Set Failure
   Threshold, Set Protocol Parameters, and Destroy.  The following
   notifications are provided by the SCTP stack to the upper layer:
   COMMUNICATION UP, DATA ARRIVE, SHUTDOWN COMPLETE, COMMUNICATION LOST,
   COMMUNICATION ERROR, RESTART, SEND FAILURE, and NETWORK STATUS
   CHANGE.

   An extension to the BSD Sockets API is defined in [RFC6458] and
   covers:

   o  the base protocol defined in [RFC4960].  The API allows control
      over local addresses and port numbers and the primary path.
      Furthermore, the application has fine control of parameters like
      retransmission thresholds, the path supervision, the delayed
      acknowledgment timeout, and the fragmentation point.  The API
      provides a mechanism to allow the SCTP stack to notify the
      application about events if the application has requested them.
      These notifications provide information about status changes of
      the association and each of the peer addresses.  In case of send
      failures, including drop of messages sent unreliably, the
      application can also be notified, and user messages can be
      returned to the application.  When sending user messages, the
      application can indicate a stream id, a payload protocol
      identifier, and an indication of whether ordered delivery is
      requested.  These parameters can also be provided on message
      reception.  Additionally, a context can be provided when sending,
      which can be used in case of send failures.  The sending of
      arbitrarily large user messages is supported.

RFC8095 - Page 18

   o  the SCTP Partial Reliability extension defined in [RFC3758] to
      specify for a user message the Partially Reliable SCTP (PR-SCTP)
      policy and the policy-specific parameter.  Examples of these
      policies defined in [RFC3758] and [RFC7496] are:

      *  limiting the time a user message is dealt with by the sender.

      *  limiting the number of retransmissions for each fragment of a
         user message.  If the number of retransmissions is limited to
         0, one gets a service similar to UDP.

      *  abandoning messages of lower priority in case of a send buffer
         shortage.

   o  the SCTP Authentication extension defined in [RFC4895] allowing
      management of the shared keys and allowing the HMAC to use and set
      the chunk types (which are only accepted in an authenticated way)
      and get the list of chunks that are accepted by the local and
      remote endpoints in an authenticated way.

   o  the SCTP Dynamic Address Reconfiguration extension defined in
      [RFC5061].  It allows the manual addition and deletion of local
      addresses for SCTP associations, as well as the enabling of
      automatic address addition and deletion.  Furthermore, the peer
      can be given a hint for choosing its primary path.

   A BSD Sockets API extension has been defined in the documents that
   specify the following SCTP extensions:

   o  the SCTP Stream Reconfiguration extension defined in [RFC6525].
      The API allows triggering of the reset operation for incoming and
      outgoing streams and the whole association.  It also provides a
      way to notify the association about the corresponding events.
      Furthermore, the application can increase the number of streams.

   o  the UDP Encapsulation of SCTP packets extension defined in
      [RFC6951].  The API allows the management of the remote UDP
      encapsulation port.

   o  the SCTP SACK-IMMEDIATELY extension defined in [RFC7053].  The API
      allows the sender of a user message to request the receiver to
      send the corresponding acknowledgment immediately.

   o  the additional PR-SCTP policies defined in [RFC7496].  The API
      allows enabling/disabling the PR-SCTP extension, choosing the
      PR-SCTP policies defined in the document, and providing
      statistical information about abandoned messages.

RFC8095 - Page 19

   Future documents describing SCTP extensions are expected to describe
   the corresponding BSD Sockets API extension in a "Socket API
   Considerations" section.

   The SCTP Socket API supports two kinds of sockets:

   o  one-to-one style sockets (by using the socket type "SOCK_STREAM").

   o  one-to-many style socket (by using the socket type
      "SOCK_SEQPACKET").

   One-to-one style sockets are similar to TCP sockets; there is a 1:1
   relationship between the sockets and the SCTP associations (except
   for listening sockets).  One-to-many style SCTP sockets are similar
   to unconnected UDP sockets, where there is a 1:n relationship between
   the sockets and the SCTP associations.

   The SCTP stack can provide information to the applications about
   state changes of the individual paths and the association whenever
   they occur.  These events are delivered similarly to user messages
   but are specifically marked as notifications.

   New functions have been introduced to support the use of multiple
   local and remote addresses.  Additional SCTP-specific send and
   receive calls have been defined to permit SCTP-specific information
   to be sent without using ancillary data in the form of additional
   Control Message (cmsg) calls.  These functions provide support for
   detecting partial delivery of user messages and notifications.

   The SCTP Socket API allows a fine-grained control of the protocol
   behavior through an extensive set of socket options.

   The SCTP kernel implementations of FreeBSD, Linux, and Solaris follow
   mostly the specified extension to the BSD Sockets API for the base
   protocol and the corresponding supported protocol extensions.

3.5.3.  Transport Features

   The transport features provided by SCTP are:

   o  connection-oriented transport with feature negotiation and
      application-to-port mapping,

   o  unicast transport,

   o  port multiplexing,

   o  unidirectional or bidirectional communication,

RFC8095 - Page 20

   o  message-oriented delivery with durable message framing supporting
      multiple concurrent streams,

   o  fully reliable, partially reliable, or unreliable delivery (based
      on user-specified policy to handle abandoned user messages) with
      drop notification,

   o  ordered and unordered delivery within a stream,

   o  support for stream scheduling prioritization,

   o  segmentation,

   o  user message bundling,

   o  flow control using a window-based mechanism,

   o  congestion control using methods similar to TCP,

   o  strong error detection (CRC32c), and

   o  transport-layer multihoming for resilience and mobility.

(page 20 continued on part 2)