RFC 6458

Sockets API Extensions for the Stream Control Transmission Protocol (SCTP)

Pages: 115
Informational
→ Errata

Part 1 of 4 – Pages 1 to 25

RFC6458 - Page 1

Internet Engineering Task Force (IETF)                        R. Stewart
Request for Comments: 6458                                Adara Networks
Category: Informational                                        M. Tuexen
ISSN: 2070-1721                         Muenster Univ. of Appl. Sciences
                                                                 K. Poon
                                                      Oracle Corporation
                                                                  P. Lei
                                                     Cisco Systems, Inc.
                                                             V. Yasevich
                                                                      HP
                                                           December 2011


                         Sockets API Extensions
          for the Stream Control Transmission Protocol (SCTP)

Abstract

   This document describes a mapping of the Stream Control Transmission
   Protocol (SCTP) into a sockets API.  The benefits of this mapping
   include compatibility for TCP applications, access to new SCTP
   features, and a consolidated error and event notification scheme.

Status of This Memo

   This document is not an Internet Standards Track specification; it is
   published for informational purposes.

   This document is a product of the Internet Engineering Task Force
   (IETF).  It represents the consensus of the IETF community.  It has
   received public review and has been approved for publication by the
   Internet Engineering Steering Group (IESG).  Not all documents
   approved by the IESG are a candidate for any level of Internet
   Standard; see Section 2 of RFC 5741.

   Information about the current status of this document, any errata,
   and how to provide feedback on it may be obtained at
   http://www.rfc-editor.org/info/rfc6458.

RFC6458 - Page 2

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.

RFC6458 - Page 3

Table of Contents

   1. Introduction ....................................................6
   2. Data Types ......................................................8
   3. One-to-Many Style Interface .....................................8
      3.1. Basic Operation ............................................8
           3.1.1. socket() ............................................9
           3.1.2. bind() .............................................10
           3.1.3. listen() ...........................................11
           3.1.4. sendmsg() and recvmsg() ............................12
           3.1.5. close() ............................................14
           3.1.6. connect() ..........................................14
      3.2. Non-Blocking Mode .........................................15
      3.3. Special Considerations ....................................16
   4. One-to-One Style Interface .....................................18
      4.1. Basic Operation ...........................................18
           4.1.1. socket() ...........................................19
           4.1.2. bind() .............................................19
           4.1.3. listen() ...........................................21
           4.1.4. accept() ...........................................21
           4.1.5. connect() ..........................................22
           4.1.6. close() ............................................23
           4.1.7. shutdown() .........................................23
           4.1.8. sendmsg() and recvmsg() ............................24
           4.1.9. getpeername() ......................................24
   5. Data Structures ................................................25
      5.1. The msghdr and cmsghdr Structures .........................25
      5.2. Ancillary Data Considerations and Semantics ...............26
           5.2.1. Multiple Items and Ordering ........................27
           5.2.2. Accessing and Manipulating Ancillary Data ..........27
           5.2.3. Control Message Buffer Sizing ......................28
      5.3. SCTP msg_control Structures ...............................28
           5.3.1. SCTP Initiation Structure (SCTP_INIT) ..............29
           5.3.2. SCTP Header Information Structure
                  (SCTP_SNDRCV) - DEPRECATED .........................30
           5.3.3. Extended SCTP Header Information Structure
                  (SCTP_EXTRCV) - DEPRECATED .........................33
           5.3.4. SCTP Send Information Structure (SCTP_SNDINFO) .....35
           5.3.5. SCTP Receive Information Structure (SCTP_RCVINFO) ..37
           5.3.6. SCTP Next Receive Information Structure
                  (SCTP_NXTINFO) .....................................38
           5.3.7. SCTP PR-SCTP Information Structure (SCTP_PRINFO) ...39
           5.3.8. SCTP AUTH Information Structure (SCTP_AUTHINFO) ....40
           5.3.9. SCTP Destination IPv4 Address Structure
                  (SCTP_DSTADDRV4) ...................................41
           5.3.10. SCTP Destination IPv6 Address Structure
                   (SCTP_DSTADDRV6) ..................................41

RFC6458 - Page 4

   6. SCTP Events and Notifications ..................................41
      6.1. SCTP Notification Structure ...............................42
           6.1.1. SCTP_ASSOC_CHANGE ..................................43
           6.1.2. SCTP_PEER_ADDR_CHANGE ..............................45
           6.1.3. SCTP_REMOTE_ERROR ..................................46
           6.1.4. SCTP_SEND_FAILED - DEPRECATED ......................47
           6.1.5. SCTP_SHUTDOWN_EVENT ................................48
           6.1.6. SCTP_ADAPTATION_INDICATION .........................49
           6.1.7. SCTP_PARTIAL_DELIVERY_EVENT ........................49
           6.1.8. SCTP_AUTHENTICATION_EVENT ..........................50
           6.1.9. SCTP_SENDER_DRY_EVENT ..............................51
           6.1.10. SCTP_NOTIFICATIONS_STOPPED_EVENT ..................52
           6.1.11. SCTP_SEND_FAILED_EVENT ............................52
      6.2. Notification Interest Options .............................54
           6.2.1. SCTP_EVENTS Option - DEPRECATED ....................54
           6.2.2. SCTP_EVENT Option ..................................56
   7. Common Operations for Both Styles ..............................57
      7.1. send(), recv(), sendto(), and recvfrom() ..................57
      7.2. setsockopt() and getsockopt() .............................59
      7.3. read() and write() ........................................60
      7.4. getsockname() .............................................60
      7.5. Implicit Association Setup ................................61
   8. Socket Options .................................................61
      8.1. Read/Write Options ........................................63
           8.1.1. Retransmission Timeout Parameters (SCTP_RTOINFO) ...63
           8.1.2. Association Parameters (SCTP_ASSOCINFO) ............64
           8.1.3. Initialization Parameters (SCTP_INITMSG) ...........66
           8.1.4. SO_LINGER ..........................................66
           8.1.5. SCTP_NODELAY .......................................66
           8.1.6. SO_RCVBUF ..........................................67
           8.1.7. SO_SNDBUF ..........................................67
           8.1.8. Automatic Close of Associations (SCTP_AUTOCLOSE) ...67
           8.1.9. Set Primary Address (SCTP_PRIMARY_ADDR) ............68
           8.1.10. Set Adaptation Layer Indicator
                   (SCTP_ADAPTATION_LAYER) ...........................68
           8.1.11. Enable/Disable Message Fragmentation
                   (SCTP_DISABLE_FRAGMENTS) ..........................68
           8.1.12. Peer Address Parameters (SCTP_PEER_ADDR_PARAMS) ...69
           8.1.13. Set Default Send Parameters
                   (SCTP_DEFAULT_SEND_PARAM) - DEPRECATED ............71
           8.1.14. Set Notification and Ancillary Events
                   (SCTP_EVENTS) - DEPRECATED ........................72
           8.1.15. Set/Clear IPv4 Mapped Addresses
                   (SCTP_I_WANT_MAPPED_V4_ADDR) ......................72
           8.1.16. Get or Set the Maximum Fragmentation Size
                   (SCTP_MAXSEG) .....................................72
           8.1.17. Get or Set the List of Supported HMAC
                   Identifiers (SCTP_HMAC_IDENT) .....................73

RFC6458 - Page 5

           8.1.18. Get or Set the Active Shared Key
                   (SCTP_AUTH_ACTIVE_KEY) ............................74
           8.1.19. Get or Set Delayed SACK Timer
                   (SCTP_DELAYED_SACK) ...............................74
           8.1.20. Get or Set Fragmented Interleave
                   (SCTP_FRAGMENT_INTERLEAVE) ........................75
           8.1.21. Set or Get the SCTP Partial Delivery Point
                   (SCTP_PARTIAL_DELIVERY_POINT) .....................77
           8.1.22. Set or Get the Use of Extended Receive Info
                   (SCTP_USE_EXT_RCVINFO) - DEPRECATED ...............77
           8.1.23. Set or Get the Auto ASCONF Flag
                   (SCTP_AUTO_ASCONF) ................................77
           8.1.24. Set or Get the Maximum Burst (SCTP_MAX_BURST) .....78
           8.1.25. Set or Get the Default Context (SCTP_CONTEXT) .....78
           8.1.26. Enable or Disable Explicit EOR Marking
                   (SCTP_EXPLICIT_EOR) ...............................79
           8.1.27. Enable SCTP Port Reusage (SCTP_REUSE_PORT) ........79
           8.1.28. Set Notification Event (SCTP_EVENT) ...............79
           8.1.29. Enable or Disable the Delivery of SCTP_RCVINFO
                   as Ancillary Data (SCTP_RECVRCVINFO) ..............79
           8.1.30. Enable or Disable the Delivery of SCTP_NXTINFO
                   as Ancillary Data (SCTP_RECVNXTINFO) ..............80
           8.1.31. Set Default Send Parameters
                   (SCTP_DEFAULT_SNDINFO) ............................80
           8.1.32. Set Default PR-SCTP Parameters
                   (SCTP_DEFAULT_PRINFO) .............................80
      8.2. Read-Only Options .........................................81
           8.2.1. Association Status (SCTP_STATUS) ...................81
           8.2.2. Peer Address Information
                  (SCTP_GET_PEER_ADDR_INFO) ..........................82
           8.2.3. Get the List of Chunks the Peer Requires to
                  Be Authenticated (SCTP_PEER_AUTH_CHUNKS) ...........84
           8.2.4. Get the List of Chunks the Local Endpoint Requires
                  to Be Authenticated (SCTP_LOCAL_AUTH_CHUNKS) .......84
           8.2.5. Get the Current Number of Associations
                  (SCTP_GET_ASSOC_NUMBER) ............................85
           8.2.6. Get the Current Identifiers of Associations
                  (SCTP_GET_ASSOC_ID_LIST) ...........................85
      8.3. Write-Only Options ........................................85
           8.3.1. Set Peer Primary Address
                  (SCTP_SET_PEER_PRIMARY_ADDR) .......................86
           8.3.2. Add a Chunk That Must Be Authenticated
                  (SCTP_AUTH_CHUNK) ..................................86
           8.3.3. Set a Shared Key (SCTP_AUTH_KEY) ...................86
           8.3.4. Deactivate a Shared Key
                  (SCTP_AUTH_DEACTIVATE_KEY) .........................87
           8.3.5. Delete a Shared Key (SCTP_AUTH_DELETE_KEY) .........88

RFC6458 - Page 6

   9. New Functions ..................................................88
      9.1. sctp_bindx() ..............................................88
      9.2. sctp_peeloff() ............................................90
      9.3. sctp_getpaddrs() ..........................................91
      9.4. sctp_freepaddrs() .........................................92
      9.5. sctp_getladdrs() ..........................................92
      9.6. sctp_freeladdrs() .........................................93
      9.7. sctp_sendmsg() - DEPRECATED ...............................93
      9.8. sctp_recvmsg() - DEPRECATED ...............................94
      9.9. sctp_connectx() ...........................................95
      9.10. sctp_send() - DEPRECATED .................................96
      9.11. sctp_sendx() - DEPRECATED ................................97
      9.12. sctp_sendv() .............................................98
      9.13. sctp_recvv() ............................................101
   10. Security Considerations ......................................103
   11. Acknowledgments ..............................................103
   12. References ...................................................104
      12.1. Normative References ....................................104
      12.2. Informative References ..................................104
   Appendix A. Example Using One-to-One Style Sockets ...............106
   Appendix B. Example Using One-to-Many Style Sockets ..............109

1.  Introduction

   The sockets API has provided a standard mapping of the Internet
   Protocol suite to many operating systems.  Both TCP [RFC0793] and UDP
   [RFC0768] have benefited from this standard representation and access
   method across many diverse platforms.  SCTP is a new protocol that
   provides many of the characteristics of TCP but also incorporates
   semantics more akin to UDP.  This document defines a method to map
   the existing sockets API for use with SCTP, providing both a base for
   access to new features and compatibility so that most existing TCP
   applications can be migrated to SCTP with few (if any) changes.

   There are three basic design objectives:

   1.  Maintain consistency with existing sockets APIs: We define a
       sockets mapping for SCTP that is consistent with other sockets
       API protocol mappings (for instance UDP, TCP, IPv4, and IPv6).

   2.  Support a one-to-many style interface: This set of semantics is
       similar to that defined for connectionless protocols, such as
       UDP.  A one-to-many style SCTP socket should be able to control
       multiple SCTP associations.  This is similar to a UDP socket,
       which can communicate with many peer endpoints.  Each of these
       associations is assigned an association identifier so that an

RFC6458 - Page 7

       application can use the ID to differentiate them.  Note that SCTP
       is connection-oriented in nature, and it does not support
       broadcast or multicast communications, as UDP does.

   3.  Support a one-to-one style interface: This interface supports a
       similar semantics as sockets for connection-oriented protocols,
       such as TCP.  A one-to-one style SCTP socket should only control
       one SCTP association.  One purpose of defining this interface is
       to allow existing applications built on other connection-oriented
       protocols to be ported to use SCTP with very little effort.
       Developers familiar with these semantics can easily adapt to
       SCTP.  Another purpose is to make sure that existing mechanisms
       in most operating systems that support sockets, such as select(),
       should continue to work with this style of socket.  Extensions
       are added to this mapping to provide mechanisms to exploit new
       features of SCTP.

   Goals 2 and 3 are not compatible, so this document defines two modes
   of mapping, namely the one-to-many style mapping and the one-to-one
   style mapping.  These two modes share some common data structures and
   operations, but will require the use of two different application
   programming styles.  Note that all new SCTP features can be used with
   both styles of socket.  The decision on which one to use depends
   mainly on the nature of the applications.

   A mechanism is defined to extract an SCTP association from a one-to-
   many style socket into a one-to-one style socket.

   Some of the SCTP mechanisms cannot be adequately mapped to an
   existing socket interface.  In some cases, it is more desirable to
   have a new interface instead of using existing socket calls.
   Section 9 of this document describes these new interfaces.

   Please note that some elements of the SCTP sockets API are declared
   as deprecated.  During the evolution of this document, elements of
   the API were introduced, implemented, and later on replaced by other
   elements.  These replaced elements are declared as deprecated, since
   they are still available in some implementations and the replacement
   functions are not.  This applies especially to older versions of
   operating systems supporting SCTP.  New SCTP socket implementations
   must implement at least the non-deprecated elements.  Implementations
   intending interoperability with older versions of the API should also
   include the deprecated functions.

RFC6458 - Page 8

2.  Data Types

   Whenever possible, Portable Operating System Interface (POSIX) data
   types defined in [IEEE-1003.1-2008] are used: uintN_t means an
   unsigned integer of exactly N bits (e.g., uint16_t).  This document
   also assumes the argument data types from POSIX when possible (e.g.,
   the final argument to setsockopt() is a socklen_t value).  Whenever
   buffer sizes are specified, the POSIX size_t data type is used.

3.  One-to-Many Style Interface

   In the one-to-many style interface, there is a one-to-many
   relationship between sockets and associations.

3.1.  Basic Operation

   A typical server in this style uses the following socket calls in
   sequence to prepare an endpoint for servicing requests:

   o  socket()

   o  bind()

   o  listen()

   o  recvmsg()

   o  sendmsg()

   o  close()

   A typical client uses the following calls in sequence to set up an
   association with a server to request services:

   o  socket()

   o  sendmsg()

   o  recvmsg()

   o  close()

   In this style, by default, all of the associations connected to the
   endpoint are represented with a single socket.  Each association is
   assigned an association identifier (the type is sctp_assoc_t) so that
   an application can use it to differentiate among them.  In some
   implementations, the peer endpoints' addresses can also be used for
   this purpose.  But this is not required for performance reasons.  If

RFC6458 - Page 9

   an implementation does not support using addresses to differentiate
   between different associations, the sendto() call can only be used to
   set up an association implicitly.  It cannot be used to send data to
   an established association, as the association identifier cannot be
   specified.

   Once an association identifier is assigned to an SCTP association,
   that identifier will not be reused until the application explicitly
   terminates the use of the association.  The resources belonging to
   that association will not be freed until that happens.  This is
   similar to the close() operation on a normal socket.  The only
   exception is when the SCTP_AUTOCLOSE option (Section 8.1.8) is set.
   In this case, after the association is terminated gracefully and
   automatically, the association identifier assigned to it can be
   reused.  All applications using this option should be aware of this
   to avoid the possible problem of sending data to an incorrect peer
   endpoint.

   If the server or client wishes to branch an existing association off
   to a separate socket, it is required to call sctp_peeloff() and to
   specify the association identifier.  The sctp_peeloff() call will
   return a new one-to-one style socket that can then be used with
   recv() and send() functions for message passing.  See Section 9.2 for
   more on branched-off associations.

   Once an association is branched off to a separate socket, it becomes
   completely separated from the original socket.  All subsequent
   control and data operations to that association must be done through
   the new socket.  For example, the close() operation on the original
   socket will not terminate any associations that have been branched
   off to a different socket.

   One-to-many style socket calls are discussed in more detail in the
   following subsections.

3.1.1.  socket()

   Applications use socket() to create a socket descriptor to represent
   an SCTP endpoint.

   The function prototype is

   int socket(int domain,
              int type,
              int protocol);

   and one uses PF_INET or PF_INET6 as the domain, SOCK_SEQPACKET as the
   type, and IPPROTO_SCTP as the protocol.

RFC6458 - Page 10

   Here, SOCK_SEQPACKET indicates the creation of a one-to-many style
   socket.

   The function returns a socket descriptor, or -1 in case of an error.

   Using the PF_INET domain indicates the creation of an endpoint that
   can use only IPv4 addresses, while PF_INET6 creates an endpoint that
   can use both IPv6 and IPv4 addresses.

3.1.2.  bind()

   Applications use bind() to specify with which local address and port
   the SCTP endpoint should associate itself.

   An SCTP endpoint can be associated with multiple addresses.  To do
   this, sctp_bindx() is introduced in Section 9.1 to help applications
   do the job of associating multiple addresses.  But note that an
   endpoint can only be associated with one local port.

   These addresses associated with a socket are the eligible transport
   addresses for the endpoint to send and receive data.  The endpoint
   will also present these addresses to its peers during the association
   initialization process; see [RFC4960].

   After calling bind(), if the endpoint wishes to accept new
   associations on the socket, it must call listen() (see
   Section 3.1.3).

   The function prototype of bind() is

   int bind(int sd,
            struct sockaddr *addr,
            socklen_t addrlen);

   and the arguments are

   sd:  The socket descriptor returned by socket().

   addr:  The address structure (struct sockaddr_in for an IPv4 address
      or struct sockaddr_in6 for an IPv6 address; see [RFC3493]).

   addrlen:  The size of the address structure.

   bind() returns 0 on success and -1 in case of an error.

   If sd is an IPv4 socket, the address passed must be an IPv4 address.
   If the sd is an IPv6 socket, the address passed can either be an IPv4
   or an IPv6 address.

RFC6458 - Page 11

   Applications cannot call bind() multiple times to associate multiple
   addresses to an endpoint.  After the first call to bind(), all
   subsequent calls will return an error.

   If the IP address part of addr is specified as a wildcard (INADDR_ANY
   for an IPv4 address, or as IN6ADDR_ANY_INIT or in6addr_any for an
   IPv6 address), the operating system will associate the endpoint with
   an optimal address set of the available interfaces.  If the IPv4
   sin_port or IPv6 sin6_port is set to 0, the operating system will
   choose an ephemeral port for the endpoint.

   If bind() is not called prior to a sendmsg() call that initiates a
   new association, the system picks an ephemeral port and will choose
   an address set equivalent to binding with a wildcard address.  One of
   those addresses will be the primary address for the association.
   This automatically enables the multi-homing capability of SCTP.

   The completion of this bind() process does not allow the SCTP
   endpoint to accept inbound SCTP association requests.  Until a
   listen() system call, described below, is performed on the socket,
   the SCTP endpoint will promptly reject an inbound SCTP INIT request
   with an SCTP ABORT.

3.1.3.  listen()

   By default, a one-to-many style socket does not accept new
   association requests.  An application uses listen() to mark a socket
   as being able to accept new associations.

   The function prototype is

   int listen(int sd,
              int backlog);

   and the arguments are

   sd:  The socket descriptor of the endpoint.

   backlog:  If backlog is non-zero, enable listening, else disable
      listening.

   listen() returns 0 on success and -1 in case of an error.

   Note that one-to-many style socket consumers do not need to call
   accept() to retrieve new associations.  Calling accept() on a one-to-
   many style socket should return EOPNOTSUPP.  Rather, new associations
   are accepted automatically, and notifications of the new associations
   are delivered via recvmsg() with the SCTP_ASSOC_CHANGE event (if

RFC6458 - Page 12

   these notifications are enabled).  Clients will typically not call
   listen(), so that they can be assured that only actively initiated
   associations are possible on the socket.  Server or peer-to-peer
   sockets, on the other hand, will always accept new associations, so a
   well-written application using server one-to-many style sockets must
   be prepared to handle new associations from unwanted peers.

   Also note that the SCTP_ASSOC_CHANGE event provides the association
   identifier for a new association, so if applications wish to use the
   association identifier as a parameter to other socket calls, they
   should ensure that the SCTP_ASSOC_CHANGE event is enabled.

3.1.4.  sendmsg() and recvmsg()

   An application uses the sendmsg() and recvmsg() calls to transmit
   data to and receive data from its peer.

   The function prototypes are

   ssize_t sendmsg(int sd,
                   const struct msghdr *message,
                   int flags);

   and

   ssize_t recvmsg(int sd,
                   struct msghdr *message,
                   int flags);

   using the following arguments:

   sd:  The socket descriptor of the endpoint.

   message:  Pointer to the msghdr structure that contains a single user
      message and possibly some ancillary data.  See Section 5 for a
      complete description of the data structures.

   flags:  No new flags are defined for SCTP at this level.  See
      Section 5 for SCTP-specific flags used in the msghdr structure.

   sendmsg() returns the number of bytes accepted by the kernel or -1 in
   case of an error.  recvmsg() returns the number of bytes received or
   -1 in case of an error.

RFC6458 - Page 13

   As described in Section 5, different types of ancillary data can be
   sent and received along with user data.  When sending, the ancillary
   data is used to specify the sent behavior, such as the SCTP stream
   number to use.  When receiving, the ancillary data is used to
   describe the received data, such as the SCTP stream sequence number
   of the message.

   When sending user data with sendmsg(), the msg_name field in the
   msghdr structure will be filled with one of the transport addresses
   of the intended receiver.  If there is no existing association
   between the sender and the intended receiver, the sender's SCTP stack
   will set up a new association and then send the user data (see
   Section 7.5 for more on implicit association setup).  If sendmsg() is
   called with no data and there is no existing association, a new one
   will be established.  The SCTP_INIT type ancillary data can be used
   to change some of the parameters used to set up a new association.
   If sendmsg() is called with NULL data, and there is no existing
   association but the SCTP_ABORT or SCTP_EOF flags are set as described
   in Section 5.3.4, then -1 is returned and errno is set to EINVAL.
   Sending a message using sendmsg() is atomic unless explicit end of
   record (EOR) marking is enabled on the socket specified by sd (see
   Section 8.1.26).

   If a peer sends a SHUTDOWN, an SCTP_SHUTDOWN_EVENT notification will
   be delivered if that notification has been enabled, and no more data
   can be sent to that association.  Any attempt to send more data will
   cause sendmsg() to return with an ESHUTDOWN error.  Note that the
   socket is still open for reading at this point, so it is possible to
   retrieve notifications.

   When receiving a user message with recvmsg(), the msg_name field in
   the msghdr structure will be populated with the source transport
   address of the user data.  The caller of recvmsg() can use this
   address information to determine to which association the received
   user message belongs.  Note that if SCTP_ASSOC_CHANGE events are
   disabled, applications must use the peer transport address provided
   in the msg_name field by recvmsg() to perform correlation to an
   association, since they will not have the association identifier.

   If all data in a single message has been delivered, MSG_EOR will be
   set in the msg_flags field of the msghdr structure (see Section 5.1).

   If the application does not provide enough buffer space to completely
   receive a data message, MSG_EOR will not be set in msg_flags.
   Successive reads will consume more of the same message until the
   entire message has been delivered, and MSG_EOR will be set.

RFC6458 - Page 14

   If the SCTP stack is running low on buffers, it may partially deliver
   a message.  In this case, MSG_EOR will not be set, and more calls to
   recvmsg() will be necessary to completely consume the message.  Only
   one message at a time can be partially delivered in any stream.  The
   socket option SCTP_FRAGMENT_INTERLEAVE controls various aspects of
   what interlacing of messages occurs for both the one-to-one and the
   one-to-many style sockets.  Please consult Section 8.1.20 for further
   details on message delivery options.

3.1.5.  close()

   Applications use close() to perform graceful shutdown (as described
   in Section 10.1 of [RFC4960]) on all of the associations currently
   represented by a one-to-many style socket.

   The function prototype is

   int close(int sd);

   and the argument is

   sd:  The socket descriptor of the associations to be closed.

   0 is returned on success and -1 in case of an error.

   To gracefully shut down a specific association represented by the
   one-to-many style socket, an application should use the sendmsg()
   call and include the SCTP_EOF flag.  A user may optionally terminate
   an association non-gracefully by using sendmsg() with the SCTP_ABORT
   flag set and possibly passing a user-specified abort code in the data
   field.  Both flags SCTP_EOF and SCTP_ABORT are passed with ancillary
   data (see Section 5.3.4) in the sendmsg() call.

   If sd in the close() call is a branched-off socket representing only
   one association, the shutdown is performed on that association only.

3.1.6.  connect()

   An application may use the connect() call in the one-to-many style to
   initiate an association without sending data.

   The function prototype is

   int connect(int sd,
               const struct sockaddr *nam,
               socklen_t len);

RFC6458 - Page 15

   and the arguments are

   sd:  The socket descriptor to which a new association is added.

   nam:  The address structure (struct sockaddr_in for an IPv4 address
      or struct sockaddr_in6 for an IPv6 address; see [RFC3493]).

   len:  The size of the address.

   0 is returned on success and -1 in case of an error.

   Multiple connect() calls can be made on the same socket to create
   multiple associations.  This is different from the semantics of
   connect() on a UDP socket.

   Note that SCTP allows data exchange, similar to T/TCP [RFC1644] (made
   Historic by [RFC6247]), during the association setup phase.  If an
   application wants to do this, it cannot use the connect() call.
   Instead, it should use sendto() or sendmsg() to initiate an
   association.  If it uses sendto() and it wants to change the
   initialization behavior, it needs to use the SCTP_INITMSG socket
   option before calling sendto().  Or it can use sendmsg() with
   SCTP_INIT type ancillary data to initiate an association without
   calling setsockopt().  Note that the implicit setup is supported for
   the one-to-many style sockets.

   SCTP does not support half close semantics.  This means that unlike
   T/TCP, MSG_EOF should not be set in the flags parameter when calling
   sendto() or sendmsg() when the call is used to initiate a connection.
   MSG_EOF is not an acceptable flag with an SCTP socket.

3.2.  Non-Blocking Mode

   Some SCTP applications may wish to avoid being blocked when calling a
   socket interface function.

   Once a bind() call and/or subsequent sctp_bindx() calls are complete
   on a one-to-many style socket, an application may set the
   non-blocking option via a fcntl() (such as O_NONBLOCK).  After
   setting the socket to non-blocking mode, the sendmsg() function
   returns immediately.  The success or failure of sending the data
   message (with possible SCTP_INITMSG ancillary data) will be signaled
   by the SCTP_ASSOC_CHANGE event with SCTP_COMM_UP or
   SCTP_CANT_START_ASSOC.  If user data could not be sent (due to an
   SCTP_CANT_START_ASSOC), the sender will also receive an
   SCTP_SEND_FAILED_EVENT event.  Events can be received by the user
   calling recvmsg().  A server (having called listen()) is also

RFC6458 - Page 16

   notified of an association-up event via the reception of an
   SCTP_ASSOC_CHANGE with SCTP_COMM_UP via the calling of recvmsg() and
   possibly the reception of the first data message.

   To shut down the association gracefully, the user must call sendmsg()
   with no data and with the SCTP_EOF flag set as described in
   Section 5.3.4.  The function returns immediately, and completion of
   the graceful shutdown is indicated by an SCTP_ASSOC_CHANGE
   notification of type SCTP_SHUTDOWN_COMP (see Section 6.1.1).  Note
   that this can also be done using the sctp_sendv() call described in
   Section 9.12.

   It is recommended that an application use caution when using select()
   (or poll()) for writing on a one-to-many style socket, because the
   interpretation of select() on write is implementation specific.
   Generally, a positive return on a select() on write would only
   indicate that one of the associations represented by the one-to-many
   style socket is writable.  An application that writes after the
   select() returns may still block, since the association that was
   writable is not the destination association of the write call.
   Likewise, select() (or poll()) for reading from a one-to-many style
   socket will only return an indication that one of the associations
   represented by the socket has data to be read.

   An application that wishes to know that a particular association is
   ready for reading or writing should either use the one-to-one style
   or use the sctp_peeloff() function (see Section 9.2) to separate the
   association of interest from the one-to-many style socket.

   Note that some implementations may have an extended select call, such
   as epoll or kqueue, that may escape this limitation and allow a
   select on a specific association of a one-to-many style socket, but
   this is an implementation-specific detail that a portable application
   cannot depend on.

3.3.  Special Considerations

   The fact that a one-to-many style socket can provide access to many
   SCTP associations through a single socket descriptor has important
   implications for both application programmers and system programmers
   implementing this API.  A key issue is how buffer space inside the
   sockets layer is managed.  Because this implementation detail
   directly affects how application programmers must write their code to
   ensure correct operation and portability, this section provides some
   guidance to both implementers and application programmers.

RFC6458 - Page 17

   An important feature that SCTP shares with TCP is flow control.
   Specifically, a sender may not send data faster than the receiver can
   consume it.

   For TCP, flow control is typically provided for in the sockets API as
   follows.  If the reader stops reading, the sender queues messages in
   the socket layer until the send socket buffer is completely filled.
   This results in a "stalled connection".  Further attempts to write to
   the socket will block or return the error EAGAIN or EWOULDBLOCK for a
   non-blocking socket.  At some point, either the connection is closed,
   or the receiver begins to read, again freeing space in the output
   queue.

   For one-to-one style SCTP sockets (this includes sockets descriptors
   that were separated from a one-to-many style socket with
   sctp_peeloff()), the behavior is identical.  For one-to-many style
   SCTP sockets, there are multiple associations for a single socket,
   which makes the situation more complicated.  If the implementation
   uses a single buffer space allocation shared by all associations, a
   single stalled association can prevent the further sending of data on
   all associations active on a particular one-to-many style socket.

   For a blocking socket, it should be clear that a single stalled
   association can block the entire socket.  For this reason,
   application programmers may want to use non-blocking one-to-many
   style sockets.  The application should at least be able to send
   messages to the non-stalled associations.

   But a non-blocking socket is not sufficient if the API implementer
   has chosen a single shared buffer allocation for the socket.  A
   single stalled association would eventually cause the shared
   allocation to fill, and it would become impossible to send even to
   non-stalled associations.

   The API implementer can solve this problem by providing each
   association with its own allocation of outbound buffer space.  Each
   association should conceptually have as much buffer space as it would
   have if it had its own socket.  As a bonus, this simplifies the
   implementation of sctp_peeloff().

   To ensure that a given stalled association will not prevent other
   non-stalled associations from being writable, application programmers
   should either

   o  demand that the underlying implementation dedicates independent
      buffer space reservation to each association (as suggested
      above), or

RFC6458 - Page 18

   o  verify that their application-layer protocol does not permit large
      amounts of unread data at the receiver (this is true of some
      request-response protocols, for example), or

   o  use one-to-one style sockets for association, which may
      potentially stall (either from the beginning, or by using
      sctp_peeloff() before sending large amounts of data that may cause
      a stalled condition).

4.  One-to-One Style Interface

   The goal of this style is to follow as closely as possible the
   current practice of using the sockets interface for a connection-
   oriented protocol such as TCP.  This style enables existing
   applications using connection-oriented protocols to be ported to SCTP
   with very little effort.

   One-to-one style sockets can be connected (explicitly or implicitly)
   at most once, similar to TCP sockets.

   Note that some new SCTP features and some new SCTP socket options can
   only be utilized through the use of sendmsg() and recvmsg() calls;
   see Section 4.1.8.

4.1.  Basic Operation

   A typical one-to-one style server uses the following system call
   sequence to prepare an SCTP endpoint for servicing requests:

   o  socket()

   o  bind()

   o  listen()

   o  accept()

   The accept() call blocks until a new association is set up.  It
   returns with a new socket descriptor.  The server then uses the new
   socket descriptor to communicate with the client, using recv() and
   send() calls to get requests and send back responses.

   Then it calls

   o  close()

   to terminate the association.

RFC6458 - Page 19

   A typical client uses the following system call sequence to set up an
   association with a server to request services:

   o  socket()

   o  connect()

   After returning from the connect() call, the client uses send()/
   sendmsg() and recv()/recvmsg() calls to send out requests and receive
   responses from the server.

   The client calls

   o  close()

   to terminate this association when done.

4.1.1.  socket()

   Applications call socket() to create a socket descriptor to represent
   an SCTP endpoint.

   The function prototype is

   int socket(int domain,
              int type,
              int protocol);

   and one uses PF_INET or PF_INET6 as the domain, SOCK_STREAM as the
   type, and IPPROTO_SCTP as the protocol.

   Here, SOCK_STREAM indicates the creation of a one-to-one style
   socket.

   Using the PF_INET domain indicates the creation of an endpoint that
   can use only IPv4 addresses, while PF_INET6 creates an endpoint that
   can use both IPv6 and IPv4 addresses.

4.1.2.  bind()

   Applications use bind() to specify with which local address and port
   the SCTP endpoint should associate itself.

   An SCTP endpoint can be associated with multiple addresses.  To do
   this, sctp_bindx() is introduced in Section 9.1 to help applications
   do the job of associating multiple addresses.  But note that an
   endpoint can only be associated with one local port.

RFC6458 - Page 20

   These addresses associated with a socket are the eligible transport
   addresses for the endpoint to send and receive data.  The endpoint
   will also present these addresses to its peers during the association
   initialization process; see [RFC4960].

   The function prototype of bind() is

   int bind(int sd,
            struct sockaddr *addr,
            socklen_t addrlen);

   and the arguments are

   sd:  The socket descriptor returned by socket().

   addr:  The address structure (struct sockaddr_in for an IPv4 address
      or struct sockaddr_in6 for an IPv6 address; see [RFC3493]).

   addrlen:  The size of the address structure.

   If sd is an IPv4 socket, the address passed must be an IPv4 address.
   If sd is an IPv6 socket, the address passed can either be an IPv4 or
   an IPv6 address.

   Applications cannot call bind() multiple times to associate multiple
   addresses to the endpoint.  After the first call to bind(), all
   subsequent calls will return an error.

   If the IP address part of addr is specified as a wildcard (INADDR_ANY
   for an IPv4 address, or as IN6ADDR_ANY_INIT or in6addr_any for an
   IPv6 address), the operating system will associate the endpoint with
   an optimal address set of the available interfaces.  If the IPv4
   sin_port or IPv6 sin6_port is set to 0, the operating system will
   choose an ephemeral port for the endpoint.

   If bind() is not called prior to the connect() call, the system picks
   an ephemeral port and will choose an address set equivalent to
   binding with a wildcard address.  One of these addresses will be the
   primary address for the association.  This automatically enables the
   multi-homing capability of SCTP.

   The completion of this bind() process does not allow the SCTP
   endpoint to accept inbound SCTP association requests.  Until a
   listen() system call, described below, is performed on the socket,
   the SCTP endpoint will promptly reject an inbound SCTP INIT request
   with an SCTP ABORT.

RFC6458 - Page 21

4.1.3.  listen()

   Applications use listen() to allow the SCTP endpoint to accept
   inbound associations.

   The function prototype is

   int listen(int sd,
              int backlog);

   and the arguments are

   sd:  The socket descriptor of the SCTP endpoint.

   backlog:  Specifies the max number of outstanding associations
      allowed in the socket's accept queue.  These are the associations
      that have finished the four-way initiation handshake (see
      Section 5 of [RFC4960]) and are in the ESTABLISHED state.  Note
      that a backlog of '0' indicates that the caller no longer wishes
      to receive new associations.

   listen() returns 0 on success and -1 in case of an error.

4.1.4.  accept()

   Applications use the accept() call to remove an established SCTP
   association from the accept queue of the endpoint.  A new socket
   descriptor will be returned from accept() to represent the newly
   formed association.

   The function prototype is

   int accept(int sd,
              struct sockaddr *addr,
              socklen_t *addrlen);

   and the arguments are

   sd:  The listening socket descriptor.

   addr:  On return, addr (struct sockaddr_in for an IPv4 address or
      struct sockaddr_in6 for an IPv6 address; see [RFC3493]) will
      contain the primary address of the peer endpoint.

   addrlen:  On return, addrlen will contain the size of addr.

   The function returns the socket descriptor for the newly formed
   association on success and -1 in case of an error.

RFC6458 - Page 22

4.1.5.  connect()

   Applications use connect() to initiate an association to a peer.

   The function prototype is

   int connect(int sd,
               const struct sockaddr *addr,
               socklen_t addrlen);

   and the arguments are

   sd:  The socket descriptor of the endpoint.

   addr:  The peer's (struct sockaddr_in for an IPv4 address or struct
      sockaddr_in6 for an IPv6 address; see [RFC3493]) address.

   addrlen:  The size of the address.

   connect() returns 0 on success and -1 on error.

   This operation corresponds to the ASSOCIATE primitive described in
   Section 10.1 of [RFC4960].

   The number of outbound streams the new association has is stack
   dependent.  Before connecting, applications can use the SCTP_INITMSG
   option described in Section 8.1.3 to change the number of outbound
   streams.

   If bind() is not called prior to the connect() call, the system picks
   an ephemeral port and will choose an address set equivalent to
   binding with INADDR_ANY and IN6ADDR_ANY_INIT for IPv4 and IPv6
   sockets, respectively.  One of the addresses will be the primary
   address for the association.  This automatically enables the
   multi-homing capability of SCTP.

   Note that SCTP allows data exchange, similar to T/TCP [RFC1644] (made
   Historic by [RFC6247]), during the association setup phase.  If an
   application wants to do this, it cannot use the connect() call.
   Instead, it should use sendto() or sendmsg() to initiate an
   association.  If it uses sendto() and it wants to change the
   initialization behavior, it needs to use the SCTP_INITMSG socket
   option before calling sendto().  Or it can use sendmsg() with
   SCTP_INIT type ancillary data to initiate an association without
   calling setsockopt().  Note that the implicit setup is supported for
   the one-to-one style sockets.

RFC6458 - Page 23

   SCTP does not support half close semantics.  This means that unlike
   T/TCP, MSG_EOF should not be set in the flags parameter when calling
   sendto() or sendmsg() when the call is used to initiate a connection.
   MSG_EOF is not an acceptable flag with an SCTP socket.

4.1.6.  close()

   Applications use close() to gracefully close down an association.

   The function prototype is

   int close(int sd);

   and the argument is

   sd:  The socket descriptor of the association to be closed.

   close() returns 0 on success and -1 in case of an error.

   After an application calls close() on a socket descriptor, no further
   socket operations will succeed on that descriptor.

4.1.7.  shutdown()

   SCTP differs from TCP in that it does not have half close semantics.
   Hence, the shutdown() call for SCTP is an approximation of the TCP
   shutdown() call, and solves some different problems.  Full TCP
   compatibility is not provided, so developers porting TCP applications
   to SCTP may need to recode sections that use shutdown().  (Note that
   it is possible to achieve the same results as half close in SCTP
   using SCTP streams.)

   The function prototype is

   int shutdown(int sd,
                int how);

   and the arguments are

   sd:  The socket descriptor of the association to be closed.

   how:  Specifies the type of shutdown.  The values are as follows:

      SHUT_RD:  Disables further receive operations.  No SCTP protocol
         action is taken.

      SHUT_WR:  Disables further send operations, and initiates the SCTP
         shutdown sequence.

RFC6458 - Page 24

      SHUT_RDWR:  Disables further send and receive operations, and
         initiates the SCTP shutdown sequence.

   shutdown() returns 0 on success and -1 in case of an error.

   The major difference between SCTP and TCP shutdown() is that SCTP
   SHUT_WR initiates immediate and full protocol shutdown, whereas TCP
   SHUT_WR causes TCP to go into the half close state.  SHUT_RD behaves
   the same for SCTP as for TCP.  The purpose of SCTP SHUT_WR is to
   close the SCTP association while still leaving the socket descriptor
   open.  This allows the caller to receive back any data that SCTP is
   unable to deliver (see Section 6.1.4 for more information) and
   receive event notifications.

   To perform the ABORT operation described in Section 10.1 of
   [RFC4960], an application can use the socket option SO_LINGER.
   SO_LINGER is described in Section 8.1.4.

4.1.8.  sendmsg() and recvmsg()

   With a one-to-one style socket, the application can also use
   sendmsg() and recvmsg() to transmit data to and receive data from its
   peer.  The semantics is similar to those used in the one-to-many
   style (see Section 3.1.4), with the following differences:

   1.  When sending, the msg_name field in the msghdr is not used to
       specify the intended receiver; rather, it is used to indicate a
       preferred peer address if the sender wishes to discourage the
       stack from sending the message to the primary address of the
       receiver.  If the socket is connected and the transport address
       given is not part of the current association, the data will not
       be sent, and an SCTP_SEND_FAILED_EVENT event will be delivered to
       the application if send failure events are enabled.

   2.  Using sendmsg() on a non-connected one-to-one style socket for
       implicit connection setup may or may not work, depending on the
       SCTP implementation.

4.1.9.  getpeername()

   Applications use getpeername() to retrieve the primary socket address
   of the peer.  This call is for TCP compatibility and is not
   multi-homed.  It may not work with one-to-many style sockets,
   depending on the implementation.  See Section 9.3 for a multi-homed
   style version of the call.

RFC6458 - Page 25

   The function prototype is

   int getpeername(int sd,
                   struct sockaddr *address,
                   socklen_t *len);

   and the arguments are

   sd:  The socket descriptor to be queried.

   address:  On return, the peer primary address is stored in this
      buffer.  If the socket is an IPv4 socket, the address will be
      IPv4.  If the socket is an IPv6 socket, the address will be either
      an IPv6 or IPv4 address.

   len:  The caller should set the length of address here.  On return,
      this is set to the length of the returned address.

   getpeername() returns 0 on success and -1 in case of an error.

   If the actual length of the address is greater than the length of the
   supplied sockaddr structure, the stored address will be truncated.

(page 25 continued on part 2)