Tech-invite3GPPspaceIETFspace
9796959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 6458

Sockets API Extensions for the Stream Control Transmission Protocol (SCTP)

Pages: 115
Informational
Errata
Part 1 of 4 – Pages 1 to 25
None   None   Next

Top   ToC   RFC6458 - Page 1
Internet Engineering Task Force (IETF)                        R. Stewart
Request for Comments: 6458                                Adara Networks
Category: Informational                                        M. Tuexen
ISSN: 2070-1721                         Muenster Univ. of Appl. Sciences
                                                                 K. Poon
                                                      Oracle Corporation
                                                                  P. Lei
                                                     Cisco Systems, Inc.
                                                             V. Yasevich
                                                                      HP
                                                           December 2011


                         Sockets API Extensions
          for the Stream Control Transmission Protocol (SCTP)

Abstract

This document describes a mapping of the Stream Control Transmission Protocol (SCTP) into a sockets API. The benefits of this mapping include compatibility for TCP applications, access to new SCTP features, and a consolidated error and event notification scheme. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc6458.
Top   ToC   RFC6458 - Page 2
Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

   This document may contain material from IETF Documents or IETF
   Contributions published or made publicly available before November
   10, 2008.  The person(s) controlling the copyright in some of this
   material may not have granted the IETF Trust the right to allow
   modifications of such material outside the IETF Standards Process.
   Without obtaining an adequate license from the person(s) controlling
   the copyright in such materials, this document may not be modified
   outside the IETF Standards Process, and derivative works of it may
   not be created outside the IETF Standards Process, except to format
   it for publication as an RFC or to translate it into languages other
   than English.
Top   ToC   RFC6458 - Page 3

Table of Contents

1. Introduction ....................................................6 2. Data Types ......................................................8 3. One-to-Many Style Interface .....................................8 3.1. Basic Operation ............................................8 3.1.1. socket() ............................................9 3.1.2. bind() .............................................10 3.1.3. listen() ...........................................11 3.1.4. sendmsg() and recvmsg() ............................12 3.1.5. close() ............................................14 3.1.6. connect() ..........................................14 3.2. Non-Blocking Mode .........................................15 3.3. Special Considerations ....................................16 4. One-to-One Style Interface .....................................18 4.1. Basic Operation ...........................................18 4.1.1. socket() ...........................................19 4.1.2. bind() .............................................19 4.1.3. listen() ...........................................21 4.1.4. accept() ...........................................21 4.1.5. connect() ..........................................22 4.1.6. close() ............................................23 4.1.7. shutdown() .........................................23 4.1.8. sendmsg() and recvmsg() ............................24 4.1.9. getpeername() ......................................24 5. Data Structures ................................................25 5.1. The msghdr and cmsghdr Structures .........................25 5.2. Ancillary Data Considerations and Semantics ...............26 5.2.1. Multiple Items and Ordering ........................27 5.2.2. Accessing and Manipulating Ancillary Data ..........27 5.2.3. Control Message Buffer Sizing ......................28 5.3. SCTP msg_control Structures ...............................28 5.3.1. SCTP Initiation Structure (SCTP_INIT) ..............29 5.3.2. SCTP Header Information Structure (SCTP_SNDRCV) - DEPRECATED .........................30 5.3.3. Extended SCTP Header Information Structure (SCTP_EXTRCV) - DEPRECATED .........................33 5.3.4. SCTP Send Information Structure (SCTP_SNDINFO) .....35 5.3.5. SCTP Receive Information Structure (SCTP_RCVINFO) ..37 5.3.6. SCTP Next Receive Information Structure (SCTP_NXTINFO) .....................................38 5.3.7. SCTP PR-SCTP Information Structure (SCTP_PRINFO) ...39 5.3.8. SCTP AUTH Information Structure (SCTP_AUTHINFO) ....40 5.3.9. SCTP Destination IPv4 Address Structure (SCTP_DSTADDRV4) ...................................41 5.3.10. SCTP Destination IPv6 Address Structure (SCTP_DSTADDRV6) ..................................41
Top   ToC   RFC6458 - Page 4
   6. SCTP Events and Notifications ..................................41
      6.1. SCTP Notification Structure ...............................42
           6.1.1. SCTP_ASSOC_CHANGE ..................................43
           6.1.2. SCTP_PEER_ADDR_CHANGE ..............................45
           6.1.3. SCTP_REMOTE_ERROR ..................................46
           6.1.4. SCTP_SEND_FAILED - DEPRECATED ......................47
           6.1.5. SCTP_SHUTDOWN_EVENT ................................48
           6.1.6. SCTP_ADAPTATION_INDICATION .........................49
           6.1.7. SCTP_PARTIAL_DELIVERY_EVENT ........................49
           6.1.8. SCTP_AUTHENTICATION_EVENT ..........................50
           6.1.9. SCTP_SENDER_DRY_EVENT ..............................51
           6.1.10. SCTP_NOTIFICATIONS_STOPPED_EVENT ..................52
           6.1.11. SCTP_SEND_FAILED_EVENT ............................52
      6.2. Notification Interest Options .............................54
           6.2.1. SCTP_EVENTS Option - DEPRECATED ....................54
           6.2.2. SCTP_EVENT Option ..................................56
   7. Common Operations for Both Styles ..............................57
      7.1. send(), recv(), sendto(), and recvfrom() ..................57
      7.2. setsockopt() and getsockopt() .............................59
      7.3. read() and write() ........................................60
      7.4. getsockname() .............................................60
      7.5. Implicit Association Setup ................................61
   8. Socket Options .................................................61
      8.1. Read/Write Options ........................................63
           8.1.1. Retransmission Timeout Parameters (SCTP_RTOINFO) ...63
           8.1.2. Association Parameters (SCTP_ASSOCINFO) ............64
           8.1.3. Initialization Parameters (SCTP_INITMSG) ...........66
           8.1.4. SO_LINGER ..........................................66
           8.1.5. SCTP_NODELAY .......................................66
           8.1.6. SO_RCVBUF ..........................................67
           8.1.7. SO_SNDBUF ..........................................67
           8.1.8. Automatic Close of Associations (SCTP_AUTOCLOSE) ...67
           8.1.9. Set Primary Address (SCTP_PRIMARY_ADDR) ............68
           8.1.10. Set Adaptation Layer Indicator
                   (SCTP_ADAPTATION_LAYER) ...........................68
           8.1.11. Enable/Disable Message Fragmentation
                   (SCTP_DISABLE_FRAGMENTS) ..........................68
           8.1.12. Peer Address Parameters (SCTP_PEER_ADDR_PARAMS) ...69
           8.1.13. Set Default Send Parameters
                   (SCTP_DEFAULT_SEND_PARAM) - DEPRECATED ............71
           8.1.14. Set Notification and Ancillary Events
                   (SCTP_EVENTS) - DEPRECATED ........................72
           8.1.15. Set/Clear IPv4 Mapped Addresses
                   (SCTP_I_WANT_MAPPED_V4_ADDR) ......................72
           8.1.16. Get or Set the Maximum Fragmentation Size
                   (SCTP_MAXSEG) .....................................72
           8.1.17. Get or Set the List of Supported HMAC
                   Identifiers (SCTP_HMAC_IDENT) .....................73
Top   ToC   RFC6458 - Page 5
           8.1.18. Get or Set the Active Shared Key
                   (SCTP_AUTH_ACTIVE_KEY) ............................74
           8.1.19. Get or Set Delayed SACK Timer
                   (SCTP_DELAYED_SACK) ...............................74
           8.1.20. Get or Set Fragmented Interleave
                   (SCTP_FRAGMENT_INTERLEAVE) ........................75
           8.1.21. Set or Get the SCTP Partial Delivery Point
                   (SCTP_PARTIAL_DELIVERY_POINT) .....................77
           8.1.22. Set or Get the Use of Extended Receive Info
                   (SCTP_USE_EXT_RCVINFO) - DEPRECATED ...............77
           8.1.23. Set or Get the Auto ASCONF Flag
                   (SCTP_AUTO_ASCONF) ................................77
           8.1.24. Set or Get the Maximum Burst (SCTP_MAX_BURST) .....78
           8.1.25. Set or Get the Default Context (SCTP_CONTEXT) .....78
           8.1.26. Enable or Disable Explicit EOR Marking
                   (SCTP_EXPLICIT_EOR) ...............................79
           8.1.27. Enable SCTP Port Reusage (SCTP_REUSE_PORT) ........79
           8.1.28. Set Notification Event (SCTP_EVENT) ...............79
           8.1.29. Enable or Disable the Delivery of SCTP_RCVINFO
                   as Ancillary Data (SCTP_RECVRCVINFO) ..............79
           8.1.30. Enable or Disable the Delivery of SCTP_NXTINFO
                   as Ancillary Data (SCTP_RECVNXTINFO) ..............80
           8.1.31. Set Default Send Parameters
                   (SCTP_DEFAULT_SNDINFO) ............................80
           8.1.32. Set Default PR-SCTP Parameters
                   (SCTP_DEFAULT_PRINFO) .............................80
      8.2. Read-Only Options .........................................81
           8.2.1. Association Status (SCTP_STATUS) ...................81
           8.2.2. Peer Address Information
                  (SCTP_GET_PEER_ADDR_INFO) ..........................82
           8.2.3. Get the List of Chunks the Peer Requires to
                  Be Authenticated (SCTP_PEER_AUTH_CHUNKS) ...........84
           8.2.4. Get the List of Chunks the Local Endpoint Requires
                  to Be Authenticated (SCTP_LOCAL_AUTH_CHUNKS) .......84
           8.2.5. Get the Current Number of Associations
                  (SCTP_GET_ASSOC_NUMBER) ............................85
           8.2.6. Get the Current Identifiers of Associations
                  (SCTP_GET_ASSOC_ID_LIST) ...........................85
      8.3. Write-Only Options ........................................85
           8.3.1. Set Peer Primary Address
                  (SCTP_SET_PEER_PRIMARY_ADDR) .......................86
           8.3.2. Add a Chunk That Must Be Authenticated
                  (SCTP_AUTH_CHUNK) ..................................86
           8.3.3. Set a Shared Key (SCTP_AUTH_KEY) ...................86
           8.3.4. Deactivate a Shared Key
                  (SCTP_AUTH_DEACTIVATE_KEY) .........................87
           8.3.5. Delete a Shared Key (SCTP_AUTH_DELETE_KEY) .........88
Top   ToC   RFC6458 - Page 6
   9. New Functions ..................................................88
      9.1. sctp_bindx() ..............................................88
      9.2. sctp_peeloff() ............................................90
      9.3. sctp_getpaddrs() ..........................................91
      9.4. sctp_freepaddrs() .........................................92
      9.5. sctp_getladdrs() ..........................................92
      9.6. sctp_freeladdrs() .........................................93
      9.7. sctp_sendmsg() - DEPRECATED ...............................93
      9.8. sctp_recvmsg() - DEPRECATED ...............................94
      9.9. sctp_connectx() ...........................................95
      9.10. sctp_send() - DEPRECATED .................................96
      9.11. sctp_sendx() - DEPRECATED ................................97
      9.12. sctp_sendv() .............................................98
      9.13. sctp_recvv() ............................................101
   10. Security Considerations ......................................103
   11. Acknowledgments ..............................................103
   12. References ...................................................104
      12.1. Normative References ....................................104
      12.2. Informative References ..................................104
   Appendix A. Example Using One-to-One Style Sockets ...............106
   Appendix B. Example Using One-to-Many Style Sockets ..............109

1. Introduction

The sockets API has provided a standard mapping of the Internet Protocol suite to many operating systems. Both TCP [RFC0793] and UDP [RFC0768] have benefited from this standard representation and access method across many diverse platforms. SCTP is a new protocol that provides many of the characteristics of TCP but also incorporates semantics more akin to UDP. This document defines a method to map the existing sockets API for use with SCTP, providing both a base for access to new features and compatibility so that most existing TCP applications can be migrated to SCTP with few (if any) changes. There are three basic design objectives: 1. Maintain consistency with existing sockets APIs: We define a sockets mapping for SCTP that is consistent with other sockets API protocol mappings (for instance UDP, TCP, IPv4, and IPv6). 2. Support a one-to-many style interface: This set of semantics is similar to that defined for connectionless protocols, such as UDP. A one-to-many style SCTP socket should be able to control multiple SCTP associations. This is similar to a UDP socket, which can communicate with many peer endpoints. Each of these associations is assigned an association identifier so that an
Top   ToC   RFC6458 - Page 7
       application can use the ID to differentiate them.  Note that SCTP
       is connection-oriented in nature, and it does not support
       broadcast or multicast communications, as UDP does.

   3.  Support a one-to-one style interface: This interface supports a
       similar semantics as sockets for connection-oriented protocols,
       such as TCP.  A one-to-one style SCTP socket should only control
       one SCTP association.  One purpose of defining this interface is
       to allow existing applications built on other connection-oriented
       protocols to be ported to use SCTP with very little effort.
       Developers familiar with these semantics can easily adapt to
       SCTP.  Another purpose is to make sure that existing mechanisms
       in most operating systems that support sockets, such as select(),
       should continue to work with this style of socket.  Extensions
       are added to this mapping to provide mechanisms to exploit new
       features of SCTP.

   Goals 2 and 3 are not compatible, so this document defines two modes
   of mapping, namely the one-to-many style mapping and the one-to-one
   style mapping.  These two modes share some common data structures and
   operations, but will require the use of two different application
   programming styles.  Note that all new SCTP features can be used with
   both styles of socket.  The decision on which one to use depends
   mainly on the nature of the applications.

   A mechanism is defined to extract an SCTP association from a one-to-
   many style socket into a one-to-one style socket.

   Some of the SCTP mechanisms cannot be adequately mapped to an
   existing socket interface.  In some cases, it is more desirable to
   have a new interface instead of using existing socket calls.
   Section 9 of this document describes these new interfaces.

   Please note that some elements of the SCTP sockets API are declared
   as deprecated.  During the evolution of this document, elements of
   the API were introduced, implemented, and later on replaced by other
   elements.  These replaced elements are declared as deprecated, since
   they are still available in some implementations and the replacement
   functions are not.  This applies especially to older versions of
   operating systems supporting SCTP.  New SCTP socket implementations
   must implement at least the non-deprecated elements.  Implementations
   intending interoperability with older versions of the API should also
   include the deprecated functions.
Top   ToC   RFC6458 - Page 8

2. Data Types

Whenever possible, Portable Operating System Interface (POSIX) data types defined in [IEEE-1003.1-2008] are used: uintN_t means an unsigned integer of exactly N bits (e.g., uint16_t). This document also assumes the argument data types from POSIX when possible (e.g., the final argument to setsockopt() is a socklen_t value). Whenever buffer sizes are specified, the POSIX size_t data type is used.

3. One-to-Many Style Interface

In the one-to-many style interface, there is a one-to-many relationship between sockets and associations.

3.1. Basic Operation

A typical server in this style uses the following socket calls in sequence to prepare an endpoint for servicing requests: o socket() o bind() o listen() o recvmsg() o sendmsg() o close() A typical client uses the following calls in sequence to set up an association with a server to request services: o socket() o sendmsg() o recvmsg() o close() In this style, by default, all of the associations connected to the endpoint are represented with a single socket. Each association is assigned an association identifier (the type is sctp_assoc_t) so that an application can use it to differentiate among them. In some implementations, the peer endpoints' addresses can also be used for this purpose. But this is not required for performance reasons. If
Top   ToC   RFC6458 - Page 9
   an implementation does not support using addresses to differentiate
   between different associations, the sendto() call can only be used to
   set up an association implicitly.  It cannot be used to send data to
   an established association, as the association identifier cannot be
   specified.

   Once an association identifier is assigned to an SCTP association,
   that identifier will not be reused until the application explicitly
   terminates the use of the association.  The resources belonging to
   that association will not be freed until that happens.  This is
   similar to the close() operation on a normal socket.  The only
   exception is when the SCTP_AUTOCLOSE option (Section 8.1.8) is set.
   In this case, after the association is terminated gracefully and
   automatically, the association identifier assigned to it can be
   reused.  All applications using this option should be aware of this
   to avoid the possible problem of sending data to an incorrect peer
   endpoint.

   If the server or client wishes to branch an existing association off
   to a separate socket, it is required to call sctp_peeloff() and to
   specify the association identifier.  The sctp_peeloff() call will
   return a new one-to-one style socket that can then be used with
   recv() and send() functions for message passing.  See Section 9.2 for
   more on branched-off associations.

   Once an association is branched off to a separate socket, it becomes
   completely separated from the original socket.  All subsequent
   control and data operations to that association must be done through
   the new socket.  For example, the close() operation on the original
   socket will not terminate any associations that have been branched
   off to a different socket.

   One-to-many style socket calls are discussed in more detail in the
   following subsections.

3.1.1. socket()

Applications use socket() to create a socket descriptor to represent an SCTP endpoint. The function prototype is int socket(int domain, int type, int protocol); and one uses PF_INET or PF_INET6 as the domain, SOCK_SEQPACKET as the type, and IPPROTO_SCTP as the protocol.
Top   ToC   RFC6458 - Page 10
   Here, SOCK_SEQPACKET indicates the creation of a one-to-many style
   socket.

   The function returns a socket descriptor, or -1 in case of an error.

   Using the PF_INET domain indicates the creation of an endpoint that
   can use only IPv4 addresses, while PF_INET6 creates an endpoint that
   can use both IPv6 and IPv4 addresses.

3.1.2. bind()

Applications use bind() to specify with which local address and port the SCTP endpoint should associate itself. An SCTP endpoint can be associated with multiple addresses. To do this, sctp_bindx() is introduced in Section 9.1 to help applications do the job of associating multiple addresses. But note that an endpoint can only be associated with one local port. These addresses associated with a socket are the eligible transport addresses for the endpoint to send and receive data. The endpoint will also present these addresses to its peers during the association initialization process; see [RFC4960]. After calling bind(), if the endpoint wishes to accept new associations on the socket, it must call listen() (see Section 3.1.3). The function prototype of bind() is int bind(int sd, struct sockaddr *addr, socklen_t addrlen); and the arguments are sd: The socket descriptor returned by socket(). addr: The address structure (struct sockaddr_in for an IPv4 address or struct sockaddr_in6 for an IPv6 address; see [RFC3493]). addrlen: The size of the address structure. bind() returns 0 on success and -1 in case of an error. If sd is an IPv4 socket, the address passed must be an IPv4 address. If the sd is an IPv6 socket, the address passed can either be an IPv4 or an IPv6 address.
Top   ToC   RFC6458 - Page 11
   Applications cannot call bind() multiple times to associate multiple
   addresses to an endpoint.  After the first call to bind(), all
   subsequent calls will return an error.

   If the IP address part of addr is specified as a wildcard (INADDR_ANY
   for an IPv4 address, or as IN6ADDR_ANY_INIT or in6addr_any for an
   IPv6 address), the operating system will associate the endpoint with
   an optimal address set of the available interfaces.  If the IPv4
   sin_port or IPv6 sin6_port is set to 0, the operating system will
   choose an ephemeral port for the endpoint.

   If bind() is not called prior to a sendmsg() call that initiates a
   new association, the system picks an ephemeral port and will choose
   an address set equivalent to binding with a wildcard address.  One of
   those addresses will be the primary address for the association.
   This automatically enables the multi-homing capability of SCTP.

   The completion of this bind() process does not allow the SCTP
   endpoint to accept inbound SCTP association requests.  Until a
   listen() system call, described below, is performed on the socket,
   the SCTP endpoint will promptly reject an inbound SCTP INIT request
   with an SCTP ABORT.

3.1.3. listen()

By default, a one-to-many style socket does not accept new association requests. An application uses listen() to mark a socket as being able to accept new associations. The function prototype is int listen(int sd, int backlog); and the arguments are sd: The socket descriptor of the endpoint. backlog: If backlog is non-zero, enable listening, else disable listening. listen() returns 0 on success and -1 in case of an error. Note that one-to-many style socket consumers do not need to call accept() to retrieve new associations. Calling accept() on a one-to- many style socket should return EOPNOTSUPP. Rather, new associations are accepted automatically, and notifications of the new associations are delivered via recvmsg() with the SCTP_ASSOC_CHANGE event (if
Top   ToC   RFC6458 - Page 12
   these notifications are enabled).  Clients will typically not call
   listen(), so that they can be assured that only actively initiated
   associations are possible on the socket.  Server or peer-to-peer
   sockets, on the other hand, will always accept new associations, so a
   well-written application using server one-to-many style sockets must
   be prepared to handle new associations from unwanted peers.

   Also note that the SCTP_ASSOC_CHANGE event provides the association
   identifier for a new association, so if applications wish to use the
   association identifier as a parameter to other socket calls, they
   should ensure that the SCTP_ASSOC_CHANGE event is enabled.

3.1.4. sendmsg() and recvmsg()

An application uses the sendmsg() and recvmsg() calls to transmit data to and receive data from its peer. The function prototypes are ssize_t sendmsg(int sd, const struct msghdr *message, int flags); and ssize_t recvmsg(int sd, struct msghdr *message, int flags); using the following arguments: sd: The socket descriptor of the endpoint. message: Pointer to the msghdr structure that contains a single user message and possibly some ancillary data. See Section 5 for a complete description of the data structures. flags: No new flags are defined for SCTP at this level. See Section 5 for SCTP-specific flags used in the msghdr structure. sendmsg() returns the number of bytes accepted by the kernel or -1 in case of an error. recvmsg() returns the number of bytes received or -1 in case of an error.
Top   ToC   RFC6458 - Page 13
   As described in Section 5, different types of ancillary data can be
   sent and received along with user data.  When sending, the ancillary
   data is used to specify the sent behavior, such as the SCTP stream
   number to use.  When receiving, the ancillary data is used to
   describe the received data, such as the SCTP stream sequence number
   of the message.

   When sending user data with sendmsg(), the msg_name field in the
   msghdr structure will be filled with one of the transport addresses
   of the intended receiver.  If there is no existing association
   between the sender and the intended receiver, the sender's SCTP stack
   will set up a new association and then send the user data (see
   Section 7.5 for more on implicit association setup).  If sendmsg() is
   called with no data and there is no existing association, a new one
   will be established.  The SCTP_INIT type ancillary data can be used
   to change some of the parameters used to set up a new association.
   If sendmsg() is called with NULL data, and there is no existing
   association but the SCTP_ABORT or SCTP_EOF flags are set as described
   in Section 5.3.4, then -1 is returned and errno is set to EINVAL.
   Sending a message using sendmsg() is atomic unless explicit end of
   record (EOR) marking is enabled on the socket specified by sd (see
   Section 8.1.26).

   If a peer sends a SHUTDOWN, an SCTP_SHUTDOWN_EVENT notification will
   be delivered if that notification has been enabled, and no more data
   can be sent to that association.  Any attempt to send more data will
   cause sendmsg() to return with an ESHUTDOWN error.  Note that the
   socket is still open for reading at this point, so it is possible to
   retrieve notifications.

   When receiving a user message with recvmsg(), the msg_name field in
   the msghdr structure will be populated with the source transport
   address of the user data.  The caller of recvmsg() can use this
   address information to determine to which association the received
   user message belongs.  Note that if SCTP_ASSOC_CHANGE events are
   disabled, applications must use the peer transport address provided
   in the msg_name field by recvmsg() to perform correlation to an
   association, since they will not have the association identifier.

   If all data in a single message has been delivered, MSG_EOR will be
   set in the msg_flags field of the msghdr structure (see Section 5.1).

   If the application does not provide enough buffer space to completely
   receive a data message, MSG_EOR will not be set in msg_flags.
   Successive reads will consume more of the same message until the
   entire message has been delivered, and MSG_EOR will be set.
Top   ToC   RFC6458 - Page 14
   If the SCTP stack is running low on buffers, it may partially deliver
   a message.  In this case, MSG_EOR will not be set, and more calls to
   recvmsg() will be necessary to completely consume the message.  Only
   one message at a time can be partially delivered in any stream.  The
   socket option SCTP_FRAGMENT_INTERLEAVE controls various aspects of
   what interlacing of messages occurs for both the one-to-one and the
   one-to-many style sockets.  Please consult Section 8.1.20 for further
   details on message delivery options.

3.1.5. close()

Applications use close() to perform graceful shutdown (as described in Section 10.1 of [RFC4960]) on all of the associations currently represented by a one-to-many style socket. The function prototype is int close(int sd); and the argument is sd: The socket descriptor of the associations to be closed. 0 is returned on success and -1 in case of an error. To gracefully shut down a specific association represented by the one-to-many style socket, an application should use the sendmsg() call and include the SCTP_EOF flag. A user may optionally terminate an association non-gracefully by using sendmsg() with the SCTP_ABORT flag set and possibly passing a user-specified abort code in the data field. Both flags SCTP_EOF and SCTP_ABORT are passed with ancillary data (see Section 5.3.4) in the sendmsg() call. If sd in the close() call is a branched-off socket representing only one association, the shutdown is performed on that association only.

3.1.6. connect()

An application may use the connect() call in the one-to-many style to initiate an association without sending data. The function prototype is int connect(int sd, const struct sockaddr *nam, socklen_t len);
Top   ToC   RFC6458 - Page 15
   and the arguments are

   sd:  The socket descriptor to which a new association is added.

   nam:  The address structure (struct sockaddr_in for an IPv4 address
      or struct sockaddr_in6 for an IPv6 address; see [RFC3493]).

   len:  The size of the address.

   0 is returned on success and -1 in case of an error.

   Multiple connect() calls can be made on the same socket to create
   multiple associations.  This is different from the semantics of
   connect() on a UDP socket.

   Note that SCTP allows data exchange, similar to T/TCP [RFC1644] (made
   Historic by [RFC6247]), during the association setup phase.  If an
   application wants to do this, it cannot use the connect() call.
   Instead, it should use sendto() or sendmsg() to initiate an
   association.  If it uses sendto() and it wants to change the
   initialization behavior, it needs to use the SCTP_INITMSG socket
   option before calling sendto().  Or it can use sendmsg() with
   SCTP_INIT type ancillary data to initiate an association without
   calling setsockopt().  Note that the implicit setup is supported for
   the one-to-many style sockets.

   SCTP does not support half close semantics.  This means that unlike
   T/TCP, MSG_EOF should not be set in the flags parameter when calling
   sendto() or sendmsg() when the call is used to initiate a connection.
   MSG_EOF is not an acceptable flag with an SCTP socket.

3.2. Non-Blocking Mode

Some SCTP applications may wish to avoid being blocked when calling a socket interface function. Once a bind() call and/or subsequent sctp_bindx() calls are complete on a one-to-many style socket, an application may set the non-blocking option via a fcntl() (such as O_NONBLOCK). After setting the socket to non-blocking mode, the sendmsg() function returns immediately. The success or failure of sending the data message (with possible SCTP_INITMSG ancillary data) will be signaled by the SCTP_ASSOC_CHANGE event with SCTP_COMM_UP or SCTP_CANT_START_ASSOC. If user data could not be sent (due to an SCTP_CANT_START_ASSOC), the sender will also receive an SCTP_SEND_FAILED_EVENT event. Events can be received by the user calling recvmsg(). A server (having called listen()) is also
Top   ToC   RFC6458 - Page 16
   notified of an association-up event via the reception of an
   SCTP_ASSOC_CHANGE with SCTP_COMM_UP via the calling of recvmsg() and
   possibly the reception of the first data message.

   To shut down the association gracefully, the user must call sendmsg()
   with no data and with the SCTP_EOF flag set as described in
   Section 5.3.4.  The function returns immediately, and completion of
   the graceful shutdown is indicated by an SCTP_ASSOC_CHANGE
   notification of type SCTP_SHUTDOWN_COMP (see Section 6.1.1).  Note
   that this can also be done using the sctp_sendv() call described in
   Section 9.12.

   It is recommended that an application use caution when using select()
   (or poll()) for writing on a one-to-many style socket, because the
   interpretation of select() on write is implementation specific.
   Generally, a positive return on a select() on write would only
   indicate that one of the associations represented by the one-to-many
   style socket is writable.  An application that writes after the
   select() returns may still block, since the association that was
   writable is not the destination association of the write call.
   Likewise, select() (or poll()) for reading from a one-to-many style
   socket will only return an indication that one of the associations
   represented by the socket has data to be read.

   An application that wishes to know that a particular association is
   ready for reading or writing should either use the one-to-one style
   or use the sctp_peeloff() function (see Section 9.2) to separate the
   association of interest from the one-to-many style socket.

   Note that some implementations may have an extended select call, such
   as epoll or kqueue, that may escape this limitation and allow a
   select on a specific association of a one-to-many style socket, but
   this is an implementation-specific detail that a portable application
   cannot depend on.

3.3. Special Considerations

The fact that a one-to-many style socket can provide access to many SCTP associations through a single socket descriptor has important implications for both application programmers and system programmers implementing this API. A key issue is how buffer space inside the sockets layer is managed. Because this implementation detail directly affects how application programmers must write their code to ensure correct operation and portability, this section provides some guidance to both implementers and application programmers.
Top   ToC   RFC6458 - Page 17
   An important feature that SCTP shares with TCP is flow control.
   Specifically, a sender may not send data faster than the receiver can
   consume it.

   For TCP, flow control is typically provided for in the sockets API as
   follows.  If the reader stops reading, the sender queues messages in
   the socket layer until the send socket buffer is completely filled.
   This results in a "stalled connection".  Further attempts to write to
   the socket will block or return the error EAGAIN or EWOULDBLOCK for a
   non-blocking socket.  At some point, either the connection is closed,
   or the receiver begins to read, again freeing space in the output
   queue.

   For one-to-one style SCTP sockets (this includes sockets descriptors
   that were separated from a one-to-many style socket with
   sctp_peeloff()), the behavior is identical.  For one-to-many style
   SCTP sockets, there are multiple associations for a single socket,
   which makes the situation more complicated.  If the implementation
   uses a single buffer space allocation shared by all associations, a
   single stalled association can prevent the further sending of data on
   all associations active on a particular one-to-many style socket.

   For a blocking socket, it should be clear that a single stalled
   association can block the entire socket.  For this reason,
   application programmers may want to use non-blocking one-to-many
   style sockets.  The application should at least be able to send
   messages to the non-stalled associations.

   But a non-blocking socket is not sufficient if the API implementer
   has chosen a single shared buffer allocation for the socket.  A
   single stalled association would eventually cause the shared
   allocation to fill, and it would become impossible to send even to
   non-stalled associations.

   The API implementer can solve this problem by providing each
   association with its own allocation of outbound buffer space.  Each
   association should conceptually have as much buffer space as it would
   have if it had its own socket.  As a bonus, this simplifies the
   implementation of sctp_peeloff().

   To ensure that a given stalled association will not prevent other
   non-stalled associations from being writable, application programmers
   should either

   o  demand that the underlying implementation dedicates independent
      buffer space reservation to each association (as suggested
      above), or
Top   ToC   RFC6458 - Page 18
   o  verify that their application-layer protocol does not permit large
      amounts of unread data at the receiver (this is true of some
      request-response protocols, for example), or

   o  use one-to-one style sockets for association, which may
      potentially stall (either from the beginning, or by using
      sctp_peeloff() before sending large amounts of data that may cause
      a stalled condition).

4. One-to-One Style Interface

The goal of this style is to follow as closely as possible the current practice of using the sockets interface for a connection- oriented protocol such as TCP. This style enables existing applications using connection-oriented protocols to be ported to SCTP with very little effort. One-to-one style sockets can be connected (explicitly or implicitly) at most once, similar to TCP sockets. Note that some new SCTP features and some new SCTP socket options can only be utilized through the use of sendmsg() and recvmsg() calls; see Section 4.1.8.

4.1. Basic Operation

A typical one-to-one style server uses the following system call sequence to prepare an SCTP endpoint for servicing requests: o socket() o bind() o listen() o accept() The accept() call blocks until a new association is set up. It returns with a new socket descriptor. The server then uses the new socket descriptor to communicate with the client, using recv() and send() calls to get requests and send back responses. Then it calls o close() to terminate the association.
Top   ToC   RFC6458 - Page 19
   A typical client uses the following system call sequence to set up an
   association with a server to request services:

   o  socket()

   o  connect()

   After returning from the connect() call, the client uses send()/
   sendmsg() and recv()/recvmsg() calls to send out requests and receive
   responses from the server.

   The client calls

   o  close()

   to terminate this association when done.

4.1.1. socket()

Applications call socket() to create a socket descriptor to represent an SCTP endpoint. The function prototype is int socket(int domain, int type, int protocol); and one uses PF_INET or PF_INET6 as the domain, SOCK_STREAM as the type, and IPPROTO_SCTP as the protocol. Here, SOCK_STREAM indicates the creation of a one-to-one style socket. Using the PF_INET domain indicates the creation of an endpoint that can use only IPv4 addresses, while PF_INET6 creates an endpoint that can use both IPv6 and IPv4 addresses.

4.1.2. bind()

Applications use bind() to specify with which local address and port the SCTP endpoint should associate itself. An SCTP endpoint can be associated with multiple addresses. To do this, sctp_bindx() is introduced in Section 9.1 to help applications do the job of associating multiple addresses. But note that an endpoint can only be associated with one local port.
Top   ToC   RFC6458 - Page 20
   These addresses associated with a socket are the eligible transport
   addresses for the endpoint to send and receive data.  The endpoint
   will also present these addresses to its peers during the association
   initialization process; see [RFC4960].

   The function prototype of bind() is

   int bind(int sd,
            struct sockaddr *addr,
            socklen_t addrlen);

   and the arguments are

   sd:  The socket descriptor returned by socket().

   addr:  The address structure (struct sockaddr_in for an IPv4 address
      or struct sockaddr_in6 for an IPv6 address; see [RFC3493]).

   addrlen:  The size of the address structure.

   If sd is an IPv4 socket, the address passed must be an IPv4 address.
   If sd is an IPv6 socket, the address passed can either be an IPv4 or
   an IPv6 address.

   Applications cannot call bind() multiple times to associate multiple
   addresses to the endpoint.  After the first call to bind(), all
   subsequent calls will return an error.

   If the IP address part of addr is specified as a wildcard (INADDR_ANY
   for an IPv4 address, or as IN6ADDR_ANY_INIT or in6addr_any for an
   IPv6 address), the operating system will associate the endpoint with
   an optimal address set of the available interfaces.  If the IPv4
   sin_port or IPv6 sin6_port is set to 0, the operating system will
   choose an ephemeral port for the endpoint.

   If bind() is not called prior to the connect() call, the system picks
   an ephemeral port and will choose an address set equivalent to
   binding with a wildcard address.  One of these addresses will be the
   primary address for the association.  This automatically enables the
   multi-homing capability of SCTP.

   The completion of this bind() process does not allow the SCTP
   endpoint to accept inbound SCTP association requests.  Until a
   listen() system call, described below, is performed on the socket,
   the SCTP endpoint will promptly reject an inbound SCTP INIT request
   with an SCTP ABORT.
Top   ToC   RFC6458 - Page 21

4.1.3. listen()

Applications use listen() to allow the SCTP endpoint to accept inbound associations. The function prototype is int listen(int sd, int backlog); and the arguments are sd: The socket descriptor of the SCTP endpoint. backlog: Specifies the max number of outstanding associations allowed in the socket's accept queue. These are the associations that have finished the four-way initiation handshake (see Section 5 of [RFC4960]) and are in the ESTABLISHED state. Note that a backlog of '0' indicates that the caller no longer wishes to receive new associations. listen() returns 0 on success and -1 in case of an error.

4.1.4. accept()

Applications use the accept() call to remove an established SCTP association from the accept queue of the endpoint. A new socket descriptor will be returned from accept() to represent the newly formed association. The function prototype is int accept(int sd, struct sockaddr *addr, socklen_t *addrlen); and the arguments are sd: The listening socket descriptor. addr: On return, addr (struct sockaddr_in for an IPv4 address or struct sockaddr_in6 for an IPv6 address; see [RFC3493]) will contain the primary address of the peer endpoint. addrlen: On return, addrlen will contain the size of addr. The function returns the socket descriptor for the newly formed association on success and -1 in case of an error.
Top   ToC   RFC6458 - Page 22

4.1.5. connect()

Applications use connect() to initiate an association to a peer. The function prototype is int connect(int sd, const struct sockaddr *addr, socklen_t addrlen); and the arguments are sd: The socket descriptor of the endpoint. addr: The peer's (struct sockaddr_in for an IPv4 address or struct sockaddr_in6 for an IPv6 address; see [RFC3493]) address. addrlen: The size of the address. connect() returns 0 on success and -1 on error. This operation corresponds to the ASSOCIATE primitive described in Section 10.1 of [RFC4960]. The number of outbound streams the new association has is stack dependent. Before connecting, applications can use the SCTP_INITMSG option described in Section 8.1.3 to change the number of outbound streams. If bind() is not called prior to the connect() call, the system picks an ephemeral port and will choose an address set equivalent to binding with INADDR_ANY and IN6ADDR_ANY_INIT for IPv4 and IPv6 sockets, respectively. One of the addresses will be the primary address for the association. This automatically enables the multi-homing capability of SCTP. Note that SCTP allows data exchange, similar to T/TCP [RFC1644] (made Historic by [RFC6247]), during the association setup phase. If an application wants to do this, it cannot use the connect() call. Instead, it should use sendto() or sendmsg() to initiate an association. If it uses sendto() and it wants to change the initialization behavior, it needs to use the SCTP_INITMSG socket option before calling sendto(). Or it can use sendmsg() with SCTP_INIT type ancillary data to initiate an association without calling setsockopt(). Note that the implicit setup is supported for the one-to-one style sockets.
Top   ToC   RFC6458 - Page 23
   SCTP does not support half close semantics.  This means that unlike
   T/TCP, MSG_EOF should not be set in the flags parameter when calling
   sendto() or sendmsg() when the call is used to initiate a connection.
   MSG_EOF is not an acceptable flag with an SCTP socket.

4.1.6. close()

Applications use close() to gracefully close down an association. The function prototype is int close(int sd); and the argument is sd: The socket descriptor of the association to be closed. close() returns 0 on success and -1 in case of an error. After an application calls close() on a socket descriptor, no further socket operations will succeed on that descriptor.

4.1.7. shutdown()

SCTP differs from TCP in that it does not have half close semantics. Hence, the shutdown() call for SCTP is an approximation of the TCP shutdown() call, and solves some different problems. Full TCP compatibility is not provided, so developers porting TCP applications to SCTP may need to recode sections that use shutdown(). (Note that it is possible to achieve the same results as half close in SCTP using SCTP streams.) The function prototype is int shutdown(int sd, int how); and the arguments are sd: The socket descriptor of the association to be closed. how: Specifies the type of shutdown. The values are as follows: SHUT_RD: Disables further receive operations. No SCTP protocol action is taken. SHUT_WR: Disables further send operations, and initiates the SCTP shutdown sequence.
Top   ToC   RFC6458 - Page 24
      SHUT_RDWR:  Disables further send and receive operations, and
         initiates the SCTP shutdown sequence.

   shutdown() returns 0 on success and -1 in case of an error.

   The major difference between SCTP and TCP shutdown() is that SCTP
   SHUT_WR initiates immediate and full protocol shutdown, whereas TCP
   SHUT_WR causes TCP to go into the half close state.  SHUT_RD behaves
   the same for SCTP as for TCP.  The purpose of SCTP SHUT_WR is to
   close the SCTP association while still leaving the socket descriptor
   open.  This allows the caller to receive back any data that SCTP is
   unable to deliver (see Section 6.1.4 for more information) and
   receive event notifications.

   To perform the ABORT operation described in Section 10.1 of
   [RFC4960], an application can use the socket option SO_LINGER.
   SO_LINGER is described in Section 8.1.4.

4.1.8. sendmsg() and recvmsg()

With a one-to-one style socket, the application can also use sendmsg() and recvmsg() to transmit data to and receive data from its peer. The semantics is similar to those used in the one-to-many style (see Section 3.1.4), with the following differences: 1. When sending, the msg_name field in the msghdr is not used to specify the intended receiver; rather, it is used to indicate a preferred peer address if the sender wishes to discourage the stack from sending the message to the primary address of the receiver. If the socket is connected and the transport address given is not part of the current association, the data will not be sent, and an SCTP_SEND_FAILED_EVENT event will be delivered to the application if send failure events are enabled. 2. Using sendmsg() on a non-connected one-to-one style socket for implicit connection setup may or may not work, depending on the SCTP implementation.

4.1.9. getpeername()

Applications use getpeername() to retrieve the primary socket address of the peer. This call is for TCP compatibility and is not multi-homed. It may not work with one-to-many style sockets, depending on the implementation. See Section 9.3 for a multi-homed style version of the call.
Top   ToC   RFC6458 - Page 25
   The function prototype is

   int getpeername(int sd,
                   struct sockaddr *address,
                   socklen_t *len);

   and the arguments are

   sd:  The socket descriptor to be queried.

   address:  On return, the peer primary address is stored in this
      buffer.  If the socket is an IPv4 socket, the address will be
      IPv4.  If the socket is an IPv6 socket, the address will be either
      an IPv6 or IPv4 address.

   len:  The caller should set the length of address here.  On return,
      this is set to the length of the returned address.

   getpeername() returns 0 on success and -1 in case of an error.

   If the actual length of the address is greater than the length of the
   supplied sockaddr structure, the stored address will be truncated.



(page 25 continued on part 2)

Next Section