Internet Engineering Task Force (IETF) J. Korhonen, Ed. Request for Comments: 7683 Broadcom Corporation Category: Standards Track S. Donovan, Ed. ISSN: 2070-1721 B. Campbell Oracle L. Morand Orange Labs October 2015 Diameter Overload Indication ConveyanceAbstract
This specification defines a base solution for Diameter overload control, referred to as Diameter Overload Indication Conveyance (DOIC). Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7683. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology and Abbreviations . . . . . . . . . . . . . . . . 3 3. Conventions Used in This Document . . . . . . . . . . . . . . 5 4. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 5 4.1. Piggybacking . . . . . . . . . . . . . . . . . . . . . . 6 4.2. DOIC Capability Announcement . . . . . . . . . . . . . . 7 4.3. DOIC Overload Condition Reporting . . . . . . . . . . . . 9 4.4. DOIC Extensibility . . . . . . . . . . . . . . . . . . . 11 4.5. Simplified Example Architecture . . . . . . . . . . . . . 12 5. Solution Procedures . . . . . . . . . . . . . . . . . . . . . 12 5.1. Capability Announcement . . . . . . . . . . . . . . . . . 12 5.1.1. Reacting Node Behavior . . . . . . . . . . . . . . . 13 5.1.2. Reporting Node Behavior . . . . . . . . . . . . . . . 13 5.1.3. Agent Behavior . . . . . . . . . . . . . . . . . . . 14 5.2. Overload Report Processing . . . . . . . . . . . . . . . 15 5.2.1. Overload Control State . . . . . . . . . . . . . . . 15 5.2.2. Reacting Node Behavior . . . . . . . . . . . . . . . 19 5.2.3. Reporting Node Behavior . . . . . . . . . . . . . . . 20 5.3. Protocol Extensibility . . . . . . . . . . . . . . . . . 22 6. Loss Algorithm . . . . . . . . . . . . . . . . . . . . . . . 23 6.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 23 6.2. Reporting Node Behavior . . . . . . . . . . . . . . . . . 24 6.3. Reacting Node Behavior . . . . . . . . . . . . . . . . . 24 7. Attribute Value Pairs . . . . . . . . . . . . . . . . . . . . 25 7.1. OC-Supported-Features AVP . . . . . . . . . . . . . . . . 25 7.2. OC-Feature-Vector AVP . . . . . . . . . . . . . . . . . . 25 7.3. OC-OLR AVP . . . . . . . . . . . . . . . . . . . . . . . 26 7.4. OC-Sequence-Number AVP . . . . . . . . . . . . . . . . . 26 7.5. OC-Validity-Duration AVP . . . . . . . . . . . . . . . . 26 7.6. OC-Report-Type AVP . . . . . . . . . . . . . . . . . . . 27 7.7. OC-Reduction-Percentage AVP . . . . . . . . . . . . . . . 27 7.8. AVP Flag Rules . . . . . . . . . . . . . . . . . . . . . 28 8. Error Response Codes . . . . . . . . . . . . . . . . . . . . 28 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 9.1. AVP Codes . . . . . . . . . . . . . . . . . . . . . . . . 29 9.2. New Registries . . . . . . . . . . . . . . . . . . . . . 29 10. Security Considerations . . . . . . . . . . . . . . . . . . . 30 10.1. Potential Threat Modes . . . . . . . . . . . . . . . . . 30 10.2. Denial-of-Service Attacks . . . . . . . . . . . . . . . 31 10.3. Noncompliant Nodes . . . . . . . . . . . . . . . . . . . 32 10.4. End-to-End Security Issues . . . . . . . . . . . . . . . 32 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 34 11.1. Normative References . . . . . . . . . . . . . . . . . . 34 11.2. Informative References . . . . . . . . . . . . . . . . . 34
Appendix A. Issues Left for Future Specifications . . . . . . . 35 A.1. Additional Traffic Abatement Algorithms . . . . . . . . . 35 A.2. Agent Overload . . . . . . . . . . . . . . . . . . . . . 35 A.3. New Error Diagnostic AVP . . . . . . . . . . . . . . . . 35 Appendix B. Deployment Considerations . . . . . . . . . . . . . 35 Appendix C. Considerations for Applications Integrating the DOIC Solution . . . . . . . . . . . . . . . . . . . . . . 36 C.1. Application Classification . . . . . . . . . . . . . . . 36 C.2. Implications of Application Type Overload . . . . . . . . 37 C.3. Request Transaction Classification . . . . . . . . . . . 38 C.4. Request Type Overload Implications . . . . . . . . . . . 39 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 421. Introduction
This specification defines a base solution for Diameter overload control, referred to as Diameter Overload Indication Conveyance (DOIC), based on the requirements identified in [RFC7068]. This specification addresses Diameter overload control between Diameter nodes that support the DOIC solution. The solution, which is designed to apply to existing and future Diameter applications, requires no changes to the Diameter base protocol [RFC6733] and is deployable in environments where some Diameter nodes do not implement the Diameter overload control solution defined in this specification. A new application specification can incorporate the overload control mechanism specified in this document by making it mandatory to implement for the application and referencing this specification normatively. It is the responsibility of the Diameter application designers to define how overload control mechanisms work on that application. Note that the overload control solution defined in this specification does not address all the requirements listed in [RFC7068]. A number of features related to overload control are left for future specifications. See Appendix A for a list of extensions that are currently being considered.2. Terminology and Abbreviations
Abatement Reaction to receipt of an overload report resulting in a reduction in traffic sent to the reporting node. Abatement actions include diversion and throttling.
Abatement Algorithm An extensible method requested by reporting nodes and used by reacting nodes to reduce the amount of traffic sent during an occurrence of overload control. Diversion An overload abatement treatment where the reacting node selects alternate destinations or paths for requests. Host-Routed Requests Requests that a reacting node knows will be served by a particular host, either due to the presence of a Destination-Host Attribute Value Pair (AVP) or by some other local knowledge on the part of the reacting node. Overload Control State (OCS) Internal state maintained by a reporting or reacting node describing occurrences of overload control. Overload Report (OLR) Overload control information for a particular overload occurrence sent by a reporting node. Reacting Node A Diameter node that acts upon an overload report. Realm-Routed Requests Requests sent by a reacting node where the reacting node does not know to which host the request will be routed. Reporting Node A Diameter node that generates an overload report. (This may or may not be the overloaded node.)
Throttling An abatement treatment that limits the number of requests sent by the reacting node. Throttling can include a Diameter Client choosing to not send requests, or a Diameter Agent or Server rejecting requests with appropriate error responses. In both cases, the result of the throttling is a permanent rejection of the transaction.3. Conventions Used in This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. The interpretation from RFC 2119 [RFC2119] does not apply for the above listed words when they are not used in all caps.4. Solution Overview
The Diameter Overload Information Conveyance (DOIC) solution allows Diameter nodes to request that other Diameter nodes perform overload abatement actions, that is, actions to reduce the load offered to the overloaded node or realm. A Diameter node that supports DOIC is known as a "DOIC node". Any Diameter node can act as a DOIC node, including Diameter Clients, Diameter Servers, and Diameter Agents. DOIC nodes are further divided into "Reporting Nodes" and "Reacting Nodes." A reporting node requests overload abatement by sending Overload Reports (OLRs). A reacting node acts upon OLRs and performs whatever actions are needed to fulfill the abatement requests included in the OLRs. A reporting node may report overload on its own behalf or on behalf of other nodes. Likewise, a reacting node may perform overload abatement on its own behalf or on behalf of other nodes. A Diameter node's role as a DOIC node is independent of its Diameter role. For example, Diameter Agents may act as DOIC nodes, even though they are not endpoints in the Diameter sense. Since Diameter enables bidirectional applications, where Diameter Servers can send requests towards Diameter Clients, a given Diameter node can simultaneously act as both a reporting node and a reacting node. Likewise, a Diameter Agent may act as a reacting node from the perspective of upstream nodes, and a reporting node from the perspective of downstream nodes.
DOIC nodes do not generate new messages to carry DOIC-related information. Rather, they "piggyback" DOIC information over existing Diameter messages by inserting new AVPs into existing Diameter requests and responses. Nodes indicate support for DOIC, and any needed DOIC parameters, by inserting an OC-Supported-Features AVP (Section 7.1) into existing requests and responses. Reporting nodes send OLRs by inserting OC-OLR AVPs (Section 7.3). A given OLR applies to the Diameter realm and application of the Diameter message that carries it. If a reporting node supports more than one realm and/or application, it reports independently for each combination of realm and application. Similarly, the OC-Supported- Features AVP applies to the realm and application of the enclosing message. This implies that a node may support DOIC for one application and/or realm, but not another, and may indicate different DOIC parameters for each application and realm for which it supports DOIC. Reacting nodes perform overload abatement according to an agreed-upon abatement algorithm. An abatement algorithm defines the meaning of some of the parameters of an OLR and the procedures required for overload abatement. An overload abatement algorithm separates Diameter requests into two sets. The first set contains the requests that are to undergo overload abatement treatment of either throttling or diversion. The second set contains the requests that are to be given normal routing treatment. This document specifies a single "must-support" algorithm, namely, the "loss" algorithm (Section 6). Future specifications may introduce new algorithms. Overload conditions may vary in scope. For example, a single Diameter node may be overloaded, in which case, reacting nodes may attempt to send requests to other destinations. On the other hand, an entire Diameter realm may be overloaded, in which case, such attempts would do harm. DOIC OLRs have a concept of "report type" (Section 7.6), where the type defines such behaviors. Report types are extensible. This document defines report types for overload of a specific host and for overload of an entire realm. DOIC works through non-supporting Diameter Agents that properly pass unknown AVPs unchanged.4.1. Piggybacking
There is no new Diameter application defined to carry overload- related AVPs. The overload control AVPs defined in this specification have been designed to be piggybacked on top of existing
application messages. This is made possible by adding the optional overload control AVPs OC-OLR and OC-Supported-Features into existing commands. Reacting nodes indicate support for DOIC by including the OC-Supported-Features AVP in all request messages originated or relayed by the reacting node. Reporting nodes indicate support for DOIC by including the OC-Supported-Features AVP in all answer messages that are originated or relayed by the reporting node and that are in response to a request that contained the OC-Supported-Features AVP. Reporting nodes may include overload reports using the OC-OLR AVP in answer messages. Note that the overload control solution does not have fixed server and client roles. The DOIC node role is determined based on the message type: whether the message is a request (i.e., sent by a "reacting node") or an answer (i.e., sent by a "reporting node"). Therefore, in a typical client-server deployment, the Diameter Client may report its overload condition to the Diameter Server for any Diameter-Server-initiated message exchange. An example of such is the Diameter Server requesting a re-authentication from a Diameter Client.4.2. DOIC Capability Announcement
The DOIC solution supports the ability for Diameter nodes to determine if other nodes in the path of a request support the solution. This capability is referred to as DOIC Capability Announcement (DCA) and is separate from the Diameter Capability Exchange. The DCA mechanism uses the OC-Supported-Features AVPs to indicate the Diameter overload features supported. The first node in the path of a Diameter request that supports the DOIC solution inserts the OC-Supported-Features AVP in the request message. The individual features supported by the DOIC nodes are indicated in the OC-Feature-Vector AVP. Any semantics associated with the features will be defined in extension specifications that introduce the features. Note: As discussed elsewhere in the document, agents in the path of the request can modify the OC-Supported-Features AVP.
Note: The DOIC solution must support deployments where Diameter Clients and/or Diameter Servers do not support the DOIC solution. In this scenario, Diameter Agents that support the DOIC solution may handle overload abatement for the non-supporting Diameter nodes. In this case, the DOIC agent will insert the OC-Supported- Features AVP in requests that do not already contain one, telling the reporting node that there is a DOIC node that will handle overload abatement. For transactions where there was an OC-Supporting-Features AVP in the request, the agent will insert the OC-Supported-Features AVP in answers, telling the reacting node that there is a reporting node. The OC-Feature-Vector AVP will always contain an indication of support for the loss overload abatement algorithm defined in this specification (see Section 6). This ensures that a reporting node always supports at least one of the advertised abatement algorithms received in a request messages. The reporting node inserts the OC-Supported-Features AVP in all answer messages to requests that contained the OC-Supported-Features AVP. The contents of the reporting node's OC-Supported-Features AVP indicate the set of Diameter overload features supported by the reporting node. This specification defines one exception -- the reporting node only includes an indication of support for one overload abatement algorithm, independent of the number of overload abatement algorithms actually supported by the reacting node. The overload abatement algorithm indicated is the algorithm that the reporting node intends to use should it enter an overload condition. Reacting nodes can use the indicated overload abatement algorithm to prepare for possible overload reports and must use the indicated overload abatement algorithm if traffic reduction is actually requested. Note that the loss algorithm defined in this document is a stateless abatement algorithm. As a result, it does not require any actions by reacting nodes prior to the receipt of an overload report. Stateful abatement algorithms that base the abatement logic on a history of request messages sent might require reacting nodes to maintain state in advance of receiving an overload report to ensure that the overload reports can be properly handled. While it should only be done in exceptional circumstances and not during an active occurrence of overload, a reacting node that wishes to transition to a different abatement algorithm can stop advertising support for the algorithm indicated by the reporting node, as long as support for the loss algorithm is always advertised.
The DCA mechanism must also allow the scenario where the set of features supported by the sender of a request and by agents in the path of a request differ. In this case, the agent can update the OC-Supported-Features AVP to reflect the mixture of the two sets of supported features. Note: The logic to determine if the content of the OC-Supported- Features AVP should be changed is out of scope for this document, as is the logic to determine the content of a modified OC-Supported-Features AVP. These are left to implementation decisions. Care must be taken not to introduce interoperability issues for downstream or upstream DOIC nodes. As such, the agent must act as a fully compliant reporting node to the downstream reacting node and as a fully compliant reacting node to the upstream reporting node.4.3. DOIC Overload Condition Reporting
As with DOIC capability announcement, overload condition reporting uses new AVPs (Section 7.3) to indicate an overload condition. The OC-OLR AVP is referred to as an overload report. The OC-OLR AVP includes the type of report, a sequence number, the length of time that the report is valid, and AVPs specific to the abatement algorithm. Two types of overload reports are defined in this document: host reports and realm reports. A report of type "HOST_REPORT" is sent to indicate the overload of a specific host, identified by the Origin-Host AVP of the message containing the OLR, for the Application-ID indicated in the transaction. When receiving an OLR of type "HOST_REPORT", a reacting node applies overload abatement treatment to the host-routed requests identified by the overload abatement algorithm (as defined in Section 2) sent for this application to the overloaded host. A report of type "REALM_REPORT" is sent to indicate the overload of a realm for the Application-ID indicated in the transaction. The overloaded realm is identified by the Destination-Realm AVP of the message containing the OLR. When receiving an OLR of type "REALM_REPORT", a reacting node applies overload abatement treatment to realm-routed requests identified by the overload abatement algorithm (as defined in Section 2) sent for this application to the overloaded realm.
This document assumes that there is a single source for realm reports for a given realm, or that if multiple nodes can send realm reports, that each such node has full knowledge of the overload state of the entire realm. A reacting node cannot distinguish between receiving realm reports from a single node or from multiple nodes. Note: Known issues exist if there are multiple sources for overload reports that apply to the same Diameter entity. Reacting nodes have no way of determining the source and, as such, will treat them as coming from a single source. Variance in sequence numbers between the two sources can then cause incorrect overload abatement treatment to be applied for indeterminate periods of time. Reporting nodes are responsible for determining the need for a reduction of traffic. The method for making this determination is implementation specific and depends on the type of overload report being generated. A host report might be generated by tracking use of resources required by the host to handle transactions for the Diameter application. A realm report generally impacts the traffic sent to multiple hosts and, as such, requires tracking the capacity of all servers able to handle realm-routed requests for the application and realm. Once a reporting node determines the need for a reduction in traffic, it uses the DOIC-defined AVPs to report on the condition. These AVPs are included in answer messages sent or relayed by the reporting node. The reporting node indicates the overload abatement algorithm that is to be used to handle the traffic reduction in the OC-Supported-Features AVP. The OC-OLR AVP is used to communicate information about the requested reduction. Reacting nodes, upon receipt of an overload report, apply the overload abatement algorithm to traffic impacted by the overload report. The method used to determine the requests that are to receive overload abatement treatment is dependent on the abatement algorithm. The loss abatement algorithm is defined in this document (Section 6). Other abatement algorithms can be defined in extensions to the DOIC solution. Two types of overload abatement treatment are defined, diversion and throttling. Reacting nodes are responsible for determining which treatment is appropriate for individual requests. As the conditions that lead to the generation of the overload report change, the reporting node can send new overload reports requesting greater reduction if the condition gets worse or less reduction if the condition improves. The reporting node sends an overload report
with a duration of zero to indicate that the overload condition has ended and abatement is no longer needed. The reacting node also determines when the overload report expires based on the OC-Validity-Duration AVP in the overload report and stops applying the abatement algorithm when the report expires. Note that erroneous overload reports can be used for DoS attacks. This includes the ability to indicate that a significant reduction in traffic, up to and including a request for no traffic, should be sent to a reporting node. As such, care should be taken to verify the sender of overload reports.4.4. DOIC Extensibility
The DOIC solution is designed to be extensible. This extensibility is based on existing Diameter-based extensibility mechanisms, along with the DOIC capability announcement mechanism. There are multiple categories of extensions that are expected. This includes the definition of new overload abatement algorithms, the definition of new report types, and the definition of new scopes of messages impacted by an overload report. A DOIC node communicates supported features by including them in the OC-Feature-Vector AVP, as a sub-AVP of OC-Supported-Features. Any non-backwards-compatible DOIC extensions define new values for the OC-Feature-Vector AVP. DOIC extensions also have the ability to add new AVPs to the OC-Supported-Features AVP, if additional information about the new feature is required. Overload reports can also be extended by adding new sub-AVPs to the OC-OLR AVP, allowing reporting nodes to communicate additional information about handling an overload condition. If necessary, new extensions can also define new AVPs that are not part of the OC-Supported-Features and OC-OLR group AVPs. It is, however, recommended that DOIC extensions use the OC-Supported- Features AVP and OC-OLR AVP to carry all DOIC-related AVPs.
4.5. Simplified Example Architecture
Figure 1 illustrates the simplified architecture for Diameter overload information conveyance. Realm X Same or other Realms <--------------------------------------> <----------------------> +--------+ : (optional) : |Diameter| : : |Server A|--+ .--. : +--------+ : .--. +--------+ | _( `. : |Diameter| : _( `. +--------+ +--( )--:-| Agent |-:--( )--|Diameter| +--------+ | ( ` . ) ) : +--------+ : ( ` . ) ) | Client | |Diameter|--+ `--(___.-' : : `--(___.-' +--------+ |Server B| : : +--------+ : : End-to-end Overload Indication 1) <-----------------------------------------------> Diameter Application Y Overload Indication A Overload Indication A' 2) <----------------------> <----------------------> Diameter Application Y Diameter Application Y Figure 1: Simplified Architecture Choices for Overload Indication Delivery In Figure 1, the Diameter overload indication can be conveyed (1) end-to-end between servers and clients or (2) between servers and the Diameter Agent inside the realm and then between the Diameter Agent and the clients.5. Solution Procedures
This section outlines the normative behavior for the DOIC solution.5.1. Capability Announcement
This section defines DOIC Capability Announcement (DCA) behavior. Note: This specification assumes that changes in DOIC node capabilities are relatively rare events that occur as a result of administrative action. Reacting nodes ought to minimize changes that force the reporting node to change the features being used, especially during active overload conditions. But even if
reacting nodes avoid such changes, reporting nodes still have to be prepared for them to occur. For example, differing capabilities between multiple reacting nodes may still force a reporting node to select different features on a per-transaction basis.5.1.1. Reacting Node Behavior
A reacting node MUST include the OC-Supported-Features AVP in all requests. It MAY include the OC-Feature-Vector AVP, as a sub-AVP of OC-Supported-Features. If it does so, it MUST indicate support for the "loss" algorithm. If the reacting node is configured to support features (including other algorithms) in addition to the loss algorithm, it MUST indicate such support in an OC-Feature-Vector AVP. An OC-Supported-Features AVP in answer messages indicates there is a reporting node for the transaction. The reacting node MAY take action, for example, creating state for some stateful abatement algorithm, based on the features indicated in the OC-Feature-Vector AVP. Note: The loss abatement algorithm does not require stateful behavior when there is no active overload report. Reacting nodes need to be prepared for the reporting node to change selected algorithms. This can happen at any time, including when the reporting node has sent an active overload report. The reacting node can minimize the potential for changes by modifying the advertised abatement algorithms sent to an overloaded reporting node to the currently selected algorithm and loss (or just loss if it is the currently selected algorithm). This has the effect of limiting the potential change in abatement algorithm from the currently selected algorithm to loss, avoiding changes to more complex abatement algorithms that require state to operate properly.5.1.2. Reporting Node Behavior
Upon receipt of a request message, a reporting node determines if there is a reacting node for the transaction based on the presence of the OC-Supported-Features AVP in the request message. If the request message contains an OC-Supported-Features AVP, then a reporting node MUST include the OC-Supported-Features AVP in the answer message for that transaction. Note: Capability announcement is done on a per-transaction basis. The reporting node cannot assume that the capabilities announced by a reacting node will be the same between transactions.
A reporting node MUST NOT include the OC-Supported-Features AVP, OC-OLR AVP, or any other overload control AVPs defined in extension documents in response messages for transactions where the request message does not include the OC-Supported-Features AVP. Lack of the OC-Supported-Features AVP in the request message indicates that there is no reacting node for the transaction. A reporting node knows what overload control functionality is supported by the reacting node based on the content or absence of the OC-Feature-Vector AVP within the OC-Supported-Features AVP in the request message. A reporting node MUST select a single abatement algorithm in the OC-Feature-Vector AVP. The abatement algorithm selected MUST indicate the abatement algorithm the reporting node wants the reacting node to use when the reporting node enters an overload condition. The abatement algorithm selected MUST be from the set of abatement algorithms contained in the request message's OC-Feature-Vector AVP. A reporting node that selects the loss algorithm may do so by including the OC-Feature-Vector AVP with an explicit indication of the loss algorithm, or it MAY omit the OC-Feature-Vector AVP. If it selects a different algorithm, it MUST include the OC-Feature-Vector AVP with an explicit indication of the selected algorithm. The reporting node SHOULD indicate support for other DOIC features defined in extension documents that it supports and that apply to the transaction. It does so using the OC-Feature-Vector AVP. Note: Not all DOIC features will apply to all Diameter applications or deployment scenarios. The features included in the OC-Feature-Vector AVP are based on local policy of the reporting node.5.1.3. Agent Behavior
Diameter Agents that support DOIC can ensure that all messages relayed by the agent contain the OC-Supported-Features AVP. A Diameter Agent MAY take on reacting node behavior for Diameter endpoints that do not support the DOIC solution. A Diameter Agent detects that a Diameter endpoint does not support DOIC reacting node behavior when there is no OC-Supported-Features AVP in a request message.
For a Diameter Agent to be a reacting node for a non-supporting Diameter endpoint, the Diameter Agent MUST include the OC-Supported- Features AVP in request messages it relays that do not contain the OC-Supported-Features AVP. A Diameter Agent MAY take on reporting node behavior for Diameter endpoints that do not support the DOIC solution. The Diameter Agent MUST have visibility to all traffic destined for the non-supporting host in order to become the reporting node for the Diameter endpoint. A Diameter Agent detects that a Diameter endpoint does not support DOIC reporting node behavior when there is no OC-Supported-Features AVP in an answer message for a transaction that contained the OC-Supported-Features AVP in the request message. If a request already has the OC-Supported-Features AVP, a Diameter Agent MAY modify it to reflect the features appropriate for the transaction. Otherwise, the agent relays the OC-Supported-Features AVP without change. Example: If the agent supports a superset of the features reported by the reacting node, then the agent might choose, based on local policy, to advertise that superset of features to the reporting node. If the Diameter Agent changes the OC-Supported-Features AVP in a request message, then it is likely it will also need to modify the OC-Supported-Features AVP in the answer message for the transaction. A Diameter Agent MAY modify the OC-Supported-Features AVP carried in answer messages. When making changes to the OC-Supported-Features or OC-OLR AVPs, the Diameter Agent needs to ensure consistency in its behavior with both upstream and downstream DOIC nodes.5.2. Overload Report Processing
5.2.1. Overload Control State
Both reacting and reporting nodes maintain Overload Control State (OCS) for active overload conditions. The following sections define behavior associated with that OCS. The contents of the OCS in the reporting node and in the reacting node represent logical constructs. The actual internal physical structure of the state included in the OCS is an implementation decision.
5.2.1.1. Overload Control State for Reacting Nodes
A reacting node maintains the following OCS per supported Diameter application: o a host-type OCS entry for each Destination-Host to which it sends host-type requests and o a realm-type OCS entry for each Destination-Realm to which it sends realm-type requests. A host-type OCS entry is identified by the pair of Application-ID and the node's DiameterIdentity. A realm-type OCS entry is identified by the pair of Application-ID and realm. The host-type and realm-type OCS entries include the following information (the actual information stored is an implementation decision): o Sequence number (as received in OC-OLR; see Section 7.3) o Time of expiry (derived from OC-Validity-Duration AVP received in the OC-OLR AVP and time of reception of the message carrying OC-OLR AVP) o Selected abatement algorithm (as received in the OC-Supported- Features AVP) o Input data that is abatement algorithm specific (as received in the OC-OLR AVP -- for example, OC-Reduction-Percentage for the loss abatement algorithm)5.2.1.2. Overload Control State for Reporting Nodes
A reporting node maintains OCS entries per supported Diameter application, per supported (and eventually selected) abatement algorithm, and per report type. An OCS entry is identified by the tuple of Application-ID, report type, and abatement algorithm, and it includes the following information (the actual information stored is an implementation decision): o Sequence number o Validity duration
o Expiration time o Input data that is algorithm specific (for example, the reduction percentage for the loss abatement algorithm)5.2.1.3. Reacting Node's Maintenance of Overload Control State
When a reacting node receives an OC-OLR AVP, it MUST determine if it is for an existing or new overload condition. Note: For the remainder of this section, the term "OLR" refers to the combination of the contents of the received OC-OLR AVP and the abatement algorithm indicated in the received OC-Supported- Features AVP. When receiving an answer message with multiple OLRs of different supported report types, a reacting node MUST process each received OLR. The OLR is for an existing overload condition if a reacting node has an OCS that matches the received OLR. For a host report, this means it matches the Application-ID and the host's DiameterIdentity in an existing host OCS entry. For a realm report, this means it matches the Application-ID and the realm in an existing realm OCS entry. If the OLR is for an existing overload condition, then a reacting node MUST determine if the OLR is a retransmission or an update to the existing OLR. If the sequence number for the received OLR is greater than the sequence number stored in the matching OCS entry, then a reacting node MUST update the matching OCS entry. If the sequence number for the received OLR is less than or equal to the sequence number in the matching OCS entry, then a reacting node MUST silently ignore the received OLR. The matching OCS MUST NOT be updated in this case. If the reacting node determines that the sequence number has rolled over, then the reacting node MUST update the matching OCS entry. This can be determined by recognizing that the number has changed from a value within 1% of the maximum value in the OC-Sequence-Number AVP to a value within 1% of the minimum value in the OC-Sequence- Number AVP.
If the received OLR is for a new overload condition, then a reacting node MUST generate a new OCS entry for the overload condition. For a host report, this means a reacting node creates an OCS entry with the Application-ID in the received message and DiameterIdentity of the Origin-Host in the received message. Note: This solution assumes that the Origin-Host AVP in the answer message included by the reporting node is not changed along the path to the reacting node. For a realm report, this means a reacting node creates an OCS entry with the Application-ID in the received message and realm of the Origin-Realm in the received message. If the received OLR contains a validity duration of zero ("0"), then a reacting node MUST update the OCS entry as being expired. Note: It is not necessarily appropriate to delete the OCS entry, as the recommended behavior is that the reacting node slowly returns to full traffic when ending an overload abatement period. The reacting node does not delete an OCS when receiving an answer message that does not contain an OC-OLR AVP (i.e., absence of OLR means "no change").5.2.1.4. Reporting Node's Maintenance of Overload Control State
A reporting node SHOULD create a new OCS entry when entering an overload condition. Note: If a reporting node knows through absence of the OC-Supported-Features AVP in received messages that there are no reacting nodes supporting DOIC, then the reporting node can choose to not create OCS entries. When generating a new OCS entry, the sequence number SHOULD be set to zero ("0"). When generating sequence numbers for new overload conditions, the new sequence number MUST be greater than any sequence number in an active (unexpired) overload report for the same application and report type previously sent by the reporting node. This property MUST hold over a reboot of the reporting node.
Note: One way of addressing this over a reboot of a reporting node is to use a timestamp for the first overload condition that occurs after the report and to start using sequences beginning with zero for subsequent overload conditions. A reporting node MUST update an OCS entry when it needs to adjust the validity duration of the overload condition at reacting nodes. Example: If a reporting node wishes to instruct reacting nodes to continue overload abatement for a longer period of time than originally communicated. This also applies if the reporting node wishes to shorten the period of time that overload abatement is to continue. A reporting node MUST update an OCS entry when it wishes to adjust any parameters specific to the abatement algorithm, including, for example, the reduction percentage used for the loss abatement algorithm. Example: If a reporting node wishes to change the reduction percentage either higher (if the overload condition has worsened) or lower (if the overload condition has improved), then the reporting node would update the appropriate OCS entry. A reporting node MUST increment the sequence number associated with the OCS entry anytime the contents of the OCS entry are changed. This will result in a new sequence number being sent to reacting nodes, instructing them to process the OC-OLR AVP. A reporting node SHOULD update an OCS entry with a validity duration of zero ("0") when the overload condition ends. Note: If a reporting node knows that the OCS entries in the reacting nodes are near expiration, then the reporting node might decide not to send an OLR with a validity duration of zero. A reporting node MUST keep an OCS entry with a validity duration of zero ("0") for a period of time long enough to ensure that any unexpired reacting node's OCS entry created as a result of the overload condition in the reporting node is deleted.5.2.2. Reacting Node Behavior
When a reacting node sends a request, it MUST determine if that request matches an active OCS.
If the request matches an active OCS, then the reacting node MUST use the overload abatement algorithm indicated in the OCS to determine if the request is to receive overload abatement treatment. For the loss abatement algorithm defined in this specification, see Section 6 for the overload abatement algorithm logic applied. If the overload abatement algorithm selects the request for overload abatement treatment, then the reacting node MUST apply overload abatement treatment on the request. The abatement treatment applied depends on the context of the request. If diversion abatement treatment is possible (i.e., a different path for the request can be selected where the overloaded node is not part of the different path), then the reacting node SHOULD apply diversion abatement treatment to the request. The reacting node MUST apply throttling abatement treatment to requests identified for abatement treatment when diversion treatment is not possible or was not applied. Note: This only addresses the case where there are two defined abatement treatments, diversion and throttling. Any extension that defines a new abatement treatment must also define its interaction with existing treatments. If the overload abatement treatment results in throttling of the request and if the reacting node is an agent, then the agent MUST send an appropriate error as defined in Section 8. Diameter endpoints that throttle requests need to do so according to the rules of the client application. Those rules will vary by application and are beyond the scope of this document. In the case that the OCS entry indicated no traffic was to be sent to the overloaded entity and the validity duration expires, then overload abatement associated with the overload report MUST be ended in a controlled fashion.5.2.3. Reporting Node Behavior
If there is an active OCS entry, then a reporting node SHOULD include the OC-OLR AVP in all answers to requests that contain the OC-Supported-Features AVP and that match the active OCS entry. Note: A request matches 1) if the Application-ID in the request matches the Application-ID in any active OCS entry and 2) if the report type in the OCS entry matches a report type supported by the reporting node as indicated in the OC-Supported-Features AVP.
The contents of the OC-OLR AVP depend on the selected algorithm. A reporting node MAY choose to not resend an overload report to a reacting node if it can guarantee that this overload report is already active in the reacting node. Note: In some cases (e.g., when there are one or more agents in the path between reporting and reacting nodes, or when overload reports are discarded by reacting nodes), a reporting node may not be able to guarantee that the reacting node has received the report. A reporting node MUST NOT send overload reports of a type that has not been advertised as supported by the reacting node. Note: A reacting node implicitly advertises support for the host and realm report types by including the OC-Supported-Features AVP in the request. Support for other report types will be explicitly indicated by new feature bits in the OC-Feature-Vector AVP. A reporting node SHOULD explicitly indicate the end of an overload occurrence by sending a new OLR with OC-Validity-Duration set to a value of zero ("0"). The reporting node SHOULD ensure that all reacting nodes receive the updated overload report. A reporting node MAY rely on the OC-Validity-Duration AVP values for the implicit cleanup of overload control state on the reacting node. Note: All OLRs sent have an expiration time calculated by adding the validity duration contained in the OLR to the time the message was sent. Transit time for the OLR can be safely ignored. The reporting node can ensure that all reacting nodes have received the OLR by continuing to send it in answer messages until the expiration time for all OLRs sent for that overload condition have expired. When a reporting node sends an OLR, it effectively delegates any necessary throttling to downstream nodes. If the reporting node also locally throttles the same set of messages, the overall number of throttled requests may be higher than intended. Therefore, before applying local message throttling, a reporting node needs to check if these messages match existing OCS entries, indicating that these messages have survived throttling applied by downstream nodes that have received the related OLR. However, even if the set of messages match existing OCS entries, the reporting node can still apply other abatement methods such as diversion. The reporting node might also need to throttle requests
for reasons other than overload. For example, an agent or server might have a configured rate limit for each client and might throttle requests that exceed that limit, even if such requests had already been candidates for throttling by downstream nodes. The reporting node also has the option to send new OLRs requesting greater reductions in traffic, reducing the need for local throttling. A reporting node SHOULD decrease requested overload abatement treatment in a controlled fashion to avoid oscillations in traffic. Example: A reporting node might wait some period of time after overload ends before terminating the OLR, or it might send a series of OLRs indicating progressively less overload severity.5.3. Protocol Extensibility
The DOIC solution can be extended. Types of potential extensions include new traffic abatement algorithms, new report types, or other new functionality. When defining a new extension that requires new normative behavior, the specification must define a new feature for the OC-Feature-Vector AVP. This feature bit is used to communicate support for the new feature. The extension may define new AVPs for use in the DOIC Capability Announcement and for use in DOIC overload reporting. These new AVPs SHOULD be defined to be extensions to the OC-Supported-Features or OC-OLR AVPs defined in this document. The Grouped AVP extension mechanisms defined in [RFC6733] apply. This allows, for example, defining a new feature that is mandatory to be understood even when piggybacked on an existing application. When defining new report type values, the corresponding specification must define the semantics of the new report types and how they affect the OC-OLR AVP handling. The OC-Supported-Feature and OC-OLR AVPs can be expanded with optional sub-AVPs only if a legacy DOIC implementation can safely ignore them without breaking backward compatibility for the given OC-Report-Type AVP value. Any new sub-AVPs must not require that the M-bit be set. Documents that introduce new report types must describe any limitations on their use across non-supporting agents.