5. Ring Protection Coordination Protocol
5.1. RPS and PSC Comparison on Ring Topology
This section provides comparison between RPS and Protection State Coordination (PSC) [RFC6378] [RFC6974] on ring topologies. This can be helpful to explain the reason of defining a new protocol for ring protection switching. The PSC protocol [RFC6378] is designed for point-to-point LSPs, on which the protection switching can only be performed on one or both of the endpoints of the LSP. The RPS protocol is designed for ring tunnels, which consist of multiple ring nodes, and the failure could happen on any segment of the ring; thus, RPS is capable of identifying and handling the different failures on the ring and coordinating the protection-switching behavior of all the nodes on the ring. As will be specified in the following sections, this is achieved with the introduction of the "pass-through" state for the ring nodes, and the location of the protection request is identified via the node IDs in the RPS request message. Taking a ring topology with N nodes as an example: With the mechanism specified in [RFC6974], on every ring node, a linear protection configuration has to be provisioned with every other node in the ring, i.e., with (N-1) other nodes. This means that on every ring node there will be (N-1) instances of the PSC protocol. And in order to detect faults and to transport the PSC message, each instance shall have a MEP on the working path and a MEP on the protection path, respectively. This means that every node on the ring needs to be configured with (N-1) * 2 MEPs.
With the mechanism defined in this document, on every ring node there will only be a single instance of the RPS protocol. In order to detect faults and to transport the RPS message, each node only needs to have a MEP on the section to its adjacent nodes, respectively. In this way, every ring node only needs to be configured with 2 MEPs. As shown in the above example, RPS is designed for ring topologies and can achieve ring protection efficiently with minimum protection instances and OAM entities, which meets the requirements on topology- specific recovery mechanisms as specified in [RFC5654].5.2. RPS Protocol
The RPS protocol defined in this section is used to coordinate the protection-switching action of all the ring nodes in the same ring. The protection operation of the ring tunnels is controlled with the help of the RPS protocol. The RPS processes in each of the individual ring nodes that form the ring MUST communicate using the Generic Associated Channel (G-ACh). The RPS protocol is applicable to all the three ring protection modes. This section takes the short-wrapping mechanism described in Section 4.3.2 as an example. The RPS protocol is used to distribute the ring status information and RPS requests to all the ring nodes. Changes in the ring status information and RPS requests can be initiated automatically based on link status or caused by external commands. Each node on the ring is uniquely identified by assigning it a node ID. The node ID MUST be unique on each ring. The maximum number of nodes on the ring supported by the RPS protocol is 127. The node ID SHOULD be independent of the order in which the nodes appear on the ring. The node ID is used to identify the source and destination nodes of each RPS request. Every node obtains the ring topology either by configuration or via some topology discovery mechanism. The ring map consists of the ring topology information, and connectivity status (Intact or Severed) between the adjacent ring nodes, which is determined via the OAM message exchanged between the adjacent nodes. The ring map is used by every ring node to determine the switchover behavior of the ring tunnels.
As shown in Figure 14, when no protection switching is active on the ring, each node MUST send RPS requests with No Request (NR) to its two adjacent nodes periodically. The transmission interval of RPS requests is specified in Section 5.2.1. +---+ A->B(NR) +---+ B->C(NR) +---+ C->D(NR) -------| A |-------------| B |-------------| C |------- (NR)F<-A +---+ (NR)A<-B +---+ (NR)B<-C +---+ Figure 14: RPS Communication between the Ring Nodes in Case of No Failure in the Ring As shown in Figure 15, when a node detects a failure and determines that protection switching is required, it MUST send the appropriate RPS request in both directions to the destination node. The destination node is the other node that is adjacent to the identified failure. When a node that is not the destination node receives an RPS request and it has no higher-priority local request, it MUST transfer in the same direction the RPS request as received. In this way, the switching nodes can maintain RPS protocol communication in the ring. The RPS request MUST be terminated by the destination node of the message. If an RPS request with the node itself set as the source node is received, this message MUST be dropped and not be forwarded to the next node. +---+ C->B(SF) +---+ B->C(SF) +---+ C->B(SF) -------| A |-------------| B |----- X -----| C |------- (SF)C<-B +---+ (SF)C<-B +---+ (SF)B<-C +---+ Figure 15: RPS Communication between the Ring Nodes in Case of Failure between Nodes B and C Note that in the case of a bidirectional failure such as a cable cut, the two adjacent nodes detect the failure and send each other an RPS request in opposite directions. o In rings utilizing the wrapping protection, each node detects the failure or receives the RPS request as the destination node MUST perform the switch from/to the working ring tunnels to/from the protection ring tunnels if it has no higher-priority active RPS request. o In rings utilizing the short-wrapping protection, each node detects the failure or receives the RPS request as the destination node MUST perform the switch only from the working ring tunnels to the protection ring tunnels.
o In rings utilizing the steering protection, when a ring switch is required, any node MUST perform the switches if its added/dropped traffic is affected by the failure. Determination of the affected traffic MUST be performed by examining the RPS requests (indicating the nodes adjacent to the failure or failures) and the stored ring map (indicating the relative position of the failure and the added traffic destined towards that failure). When the failure has cleared and the Wait-to-Restore (WTR) timer has expired, the nodes that generate the RPS requests MUST drop their respective switches and MUST generate an RPS request carrying the NR code. The node receiving such an RPS request from both directions MUST drop its protection switches. A protection switch MUST be initiated by one of the criteria specified in Section 5.3. A failure of the RPS protocol or controller MUST NOT trigger a protection switch. Ring switches MUST be preempted by higher-priority RPS requests. For example, consider a protection switch that is active due to a manual switch request on the given link, and another protection switch is required due to a failure on another link. Then an RPS request MUST be generated, the former protection switch MUST be dropped, and the latter protection switch established. The MPLS-TP Shared-Ring Protection mechanism supports multiple protection switches in the ring, resulting in the ring being segmented into two or more separate segments. This may happen when several RPS requests of the same priority exist in the ring due to multiple failures or external switch commands. Proper operation of the MSRP mechanism relies on all nodes using their ring map to determine the state of the ring (nodes and links). In order to accommodate ring state knowledge, the RPS requests MUST be sent in both directions during a protection switch.5.2.1. Transmission and Acceptance of RPS Requests
A new RPS request MUST be transmitted immediately when a change in the transmitted status occurs. The first three RPS protocol messages carrying a new RPS request MUST be transmitted as fast as possible. For fast protection switching within 50 ms, the interval of the first three RPS protocol messages SHOULD be 3.3 ms. The successive RPS requests SHOULD be transmitted with the interval of 5 seconds. A ring node that is not the destination of the received RPS message MUST forward it to the next node along the ring immediately.
5.2.2. RPS Protocol Data Unit (PDU) Format
Figure 16 depicts the format of an RPS packet that is sent on the G-ACh. The Channel Type field is set to indicate that the message is an RPS message. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 1|Version| Reserved | RPS Channel Type (0x002A) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Dest Node ID | Src Node ID | Request | M | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 16: G-ACh RPS Packet Format The following fields MUST be provided: o Destination Node ID: The destination node ID MUST always be set to the value of the node ID of the adjacent node. The node ID MUST be unique on each ring. Valid destination node ID values are 1-127. o Source Node ID: The source node ID MUST always be set to the ID value of the node generating the RPS request. The node ID MUST be unique on each ring. Valid source node ID values are 1-127. o Protection-Switching Mode (M): This 2-bit field indicates the protection-switching mode used by the sending node of the RPS message. This can be used to check that the ring nodes on the same ring use the same protection-switching mechanism. The defined values of the M field are listed as below: +------------------+-----------------------------+ | Bits (MSB - LSB) | Protection-Switching Mode | +------------------+-----------------------------+ | 0 0 | Reserved | | 0 1 | Wrapping | | 1 0 | Short-Wrapping | | 1 1 | Steering | +------------------+-----------------------------+ Note: MSB = most significant bit LSB = least significant bit
o RPS Request Code: A code consisting of 8 bits as specified below: +------------------+-----------------------------+----------+ | Bits | Condition, State, | Priority | | (MSB - LSB) | or External Request | | +------------------+-----------------------------+----------+ | 0 0 0 0 1 1 1 1 | Lockout of Protection (LP) | highest | | 0 0 0 0 1 1 0 1 | Forced Switch (FS) | | | 0 0 0 0 1 0 1 1 | Signal Fail (SF) | | | 0 0 0 0 0 1 1 0 | Manual Switch (MS) | | | 0 0 0 0 0 1 0 1 | Wait-to-Restore (WTR) | | | 0 0 0 0 0 0 1 1 | Exercise (EXER) | | | 0 0 0 0 0 0 0 1 | Reverse Request (RR) | | | 0 0 0 0 0 0 0 0 | No Request (NR) | lowest | +------------------+-----------------------------+----------+5.2.3. Ring Node RPS States
Idle state: A node is in the idle state when it has no RPS request and is sending and receiving an NR code to/from both directions. Switching state: A node not in the idle or pass-through states is in the switching state. Pass-through state: A node is in the pass-through state when its highest priority RPS request is a request not destined to it or generated by it. The pass-through is bidirectional.5.2.3.1. Idle State
A node in the idle state MUST generate the NR request in both directions. A node in the idle state MUST terminate RPS requests that flow in both directions. A node in the idle state MUST block the traffic flow on protection ring tunnels in both directions.5.2.3.2. Switching State
A node in the switching state MUST generate an RPS request to its adjacent node with its highest RPS request code in both directions when it detects a failure or receives an external command.
In a bidirectional failure condition, both of the nodes adjacent to the failure detect the failure and send the RPS request in both directions with the destination set to each other; while each node can only receive the RPS request via the long path, the message sent via the short path will get lost due to the bidirectional failure. Here, the short path refers to the shorter path on the ring between the source and destination node of the RPS request, and the long path refers to the longer path on the ring between the source and destination node of the RPS request. Upon receipt of the RPS request on the long path, the destination node of the RPS request MUST send an RPS request with its highest request code periodically along the long path to the other node adjacent to the failure. In a unidirectional failure condition, the node that detects the failure MUST send the RPS request in both directions with the destination node set to the other node adjacent to the failure. The destination node of the RPS request cannot detect the failure itself but will receive an RPS request from both the short path and the long path. The destination node MUST acknowledge the received RPS requests by replying with an RPS request with the RR code on the short path and an RPS request with the received RPS request code on the long path. Accordingly, when the node that detects the failure receives the RPS request with RR code on the short path, then the RPS request received from the same node along the long path SHOULD be ignored. A node in the switching state MUST terminate the received RPS requests in both directions and not forward it further along the ring. The following switches as defined in Section 5.3.1 MUST be allowed to coexist: o LP and LP o FS and FS o SF and SF o FS and SF When multiple MS RPS requests exist at the same time addressing different links and there is no higher-priority request on the ring, no switch SHOULD be executed and existing switches MUST be dropped. The nodes MUST still signal an RPS request with the MS code. Multiple EXER requests MUST be allowed to coexist in the ring.
A node in a ring-switching state that receives the external command LP for the affected link MUST drop its switch and MUST signal NR for the locked link if there is no other RPS request on another link. The node still SHOULD signal a relevant RPS request for another link.5.2.3.3. Pass-Through State
When a node is in a pass-through state, it MUST transfer the received RPS request unchanged in the same direction. When a node is in a pass-through state, it MUST enable the traffic flow on protection ring tunnels in both directions.5.2.4. RPS State Transitions
All state transitions are triggered by an incoming RPS request change, a WTR expiration, an externally initiated command, or locally detected MPLS-TP section failure conditions. RPS requests due to a locally detected failure, an externally initiated command, or a received RPS request shall preempt existing RPS requests in the prioritized order given in Section 5.2.2, unless the requests are allowed to coexist.5.2.4.1. Transitions between Idle and Pass-Through States
The transition from the idle state to pass-through state MUST be triggered by a valid RPS request change, in any direction, from the NR code to any other code, as long as the new request is not destined to the node itself. Both directions move then into a pass-through state, so that traffic entering the node through the protection ring tunnels are transferred transparently through the node. A node MUST revert from pass-through state to the idle state when an RPS request with an NR code is received in both directions. Then both directions revert simultaneously from the pass-through state to the idle state.5.2.4.2. Transitions between Idle and Switching States
Transition of a node from the idle state to the switching state MUST be triggered by one of the following conditions: o A valid RPS request change from the NR code to any code received on either the long or the short path and is destined to this node o An externally initiated command for this node
o The detection of an MPLS-TP section-layer failure at this node Actions taken at a node in the idle state upon transition to the switching state are: o For all protection-switch requests, except EXER and LP, the node MUST execute the switch o For EXER, and LP, the node MUST signal the appropriate request but not execute the switch In one of the following conditions, transition from the switching state to the idle state MUST be triggered: o On the node that triggers the protection switching, when the WTR time expires or an externally initiated command is cleared, the node MUST transit from switching state to Idle State and signal the NR code using RPS message in both directions. o On the node that enters the switching state due to the received RPS request: upon reception of the NR code from both directions, the head-end node MUST drop its switch, transition to idle state, and signal the NR code in both directions.5.2.4.3. Transitions between Switching States
When a node that is currently executing any protection switch receives a higher-priority RPS request (due to a locally detected failure, an externally initiated command, or a ring protection switch request destined to it) for the same link, it MUST update the priority of the switch it is executing to the priority of the received RPS request. When a failure condition clears at a node, the node MUST enter WTR condition and remain in it for the appropriate time-out interval, unless: o A different RPS request with a higher priority than WTR is received o Another failure is detected o An externally initiated command becomes active The node MUST send out a WTR code on both the long and short paths.
When a node that is executing a switch in response to an incoming SF RPS request (not due to a locally detected failure) receives a WTR code (unidirectional failure case), it MUST send out the RR code on the short path and the WTR on the long path.5.2.4.4. Transitions between Switching and Pass-Through States
When a node that is currently executing a switch receives an RPS request for a non-adjacent link of higher priority than the switch it is executing, it MUST drop its switch immediately and enter the pass- through state. The transition of a node from pass-through to switching state MUST be triggered by: o An equal priority, a higher priority, or an allowed coexisting externally initiated command o The detection of an equal priority, a higher priority, or an allowed coexisting automatic initiated command o The receipt of an equal, a higher priority, or an allowed coexisting RPS request destined to this node