RFC 5977

RMD-QOSM: The NSIS Quality-of-Service Model for Resource Management in Diffserv

Pages: 128
Experimental

Part 5 of 5 – Pages 101 to 128

RFC5977 - Page 101 prevText

Appendix A.  Examples

A.1.  Example of a Re-Marking Operation during Severe Congestion in the
      Interior Nodes

   This appendix describes an example of a re-marking operation during
   severe congestion in the Interior nodes.

   Per supported PHB, the Interior node can support the operation states
   depicted in Figure 26, when the per-flow congestion notification
   based on probing signaling scheme is used in combination with this
   severe congestion type.  Figure 27 depicts the same functionality
   when the per-flow congestion notification based on probing scheme is
   not used in combination with the severe congestion scheme.  The
   description given in this and the following appendices, focuses on
   the situation where: (1) the "notified DSCP" marking is used in
   congestion notification state, and (2) the "encoded DSCP" and
   "affected DSCP" markings are used in severe congestion state.  In
   this case, the "notified DSCP" marking is used during the congestion
   notification state to mark all packets passing through an Interior
   node that operates in the congestion notification state.  In this
   way, and in combination with probing, a flow-based ECMP solution can
   be provided for the congestion notification state.  The "encoded
   DSCP" marking is used to encode and signal the excess rate, measured
   at Interior nodes, to the Egress nodes.  The "affected DSCP" marking
   is used to mark all packets that are passing through a severe
   congested node and are not "encoded DSCP" marked.

   Another possible situation could be derived in which both congestion
   notification and severe congestion state use the "encoded DSCP"
   marking, without using the "notified DSCP" marking.  The "affected
   DSCP" marking is used to mark all packets that pass through an
   Interior node that is in severe congestion state and are not "encoded
   DSCP" marked.  In addition, the probe packet that is carried by an
   intra-domain RESERVE message and pass through Interior nodes SHOULD
   be "encoded DSCP" marked if the Interior node is in congestion
   notification or severe congestion states.  Otherwise, the probe
   packet will remain unmarked.  In this way, an ECMP solution can be
   provided for both congestion notification and severe congestion
   states.  The"encoded DSCP" packets signal an excess rate that is not
   only associated with Interior nodes that are in severe congestion
   state, but also with Interior nodes that are in congestion
   notification state.  The algorithm at the Interior node is similar to
   the algorithm described in the following appendix sections.  However,
   this method is not described in detail in this example.

RFC5977 - Page 102

           ---------------------------------------------
          |        event B                              |
          |                                             V
       ----------             -------------           ----------
      | Normal   |  event A  | Congestion  | event B | Severe   |
      |  state   |---------->| notification|-------->|congestion|
      |          |           |  state      |         |  state   |
       ----------             -------------           ----------
        ^  ^                       |                     |
        |  |      event C          |                     |
        |   -----------------------                      |
        |         event D                                |
         ------------------------------------------------

   Figure 26: States of operation, severe congestion combined with
              congestion notification based on probing

       ----------                 -------------
      | Normal   |  event B      | Severe      |
      |  state   |-------------->| congestion  |
      |          |               |  state      |
       ----------                 -------------
           ^                           |
           |      event E              |
            ---------------------------

   Figure 27: States of operation, severe congestion without
              congestion notification based on probing

   The terms used in Figures 26 and 27 are:

   Normal state: represents the normal operation conditions of the node,
   i.e., no congestion.

   Severe congestion state: represents the state in which the Interior
   node is severely congested related to a certain PHB.  It is important
   to emphasize that one of the targets of the severe congestion state
   solution to change the severe congestion state behavior directly to
   the normal state.

   Congestion notification: state in which the load is relatively high,
   close to the level when congestion can occur.

   event A: this event occurs when the incoming PHB rate is higher than
   the "congestion notification detection" threshold and lower than the
   "severe congestion detection".  This threshold is used by the
   congestion notification based on probing scheme, see Sections 4.6.1.7
   and 4.6.2.6.

RFC5977 - Page 103

   event B: this event occurs when the incoming PHB rate is higher than
   the "severe congestion detection" threshold.

   event C: this event occurs when the incoming PHB rate is lower than
   or equal to the "congestion notification detection" threshold.

   event D: this event occurs when the incoming PHB rate is lower than
   or equal to the "severe_congestion_restoration" threshold.  It is
   important to emphasize that this even supports one of the targets of
   the severe congestion state solution to change the severe congestion
   state behavior directly to the normal state.

   event E: this event occurs when the incoming PHB rate is lower than
   or equal to the "severe congestion restoration" threshold.

   Note that the "severe congestion detection", "severe congestion
   restoration" and admission thresholds SHOULD be higher than the
   "congestion notification detection" threshold, i.e., "severe
   congestion detection" > "congestion notification detection" and
   "severe congestion restoration" > "congestion notification
   detection".

   Furthermore, the "severe congestion detection" threshold SHOULD be
   higher than or equal to the admission threshold that is used by the
   reservation-based and NSIS measurement-based signaling schemes.
   "severe congestion detection" >= admission threshold.

   Moreover, the "severe congestion restoration" threshold SHOULD be
   lower than or equal to the "severe congestion detection" threshold
   that is used by the reservation-based and NSIS measurement-based
   signaling schemes, that is:

   "severe congestion restoration" <= "severe congestion detection"

   During severe congestion, the Interior node calculates, per traffic
   class (PHB), the incoming rate that is above the "severe congestion
   restoration" threshold, denoted as signaled_overload_rate, in the
   following way:

   *  A severe congested Interior node SHOULD take into account that
      packets might be dropped.  Therefore, before queuing and
      eventually dropping packets, the Interior node SHOULD count the
      total number of unmarked and re-marked bytes received by the
      severe congested node, denote this number as total_received_bytes.
      Note that there are situations in which more than one Interior
      node in the same path become severely congested.  Therefore, any
      Interior node located behind a severely congested node MAY receive
      marked bytes.

RFC5977 - Page 104

   When the "severe congestion detection" threshold per PHB is set equal
   to the maximum capacity allocated to one PHB used by the RMD-QOSM, it
   means that if the maximum capacity associated to a PHB is fully
   utilized and a packet belonging to this PHB arrives, then it is
   assumed that the Interior node will not forward this packet
   downstream.

   In other words, this packet will either be dropped or set to another
   PHB.  Furthermore, this also means that after the severe congestion
   situation is solved, then the ongoing flows will be able to send
   their associated packets up to a total rate equal to the maximum
   capacity associated with the PHB.  Therefore, when more than one
   Interior node located on the same path will be severely congested and
   when the Interior node receives "encoded DSCP" marked packets, it
   means that an Interior node located upstream is also severely
   congested.

   When the "severe congestion detection" threshold per PHB is set equal
   to the maximum capacity allocated to one PHB, then this Interior node
   MUST forward the "encoded DSCP" marked packets and it SHOULD NOT
   consider these packets during its local re-marking process.  In other
   words, the Egress should see the excess rates encoded by the
   different severely congested Interior nodes as independent, and
   therefore, these independent excess rates will be added.

   When the "severe congestion detection" threshold per PHB is not set
   equal to the maximum capacity allocated to one PHB, this means that
   after the severe congestion situation is solved, the ongoing flows
   will not be able to send their associated packets up to a total rate
   equal to the maximum capacity associated with the PHB, but only up to
   the "severe_congestion_threshold".  When more than one Interior node
   located on the same communication path is severely congested and when
   one of these Interior node receives "encoded_DSCP" marked packets,
   this Interior node SHOULD NOT mark unmarked, i.e., either "original
   DSCP" or "affected DSCP" or "notified DSCP" encoded packets, up to a
   rate equal to the difference between the maximum PHB capacity and the
   "severe congestion threshold", when the incoming "encoded DSCP"
   marked packets are already able to signal this difference.  In this
   case, the "severe congestion threshold" SHOULD be configured in all
   Interior nodes, which are located in the RMD domain, and equal to:

   "severe_congestion_threshold" =
      Maximum PHB capacity - threshold_offset_rate

   The threshold_offset_rate represents rate and SHOULD have the same
   value in all Interior nodes.

RFC5977 - Page 105

   *  before queuing and eventually dropping the packets, at the end of
      each measurement interval of T seconds, calculate the current
      estimated overloaded rate, say measured_overload_rate, by using
      the following equation:

   measured_overload_rate =
   =((total_received_bytes)/T)-severe_congestion_restoration)

   To provide a reliable estimation of the encoded information, several
   techniques can be used; see [AtLi01], [AdCa03], [ThCo04], and
   [AnHa06].  Note that since marking is done in Interior nodes, the
   decisions are made at Egress nodes, and the termination of flows is
   performed by Ingress nodes, there is a significant delay until the
   overload information is learned by the Ingress nodes (see Section 6
   of [CsTa05]).  The delay consists of the trip time of data packets
   from the severely congested Interior node to the Egress, the
   measurement interval, i.e., T, and the trip time of the notification
   signaling messages from Egress to Ingress.  Moreover, until the
   overload decreases at the severely congested Interior node, an
   additional trip time from the Ingress node to the severely congested
   Interior node MUST expire.  This is because immediately before
   receiving the congestion notification, the Ingress MAY have sent out
   packets in the flows that were selected for termination.  That is, a
   terminated flow MAY contribute to congestion for a time longer that
   is taken from the Ingress to the Interior node.  Without considering
   the above, Interior nodes would continue marking the packets until
   the measured utilization falls below the severe congestion
   restoration threshold.  In this way, in the end, more flows will be
   terminated than necessary, i.e., an overreaction takes place.
   [CsTa05] provides a solution to this problem, where the Interior
   nodes use a sliding window memory to keep track of the signaling
   overload in a couple of previous measurement intervals.  At the end
   of a measurement interval, T, before encoding and signaling the
   overloaded rate as "encoded DSCP" packets, the actual overload is
   decreased with the sum of already signaled overload stored in the
   sliding window memory, since that overload is already being handled
   in the severe congestion handling control loop.  The sliding window
   memory consists of an integer number of cells, i.e., n = maximum
   number of cells.  Guidelines for configuring the sliding window
   parameters are given in [CsTa05].

   At the end of each measurement interval, the newest calculated
   overload is pushed into the memory, and the oldest cell is dropped.

   If Mi is the overload_rate stored in ith memory cell (i = [1..n]),
   then at the end of every measurement interval, the overload rate that
   is signaled to the Egress node, i.e., signaled_overload_rate is
   calculated as follows:

RFC5977 - Page 106

   Sum_Mi =0
   For i =1 to n
   {
   Sum_Mi = Sum_Mi + Mi
   }

   signaled_overload_rate = measured_overload_rate - Sum_Mi,

   where Sum_Mi is calculated as above.

   Next, the sliding memory is updated as follows:
       for i = 1..(n-1): Mi <- Mi+1
       Mn <- signaled_overload_rate

   The bytes that have to be re-marked to satisfy the signaled overload
   rate: signaled_remarked_bytes, are calculated using the following
   pseudocode:

   IF severe_congestion_threshold <> Maximum PHB capacity
   THEN
    {
     IF (incoming_encoded-DSCP_rate <> 0) AND
        (incoming_encoded-DSCP_rate =< termination_offset_rate)
     THEN
        { signaled_remarked_bytes =
         = ((signaled_overload_rate - incoming_encoded-DSCP_rate)*T)/N
        }
     ELSE IF (incoming_encoded-DSCP_rate > termination_offset_rate)
     THEN signaled_remarked_bytes =
         = ((signaled_overload_rate - termination_offset_rate)*T)/N
     ELSE IF (incoming_encoded-DSCP_rate =0)
     THEN signaled_remarked_bytes =
         = signaled_overload_rate*T/N
     }
    ELSE signaled_remarked_bytes =  signaled_overload_rate *T/N

    Where the incoming "encoded DSCP" rate is calculated as follows:

    incoming_encoded-DSCP_rate =
     = (received number of "encoded_DSCP" during T) * N)/T;

   The signal_remarked_bytes also represents the number of the outgoing
   packets (after the dropping stage) that MUST be re-marked, during
   each measurement interval T, by a node when operates in severe
   congestion mode.

RFC5977 - Page 107

   Note that, in order to process an overload situation higher than 100%
   of the maintained severe congestion threshold, all the nodes within
   the domain MUST be configured and maintain a scaling parameter, e.g.,
   N used in the above equation, which in combination with the marked
   bytes, e.g., signaled_remarked_bytes, such a high overload situation
   can be calculated and represented.  N can be equal to or higher than
   1.

   Note that when incoming re-marked bytes are dropped, the operation of
   the severe congestion algorithm MAY be affected, e.g., the algorithm
   MAY become, in certain situations, slower.  An implementation of the
   algorithm MAY assure as much as possible that the incoming marked
   bytes are not dropped.  This could for example be accomplished by
   using different dropping rate thresholds for marked and unmarked
   bytes.

   Note that when the "affected DSCP" marking is used by a node that is
   congested due to a severe congestion situation, then all the outgoing
   packets that are not marked (i.e., by using the "encoded DSCP") have
   to be re-marked using the "affected DSCP" marking.

   The "encoded DSCP" and the "affected DSCP" marked packets (when
   applied in the whole RMD domain) are propagated to the QNE Edge
   nodes.

   Furthermore, note that when the congestion notification based on
   probing is used in combination with severe congestion, then in
   addition to the possible "encoded DSCP" and "affected DSCP", another
   DSCP for the re-marking of the same PHB is used (see Section
   4.6.1.7).  This additional DSCP is denoted in this document as
   "notified DSCP".  When an Interior node operates in the severe
   congested state (see Figure 27), and receives "notified DSCP"
   packets, these packets are considered to be unmarked packets (but not
   "affected DSCP" packets).  This means that during severe congestion,
   also the "notified DSCP" packets can be re-marked and encoded as
   either "encoded DSCP" or "affected DSCP" packets.

A.2.  Example of a Detailed Severe Congestion Operation in the Egress
      Nodes

   This appendix describes an example of a detailed severe congestion
   operation in the Egress nodes.

   The states of operation in Egress nodes are similar to the ones
   described in Appendix A.1.  The definition of the events, see below,
   is however different than the definition of the events given in
   Figures 26 and 27:

RFC5977 - Page 108

   *  event A: when the Egress receives a predefined rate of "notified
      DSCP" marked bytes/packets, event A is activated (see Sections
      4.6.1.7 and A.4).  The predefined rate of "notified DSCP" marked
      bytes is denoted as the congestion notification detection
      threshold.  Note this congestion notification detection threshold
      can also be zero, meaning that the event A is activated when the
      Egress node, during an interval T, receives at least one "notified
      DSCP" packet.

   *  event B: this event occurs when the Egress receives packets marked
      as either "encoded DSCP" or "affected DSCP" (when "affected DSCP"
      is applied in the whole RMD domain).

   *  event C: this event occurs when the rate of incoming "notified
      DSCP" packets decreases below the congestion notification
      detection threshold.  In the situation that the congestion
      notification detection threshold is zero, this will mean that
      event C is activated when the Egress node, during an interval T,
      does not receive any "notified DSCP" marked packets.

   *  event D: this event occurs when the Egress, during an interval T,
      does not receive packets marked as either "encoded DSCP" or
      "affected DSCP" (when "affected DSCP" is applied in the whole RMD
      domain).  Note that when "notified DSCP" is applied in the whole
      RMD domain for the support of congestion notification, this event
      could cause the following change in operation state.

      When the Egress, during an interval T, does not receive (1)
      packets marked as either "encoded DSCP" or "affected DSCP" (when
      "affected DSCP" is applied in the whole RMD domain) and (2) it
      does NOT receive "notified DSCP" marked packets, the change in the
      operation state occurs from the severe congestion state to normal
      state.

      When the Egress, during an interval T, does not receive (1)
      packets marked as either "encoded DSCP" or "affected DSCP" (when
      "affected DSCP" is applied in the whole RMD domain) and (2) it
      does receive "notified DSCP" marked packets, the change in the
      operation state occurs from the severe congestion state to the
      congestion notification state.

   *  event E: this event occurs when the Egress, during an interval T,
      does not receive packets marked as either "encoded DSCP" or
      "affected DSCP" (when "affected DSCP" is applied in the whole RMD
      domain).

RFC5977 - Page 109

   An example of the algorithm for calculation of the number of flows
   associated with each priority class that have to be terminated is
   explained by the pseudocode below.

   The Edge nodes are able to support severe congestion handling by: (1)
   identifying which flows were affected by the severe congestion and
   (2) selecting and terminating some of these flows such that the
   quality of service of the remaining flows is recovered.

   The "encoded DSCP" and the "affected DSCP" marked packets (when
   applied in the whole RMD domain) are received by the QNE Edge node.

   The QNE Edge nodes keep per-flow state and therefore they can
   translate the calculated bandwidth to be terminated, to number of
   flows.  The QNE Egress node records the excess rate and the identity
   of all the flows, arriving at the QNE Egress node, with "encoded
   DSCP" and with "affected DSCP" (when applied in the whole RMD
   domain); only these flows, which are the ones passing through the
   severely congested Interior node(s), are candidates for termination.
   The excess rate is calculated by measuring the rate of all the
   "encoded DSCP" data packets that arrive at the QNE Egress node.  The
   measured excess rate is converted by the Egress node, by multiplying
   it by the factor N, which was used by the QNE Interior node(s) to
   encode the overload level.

   When different priority flows are supported, all the low priority
   flows that arrived at the Egress node are terminated first.  Next,
   all the medium priority flows are stopped and finally, if necessary,
   even high priority flows are chosen.  Within a priority class both
   "encoded DSCP" and "affected DSCP" are considered before the
   mechanism moves to higher priority class.  Finally, for each flow
   that has to be terminated the Egress node, sends a NOTIFY message to
   the Ingress node, which stops the flow.

   Below, this algorithm is described in detail.

   First, when the Egress operates in the severe congestion state, the
   total amount of re-marked bandwidth associated with the PHB traffic
   class, say total_congested_bandwidth, is calculated.  Note that when
   the node maintains information about each Ingress/Egress pair
   aggregate, then the total_congested_bandwidth MUST be calculated per
   Ingress/Egress pair reservation aggregate.  This bandwidth represents
   the severely congested bandwidth that SHOULD be terminated.  The
   total_congested_bandwidth can be calculated as follows:

   total_congested_bandwidth = N*input_remarked_bytes/T

RFC5977 - Page 110

   Where, input_remarked_bytes represents the number of "encoded DSCP"
   marked bytes that arrive at the Egress, during one measurement
   interval T, N is defined as in Sections 4.6.1.6.2.1 and A.1.  The
   term denoted as terminated_bandwidth is a temporal variable
   representing the total bandwidth that has to be terminated, belonging
   to the same PHB traffic class.  The terminate_flow_bandwidth
   (priority_class) is the total bandwidth associated with flows of
   priority class equal to priority_class.  The parameter priority_class
   is an integer fulfilling:

   0 =< priority_class =< Maximum_priority.

   The QNE Egress node records the identity of the QNE Ingress node that
   forwarded each flow, the total_congested_bandwidth and the identity
   of all the flows, arriving at the QNE Egress node, with "encoded
   DSCP" and "affected DSCP" (when applied in whole RMD domain).  This
   ensures that only these flows, which are the ones passing through the
   severely overloaded QNE Interior node(s), are candidates for
   termination.  The selection of the flows to be terminated is
   described in the pseudocode that is given below, which is realized by
   the function denoted below as calculate_terminate_flows().

   The calculate_terminate_flows() function uses the
   <terminate_bandwidth_class> value and translates this bandwidth value
   to number of flows that have to be terminated.  Only the "encoded
   DSCP" flows and "affected DSCP" (when applied in whole RMD domain)
   flows, which are the ones passing through the severely overloaded
   Interior node(s), are candidates for termination.

   After the flows to be terminated are selected, the
   <sum_bandwidth_terminate(priority_class)> value is calculated that is
   the sum of the bandwidth associated with the flows, belonging to a
   certain priority class, which will certainly be terminated.

   The constraint of finding the total number of flows that have to be
   terminated is that sum_bandwidth_terminate(priority_class), SHOULD be
   smaller or approximately equal to the variable
   terminate_bandwidth(priority_class).

RFC5977 - Page 111

   terminated_bandwidth = 0;
   priority_class = 0;
   while terminated_bandwidth < total_congested_bandwidth
    {
     terminate_bandwidth(priority_class) =
     = total_congested_bandwidth - terminated_bandwidth
     calculate_terminate_flows(priority_class);
     terminated_bandwidth =
     = sum_bandwidth_terminate(priority_class) + terminated_bandwidth;
     priority_class = priority_class + 1;
    }

   If the Egress node maintains Ingress/Egress pair reservation
   aggregates, then the above algorithm is performed for each
   Ingress/Egress pair reservation aggregate.

   Finally, for each flow that has to be terminated, the QNE Egress node
   sends a NOTIFY message to the QNE Ingress node to terminate the flow.

A.3.  Example of a Detailed Re-Marking Admission Control (Congestion
      Notification) Operation in Interior Nodes

   This appendix describes an example of a detailed re-marking admission
   control (congestion notification) operation in Interior nodes.  The
   predefined congestion notification threshold, see Appendix A.1, is
   set according to, and usually less than, an engineered bandwidth
   limitation, i.e., admission threshold, e.g., based on a Service Level
   Agreement or a capacity limitation of specific links.

   The difference between the congestion notification threshold and the
   engineered bandwidth limitation, i.e., admission threshold, provides
   an interval where the signaling information on resource limitation is
   already sent by a node but the actual resource limitation is not
   reached.  This is due to the fact that data packets associated with
   an admitted session have not yet arrived, which allows the admission
   control process available at the Egress to interpret the signaling
   information and reject new calls before reaching congestion.

   Note that in the situation when the data rate is higher than the
   preconfigured congestion notification rate, data packets are also re-
   marked (see Section 4.6.1.6.2.1).  To distinguish between congestion
   notification and severe congestion, two methods MAY be used (see
   Appendix A.1):

   *  using different <DSCP> values (re-marked <DSCP> values).  The re-
      marked DSCP that is used for this purpose is denoted as "notified
      DSCP" in this document.  When this method is used and when the
      Interior node is in "congestion notification" state, see Appendix

RFC5977 - Page 112

      A.1, then the node SHOULD re-mark all the unmarked bytes passing
      through the node using the "notified DSCP".  Note that this method
      can only be applied if all nodes in the RMD domain use the
      "notified" DSCP marking.  In this way, probe packets that will
      pass through the Interior node that operates in congestion
      notification state are also encoded using the "notified DSCP"
      marking.

   *  Using the "encoded DSCP" marking for congestion notification and
      severe congestion.  This method is not described in detail in this
      example appendix.

A.4.  Example of a Detailed Admission Control (Congestion Notification)
      Operation in Egress Nodes

   This appendix describes an example of a detailed admission control
   (congestion notification) operation in Egress nodes.

   The admission control congestion notification procedure can be
   applied only if the Egress maintains the Ingress/Egress pair
   aggregate.  When the operation state of the Ingress/Egress pair
   aggregate is the "congestion notification", see Appendix A.2, then
   the implementation of the algorithm depends on how the congestion
   notification situation is notified to the Egress.  As mentioned in
   Appendix A.3, two methods are used:

   *  using the "notified DSCP".  During a measurement interval T, the
      Egress counts the number of "notified DSCP" marked bytes that
      belong to the same PHB and are associated with the same
      Ingress/Egress pair aggregate, say input_notified_bytes.  We
      denote the rate as incoming_notified_rate.

   *  using the "encoded DSCP".  In this case, during a measurement
      interval T, the Egress measures the input_notified_bytes by
      counting the "encoded DSCP" bytes.

   Below only the detail description of the first method is given.

   The incoming congestion_rate can be then calculated as follows:

      incoming_congestion_rate = input_notified_bytes/T

   If the incoming_congestion_rate is higher than a preconfigured
   congestion notification threshold, then the communication path
   between Ingress and Egress is considered to be congested.  Note that
   the pre-congestion notification threshold can be set to "0".  In this

RFC5977 - Page 113

   case, the Egress node will operate in congestion notification state
   at the moment that it receives at least one "notified DSCP" encoded
   packet.

   When the Egress node operates in "congestion notification" state and
   if the end-to-end RESERVE (probe) arrives at the Egress, then this
   request SHOULD be rejected.  Note that this happens only when the
   probe packet is either "notified DSCP" or "encoded DSCP" marked.  In
   this way, it is ensured that the end-to-end RESERVE (probe) packet
   passed through the node that is congested.  This feature is very
   useful when ECMP-based routing is used to detect only flows that are
   passing through the congested router.

   If such an Ingress/Egress pair aggregated state is not available when
   the (probe) RESERVE message arrives at the Egress, then this request
   is accepted if the DSCP of the packet carrying the RESERVE message is
   unmarked.  Otherwise (if the packet is either "notified DSCP" or
   "encoded DSCP" marked), it is rejected.

A.5.  Example of Selecting Bidirectional Flows for Termination during
      Severe Congestion

   This appendix describes an example of selecting bidirectional flows
   for termination during severe congestion.

   When a severe congestion occurs, e.g., in the forward path, and when
   the algorithm terminates flows to solve the severe congestion in the
   forward path, then the reserved bandwidth associated with the
   terminated bidirectional flows is also released.  Therefore, a
   careful selection of the flows that have to be terminated SHOULD take
   place.  A possible method of selecting the flows belonging to the
   same priority type passing through the severe congestion point on a
   unidirectional path can be the following:

   *  the Egress node SHOULD select, if possible, first unidirectional
      flows instead of bidirectional flows.

   *  the Egress node SHOULD select, if possible, bidirectional flows
      that reserved a relatively small amount of resources on the path
      reversed to the path of congestion.

A.6.  Example of a Severe Congestion Solution for Bidirectional Flows
      Congested Simultaneously on Forward and Reverse Paths

   This appendix describes an example of a severe congestion solution
   for bidirectional flows congested simultaneously on forward and
   reverse paths.

RFC5977 - Page 114

   This scenario describes a solution using the combination of the
   severe congestion solutions described in Section 4.6.2.5.2.  It is
   considered that the severe congestion occurs simultaneously in
   forward and reverse directions, which MAY affect the same
   bidirectional flows.

   When the QNE Edges maintain per-flow intra-domain QoS-NSLP
   operational states, the steps can be the following, see Figure A.3.
   Consider that the Egress node selects a number of bidirectional flows
   to be terminated.  In this case, the Egress will send, for each
   bidirectional flow, a NOTIFY message to Ingress.  If the Ingress
   receives these NOTIFY messages and its operational state (associated
   with reverse path) is in the severe congestion state (see Figures 26
   and 27), then the Ingress operates in the following way:

   *  For each NOTIFY message, the Ingress SHOULD identify the
      bidirectional flows that have to be terminated.

   *  The Ingress then calculates the total bandwidth that SHOULD be
      released in the reverse direction (thus not in forward direction)
      if the bidirectional flows will be terminated (preempted), say
      "notify_reverse_bandwidth".  This bandwidth can be calculated by
      the sum of the bandwidth values associated with all the end-to-end
      sessions that received a (severe congestion) NOTIFY message.

   *  Furthermore, using the received marked packets (from the reverse
      path) the Ingress will calculate, using the algorithm used by an
      Egress and described in Appendix A.2, the total bandwidth that has
      to be terminated in order to solve the congestion in the reverse
      path direction, say "marked_reverse_bandwidth".

   *  The Ingress then calculates the bandwidth of the additional flows
      that have to be terminated, say "additional_reverse_bandwidth", in
      order to solve the severe congestion in reverse direction, by
      taking into account:

   ** the bandwidth in the reverse direction of the bidirectional flows
      that were appointed by the Egress (the ones that received a NOTIFY
      message) to be preempted, i.e., "notify_reverse_bandwidth".

   ** the total amount of bandwidth in the reverse direction that has
      been calculated by using the received marked packets, i.e.,
      "marked_reverse_bandwidth".

RFC5977 - Page 115

QNE(Ingress)     NE (int.)    NE (int.)       NE (int.)     QNE(Egress)
NTLP stateful                                             NTLP stateful
data|    user        |                |           |               |
--->|    data        | #unmarked bytes|           |               |
    |--------------->S #marked bytes  |           |               |
    |                S--------------------------->|               |
    |                |                |           |-------------->|data
    |                |                |           |               |--->
    |                |                |           |              Term.?
    |            NOTIFY               |           |               |Yes
    |<------------------------------------------------------------|
    |                |                |           |               |data
    |                |                |  user     |               |<---
    |   user data    |                |  data     |<--------------|
    | (#marked bytes)|                S<----------|               |
    |<--------------------------------S           |               |
    | (#unmarked bytes)               S           |               |
Term|<--------------------------------S           |               |
Flow?                |                S           |               |
YES |RESERVE(RMD-QSPEC):              S           |               |
    |"forward - T tear"               s           |               |
    |--------------->|  RESERVE(RMD-QSPEC):       |               |
    |                |  "forward - T tear"        |               |
    |                |--------------------------->|               |
    |                |                S           |-------------->|
    |                |                S         RESERVE(RMD-QSPEC):
    |                |                S       "reverse - T tear"  |
    |      RESERVE(RMD-QSPEC)         S           |<--------------|
    |      "reverse - T tear"         S<----------|               |
    |<--------------------------------S           |               |

  Figure 28: Intra-domain RMD severe congestion handling for
             bidirectional reservation (congestion in both forward
             and reverse direction)

   This additional bandwidth can be calculated using the following
   algorithm:

   IF ("marked_reverse_bandwidth" > "notify_reverse_bandwidth") THEN
   "additional_reverse_bandwidth" =
    = "marked_reverse_bandwidth"- "notify_reverse_bandwidth";
   ELSE
   "additional_reverse_bandwidth" = 0

   *  Ingress terminates the flows that experienced a severe congestion
      in the forward path and received a (severe congestion) NOTIFY
      message.

RFC5977 - Page 116

      *  If possible, the Ingress SHOULD terminate unidirectional flows
         that use the same Egress-Ingress reverse direction
         communication path to satisfy the release of a total bandwidth
         up equal to the "additional_reverse_bandwidth", see Appendix
         A.5.

      *  If the number of REQUIRED unidirectional flows (to satisfy the
         above issue) is not available, then a number of bidirectional
         flows that are using the same Egress-Ingress reverse direction
         communication path MAY be selected for preemption in order to
         satisfy the release of a total bandwidth equal up to the
         "additional_reverse_bandwidth".  Note that using the guidelines
         given in Appendix A.5, first the bidirectional flows that
         reserved a relatively small amount of resources on the path
         reversed to the path of congestion SHOULD be selected for
         termination.

         When the QNE Edges maintain aggregated intra-domain QoS-NSLP
         operational states, the steps can be the following.

      *  The Egress calculates the bandwidth to be terminated using the
         same method as described in Section 4.6.1.6.2.2.  The Egress
         includes this bandwidth value in a <PDR Bandwidth> within a
         "PDR_Congestion_Report" container that is carried by the end-
         to-end NOTIFY message.

      *  The Ingress receives the NOTIFY message and reads the <PDR
         Bandwidth> value included in the "PDR_Congestion_Report"
         container.  Note that this value is denoted as
         "notify_reverse_bandwidth" in the situation that the QNE Edges
         maintain per-flow intra-domain QoS-NSLP operational states, but
         is calculated differently.  The variables
         "marked_reverse_bandwidth" and "additional_reverse_bandwidth"
         are calculated using the same steps as explained for the
         situation that the QNE Edges maintain per-flow intra-domain
         QoS-NSLP states.

      *  Regarding the termination of flows that use the same Egress-
         Ingress reverse direction communication path, the Ingress can
         follow the same procedures as the situation that the QNE Edges
         maintain per-flow intra-domain QoS-NSLP operational states.

         The RMD-aggregated (reduced-state) reservations maintained by
         the Interior nodes, can be reduced in the "forward" and
         "reverse" directions by using the procedure described in
         Section 4.6.2.3 and including in the <Peak Data Rate-1 (p)>
         value of the local RMD-QSPEC <TMOD-1> parameter of the RMD-QOSM
         <QoS Desired> field carried by the forward intra-domain RESERVE

RFC5977 - Page 117

         the value equal to <notify_reverse_bandwidth> and by including
         the <additional_reverse_bandwidth> value in the <PDR Bandwidth>
         parameter within the "PDR_Release_Request" container that is
         carried by the same intra-domain RESERVE message.

A.7.  Example of Preemption Handling during Admission Control

   This appendix describes an example of how preemption handling is
   supported during admission control.

   This section describes the mechanism that can be supported by the QNE
   Ingress, QNE Interior, and QNE Egress nodes to satisfy preemption
   during the admission control process.

   This mechanism uses the preemption building blocks specified in
   [RFC5974].

A.7.1.  Preemption Handling in QNE Ingress Nodes

   If a QNE Ingress receives a RESERVE for a session that causes other
   session(s) to be preempted, for each of these to-be-preempted
   sessions, then the QNE Ingress follows the following steps:

   Step_1:

   The QNE Ingress MUST send a tearing RESERVE downstream and add a
   BOUND-SESSION-ID, with <Binding_Code> value equal to "Indicated
   session caused preemption" that indicates the SESSION-ID of the
   session that caused the preemption.  Furthermore, an <INFO-SPEC>
   object with error code value equal to "Reservation preempted" has to
   be included in each of these tearing RESERVE messages.

   The selection of which flows have to be preempted can be based on
   predefined policies.  For example, this selection process can be
   based on the MRI associated with the high and low priority sessions.
   In particular, the QNE Ingress can select low(er) priority session(s)
   where their MRI is "close" (especially the target IP) to the one
   associated with the higher priority session.  This means that
   typically the high priority session and the to-be-preempted lower
   priority sessions are following the same communication path and are
   passing through the same QNE Egress node.

   Furthermore, the amount of lower priority sessions that have to be
   preempted per each high priority session, has to be such that the
   requested resources by the higher priority session SHOULD be lower or
   equal than the sum of the reserved resources associated with the
   lower priority sessions that have to be preempted.

RFC5977 - Page 118

   Step_2:

   For each of the sent tearing RESERVE(s) the QNE Ingress will send a
   NOTIFY message with an <INFO-SPEC> object with error code value equal
   to "Reservation preempted" towards the QNI.

   Step_3:

   After sending the preempted (tearing) RESERVE(s), the Ingress QNE
   will send the (reserving) RESERVE, which caused the preemption,
   downstream towards the QNE Egress.

A.7.2.  Preemption Handling in QNE Interior Nodes

   The QNE Interior upon receiving the first (tearing) RESERVE that
   carries the <BOUND-SESSION-ID> object with <Binding_Code> value equal
   to "Indicated session caused preemption" and an <INFO-SPEC> object
   with error code value equal to "Reservation preempted" it considers
   that this session has to be preempted.

   In this case, the QNE Interior creates a so-called "preemption
   state", which is identified by the SESSION-ID carried in the
   preemption-related <BOUND-SESSION-ID> object.  Furthermore, this
   "preemption state" will include the SESSION-ID of the session
   associated with the (tearing) RESERVE.  Subsequently, if additional
   tearing RESERVE(s) are arriving including the same values of BOUND-
   SESSION-ID and <INFO-SPEC> objects, then the associated SESSION-IDs
   of these (tearing) RESERVE message will be included in the already
   created "preemption state".  The QNE will then set a timer, with a
   value that is high enough to ensure that it will not expire before
   the (reserving) RESERVE arrives.

   Note that when the "preemption state" timer expires, the bandwidth
   associated with the preempted session(s) will have to be released,
   following a normal RMD-QOSM bandwidth release procedure.  If the QNE
   Interior node will not receive all the to-be-preempted (tearing)
   RESERVE messages sent by the QNE Ingress before their associated
   (reserving) RESERVE message arrives, then the (reserving) RESERVE
   message will not reserve any resources and this message will be "M"
   marked (see Section 4.6.1.2).  Note that this situation is not a
   typical situation.  Typically, this situation can only occur when at
   least one of (tearing) the RESERVE messages is dropped due to an
   error condition.

RFC5977 - Page 119

   Otherwise, if the QNE Interior receives all the to-be-preempted
   (tearing) RESERVE messages sent by the QNE Ingress, then the QNE
   Interior will remove the pending resources, and make the new
   reservation using normal RMD-QOSM bandwidth release and reservation
   procedures.

A.7.3.  Preemption Handling in QNE Egress Nodes

   Similar to the QNE Interior operation, the QNE Egress, upon receiving
   the first (tearing) RESERVE that carries the <BOUND-SESSION-ID>
   object with the <Binding_Code> value equal to "Indicated session
   caused preemption" and an <INFO-SPEC> object with error code value
   equal to "Reservation preempted", it considers that this session has
   to be preempted.  Similar to the QNE Interior operation the QNE
   Egress creates a so called "preemption state", which is identified by
   the SESSION-ID carried in the preemption-related <BOUND-SESSION-ID>
   object.  This "preemption state" will store the same type of
   information and use the same timer value as specified in Appendix
   A.7.2.

   Subsequently, if additional tearing RESERVE(s) are arriving including
   the same values of BOUND-SESSION-ID and <INFO-SPEC> objects, then the
   associated SESSION-IDs of these (tearing) RESERVE message will be
   included in the already created "preemption state".

   If the (reserving) RESERVE message sent by the QNE Ingress node
   arrived and is not "M" marked, and if all the to-be-preempted
   (tearing) RESERVE messages arrived, then the QNE Egress will remove
   the pending resources and make the new reservation using normal RMD-
   QOSM procedures.

   If the QNE Egress receives an "M" marked RESERVE message, then the
   QNE Egress will use the normal partial RMD-QOSM procedure to release
   the partial reserved resources associated with the "M" marked RESERVE
   (see Section 4.6.1.2).

   If the QNE Egress will not receive all the to-be-preempted (tearing)
   RESERVE messages sent by the QNE Ingress before their associated and
   not "M" marked (reserving) RESERVE message arrives, then the
   following steps can be followed:

   *  If the QNE Egress uses an end-to-end QOSM that supports the
      preemption handling, then the QNE Egress has to calculate and
      select new lower priority sessions that have to be terminated.
      How the preempted sessions are selected and signaled to the
      downstream QNEs is similar to the operation specified in Appendix
      A.7.1.

RFC5977 - Page 120

   *  If the QNE Egress does not use an end-to-end QOSM that supports
      the preemption handling, then the QNE Egress has to reject the
      requesting (reserving) RESERVE message associated with the high
      priority session (see Section 4.6.1.2).

   Note that typically, the situation in which the QNE Egress does not
   receive all the to-be-preempted (tearing) RESERVE messages sent by
   the QNE Ingress can only occur when at least one of the (tearing)
   RESERVE messages are dropped due to an error condition.

A.8.  Example of a Retransmission Procedure within the RMD Domain

   This appendix describes an example of a retransmission procedure that
   can be used in the RMD domain.

   If the retransmission of intra-domain RESERVE messages within the RMD
   domain is not disallowed, then all the QNE Interior nodes SHOULD use
   the functionality described in this section.

   In this situation, we enable QNE Interior nodes to maintain a replay
   cache in which each entry contains the <RSN>, <SESSION-ID> (available
   via GIST), <REFRESH-PERIOD> (available via the QoS NSLP [RFC5974]),
   and the last received "PHR Container" <Parameter ID> carried by the
   RMD-QSPEC for each session [RFC5975].  Thus, this solution uses
   information carried by <QoS-NSLP> objects [RFC5974] and parameters
   carried by the RMD-QSPEC "PHR Container".  The following phases can
   be distinguished:

   Phase 1: Create Replay Cache Entry

   When an Interior node receives an intra-domain RESERVE message and
   its cache is empty or there is no matching entry, it reads the
   <Parameter ID> field of the "PHR Container" of the received message.
   If the <Parameter ID> is a PHR_RESOURCE_REQUEST, which indicates that
   the intra-domain RESERVE message is a reservation request, then the
   QNE Interior node creates a new entry in the cache and copies the
   <RSN>, <SESSION-ID> and <Parameter ID> to the entry and sets the
   <REFRESH-PERIOD>.

   By using the information stored in the list, the Interior node
   verifies whether or not the received intra-domain RESERVE message is
   sent by an adversary.  For example, if the <SESSION-ID> and <RSN> of
   a received intra-domain RESERVE message match the values stored in
   the list then the Interior node checks the <Parameter ID> part.

RFC5977 - Page 121

   If the <Parameter ID> is different, then:

   Situation D1: <Parameter ID> in its own list is
      PHR_RESOURCE_REQUEST, and <Parameter ID> in the message is
      PHR_REFRESH_UPDATE;

   Situation D2: <Parameter ID> in its own list is
      PHR_RESOURCE_REQUEST or PHR_REFRESH_UPDATE, and <Parameter ID>
      in the message is PHR_RELEASE_REQUEST;

   Situation D3: <Parameter ID> in its own list is PHR_REFRESH_UPDATE,
      and <Parameter ID> in the message is PHR_RESOURCE_REQUEST;

   For Situation D1, the QNE Interior node processes this message by
   RMD-QOSM default operation, reserves bandwidth, updates the entry,
   and passes the message to downstream nodes.  For Situation D2, the
   QNE Interior node processes this message by RMD-QOSM default
   operation, releases bandwidth, deletes all entries associated with
   the session and passes the message to downstream nodes.  For
   situation D3, the QNE Interior node does not use/process the local
   RMD-QSPEC <TMOD-1> parameter carried by the received intra-domain
   RESERVE message.  Furthermore, the <K> flag in the "PHR Container"
   has to be set such that the local RMD-QSPEC <TMOD-1> parameter
   carried by the intra-domain RESERVE message is not processed/used by
   a QNE Interior node.

   If the <Parameter ID> is the same, then:

      Situation S1: <Parameter ID> is equal to PHR_RESOURCE_REQUEST;
      Situation S2: <Parameter ID> is equal to PHR_REFRESH_UPDATE;

      For situation S1, the QNE Interior node does not process the
      intra-domain RESERVE message, but it just passes it to downstream
      nodes, because it might have been retransmitted by the QNE Ingress
      node.  For situation S2, the QNE Interior node processes the first
      incoming intra-domain (refresh) RESERVE message within a refresh
      period and updates the entry and forwards it to the downstream
      nodes.

   If only <Session-ID> is matched to the list, then the QNE Interior
   node checks the <RSN>.  Here also two situations can be
   distinguished:

   If a rerouting takes place (see Section 5.2.5.2 in [RFC5974]), the
   <RSN> in the message will be equal to either <RSN + 2> in the stored
   list if it is not a tearing RESERVE or <RSN -1> in the stored list if
   it is a tearing RESERVE:

RFC5977 - Page 122

   The QNE Interior node will check the <Parameter ID> part;

   If the <RSN> in the message is equal to <RSN + 2> in the stored list
   and the <Parameter ID> is a PHR_RESOURCE_REQUEST or
   PHR_REFRESH_UPDATE, then the received intra-domain RESERVE message
   has to be interpreted and processed as a typical (non-tearing)
   RESERVE message, which is caused by rerouting, see Section 5.2.5.2 in
   [RFC5974].

   If the <RSN> in the message is equal to <RSN-1> in the stored list
   and the <Parameter ID> is a PHR_RELEASE_REQUEST, then the received
   intra-domain RESERVE message has to be interpreted and processed as a
   typical (tearing) RESERVE message, which is caused by rerouting (see
   Section 5.2.5.2 in [RFC5974]).

   If other situations occur than the ones described above, then the QNE
   Interior node does not use/process the local RMD-QSPEC <TMOD-1>
   parameter carried by the received intra-domain RESERVE message.
   Furthermore, the <K> parameter has to be set, see above.

   Phase 2: Update Replay Cache Entry

   When a QNE Interior node receives an intra-domain RESERVE message, it
   retrieves the corresponding entry from the cache and compares the
   values.  If the message is valid, the Interior node will update
   <Parameter ID> and <REFRESH-PERIOD> in the list entry.

   Phase 3: Delete Replay Cache Entry

   When a QNE Interior node receives an intra-domain (tear) RESERVE
   message and an entry in the replay cache can be found, then the QNE
   Interior node will delete this entry after processing the message.
   Furthermore, the Interior node will delete cache entries, if it did
   not receive an intra-domain (refresh) RESERVE message during the
   <REFRESH-PERIOD> period with a <Parameter ID> value equal to
   PHR_REFRESH_UPDATE.

A.9.  Example on Matching the Initiator QSPEC to the Local RMD-QSPEC

   Section 3.4 of [RFC5975] describes an example of how the QSPEC can be
   Used within QoS-NSLP.  Figure 29 illustrates a situation where a QNI
   and a QNR are using an end-to-end QOSM, denoted in this context as
   Z-e2e.  It is considered that the QNI access network side is a
   wireless access network built on a generation "X" technology with QoS
   support as defined by generation "X", while QNR access network is a
   wired/fixed access network with its own defined QoS support.

RFC5977 - Page 123

   Furthermore, it is considered that the shown QNE Edges are located at
   the boundary of an RMD domain and that the shown QNE Interior nodes
   are located inside the RMD domain.

   The QNE Edges are able to run both the Z-e2e QOSM and the RMD-QOSM,
   while the QNE Interior nodes can only run the RMD-QOSM.  The QNI is
   considered to be a wireless laptop, for example, while the QNR is
   considered to be a PC.

   |------|   |------|                           |------|   |------|
   |Z-e2e |<->|Z-e2e |<------------------------->|Z-e2e |<->|Z-e2e |
   | QOSM |   | QOSM |                           | QOSM |   | QOSM |
   |      |   |------|   |-------|   |-------|   |------|   |      |
   | NSLP |   | NSLP |<->| NSLP  |<->| NSLP  |<->| NSLP |   | NSLP |
   |Z-e2e |   |  RMD |   |  RMD  |   |  RMD  |   | RMD  |   | Z-e2e|
   | QOSM |   | QOSM |   | QOSM  |   | QOSM  |   | QOSM |   | QOSM |
   |------|   |------|   |-------|   |-------|   |------|   |------|
   -----------------------------------------------------------------
   |------|   |------|   |-------|   |-------|   |------|   |------|
   | NTLP |<->| NTLP |<->| NTLP  |<->| NTLP  |<->| NTLP |<->| NTLP |
   |------|   |------|   |-------|   |-------|   |------|   |------|
     QNI         QNE        QNE         QNE         QNE       QNR
   (End)  (Ingress Edge) (Interior)  (Interior) (Egress Edge)  (End)

    Figure 29. Example of initiator and local domain QOSM operation

   The QNI sets <QoS Desired> and <QoS Available> QSPEC objects in the
   initiator QSPEC, and initializes <QoS Available> to <QoS Desired>.
   In this example, the <Minimum QoS> object is not populated.  The QNI
   populates QSPEC parameters to ensure correct treatment of its traffic
   in domains down the path.  Additionally, to ensure correct treatment
   further down the path, the QNI includes <PHB Class> in <QoS Desired>.
   The QNI therefore includes in the QSPEC.

     <QoS Desired> = <TMOD-1> <PHB Class>
     <QoS Available> = <TMOD-1> <Path Latency>

   In this example, it is assumed that the <TMOD-1> parameter is used to
   encode the traffic parameters of a VoIP application that uses RTP and
   the G.711 Codec, see Appendix B in [RFC5975].  The below text is
   copied from [RFC5975].

      In the simplest case the Minimum Policed Unit m is the sum of the
      IP-, UDP- and RTP- headers + payload.  The IP header in the IPv4
      case has a size of 20 octets (40 octets if IPv6 is used).  The UDP
      header has a size of 8 octets and RTP uses a 12 octet header.  The

RFC5977 - Page 124

      G.711 Codec specifies a bandwidth of 64 kbit/s (8000 octets/s).
      Assuming RTP transmits voice datagrams every 20 ms, the payload
      for one datagram is 8000 octets/s * 0.02 s = 160 octets.

      IPv4+UDP+RTP+payload: m=20+8+12+160 octets = 200 octets
      IPv6+UDP+RTP+payload: m=40+8+12+160 octets = 220 octets

      The Rate r specifies the amount of octets per second.  50
      datagrams are sent per second.

      IPv4: r = 50 1/s * m = 10,000 octets/s
      IPv6: r = 50 1/s * m = 11,000 octets/s

      The bucket size b specifies the maximum burst.  In this example, a
      burst of 10 packets is used.

      IPv4: b = 10 * m = 2000 octets
      IPv6: b = 10 * m = 2200 octets

   In our example, we will assume that IPV4 is used and therefore, the
   <TMOD-1> values will be set as follows:

   m = 200 octets
   r = 10000 octets/s
   b = 2000 octets

   The <Peak Data Rate-1 (p)> and MPS are not specified above, but in
   our example we will assume:

   p = r = 10000 octets/s
   MPS = 220 octets

   The <PHB Class> is set in such a way that the Expedited Forwarding
   (EF) PHB is used.

   Since <Path Latency> and <QoS Class> are not vital parameters from
   the QNI's perspective, it does not raise their <M> flags.

   Each QNE, which supports the Z-e2e QOSM on the path, reads and
   interprets those parameters in the initiator QSPEC.

   When an end-to-end RESERVE message is received at a QNE Ingress node
   at the RMD domain border, the QNE Ingress can "hide" the initiator
   end-to-end RESERVE message so that only the QNE Edges process the
   initiator (end-to-end) RESERVE message, which then bypasses
   intermediate nodes between the Edges of the domain, and issues its
   own local RESERVE message (see Section 6).  For this new local
   RESERVE message, the QNE Ingress node generates the local RMD-QSPEC.

RFC5977 - Page 125

   The RMD-QSPEC corresponding to the RMD-QOSM is generated based on the
   original initiator QSPEC according to the procedures described in
   Section 4.5 of [RFC5974] and in Section 6 of this document.  The RMD
   QNE Ingress maps the <TMOD-1> parameters contained in the original
   Initiator QSPEC into the equivalent <TMOD-1> parameter representing
   only the peak bandwidth in the local RMD-QSPEC.

   In this example, the initial <TMOD-1> parameters are mapped into the
   RMD-QSPEC <TMOD-1> parameters as follows.

   As specified, the RMD-QOSM bandwidth equivalent <TMOD-1> parameter of
   RMD-QSPEC should have:

      r = p of initial e2e <TMOD-1> parameter
      m = large;
      b = large;

   For the RMD-QSPEC <TMOD-1> parameter, the following values are
   calculated:

      r = p of initial e2e <TMOD-1> parameter = 10000 octets/s
      m is set in this example to large as follows:
      m = MPS of initial e2e <TMOD-1> parameter = 220 octets

   The maximum value of b = 250 gigabytes, but in our example this value
   is quite large.  The b parameter specifies the extent to which the
   data rate can exceed the sustainable level for short periods of time.

   In order to get a large b, in this example we consider that for a
   period of certain period of time the data rate can exceed the
   sustainable level, which in our example is the peak rate (p).

   Thus, in our example, we calculate b as:

      b = p * "period of time"

   For this VoIP example, we can assume that this period of time is 1.5
   seconds, see below:

      b = 10000 octets/s * 1.5 seconds = 15000 octets

   Thus, the local RMD-QSPEC <TMOD-1> values are:

      r = 10000 octets/s
      p = 10000 octets/s
      m = 220 octets
      b = 15000 octets
      MPS = 220 octets

RFC5977 - Page 126

   The bit level format of the RMD-QSPEC is given in Section 4.1.  In
   particular, the Initiator/Local QSPEC bit, i.e., <I> is set to
   "Local" (i.e., "1") and the <Qspec Proc> is set as follows:

      * Message Sequence = 0: Sender initiated
      * Object combination = 0: <QoS Desired> for RESERVE and
        <QoS Reserved> for RESPONSE

   The <QSPEC Version> used by RMD-QOSM is the default version, i.e.,
   "0", see [RFC5975].  The <QSPEC Type> value used by the RMD-QOSM is
   specified in [RFC5975] and is equal to: "2".

   The <Traffic Handling Directives> contains the following fields:

   <Traffic Handling Directives> = <PHR container> <PDR container>

   The Per-Hop Reservation container (PHR container) and the Per-Domain
   Reservation container (PDR container) are specified in Sections 4.1.2
   and 4.1.3, respectively.  The <PHR container> contains the traffic
   handling directives for intra-domain communication and reservation.
   The <PDR container> contains additional traffic handling directives
   that are needed for edge-to-edge communication.  The RMD-QOSM <QoS
   Desired> and <QoS Reserved>, are specified in Section 4.1.1.

   In RMD-QOSM the <QoS Desired> and <QoS Reserved> objects contain the
   following parameters:

   <QoS Desired> = <TMOD-1> <PHB Class> <Admission Priority>
   <QoS Reserved> = <TMOD-1> <PHB Class> <Admission Priority>

   The bit format of the <PHB Class> (see [RFC5975] and Figures 4 and 5)
   and <Admission Priority> complies to the bit format specified in
   [RFC5975].

   In this example, the RMD-QSPEC <TMOD-1> values are the ones that were
   calculated and given above.  Furthermore, the <PHB Class>, represents
   the EF PHB class.  Moreover, in this example the RMD reservation is
   established without an <Admission Priority> parameter, which is
   equivalent to a reservation established with an <Admission Priority>
   whose value is 1.

   The RMD QNE Egress node updates <QoS Available> on behalf of the
   entire RMD domain if it can.  If it cannot (since the <M> flag is not
   set for <Path Latency>) it raises the parameter-specific, "not-
   supported" flag, warning the QNR that the final latency value in <QoS
   Available> is imprecise.

RFC5977 - Page 127

   In the "Y" access domain, the initiator QSPEC is processed by the QNR
   in the similar was as it was processed in the "X" wireless access
   domain, by the QNI.

   If the reservation was successful, eventually the RESERVE request
   arrives at the QNR (otherwise, the QNE at which the reservation
   failed would have aborted the RESERVE and sent an error RESPONSE back
   to the QNI).  If the <RII> was included in the QoS-NSLP message, the
   QNR generates a positive RESPONSE with QSPEC objects <QoS Reserved>
   and <QoS Available>.  The parameters appearing in <QoS Reserved> are
   the same as in <QoS Desired>, with values copied from <QoS
   Available>.  Hence, the QNR includes the following QSPEC objects in
   the RESPONSE message:

      <QoS Reserved> = <TMOD-1> <PHB Class>
      <QoS Available> = <TMOD-1> <Path Latency>

Contributors

   Attila Takacs
   Ericsson Research
   Ericsson Hungary Ltd.
   Laborc 1, Budapest, Hungary, H-1037
   EMail: Attila.Takacs@ericsson.com


   Andras Csaszar
   Ericsson Research
   Ericsson Hungary Ltd.
   Laborc 1, Budapest, Hungary, H-1037
   EMail: Andras.Csaszar@ericsson.com

RFC5977 - Page 128

Authors' Addresses

   Attila Bader
   Ericsson Research
   Ericsson Hungary Ltd.
   Laborc 1, Budapest, Hungary, H-1037
   EMail: Attila.Bader@ericsson.com


   Lars Westberg
   Ericsson Research
   Torshamnsgatan 23
   SE-164 80 Stockholm, Sweden
   EMail: Lars.Westberg@ericsson.com


   Georgios Karagiannis
   University of Twente
   P.O. Box 217
   7500 AE Enschede, The Netherlands
   EMail: g.karagiannis@ewi.utwente.nl


   Cornelia Kappler
   ck technology concepts
   Berlin, Germany
   EMail: cornelia.kappler@cktecc.de


   Hannes Tschofenig
   Nokia Siemens Networks
   Linnoitustie 6
   Espoo 02600
   Finland
   EMail: Hannes.Tschofenig@nsn.com
   URI: http://www.tschofenig.priv.at


   Tom Phelan
   Sonus Networks
   250 Apollo Dr.
   Chelmsford, MA 01824 USA
   EMail: tphelan@sonusnet.com