Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 3670

Information Model for Describing Network Device QoS Datapath Mechanisms

Pages: 97
Proposed Standard
Part 2 of 4 – Pages 10 to 38
First   Prev   Next

Top   ToC   RFC3670 - Page 10   prevText

3. Methodology

There is a clear need to define attributes and behavior that together define how traffic should be conditioned. This document defines a set of classes and relationships that represent the QoS mechanisms used to condition traffic; [QPIM] is used to define policies to control the QoS mechanisms defined in this document. However, some very basic issues need to be considered when combining these documents. Considering these issues should help in constructing a schema for managing the operation and configuration of network QoS mechanisms through the use of QoS policies.

3.1. Level of Abstraction for Expressing QoS Policies

The first issue requiring consideration is the level of abstraction at which QoS policies should be expressed. If we consider policies as a set of rules used to react to events and manipulate attributes or generate new events, we realize that policy represents a continuum of specifications that relate business goals and rules to the conditioning of traffic done by a device or a set of devices. An example of a business level policy might be: from 1:00 pm PST to 7:00 am EST, sell off 40% of the network capacity on the open market. In contrast, a device-specific policy might be: if the queue depth grows at a geometric rate over a specified duration, trigger a potential link failure event. A general model for this continuum is shown in Figure 1 below. +---------------------+ | High-Level Business | Not directly related to device | Policies | operation and configuration details +---------------------+ | | +---------V-----------+ | Device-Independent | Translate high-level policies to | Policies | generic device operational and +---------------------+ configuration information | | +---------V-----------+ | Device-Dependent | Translate generic device information | Policies | to specify how particular devices +---------------------+ should operate and be configured Figure 1. The Policy Continuum
Top   ToC   RFC3670 - Page 11
   High-level business policies are used to express the requirements of
   the different applications, and prioritize which applications get
   "better" treatment when the network is congested.  The goal, then, is
   to use policies to relate the operational and configuration needs of
   a device directly to the business rules that the network
   administrator is trying to implement in the network that the device
   belongs to.

   Device-independent policies translate business policies into a set of
   generalized operational and configuration policies that are
   independent of any specific device, but dependent on a particular set
   of QoS mechanisms, such as random early detection (RED) dropping or
   weighted round robin scheduling.  Not only does this enable different
   types of devices (routers, switches, hosts, etc.) to be controlled by
   QoS policies, it also enables devices made by different vendors that
   use the same types of QoS mechanisms to be controlled.  This enables
   these different devices to each supply the correct relative
   conditioning to the same type of traffic.

   In contrast, device-dependent policies translate device-independent
   policies into ones that are specific for a given device.  The reason
   that a distinction is made between device-independent and device-
   dependent policies is that in a given network, many different devices
   having many different capabilities need to be controlled together.
   Device-independent policies provide a common layer of abstraction for
   managing multiple devices of different capabilities, while device-
   dependent policies implement the specific conditioning that is
   required.  This document provides a common set of abstractions for
   representing QoS mechanisms in a device-independent way.

   This document is focused on the device-independent representation of
   QoS mechanisms.  QoS mechanisms are modeled in sufficient detail to
   provide a common device-independent representation of QoS policies.
   They can also be used to provide a basis for specialization, enabling
   each vendor to derive a set of vendor-specific classes that represent
   how traffic conditioning is done for that vendor's set of devices.

3.2. Specifying Policy Parameters

Policies are a function of parameters (attributes) and operators (boolean, arithmetic, relational, etc.). Therefore, both need to be defined as part of the same policy in order to correctly condition the traffic. If the parameters of the policy are specified too narrowly, they will reflect the individual implementations of QoS in each device. As there is currently little consensus in the industry on what the correct implementation model for QoS is, most defined attributes would only be applicable to the unique characteristics of a few individual devices. Moreover, standardizing all of these
Top   ToC   RFC3670 - Page 12
   potential implementation alternatives would be a never-ending task as
   new implementations continued to appear on the market.

   On the other hand, if the parameters of the policy are specified too
   broadly, it is impossible to develop meaningful policies. For
   example, if we concentrate on the so-called Olympic set of policies,
   a business policy like "Bob gets Gold Service," is clearly
   meaningless to the large majority of existing devices. This is
   because the device has no way of determining who Bob is, or what QoS
   mechanisms should be configured in what way to provide Gold service.

   Furthermore, Gold service may represent a single service, or it may
   identify a set of services that are related to each other. In the
   latter case, these services may have different conditioning
   characteristics.

   This document defines a set of parameters that fit into a canonical
   model for modeling the elements in the forwarding path of a device
   implementing QoS traffic conditioning.  By defining this model in a
   device-independent way, the needed parameters can be appropriately
   abstracted.

3.3. Specifying Policy Services

Administrators want the flexibility to be able to define traffic conditioning without having to have a low-level understanding of the different QoS mechanisms that implement that conditioning. Furthermore, administrators want the flexibility to group different services together, describing a higher-level concept such as "Gold Service". This higher-level service could be viewed as providing the processing to deliver "Gold" quality of service. These two goals dictate the need for the following set of abstractions: o a flexible way to describe a service o must be able to group different services that may use different technologies (e.g., DiffServ and IEEE 802.1Q) together o must be able to define a set of sub-services that together make up a higher-level service o must be able to associate a service and the set of QoS mechanisms that are used to condition traffic for that service o must be able to define policies that manage the QoS mechanisms used to implement a service.
Top   ToC   RFC3670 - Page 13
   This document addresses this set of problems by defining a set of
   classes and associations that can represent abstract concepts like
   "Gold Service," and bind each of these abstract services to a
   specific set of QoS mechanisms that implement the conditioning that
   they require.  Furthermore, this document defines the concept of
   "sub-services," to enable Gold Service to be defined either as a
   single service or as a set of services that together should be
   treated as an atomic entity.

   Given these abstractions, policies (as defined in [QPIM]) can be
   written to control the QoS mechanisms and services defined in this
   document.

3.4. Level of Abstraction for Defining QoS Attributes and Classes

This document defines a set of classes and properties to support policies that configure device QoS mechanisms. This document concentrates on the representation of services in the datapath that support both DiffServ (for aggregate traffic conditioning) and IntServ (for flow-based traffic conditioning). Classes and properties for modeling IntServ admission control services may be defined in a future document. The classes and properties in this document are designed to be used in conjunction with the QoS policy classes and properties defined in [QPIM]. For example, to preserve the delay characteristics committed to an end-user, a network administrator may wish to create policies that monitor the queue depths in a device, and adjust resource allocations when delay budgets are at risk (perhaps as a result of a network topology change). The classes and properties in this document define the specific services and mechanisms required to implement those services. The classes and properties defined in [QPIM] provide the overall structure of the policy that manages and configures this service. This combination of low-level specification (using this document) and high-level structuring (using [QPIM]) of network services enables network administrators to define new services required of the network, that are directly related to business goals, while ensuring that such services can be managed. However, this goal (of creating and managing service-oriented policies) can only be realized if policies can be constructed that are capable of supporting diverse implementations of QoS. The solution is to model the QoS capabilities of devices at the behavioral level. This means that for traffic conditioning services realized in the datapath, the model must support the following characteristics: o modeling of a generic network service that has QoS capabilities
Top   ToC   RFC3670 - Page 14
   o  modeling of how the traffic conditioning itself is defined

   o  modeling of how statistics are gathered to monitor QoS traffic
      conditioning services - this facet of the model will be added in a
      future document.

   This document models a network service, and associates it with one or
   more QoS mechanisms that are used to implement that service.  It also
   models in a canonical form the various components that are used to
   condition traffic, such that standard as well as custom traffic
   conditioning services may be described.

3.5. Characterization of QoS Properties

The QoS properties and classes will be described in more detail in Section 4. However, we should consider the basic characteristics of these properties, to understand the methodology for representing them. There are essentially two types of properties, state and configuration. Configuration properties describe the desired state of a device, and include properties and classes for representing desired or proposed thresholds, bandwidth allocations, and how to classify traffic. State properties describe the actual state of the device. These include properties to represent the current operational values of the attributes in devices configured via the configuration properties, as well as properties that represent state (queue depths, excess capacity consumption, loss rates, and so forth). In order to be correlated and used together, these two types of properties must be modeled using a common information model. The possibility of modeling state properties and their corresponding configuration settings is accomplished using the same classes in this model - although individual instances of the classes would have to be appropriately named or placed in different containers to distinguish current state values from desired configuration settings. State information is addressed in a very limited fashion by QDDIM. Currently, only CurrentQueueDepth is proposed as an attribute on QueuingService. The majority of the model is related to configuration. Given this fact, it is assumed that this model is a direct memory map into a device. All manipulation of model classes and properties directly affects the state of the device. If it is desired to also use these classes to represent desired configuration, that is left to the discretion of the implementor.
Top   ToC   RFC3670 - Page 15
   It is acknowledged that additional properties are needed to
   completely model current state.  However, many of the properties
   defined in this document represent exactly the state variables that
   will be configured by the configuration properties.  Thus, the
   definition of the configuration properties has an exact
   correspondence with the state properties, and can be used in modeling
   both actual (state) and desired/proposed configuration.

3.6. QoS Information Model Derivation

The question of context also leads to another question: how does the information specified in the core and QoS policy models ([PCIM], [PCIME], and [QPIM], respectively) integrate with the information defined in this document? To put it another way, where should device-independent concepts that lead to device-specific QoS attributes be derived from? Past thinking was that QoS was part of the policy model. This view is not completely accurate, and it leads to confusion. QoS is a set of services that can be controlled using policy. These services are represented as device mechanisms. An important point here is that QoS services, as well as other types of services (e.g., security), are provided by the mechanisms inherent in a given device. This means that not all devices are indeed created equal. For example, although two devices may have the same type of mechanism (e.g., a queue), one may be a simple implementation (i.e., a FIFO queue) whereas one may be much more complex and robust (e.g., class-based weighted fair queuing (CBWFQ)). However, both of these devices can be used to deliver QoS services, and both need to be controlled by policy. Thus, a device-independent policy can instruct the devices to queue certain traffic, and a device-specific policy can be used to control the queuing in each device. Furthermore, policy is used to control these mechanisms, not to represent them. For example, QoS services are implemented with classifiers, meters, markers, droppers, queues, and schedulers. Similarly, security is also a characteristic of devices, as authentication and encryption capabilities represent services that networked devices perform (irrespective of interactions with policy servers). These security services may use some of the same mechanisms that are used by QoS services, such as the concepts of filters. However, they will mostly require different mechanisms than the ones used by QoS, even though both sets of services are implemented in the same devices. Thus, the similarity between the QoS model and models for other services is not so much that they contain a few common mechanisms. Rather, they model how a device implements their respective services.
Top   ToC   RFC3670 - Page 16
   As such, the modeling of QoS should be part of a networking device
   schema rather than a policy schema.  This allows the networking
   device schema to concentrate on modeling device mechanisms, and the
   policy schema to focus on the semantics of representing the policy
   itself (conditions, actions, operators, etc.).  While this document
   concentrates on defining an information model to represent QoS
   services in a device datapath, the ultimate goal is to be able to
   apply policies that control these services in network devices.
   Furthermore, these two schemata (device and policy) must be tightly
   integrated in order to enable policy to control QoS services.

3.7. Attribute Representation

The last issue to be considered is the question of how attributes are represented. If QoS attributes are represented as absolute numbers (e.g., Class AF2 gets 2 Mbs of bandwidth), it is more difficult to make them uniform across multiple ports in a device or across multiple devices, because of the broad variation in link capacities. However, expressing attributes in relative or proportional terms (e.g., Class AF2 gets 5% of the total link bandwidth) makes it more difficult to express certain types of conditions and actions, such as: (If ConsumedBandwidth = AssignedBandwidth Then ...) There are really three approaches to addressing this problem: o Multiple properties can be defined to express the same value in various forms. This idea has been rejected because of the difficulty in keeping these different properties synchronized (e.g., when one property changes, the others all have to be updated). o Multi-modal properties can be defined to express the same value, in different terms, based on the access or assignment mode. This option was rejected because it significantly complicates the model and is impossible to express in current directory access protocols (e.g., (L)DAP). o Properties can be expressed as "absolutes", but the operators in the policy schema would need to be more sophisticated. Thus, to represent a percentage, division and multiplication operators are required (e.g., Class AF2 gets .05 * the total link bandwidth). This is the approach that has been taken in this document.
Top   ToC   RFC3670 - Page 17

3.8. Mental Model

The mental model for constructing this schema is based on the work done in the Differentiated Services working group. This schema is based on information provided in the current versions of the DiffServ Informal Management Model [DSMODEL], the DiffServ MIB [DSMIB], the PIB [PIB], as well as on information in the set of RFCs that constitute the basic definition of DiffServ itself ([R2475], [R2474], [R2597], and [R3246]). In addition, a common set of terminology is available in [POLTERM]. This model is built around two fundamental class hierarchies that are bound together using a set of associations. The two class hierarchies derive from the QoSService and ConditioningService base classes. A set of associations relate lower-level QoSService subclasses to higher-level QoS services, relate different types of conditioning services together in processing a traffic class, and relate a set of conditioning services to a specific QoS service. This combination of associations enables us to view the device as providing a set of services that can be configured, in a modular building block fashion, to construct application-specific services. Thus, this document can be used to model existing and future standard as well as application-specific network QoS services.

3.8.1. The QoSService Class

The first of the classes defined here, QoSService, is used to represent higher-level network services that require special conditioning of their traffic. An instance of QoSService (or one of its subclasses) is used to bring together a group of conditioning services that, from the perspective of the system manager, are all used to deliver a common service. Thus, the set of classifiers, markers, and related conditioning services that provide premium service to the "selected" set of user traffic may be grouped together into a premium QoS service. QoSService has a set of subclasses that represent different approaches to delivering IP services. The currently defined set of subclasses are a FlowService for flow-oriented QoS delivery and a DiffServService for DiffServ aggregate-oriented QoS service delivery. The QoS services can be related to each other as peers, or they can be implemented as subservient services to each other. The QoSSubService aggregation indicates that one or more QoSService objects are subservient to a particular QoSService object. For example, this enables us to define Gold Service as a combination of two DiffServ services, one for high quality traffic treatment, and one for servicing the rest of the traffic. Each of these
Top   ToC   RFC3670 - Page 18
   DiffServService objects would be associated with a set of
   classifiers, markers, etc, such that the high quality traffic would
   get EF marking and appropriate queuing.

   The DiffServService class itself has an AFService subclass.  This
   subclass is used to represent the specific notion that several
   related markings within the AF PHB Group work together to provide a
   single service.  When other DiffServ PHB Groups are defined that use
   more than one code point, these will be likely candidates for
   additional DiffServService subclasses.

   Technology-specific mappings of these services, representing the
   specific use of PHB marking or 802.1Q marking, are captured within
   the ConditioningService hierarchy, rather than in the subclasses of
   QoSService.

   These concepts are depicted in Figure 2.  Note that both of the
   associations are aggregations: a QoSService object aggregates both
   the set of QoSService objects subservient to it, and the set of
   ConditioningService objects that realize it.  See Section 4 for class
   and association definitions.

                /\______
           0..1 \/      |
   +--------------+     | QoSSubService     +---------------+
   |              |0..n |                   |               |
   |  QoSService  |-----                    | Conditioning  |
   |              |                         |   Service     |
   |              |                         |               |
   |              |0..n                 0..n|               |
   |              | /\______________________|               |
   |              | \/  QoSConditioning     |               |
   +--------------+       SubService        +---------------+

   Figure 2.  QoSService and its Aggregations

3.8.2. The ConditioningService Class

The goal of the ConditioningService classes is to describe the sequence of traffic conditioning that is applied to a given traffic stream on the ingress interface through which it enters a device, and then on the egress interface through which it leaves the device. This is done using a set of classes and relationships. The routing decision in the device core, which selects which egress interface a particular packet will use, is not represented in this model. A single base class, ConditioningService, is the superclass for a set of subclasses representing the mechanisms that condition traffic.
Top   ToC   RFC3670 - Page 19
   These subclasses define device-independent conditioning primitives
   (including classifiers, meters, markers, droppers, queues, and
   schedulers) that together implement the conditioning of traffic on an
   interface.  This model abstracts these services into a common set of
   modular building blocks that can be used, regardless of device
   implementation, to model the traffic conditioning internal to a
   device.

   The different conditioning mechanisms need to be related to each
   other to describe how traffic is conditioned.  Several important
   variations of how these services are related together exist:

   o  A particular ingress or egress interface may not require all the
      types of ConditioningServices.

   o  Multiple instances of the same mechanism may be required on an
      ingress or egress interface.

   o  There is no set order of application for the ConditioningServices
      on an ingress or egress interface.

   Therefore, this model does not dictate a fixed ordering among the
   subclasses of ConditioningService, or identify a subclass of
   ConditioningService that must appear first or last among the
   ConditioningServices on an ingress or egress interface.  Instead,
   this model ties together the various ConditioningService instances on
   an ingress or egress interface using the NextService,
   NextServiceAfterMeter, and NextServiceAfterConditioningElement
   associations.  There are also separate associations, called
   IngressConditioningServiceOnEndpoint and
   EgressConditioningServiceOnEndpoint, which, respectively, tie an
   ingress interface to its first ConditioningService, and tie an egress
   interface to its last ConditioningService(s).

3.8.3. Preserving QoS Information from Ingress to Egress

There is one important way in which the QDDIM model diverges from the [DSMODEL]. In [DSMODEL], traffic passes through a network device in three stages: o It comes in on an ingress interface, where it may receive QoS conditioning. o It traverses the routing core, where logic outside the scope of QoS determines which egress interface it will use to leave the device.
Top   ToC   RFC3670 - Page 20
   o  It may receive further QoS conditioning on the selected egress
      interface, and then it leaves the device.

   In this model, no information about the QoS conditioning that a
   packet receives on the ingress interface is communicated with the
   packet across the routing core to the egress interface.

   The QDDIM model relaxes this restriction, to allow information about
   the treatment that a packet received on an ingress interface to be
   communicated along with the packet to the egress interface.  (This
   relaxation adds a capability that is present in many network
   devices.)  QDDIM represents this information transfer in terms of a
   packet preamble, which is how many devices implement it.  But
   implementations are free to use other mechanisms to achieve the same
   result.

       +---------+
       | Meter-A |
    a  |         | b      d
   --->|      In-|---PM-1--->
       |         | c      e
       |     Out-|---PM-2--->
       +---------+

   Figure 3:  Meter Followed by Two Preamble Markers

   Figure 3 shows an example in which meter results are captured in a
   packet preamble.  The arrows labeled with single letters represent
   instances of either the NextService association (a, d, and e), or of
   its peer association NextServiceAfterMeter (b and c).  PreambleMarker
   PM-1 adds to the packet preamble an indication that the packet exited
   Meter A as conforming traffic. Similarly, PreambleMarker PM-2 adds to
   the preambles of packets that come through it indications that they
   exited Meter A as nonconforming traffic.  A PreambleMarker appends
   its information to whatever is already present in a packet preamble,
   as opposed to overwriting what is already there.

   To foster interoperability, the basic format of the information
   captured by a PreambleMarker is specified.  (Implementations, of
   course, are free to represent this information in a different way
   internally - this is just how it is represented in the model.) The
   information is represented by an ordered, multi-valued string
   property FilterItemList, where each individual value of the property
   is of the form "<type>,<value>".  When a PreambleMarker "appends" its
   information to the information that was already present in a packet
   preamble, it does so by adding additional items of the indicated
   format to the end of the list.
Top   ToC   RFC3670 - Page 21
   QDDIM provides a limited set of <type>'s that a PreambleMarker may
   use:

   o  ConformingFromMeter: the value is the name of the meter.

   o  PartConformingFromMeter: the value is the name of the meter.

   o  NonConformingFromMeter: the value is the name of the meter.

   o  VlanId: the value is the virtual LAN identifier (VLAN ID).

   Implementations may recognize other <type>'s in addition to these.
   If collisions of implementation-specific <type>'s become a problem,
   it is possible that <type>'s may become an IANA-administered range in
   a future revision of this document.

   To make use of the information that a PreambleMarker stores in a
   packet preamble, a specific subclass PreambleFilter of
   FilterEntryBase is defined, to match on the "<type>,<value>" strings.
   To simplify the case where there's just a single level of metering in
   a device, but different individual meters on each ingress interface,
   PreambleFilter allows a wildcard "any" for the <value> part of the
   three meter-related filters.  With this wildcard, an administrator
   can specify a Classifier to select all packets that were found to be
   conforming (or partially conforming, or non-conforming) by their
   respective meters, without having to name each meter individually in
   a separate ClassifierElement.

   Once a meter result has been stored in a packet preamble, it is
   available for any subsequent Classifier to use.  So while the
   motivation for this capability has been described in terms of
   preserving QoS conditioning information from an ingress interface to
   an egress interface, a prior meter result may also be used for
   classifying packets later in the datapath on the same interface where
   the meter resides.

3.9. Classifiers, FilterLists, and Filter Entries

This document uses a number of classes to model the classifiers defined in [DSMODEL]: ClassifierService, ClassifierElement, FilterList, FilterEntryBase, and various subclasses of FilterEntryBase. There are also two associations involved: ClassifierElementUsesFilterList and EntriesInFilterList. The QDDIM model makes no use of CIM's FilterEntry class. In [DSMODEL], a single traffic stream coming into a classifier is split into multiple traffic streams leaving it, based on which of an ordered set of filters each packet in the incoming stream matches. A
Top   ToC   RFC3670 - Page 22
   filter matches either a field in the packet itself, or possibly other
   attributes associated with the packet.  In the case of a multi-field
   (MF) classifier, packets are assigned to output streams based on the
   contents of multiple fields in the packet header.  For example, an MF
   classifier might assign packets to an output stream based on their
   complete IP-addressing 5-tuple.

   To optimize the representation of MF classifiers, subclasses of
   FilterEntryBase are introduced, which allow multiple related packet
   header fields to be represented in a single object.  These subclasses
   are IPHeaderFilter and 8021Filter.  With IPHeaderFilter, for example,
   criteria for selecting packets based on all five of the IP 5-tuple
   header fields and the DiffServ DSCP can be represented by a
   FilterList containing one IPHeaderFilter object.  Because these two
   classes have applications beyond those considered in this document,
   they, as well as the abstract class FilterEntryBase, are defined in
   the more general document [PCIME] rather than here.

   The FilterList object is always needed, even if it contains only one
   filter entry (that is, one FilterEntryBase subclass) object. This is
   because a ClassifierElement can only be associated with a Filter
   List, as opposed to an individual FilterEntry.  FilterList is also
   defined in [PCIME].

   The EntriesInFilterList aggregation (also defined in [PCIME]) has a
   property EntrySequence, which in the past (in CIM) could be used to
   specify an evaluation order on the filter entries in a FilterList.
   Now, however, the EntrySequence property supports only a single
   value: '0'.  This value indicates that the FilterEntries are ANDed
   together to determine whether a packet matches the MF selector that
   the FilterList represents.

   A ClassifierElement specifies the starting point for a specific
   policy or data path.  Each ClassifierElement uses the
   NextServiceAfterClassifierElement association to determine the next
   conditioning service to apply for packets to.

   A ClassifierService defines a grouping of ClassifierElements. There
   are certain instances where a ClassifierService actually specifies an
   aggregation of ClassifierServices.  One practical case would be where
   each ClassifierService specifies a group of policies associated with
   a particular application and another ClassifierService groups the
   application-specific ClassifierService instances.  In this particular
   case, the application-specific ClassifierService instances are
   specified once, but unique combinations of these ClassifierServices
   are specified, as needed, using other ClassifierService instances.
   ClassifierService instances grouping other ClassifierService
   instances may not specify a FilterList using the
Top   ToC   RFC3670 - Page 23
   ClassifierElementUsesFilterList association.  This special use of
   ClassifierService serves just as a Classifier collecting function.

3.10. Modeling of Droppers

In [DSMODEL], a distinction is made between absolute droppers and algorithmic droppers. In QDDIM, both of these types of droppers are modeled with the DropperService class, or with one of its subclasses. In both cases, the queue from which the dropper drops packets is tied to the dropper by an instance of the NextService association. The dropper always plays the PrecedingService role in these associations, and the queue always plays the FollowingService role. There is always exactly one queue from which a dropper drops packets. Since an absolute dropper drops all packets in its queue, it needs no configuration beyond a NextService tie to that queue. For an algorithmic dropper, however, further configuration is needed: o a specific drop algorithm; o parameters for the algorithm (for example, token bucket size); o the source(s) of input(s) to the algorithm; o possibly per-input parameters for the algorithm. The first two of these items are represented by properties of the DropperService class, or properties of one of its subclasses. The last two, however, involve additional classes and associations.

3.10.1. Configuring Head and Tail Droppers

The HeadTailDropQueueBinding is the association that identifies the inputs for the algorithm executed by a tail dropper. This association is not used for a head dropper, because a head dropper always has exactly one input to its drop algorithm, and this input is always the queue from which it drops packets. For a tail dropper, this association is defined to have a many-to-many cardinality. There are, however, two distinct cases: One dropper bound to many queues: This represents the case where the drop algorithm for the dropper involves inputs from more than one queue. The dropper still drops from only one queue, the one to which it is tied by a NextService association. But the drop decision may be influenced by the state of several queues. For the classes HeadTailDropper and HeadTailDropQueueBinding, the rule for combining the multiple inputs is simple addition: if the sum of the lengths of the monitored queues exceeds the dropper's QueueThreshold value, then
Top   ToC   RFC3670 - Page 24
   packets are dropped.  This rule for combining inputs may, however, be
   overridden by a different rule in subclasses of one or both of these
   classes.

   One queue bound to many droppers: This represents the case where the
   state of one queue (which is typically also the queue from which
   packets are dropped) provides an input to multiple droppers' drop
   algorithms.  A use case here is a classifier that splits a traffic
   stream into, say, four parts, representing four classes of traffic.
   Each of the parts goes through a separate HeadTailDropper, then
   they're re-merged onto the same queue.  The net is a single queue
   containing packets of four traffic types, with, say, the following
   drop thresholds:

      o    Class 1 - 90% full
      o    Class 2 - 80% full
      o    Class 3 - 70% full
      o    Class 4 - 50% full

   Here the percentages represent the overall state of the queue. With
   this configuration, when the queue in question becomes 50% full,
   Class 4 packets will be dropped rather than joining the queue, when
   it becomes 70% full, Class 3 and 4 packets will be dropped, etc.

   The two cases described here can also occur together, if a dropper
   receives inputs from multiple queues, one or more of which are also
   providing inputs to other droppers.

3.10.2. Configuring RED Droppers

Like a tail dropper, a RED dropper, represented by an instance of the REDDropperService class, may take as its inputs the states of multiple queues. In this case, however, there is an additional step: each of these inputs may be smoothed before the RED dropper uses it, and the smoothing process itself must be parameterized. Consequently, in addition to REDDropperService and QueuingService, a third class, DropThresholdCalculationService, is introduced, to represent the per-queue parameterization of this smoothing process.
Top   ToC   RFC3670 - Page 25
   The following instance diagram illustrates how these classes work
   with each other:

           RDSvc-A
           |  |  |
     +-----+  |  +-----+
     |        |        |
   DTCS-1   DTCS-2   DTCS-3
     |        |        |
    Q-1      Q-2      Q-3

   Figure 4. Inputs for a RED Dropper

   So REDDropperService-A (RDSvc-A) is using inputs from three queues to
   make its drop decision.  (As always, RDSvc-A is linked to the queue
   from which it drops packets via the NextService association.)  For
   each of these three queues, there is a
   (DropThresholdCalculationService) DTCS instance that represents the
   smoothing weight and time interval to use when looking at that queue.
   Thus each DTCS instance is tied to exactly one queue, although a
   single queue may be examined (with different weight and time values)
   by multiple DTCS instances.  Also, a DTCS instance and the queue
   behind it can be thought of as a "unit of reusability".  So a single
   DTCS can be referred to by multiple RDSvc's.

   Unless it is overridden by a different rule in a subclass of
   REDDropperService, the rule that a RED dropper uses to combine the
   smoothed inputs from the DTCS's to create a value to use in making
   its drop decision is simple addition.

3.11. Modeling of Queues and Schedulers

In order to appreciate the rationale behind this rather complex model for scheduling, we must consider the rather complex nature of schedulers, as well as the extreme variations in algorithms and implementations. Although these variations are broad, we have identified four examples that serve to test the model and justify its complexity.

3.11.1. Simple Hierarchical Scheduler

A simple, hierarchical scheduler has the following properties. First, when a scheduling opportunity is given to a set of queues, a single, viable queue is determined based on some scheduling criteria, such as bandwidth or priority. The output of the scheduler is the input to another scheduler that treats the first scheduler (and its queues) as a single logical queue. Hence, if the first scheduler determined the appropriate packet to release based on a priority assigned to each
Top   ToC   RFC3670 - Page 26
   queue, the second scheduler might specify a bandwidth
   limit/allocation for the entire set of queues aggregated by the first
   scheduler.

   +----------+                              NextService
   |QueuingSvc+----------------------------------------------+
   | Name=EF1 |                                              |
   |          | QueueTo    +--------------+ ElementSched     |
   |          +------------+PrioritySched +---------------+  |
   +----------+ Schedule   |Element       | Service       |  |
                           | Name=EF1-Pri |               |  v
                           | Priority=1   |    +-----------+-+-+
                           +--------------+    |SchedulingSvc  +
                                               | Name=PriSched1+
                           +--------------+    +----------+--+-+
                           |PrioritySched | ElementSched  |  ^
   +----------+            |Element       +---------------+  |
   |QueuingSvc| QueueTo    | Name=AF1x-Pri| Service          |
   | Name=AF1x+------------+ Priority=2   |                  |
   |          | Schedule   +--------------+                  |
   |          |                              NextService     |
   |          +----------------------------------------------+
   +----------+
   :
   +---------------+            NextScheduler
   |SchedulingSvc  +--------------------------------------------+
   | Name=PriSched1|                                            |
   +-------+-------+       +--------------------+ElementSchedSvc|
           | SchedToSched  |AllocationScheduling+--------+      |
           +---------------+Element             |        |      |
                           | Name=PriSched1-Band|        |      |
                           | Units=Bytes        |        |      v
                           | Bandwidth=100      | +------+------+--+
                           +--------------------+ |SchedulingSvc   |
                                                  | Name=BandSched1|
                           +--------------------+ +------+------+--+
                           |AllocationScheduling|        |      ^
   +---------------+       |Element             +--------+      |
   |QueuingService |       | Name=BE-Band       |ElementSchedSvc|
   | Name=BE       |QueueTo+ Units=Bytes        |               |
   |               |-------+ Bandwidth=50       |               |
   |               |Sched  +--------------------+               |
   |               |                             NextService    |
   |               +--------------------------------------------+
   +---------------+

   Figure 5. Example 1: Simple Hierarchical Scheduler
Top   ToC   RFC3670 - Page 27
   Figure 5 illustrates the example and how it would be instantiated
   using the model.  In the figure, NextService determines the first
   scheduler after the queue.  NextScheduler determines the
   subsequent ordering of schedulers.  In addition, the
   ElementSchedulingService association determines the set of
   scheduling parameters used by a specific scheduler.  Scheduling
   parameters can be bound either to queues or to schedulers.  In
   the case of the SchedulingElement EF1-Pri, the binding is to a
   queue, so the QueueToSchedule association is used.  In the case
   of the SchedulingElement PriSched1-Band, the binding is to
   another scheduler, so the SchedulerToSchedule association is
   used.  Note that due to space constraints of the document, the
   SchedulingService PRISched1 is represented twice, to show how it
   is connected to all the other objects.

3.11.2. Complex Hierarchical Scheduler

A complex, hierarchical scheduler has the same characteristics as a simple scheduler, except that the criteria for the second scheduler are determined on a per queue basis rather than on an aggregate basis. One scenario might be a set of bounded priority schedulers. In this case, each queue is assigned a relative priority. However, each queue is also not allowed to exceed a bandwidth allocation that is unique to that queue. In order to support this scenario, the queue must be bound to two separate schedulers. Figure 6 illustrates this situation, by describing an EF queue and a best effort (BE) queue both pointing to a priority scheduler via the NextService association. The NextScheduler association between the priority scheduler and the bandwidth scheduler in turn defines the ordering of the scheduling hierarchy. Also note that each scheduler has a distinct set of scheduling parameters that are bound back to each queue. This demonstrates the need to support two or more parameter sets on a per queue basis.
Top   ToC   RFC3670 - Page 28
   +----------------+
   |QueuingService  |
   | Name=EF        |
   |                |QueueTo   +----------------+ElementSchedSvc
   |                +----------+AllocationSched +--------+
   ++---+-----------+Schedule  |Element         |        |
    |   |                      | Name=BandEF    |        |
    |   |QueueTo               | Units=Bytes    |        |
    |   |Schedule              | Bandwidth=100  |        |
    |   |                      +----------------+ +------+---------+
    |   |                                         |SchedulingSvc   |
    |   |      +------------------+               | Name=BandSched |
    |   +------+PriorityScheduling|               +------------+--++
    |          |Element           |                            ^  |
    |          | Name=PriEF       |ElementSchedSvc             |  |
    |          | Priority=1       +---------------------+      |  |
    |          +------------------+                     |      |  |
    |NextService                                        |      |  |
    +-------------------------------------------------+ |      |  |
                                                      | |      |  |
     NextService                                      | |      |  |
    +-----------------------------------------------+ | |      |  |
    |                                               | | |      |  |
    |          +------------------+ElementSchedSvc  | | |      |  |
    |          |PriorityScheduling+--------+        | | |      |  |
    |          |Element           |        |        | | |      |  |
    |          | Name=PriBE       |        |        v v |      |  |
    |   +------+ Priority=2       |    +---+--------+-+-+-+Next|  |
    |   |      +------------------+    |SchedulingService +----+  |
    |   |                              | Name=PriSched    |Sched  |
    |   |                              +------------------+       |
    |   |QueueTo                                                  |
    |   |Schedule              +----------------+                 |
    |   |                      |AllocationSched |ElementSchedSvc  |
   +----+---------+            |Element         +-----------------+
   |QueuingService|QueueTo     | Name=BandBE    |
   | Name=BE      +------------+ Units=Bytes    |
   |              |Schedule    | Bandwidth=50   |
   |              |            +----------------+
   +--------------+

   Figure 6. Example 2: Complex Hierarchical Scheduler
Top   ToC   RFC3670 - Page 29

3.11.3. Excess Capacity Scheduler

An excess capacity scheduler offers a similar requirement to support two scheduling parameter sets per queue. However, in this scenario the reasons are a little different. Suppose a set of queues have each been assigned bandwidth limits to ensure that no traffic class starves out another traffic class. The result may be that one or more queues have exceeded their allocation while the queues that deserve scheduling opportunities are empty. The question then is how is the excess (idle) bandwidth allocated. Conceivably, the scheduling criteria for excess capacity are completely different from the criteria that determine allocations under uniform load. This could be supported with a scheduling hierarchy. However, the problem is that the criteria for using the subsequent scheduler are different from those in the last two cases. Specifically, the next scheduler should only be used if a scheduling opportunity exists that was passed over by the prior scheduler. When a scheduler chooses to forgo a scheduling decision, it is behaving as a non-work conserving scheduler. Work conserving schedulers, by definition, will always take advantage of a scheduling opportunity, irrespective of which queue is being serviced and how much bandwidth it has consumed in the past. This point leads to an interesting insight. The semantics of a non-work conserving scheduler are equivalent to those of a meter, in that if a packet is in profile it is given the scheduling opportunity, and if it is out of profile it does not get a scheduling opportunity. However, with meters there are semantics that determine the next action behavior when the packet is in profile and when the packet is out of profile. Similarly, with the non-work conserving scheduler, there needs to be a means for determining the next scheduler when a scheduler chooses not to utilize a scheduling opportunity. Figure 7 illustrates this last scenario. It appears very similar to Figure 6, except that the binding between the allocation scheduler and the WRR scheduler is using a FailNextScheduler association. This association is explicitly indicating the fact that the only time the WRR scheduler would be used is when there are non-empty queues that the allocation scheduler rejected for scheduling consideration. Note that Figure 7 is incomplete, in that typically there would be several more queues that are bound to an allocation scheduler and a WRR scheduler.
Top   ToC   RFC3670 - Page 30
   +------------+
   |QueuingSvc  |
   | Name=EF    |
   |            |
   |            |
   ++-+---------+
    | |
    | |QueueTo
    | |Schedule                                     +--------------+
    | |                                             |SchedulingSvc |
    | |      +------------------+                   | Name=WRRSched|
    | +------+AllocationSched   |                   +----------+-+-+
    |        |Element           |                              ^ |
    |        | Name=BandEF      |ElementSchedSvc               | |
    |        | Units=Bytes      +--------------------+         | |
    |        | Bandwidth=100    |                    |         | |
    |        +------------------+                    |         | |
    |NextService                                     |         | |
    +----------------------------------------------+ |         | |
                                                   | |         | |
     NextService                                   | |         | |
    +--------------------------------------------+ | |         | |
    |                                            | | |         | |
    |        +------------------+ElementSchedSvc | | |         | |
    |        |AllocationSched   +--------+       | | |         | |
    |        |Element           |        |       | | |         | |
    |        | Name=BandwidthAF1|        |       | | |         | |
    |        | Units=Bytes      |        |       v v |         | |
    | +------+ Bandwidth=50     |  +--+----------+-+-++FailNext| |
    | |      +------------------+  |SchedulingService +--------+ |
    | |QueueTo                     | Name=BandSched   |Scheduler |
    | |Schedule                    +------------------+          |
    | |                                                          |
    | |                       +---------------------+            |
   ++-+-----------+           | WRRSchedulingElement|            |
   |QueuingService|QueueTo    | Name=WRRBE          +------------+
   | Name=BE      +-----------+ Weight=30           |ElementSchedSvc
   +--------------+Schedule   +---------------------+

   Figure 7.  Example 3: Excess Capacity Scheduler
Top   ToC   RFC3670 - Page 31

3.11.4. Hierarchical CBQ Scheduler

A hierarchical class-based queuing (CBQ) scheduler is the fourth scenario to be considered. In hierarchical CBQ, each queue is allocated a specific bandwidth allocation. Queues are grouped together into a logical scheduler. This logical scheduler in turn has an aggregate bandwidth allocation that equals the sum of the queues it is scheduling. In turn, logical schedulers can be aggregated into higher-level logical schedulers. Changing perspectives and looking top down, the top-most logical scheduler has 100% of the link capacity. This allocation is parceled out to logical schedulers below it such that the sum of the allocations is equal to 100%. These second tier schedulers may in turn parcel out their allocation across a third tier of schedulers and so forth until the lowest tier that parcels out their allocations to specific queues representing relatively fine-grained classes of traffic. The unique aspect of hierarchical CBQ is that when there is insufficient bandwidth for a specific allocation, schedulers higher in the tree are tested to see if another portion of the tree has capacity to spare. Figure 8 demonstrates this example with two tiers. The example is split in half because of space constraints, resulting in the CBQTier1 scheduling service instance being represented twice. Note that the total allocation at the top tier is 50 Mb. The voice allocation is 22 Mb. The remaining 23 Mb is split between FTP and Web. Hence, if Web traffic is actually consuming 20 Mb (5 Mb in excess of the allocation). If FTP is consuming 5 Mb, then it is possible for the CBQTier1 scheduler to offer 3Mb of its allocation to Web traffic. However, this is not enough, so the FailNextScheduler association needs to be traversed to determine if there is any excess capacity available from the voice class. If the voice class is only consuming 15 Mb of its 22 Mb allocation, there are sufficient resources to allow the web traffic through. Note that FailNextScheduler is used as the association. The reason is because the CBQTier1 scheduler in fact failed to schedule a packet because of insufficient resources. It is conceivable that a variant of hierarchical CBQ allows a hierarchy for successful scheduling as well. Hence, both associations are necessary. Note that due to space constraints of the document, the SchedulingService CBQTier1 is represented twice, to show how it is connected to all the other objects.
Top   ToC   RFC3670 - Page 32
   +-----------+                        NextService
   |QueuingSvc +-------------------------------------------+
   | Name=Web  |                                           |
   |           |QueueTo+----------------+ ElementSchedSvc  |
   |           +-------+AllocationSched +----------------+ |
   +-----------+Sched  |Element         |                | |
                       | Name=Web-Alloc |                | v
                       | Bandwidth=15   |    +-----------+-+-+
                       +----------------+    |SchedulingSvc  +
                                             | Name=CBQTier1 +
                       +----------------+    +-----------+-+-+
                       |AllocationSched | ElementSchedSvc| ^
   +-----------+       |Element         +----------------+ |
   |QueuingSvc |QueueTo| Name=FTP-Alloc |                  |
   | Name=FTP  +-------+ Bandwidth=8    |                  |
   |           |Sched  +----------------+                  |
   |           |                        NextService        |
   |           +-------------------------------------------+
   +-----------+
   :

   +---------------+                    FailNextScheduler
   |SchedulingSvc  +---------------------------------------------+
   | Name=CBQTier1 |                                             |
   +-------+-------+       +---------------------+ElementSchedSvc|
           | SchedToSched  |AllocationScheduling +--------+      |
           +---------------+Element              |        |      |
                           | Name=LowPri-Alloc   |        |      |
                           | Bandwidth=23        |        |      v
                           +---------------------+  +-----+------+-+
                                                    |SchedulingSvc |
                                                    | Name=CBQTop  |
                        +---------------------+     +----------+-+-+
                        |AllocationScheduling |ElementSchedSvc | ^
   +------------+       |Element              +----------------+ |
   |QueuingSvc  |QueueTo| Name=BE-Band        |                  |
   | Name=Voice +-------+ Bandwidth=22        |                  |
   |            |Sched  +---------------------+                  |
   |            |                       NextService              |
   |            +------------------------------------------------+
   +------------+

   Figure 8.  Example 4: Hierarchical CBQ Scheduler
Top   ToC   RFC3670 - Page 33

4. The Class Hierarchy

The following sections present the class and association hierarchies that together comprise the information model for modeling QoS capabilities at the device level.

4.1. Associations and Aggregations

Associations and aggregations are a means of representing relationships between two (or theoretically more) objects. Dependency, aggregation, and other relationships are modeled as classes containing two (or more) object references. It should be noted that aggregations represent either "whole-part" or "collection" relationships. For example, aggregation can be used to represent the containment relationship between a system and the components that constitute the system. Since associations and aggregations are classes, they can benefit from all of the object-oriented features that other non-relationship classes have. For example, they can contain properties and methods, and inheritance can be used to refine their semantics such that they represent more specialized types of their superclasses. Note that an association (or an aggregation) object is treated as an atomic unit (individual instance), even though it relates/collects/is comprised of multiple objects. This is a defining feature of an association (or an aggregation) - although the individual elements that are related to other objects have their own identities, the association (or aggregation) object that is constructed using these objects has its own identity and name as well. It is important to note that associations and aggregations form an inheritance hierarchy that is separate from the class inheritance hierarchy. Although associations and aggregations are typically bi- directional, there is nothing that prevents higher order associations or aggregations from being defined. However, such associations and aggregations are inherently more complex to define, understand, and use. In practice, associations and aggregations of orders higher than binary are rarely used, because of their greatly increased complexity and lack of generality. All of the associations and aggregations defined in this model are binary. Note also that by definition, associations and aggregations cannot be unary.
Top   ToC   RFC3670 - Page 34
   Finally, note that associations and aggregations that are defined
   between two classes do not affect the classes themselves.  That is,
   the addition or deletion of an association or an aggregation does not
   affect the interfaces of the classes that it is connecting.

4.2. The Structure of the Class Hierarchies

The structure of the class, association, and aggregation class inheritance hierarchies for managing the datapaths of QoS devices is shown, respectively, in Figure 9, Figure 10, and Figure 11. The notation (CIMCORE) identifies a class defined in the CIM Core model. Please refer to [CIM] for the definitions of these classes. Similarly, the notation [PCIME] identifies a class defined in the Policy Core Information Model Extensions document. This model has been influenced by [CIM], and is compatible with the Directory Enabled Networks (DEN) effort. +--ManagedElement (CIMCORE) | +--ManagedSystemElement (CIMCORE) | | | +--LogicalElement (CIMCORE) | | | +--Service (CIMCORE) | | | | | +--ConditioningService | | | | | | | +--ClassifierService | | | | | | | | | +--ClassifierElement | | | | | | | +--MeterService | | | | | | | | | +--AverageRateMeterService | | | | | | | | | +--EWMAMeterService | | | | | | | | | +--TokenBucketMeterService | | | | | | | +--MarkerService | | | | | | | | | +--PreambleMarkerService | | | | | | | | | +--TOSMarkerService | | | | | | | | | +--DSCPMarkerService | | | | |
Top   ToC   RFC3670 - Page 35
   (continued from previous page;
    the first four elements are repeated for convenience)

   +--ManagedElement (CIMCORE)
      |
      +--ManagedSystemElement (CIMCORE)
      |  |
      |  +--LogicalElement (CIMCORE)
      |     |
      |     +--Service (CIMCORE)
      |     |  |  |  +--8021QMarkerService
      |     |  |  |
      |     |  |  +--DropperService
      |     |  |  |  |
      |     |  |  |  +--HeadTailDropperService
      |     |  |  |  |
      |     |  |  |  +--RedDropperService
      |     |  |  |
      |     |  |  +--QueuingService
      |     |  |  |
      |     |  |  +--PacketSchedulingService
      |     |  |     |
      |     |  |     +--NonWorkConservingSchedulingService
      |     |  |
      |     |  +--QoSService
      |     |  |  |
      |     |  |  +--DiffServService
      |     |  |  |   |
      |     |  |  |   +--AFService
      |     |  |  |
      |     |  |  +--FlowService
      |     |  |
      |     |  +--DropThresholdCalculationService
      |     |
      |     +--FilterEntryBase [PCIME]
      |     |  |
      |     |  +--IPHeaderFilter [PCIME]
      |     |  |
      |     |  +--8021Filter [PCIME]
      |     |  |
      |     |  +--PreambleFilter
      |     |
      |     +--FilterList [PCIME]
      |     |
      |     +--ServiceAccessPoint (CIMCORE)
      |        |
      |        +--ProtocolEndpoint
Top   ToC   RFC3670 - Page 36
   (continued from previous page;
    the first four elements are repeated for convenience)

   +--ManagedElement (CIMCORE)
      |
      +--ManagedSystemElement (CIMCORE)
      |  |
      |  +--LogicalElement (CIMCORE)
      |     |
      |     +--Service (CIMCORE)
      |
      +--Collection (CIMCORE)
      |  |
      |  +--CollectionOfMSEs (CIMCORE)
      |     |
      |     +--BufferPool
      |
      +--SchedulingElement
         |
         +--AllocationSchedulingElement
         |
         +--WRRSchedulingElement
         |
         +--PrioritySchedulingElement
            |
            +--BoundedPrioritySchedulingElement

   Figure 9.  Class Inheritance Hierarchy
Top   ToC   RFC3670 - Page 37
   The inheritance hierarchy for the associations defined in this
   document is shown in Figure 10.

   +--Dependency (CIMCORE)
   |  |
   |  +--ServiceSAPDependency (CIMCORE)
   |  |  |
   |  |  +--IngressConditioningServiceOnEndpoint
   |  |  |
   |  |  +--EgressConditioningServiceOnEndpoint
   |  |
   |  +--HeadTailDropQueueBinding
   |  |
   |  +--CalculationBasedOnQueue
   |  |
   |  +--ProvidesServiceToElement (CIMCORE)
   |  |  |
   |  |  +--ServiceServiceDependency (CIMCORE)
   |  |     |
   |  |     +--CalculationServiceForDropper
   |  |
   |  +--QueueAllocation
   |  |
   |  +--ClassifierElementUsesFilterList
   |
   +--AFRelatedServices
   |
   +--NextService
   |  |
   |  +--NextServiceAfterClassifierElement
   |  |
   |  +--NextScheduler
   |    |
   |    +--FailNextScheduler
   |
   +--NextServiceAfterMeter
   |
   +--QueueToSchedule
   |
   +--SchedulingServiceToSchedule

   Figure 10.  Association Class Inheritance Hierarchy
Top   ToC   RFC3670 - Page 38
   The inheritance hierarchy for the aggregations defined in this
   document is shown in Figure 11.

   +--MemberOfCollection (CIMCORE)
   |  |
   |  +--CollectedBufferPool
   |
   +--Component (CIMCORE)
   |  |
   |  +--ServiceComponent (CIMCORE)
   |  |  |
   |  |  +--QoSSubService
   |  |  |
   |  |  +--QoSConditioningSubService
   |  |  |
   |  |  +--ClassifierElementInClassifierService
   |  |
   |  +--EntriesInFilterList [PCIME]
   |
   +--ElementInSchedulingService

   Figure 11.  Aggregation Class Inheritance Hierarchy



(page 38 continued on part 3)

Next Section