3. Methodology
There is a clear need to define attributes and behavior that together define how traffic should be conditioned. This document defines a set of classes and relationships that represent the QoS mechanisms used to condition traffic; [QPIM] is used to define policies to control the QoS mechanisms defined in this document. However, some very basic issues need to be considered when combining these documents. Considering these issues should help in constructing a schema for managing the operation and configuration of network QoS mechanisms through the use of QoS policies.3.1. Level of Abstraction for Expressing QoS Policies
The first issue requiring consideration is the level of abstraction at which QoS policies should be expressed. If we consider policies as a set of rules used to react to events and manipulate attributes or generate new events, we realize that policy represents a continuum of specifications that relate business goals and rules to the conditioning of traffic done by a device or a set of devices. An example of a business level policy might be: from 1:00 pm PST to 7:00 am EST, sell off 40% of the network capacity on the open market. In contrast, a device-specific policy might be: if the queue depth grows at a geometric rate over a specified duration, trigger a potential link failure event. A general model for this continuum is shown in Figure 1 below. +---------------------+ | High-Level Business | Not directly related to device | Policies | operation and configuration details +---------------------+ | | +---------V-----------+ | Device-Independent | Translate high-level policies to | Policies | generic device operational and +---------------------+ configuration information | | +---------V-----------+ | Device-Dependent | Translate generic device information | Policies | to specify how particular devices +---------------------+ should operate and be configured Figure 1. The Policy Continuum
High-level business policies are used to express the requirements of the different applications, and prioritize which applications get "better" treatment when the network is congested. The goal, then, is to use policies to relate the operational and configuration needs of a device directly to the business rules that the network administrator is trying to implement in the network that the device belongs to. Device-independent policies translate business policies into a set of generalized operational and configuration policies that are independent of any specific device, but dependent on a particular set of QoS mechanisms, such as random early detection (RED) dropping or weighted round robin scheduling. Not only does this enable different types of devices (routers, switches, hosts, etc.) to be controlled by QoS policies, it also enables devices made by different vendors that use the same types of QoS mechanisms to be controlled. This enables these different devices to each supply the correct relative conditioning to the same type of traffic. In contrast, device-dependent policies translate device-independent policies into ones that are specific for a given device. The reason that a distinction is made between device-independent and device- dependent policies is that in a given network, many different devices having many different capabilities need to be controlled together. Device-independent policies provide a common layer of abstraction for managing multiple devices of different capabilities, while device- dependent policies implement the specific conditioning that is required. This document provides a common set of abstractions for representing QoS mechanisms in a device-independent way. This document is focused on the device-independent representation of QoS mechanisms. QoS mechanisms are modeled in sufficient detail to provide a common device-independent representation of QoS policies. They can also be used to provide a basis for specialization, enabling each vendor to derive a set of vendor-specific classes that represent how traffic conditioning is done for that vendor's set of devices.3.2. Specifying Policy Parameters
Policies are a function of parameters (attributes) and operators (boolean, arithmetic, relational, etc.). Therefore, both need to be defined as part of the same policy in order to correctly condition the traffic. If the parameters of the policy are specified too narrowly, they will reflect the individual implementations of QoS in each device. As there is currently little consensus in the industry on what the correct implementation model for QoS is, most defined attributes would only be applicable to the unique characteristics of a few individual devices. Moreover, standardizing all of these
potential implementation alternatives would be a never-ending task as new implementations continued to appear on the market. On the other hand, if the parameters of the policy are specified too broadly, it is impossible to develop meaningful policies. For example, if we concentrate on the so-called Olympic set of policies, a business policy like "Bob gets Gold Service," is clearly meaningless to the large majority of existing devices. This is because the device has no way of determining who Bob is, or what QoS mechanisms should be configured in what way to provide Gold service. Furthermore, Gold service may represent a single service, or it may identify a set of services that are related to each other. In the latter case, these services may have different conditioning characteristics. This document defines a set of parameters that fit into a canonical model for modeling the elements in the forwarding path of a device implementing QoS traffic conditioning. By defining this model in a device-independent way, the needed parameters can be appropriately abstracted.3.3. Specifying Policy Services
Administrators want the flexibility to be able to define traffic conditioning without having to have a low-level understanding of the different QoS mechanisms that implement that conditioning. Furthermore, administrators want the flexibility to group different services together, describing a higher-level concept such as "Gold Service". This higher-level service could be viewed as providing the processing to deliver "Gold" quality of service. These two goals dictate the need for the following set of abstractions: o a flexible way to describe a service o must be able to group different services that may use different technologies (e.g., DiffServ and IEEE 802.1Q) together o must be able to define a set of sub-services that together make up a higher-level service o must be able to associate a service and the set of QoS mechanisms that are used to condition traffic for that service o must be able to define policies that manage the QoS mechanisms used to implement a service.
This document addresses this set of problems by defining a set of classes and associations that can represent abstract concepts like "Gold Service," and bind each of these abstract services to a specific set of QoS mechanisms that implement the conditioning that they require. Furthermore, this document defines the concept of "sub-services," to enable Gold Service to be defined either as a single service or as a set of services that together should be treated as an atomic entity. Given these abstractions, policies (as defined in [QPIM]) can be written to control the QoS mechanisms and services defined in this document.3.4. Level of Abstraction for Defining QoS Attributes and Classes
This document defines a set of classes and properties to support policies that configure device QoS mechanisms. This document concentrates on the representation of services in the datapath that support both DiffServ (for aggregate traffic conditioning) and IntServ (for flow-based traffic conditioning). Classes and properties for modeling IntServ admission control services may be defined in a future document. The classes and properties in this document are designed to be used in conjunction with the QoS policy classes and properties defined in [QPIM]. For example, to preserve the delay characteristics committed to an end-user, a network administrator may wish to create policies that monitor the queue depths in a device, and adjust resource allocations when delay budgets are at risk (perhaps as a result of a network topology change). The classes and properties in this document define the specific services and mechanisms required to implement those services. The classes and properties defined in [QPIM] provide the overall structure of the policy that manages and configures this service. This combination of low-level specification (using this document) and high-level structuring (using [QPIM]) of network services enables network administrators to define new services required of the network, that are directly related to business goals, while ensuring that such services can be managed. However, this goal (of creating and managing service-oriented policies) can only be realized if policies can be constructed that are capable of supporting diverse implementations of QoS. The solution is to model the QoS capabilities of devices at the behavioral level. This means that for traffic conditioning services realized in the datapath, the model must support the following characteristics: o modeling of a generic network service that has QoS capabilities
o modeling of how the traffic conditioning itself is defined o modeling of how statistics are gathered to monitor QoS traffic conditioning services - this facet of the model will be added in a future document. This document models a network service, and associates it with one or more QoS mechanisms that are used to implement that service. It also models in a canonical form the various components that are used to condition traffic, such that standard as well as custom traffic conditioning services may be described.3.5. Characterization of QoS Properties
The QoS properties and classes will be described in more detail in Section 4. However, we should consider the basic characteristics of these properties, to understand the methodology for representing them. There are essentially two types of properties, state and configuration. Configuration properties describe the desired state of a device, and include properties and classes for representing desired or proposed thresholds, bandwidth allocations, and how to classify traffic. State properties describe the actual state of the device. These include properties to represent the current operational values of the attributes in devices configured via the configuration properties, as well as properties that represent state (queue depths, excess capacity consumption, loss rates, and so forth). In order to be correlated and used together, these two types of properties must be modeled using a common information model. The possibility of modeling state properties and their corresponding configuration settings is accomplished using the same classes in this model - although individual instances of the classes would have to be appropriately named or placed in different containers to distinguish current state values from desired configuration settings. State information is addressed in a very limited fashion by QDDIM. Currently, only CurrentQueueDepth is proposed as an attribute on QueuingService. The majority of the model is related to configuration. Given this fact, it is assumed that this model is a direct memory map into a device. All manipulation of model classes and properties directly affects the state of the device. If it is desired to also use these classes to represent desired configuration, that is left to the discretion of the implementor.
It is acknowledged that additional properties are needed to completely model current state. However, many of the properties defined in this document represent exactly the state variables that will be configured by the configuration properties. Thus, the definition of the configuration properties has an exact correspondence with the state properties, and can be used in modeling both actual (state) and desired/proposed configuration.3.6. QoS Information Model Derivation
The question of context also leads to another question: how does the information specified in the core and QoS policy models ([PCIM], [PCIME], and [QPIM], respectively) integrate with the information defined in this document? To put it another way, where should device-independent concepts that lead to device-specific QoS attributes be derived from? Past thinking was that QoS was part of the policy model. This view is not completely accurate, and it leads to confusion. QoS is a set of services that can be controlled using policy. These services are represented as device mechanisms. An important point here is that QoS services, as well as other types of services (e.g., security), are provided by the mechanisms inherent in a given device. This means that not all devices are indeed created equal. For example, although two devices may have the same type of mechanism (e.g., a queue), one may be a simple implementation (i.e., a FIFO queue) whereas one may be much more complex and robust (e.g., class-based weighted fair queuing (CBWFQ)). However, both of these devices can be used to deliver QoS services, and both need to be controlled by policy. Thus, a device-independent policy can instruct the devices to queue certain traffic, and a device-specific policy can be used to control the queuing in each device. Furthermore, policy is used to control these mechanisms, not to represent them. For example, QoS services are implemented with classifiers, meters, markers, droppers, queues, and schedulers. Similarly, security is also a characteristic of devices, as authentication and encryption capabilities represent services that networked devices perform (irrespective of interactions with policy servers). These security services may use some of the same mechanisms that are used by QoS services, such as the concepts of filters. However, they will mostly require different mechanisms than the ones used by QoS, even though both sets of services are implemented in the same devices. Thus, the similarity between the QoS model and models for other services is not so much that they contain a few common mechanisms. Rather, they model how a device implements their respective services.
As such, the modeling of QoS should be part of a networking device schema rather than a policy schema. This allows the networking device schema to concentrate on modeling device mechanisms, and the policy schema to focus on the semantics of representing the policy itself (conditions, actions, operators, etc.). While this document concentrates on defining an information model to represent QoS services in a device datapath, the ultimate goal is to be able to apply policies that control these services in network devices. Furthermore, these two schemata (device and policy) must be tightly integrated in order to enable policy to control QoS services.3.7. Attribute Representation
The last issue to be considered is the question of how attributes are represented. If QoS attributes are represented as absolute numbers (e.g., Class AF2 gets 2 Mbs of bandwidth), it is more difficult to make them uniform across multiple ports in a device or across multiple devices, because of the broad variation in link capacities. However, expressing attributes in relative or proportional terms (e.g., Class AF2 gets 5% of the total link bandwidth) makes it more difficult to express certain types of conditions and actions, such as: (If ConsumedBandwidth = AssignedBandwidth Then ...) There are really three approaches to addressing this problem: o Multiple properties can be defined to express the same value in various forms. This idea has been rejected because of the difficulty in keeping these different properties synchronized (e.g., when one property changes, the others all have to be updated). o Multi-modal properties can be defined to express the same value, in different terms, based on the access or assignment mode. This option was rejected because it significantly complicates the model and is impossible to express in current directory access protocols (e.g., (L)DAP). o Properties can be expressed as "absolutes", but the operators in the policy schema would need to be more sophisticated. Thus, to represent a percentage, division and multiplication operators are required (e.g., Class AF2 gets .05 * the total link bandwidth). This is the approach that has been taken in this document.
3.8. Mental Model
The mental model for constructing this schema is based on the work done in the Differentiated Services working group. This schema is based on information provided in the current versions of the DiffServ Informal Management Model [DSMODEL], the DiffServ MIB [DSMIB], the PIB [PIB], as well as on information in the set of RFCs that constitute the basic definition of DiffServ itself ([R2475], [R2474], [R2597], and [R3246]). In addition, a common set of terminology is available in [POLTERM]. This model is built around two fundamental class hierarchies that are bound together using a set of associations. The two class hierarchies derive from the QoSService and ConditioningService base classes. A set of associations relate lower-level QoSService subclasses to higher-level QoS services, relate different types of conditioning services together in processing a traffic class, and relate a set of conditioning services to a specific QoS service. This combination of associations enables us to view the device as providing a set of services that can be configured, in a modular building block fashion, to construct application-specific services. Thus, this document can be used to model existing and future standard as well as application-specific network QoS services.3.8.1. The QoSService Class
The first of the classes defined here, QoSService, is used to represent higher-level network services that require special conditioning of their traffic. An instance of QoSService (or one of its subclasses) is used to bring together a group of conditioning services that, from the perspective of the system manager, are all used to deliver a common service. Thus, the set of classifiers, markers, and related conditioning services that provide premium service to the "selected" set of user traffic may be grouped together into a premium QoS service. QoSService has a set of subclasses that represent different approaches to delivering IP services. The currently defined set of subclasses are a FlowService for flow-oriented QoS delivery and a DiffServService for DiffServ aggregate-oriented QoS service delivery. The QoS services can be related to each other as peers, or they can be implemented as subservient services to each other. The QoSSubService aggregation indicates that one or more QoSService objects are subservient to a particular QoSService object. For example, this enables us to define Gold Service as a combination of two DiffServ services, one for high quality traffic treatment, and one for servicing the rest of the traffic. Each of these
DiffServService objects would be associated with a set of classifiers, markers, etc, such that the high quality traffic would get EF marking and appropriate queuing. The DiffServService class itself has an AFService subclass. This subclass is used to represent the specific notion that several related markings within the AF PHB Group work together to provide a single service. When other DiffServ PHB Groups are defined that use more than one code point, these will be likely candidates for additional DiffServService subclasses. Technology-specific mappings of these services, representing the specific use of PHB marking or 802.1Q marking, are captured within the ConditioningService hierarchy, rather than in the subclasses of QoSService. These concepts are depicted in Figure 2. Note that both of the associations are aggregations: a QoSService object aggregates both the set of QoSService objects subservient to it, and the set of ConditioningService objects that realize it. See Section 4 for class and association definitions. /\______ 0..1 \/ | +--------------+ | QoSSubService +---------------+ | |0..n | | | | QoSService |----- | Conditioning | | | | Service | | | | | | |0..n 0..n| | | | /\______________________| | | | \/ QoSConditioning | | +--------------+ SubService +---------------+ Figure 2. QoSService and its Aggregations3.8.2. The ConditioningService Class
The goal of the ConditioningService classes is to describe the sequence of traffic conditioning that is applied to a given traffic stream on the ingress interface through which it enters a device, and then on the egress interface through which it leaves the device. This is done using a set of classes and relationships. The routing decision in the device core, which selects which egress interface a particular packet will use, is not represented in this model. A single base class, ConditioningService, is the superclass for a set of subclasses representing the mechanisms that condition traffic.
These subclasses define device-independent conditioning primitives (including classifiers, meters, markers, droppers, queues, and schedulers) that together implement the conditioning of traffic on an interface. This model abstracts these services into a common set of modular building blocks that can be used, regardless of device implementation, to model the traffic conditioning internal to a device. The different conditioning mechanisms need to be related to each other to describe how traffic is conditioned. Several important variations of how these services are related together exist: o A particular ingress or egress interface may not require all the types of ConditioningServices. o Multiple instances of the same mechanism may be required on an ingress or egress interface. o There is no set order of application for the ConditioningServices on an ingress or egress interface. Therefore, this model does not dictate a fixed ordering among the subclasses of ConditioningService, or identify a subclass of ConditioningService that must appear first or last among the ConditioningServices on an ingress or egress interface. Instead, this model ties together the various ConditioningService instances on an ingress or egress interface using the NextService, NextServiceAfterMeter, and NextServiceAfterConditioningElement associations. There are also separate associations, called IngressConditioningServiceOnEndpoint and EgressConditioningServiceOnEndpoint, which, respectively, tie an ingress interface to its first ConditioningService, and tie an egress interface to its last ConditioningService(s).3.8.3. Preserving QoS Information from Ingress to Egress
There is one important way in which the QDDIM model diverges from the [DSMODEL]. In [DSMODEL], traffic passes through a network device in three stages: o It comes in on an ingress interface, where it may receive QoS conditioning. o It traverses the routing core, where logic outside the scope of QoS determines which egress interface it will use to leave the device.
o It may receive further QoS conditioning on the selected egress interface, and then it leaves the device. In this model, no information about the QoS conditioning that a packet receives on the ingress interface is communicated with the packet across the routing core to the egress interface. The QDDIM model relaxes this restriction, to allow information about the treatment that a packet received on an ingress interface to be communicated along with the packet to the egress interface. (This relaxation adds a capability that is present in many network devices.) QDDIM represents this information transfer in terms of a packet preamble, which is how many devices implement it. But implementations are free to use other mechanisms to achieve the same result. +---------+ | Meter-A | a | | b d --->| In-|---PM-1---> | | c e | Out-|---PM-2---> +---------+ Figure 3: Meter Followed by Two Preamble Markers Figure 3 shows an example in which meter results are captured in a packet preamble. The arrows labeled with single letters represent instances of either the NextService association (a, d, and e), or of its peer association NextServiceAfterMeter (b and c). PreambleMarker PM-1 adds to the packet preamble an indication that the packet exited Meter A as conforming traffic. Similarly, PreambleMarker PM-2 adds to the preambles of packets that come through it indications that they exited Meter A as nonconforming traffic. A PreambleMarker appends its information to whatever is already present in a packet preamble, as opposed to overwriting what is already there. To foster interoperability, the basic format of the information captured by a PreambleMarker is specified. (Implementations, of course, are free to represent this information in a different way internally - this is just how it is represented in the model.) The information is represented by an ordered, multi-valued string property FilterItemList, where each individual value of the property is of the form "<type>,<value>". When a PreambleMarker "appends" its information to the information that was already present in a packet preamble, it does so by adding additional items of the indicated format to the end of the list.
QDDIM provides a limited set of <type>'s that a PreambleMarker may use: o ConformingFromMeter: the value is the name of the meter. o PartConformingFromMeter: the value is the name of the meter. o NonConformingFromMeter: the value is the name of the meter. o VlanId: the value is the virtual LAN identifier (VLAN ID). Implementations may recognize other <type>'s in addition to these. If collisions of implementation-specific <type>'s become a problem, it is possible that <type>'s may become an IANA-administered range in a future revision of this document. To make use of the information that a PreambleMarker stores in a packet preamble, a specific subclass PreambleFilter of FilterEntryBase is defined, to match on the "<type>,<value>" strings. To simplify the case where there's just a single level of metering in a device, but different individual meters on each ingress interface, PreambleFilter allows a wildcard "any" for the <value> part of the three meter-related filters. With this wildcard, an administrator can specify a Classifier to select all packets that were found to be conforming (or partially conforming, or non-conforming) by their respective meters, without having to name each meter individually in a separate ClassifierElement. Once a meter result has been stored in a packet preamble, it is available for any subsequent Classifier to use. So while the motivation for this capability has been described in terms of preserving QoS conditioning information from an ingress interface to an egress interface, a prior meter result may also be used for classifying packets later in the datapath on the same interface where the meter resides.3.9. Classifiers, FilterLists, and Filter Entries
This document uses a number of classes to model the classifiers defined in [DSMODEL]: ClassifierService, ClassifierElement, FilterList, FilterEntryBase, and various subclasses of FilterEntryBase. There are also two associations involved: ClassifierElementUsesFilterList and EntriesInFilterList. The QDDIM model makes no use of CIM's FilterEntry class. In [DSMODEL], a single traffic stream coming into a classifier is split into multiple traffic streams leaving it, based on which of an ordered set of filters each packet in the incoming stream matches. A
filter matches either a field in the packet itself, or possibly other attributes associated with the packet. In the case of a multi-field (MF) classifier, packets are assigned to output streams based on the contents of multiple fields in the packet header. For example, an MF classifier might assign packets to an output stream based on their complete IP-addressing 5-tuple. To optimize the representation of MF classifiers, subclasses of FilterEntryBase are introduced, which allow multiple related packet header fields to be represented in a single object. These subclasses are IPHeaderFilter and 8021Filter. With IPHeaderFilter, for example, criteria for selecting packets based on all five of the IP 5-tuple header fields and the DiffServ DSCP can be represented by a FilterList containing one IPHeaderFilter object. Because these two classes have applications beyond those considered in this document, they, as well as the abstract class FilterEntryBase, are defined in the more general document [PCIME] rather than here. The FilterList object is always needed, even if it contains only one filter entry (that is, one FilterEntryBase subclass) object. This is because a ClassifierElement can only be associated with a Filter List, as opposed to an individual FilterEntry. FilterList is also defined in [PCIME]. The EntriesInFilterList aggregation (also defined in [PCIME]) has a property EntrySequence, which in the past (in CIM) could be used to specify an evaluation order on the filter entries in a FilterList. Now, however, the EntrySequence property supports only a single value: '0'. This value indicates that the FilterEntries are ANDed together to determine whether a packet matches the MF selector that the FilterList represents. A ClassifierElement specifies the starting point for a specific policy or data path. Each ClassifierElement uses the NextServiceAfterClassifierElement association to determine the next conditioning service to apply for packets to. A ClassifierService defines a grouping of ClassifierElements. There are certain instances where a ClassifierService actually specifies an aggregation of ClassifierServices. One practical case would be where each ClassifierService specifies a group of policies associated with a particular application and another ClassifierService groups the application-specific ClassifierService instances. In this particular case, the application-specific ClassifierService instances are specified once, but unique combinations of these ClassifierServices are specified, as needed, using other ClassifierService instances. ClassifierService instances grouping other ClassifierService instances may not specify a FilterList using the
ClassifierElementUsesFilterList association. This special use of ClassifierService serves just as a Classifier collecting function.3.10. Modeling of Droppers
In [DSMODEL], a distinction is made between absolute droppers and algorithmic droppers. In QDDIM, both of these types of droppers are modeled with the DropperService class, or with one of its subclasses. In both cases, the queue from which the dropper drops packets is tied to the dropper by an instance of the NextService association. The dropper always plays the PrecedingService role in these associations, and the queue always plays the FollowingService role. There is always exactly one queue from which a dropper drops packets. Since an absolute dropper drops all packets in its queue, it needs no configuration beyond a NextService tie to that queue. For an algorithmic dropper, however, further configuration is needed: o a specific drop algorithm; o parameters for the algorithm (for example, token bucket size); o the source(s) of input(s) to the algorithm; o possibly per-input parameters for the algorithm. The first two of these items are represented by properties of the DropperService class, or properties of one of its subclasses. The last two, however, involve additional classes and associations.3.10.1. Configuring Head and Tail Droppers
The HeadTailDropQueueBinding is the association that identifies the inputs for the algorithm executed by a tail dropper. This association is not used for a head dropper, because a head dropper always has exactly one input to its drop algorithm, and this input is always the queue from which it drops packets. For a tail dropper, this association is defined to have a many-to-many cardinality. There are, however, two distinct cases: One dropper bound to many queues: This represents the case where the drop algorithm for the dropper involves inputs from more than one queue. The dropper still drops from only one queue, the one to which it is tied by a NextService association. But the drop decision may be influenced by the state of several queues. For the classes HeadTailDropper and HeadTailDropQueueBinding, the rule for combining the multiple inputs is simple addition: if the sum of the lengths of the monitored queues exceeds the dropper's QueueThreshold value, then
packets are dropped. This rule for combining inputs may, however, be overridden by a different rule in subclasses of one or both of these classes. One queue bound to many droppers: This represents the case where the state of one queue (which is typically also the queue from which packets are dropped) provides an input to multiple droppers' drop algorithms. A use case here is a classifier that splits a traffic stream into, say, four parts, representing four classes of traffic. Each of the parts goes through a separate HeadTailDropper, then they're re-merged onto the same queue. The net is a single queue containing packets of four traffic types, with, say, the following drop thresholds: o Class 1 - 90% full o Class 2 - 80% full o Class 3 - 70% full o Class 4 - 50% full Here the percentages represent the overall state of the queue. With this configuration, when the queue in question becomes 50% full, Class 4 packets will be dropped rather than joining the queue, when it becomes 70% full, Class 3 and 4 packets will be dropped, etc. The two cases described here can also occur together, if a dropper receives inputs from multiple queues, one or more of which are also providing inputs to other droppers.3.10.2. Configuring RED Droppers
Like a tail dropper, a RED dropper, represented by an instance of the REDDropperService class, may take as its inputs the states of multiple queues. In this case, however, there is an additional step: each of these inputs may be smoothed before the RED dropper uses it, and the smoothing process itself must be parameterized. Consequently, in addition to REDDropperService and QueuingService, a third class, DropThresholdCalculationService, is introduced, to represent the per-queue parameterization of this smoothing process.
The following instance diagram illustrates how these classes work with each other: RDSvc-A | | | +-----+ | +-----+ | | | DTCS-1 DTCS-2 DTCS-3 | | | Q-1 Q-2 Q-3 Figure 4. Inputs for a RED Dropper So REDDropperService-A (RDSvc-A) is using inputs from three queues to make its drop decision. (As always, RDSvc-A is linked to the queue from which it drops packets via the NextService association.) For each of these three queues, there is a (DropThresholdCalculationService) DTCS instance that represents the smoothing weight and time interval to use when looking at that queue. Thus each DTCS instance is tied to exactly one queue, although a single queue may be examined (with different weight and time values) by multiple DTCS instances. Also, a DTCS instance and the queue behind it can be thought of as a "unit of reusability". So a single DTCS can be referred to by multiple RDSvc's. Unless it is overridden by a different rule in a subclass of REDDropperService, the rule that a RED dropper uses to combine the smoothed inputs from the DTCS's to create a value to use in making its drop decision is simple addition.3.11. Modeling of Queues and Schedulers
In order to appreciate the rationale behind this rather complex model for scheduling, we must consider the rather complex nature of schedulers, as well as the extreme variations in algorithms and implementations. Although these variations are broad, we have identified four examples that serve to test the model and justify its complexity.3.11.1. Simple Hierarchical Scheduler
A simple, hierarchical scheduler has the following properties. First, when a scheduling opportunity is given to a set of queues, a single, viable queue is determined based on some scheduling criteria, such as bandwidth or priority. The output of the scheduler is the input to another scheduler that treats the first scheduler (and its queues) as a single logical queue. Hence, if the first scheduler determined the appropriate packet to release based on a priority assigned to each
queue, the second scheduler might specify a bandwidth limit/allocation for the entire set of queues aggregated by the first scheduler. +----------+ NextService |QueuingSvc+----------------------------------------------+ | Name=EF1 | | | | QueueTo +--------------+ ElementSched | | +------------+PrioritySched +---------------+ | +----------+ Schedule |Element | Service | | | Name=EF1-Pri | | v | Priority=1 | +-----------+-+-+ +--------------+ |SchedulingSvc + | Name=PriSched1+ +--------------+ +----------+--+-+ |PrioritySched | ElementSched | ^ +----------+ |Element +---------------+ | |QueuingSvc| QueueTo | Name=AF1x-Pri| Service | | Name=AF1x+------------+ Priority=2 | | | | Schedule +--------------+ | | | NextService | | +----------------------------------------------+ +----------+ : +---------------+ NextScheduler |SchedulingSvc +--------------------------------------------+ | Name=PriSched1| | +-------+-------+ +--------------------+ElementSchedSvc| | SchedToSched |AllocationScheduling+--------+ | +---------------+Element | | | | Name=PriSched1-Band| | | | Units=Bytes | | v | Bandwidth=100 | +------+------+--+ +--------------------+ |SchedulingSvc | | Name=BandSched1| +--------------------+ +------+------+--+ |AllocationScheduling| | ^ +---------------+ |Element +--------+ | |QueuingService | | Name=BE-Band |ElementSchedSvc| | Name=BE |QueueTo+ Units=Bytes | | | |-------+ Bandwidth=50 | | | |Sched +--------------------+ | | | NextService | | +--------------------------------------------+ +---------------+ Figure 5. Example 1: Simple Hierarchical Scheduler
Figure 5 illustrates the example and how it would be instantiated using the model. In the figure, NextService determines the first scheduler after the queue. NextScheduler determines the subsequent ordering of schedulers. In addition, the ElementSchedulingService association determines the set of scheduling parameters used by a specific scheduler. Scheduling parameters can be bound either to queues or to schedulers. In the case of the SchedulingElement EF1-Pri, the binding is to a queue, so the QueueToSchedule association is used. In the case of the SchedulingElement PriSched1-Band, the binding is to another scheduler, so the SchedulerToSchedule association is used. Note that due to space constraints of the document, the SchedulingService PRISched1 is represented twice, to show how it is connected to all the other objects.3.11.2. Complex Hierarchical Scheduler
A complex, hierarchical scheduler has the same characteristics as a simple scheduler, except that the criteria for the second scheduler are determined on a per queue basis rather than on an aggregate basis. One scenario might be a set of bounded priority schedulers. In this case, each queue is assigned a relative priority. However, each queue is also not allowed to exceed a bandwidth allocation that is unique to that queue. In order to support this scenario, the queue must be bound to two separate schedulers. Figure 6 illustrates this situation, by describing an EF queue and a best effort (BE) queue both pointing to a priority scheduler via the NextService association. The NextScheduler association between the priority scheduler and the bandwidth scheduler in turn defines the ordering of the scheduling hierarchy. Also note that each scheduler has a distinct set of scheduling parameters that are bound back to each queue. This demonstrates the need to support two or more parameter sets on a per queue basis.
+----------------+ |QueuingService | | Name=EF | | |QueueTo +----------------+ElementSchedSvc | +----------+AllocationSched +--------+ ++---+-----------+Schedule |Element | | | | | Name=BandEF | | | |QueueTo | Units=Bytes | | | |Schedule | Bandwidth=100 | | | | +----------------+ +------+---------+ | | |SchedulingSvc | | | +------------------+ | Name=BandSched | | +------+PriorityScheduling| +------------+--++ | |Element | ^ | | | Name=PriEF |ElementSchedSvc | | | | Priority=1 +---------------------+ | | | +------------------+ | | | |NextService | | | +-------------------------------------------------+ | | | | | | | NextService | | | | +-----------------------------------------------+ | | | | | | | | | | | +------------------+ElementSchedSvc | | | | | | |PriorityScheduling+--------+ | | | | | | |Element | | | | | | | | | Name=PriBE | | v v | | | | +------+ Priority=2 | +---+--------+-+-+-+Next| | | | +------------------+ |SchedulingService +----+ | | | | Name=PriSched |Sched | | | +------------------+ | | |QueueTo | | |Schedule +----------------+ | | | |AllocationSched |ElementSchedSvc | +----+---------+ |Element +-----------------+ |QueuingService|QueueTo | Name=BandBE | | Name=BE +------------+ Units=Bytes | | |Schedule | Bandwidth=50 | | | +----------------+ +--------------+ Figure 6. Example 2: Complex Hierarchical Scheduler
3.11.3. Excess Capacity Scheduler
An excess capacity scheduler offers a similar requirement to support two scheduling parameter sets per queue. However, in this scenario the reasons are a little different. Suppose a set of queues have each been assigned bandwidth limits to ensure that no traffic class starves out another traffic class. The result may be that one or more queues have exceeded their allocation while the queues that deserve scheduling opportunities are empty. The question then is how is the excess (idle) bandwidth allocated. Conceivably, the scheduling criteria for excess capacity are completely different from the criteria that determine allocations under uniform load. This could be supported with a scheduling hierarchy. However, the problem is that the criteria for using the subsequent scheduler are different from those in the last two cases. Specifically, the next scheduler should only be used if a scheduling opportunity exists that was passed over by the prior scheduler. When a scheduler chooses to forgo a scheduling decision, it is behaving as a non-work conserving scheduler. Work conserving schedulers, by definition, will always take advantage of a scheduling opportunity, irrespective of which queue is being serviced and how much bandwidth it has consumed in the past. This point leads to an interesting insight. The semantics of a non-work conserving scheduler are equivalent to those of a meter, in that if a packet is in profile it is given the scheduling opportunity, and if it is out of profile it does not get a scheduling opportunity. However, with meters there are semantics that determine the next action behavior when the packet is in profile and when the packet is out of profile. Similarly, with the non-work conserving scheduler, there needs to be a means for determining the next scheduler when a scheduler chooses not to utilize a scheduling opportunity. Figure 7 illustrates this last scenario. It appears very similar to Figure 6, except that the binding between the allocation scheduler and the WRR scheduler is using a FailNextScheduler association. This association is explicitly indicating the fact that the only time the WRR scheduler would be used is when there are non-empty queues that the allocation scheduler rejected for scheduling consideration. Note that Figure 7 is incomplete, in that typically there would be several more queues that are bound to an allocation scheduler and a WRR scheduler.
+------------+ |QueuingSvc | | Name=EF | | | | | ++-+---------+ | | | |QueueTo | |Schedule +--------------+ | | |SchedulingSvc | | | +------------------+ | Name=WRRSched| | +------+AllocationSched | +----------+-+-+ | |Element | ^ | | | Name=BandEF |ElementSchedSvc | | | | Units=Bytes +--------------------+ | | | | Bandwidth=100 | | | | | +------------------+ | | | |NextService | | | +----------------------------------------------+ | | | | | | | NextService | | | | +--------------------------------------------+ | | | | | | | | | | | +------------------+ElementSchedSvc | | | | | | |AllocationSched +--------+ | | | | | | |Element | | | | | | | | | Name=BandwidthAF1| | | | | | | | | Units=Bytes | | v v | | | | +------+ Bandwidth=50 | +--+----------+-+-++FailNext| | | | +------------------+ |SchedulingService +--------+ | | |QueueTo | Name=BandSched |Scheduler | | |Schedule +------------------+ | | | | | | +---------------------+ | ++-+-----------+ | WRRSchedulingElement| | |QueuingService|QueueTo | Name=WRRBE +------------+ | Name=BE +-----------+ Weight=30 |ElementSchedSvc +--------------+Schedule +---------------------+ Figure 7. Example 3: Excess Capacity Scheduler
3.11.4. Hierarchical CBQ Scheduler
A hierarchical class-based queuing (CBQ) scheduler is the fourth scenario to be considered. In hierarchical CBQ, each queue is allocated a specific bandwidth allocation. Queues are grouped together into a logical scheduler. This logical scheduler in turn has an aggregate bandwidth allocation that equals the sum of the queues it is scheduling. In turn, logical schedulers can be aggregated into higher-level logical schedulers. Changing perspectives and looking top down, the top-most logical scheduler has 100% of the link capacity. This allocation is parceled out to logical schedulers below it such that the sum of the allocations is equal to 100%. These second tier schedulers may in turn parcel out their allocation across a third tier of schedulers and so forth until the lowest tier that parcels out their allocations to specific queues representing relatively fine-grained classes of traffic. The unique aspect of hierarchical CBQ is that when there is insufficient bandwidth for a specific allocation, schedulers higher in the tree are tested to see if another portion of the tree has capacity to spare. Figure 8 demonstrates this example with two tiers. The example is split in half because of space constraints, resulting in the CBQTier1 scheduling service instance being represented twice. Note that the total allocation at the top tier is 50 Mb. The voice allocation is 22 Mb. The remaining 23 Mb is split between FTP and Web. Hence, if Web traffic is actually consuming 20 Mb (5 Mb in excess of the allocation). If FTP is consuming 5 Mb, then it is possible for the CBQTier1 scheduler to offer 3Mb of its allocation to Web traffic. However, this is not enough, so the FailNextScheduler association needs to be traversed to determine if there is any excess capacity available from the voice class. If the voice class is only consuming 15 Mb of its 22 Mb allocation, there are sufficient resources to allow the web traffic through. Note that FailNextScheduler is used as the association. The reason is because the CBQTier1 scheduler in fact failed to schedule a packet because of insufficient resources. It is conceivable that a variant of hierarchical CBQ allows a hierarchy for successful scheduling as well. Hence, both associations are necessary. Note that due to space constraints of the document, the SchedulingService CBQTier1 is represented twice, to show how it is connected to all the other objects.
+-----------+ NextService |QueuingSvc +-------------------------------------------+ | Name=Web | | | |QueueTo+----------------+ ElementSchedSvc | | +-------+AllocationSched +----------------+ | +-----------+Sched |Element | | | | Name=Web-Alloc | | v | Bandwidth=15 | +-----------+-+-+ +----------------+ |SchedulingSvc + | Name=CBQTier1 + +----------------+ +-----------+-+-+ |AllocationSched | ElementSchedSvc| ^ +-----------+ |Element +----------------+ | |QueuingSvc |QueueTo| Name=FTP-Alloc | | | Name=FTP +-------+ Bandwidth=8 | | | |Sched +----------------+ | | | NextService | | +-------------------------------------------+ +-----------+ : +---------------+ FailNextScheduler |SchedulingSvc +---------------------------------------------+ | Name=CBQTier1 | | +-------+-------+ +---------------------+ElementSchedSvc| | SchedToSched |AllocationScheduling +--------+ | +---------------+Element | | | | Name=LowPri-Alloc | | | | Bandwidth=23 | | v +---------------------+ +-----+------+-+ |SchedulingSvc | | Name=CBQTop | +---------------------+ +----------+-+-+ |AllocationScheduling |ElementSchedSvc | ^ +------------+ |Element +----------------+ | |QueuingSvc |QueueTo| Name=BE-Band | | | Name=Voice +-------+ Bandwidth=22 | | | |Sched +---------------------+ | | | NextService | | +------------------------------------------------+ +------------+ Figure 8. Example 4: Hierarchical CBQ Scheduler
4. The Class Hierarchy
The following sections present the class and association hierarchies that together comprise the information model for modeling QoS capabilities at the device level.4.1. Associations and Aggregations
Associations and aggregations are a means of representing relationships between two (or theoretically more) objects. Dependency, aggregation, and other relationships are modeled as classes containing two (or more) object references. It should be noted that aggregations represent either "whole-part" or "collection" relationships. For example, aggregation can be used to represent the containment relationship between a system and the components that constitute the system. Since associations and aggregations are classes, they can benefit from all of the object-oriented features that other non-relationship classes have. For example, they can contain properties and methods, and inheritance can be used to refine their semantics such that they represent more specialized types of their superclasses. Note that an association (or an aggregation) object is treated as an atomic unit (individual instance), even though it relates/collects/is comprised of multiple objects. This is a defining feature of an association (or an aggregation) - although the individual elements that are related to other objects have their own identities, the association (or aggregation) object that is constructed using these objects has its own identity and name as well. It is important to note that associations and aggregations form an inheritance hierarchy that is separate from the class inheritance hierarchy. Although associations and aggregations are typically bi- directional, there is nothing that prevents higher order associations or aggregations from being defined. However, such associations and aggregations are inherently more complex to define, understand, and use. In practice, associations and aggregations of orders higher than binary are rarely used, because of their greatly increased complexity and lack of generality. All of the associations and aggregations defined in this model are binary. Note also that by definition, associations and aggregations cannot be unary.
Finally, note that associations and aggregations that are defined between two classes do not affect the classes themselves. That is, the addition or deletion of an association or an aggregation does not affect the interfaces of the classes that it is connecting.4.2. The Structure of the Class Hierarchies
The structure of the class, association, and aggregation class inheritance hierarchies for managing the datapaths of QoS devices is shown, respectively, in Figure 9, Figure 10, and Figure 11. The notation (CIMCORE) identifies a class defined in the CIM Core model. Please refer to [CIM] for the definitions of these classes. Similarly, the notation [PCIME] identifies a class defined in the Policy Core Information Model Extensions document. This model has been influenced by [CIM], and is compatible with the Directory Enabled Networks (DEN) effort. +--ManagedElement (CIMCORE) | +--ManagedSystemElement (CIMCORE) | | | +--LogicalElement (CIMCORE) | | | +--Service (CIMCORE) | | | | | +--ConditioningService | | | | | | | +--ClassifierService | | | | | | | | | +--ClassifierElement | | | | | | | +--MeterService | | | | | | | | | +--AverageRateMeterService | | | | | | | | | +--EWMAMeterService | | | | | | | | | +--TokenBucketMeterService | | | | | | | +--MarkerService | | | | | | | | | +--PreambleMarkerService | | | | | | | | | +--TOSMarkerService | | | | | | | | | +--DSCPMarkerService | | | | |
(continued from previous page; the first four elements are repeated for convenience) +--ManagedElement (CIMCORE) | +--ManagedSystemElement (CIMCORE) | | | +--LogicalElement (CIMCORE) | | | +--Service (CIMCORE) | | | | +--8021QMarkerService | | | | | | | +--DropperService | | | | | | | | | +--HeadTailDropperService | | | | | | | | | +--RedDropperService | | | | | | | +--QueuingService | | | | | | | +--PacketSchedulingService | | | | | | | +--NonWorkConservingSchedulingService | | | | | +--QoSService | | | | | | | +--DiffServService | | | | | | | | | +--AFService | | | | | | | +--FlowService | | | | | +--DropThresholdCalculationService | | | +--FilterEntryBase [PCIME] | | | | | +--IPHeaderFilter [PCIME] | | | | | +--8021Filter [PCIME] | | | | | +--PreambleFilter | | | +--FilterList [PCIME] | | | +--ServiceAccessPoint (CIMCORE) | | | +--ProtocolEndpoint
(continued from previous page; the first four elements are repeated for convenience) +--ManagedElement (CIMCORE) | +--ManagedSystemElement (CIMCORE) | | | +--LogicalElement (CIMCORE) | | | +--Service (CIMCORE) | +--Collection (CIMCORE) | | | +--CollectionOfMSEs (CIMCORE) | | | +--BufferPool | +--SchedulingElement | +--AllocationSchedulingElement | +--WRRSchedulingElement | +--PrioritySchedulingElement | +--BoundedPrioritySchedulingElement Figure 9. Class Inheritance Hierarchy
The inheritance hierarchy for the associations defined in this document is shown in Figure 10. +--Dependency (CIMCORE) | | | +--ServiceSAPDependency (CIMCORE) | | | | | +--IngressConditioningServiceOnEndpoint | | | | | +--EgressConditioningServiceOnEndpoint | | | +--HeadTailDropQueueBinding | | | +--CalculationBasedOnQueue | | | +--ProvidesServiceToElement (CIMCORE) | | | | | +--ServiceServiceDependency (CIMCORE) | | | | | +--CalculationServiceForDropper | | | +--QueueAllocation | | | +--ClassifierElementUsesFilterList | +--AFRelatedServices | +--NextService | | | +--NextServiceAfterClassifierElement | | | +--NextScheduler | | | +--FailNextScheduler | +--NextServiceAfterMeter | +--QueueToSchedule | +--SchedulingServiceToSchedule Figure 10. Association Class Inheritance Hierarchy
The inheritance hierarchy for the aggregations defined in this document is shown in Figure 11. +--MemberOfCollection (CIMCORE) | | | +--CollectedBufferPool | +--Component (CIMCORE) | | | +--ServiceComponent (CIMCORE) | | | | | +--QoSSubService | | | | | +--QoSConditioningSubService | | | | | +--ClassifierElementInClassifierService | | | +--EntriesInFilterList [PCIME] | +--ElementInSchedulingService Figure 11. Aggregation Class Inheritance Hierarchy