RFC 7640

Traffic Management Benchmarking

Pages: 51
Informational

Part 2 of 3 – Pages 17 to 32

RFC7640 - Page 17 prevText

6.  Traffic Benchmarking Methodology

   The traffic benchmarking methodology uses the test setup from
   Section 1.2 and metrics defined in Section 4.

   Each test SHOULD compare the network device's internal statistics
   (available via command line management interface, SNMP, etc.) to the
   measured metrics defined in Section 4.  This evaluates the accuracy
   of the internal traffic management counters under individual test
   conditions and capacity test conditions as defined in Sections 4.1
   and 4.2.  This comparison is not intended to compare real-time
   statistics, but rather the cumulative statistics reported after the
   test has completed and device counters have updated (it is common for
   device counters to update after an interval of 10 seconds or more).

   From a device configuration standpoint, scheduling and shaping
   functionality can be applied to logical ports (e.g., Link Aggregation
   (LAG)).  This would result in the same scheduling and shaping
   configuration applied to all of the member physical ports.  The focus
   of this document is only on tests at a physical-port level.

   The following sections provide the objective, procedure, metrics, and
   reporting format for each test.  For all test steps, the following
   global parameters must be specified:

      Test Runs (Tr):
         The number of times the test needs to be run to ensure accurate
         and repeatable results.  The recommended value is a minimum
         of 10.

      Test Duration (Td):
         The duration of a test iteration, expressed in seconds.  The
         recommended minimum value is 60 seconds.

   The variability in the test results MUST be measured between test
   runs, and if the variation is characterized as a significant portion
   of the measured values, the next step may be to revise the methods to
   achieve better consistency.

6.1.  Policing Tests

   A policer is defined as the entity performing the policy function.
   The intent of the policing tests is to verify the policer performance
   (i.e., CIR/CBS and EIR/EBS parameters).  The tests will verify that
   the network device can handle the CIR with CBS and the EIR with EBS,
   and will use back-to-back packet-testing concepts as described in
   [RFC2544] (but adapted to burst size algorithms and terminology).
   Also, [MEF-14], [MEF-19], and [MEF-37] provide some bases for

RFC7640 - Page 18

   specific components of this test.  The burst hunt algorithm defined
   in Section 5.1.1 can also be used to automate the measurement of the
   CBS value.

   The tests are divided into two (2) sections: individual policer tests
   and then full-capacity policing tests.  It is important to benchmark
   the basic functionality of the individual policer and then proceed
   into the fully rated capacity of the device.  This capacity may
   include the number of policing policies per device and the number of
   policers simultaneously active across all ports.

6.1.1.  Policer Individual Tests

   Objective:
      Test a policer as defined by [RFC4115] or [MEF-10.3], depending
      upon the equipment's specification.  In addition to verifying that
      the policer allows the specified CBS and EBS bursts to pass, the
      policer test MUST verify that the policer will remark or drop
      excess packets, and pass traffic at the specified CBS/EBS values.

   Test Summary:
      Policing tests should use stateless traffic.  Stateful TCP test
      traffic will generally be adversely affected by a policer in the
      absence of traffic shaping.  So, while TCP traffic could be used,
      it is more accurate to benchmark a policer with stateless traffic.

      As an example of a policer as defined by [RFC4115], consider a
      CBS/EBS of 64 KB and CIR/EIR of 100 Mbps on a 1 GigE physical link
      (in color-blind mode).  A stateless traffic burst of 64 KB would
      be sent into the policer at the GigE rate.  This equates to an
      approximately 0.512-millisecond burst time (64 KB at 1 GigE).  The
      traffic generator must space these bursts to ensure that the
      aggregate throughput does not exceed the CIR.  The Ti between the
      bursts would equal CBS * 8 / CIR = 5.12 milliseconds in this
      example.

   Test Metrics:
      The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV)
      SHALL be measured at the egress port and recorded.

   Procedure:
      1. Configure the DUT policing parameters for the desired CIR/EIR
         and CBS/EBS values to be tested.

      2. Configure the tester to generate a stateless traffic burst
         equal to CBS and an interval equal to Ti (CBS in bits/CIR).

RFC7640 - Page 19

      3. Compliant Traffic Test: Generate bursts of CBS + EBS traffic
         into the policer ingress port, and measure the metrics defined
         in Section 4.1 (BSA, LP, OOS, PD, and PDV) at the egress port
         and across the entire Td (default 60-second duration).

      4. Excess Traffic Test: Generate bursts of greater than CBS + EBS
         bytes into the policer ingress port, and verify that the
         policer only allowed the BSA bytes to exit the egress.  The
         excess burst MUST be recorded; the recommended value is
         1000 bytes.  Additional tests beyond the simple color-blind
         example might include color-aware mode, configurations where
         EIR is greater than CIR, etc.

   Reporting Format:
      The policer individual report MUST contain all results for each
      CIR/EIR/CBS/EBS test run.  A recommended format is as follows:

      ***********************************************************

      Test Configuration Summary: Tr, Td

      DUT Configuration Summary: CIR, EIR, CBS, EBS

      The results table should contain entries for each test run,
      as follows (Test #1 to Test #Tr):

      -  Compliant Traffic Test: BSA, LP, OOS, PD, and PDV

      -  Excess Traffic Test: BSA

      ***********************************************************

6.1.2.  Policer Capacity Tests

   Objective:
      The intent of the capacity tests is to verify the policer
      performance in a scaled environment with multiple ingress customer
      policers on multiple physical ports.  This test will benchmark the
      maximum number of active policers as specified by the device
      manufacturer.

   Test Summary:
      The specified policing function capacity is generally expressed in
      terms of the number of policers active on each individual physical
      port as well as the number of unique policer rates that are
      utilized.  For all of the capacity tests, the benchmarking test

RFC7640 - Page 20

      procedure and reporting format described in Section 6.1.1 for a
      single policer MUST be applied to each of the physical-port
      policers.

      For example, a Layer 2 switching device may specify that each of
      the 32 physical ports can be policed using a pool of policing
      service policies.  The device may carry a single customer's
      traffic on each physical port, and a single policer is
      instantiated per physical port.  Another possibility is that a
      single physical port may carry multiple customers, in which case
      many customer flows would be policed concurrently on an individual
      physical port (separate policers per customer on an individual
      port).

   Test Metrics:
      The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV)
      SHALL be measured at the egress port and recorded.

   The following sections provide the specific test scenarios,
   procedures, and reporting formats for each policer capacity test.

6.1.2.1.  Maximum Policers on Single Physical Port

   Test Summary:
      The first policer capacity test will benchmark a single physical
      port, with maximum policers on that physical port.

      Assume multiple categories of ingress policers at rates
      r1, r2, ..., rn.  There are multiple customers on a single
      physical port.  Each customer could be represented by a
      single-tagged VLAN, a double-tagged VLAN, a Virtual Private LAN
      Service (VPLS) instance, etc.  Each customer is mapped to a
      different policer.  Each of the policers can be of rates
      r1, r2, ..., rn.

      An example configuration would be

      -  Y1 customers, policer rate r1

      -  Y2 customers, policer rate r2

      -  Y3 customers, policer rate r3

      ...

      -  Yn customers, policer rate rn

RFC7640 - Page 21

      Some bandwidth on the physical port is dedicated for other traffic
      (i.e., other than customer traffic); this includes network control
      protocol traffic.  There is a separate policer for the other
      traffic.  Typical deployments have three categories of policers;
      there may be some deployments with more or less than three
      categories of ingress policers.

   Procedure:
      1. Configure the DUT policing parameters for the desired CIR/EIR
         and CBS/EBS values for each policer rate (r1-rn) to be tested.

      2. Configure the tester to generate a stateless traffic burst
         equal to CBS and an interval equal to Ti (CBS in bits/CIR) for
         each customer stream (Y1-Yn).  The encapsulation for each
         customer must also be configured according to the service
         tested (VLAN, VPLS, IP mapping, etc.).

      3. Compliant Traffic Test: Generate bursts of CBS + EBS traffic
         into the policer ingress port for each customer traffic stream,
         and measure the metrics defined in Section 4.1 (BSA, LP, OOS,
         PD, and PDV) at the egress port for each stream and across the
         entire Td (default 30-second duration).

      4. Excess Traffic Test: Generate bursts of greater than CBS + EBS
         bytes into the policer ingress port for each customer traffic
         stream, and verify that the policer only allowed the BSA bytes
         to exit the egress for each stream.  The excess burst MUST be
         recorded; the recommended value is 1000 bytes.

   Reporting Format:
      The policer individual report MUST contain all results for each
      CIR/EIR/CBS/EBS test run, per customer traffic stream.  A
      recommended format is as follows:

      *****************************************************************

      Test Configuration Summary: Tr, Td

      Customer Traffic Stream Encapsulation: Map each stream to VLAN,
      VPLS, IP address

      DUT Configuration Summary per Customer Traffic Stream: CIR, EIR,
      CBS, EBS

RFC7640 - Page 22

      The results table should contain entries for each test run,
      as follows (Test #1 to Test #Tr):

      -  Customer Stream Y1-Yn (see note) Compliant Traffic Test:
         BSA, LP, OOS, PD, and PDV

      -  Customer Stream Y1-Yn (see note) Excess Traffic Test: BSA

      *****************************************************************

      Note: For each test run, there will be two (2) rows for each
      customer stream: the Compliant Traffic Test result and the Excess
      Traffic Test result.

6.1.2.2.  Single Policer on All Physical Ports

   Test Summary:
      The second policer capacity test involves a single policer
      function per physical port with all physical ports active.  In
      this test, there is a single policer per physical port.  The
      policer can have one of the rates r1, r2, ..., rn.  All of the
      physical ports in the networking device are active.

   Procedure:
      The procedure for this test is identical to the procedure listed
      in Section 6.1.1.  The configured parameters must be reported
      per port, and the test report must include results per measured
      egress port.

6.1.2.3.  Maximum Policers on All Physical Ports

   The third policer capacity test is a combination of the first and
   second capacity tests, i.e., maximum policers active per physical
   port and all physical ports active.

   Procedure:
      The procedure for this test is identical to the procedure listed
      in Section 6.1.2.1.  The configured parameters must be reported
      per port, and the test report must include per-stream results per
      measured egress port.

RFC7640 - Page 23

6.2.  Queue/Scheduler Tests

   Queues and traffic scheduling are closely related in that a queue's
   priority dictates the manner in which the traffic scheduler transmits
   packets out of the egress port.

   Since device queues/buffers are generally an egress function, this
   test framework will discuss testing at the egress (although the
   technique can be applied to ingress-side queues).

   Similar to the policing tests, these tests are divided into two
   sections: individual queue/scheduler function tests and then
   full-capacity tests.

6.2.1.  Queue/Scheduler Individual Tests

   The various types of scheduling techniques include FIFO, Strict
   Priority (SP) queuing, and Weighted Fair Queuing (WFQ), along with
   other variations.  This test framework recommends testing with a
   minimum of three techniques, although benchmarking other
   device-scheduling algorithms is left to the discretion of the tester.

6.2.1.1.  Testing Queue/Scheduler with Stateless Traffic

   Objective:
      Verify that the configured queue and scheduling technique can
      handle stateless traffic bursts up to the queue depth.

   Test Summary:
      A network device queue is memory based, unlike a policing
      function, which is token or credit based.  However, the same
      concepts from Section 6.1 can be applied to testing network device
      queues.

      The device's network queue should be configured to the desired
      size in KB (i.e., Queue Length (QL)), and then stateless traffic
      should be transmitted to test this QL.

      A queue should be able to handle repetitive bursts with the
      transmission gaps proportional to the Bottleneck Bandwidth (BB).
      The transmission gap is referred to here as the transmission
      interval (Ti).  The Ti can be defined for the traffic bursts and
      is based on the QL and BB of the egress interface.

         Ti = QL * 8 / BB

RFC7640 - Page 24

      Note that this equation is similar to the Ti required for
      transmission into a policer (QL = CBS, BB = CIR).  Note also that
      the burst hunt algorithm defined in Section 5.1.1 can also be used
      to automate the measurement of the queue value.

      The stateless traffic burst SHALL be transmitted at the link speed
      and spaced within the transmission interval (Ti).  The metrics
      defined in Section 4.1 SHALL be measured at the egress port and
      recorded; the primary intent is to verify the BSA and verify that
      no packets are dropped.

      The scheduling function must also be characterized to benchmark
      the device's ability to schedule the queues according to the
      priority.  An example would be two levels of priority that include
      SP and FIFO queuing.  Under a flow load greater than the egress
      port speed, the higher-priority packets should be transmitted
      without drops (and also maintain low latency), while the lower-
      priority (or best-effort) queue may be dropped.

   Test Metrics:
      The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV)
      SHALL be measured at the egress port and recorded.

   Procedure:
      1. Configure the DUT QL and scheduling technique parameters (FIFO,
         SP, etc.).

      2. Configure the tester to generate a stateless traffic burst
         equal to QL and an interval equal to Ti (QL in bits/BB).

      3. Generate bursts of QL traffic into the DUT, and measure the
         metrics defined in Section 4.1 (LP, OOS, PD, and PDV) at the
         egress port and across the entire Td (default 30-second
         duration).

   Reporting Format:
      The Queue/Scheduler Stateless Traffic individual report MUST
      contain all results for each QL/BB test run.  A recommended format
      is as follows:

      ****************************************************************

      Test Configuration Summary: Tr, Td

      DUT Configuration Summary: Scheduling technique (i.e., FIFO, SP,
      WFQ, etc.), BB, and QL

RFC7640 - Page 25

      The results table should contain entries for each test run,
      as follows (Test #1 to Test #Tr):

      -  LP, OOS, PD, and PDV

      ****************************************************************

6.2.1.2.  Testing Queue/Scheduler with Stateful Traffic

   Objective:
      Verify that the configured queue and scheduling technique can
      handle stateful traffic bursts up to the queue depth.

   Test Background and Summary:
      To provide a more realistic benchmark and to test queues in
      Layer 4 devices such as firewalls, stateful traffic testing is
      recommended for the queue tests.  Stateful traffic tests will also
      utilize the Network Delay Emulator (NDE) from the network setup
      configuration in Section 1.2.

      The BDP of the TCP test traffic must be calibrated to the QL of
      the device queue.  Referencing [RFC6349], the BDP is equal to:

         BB * RTT / 8 (in bytes)

      The NDE must be configured to an RTT value that is large enough to
      allow the BDP to be greater than QL.  An example test scenario is
      defined below:

      -  Ingress link = GigE

      -  Egress link = 100 Mbps (BB)

      -  QL = 32 KB

      RTT(min) = QL * 8 / BB and would equal 2.56 ms
         (and the BDP = 32 KB)

      In this example, one (1) TCP connection with window size / SSB of
      32 KB would be required to test the QL of 32 KB.  This Bulk
      Transfer Test can be accomplished using iperf, as described in
      Appendix A.

RFC7640 - Page 26

      Two types of TCP tests MUST be performed: the Bulk Transfer Test
      and the Micro Burst Test Pattern, as documented in Appendix B.
      The Bulk Transfer Test only bursts during the TCP Slow Start (or
      Congestion Avoidance) state, while the Micro Burst Test Pattern
      emulates application-layer bursting, which may occur any time
      during the TCP connection.

      Other types of tests SHOULD include the following: simple web
      sites, complex web sites, business applications, email, and
      SMB/CIFS (Common Internet File System) file copy (all of which are
      also documented in Appendix B).

   Test Metrics:
      The test results will be recorded per the stateful metrics defined
      in Section 4.2 -- primarily the TCP Test Pattern Execution Time
      (TTPET), TCP Efficiency, and Buffer Delay.

   Procedure:
      1. Configure the DUT QL and scheduling technique parameters (FIFO,
         SP, etc.).

      2. Configure the test generator* with a profile of an emulated
         application traffic mixture.

         -  The application mixture MUST be defined in terms of
            percentage of the total bandwidth to be tested.

         -  The rate of transmission for each application within the
            mixture MUST also be configurable.

         *  To ensure repeatable results, the test generator MUST be
            capable of generating precise TCP test patterns for each
            application specified.

      3. Generate application traffic between the ingress (client side)
         and egress (server side) ports of the DUT, and measure the
         metrics (TTPET, TCP Efficiency, and Buffer Delay) per
         application stream and at the ingress and egress ports (across
         the entire Td, default 60-second duration).

      A couple of items require clarification concerning application
      measurements: an application session may be comprised of a single
      TCP connection or multiple TCP connections.

      If an application session utilizes a single TCP connection, the
      application throughput/metrics have a 1-1 relationship to the TCP
      connection measurements.

RFC7640 - Page 27

      If an application session (e.g., an HTTP-based application)
      utilizes multiple TCP connections, then all of the TCP connections
      are aggregated in the application throughput measurement/metrics
      for that application.

      Then, there is the case of multiple instances of an application
      session (i.e., multiple FTPs emulating multiple clients).  In this
      situation, the test should measure/record each FTP application
      session independently, tabulating the minimum, maximum, and
      average for all FTP sessions.

      Finally, application throughput measurements are based on Layer 4
      TCP throughput and do not include bytes retransmitted.  The TCP
      Efficiency metric MUST be measured during the test, because it
      provides a measure of "goodput" during each test.

   Reporting Format:
      The Queue/Scheduler Stateful Traffic individual report MUST
      contain all results for each traffic scheduler and QL/BB test run.
      A recommended format is as follows:

      ******************************************************************

      Test Configuration Summary: Tr, Td

      DUT Configuration Summary: Scheduling technique (i.e., FIFO, SP,
      WFQ, etc.), BB, and QL

      Application Mixture and Intensities: These are the percentages
      configured for each application type.

      The results table should contain entries for each test run, with
      minimum, maximum, and average per application session, as follows
      (Test #1 to Test #Tr):

      -  Throughput (bps) and TTPET for each application session

      -  Bytes In and Bytes Out for each application session

      -  TCP Efficiency and Buffer Delay for each application session

      ******************************************************************

RFC7640 - Page 28

6.2.2.  Queue/Scheduler Capacity Tests

   Objective:
      The intent of these capacity tests is to benchmark queue/scheduler
      performance in a scaled environment with multiple
      queues/schedulers active on multiple egress physical ports.  These
      tests will benchmark the maximum number of queues and schedulers
      as specified by the device manufacturer.  Each priority in the
      system will map to a separate queue.

   Test Metrics:
      The metrics defined in Section 4.1 (BSA, LP, OOS, PD, and PDV)
      SHALL be measured at the egress port and recorded.

   The following sections provide the specific test scenarios,
   procedures, and reporting formats for each queue/scheduler capacity
   test.

6.2.2.1.  Multiple Queues, Single Port Active

   For the first queue/scheduler capacity test, multiple queues per port
   will be tested on a single physical port.  In this case, all of the
   queues (typically eight) are active on a single physical port.
   Traffic from multiple ingress physical ports is directed to the same
   egress physical port.  This will cause oversubscription on the egress
   physical port.

   There are many types of priority schemes and combinations of
   priorities that are managed by the scheduler.  The following sections
   specify the priority schemes that should be tested.

6.2.2.1.1.  Strict Priority on Egress Port

   Test Summary:
      For this test, SP scheduling on the egress physical port should be
      tested, and the benchmarking methodologies specified in
      Sections 6.2.1.1 (stateless) and 6.2.1.2 (stateful) (procedure,
      metrics, and reporting format) should be applied here.  For a
      given priority, each ingress physical port should get a fair share
      of the egress physical-port bandwidth.

RFC7640 - Page 29

      Since this is a capacity test, the configuration and report
      results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also
      include:

      Configuration:

      -  The number of physical ingress ports active during the test

      -  The classification marking (DSCP, VLAN, etc.) for each physical
         ingress port

      -  The traffic rate for stateful traffic and the traffic
         rate/mixture for stateful traffic for each physical
         ingress port

      Report Results:

      -  For each ingress port traffic stream, the achieved throughput
         rate and metrics at the egress port

6.2.2.1.2.  Strict Priority + WFQ on Egress Port

   Test Summary:
      For this test, SP and WFQ should be enabled simultaneously in the
      scheduler, but on a single egress port.  The benchmarking
      methodologies specified in Sections 6.2.1.1 (stateless) and
      6.2.1.2 (stateful) (procedure, metrics, and reporting format)
      should be applied here.  Additionally, the egress port
      bandwidth-sharing among weighted queues should be proportional to
      the assigned weights.  For a given priority, each ingress physical
      port should get a fair share of the egress physical-port
      bandwidth.

      Since this is a capacity test, the configuration and report
      results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also
      include:

      Configuration:

      -  The number of physical ingress ports active during the test

      -  The classification marking (DSCP, VLAN, etc.) for each physical
         ingress port

      -  The traffic rate for stateful traffic and the traffic
         rate/mixture for stateful traffic for each physical
         ingress port

RFC7640 - Page 30

      Report Results:

      -  For each ingress port traffic stream, the achieved throughput
         rate and metrics at each queue of the egress port queue (both
         the SP and WFQ)

      Example:

      -  Egress Port SP Queue: throughput and metrics for ingress
         streams 1-n

      -  Egress Port WFQ: throughput and metrics for ingress streams 1-n

6.2.2.2.  Single Queue per Port, All Ports Active

   Test Summary:
      Traffic from multiple ingress physical ports is directed to the
      same egress physical port.  This will cause oversubscription on
      the egress physical port.  Also, the same amount of traffic is
      directed to each egress physical port.

      The benchmarking methodologies specified in Sections 6.2.1.1
      (stateless) and 6.2.1.2 (stateful) (procedure, metrics, and
      reporting format)  should be applied here.  Each ingress physical
      port should get a fair share of the egress physical-port
      bandwidth.  Additionally, each egress physical port should receive
      the same amount of traffic.

      Since this is a capacity test, the configuration and report
      results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also
      include:

      Configuration:

      -  The number of ingress ports active during the test

      -  The number of egress ports active during the test

      -  The classification marking (DSCP, VLAN, etc.) for each physical
         ingress port

      -  The traffic rate for stateful traffic and the traffic
         rate/mixture for stateful traffic for each physical
         ingress port

RFC7640 - Page 31

      Report Results:

      -  For each egress port, the achieved throughput rate and metrics
         at the egress port queue for each ingress port stream

      Example:

      -  Egress Port 1: throughput and metrics for ingress streams 1-n

      -  Egress Port n: throughput and metrics for ingress streams 1-n

6.2.2.3.  Multiple Queues per Port, All Ports Active

   Test Summary:
      Traffic from multiple ingress physical ports is directed to all
      queues of each egress physical port.  This will cause
      oversubscription on the egress physical ports.  Also, the same
      amount of traffic is directed to each egress physical port.

      The benchmarking methodologies specified in Sections 6.2.1.1
      (stateless) and 6.2.1.2 (stateful) (procedure, metrics, and
      reporting format) should be applied here.  For a given priority,
      each ingress physical port should get a fair share of the egress
      physical-port bandwidth.  Additionally, each egress physical port
      should receive the same amount of traffic.

      Since this is a capacity test, the configuration and report
      results format (see Sections 6.2.1.1 and 6.2.1.2) MUST also
      include:

      Configuration:

      -  The number of physical ingress ports active during the test

      -  The classification marking (DSCP, VLAN, etc.) for each physical
         ingress port

      -  The traffic rate for stateful traffic and the traffic
         rate/mixture for stateful traffic for each physical
         ingress port

      Report Results:

      -  For each egress port, the achieved throughput rate and metrics
         at each egress port queue for each ingress port stream

RFC7640 - Page 32

      Example:

      -  Egress Port 1, SP Queue: throughput and metrics for ingress
         streams 1-n

      -  Egress Port 2, WFQ: throughput and metrics for ingress
         streams 1-n

      ...

      -  Egress Port n, SP Queue: throughput and metrics for ingress
         streams 1-n

      -  Egress Port n, WFQ: throughput and metrics for ingress
         streams 1-n

(page 32 continued on part 3)