Tech-invite3GPPspaceIETFspace
96959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 7015

Flow Aggregation for the IP Flow Information Export (IPFIX) Protocol

Pages: 49
Proposed Standard
Part 2 of 2 – Pages 24 to 49
First   Prev   None

Top   ToC   RFC7015 - Page 24   prevText

6. Additional Considerations and Special Cases in Flow Aggregation

6.1. Exact versus Approximate Counting during Aggregation

In certain circumstances, particularly involving aggregation by devices with limited resources, and in situations where exact aggregated counts are less important than relative magnitudes (e.g., driving graphical displays), counter distribution during key aggregation may be performed by approximate counting means (e.g., Bloom filters). The choice to use approximate counting is implementation and application dependent.

6.2. Delay and Loss Introduced by the IAP

When accepting Original Flows in export order from traffic captured live, the Intermediate Aggregation Process waits for all Original Flows that may contribute to a given interval during interval distribution. This is generally dominated by the active timeout of the Metering Process measuring the Original Flows. For example, with Metering Processes configured with a five-minute active timeout, the Intermediate Aggregation Process introduces a delay of at least five minutes to all exported Aggregated Flows to ensure it has received all Original Flows. Note that when aggregating Flows from multiple Metering Processes with different active timeouts, the delay is determined by the maximum active timeout. In certain circumstances, additional delay at the original Exporter may cause an IAP to close an interval before the last Original Flow(s) accountable to the interval arrives. In this case, the IAP MAY drop the late Original Flow(s). Accounting of Flows lost at an Intermediate Process due to such issues is covered in [IPFIX-MED-PROTO].

6.3. Considerations for Aggregation of Sampled Flows

The accuracy of Aggregated Flows may also be affected by sampling of the Original Flows, or sampling of packets making up the Original Flows. At the time of writing, the effect of sampling on Flow aggregation is still an open research question. However, to maximize the comparability of Aggregated Flows, aggregation of sampled Flows should only be applied to Original Flows sampled using the same sampling rate and sampling algorithm, Flows created from packets sampled using the same sampling rate and sampling algorithm, or Original Flows that have been normalized as if they had the same sampling rate and algorithm before aggregation. For more on packet sampling within IPFIX, see [RFC5476]. For more on Flow sampling within the IPFIX Mediator framework, see [RFC7014].
Top   ToC   RFC7015 - Page 25

6.4. Considerations for Aggregation of Heterogeneous Flows

Aggregation may be applied to Original Flows from different sources and of different types (i.e., represented using different, perhaps wildly different Templates). When the goal is to separate the heterogeneous Original Flows and aggregate them into heterogeneous Aggregated Flows, each aggregation should be done at its own Intermediate Aggregation Process. The Observation Domain ID on the Messages containing the output Aggregated Flows can be used to identify the different Processes and to segregate the output. However, when the goal is to aggregate these Flows into a single stream of Aggregated Flows representing one type of data, and if the Original Flows may represent the same original packet at two different Observation Points, the Original Flows should be correlated by the correlation and normalization operation within the IAP to ensure that each packet is only represented in a single Aggregated Flow or set of Aggregated Flows differing only by aggregation interval.

7. Export of Aggregated IP Flows Using IPFIX

In general, Aggregated Flows are exported in IPFIX as any other Flow. However, certain aspects of Aggregated Flow export benefit from additional guidelines or new Information Elements to represent aggregation metadata or information generated during aggregation. These are detailed in the following subsections.

7.1. Time Interval Export

Since an Aggregated Flow is simply a Flow, the existing timestamp Information Elements in the IPFIX Information Model (e.g., flowStartMilliseconds, flowEndNanoseconds) are sufficient to specify the time interval for aggregation. Therefore, no new aggregation- specific Information Elements for exporting time interval information are necessary. Each Aggregated Flow carrying timing information SHOULD contain both an interval start and interval end timestamp.

7.2. Flow Count Export

The following four Information Elements are defined to count Original Flows as discussed in Section 5.2.1.
Top   ToC   RFC7015 - Page 26

7.2.1. originalFlowsPresent

Description: The non-conservative count of Original Flows contributing to this Aggregated Flow. Non-conservative counts need not sum to the original count on re-aggregation. Abstract Data Type: unsigned64 Data Type Semantics: deltaCounter ElementID: 375

7.2.2. originalFlowsInitiated

Description: The conservative count of Original Flows whose first packet is represented within this Aggregated Flow. Conservative counts must sum to the original count on re-aggregation. Abstract Data Type: unsigned64 Data Type Semantics: deltaCounter ElementID: 376

7.2.3. originalFlowsCompleted

Description: The conservative count of Original Flows whose last packet is represented within this Aggregated Flow. Conservative counts must sum to the original count on re-aggregation. Abstract Data Type: unsigned64 Data Type Semantics: deltaCounter ElementID: 377

7.2.4. deltaFlowCount

Description: The conservative count of Original Flows contributing to this Aggregated Flow; may be distributed via any of the methods expressed by the valueDistributionMethod Information Element. Abstract Data Type: unsigned64 Data Type Semantics: deltaCounter ElementID: 3
Top   ToC   RFC7015 - Page 27

7.3. Distinct Host Export

The following six Information Elements represent the distinct counts of source and destination network-layer addresses used to export distinct host counts reduced away during key aggregation.

7.3.1. distinctCountOfSourceIPAddress

Description: The count of distinct source IP address values for Original Flows contributing to this Aggregated Flow, without regard to IP version. This Information Element is preferred to the IP-version-specific counters, unless it is important to separate the counts by version. Abstract Data Type: unsigned64 Data Type Semantics: totalCounter ElementID: 378

7.3.2. distinctCountOfDestinationIPAddress

Description: The count of distinct destination IP address values for Original Flows contributing to this Aggregated Flow, without regard to IP version. This Information Element is preferred to the version-specific counters below, unless it is important to separate the counts by version. Abstract Data Type: unsigned64 Data Type Semantics: totalCounter ElementID: 379

7.3.3. distinctCountOfSourceIPv4Address

Description: The count of distinct source IPv4 address values for Original Flows contributing to this Aggregated Flow. Abstract Data Type: unsigned32 Data Type Semantics: totalCounter ElementID: 380
Top   ToC   RFC7015 - Page 28

7.3.4. distinctCountOfDestinationIPv4Address

Description: The count of distinct destination IPv4 address values for Original Flows contributing to this Aggregated Flow. Abstract Data Type: unsigned32 Data Type Semantics: totalCounter ElementID: 381

7.3.5. distinctCountOfSourceIPv6Address

Description: The count of distinct source IPv6 address values for Original Flows contributing to this Aggregated Flow. Abstract Data Type: unsigned64 Data Type Semantics: totalCounter ElementID: 382

7.3.6. distinctCountOfDestinationIPv6Address

Description: The count of distinct destination IPv6 address values for Original Flows contributing to this Aggregated Flow. Abstract Data Type: unsigned64 Data Type Semantics: totalCounter ElementID: 383

7.4. Aggregate Counter Distribution Export

When exporting counters distributed among Aggregated Flows, as described in Section 5.1.1, the Exporting Process MAY export an Aggregate Counter Distribution Option Record for each Template describing Aggregated Flow records; this Options Template is described below. It uses the valueDistributionMethod Information Element, also defined below. Since, in many cases, distribution is simple, accounting the counters from Contributing Flows to the first Interval to which they contribute, this is the default situation, for which no Aggregate Counter Distribution Record is necessary; Aggregate Counter Distribution Records are only applicable in more exotic situations, such as using an Aggregation Interval smaller than the durations of Original Flows.
Top   ToC   RFC7015 - Page 29

7.4.1. Aggregate Counter Distribution Options Template

This Options Template defines the Aggregate Counter Distribution Record, which allows the binding of a value distribution method to a Template ID. The scope is the Template ID, whose uniqueness, per [RFC7011], is local to the Transport Session and Observation Domain that generated the Template ID. This is used to signal to the Collecting Process how the counters were distributed. The fields are as below: +-----------------------------+-------------------------------------+ | IE | Description | +-----------------------------+-------------------------------------+ | templateId [scope] | The Template ID of the Template | | | defining the Aggregated Flows to | | | which this distribution option | | | applies. This Information Element | | | MUST be defined as a Scope field. | | valueDistributionMethod | The method used to distribute the | | | counters for the Aggregated Flows | | | defined by the associated Template. | +-----------------------------+-------------------------------------+

7.4.2. valueDistributionMethod Information Element

Description: A description of the method used to distribute the counters from Contributing Flows into the Aggregated Flow records described by an associated scope, generally a Template. The method is deemed to apply to all the non-Key Information Elements in the referenced scope for which value distribution is a valid operation; if the originalFlowsInitiated and/or originalFlowsCompleted Information Elements appear in the Template, they are not subject to this distribution method, as they each infer their own distribution method. This is intended to be a complete set of possible value distribution methods; it is encoded as follows:
Top   ToC   RFC7015 - Page 30
   +-------+-----------------------------------------------------------+
   | Value | Description                                               |
   +-------+-----------------------------------------------------------+
   | 0     | Unspecified: The counters for an Original Flow are        |
   |       | explicitly not distributed according to any other method  |
   |       | defined for this Information Element; use for arbitrary   |
   |       | distribution, or distribution algorithms not described by |
   |       | any other codepoint.                                      |
   |       | --------------------------------------------------------- |
   |       |                                                           |
   | 1     | Start Interval: The counters for an Original Flow are     |
   |       | added to the counters of the appropriate Aggregated Flow  |
   |       | containing the start time of the Original Flow.  This     |
   |       | should be assumed the default if value distribution       |
   |       | information is not available at a Collecting Process for  |
   |       | an Aggregated Flow.                                       |
   |       | --------------------------------------------------------- |
   |       |                                                           |
   | 2     | End Interval: The counters for an Original Flow are added |
   |       | to the counters of the appropriate Aggregated Flow        |
   |       | containing the end time of the Original Flow.             |
   |       | --------------------------------------------------------- |
   |       |                                                           |
   | 3     | Mid Interval: The counters for an Original Flow are added |
   |       | to the counters of a single appropriate Aggregated Flow   |
   |       | containing some timestamp between start and end time of   |
   |       | the Original Flow.                                        |
   |       | --------------------------------------------------------- |
   |       |                                                           |
   | 4     | Simple Uniform Distribution: Each counter for an Original |
   |       | Flow is divided by the number of time intervals the       |
   |       | Original Flow covers (i.e., of appropriate Aggregated     |
   |       | Flows sharing the same Flow Key), and this number is      |
   |       | added to each corresponding counter in each Aggregated    |
   |       | Flow.                                                     |
   |       | --------------------------------------------------------- |
   |       |                                                           |
   | 5     | Proportional Uniform Distribution: Each counter for an    |
   |       | Original Flow is divided by the number of time units the  |
   |       | Original Flow covers, to derive a mean count rate.  This  |
   |       | mean count rate is then multiplied by the number of time  |
   |       | units in the intersection of the duration of the Original |
   |       | Flow and the time interval of each Aggregated Flow.       |
   |       |  This is like simple uniform distribution, but accounts   |
   |       | for the fractional portions of a time interval covered by |
   |       | an Original Flow in the first and last time interval.     |
   |       | --------------------------------------------------------- |
Top   ToC   RFC7015 - Page 31
   |       | --------------------------------------------------------- |
   | 6     | Simulated Process: Each counter of the Original Flow is   |
   |       | distributed among the intervals of the Aggregated Flows   |
   |       | according to some function the Intermediate Aggregation   |
   |       | Process uses based upon properties of Flows presumed to   |
   |       | be like the Original Flow.  This is essentially an        |
   |       | assertion that the Intermediate Aggregation Process has   |
   |       | no direct packet timing information but is nevertheless   |
   |       | not using one of the other simpler distribution methods.  |
   |       | The Intermediate Aggregation Process specifically makes   |
   |       | no assertion as to the correctness of the simulation.     |
   |       | --------------------------------------------------------- |
   |       |                                                           |
   | 7     | Direct: The Intermediate Aggregation Process has access   |
   |       | to the original packet timings from the packets making up |
   |       | the Original Flow, and uses these to distribute or        |
   |       | recalculate the counters.                                 |
   +-------+-----------------------------------------------------------+

   Abstract Data Type:  unsigned8

   ElementID:  384

8. Examples

In these examples, the same data, described by the same Template, will be aggregated multiple different ways; this illustrates the various different functions that could be implemented by Intermediate Aggregation Processes. Templates are shown in IESpec format as introduced in [RFC7013]. The source data format is a simplified Flow: timestamps, traditional 5-tuple, and octet count; the Flow Key fields are the 5-tuple. The Template is shown in Figure 9. flowStartMilliseconds(152)[8] flowEndMilliseconds(153)[8] sourceIPv4Address(8)[4]{key} destinationIPv4Address(12)[4]{key} sourceTransportPort(7)[2]{key} destinationTransportPort(11)[2]{key} protocolIdentifier(4)[1]{key} octetDeltaCount(1)[8] Figure 9: Input Template for Examples The data records given as input to the examples in this section are shown below; timestamps are given in H:MM:SS.sss format. In this and subsequent figures, flowStartMilliseconds is shown in H:MM:SS.sss format as 'start time', flowEndMilliseconds is shown in H:MM:SS.sss
Top   ToC   RFC7015 - Page 32
   format as 'end time', sourceIPv4Address is shown as 'source ip4' with
   the following 'port' representing sourceTransportPort,
   destinationIPv4Address is shown as 'dest ip4' with the following
   'port' representing destinationTransportPort, protocolIdentifier is
   shown as 'pt', and octetDeltaCount as 'oct'.

  start time |end time   |source ip4 |port |dest ip4      |port|pt|  oct
  9:00:00.138 9:00:00.138 192.0.2.2   47113 192.0.2.131    53   17   119
  9:00:03.246 9:00:03.246 192.0.2.2   22153 192.0.2.131    53   17    83
  9:00:00.478 9:00:03.486 192.0.2.2   52420 198.51.100.2   443  6   1637
  9:00:07.172 9:00:07.172 192.0.2.3   56047 192.0.2.131    53   17   111
  9:00:07.309 9:00:14.861 192.0.2.3   41183 198.51.100.67  80   6  16838
  9:00:03.556 9:00:19.876 192.0.2.2   17606 198.51.100.68  80   6  11538
  9:00:25.210 9:00:25.210 192.0.2.3   47113 192.0.2.131    53   17   119
  9:00:26.358 9:00:30.198 192.0.2.3   48458 198.51.100.133 80   6   2973
  9:00:29.213 9:01:00.061 192.0.2.4   61295 198.51.100.2   443  6   8350
  9:04:00.207 9:04:04.431 203.0.113.3 41256 198.51.100.133 80   6    778
  9:03:59.624 9:04:06.984 203.0.113.3 51662 198.51.100.3   80   6    883
  9:00:30.532 9:06:15.402 192.0.2.2   37581 198.51.100.2   80   6  15420
  9:06:56.813 9:06:59.821 203.0.113.3 52572 198.51.100.2   443  6   1637
  9:06:30.565 9:07:00.261 203.0.113.3 49914 198.51.100.133 80   6    561
  9:06:55.160 9:07:05.208 192.0.2.2   50824 198.51.100.2   443  6   1899
  9:06:49.322 9:07:05.322 192.0.2.3   34597 198.51.100.3   80   6   1284
  9:07:05.849 9:07:09.625 203.0.113.3 58907 198.51.100.4   80   6   2670
  9:10:45.161 9:10:45.161 192.0.2.4   22478 192.0.2.131    53   17    75
  9:10:45.209 9:11:01.465 192.0.2.4   49513 198.51.100.68  80   6   3374
  9:10:57.094 9:11:00.614 192.0.2.4   64832 198.51.100.67  80   6    138
  9:10:59.770 9:11:02.842 192.0.2.3   60833 198.51.100.69  443  6   2325
  9:02:18.390 9:13:46.598 203.0.113.3 39586 198.51.100.17  80   6  11200
  9:13:53.933 9:14:06.605 192.0.2.2   19638 198.51.100.3   80   6   2869
  9:13:02.864 9:14:08.720 192.0.2.3   40429 198.51.100.4   80   6  18289

                    Figure 10: Input Data for Examples

8.1. Traffic Time Series per Source

Aggregating Flows by source IP address in time series (i.e., with a regular interval) can be used in subsequent heavy-hitter analysis and as a source parameter for statistical anomaly detection techniques. Here, the Intermediate Aggregation Process imposes an interval, aggregates the key to remove all key fields other than the source IP address, then combines the result into a stream of Aggregated Flows. The imposed interval of five minutes is longer than the majority of Flows; for those Flows crossing interval boundaries, the entire Flow is accounted to the interval containing the start time of the Flow.
Top   ToC   RFC7015 - Page 33
   In this example, the Partially Aggregated Flows after each conceptual
   operation in the Intermediate Aggregation Process are shown.  These
   are meant to be illustrative of the conceptual operations only, and
   not to suggest an implementation (indeed, the example shown here
   would not necessarily be the most efficient method for performing
   these operations).  Subsequent examples will omit the Partially
   Aggregated Flows for brevity.

   The input to this process could be any Flow Record containing a
   source IP address and octet counter; consider for this example the
   Template and data from the introduction.  The Intermediate
   Aggregation Process would then output records containing just
   timestamps, source IP, and octetDeltaCount, as in Figure 11.

   flowStartMilliseconds(152)[8]
   flowEndMilliseconds(153)[8]
   sourceIPv4Address(8)[4]
   octetDeltaCount(1)[8]

           Figure 11: Output Template for Time Series per Source
Top   ToC   RFC7015 - Page 34
   Assume the goal is to get 5-minute (300 s) time series of octet
   counts per source IP address.  The aggregation operations would then
   be arranged as in Figure 12.

                    Original Flows
                          |
                          V
              +-----------------------+
              | interval distribution |
              |  * impose uniform     |
              |    300s time interval |
              +-----------------------+
                  |
                  | Partially Aggregated Flows
                  V
   +------------------------+
   |  key aggregation       |
   |   * reduce key to only |
   |     sourceIPv4Address  |
   +------------------------+
                  |
                  | Partially Aggregated Flows
                  V
             +-------------------------+
             |  aggregate combination  |
             |   * sum octetDeltaCount |
             +-------------------------+
                          |
                          V
                  Aggregated Flows

       Figure 12: Aggregation Operations for Time Series per Source

   After applying the interval distribution step to the source data in
   Figure 10, only the time intervals have changed; the Partially
   Aggregated Flows are shown in Figure 13.  Note that interval
   distribution follows the default Start Interval policy; that is, the
   entire Flow is accounted to the interval containing the Flow's start
   time.
Top   ToC   RFC7015 - Page 35
  start time |end time   |source ip4 |port |dest ip4      |port|pt|  oct
  9:00:00.000 9:05:00.000 192.0.2.2   47113 192.0.2.131    53   17   119
  9:00:00.000 9:05:00.000 192.0.2.2   22153 192.0.2.131    53   17    83
  9:00:00.000 9:05:00.000 192.0.2.2   52420 198.51.100.2   443  6   1637
  9:00:00.000 9:05:00.000 192.0.2.3   56047 192.0.2.131    53   17   111
  9:00:00.000 9:05:00.000 192.0.2.3   41183 198.51.100.67  80   6  16838
  9:00:00.000 9:05:00.000 192.0.2.2   17606 198.51.100.68  80   6  11538
  9:00:00.000 9:05:00.000 192.0.2.3   47113 192.0.2.131    53   17   119
  9:00:00.000 9:05:00.000 192.0.2.3   48458 198.51.100.133 80   6   2973
  9:00:00.000 9:05:00.000 192.0.2.4   61295 198.51.100.2   443  6   8350
  9:00:00.000 9:05:00.000 203.0.113.3 41256 198.51.100.133 80   6    778
  9:00:00.000 9:05:00.000 203.0.113.3 51662 198.51.100.3   80   6    883
  9:00:00.000 9:05:00.000 192.0.2.2   37581 198.51.100.2   80   6  15420
  9:00:00.000 9:05:00.000 203.0.113.3 39586 198.51.100.17  80   6  11200
  9:05:00.000 9:10:00.000 203.0.113.3 52572 198.51.100.2   443  6   1637
  9:05:00.000 9:10:00.000 203.0.113.3 49914 197.51.100.133 80   6    561
  9:05:00.000 9:10:00.000 192.0.2.2   50824 198.51.100.2   443  6   1899
  9:05:00.000 9:10:00.000 192.0.2.3   34597 198.51.100.3   80   6   1284
  9:05:00.000 9:10:00.000 203.0.113.3 58907 198.51.100.4   80   6   2670
  9:10:00.000 9:15:00.000 192.0.2.4   22478 192.0.2.131    53   17    75
  9:10:00.000 9:15:00.000 192.0.2.4   49513 198.51.100.68  80   6   3374
  9:10:00.000 9:15:00.000 192.0.2.4   64832 198.51.100.67  80   6    138
  9:10:00.000 9:15:00.000 192.0.2.3   60833 198.51.100.69  443  6   2325
  9:10:00.000 9:15:00.000 192.0.2.2   19638 198.51.100.3   80   6   2869
  9:10:00.000 9:15:00.000 192.0.2.3   40429 198.51.100.4   80   6  18289

         Figure 13: Interval Imposition for Time Series per Source

   After the key aggregation step, all Flow Keys except the source IP
   address have been discarded, as shown in Figure 14.  This leaves
   duplicate Partially Aggregated Flows to be combined in the final
   operation.
Top   ToC   RFC7015 - Page 36
   start time |end time   |source ip4 |octets
   9:00:00.000 9:05:00.000 192.0.2.2      119
   9:00:00.000 9:05:00.000 192.0.2.2       83
   9:00:00.000 9:05:00.000 192.0.2.2     1637
   9:00:00.000 9:05:00.000 192.0.2.3      111
   9:00:00.000 9:05:00.000 192.0.2.3    16838
   9:00:00.000 9:05:00.000 192.0.2.2    11538
   9:00:00.000 9:05:00.000 192.0.2.3      119
   9:00:00.000 9:05:00.000 192.0.2.3     2973
   9:00:00.000 9:05:00.000 192.0.2.4     8350
   9:00:00.000 9:05:00.000 203.0.113.3    778
   9:00:00.000 9:05:00.000 203.0.113.3    883
   9:00:00.000 9:05:00.000 192.0.2.2    15420
   9:00:00.000 9:05:00.000 203.0.113.3  11200
   9:05:00.000 9:10:00.000 203.0.113.3   1637
   9:05:00.000 9:10:00.000 203.0.113.3    561
   9:05:00.000 9:10:00.000 192.0.2.2     1899
   9:05:00.000 9:10:00.000 192.0.2.3     1284
   9:05:00.000 9:10:00.000 203.0.113.3   2670
   9:10:00.000 9:15:00.000 192.0.2.4       75
   9:10:00.000 9:15:00.000 192.0.2.4     3374
   9:10:00.000 9:15:00.000 192.0.2.4      138
   9:10:00.000 9:15:00.000 192.0.2.3     2325
   9:10:00.000 9:15:00.000 192.0.2.2     2869
   9:10:00.000 9:15:00.000 192.0.2.3    18289

           Figure 14: Key Aggregation for Time Series per Source

   Aggregate combination sums the counters per key and interval; the
   summations of the first two keys and intervals are shown in detail in
   Figure 15.
Top   ToC   RFC7015 - Page 37
     start time |end time   |source ip4 |octets
     9:00:00.000 9:05:00.000 192.0.2.2      119
     9:00:00.000 9:05:00.000 192.0.2.2       83
     9:00:00.000 9:05:00.000 192.0.2.2     1637
     9:00:00.000 9:05:00.000 192.0.2.2    11538
   + 9:00:00.000 9:05:00.000 192.0.2.2    15420
                                          -----
   = 9:00:00.000 9:05:00.000 192.0.2.2    28797

     9:00:00.000 9:05:00.000 192.0.2.3      111
     9:00:00.000 9:05:00.000 192.0.2.3    16838
     9:00:00.000 9:05:00.000 192.0.2.3      119
   + 9:00:00.000 9:05:00.000 192.0.2.3     2973
                                          -----
   = 9:00:00.000 9:05:00.000 192.0.2.3    20041

             Figure 15: Summation during Aggregate Combination

   This can be applied to each set of Partially Aggregated Flows to
   produce the final Aggregated Flows that are shown in Figure 16, as
   exported by the Template in Figure 11.

   start time |end time   |source ip4 |octets
   9:00:00.000 9:05:00.000 192.0.2.2    28797
   9:00:00.000 9:05:00.000 192.0.2.3    20041
   9:00:00.000 9:05:00.000 192.0.2.4     8350
   9:00:00.000 9:05:00.000 203.0.113.3  12861
   9:05:00.000 9:10:00.000 192.0.2.2     1899
   9:05:00.000 9:10:00.000 192.0.2.3     1284
   9:05:00.000 9:10:00.000 203.0.113.3   4868
   9:10:00.000 9:15:00.000 192.0.2.2     2869
   9:10:00.000 9:15:00.000 192.0.2.3    20614
   9:10:00.000 9:15:00.000 192.0.2.4     3587

          Figure 16: Aggregated Flows for Time Series per Source

8.2. Core Traffic Matrix

Aggregating Flows by source and destination ASN in time series is used to generate core traffic matrices. The core traffic matrix provides a view of the state of the routes within a network, and it can be used for long-term planning of changes to network design based on traffic demand. Here, imposed time intervals are generally much longer than active Flow timeouts. The traffic matrix is reported in terms of octets, packets, and flows, as each of these values may have a subtly different effect on capacity planning.
Top   ToC   RFC7015 - Page 38
   This example demonstrates key aggregation using derived keys and
   Original Flow counting.  While some Original Flows may be generated
   by Exporting Processes on forwarding devices, and therefore contain
   the bgpSourceAsNumber and bgpDestinationAsNumber Information
   Elements, Original Flows from Exporting Processes on dedicated
   measurement devices without routing data contain only a
   destinationIPv[46]Address.  For these Flows, the Mediator must look
   up a next-hop AS from an IP-to-AS table, replacing source and
   destination addresses with ASNs.  The table used in this example is
   shown in Figure 17.  (Note that due to limited example address space,
   in this example we ignore the common practice of routing only blocks
   of /24 or larger.)

   prefix           |ASN
   192.0.2.0/25      64496
   192.0.2.128/25    64497
   198.51.100/24     64498
   203.0.113.0/24    64499

                        Figure 17: Example ASN Map

   The Template for Aggregated Flows produced by this example is shown
   in Figure 18.

   flowStartMilliseconds(152)[8]
   flowEndMilliseconds(153)[8]
   bgpSourceAsNumber(16)[4]
   bgpDestinationAsNumber(17)[4]
   octetDeltaCount(1)[8]

               Figure 18: Output Template for Traffic Matrix

   Assume the goal is to get 60-minute time series of octet counts per
   source/destination ASN pair.  The aggregation operations would then
   be arranged as in Figure 19.
Top   ToC   RFC7015 - Page 39
                    Original Flows
                          |
                          V
              +-----------------------+
              | interval distribution |
              |  * impose uniform     |
              |    3600s time interval|
              +-----------------------+
                  |
                  | Partially Aggregated Flows
                  V
   +------------------------+
   |  key aggregation       |
   |  * reduce key to only  |
   |    sourceIPv4Address + |
   |    destIPv4Address     |
   +------------------------+
                  |
                  V
   +------------------------+
   |  key aggregation       |
   |  * replace addresses   |
   |    with ASN from map   |
   +------------------------+
                  |
                  | Partially Aggregated Flows
                  V
             +-------------------------+
             |  aggregate combination  |
             |   * sum octetDeltaCount |
             +-------------------------+
                          |
                          V
                  Aggregated Flows

           Figure 19: Aggregation Operations for Traffic Matrix

   After applying the interval distribution step to the source data in
   Figure 10, the Partially Aggregated Flows are shown in Figure 20.
   Note that the Flows are identical to those in the interval
   distribution step in the previous example, except the chosen interval
   (1 hour, 3600 seconds) is different; therefore, all the Flows fit
   into a single interval.
Top   ToC   RFC7015 - Page 40
   start time |end time |source ip4 |port |dest ip4      |port|pt|  oct
   9:00:00     10:00:00  192.0.2.2   47113 192.0.2.131    53   17   119
   9:00:00     10:00:00  192.0.2.2   22153 192.0.2.131    53   17    83
   9:00:00     10:00:00  192.0.2.2   52420 198.51.100.2   443  6   1637
   9:00:00     10:00:00  192.0.2.3   56047 192.0.2.131    53   17   111
   9:00:00     10:00:00  192.0.2.3   41183 198.51.100.67  80   6  16838
   9:00:00     10:00:00  192.0.2.2   17606 198.51.100.68  80   6  11538
   9:00:00     10:00:00  192.0.2.3   47113 192.0.2.131    53   17   119
   9:00:00     10:00:00  192.0.2.3   48458 198.51.100.133 80   6   2973
   9:00:00     10:00:00  192.0.2.4   61295 198.51.100.2   443  6   8350
   9:00:00     10:00:00  203.0.113.3 41256 198.51.100.133 80   6    778
   9:00:00     10:00:00  203.0.113.3 51662 198.51.100.3   80   6    883
   9:00:00     10:00:00  192.0.2.2   37581 198.51.100.2   80   6  15420
   9:00:00     10:00:00  203.0.113.3 52572 198.51.100.2   443  6   1637
   9:00:00     10:00:00  203.0.113.3 49914 197.51.100.133 80   6    561
   9:00:00     10:00:00  192.0.2.2   50824 198.51.100.2   443  6   1899
   9:00:00     10:00:00  192.0.2.3   34597 198.51.100.3   80   6   1284
   9:00:00     10:00:00  203.0.113.3 58907 198.51.100.4   80   6   2670
   9:00:00     10:00:00  192.0.2.4   22478 192.0.2.131    53   17    75
   9:00:00     10:00:00  192.0.2.4   49513 198.51.100.68  80   6   3374
   9:00:00     10:00:00  192.0.2.4   64832 198.51.100.67  80   6    138
   9:00:00     10:00:00  192.0.2.3   60833 198.51.100.69  443  6   2325
   9:00:00     10:00:00  203.0.113.3 39586 198.51.100.17  80   6  11200
   9:00:00     10:00:00  192.0.2.2   19638 198.51.100.3   80   6   2869
   9:00:00     10:00:00  192.0.2.3   40429 198.51.100.4   80   6  18289

             Figure 20: Interval Imposition for Traffic Matrix

   The next steps are to discard irrelevant key fields and to replace
   the source and destination addresses with source and destination ASNs
   in the map; the results of these key aggregation steps are shown in
   Figure 21.
Top   ToC   RFC7015 - Page 41
   start time |end time |source ASN |dest ASN |octets
   9:00:00     10:00:00  AS64496     AS64497      119
   9:00:00     10:00:00  AS64496     AS64497       83
   9:00:00     10:00:00  AS64496     AS64498     1637
   9:00:00     10:00:00  AS64496     AS64497      111
   9:00:00     10:00:00  AS64496     AS64498    16838
   9:00:00     10:00:00  AS64496     AS64498    11538
   9:00:00     10:00:00  AS64496     AS64497      119
   9:00:00     10:00:00  AS64496     AS64498     2973
   9:00:00     10:00:00  AS64496     AS64498     8350
   9:00:00     10:00:00  AS64499     AS64498      778
   9:00:00     10:00:00  AS64499     AS64498      883
   9:00:00     10:00:00  AS64496     AS64498    15420
   9:00:00     10:00:00  AS64499     AS64498     1637
   9:00:00     10:00:00  AS64499     AS64498      561
   9:00:00     10:00:00  AS64496     AS64498     1899
   9:00:00     10:00:00  AS64496     AS64498     1284
   9:00:00     10:00:00  AS64499     AS64498     2670
   9:00:00     10:00:00  AS64496     AS64497       75
   9:00:00     10:00:00  AS64496     AS64498     3374
   9:00:00     10:00:00  AS64496     AS64498      138
   9:00:00     10:00:00  AS64496     AS64498     2325
   9:00:00     10:00:00  AS64499     AS64498    11200
   9:00:00     10:00:00  AS64496     AS64498     2869
   9:00:00     10:00:00  AS64496     AS64498    18289

              Figure 21: Key Aggregation for Traffic Matrix:
                         Reduction and Replacement

   Finally, aggregate combination sums the counters per key and
   interval.  The resulting Aggregated Flows containing the traffic
   matrix, shown in Figure 22, are then exported using the Template in
   Figure 18.  Note that these Aggregated Flows represent a sparse
   matrix: AS pairs for which no traffic was received have no
   corresponding record in the output.

   start time  end time  source ASN  dest ASN  octets
   9:00:00     10:00:00  AS64496     AS64497      507
   9:00:00     10:00:00  AS64496     AS64498    86934
   9:00:00     10:00:00  AS64499     AS64498    17729

              Figure 22: Aggregated Flows for Traffic Matrix

   The output of this operation is suitable for re-aggregation: that is,
   traffic matrices from single links or Observation Points can be
   aggregated through the same interval imposition and aggregate
   combination steps in order to build a traffic matrix for an entire
   network.
Top   ToC   RFC7015 - Page 42

8.3. Distinct Source Count per Destination Endpoint

Aggregating Flows by destination address and port, and counting distinct sources aggregated away, can be used as part of passive service inventory and host characterization. This example shows aggregation as an analysis technique, performed on source data stored in an IPFIX File. As the Transport Session in this File is bounded, removal of all timestamp information allows summarization of the entire time interval contained within the interval. Removal of timing information during interval imposition is equivalent to an infinitely long imposed time interval. This demonstrates both how infinite intervals work, and how unique counters work. The aggregation operations are summarized in Figure 23.
Top   ToC   RFC7015 - Page 43
                    Original Flows
                          |
                          V
              +-----------------------+
              | interval distribution |
              |  * discard timestamps |
              +-----------------------+
                  |
                  | Partially Aggregated Flows
                  V
   +----------------------------+
   |  value aggregation         |
   |  * discard octetDeltaCount |
   +----------------------------+
                  |
                  | Partially Aggregated Flows
                  V
   +----------------------------+
   |  key aggregation           |
   |   * reduce key to only     |
   |     destIPv4Address +      |
   |     destTransportPort,     |
   |   * count distinct sources |
   +----------------------------+
                  |
                  | Partially Aggregated Flows
                  V
       +----------------------------------------------+
       |  aggregate combination                       |
       |   * no-op (distinct sources already counted) |
       +----------------------------------------------+
                          |
                          V
                  Aggregated Flows

            Figure 23: Aggregation Operations for Source Count

   The Template for Aggregated Flows produced by this example is shown
   in Figure 24.

   destinationIPv4Address(12)[4]
   destinationTransportPort(11)[2]
   distinctCountOfSourceIPAddress(378)[8]

                Figure 24: Output Template for Source Count
Top   ToC   RFC7015 - Page 44
   Interval distribution, in this case, merely discards the timestamp
   information from the Original Flows in Figure 10, and as such is not
   shown.  Likewise, the value aggregation step simply discards the
   octetDeltaCount value field.  The key aggregation step reduces the
   key to the destinationIPv4Address and destinationTransportPort,
   counting the distinct source addresses.  Since this is essentially
   the output of this aggregation function, the aggregate combination
   operation is a no-op; the resulting Aggregated Flows are shown in
   Figure 25.

   dest ip4      |port |dist src
   192.0.2.131    53           3
   198.51.100.2   80           1
   198.51.100.2   443          3
   198.51.100.67  80           2
   198.51.100.68  80           2
   198.51.100.133 80           2
   198.51.100.3   80           3
   198.51.100.4   80           2
   198.51.100.17  80           1
   198.51.100.69  443          1

               Figure 25: Aggregated Flows for Source Count

8.4. Traffic Time Series per Source with Counter Distribution

Returning to the example in Section 8.1, note that our source data contains some Flows with durations longer than the imposed interval of five minutes. The default method for dealing with such Flows is to account them to the interval containing the Flow's start time. In this example, the same data is aggregated using the same arrangement of operations and the same output Template as in Section 8.1, but using a different counter distribution policy, Simple Uniform Distribution, as described in Section 5.1.1. In order to do this, the Exporting Process first exports the Aggregate Counter Distribution Options Template, as in Figure 26. templateId(12)[2]{scope} valueDistributionMethod(384)[1] Figure 26: Aggregate Counter Distribution Options Template This Template is followed by an Aggregate Counter Distribution Record described by this Template; assuming the output Template in Figure 11 has ID 257, this record would appear as in Figure 27.
Top   ToC   RFC7015 - Page 45
   template ID | value distribution method
           257   4 (simple uniform)

             Figure 27: Aggregate Counter Distribution Record

   Following metadata export, the aggregation steps follow as before.
   However, two long Flows are distributed across multiple intervals in
   the interval imposition step, as indicated with "*" in Figure 28.
   Note the uneven distribution of the three-interval, 11200-octet Flow
   into three Partially Aggregated Flows of 3733, 3733, and 3734 octets;
   this ensures no cumulative error is injected by the interval
   distribution step.

 start time |end time   |source ip4 |port |dest ip4      |port|pt|  oct
 9:00:00.000 9:05:00.000 192.0.2.2   47113 192.0.2.131    53   17   119
 9:00:00.000 9:05:00.000 192.0.2.2   22153 192.0.2.131    53   17    83
 9:00:00.000 9:05:00.000 192.0.2.2   52420 198.51.100.2   443  6   1637
 9:00:00.000 9:05:00.000 192.0.2.3   56047 192.0.2.131    53   17   111
 9:00:00.000 9:05:00.000 192.0.2.3   41183 198.51.100.67  80   6  16838
 9:00:00.000 9:05:00.000 192.0.2.2   17606 198.51.100.68  80   6  11538
 9:00:00.000 9:05:00.000 192.0.2.3   47113 192.0.2.131    53   17   119
 9:00:00.000 9:05:00.000 192.0.2.3   48458 198.51.100.133 80   6   2973
 9:00:00.000 9:05:00.000 192.0.2.4   61295 198.51.100.2   443  6   8350
 9:00:00.000 9:05:00.000 203.0.113.3 41256 198.51.100.133 80   6    778
 9:00:00.000 9:05:00.000 203.0.113.3 51662 198.51.100.3   80   6    883
 9:00:00.000 9:05:00.000 192.0.2.2   37581 198.51.100.2   80   6   7710*
 9:00:00.000 9:05:00.000 203.0.113.3 39586 198.51.100.17  80   6   3733*
 9:05:00.000 9:10:00.000 203.0.113.3 52572 198.51.100.2   443  6   1637
 9:05:00.000 9:10:00.000 203.0.113.3 49914 197.51.100.133 80   6    561
 9:05:00.000 9:10:00.000 192.0.2.2   50824 198.51.100.2   443  6   1899
 9:05:00.000 9:10:00.000 192.0.2.3   34597 198.51.100.3   80   6   1284
 9:05:00.000 9:10:00.000 203.0.113.3 58907 198.51.100.4   80   6   2670
 9:05:00.000 9:10:00.000 192.0.2.2   37581 198.51.100.2   80   6   7710*
 9:05:00.000 9:10:00.000 203.0.113.3 39586 198.51.100.17  80   6   3733*
 9:10:00.000 9:15:00.000 192.0.2.4   22478 192.0.2.131    53   17    75
 9:10:00.000 9:15:00.000 192.0.2.4   49513 198.51.100.68  80   6   3374
 9:10:00.000 9:15:00.000 192.0.2.4   64832 198.51.100.67  80   6    138
 9:10:00.000 9:15:00.000 192.0.2.3   60833 198.51.100.69  443  6   2325
 9:10:00.000 9:15:00.000 192.0.2.2   19638 198.51.100.3   80   6   2869
 9:10:00.000 9:15:00.000 192.0.2.3   40429 198.51.100.4   80   6  18289
 9:10:00.000 9:15:00.000 203.0.113.3 39586 198.51.100.17  80   6   3734*

  Figure 28: Distributed Interval Imposition for Time Series per Source

   Subsequent steps are as in Section 8.1; the results, to be exported
   using the Template shown in Figure 11, are shown in Figure 29, with
   Aggregated Flows differing from the example in Section 8.1 indicated
   by "*".
Top   ToC   RFC7015 - Page 46
   start time |end time   |source ip4 |octets
   9:00:00.000 9:05:00.000 192.0.2.2    21087*
   9:00:00.000 9:05:00.000 192.0.2.3    20041
   9:00:00.000 9:05:00.000 192.0.2.4     8350
   9:00:00.000 9:05:00.000 203.0.113.3   5394*
   9:05:00.000 9:10:00.000 192.0.2.2     9609*
   9:05:00.000 9:10:00.000 192.0.2.3     1284
   9:05:00.000 9:10:00.000 203.0.113.3   8601*
   9:10:00.000 9:15:00.000 192.0.2.2     2869
   9:10:00.000 9:15:00.000 192.0.2.3    20614
   9:10:00.000 9:15:00.000 192.0.2.4     3587
   9:10:00.000 9:15:00.000 203.0.113.3   3734*

          Figure 29: Aggregated Flows for Time Series per Source
                         with Counter Distribution

9. Security Considerations

This document specifies the operation of an Intermediate Aggregation Process with the IPFIX protocol; the Security Considerations for the protocol itself in Section 11 of [RFC7011] therefore apply. In the common case that aggregation is performed on a Mediator, the Security Considerations for Mediators in Section 9 of [RFC6183] apply as well. As mentioned in Section 3, certain aggregation operations may tend to have an anonymizing effect on Flow data by obliterating sensitive identifiers. Aggregation may also be combined with anonymization within a Mediator, or as part of a chain of Mediators, to further leverage this effect. In any case in which an Intermediate Aggregation Process is applied as part of a data anonymization or protection scheme, or is used together with anonymization as described in [RFC6235], the Security Considerations in Section 9 of [RFC6235] apply.

10. IANA Considerations

This document specifies the creation of new IPFIX Information Elements in the IPFIX Information Element registry [IANA-IPFIX], as defined in Section 7 above. IANA has assigned Information Element numbers to these Information Elements, and entered them into the registry.

11. Acknowledgments

Special thanks to Elisa Boschi for early work on the concepts laid out in this document. Thanks to Lothar Braun, Christian Henke, and Rahul Patel for their reviews and valuable feedback, with special
Top   ToC   RFC7015 - Page 47
   thanks to Paul Aitken for his multiple detailed reviews.  This work
   is materially supported by the European Union Seventh Framework
   Programme under grant agreement 257315 (DEMONS).

12. References

12.1. Normative References

[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", BCP 26, RFC 5226, May 2008. [RFC7011] Claise, B., Ed., Trammell, B., Ed., and P. Aitken, "Specification of the IP Flow Information Export (IPFIX) Protocol for the Exchange of Flow Information", STD 77, RFC 7011, September 2013.

12.2. Informative References

[RFC3917] Quittek, J., Zseby, T., Claise, B., and S. Zander, "Requirements for IP Flow Information Export (IPFIX)", RFC 3917, October 2004. [RFC5470] Sadasivan, G., Brownlee, N., Claise, B., and J. Quittek, "Architecture for IP Flow Information Export", RFC 5470, March 2009. [RFC5472] Zseby, T., Boschi, E., Brownlee, N., and B. Claise, "IP Flow Information Export (IPFIX) Applicability", RFC 5472, March 2009. [RFC5476] Claise, B., Johnson, A., and J. Quittek, "Packet Sampling (PSAMP) Protocol Specifications", RFC 5476, March 2009. [RFC5655] Trammell, B., Boschi, E., Mark, L., Zseby, T., and A. Wagner, "Specification of the IP Flow Information Export (IPFIX) File Format", RFC 5655, October 2009. [RFC5982] Kobayashi, A. and B. Claise, "IP Flow Information Export (IPFIX) Mediation: Problem Statement", RFC 5982, August 2010. [RFC6183] Kobayashi, A., Claise, B., Muenz, G., and K. Ishibashi, "IP Flow Information Export (IPFIX) Mediation: Framework", RFC 6183, April 2011.
Top   ToC   RFC7015 - Page 48
   [RFC6235]  Boschi, E. and B. Trammell, "IP Flow Anonymization
              Support", RFC 6235, May 2011.

   [RFC6728]  Muenz, G., Claise, B., and P. Aitken, "Configuration Data
              Model for the IP Flow Information Export (IPFIX) and
              Packet Sampling (PSAMP) Protocols", RFC 6728, October
              2012.

   [RFC7012]  Claise, B., Ed. and B. Trammell, Ed., "Information Model
              for IP Flow Information Export (IPFIX)", RFC 7012,
              September 2013.

   [RFC7013]  Trammell, B. and B. Claise, "Guidelines for Authors and
              Reviewers of IP Flow Information Export (IPFIX)
              Information Elements", BCP 184, RFC 7013, September 2013.

   [RFC7014]  D'Antonio, S., Zseby, T., Henke, C., and L. Peluso, "Flow
              Selection Techniques", RFC 7014, September 2013.

   [IANA-IPFIX]
              IANA, "IP Flow Information Export (IPFIX) Entities",
              <http://www.iana.org/assignments/ipfix>.

   [IPFIX-MED-PROTO]
              Claise, B., Kobayashi, A., and B. Trammell, "Operation of
              the IP Flow Information Export (IPFIX) Protocol on IPFIX
              Mediators", Work in Progress, July 2013.
Top   ToC   RFC7015 - Page 49

Authors' Addresses

Brian Trammell Swiss Federal Institute of Technology Zurich Gloriastrasse 35 8092 Zurich Switzerland Phone: +41 44 632 70 13 EMail: trammell@tik.ee.ethz.ch Arno Wagner Consecom AG Bleicherweg 64a 8002 Zurich Switzerland EMail: arno@wagner.name Benoit Claise Cisco Systems, Inc. De Kleetlaan 6a b1 1831 Diegem Belgium Phone: +32 2 704 5622 EMail: bclaise@cisco.com