Internet Engineering Task Force (IETF) P. Eardley Request for Comments: 7594 BT Category: Informational A. Morton ISSN: 2070-1721 AT&T Labs M. Bagnulo UC3M T. Burbridge BT P. Aitken Brocade A. Akhter Consultant September 2015 A Framework for Large-Scale Measurement of Broadband Performance (LMAP)Abstract
Measuring broadband service on a large scale requires a description of the logical architecture and standardisation of the key protocols that coordinate interactions between the components. This document presents an overall framework for large-scale measurements. It also defines terminology for LMAP (Large-Scale Measurement of Broadband Performance). Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are a candidate for any level of Internet Standard; see Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7594.
Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Outline of an LMAP-Based Measurement System . . . . . . . . . 5 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 9 4. Constraints . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.1. The Measurement System Is Under the Direction of a Single Organisation . . . . . . . . . . . . . . . . . . . . . . 13 4.2. Each MA May Only Have a Single Controller at Any Point in Time . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5. Protocol Model . . . . . . . . . . . . . . . . . . . . . . . 13 5.1. Bootstrapping Process . . . . . . . . . . . . . . . . . . 14 5.2. Control Protocol . . . . . . . . . . . . . . . . . . . . 15 5.2.1. Configuration . . . . . . . . . . . . . . . . . . . . 15 5.2.2. Instruction . . . . . . . . . . . . . . . . . . . . . 16 5.2.3. Capabilities, Failure, and Logging Information . . . 20 5.3. Operation of Measurement Tasks . . . . . . . . . . . . . 22 5.3.1. Starting and Stopping Measurement Tasks . . . . . . . 22 5.3.2. Overlapping Measurement Tasks . . . . . . . . . . . . 24 5.4. Report Protocol . . . . . . . . . . . . . . . . . . . . . 24 5.4.1. Reporting of the Subscriber's Service Parameters . . 26 5.5. Operation of LMAP over the Underlying Packet Transfer Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 26 5.6. Items beyond the Scope of the Initial LMAP Work . . . . . 27 5.6.1. End-User-Controlled Measurement System . . . . . . . 28 6. Deployment Considerations . . . . . . . . . . . . . . . . . . 29 6.1. Controller and the Measurement System . . . . . . . . . . 29 6.2. Measurement Agent . . . . . . . . . . . . . . . . . . . . 30 6.2.1. Measurement Agent on a Networked Device . . . . . . . 30 6.2.2. Measurement Agent Embedded in a Site Gateway . . . . 31 6.2.3. Measurement Agent Embedded behind a Site NAT or Firewall . . . . . . . . . . . . . . . . . . . . . . 31
6.2.4. Multihomed Measurement Agent . . . . . . . . . . . . 31 6.2.5. Measurement Agent Embedded in an ISP Network . . . . 32 6.3. Measurement Peer . . . . . . . . . . . . . . . . . . . . 32 6.4. Deployment Examples . . . . . . . . . . . . . . . . . . . 33 7. Security Considerations . . . . . . . . . . . . . . . . . . . 36 8. Privacy Considerations . . . . . . . . . . . . . . . . . . . 38 8.1. Categories of Entities with Information of Interest . . . 38 8.2. Examples of Sensitive Information . . . . . . . . . . . . 39 8.3. Different Privacy Issues Raised by Different Sorts of Measurement Methods . . . . . . . . . . . . . . . . . . . 40 8.4. Privacy Analysis of the Communication Models . . . . . . 41 8.4.1. MA Bootstrapping . . . . . . . . . . . . . . . . . . 41 8.4.2. Controller <-> Measurement Agent . . . . . . . . . . 42 8.4.3. Collector <-> Measurement Agent . . . . . . . . . . . 43 8.4.4. Measurement Peer <-> Measurement Agent . . . . . . . 43 8.4.5. Measurement Agent . . . . . . . . . . . . . . . . . . 45 8.4.6. Storage and Reporting of Measurement Results . . . . 46 8.5. Threats . . . . . . . . . . . . . . . . . . . . . . . . . 46 8.5.1. Surveillance . . . . . . . . . . . . . . . . . . . . 46 8.5.2. Stored Data Compromise . . . . . . . . . . . . . . . 47 8.5.3. Correlation and Identification . . . . . . . . . . . 47 8.5.4. Secondary Use and Disclosure . . . . . . . . . . . . 48 8.6. Mitigations . . . . . . . . . . . . . . . . . . . . . . . 48 8.6.1. Data Minimisation . . . . . . . . . . . . . . . . . . 48 8.6.2. Anonymity . . . . . . . . . . . . . . . . . . . . . . 49 8.6.3. Pseudonymity . . . . . . . . . . . . . . . . . . . . 50 8.6.4. Other Mitigations . . . . . . . . . . . . . . . . . . 50 9. Informative References . . . . . . . . . . . . . . . . . . . 51 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 54 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 541. Introduction
There is a desire to be able to coordinate the execution of broadband measurements and the collection of measurement results across a large scale set of Measurement Agents (MAs). These MAs could be software-based agents on PCs, embedded agents in consumer devices (such as TVs or gaming consoles), embedded in service-provider- controlled devices such as set-top boxes and home gateways, or simply dedicated probes. MAs may also be embedded on a device that is part of an ISP's network, such as a DSLAM (Digital Subscriber Line Access Multiplexer), router, Carrier Grade NAT (Network Address Translator), or ISP Gateway. It is expected that a measurement system could easily encompass a few hundred thousand or even millions of such MAs. Such a scale presents unique problems in coordination, execution, and measurement result collection. Several use cases have been proposed for large-scale measurements including:
o Operators: to help plan their network and identify faults o Regulators: to benchmark several network operators and support public policy development Further details of the use cases can be found in [RFC7536]. The LMAP framework should be useful for these, as well as other use cases, such as to help end users run diagnostic checks like a network speed test. The LMAP framework has three basic elements: Measurement Agents, Controllers, and Collectors. Measurement Agents (MAs) initiate the actual measurements, which are called Measurement Tasks in the LMAP terminology. In principle, there are no restrictions on the type of device in which the MA function resides. The Controller instructs one or more MAs and communicates the set of Measurement Tasks an MA should perform and when. For example, it may instruct an MA at a home gateway: "Measure the 'UDP latency' with www.example.org; repeat every hour at xx.05". The Controller also manages an MA by instructing it on how to report the Measurement Results, for example: "Report results once a day in a batch at 4am". We refer to these as the Measurement Schedule and Report Schedule. The Collector accepts Reports from the MAs with the Results from their Measurement Tasks. Therefore, the MA is a device that gets Instructions from the Controller, initiates the Measurement Tasks, and reports to the Collector. The communications between these three LMAP functions are structured according to a Control Protocol and a Report Protocol. The design goals are the following large-scale Measurement System features: o Standardised - in terms of the Measurement Tasks that they perform, the components, the data models, and the protocols for transferring information between the components. Amongst other things, standardisation enables meaningful comparisons of measurements made of the same Metric at different times and places, and provides the operator of a Measurement System with criteria for evaluation of the different solutions that can be used for various purposes including buying decisions (such as buying the various components from different vendors). Today's systems are proprietary in some or all of these aspects.
o Large-scale - [RFC7536] envisages Measurement Agents in every home gateway and edge device such as set-top boxes and tablet computers, and located throughout the Internet as well [RFC7398]. It is expected that a Measurement System could easily encompass a few hundred thousand or even millions of Measurement Agents. Existing systems have up to a few thousand MAs (without judging how much further they could scale). o Diversity - a Measurement System should handle Measurement Agents from different vendors that are in wired and wireless networks, can execute different sorts of Measurement Tasks, are on devices with IPv4 or IPv6 addresses, and so on. o Privacy Respecting - the protocols and procedures should respect the sensitive information of all those involved in measurements.2. Outline of an LMAP-Based Measurement System
In this section, we provide an overview of the whole Measurement System. New LMAP-specific terms are capitalised; Section 3 provides a terminology section with a compilation of all the LMAP terms and their definitions. Section 4 onwards considers the LMAP components in more detail. Other LMAP specifications will define an Information Model, the associated Data Models, and select/extend one or more protocols for the secure communication: firstly, a Control Protocol, for a Controller to instruct Measurement Agents regarding which performance Metrics to measure, when to measure them, and how/when to report the measurement results to a Collector; secondly, a Report Protocol, for a Measurement Agent to report the results to the Collector. Figure 1 shows the main components of a Measurement System, and the interactions of those components. Some of the components are outside the scope of initial LMAP work. The MA performs Measurement Tasks. One possibility is that the MA observes existing traffic. Another possibility is for the MA to generate (or receive) traffic specially created for the purpose and measure some Metric associated with its transfer. Figure 1 includes both possibilities (in practice, it may be more usual for an MA to do one) whilst Section 6.4 shows some examples of possible arrangements of the components. The MAs are pieces of code that can be executed in specialised hardware (hardware probe) or on a general-purpose device (like a PC or mobile phone). A device with a Measurement Agent may have multiple physical interfaces (Wi-Fi, Ethernet, DSL (Digital
Subscriber Line); and non-physical interfaces such as PPPoE (Point-to-Point Protocol over Ethernet) or IPsec) and the Measurement Tasks may specify any one of these. The Controller manages an MA through use of the Control Protocol, which transfers the Instruction to the MA. This describes the Measurement Tasks the MA should perform and when. For example the Controller may instruct an MA at a home gateway: "Count the number of TCP SYN packets observed in a 1 minute interval; repeat every hour at xx.05 + Unif[0,180] seconds". The Measurement Schedule determines when the Measurement Tasks are executed. The Controller also manages an MA by instructing it on how to report the Measurement Results, for example: "Report results once a day in a batch at 4am + Unif[0,180] seconds; if the end user is active then delay the report 5 minutes." The Report Schedule determines when the Reports are uploaded to the Collector. The Measurement Schedule and Report Schedule can define one-off (non-recurring) actions (for example, "Do measurement now", "Report as soon as possible"), as well as recurring ones. The Collector accepts a Report from an MA with the Measurement Results from its Measurement Tasks. It then provides the Results to a repository. A Measurement Method defines how to measure a Metric of interest. It is very useful to standardise Measurement Methods, so that it is meaningful to compare measurements of the same Metric made at different times and places. It is also useful to define a registry for commonly used Metrics [IPPM-REG] so that a Metric and its associated Measurement Method can be referred to simply by its identifier in the registry. The registry will hopefully be referenced by other standards organisations. The Measurement Methods may be defined by the IETF, locally, or by some other standards body. Broadly speaking there are two types of Measurement Methods. In both types, a Measurement Agent measures a particular Observed Traffic Flow. It may involve a single MA simply observing existing traffic -- for example, the Measurement Agent could count bytes or calculate the average loss for a particular flow. On the other hand, a Measurement Method may observe traffic created specifically for the purpose of measurement. This requires multiple network entities, which perform different roles. For example, to measure the round trip delay one possible Measurement Method would consist of an MA sending an ICMP (Internet Control Message Protocol) ECHO request ("ping") to a responder in the Internet. In LMAP terms, the responder is termed a Measurement Peer (MP), meaning that it helps the MA but is not managed by the Controller. Other Measurement Methods involve a second MA, with the Controller instructing the MAs in a coordinated manner. Traffic generated specifically as part of
the Measurement Method is termed Measurement Traffic; in the ping example, it is the ICMP ECHO Requests and Replies. The protocols used for the Measurement Traffic are out of the scope of initial LMAP work and fall within the scope of other IETF WGs such as IPPM (IP Performance Metrics). A Measurement Task is the action performed by a particular MA at a particular time, as the specific instance of its role in a Measurement Method. LMAP is mainly concerned with Measurement Tasks, for instance in terms of its Information Model and Protocols. For Measurement Results to be truly comparable, as might be required by a regulator, not only do the same Measurement Methods need to be used to assess Metrics, but also the set of Measurement Tasks should follow a similar Measurement Schedule and be of similar number. The details of such a characterisation plan are beyond the scope of IETF work, although it is certainly facilitated by the IETF's work. Both control and report messages are transferred over a secure Channel. A Control Channel is between the Controller and an MA; the Control Protocol delivers Instruction Messages to the MA and Capabilities, Failure, and Logging Information in the reverse direction. A Report Channel is between an MA and Collector, and the Report Protocol delivers Reports to the Collector. Finally, we introduce several components that are outside the scope of initial LMAP work that will be provided through existing protocols or applications. They affect how the Measurement System uses the Measurement Results and how it decides what set of Measurement Tasks to perform. As shown in Figure 1, these components are: the bootstrapper, Subscriber parameter database, data analysis tools, and Results repository. The MA needs to be bootstrapped with initial details about its Controller, including authentication credentials. The LMAP work considers the Bootstrap process, since it affects the Information Model. However, LMAP does not define a Bootstrap protocol, since it is likely to be technology specific and could be defined by the Broadband Forum, CableLabs, or IEEE depending on the device. Possible protocols are SNMP (Simple Network Management Protocol), NETCONF (Network Configuration Protocol), or (for Home Gateways) CPE WAN Management Protocol (CWMP) from the Auto Configuration Server (ACS) (as specified in TR-069 [TR-069]). A Subscriber parameter database contains information about the line, such as the customer's broadband contract (perhaps 2, 40, or 80 Mb/s), the line technology (DSL or fibre), the time zone in which the MA is located, and the type of home gateway and MA. These parameters
are already gathered and stored by existing operations systems. They may affect the choice of what Measurement Tasks to run and how to interpret the Measurement Results. For example, a download test suitable for a line with an 80 Mb/s contract may overwhelm a 2 Mb/s line. A Results repository records all Measurement Results in an equivalent form, for example an SQL (Structured Query Language) database, so that they can easily be accessed by the data analysis tools. The data analysis tools receive the results from the Collector or via the Results repository. They might visualise the data or identify which component or link is likely to be the cause of a fault or degradation. This information could help the Controller decide what follow-up Measurement Task to perform in order to diagnose a fault. The data analysis tools also need to understand the Subscriber's service information, for example, the broadband contract.
+--------+ +-----------+ +-----------+ ^ |End user| | | Observed | End user | | | |<-----|-----------|---Traffic--->| | | | | | | Flow | | | | | | | | | Non-LMAP | | | | Measurement | | Scope | | | |<--Traffic--->| | | +--------+ | | +-----------+ | ................|...........|.................................V <MP> |Measurement| <MP> ^ |Agent: | | |LMAP | | +----------->|interface | | | +-----------+ | | ^ | LMAP | Instruction | | Report Scope | (over Control | | (over Report Channel) | | Channel) | +-----------------------+ | | | | | | | | | | | v | | +------------+ +------------+ | | | Controller | | Collector | | | +------------+ +------------+ v | ^ ^ | ^ | | | | | | | +--------+ | | | | | v | +------------+ +----------+ +--------+ +----------+ | |Bootstrapper| |Subscriber|--->| data |<---| Results | Non- +------------+ |parameter | |analysis| |repository| LMAP |database | | tools | +----------+ Scope +----------+ +--------+ | | v MP: Measurement Peer Figure 1: Schematic of main elements of an LMAP-based Measurement System (showing the elements in and out of the scope of initial LMAP work)3. Terminology
This section defines terminology for LMAP. Please note that defined terms are capitalised throughout.
Bootstrap: A process that integrates a Measurement Agent into a Measurement System. Capabilities: Information about the performance measurement capabilities of the MA, in particular the Measurement Method roles and measurement protocol roles that it can perform, and the device hosting the MA, for example its interface type and speed, but not dynamic information. Channel: A bidirectional logical connection that is defined by a specific Controller and MA, or Collector and MA, plus associated security. Collector: A function that receives a Report from an MA. Configuration: A process for informing the MA about its MA-ID, (optional) Group-ID, and Control Channel. Controller: A function that provides a Measurement Agent with its Instruction. Control Channel: A Channel between a Controller and an MA over which Instruction Messages and Capabilities, Failure, and Logging Information are sent. Control Protocol: The protocol delivering Instruction(s) from a Controller to a Measurement Agent. It also delivers Capabilities, Failure, and Logging Information from the Measurement Agent to the Controller. It can also be used to update the MA's Configuration. It runs over the Control Channel. Cycle-ID: A tag that is sent by the Controller in an Instruction and echoed by the MA in its Report. The same Cycle-ID is used by several MAs that use the same Measurement Method for a Metric with the same Input Parameters. Hence, the Cycle-ID allows the Collector to easily identify Measurement Results that should be comparable. Data Model: The implementation of an Information Model in a particular data modelling language [RFC3444]. Environmental Constraint: A parameter that is measured as part of the Measurement Task, its value determining whether the rest of the Measurement Task proceeds. Failure Information: Information about the MA's failure to take action or execute an Instruction, whether concerning Measurement Tasks or Reporting.
Group-ID: An identifier of a group of MAs. Information Model: The protocol-neutral definition of the semantics of the Instructions, the Report, the status of the different elements of the Measurement System, as well of the events in the system [RFC3444]. Input Parameter: A parameter whose value is left open by the Metric and its Measurement Method and is set to a specific value in a Measurement Task. Altering the value of an Input Parameter does not change the fundamental nature of the Measurement Task. Instruction: The description of Measurement Tasks for an MA to perform and the details of the Report for it to send. It is the collective description of the Measurement Task configurations, the configuration of the Measurement Schedules, the configuration of the Report Channel(s), the configuration of Report Schedule(s), and the details of any Suppression. Instruction Message: The message that carries an Instruction from a Controller to a Measurement Agent. Logging Information: Information about the operation of the Measurement Agent, which may be useful for debugging. Measurement Agent (MA): The function that receives Instruction Messages from a Controller and operates the Instruction by executing Measurement Tasks (using protocols outside the scope of the initial LMAP work and perhaps in concert with one or more other Measurement Agents or Measurement Peers) and (if part of the Instruction) by reporting Measurement Results to a Collector or Collectors. Measurement Agent Identifier (MA-ID): a Universally Unique IDentifier [RFC4122] that identifies a particular MA and is configured as part of the Bootstrapping process. Measurement Method: The process for assessing the value of a Metric; the process of measuring some performance or reliability Metric associated with the transfer of traffic. Measurement Peer (MP): The function that assists a Measurement Agent with Measurement Tasks and does not have an interface to the Controller or Collector. Measurement Result: The output of a single Measurement Task (the value obtained for the Metric). Measurement Schedule: The schedule for performing Measurement Tasks.
Measurement System: The set of LMAP-defined and related components that are operated by a single organisation, for the purpose of measuring performance aspects of the network. Measurement Task: The action performed by a particular Measurement Agent that consists of the single assessment of a Metric through operation of a Measurement Method role at a particular time, with all of the role's Input Parameters set to specific values. Measurement Traffic: the packet(s) generated by some types of Measurement Method that involve measuring some parameter associated with the transfer of the packet(s). Metric: The quantity related to the performance and reliability of the network that we'd like to know the value of. Observed Traffic Flow: In RFC 7011 [RFC7011], a Traffic Flow (or Flow) is defined as "a set of packets or frames passing an Observation Point in the network during a certain time interval. All packets belonging to a particular Flow have a set of common properties," such as packet header fields, characteristics, and treatments. A Flow measured by the LMAP system is termed an Observed Traffic Flow. Its properties are summarised and tabulated in Measurement Results (as opposed to raw capture and export). Report: The set of Measurement Results and other associated information (as defined by the Instruction). The Report is sent by a Measurement Agent to a Collector. Report Channel: A Channel between a Collector and an MA over which Report messages are sent. Report Protocol: The protocol delivering Report(s) from a Measurement Agent to a Collector. It runs over the Report Channel. Report Schedule: The schedule for sending Reports to a Collector. Subscriber: An entity (associated with one or more users) that is engaged in a subscription with a service provider. Suppression: The temporary cessation of Measurement Tasks.4. Constraints
The LMAP framework makes some important assumptions, which constrain the scope of the initial LMAP work.
4.1. The Measurement System Is Under the Direction of a Single Organisation
In the LMAP framework, the Measurement System is under the direction of a single organisation that is responsible for any impact that its measurements have on a user's quality of experience and privacy. Clear responsibility is critical given that a misbehaving large-scale Measurement System could potentially harm user experience, user privacy, and network security. However, the components of an LMAP Measurement System can be deployed in administrative domains that are not owned by the measuring organisation. Thus, the system of functions deployed by a single organisation constitutes a single LMAP domain, which may span ownership or other administrative boundaries.4.2. Each MA May Only Have a Single Controller at Any Point in Time
An MA is instructed by one Controller and is in one Measurement System. The constraint avoids different Controllers giving an MA conflicting instructions and so means that the MA does not have to manage contention between multiple Measurement (or Report) Schedules. This simplifies the design of MAs (critical for a large-scale infrastructure) and allows a Measurement Schedule to be tested on specific types of MAs before deployment to ensure that the end-user experience is not impacted (due to CPU, memory, or broadband-product constraints). However, a Measurement System may have several Controllers.