Internet Engineering Task Force (IETF) K. Moriarty, Ed. Request for Comments: 8404 Dell EMC Category: Informational A. Morton, Ed. ISSN: 2070-1721 AT&T Labs July 2018 Effects of Pervasive Encryption on OperatorsAbstract
Pervasive monitoring attacks on the privacy of Internet users are of serious concern to both user and operator communities. RFC 7258 discusses the critical need to protect users' privacy when developing IETF specifications and also recognizes that making networks unmanageable to mitigate pervasive monitoring is not an acceptable outcome: an appropriate balance is needed. This document discusses current security and network operations as well as management practices that may be impacted by the shift to increased use of encryption to help guide protocol development in support of manageable and secure networks. Status of This Memo This document is not an Internet Standards Track specification; it is published for informational purposes. This document is a product of the Internet Engineering Task Force (IETF). It has been approved for publication by the Internet Engineering Steering Group (IESG). Not all documents approved by the IESG are candidates for any level of Internet Standard; see Section 2 of RFC 7841. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at https://www.rfc-editor.org/info/rfc8404.
Copyright Notice Copyright (c) 2018 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Additional Background on Encryption Changes . . . . . . . 5 1.2. Examples of Attempts to Preserve Functions . . . . . . . 7 2. Network Service Provider Monitoring Practices . . . . . . . . 8 2.1. Passive Monitoring . . . . . . . . . . . . . . . . . . . 8 2.1.1. Traffic Surveys . . . . . . . . . . . . . . . . . . . 8 2.1.2. Troubleshooting . . . . . . . . . . . . . . . . . . . 9 2.1.3. Traffic-Analysis Fingerprinting . . . . . . . . . . . 11 2.2. Traffic Optimization and Management . . . . . . . . . . . 12 2.2.1. Load Balancers . . . . . . . . . . . . . . . . . . . 12 2.2.2. Differential Treatment Based on Deep Packet Inspection (DPI) . . . . . . . . . . . . . . . . . . 14 2.2.3. Network-Congestion Management . . . . . . . . . . . . 16 2.2.4. Performance-Enhancing Proxies . . . . . . . . . . . . 16 2.2.5. Caching and Content Replication near the Network Edge 17 2.2.6. Content Compression . . . . . . . . . . . . . . . . . 18 2.2.7. Service Function Chaining . . . . . . . . . . . . . . 18 2.3. Content Filtering, Network Access, and Accounting . . . . 19 2.3.1. Content Filtering . . . . . . . . . . . . . . . . . . 19 2.3.2. Network Access and Data Usage . . . . . . . . . . . . 20 2.3.3. Application Layer Gateways (ALGs) . . . . . . . . . . 21 2.3.4. HTTP Header Insertion . . . . . . . . . . . . . . . . 22 3. Encryption in Hosting and Application SP Environments . . . . 23 3.1. Management-Access Security . . . . . . . . . . . . . . . 23 3.1.1. Monitoring Customer Access . . . . . . . . . . . . . 24 3.1.2. SP Content Monitoring of Applications . . . . . . . . 24 3.2. Hosted Applications . . . . . . . . . . . . . . . . . . . 26 3.2.1. Monitoring Managed Applications . . . . . . . . . . . 27 3.2.2. Mail Service Providers . . . . . . . . . . . . . . . 27 3.3. Data Storage . . . . . . . . . . . . . . . . . . . . . . 28 3.3.1. Object-Level Encryption . . . . . . . . . . . . . . . 28
3.3.2. Disk Encryption, Data at Rest (DAR) . . . . . . . . . 29 3.3.3. Cross-Data-Center Replication Services . . . . . . . 29 4. Encryption for Enterprises . . . . . . . . . . . . . . . . . 30 4.1. Monitoring Practices of the Enterprise . . . . . . . . . 30 4.1.1. Security Monitoring in the Enterprise . . . . . . . . 31 4.1.2. Monitoring Application Performance in the Enterprise 32 4.1.3. Diagnostics and Troubleshooting for Enterprise Networks . . . . . . . . . . . . . . . . . . . . . . 33 4.2. Techniques for Monitoring Internet-Session Traffic . . . 34 5. Security Monitoring for Specific Attack Types . . . . . . . . 36 5.1. Mail Abuse and Spam . . . . . . . . . . . . . . . . . . . 37 5.2. Denial of Service . . . . . . . . . . . . . . . . . . . . 37 5.3. Phishing . . . . . . . . . . . . . . . . . . . . . . . . 38 5.4. Botnets . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.5. Malware . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.6. Spoofed-Source IP Address Protection . . . . . . . . . . 39 5.7. Further Work . . . . . . . . . . . . . . . . . . . . . . 39 6. Application-Based Flow Information Visible to a Network . . . 40 6.1. IP Flow Information Export . . . . . . . . . . . . . . . 40 6.2. TLS Server Name Indication . . . . . . . . . . . . . . . 40 6.3. Application-Layer Protocol Negotiation (ALPN) . . . . . . 41 6.4. Content Length, Bitrate, and Pacing . . . . . . . . . . . 42 7. Effect of Encryption on the Evolution of Mobile Networks . . 42 8. Response to Increased Encryption and Looking Forward . . . . 43 9. Security Considerations . . . . . . . . . . . . . . . . . . . 43 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 11. Informative References . . . . . . . . . . . . . . . . . . . 44 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 53 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 531. Introduction
In response to pervasive monitoring revelations and the IETF consensus that pervasive monitoring is an attack [RFC7258], efforts are underway to increase encryption of Internet traffic. Pervasive monitoring is of serious concern to users, operators, and application providers. RFC 7258 discusses the critical need to protect users' privacy when developing IETF specifications and also recognizes that making networks unmanageable to mitigate pervasive monitoring is not an acceptable outcome; rather, an appropriate balance would emerge over time. This document describes practices currently used by network operators to manage, operate, and secure their networks and how those practices may be impacted by a shift to increased use of encryption. It provides network operators' perspectives about the motivations and objectives of those practices as well as effects anticipated by operators as use of encryption increases. It is a summary of
concerns of the operational community as they transition to managing networks with less visibility. This document does not endorse the use of the practices described herein, nor does it aim to provide a comprehensive treatment of the effects of current practices, some of which have been considered controversial from a technical or business perspectives or contradictory to previous IETF statements (e.g., [RFC1958], [RFC1984], and [RFC2804]). The following RFCs consider the end-to-end (e2e) architectural principle to be a guiding principle for the development of Internet protocols [RFC2775] [RFC3724] [RFC7754]. This document aims to help IETF participants understand network operators' perspectives about the impact of pervasive encryption, both opportunistic and strong end-to-end encryption, on operational practices. The goal is to help inform future protocol development to ensure that operational impact is part of the conversation. Perhaps new methods could be developed to accomplish some of the goals of current practices despite changes in the extent to which cleartext will be available to network operators (including methods that rely on network endpoints where applicable). Discussion of current practices and the potential future changes is provided as a prerequisite to potential future cross-industry and cross-layer work to support the ongoing evolution towards a functional Internet with pervasive encryption. Traditional network management, planning, security operations, and performance optimization have been developed on the Internet where a large majority of data traffic flows without encryption. While unencrypted traffic has made information that aids operations and troubleshooting at all layers accessible, it has also made pervasive monitoring by unseen parties possible. With broad support and increased awareness of the need to consider privacy in all aspects across the Internet, it is important to catalog existing management, operational, and security practices that have depended upon the availability of cleartext to function and to explore if critical operational practices can be met by less-invasive means. This document refers to several different forms of Service Providers (SPs). For example, network service providers (or network operators) provide IP-packet transport primarily, though they may bundle other services with packet transport. Alternatively, application service providers primarily offer systems that participate as an endpoint in communications with the application user and hosting service providers lease computing, storage, and communications systems in data centers. In practice, many companies perform two or more service provider roles but may be historically associated with one.
This document includes a sampling of current practices and does not attempt to describe every nuance. Some sections cover technologies used over a broad spectrum of devices and use cases.1.1. Additional Background on Encryption Changes
Pervasive encryption in this document refers to all types of session encryption including Transport Layer Security (TLS), IP Security (IPsec), TCPcrypt [TCPcrypt], QUIC [QUIC] (IETF's specification of Google's QUIC), and others that are increasingly deployed. It is well understood that session encryption helps to prevent both passive and active attacks on transport protocols; more on pervasive monitoring can be found in "Confidentiality in the Face of Pervasive Surveillance: A Threat Model and Problem Statement" [RFC7624]. Active attacks have long been a motivation for increased encryption, and preventing pervasive monitoring became a focus just a few years ago. As such, the Internet Architecture Board (IAB) released a statement advocating for increased use of encryption in November 2014 (see <https://www.iab.org/2014/11/14/iab-statement-on-internet- confidentiality/>). Perspectives on encryption paradigms have shifted over time to make ease of deployment a high priority and to balance that against providing the maximum possible level of security, regardless of deployment considerations. One such shift is documented in Opportunistic Security (OS) [RFC7435], which suggests that when use of authenticated encryption is not possible, cleartext sessions should be upgraded to unauthenticated session encryption, rather than no encryption. OS encourages upgrading from cleartext but cannot require or guarantee such upgrades. Once OS is used, it allows for an evolution to authenticated encryption. These efforts are necessary to improve an end user's expectation of privacy, making pervasive monitoring cost prohibitive. With OS in use, active attacks are still possible on unauthenticated sessions. OS has been implemented as NULL Authentication with IPsec [RFC7619], and there are a number of infrastructure use cases such as server-to-server encryption where this mode is deployed. While OS is helpful in reducing pervasive monitoring by increasing the cost to monitor, it is recognized that risk profiles for some applications require authenticated and secure session encryption as well prevention of active attacks. IPsec, and other session encryption protocols, with authentication has many useful applications, and usage has increased for infrastructure applications such as for virtual private networks between data centers. OS, as well as other protocol developments like the Automated Certificate Management Environment (ACME), have increased the usage of session encryption on the Internet.
Risk profiles vary and so do the types of session encryption deployed. To understand the scope of changes in visibility, a few examples are highlighted. Work continues to improve the implementation, development, and configuration of TLS and DTLS sessions to prevent active attacks used to monitor or intercept session data. The changes from TLS 1.2 to 1.3 enhance the security of TLS, while hiding more of the session negotiation and providing less visibility on the wire. The Using TLS in Applications (UTA) Working Group has been publishing documentation to improve the security of TLS and DTLS sessions. They have documented the known attack vectors in [RFC7457], have documented best practices for TLS and DTLS in [RFC7525], and have other documents in development. The recommendations from these documents were built upon for TLS 1.3 to provide a more inherently secure end-to-end protocol. In addition to encrypted website access (HTTP over TLS), there are other well-deployed application-level transport encryption efforts such as MTA-to-MTA (mail transfer agent) session encryption transport for email (SMTP over TLS) and gateway-to-gateway for instant messaging (the Extensible Messaging and Presence Protocol (XMPP) over TLS). Although this does provide protection from transport-layer attacks, the servers could be a point of vulnerability if user-to- user encryption is not provided for these messaging protocols. User-to-user content encryption schemes, such as S/MIME and Pretty Good Privacy (PGP) for email and Off-the-Record (OTR) encryption for XMPP are used by those interested in protecting their data as it crosses intermediary servers, preventing transport-layer attacks by providing an end-to-end solution. User-to-user schemes are under review, and additional options will emerge to ease the configuration requirements, making this type of option more accessible to non-technical users interested in protecting their privacy. Increased use of encryption, either opportunistic or authenticated, at the transport, network, or application layer, impacts how networks are operated, managed, and secured. In some cases, new methods to operate, manage, and secure networks will evolve in response. In other cases, currently available capabilities for monitoring or troubleshooting networks could become unavailable. This document lists a collection of functions currently employed by network operators that may be impacted by the shift to increased use of encryption. This document does not attempt to specify responses or solutions to these impacts; it documents the current state.
1.2. Examples of Attempts to Preserve Functions
Following the Snowden [Snowden] revelations, application service providers (Yahoo, Google, etc.) responded by encrypting traffic between their data centers (IPsec) to prevent passive monitoring from taking place unbeknownst to them. Infrastructure traffic carried over the public Internet has been encrypted for some time; this change for universal encryption was specific to their private backbones. Large mail service providers also began to encrypt session transport (TLS) to hosted mail services. This and other increases in the use of encryption had the immediate effect of providing confidentiality and integrity for protected data, but it created a problem for some network-management functions. Operators could no longer gain access to some session streams resulting in actions by several to regain their operational practices that previously depended on cleartext data sessions. The Electronic Frontier Foundation (EFF) reported [EFF2014] several network service providers using a downgrade attack to prevent the use of SMTP over TLS by breaking STARTTLS (Section 3.2 of [RFC7525]), essentially preventing the negotiation process resulting in fallback to the use of cleartext. There have already been documented cases of service providers preventing STARTTLS to avoid session encryption negotiation on some sessions. Doing so allows them to inject a super cookie that enables advertisers to track users; these actions are also considered an attack. These serve as examples of undesirable behavior that could be prevented through upfront discussions in protocol work for operators and protocol designers to understand the implications of such actions. In other cases, some service providers and enterprises have relied on middleboxes having access to cleartext for load-balancing, monitoring for attack traffic, meeting regulatory requirements, or other purposes. The implications for enterprises that own the data on their networks or that have explicit agreements that permit the monitoring of user traffic are very different from those for service providers who may be accessing content in a way that violates privacy considerations. Additionally, service provider equipment is designed for accessing only the headers exposed for the data-link, network, and transport layers. Delving deeper into packets is possible, but there is typically a high degree of accuracy from the header information and packet sizes when limited to header information from these three layers. Service providers also have the option of adding routing overlay protocols to traffic. These middlebox implementations, performing functions either considered legitimate by the IETF or not, have been impacted by increases in encrypted traffic. Only methods keeping with the goal of balancing network management and pervasive monitoring mitigation as discussed in [RFC7258] should be considered in work toward a solution resulting from this document.
It is well known that national surveillance programs monitor traffic for criminal activities [JNSLP] [RFC2804] [RFC7258]. Governments vary on their balance between monitoring versus the protection of user privacy, data, and assets. Those that favor unencrypted access to data ignore the real need to protect users' identities, financial transactions, and intellectual property (which require security and encryption to prevent crime). A clear understanding of technology, encryption, and monitoring goals will aid in the development of solutions as work continues towards finding an appropriate balance that allows for management while protecting user privacy with strong encryption solutions.2. Network Service Provider Monitoring Practices
Service providers, for this definition, include the backbone ISPs as well as those providing infrastructure at scale for core Internet use (hosted infrastructure and services such as email). Network service providers use various techniques to operate, manage, and secure their networks. The following subsections detail the purpose of several techniques as well as which protocol fields are used to accomplish each task. In response to increased encryption of these fields, some network service providers may be tempted to undertake undesirable security practices in order to gain access to the fields in unencrypted data flows. To avoid this situation, new methods could be developed to accomplish the same goals without service providers having the ability to see session data.2.1. Passive Monitoring
2.1.1. Traffic Surveys
Internet traffic surveys are useful in many pursuits, such as input for studies of the Center for Applied Internet Data Analysis (CAIDA) [CAIDA], network planning, and optimization. Tracking the trends in Internet traffic growth, from earlier peer-to-peer communication to the extensive adoption of unicast video streaming applications, has relied on a view of traffic composition with a particular level of assumed accuracy, based on access to cleartext by those conducting the surveys. Passive monitoring makes inferences about observed traffic using the maximal information available and is subject to inaccuracies stemming from incomplete sampling (of packets in a stream) or loss due to monitoring-system overload. When encryption conceals more layers in each packet, reliance on pattern inferences and other heuristics grows and accuracy suffers. For example, the traffic patterns between server and browser are dependent on browser supplier and
version, even when the sessions use the same server application (e.g., web email access). It remains to be seen whether more complex inferences can be mastered to produce the same monitoring accuracy.2.1.2. Troubleshooting
Network operators use protocol-dissecting analyzers when responding to customer problems, to identify the presence of attack traffic, and to identify root causes of the problem such as misconfiguration. In limited cases, packet captures may also be used when a customer approves of access to their packets or provides packet captures close to the endpoint. The protocol dissection is generally limited to supporting protocols (e.g., DNS and DHCP), network and transport (e.g., IP and TCP), and some higher-layer protocols (e.g., RTP and the RTP Control Protocol (RTCP)). Troubleshooting will move closer to the endpoint with increased encryption and adjustments in practices to effectively troubleshoot using a 5-tuple may require education. Packet-loss investigations, and those where access is limited to a 2-tuple (IPsec tunnel mode), rely on network and transport-layer headers taken at the endpoint. In this case, captures on intermediate nodes are not reliable as there are far too many cases of aggregate interfaces and alternate paths in service provider networks. Network operators are often the first ones called upon to investigate application problems (e.g., "my HD video is choppy"), to first rule out network and network services as a cause for the underlying issue. When diagnosing a customer problem, the starting point may be a particular application that isn't working. The ability to identify the problem application's traffic is important, and packet capture provided from the customer close to the edge may be used for this purpose; IP address filtering is not useful for applications using Content Delivery Networks (CDNs) or cloud providers. After identifying the traffic, an operator may analyze the traffic characteristics and routing of the traffic. This diagnostic step is important to help determine the root cause before exploring if the issue is directly with the application. For example, by investigating packet loss (from TCP sequence and acknowledgement numbers), Round-Trip Time (RTT) (from TCP timestamp options or application-layer transactions, e.g., DNS or HTTP response time), TCP receive-window size, packet corruption (from checksum verification), inefficient fragmentation, or application-layer problems, the operator can narrow the problem to a portion of the network, server overload, client or server misconfiguration, etc. Network operators may also be able to identify the presence of attack
traffic as not conforming to the application the user claims to be using. In many instances, the exposed packet header is sufficient for this type of troubleshooting. One way of quickly excluding the network as the bottleneck during troubleshooting is to check whether the speed is limited by the endpoints. For example, the connection speed might instead be limited by suboptimal TCP options, the sender's congestion window, the sender temporarily running out of data to send, the sender waiting for the receiver to send another request, or the receiver closing the receive window. All this information can be derived from the cleartext TCP header. Packet captures and protocol-dissecting analyzers have been important tools. Automated monitoring has also been used to proactively identify poor network conditions, leading to maintenance and network upgrades before user experience declines. For example, findings of loss and jitter in Voice over IP (VoIP) traffic can be a predictor of future customer dissatisfaction (supported by metadata from RTP/RTCP) [RFC3550], or increases in DNS response time can generally make interactive web browsing appear sluggish. But, to detect such problems, the application or service stream must first be distinguished from others. When increased encryption is used, operators lose a source of data that may be used to debug user issues. For example, IPsec obscures TCP and RTP header information, while TLS and the Secure Real-time Transport Protocol (SRTP) do not. Because of this, application- server operators using increased encryption might be called upon more frequently to assist with debugging and troubleshooting; thus, they may want to consider what tools can be put in the hands of their clients or network operators. Further, the performance of some services can be more efficiently managed and repaired when information on user transactions is available to the service provider. It may be possible to continue transaction-monitoring activities without cleartext access to the application layers of interest; however, inaccuracy will increase and efficiency of repair activities will decrease. For example, an application-protocol error or failure would be opaque to network troubleshooters when transport encryption is applied, making root cause location more difficult and, therefore, increasing the time to repair. Repair time directly reduces the availability of the service, and most network operators have made availability a key metric in their Service Level Agreements (SLAs) and/or subscription rebates. Also, there may be more cases of user-communication failures when the additional encryption processes are introduced (e.g., key management at large scale), leading to more customer
service contacts and (at the same time) less information available to network-operation repair teams. In mobile networks, knowledge about TCP's stream transfer progress (by observing ACKs, retransmissions, packet drops, and the Sector Utilization Level, etc.) is further used to measure the performance of network segments (sector, eNodeB (eNB), etc.). This information is used as key performance indicators (KPIs) and for the estimation of user/service key quality indicators at network edges for circuit emulation (CEM) as well as input for mitigation methods. If the makeup of active services per user and per sector are not visible to a server that provides Internet Access Point Names (APNs), it cannot perform mitigation functions based on network segment view. It is important to note that the push for encryption by application providers has been motivated by the application of the described techniques. Although network operators have noted performance improvements with network-based optimization or enhancement of user traffic (otherwise, deployment would not have occurred), application providers have likewise noted some degraded performance and/or user experience, and such cases may result in additional operator troubleshooting. Further, encrypted application streams might avoid outdated optimization or enhancement techniques, where they exist. A gap exists for vendors where built-in diagnostics and serviceability are not adequate to provide detailed logging and debugging capabilities that, when possible, could be accessed with cleartext network parameters. In addition to traditional logging and debugging methods, packet tracing and inspection along the service path provides operators the visibility to continue to diagnose problems reported both internally and by their customers. Logging of service path upon exit for routing overlay protocols will assist with policy management and troubleshooting capabilities for traffic flows on encrypted networks. Protocol trace logging and protocol data unit (PDU) logging should also be considered to improve visibility to monitor and troubleshoot application-level traffic. Additional work on this gap would assist network operators to better troubleshoot and manage networks with increasing amounts of encrypted traffic.2.1.3. Traffic-Analysis Fingerprinting
Fingerprinting is used in traffic analysis and monitoring to identify traffic streams that match certain patterns. This technique can be used with both cleartext and encrypted sessions. Some Distributed Denial-of-Service (DDoS) prevention techniques at the network- provider level rely on the ability to fingerprint traffic in order to mitigate the effect of this type of attack. Thus, fingerprinting may be an aspect of an attack or part of attack countermeasures.
A common, early trigger for DDoS mitigation includes observing uncharacteristic traffic volumes or sources, congestion, or degradation of a given network or service. One approach to mitigate such an attack involves distinguishing attacker traffic from legitimate user traffic. The ability to examine layers and payloads above transport provides an increased range of filtering opportunities at each layer in the clear. If fewer layers are in the clear, this means that there are reduced filtering opportunities available to mitigate attacks. However, fingerprinting is still possible. Passive monitoring of network traffic can lead to invasion of privacy by external actors at the endpoints of the monitored traffic. Encryption of traffic end to end is one method to obfuscate some of the potentially identifying information. For example, browser fingerprints are comprised of many characteristics, including User Agents, HTTP Accept headers, browser plug-in details, screen size and color details, system fonts, and time zones. A monitoring system could easily identify a specific browser, and by correlating other information, identify a specific user.2.2. Traffic Optimization and Management
2.2.1. Load Balancers
A standalone load balancer is a function one can take off the shelf, place in front of a pool of servers, and configure appropriately, and it will balance the traffic load among servers in the pool. This is a typical setup for load balancers. Standalone load balancers rely on the plainly observable information in the packets they are forwarding and industry-accepted standards in interpreting the plainly observable information. Typically, this is a 5-tuple of the connection. This type of configuration terminates TLS sessions at the load balancer, making it the endpoint instead of the server. Standalone load balancers are considered middleboxes, but they are an integral part of server infrastructure that scales. In contrast, an integrated load balancer is developed to be an integral part of the service provided by the server pool behind that load balancer. These load balancers can communicate state with their pool of servers to better route flows to the appropriate servers. They rely on non-standard, system-specific information and operational knowledge shared between the load balancer and its servers. Both standalone and integrated load balancers can be deployed in pools for redundancy and load sharing. For high availability, it is important that when packets belonging to a flow start to arrive at a
different load balancer in the load-balancer pool, the packets continue to be forwarded to the original server in the server pool. The importance of this requirement increases as the chance of such a load balancer change event increases. Mobile operators deploy integrated load balancers to assist with maintaining connection state as devices migrate. With the proliferation of mobile connected devices, there is an acute need for connection-oriented protocols that maintain connections after a network migration by an endpoint. This connection persistence provides an additional challenge for multihomed anycast-based services typically employed by large content owners and CDNs. The challenge is that a migration to a different network in the middle of the connection greatly increases the chances of the packets routed to a different anycast point of presence (POP) due to the new network's different connectivity and Internet peering arrangements. The load balancer in the new POP, potentially thousands of miles away, will not have information about the new flow and would not be able to route it back to the original POP. To help with the endpoint network migration challenges, anycast service operations are likely to employ integrated load balancers that, in cooperation with their pool servers, are able to ensure that client-to-server packets contain some additional identification in plainly observable parts of the packets (in addition to the 5-tuple). As noted in Section 2 of [RFC7258], careful consideration in protocol design to mitigate pervasive monitoring is important, while ensuring manageability of the network. An area for further research includes end-to-end solutions that would provide a simpler architecture and that may solve the issue with CDN anycast. In this case, connections would be migrated to a CDN unicast address. Current protocols, such as TCP, allow the development of stateless integrated load balancers by availing such load balancers of additional plaintext information in client-to-server packets. In case of TCP, such information can be encoded by having server- generated sequence numbers (that are ACKed by the client), segment values, lengths of the packet sent, etc. The use of some of these mechanisms for load balancing negates some of the security assumptions associated with those primitives (e.g., that an off-path attacker guessing valid sequence numbers for a flow is hard). Another possibility is a dedicated mechanism for storing load- balancer state, such as QUIC's proposed connection ID to provide visibility to the load balancer. An identifier could be used for tracking purposes, but this may provide an option that is an improvement from bolting it on to an unrelated transport signal.
This method allows for tight control by one of the endpoints and can be rotated to avoid roving client linkability: in other words, being a specific, separate signal, it can be governed in a way that is finely targeted at that specific use case. Some integrated load balancers have the ability to use additional plainly observable information even for today's protocols that are not network-migration tolerant. This additional information allows for improved availability and scalability of the load-balancing operation. For example, BGP reconvergence can cause a flow to switch anycast POPs, even without a network change by any endpoint. Additionally, a system that is able to encode the identity of the pool server in plaintext information available in each incoming packet is able to provide stateless load balancing. This ability confers great reliability and scalability advantages, even if the flow remains in a single POP, because the load-balancing system is not required to keep state of each flow. Even more importantly, there's no requirement to continuously synchronize such state among the pool of load balancers. An integrated load balancer repurposing limited existing bits in transport-flow state must maintain and synchronize per-flow state occasionally: using the sequence number as a cookie only works for so long given that there aren't that many bits available to divide across a pool of machines. Mobile operators apply 3GPP Self-Organizing Networks (SONs) for intelligent workflows such as content-aware Mobility Load Balancing (MLB). Where network load balancers have been configured to route according to application-layer semantics, an encrypted payload is effectively invisible. This has resulted in practices of intercepting TLS in front of load balancers to regain that visibility, but at a cost to security and privacy. In future Network Function Virtualization (NFV) architectures, load- balancing functions are likely to be more prevalent (deployed at locations throughout operators' networks). NFV environments will require some type of identifier (IPv6 flow identifiers, the proposed QUIC connection ID, etc.) for managing traffic using encrypted tunnels. The shift to increased encryption will have an impact on visibility of flow information and will require adjustments to perform similar load-balancing functions within an NFV.2.2.2. Differential Treatment Based on Deep Packet Inspection (DPI)
Data transfer capacity resources in cellular radio networks tend to be more constrained than in fixed networks. This is a result of variance in radio signal strength as a user moves around a cell, the rapid ingress and egress of connections as users hand off between adjacent cells, and temporary congestion at a cell. Mobile networks
alleviate this by queuing traffic according to its required bandwidth and acceptable latency: for example, a user is unlikely to notice a 20 ms delay when receiving a simple web page or email, or an instant message response, but will very likely notice a rebuffering pause in a video playback or a VoIP call de-jitter buffer. Ideally, the scheduler manages the queue so that each user has an acceptable experience as conditions vary, but inferences of the traffic type have been used to make bearer assignments and set scheduler priority. Deep Packet Inspection (DPI) allows identification of applications based on payload signatures, in contrast to trusting well-known port numbers. Application- and transport-layer encryption make the traffic type estimation more complex and less accurate; therefore, it may not be effectual to use this information as input for queue management. With the use of WebSockets [RFC6455], for example, many forms of communications (from isochronous/real-time to bulk/elastic file transfer) will take place over HTTP port 80 or port 443, so only the messages and higher-layer data will make application differentiation possible. If the monitoring system sees only "HTTP port 443", it cannot distinguish application streams that would benefit from priority queuing from others that would not. Mobile networks especially rely on content-/application-based prioritization of Over-the-Top (OTT) services -- each application type or service has different delay/loss/throughput expectations, and each type of stream will be unknown to an edge device if encrypted. This impedes dynamic QoS adaptation. An alternate way to achieve encrypted application separation is possible when the User Equipment (UE) requests a dedicated bearer for the specific application stream (known by the UE), using a mechanism such as the one described in Section 6.5 of 3GPP TS 24.301 [TS3GPP]. The UE's request includes the Quality Class Indicator (QCI) appropriate for each application, based on their different delay/loss/throughput expectations. However, UE requests for dedicated bearers and QCI may not be supported at the subscriber's service level, or in all mobile networks. These effects and potential alternative solutions have been discussed at the accord BoF [ACCORD] at IETF 95. This section does not consider traffic discrimination by service providers related to Net Neutrality, where traffic may be favored according to the service provider's preference as opposed to the user's preference. These use cases are considered out of scope for this document as controversial practices.
2.2.3. Network-Congestion Management
For 3GPP User Plane Congestion Management (UPCON) [UPCON], the ability to understand content and manage networks during periods of congestion is the focus. Mitigating techniques such as deferred download, off-peak acceleration, and outbound roamers are a few examples of the areas explored in the associated 3GPP documents. The documents describe the issues, describe the data utilized in managing congestion, and make policy recommendations.2.2.4. Performance-Enhancing Proxies
Performance-enhancing TCP proxies may perform local retransmission at the network edge; this also applies to mobile networks. In TCP, duplicated ACKs are detected and potentially concealed when the proxy retransmits a segment that was lost on the mobile link without involvement of the far end (see Section 2.1.1 of [RFC3135] and Section 3.5 of [MIDDLEBOXES]). Operators report that this optimization at network edges improves real-time transmission over long-delay Internet paths or networks with large capacity variation (such as mobile/cellular networks). However, such optimizations can also cause problems with performance, for example, if the characteristics of some packet streams begin to vary significantly from those considered in the proxy design. In general, some operators have stated that performance-enhancing proxies have a lower RTT to the client; therefore, they determine the responsiveness of flow control. A lower RTT makes the flow-control loop more responsive to changes in the mobile-network conditions and enables faster adaptation in a delay- and capacity-varying network due to user mobility. Further, some use service-provider-operated proxies to reduce the control delay between the sender and a receiver on a mobile network where resources are limited. The RTT determines how quickly a user's attempt to cancel a video is recognized and, therefore, how quickly the traffic is stopped, thus keeping unwanted video packets from entering the radio-scheduler queue. If impacted by encryption, performance-enhancing proxies could make use of routing overlay protocols to accomplish the same task, but this results in additional overhead. An application-type-aware network edge (middlebox) can further control pacing, limit simultaneous HD videos, or prioritize active videos against new videos, etc. Services at this more granular level are limited with the use of encryption.
Performance-enhancing proxies are primarily used on long-delay links (satellite) with access to the TCP header to provide an early ACK and make the long-delay link of the path seem shorter. With some specific forms of flow control, TCP can be more efficient than alternatives such as proxies. The editors cannot cite research on this point specific to the performance-enhancing proxies described, but they agree this area could be explored to determine if flow- control modifications could preserve the end-to-end performance on long-delay path sessions where the TCP header is exposed.2.2.5. Caching and Content Replication near the Network Edge
The features and efficiency of some Internet services can be augmented through analysis of user flows and the applications they provide. For example, network caching of popular content at a location close to the requesting user can improve delivery efficiency (both in terms of lower request response times and reduced use of links on the international Internet when content is remotely located), and service providers through an authorized agreement acting on their behalf use DPI in combination with content- distribution networks to determine if they can intervene effectively. Encryption of packet contents at a given protocol layer usually makes DPI processing of that layer and higher layers impossible. That being said, it should be noted that some content providers prevent caching to control content delivery through the use of encrypted end-to-end sessions. CDNs vary in their deployment options of end- to-end encryption. The business risk of losing control of content is a motivation outside of privacy and pervasive monitoring that is driving end-to-end encryption for these content providers. It should be noted that caching was first supported in [RFC1945] and continued in the recent update of "Hypertext Transfer Protocol (HTTP/1.1): Caching" [RFC7234]. Some operators also operate transparent caches that neither the user nor the origin opt-in. The use of these caches is controversial within the IETF and is generally precluded by the use of HTTPS. Content replication in caches (for example, live video and content protected by Digital Rights Management (DRM)) is used to most efficiently utilize the available limited bandwidth and thereby maximize the user's Quality of Experience (QoE). Especially in mobile networks, duplicating every stream through the transit network increases backhaul cost for live TV. 3GPP Enhanced Multimedia Broadcast/Multicast Services (eMBMS) utilize trusted edge proxies to facilitate delivering the same stream to different users, using either unicast or multicast depending on channel conditions to the user. There are ongoing efforts to support multicast inside carrier networks while preserving end-to-end security: Automatic Multicast
Tunneling (AMT), for instance, allows CDNs to deliver a single (potentially encrypted) copy of a live stream to a carrier network over the public Internet and for the carrier to then distribute that live stream as efficiently as possible within its own network using multicast. Alternate approaches are in the early phase of being explored to allow caching of encrypted content. These solutions require cooperation from content owners and fall outside the scope of what is covered in this document. Content delegation allows for replication with possible benefits, but any form of delegation has the potential to affect the expectation of client-server confidentiality.2.2.6. Content Compression
In addition to caching, various applications exist to provide data compression in order to conserve the life of the user's mobile data plan or make delivery over the mobile link more efficient. The compression proxy access can be built into a specific user-level application, such as a browser, or it can be available to all applications using a system-level application. The primary method is for the mobile application to connect to a centralized server as a transparent proxy (user does not opt-in), with the data channel between the client application and the server using compression to minimize bandwidth utilization. The effectiveness of such systems depends on the server having access to unencrypted data flows. Aggregated data stream content compression that spans objects and data sources that can be treated as part of a unified compression scheme (e.g., through the use of a shared segment store) is often effective at providing data offload when there is a network element close to the receiver that has access to see all the content.2.2.7. Service Function Chaining
Service Function Chaining (SFC) is defined in RFC 7665 [RFC7665] and RFC 8300 [RFC8300]. As discussed in RFC 7498 [RFC7498], common SFC deployments may use classifiers to direct traffic into VLANs instead of using a Network Service Header (NSH), as defined in RFC 8300 [RFC8300]. As described in RFC 7665 [RFC7665], the ordered steering of traffic to support specific optimizations depends upon the ability of a classifier to determine the microflows. RFC 2474 [RFC2474] defines the following: Microflow: a single instance of an application-to-application flow of packets which is identified by source address, destination address, protocol id, and source port, destination port (where applicable).
SFC currently depends upon a classifier to at least identify the microflow. As the classifier's visibility is reduced from a 5-tuple to a 2-tuple, or if information above the transport layer becomes inaccessible, then the SFC classifier is not able to perform its job, and the service functions of the path may be adversely affected. There are also mechanisms provided to protect security and privacy. In the SFC case, the layer below a network service header can be protected with session encryption. A goal is protecting end-user data, while retaining the intended functions of RFC 7665 [RFC7665] at the same time.2.3. Content Filtering, Network Access, and Accounting
Mobile networks and many ISPs operate under the regulations of their licensing government authority. These regulations include Lawful Intercept, adherence to Codes of Practice on content filtering, and application of court order filters. Such regulations assume network access to provide content filtering and accounting, as discussed below. As previously stated, the intent of this document is to document existing practices; the development of IETF protocols follows the guiding principles of [RFC1984] and [RFC2804] and explicitly does not support tools and methods that could be used for wiretapping and censorship.2.3.1. Content Filtering
There are numerous reasons why service providers might block content: to comply with requests from law enforcement or regulatory authorities, to effectuate parental controls, to enforce content- based billing, or for other reasons, possibly considered inappropriate by some. See RFC 7754 [RFC7754] for a survey of Internet filtering techniques and motivations and the IAB consensus on those mechanisms. This section is intended to document a selection of current content-blocking practices by operators and the effects of encryption on those practices. Content blocking may also happen at endpoints or at the edge of enterprise networks, but those scenarios are not addressed in this section. In a mobile network, content filtering usually occurs in the core network. With other networks, content filtering could occur in the core network or at the edge. A proxy is installed that analyzes the transport metadata of the content users are viewing and filters content based on either a blacklist of sites or the user's predefined profile (e.g., for age-sensitive content). Although filtering can be done by many methods, one commonly used method involves a trigger based on the proxy identifying a DNS lookup of a host name in a URL that appears on a blacklist being used by the operator. The
subsequent requests to that domain will be rerouted to a proxy that checks whether the full URL matches a blocked URL on the list, and it will return a 404 if a match is found. All other requests should complete. This technique does not work in situations where DNS traffic is encrypted (e.g., by employing [RFC7858]). This method is also used by other types of network providers enabling traffic inspection, but not modification. Content filtering via a proxy can also utilize an intercepting certificate where the client's session is terminated at the proxy enabling for cleartext inspection of the traffic. A new session is created from the intercepting device to the client's destination; this is an opt-in strategy for the client, where the endpoint is configured to trust the intercepting certificate. Changes to TLS 1.3 do not impact this more invasive method of interception, which has the potential to expose every HTTPS session to an active man in the middle (MITM). Another form of content filtering is called parental control, where some users are deliberately denied access to age-sensitive content as a feature to the service subscriber. Some sites involve a mixture of universal and age-sensitive content and filtering software. In these cases, more-granular (application-layer) metadata may be used to analyze and block traffic. Methods that accessed cleartext application-layer metadata no longer work when sessions are encrypted. This type of granular filtering could occur at the endpoint or as a proxy service. However, the lack of ability to efficiently manage endpoints as a service reduces network service providers' ability to offer parental control.2.3.2. Network Access and Data Usage
Approved access to a network is a prerequisite to requests for Internet traffic. However, there are cases (beyond parental control) when a network service provider currently redirects customer requests for content (affecting content accessibility): 1. The network service provider is performing the accounting and billing for the content provider, and the customer has not (yet) purchased the requested content. 2. Further content may not be allowed as the customer has reached their usage limit and needs to purchase additional data service, which is the usual billing approach in mobile networks.
Currently, some network service providers redirect the customer using HTTP redirect to a captive portal page that explains to those customers the reason for the blockage and the steps to proceed. [RFC6108] describes one viable web notification system. When the HTTP headers and content are encrypted, this appropriately prevents mobile carriers from intercepting the traffic and performing an HTTP redirect. As a result, some mobile carriers block customer's encrypted requests, which impacts customer experience because the blocking reason must be conveyed by some other means. The customer may need to call customer care to find out the reason and/or resolve the issue, possibly extending the time needed to restore their network access. While there are well-deployed alternate SMS-based solutions that do not involve out-of-specification protocol interception, this is still an unsolved problem for non-SMS users. Further, when the requested service is about to consume the remainder of the user's plan limits, the transmission could be terminated and advance notifications may be sent to the user by their service provider to warn the user ahead of the exhausted plan. If web content is encrypted, the network provider cannot know the data transfer size at request time. Lacking this visibility of the application type and content size, the network would continue the transmission and stop the transfer when the limit was reached. A partial transfer may not be usable by the client wasting both network and user resources, possibly leading to customer complaints. The content provider does not know a user's service plans or current usage and cannot warn the user of plan exhaustion. In addition, some mobile network operators sell tariffs that allow free-data access to certain sites, known as 'zero rating'. A session to visit such a site incurs no additional cost or data usage to the user. For some implementations, zero rating is impacted if encryption hides the details of the content domain from the network.2.3.3. Application Layer Gateways (ALGs)
Application Layer Gateways (ALGs) assist applications to set connectivity across Network Address Translators (NATs), firewalls, and/or load balancers for specific applications running across mobile networks. Section 2.9 of [RFC2663] describes the role of ALGs and their interaction with NAT and/or application payloads. ALGs are deployed with an aim to improve connectivity. However, it is an IETF best common practice recommendation that ALGs for UDP-based protocols be turned off [RFC4787].
One example of an ALG in current use is aimed at video applications that use the Real-Time Streaming Protocol (RTSP) [RFC7826] primary stream as a means to identify related RTP/RTCP [RFC3550] flows at setup. The ALG in this case relies on the 5-tuple flow information derived from RTSP to provision NAT or other middleboxes and provide connectivity. Implementations vary, and two examples follow: 1. Parse the content of the RTSP stream and identify the 5-tuple of the supporting streams as they are being negotiated. 2. Intercept and modify the 5-tuple information of the supporting media streams as they are being negotiated on the RTSP stream, which is more intrusive to the media streams. When RTSP-stream content is encrypted, the 5-tuple information within the payload is not visible to these ALG implementations; therefore, they cannot provision their associated middleboxes with that information. The deployment of IPv6 may well reduce the need for NAT and the corresponding requirement for ALGs.2.3.4. HTTP Header Insertion
Some mobile carriers use HTTP header insertion (see Section 3.2.1 of [RFC7230]) to provide information about their customers to third parties or to their own internal systems [Enrich]. Third parties use the inserted information for analytics, customization, advertising, cross-site tracking of users, customer billing, or selectively allowing or blocking content. HTTP header insertion is also used to pass information internally between a mobile service provider's sub-systems, thus keeping the internal systems loosely coupled. When HTTP connections are encrypted to protect user privacy, mobile network service providers cannot insert headers to accomplish the, sometimes considered controversial, functions above. Guidance from the Internet Architecture Board has been provided in "Design Considerations for Metadata Insertion" [RFC8165]. The guidance asserts that designs that share metadata only by explicit actions at the host are preferable to designs in which middleboxes insert metadata. Alternate notification methods that follow this and other guidance would be helpful to mobile carriers.