Tech-invite3GPPspaceIETFspace
959493929190898887868584838281807978777675747372717069686766656463626160595857565554535251504948474645444342414039383736353433323130292827262524232221201918171615141312111009080706050403020100
in Index   Prev   Next

RFC 8280

Research into Human Rights Protocol Considerations

Pages: 81
Informational
Errata
Part 2 of 4 – Pages 15 to 40
First   Prev   Next

Top   ToC   RFC8280 - Page 15   prevText

5. Methodology

Mapping the relationship between human rights, protocols, and architectures is a new research challenge that requires a good amount of interdisciplinary and cross-organizational cooperation to develop a consistent methodology. The methodological choices made in this document are based on the political-science-based method of discourse analysis and ethnographic research methods [Cath]. This work departs from the assumption that language reflects the understanding of concepts. Or, as [Jabri] holds, policy documents are "social relations represented in texts where the language contained within these texts is used to construct meaning and representation." This process happens in society [Denzin] and manifests itself in institutions and organizations [King], exposed using the ethnographic methods of semi-structured interviews and participant observation. Or, in non-academic
Top   ToC   RFC8280 - Page 16
   language, the way the language in IETF/IRTF documents describes and
   approaches the issues they are trying to address is an indication of
   the underlying social assumptions and relationships of the engineers
   to their engineering.  By reading and analyzing these documents, as
   well as interviewing engineers and participating in the IETF/IRTF
   working groups, it is possible to distill the relationship between
   human rights, protocols, and the Internet's infrastructure as it
   pertains to the work of the IETF.

   The discourse analysis was operationalized using qualitative and
   quantitative means.  The first step taken by the authors and
   contributors was reading RFCs and other official IETF documents.  The
   second step was the use of a Python-based analyzer, using the
   "Bigbang" tool, adapted by Nick Doty [Doty], to scan for the concepts
   that were identified as important architectural principles (distilled
   on the initial reading and supplemented by the interviews and
   participant observation).  Such a quantitative method is very precise
   and speeds up the research process [Ritchie].  But this tool is
   unable to understand "latent meaning" [Denzin].  In order to mitigate
   these issues of automated word-frequency-based approaches and to get
   a sense of the "thick meaning" [Geertz] of the data, a second
   qualitative analysis of the data set was performed.  These various
   rounds of discourse analysis were used to inform the interviews and
   further data analysis.  As such, the initial rounds of quantitative
   discourse analysis were used to inform the second rounds of
   qualitative analysis.  The results from the qualitative interviews
   were again used to feed new concepts into the quantitative discourse
   analysis.  As such, the two methods continued to support and enrich
   each other.

   The ethnographic methods of the data collection and processing
   allowed the research group to acquire the data necessary to "provide
   a holistic understanding of research participants' views and actions"
   [Denzin] that highlighted ongoing issues and case studies where
   protocols impact human rights.  The interview participants were
   selected through purposive sampling [Babbie], as the research group
   was interested in getting a wide variety of opinions on the role of
   human rights in guiding protocol development.  This sampling method
   also ensured that individuals with extensive experience working at
   the IETF in various roles were targeted.  The interviewees included
   individuals in leadership positions (Working Group (WG) chairs, Area
   Directors (ADs)), "regular participants", and individuals working for
   specific entities (corporate, civil society, political, academic) and
   represented various backgrounds, nationalities, and genders.
Top   ToC   RFC8280 - Page 17

5.1. Data Sources

In order to map the potential relationship between human rights and protocols, the HRPC Research Group gathered data from three specific sources:

5.1.1. Discourse Analysis of RFCs

To start addressing the issue, a mapping exercise analyzing Internet infrastructure and protocol features vis-a-vis their possible impact on human rights was undertaken. Therefore, research on (1) the language used in current and historic RFCs and (2) information gathered from mailing-list discussions was undertaken to expose core architectural principles, language, and deliberations on the human rights of those affected by the network.

5.1.2. Interviews with Members of the IETF Community

Over 30 interviews with the current and past members of the Internet Architecture Board (IAB), current and past members of the Internet Engineering Steering Group (IESG), chairs of selected working groups, and RFC authors were done at the IETF 92 meeting in Dallas in March 2015 to get an insider's understanding of how they view the relationship (if any) between human rights and protocols, and how this relationship plays out in their work. Several of the participants opted to remain anonymous. If you are interested in this data set, please contact the authors of this document.

5.1.3. Participant Observation in Working Groups

By participating in various working groups, in person at IETF meetings, and on mailing lists, information about the IETF's day-to-day workings was gathered, from which general themes, technical concepts, and use cases about human rights and protocols were extracted. This process started at the IETF 91 meeting in Honolulu and continues today.
Top   ToC   RFC8280 - Page 18

5.2. Data Analysis Strategies

The data above was processed using three consecutive strategies: mapping protocols related to human rights, extracting concepts from these protocols, and creation of a common glossary (detailed under Section 2). Before going over these strategies, some elaboration on the process of identifying technical concepts as they relate to human rights is needed:

5.2.1. Identifying Qualities of Technical Concepts That Relate to Human Rights

5.2.1.1. Mapping Protocols and Standards to Human Rights
By combining data from the three data sources named above, an extensive list of protocols and standards that potentially enable the Internet as a tool for freedom of expression and association was created. In order to determine the enabling (or inhibiting) features, we relied on direct references in the RFCs as related to such impacts, as well as input from the community. Based on this analysis, a list of RFCs that describe standards and protocols that are potentially closely related to human rights was compiled.
5.2.1.2. Extracting Concepts from Selected RFCs
The first step was to identify the protocols and standards that are related to human rights and to create an environment that enables human rights. For that, we needed to focus on specific technical concepts that underlie these protocols and standards. Based on this list, a number of technical concepts that appeared frequently were extracted and used to create a second list of technical terms that, when combined and applied in different circumstances, create an enabling environment for exercising human rights on the Internet.
5.2.1.3. Building a Common Vocabulary of Technical Concepts That Impact Human Rights
While interviewing experts, investigating RFCs, and compiling technical definitions, several concepts of convergence and divergence were identified. To ensure that the discussion was based on a common understanding of terms and vocabulary, a list of definitions was created. The definitions are based on the wording found in various IETF documents; if the definitions were not available therein, definitions were taken from other SDOs or academic literature, as indicated in Section 2.
Top   ToC   RFC8280 - Page 19
5.2.1.4. Translating Human Rights Concepts into Technical Definitions
The previous steps allowed for the clarification of relationships between human rights and technical concepts. The steps taken show how the research process "zoomed in", from compiling a broad list of protocols and standards that relate to human rights to extracting the precise technical concepts that make up these protocols and standards, in order to understand the relationship between the two. This subsection presents the next step: translating human rights to technical concepts by matching the individual components of the rights to the accompanying technical concepts, allowing for the creation of a list of technical concepts that, when partially combined, can create an enabling environment for human rights.
5.2.1.5. List of Technical Terms That, When Partially Combined, Can Create an Enabling Environment for Human Rights
Based on the prior steps, the following list of technical terms was drafted. When partially combined, this list can create an enabling environment for human rights, such as freedom of expression and freedom of association. Architectural principles Enabling features and system properties for user rights /------------------------------------------------\ | | +=================|=============================+ | = | = | = | End-to-end = | = | Reliability = | = | Resilience = Access as | = | Interoperability = human right | = Good enough | Transparency = | = principle | Data minimization = | = | Permissionless innovation = | = Simplicity | Graceful degradation = | = | Connectivity = | = | Heterogeneity support = | = | = | = | = | = \------------------------------------------------/ = = +===============================================+ Figure 1: Relationship between Architectural Principles and Enabling Features for User Rights
Top   ToC   RFC8280 - Page 20

5.2.2. Relating Human Rights to Technical Concepts

The technical concepts listed in the steps above have been grouped according to their impact on specific rights, as mentioned in the interviews done at IETF 92 as well as the study of literature (see Section 4 ("Literature and Discussion Review") above). This analysis aims to assist protocol developers in better understanding the roles that specific technical concepts have with regard to their contribution to an enabling environment for people to exercise their human rights. This analysis does not claim to be a complete or exhaustive mapping of all possible ways in which protocols could potentially impact human rights, but it presents a mapping of initial concepts based on interviews and on discussion and review of the literature. +-----------------------+-----------------------------------------+ | Technical Concepts | Rights Potentially Impacted | +-----------------------+-----------------------------------------+ | Connectivity | | | Privacy | | | Security | | | Content agnosticism | Right to freedom of expression | | Internationalization | | | Censorship resistance | | | Open standards | | | Heterogeneity support | | +-----------------------+-----------------------------------------+ | Anonymity | | | Privacy | | | Pseudonymity | Right to non-discrimination | | Accessibility | | +-----------------------+-----------------------------------------+ | Content agnosticism | | | Security | Right to equal protection | +-----------------------+-----------------------------------------+ | Accessibility | | | Internationalization | Right to political participation | | Censorship resistance | | | Connectivity | | +-----------------------+-----------------------------------------+ | Open standards | | | Localization | Right to participate in cultural life, | | Internationalization | arts, and science, and | | Censorship resistance | Right to education | | Accessibility | |
Top   ToC   RFC8280 - Page 21
   +-----------------------+-----------------------------------------+
   | Connectivity          |                                         |
   | Decentralization      |                                         |
   | Censorship resistance | Right to freedom of assembly            |
   | Pseudonymity          |    and association                      |
   | Anonymity             |                                         |
   | Security              |                                         |
   +-----------------------+-----------------------------------------+
   | Reliability           |                                         |
   | Confidentiality       |                                         |
   | Integrity             | Right to security                       |
   | Authenticity          |                                         |
   | Anonymity             |                                         |
   |                       |                                         |
   +-----------------------+-----------------------------------------+

        Figure 2: Relationship between Specific Technical Concepts
       with Regard to Their Contribution to an Enabling Environment
                 for People to Exercise Their Human Rights

5.2.3. Mapping Cases of Protocols, Implementations, and Networking Paradigms That Adversely Impact Human Rights or Are Enablers Thereof

Given the information above, the following list of cases of protocols, implementations, and networking paradigms that either adversely impact or enable human rights was formed. It is important to note that the assessment here is not a general judgment on these protocols, nor is it an exhaustive listing of all the potential negative or positive impacts on human rights that these protocols might have. When these protocols were conceived, there were many criteria to take into account. For instance, relying on a centralized service can be bad for freedom of speech (it creates one more control point, where censorship could be applied), but it may be a necessity if the endpoints are not connected and reachable permanently. So, when we say "protocol X has feature Y, which may endanger freedom of speech," it does not mean that protocol X is bad, much less that its authors were evil. The goal here is to show, with actual examples, that the design of protocols has practical consequences for some human rights and that these consequences have to be considered in the design phase.
Top   ToC   RFC8280 - Page 22
5.2.3.1. IPv4
The Internet Protocol version 4 (IPv4), also known as "Layer 3" of the Internet and specified with a common encapsulation and protocol header, is defined in [RFC791]. The evolution of Internet communications led to continued development in this area, "encapsulated" in the development of version 6 (IPv6) of the protocol [RFC8200]. In spite of this updated protocol, we find that 23 years after the specification of IPv6 the older IPv4 standard continues to account for a sizable majority of Internet traffic. Most of the issues discussed here (Network Address Translators (NATs) are a major exception; see Section 5.2.3.1.2 ("Address Translation and Mobility")) are valid for IPv4 as well as IPv6. The Internet was designed as a platform for free and open communication, most notably encoded in the end-to-end principle, and that philosophy is also present in the technical implementation of IP [RFC3724]. While the protocol was designed to exist in an environment where intelligence is at the end hosts, it has proven to provide sufficient information that a more intelligent network core can make policy decisions and enforce policy-based traffic shaping, thereby restricting the communications of end hosts. These capabilities for network control and for limitations on freedom of expression by end hosts can be traced back to the design of IPv4, helping us to understand which technical protocol decisions have led to harm to this human right. A feature that can harm freedom of expression as well as the right to privacy through misuse of IP is the exploitation of the public visibility of the host pairs for all communications and the corresponding ability to differentiate and block traffic as a result of that metadata.
5.2.3.1.1. Network Visibility of Source and Destination
The IPv4 protocol header contains fixed location fields for both the source IP address and destination IP address [RFC791]. These addresses identify both the host sending and the host receiving each message; they also allow the core network to understand who is talking to whom and to practically limit communication selectively between pairs of hosts. Blocking of communication based on the pair of source and destination is one of the most common limitations on the ability for people to communicate today [CAIDA] and can be seen as a restriction of the ability for people to assemble or to consensually express themselves. Inclusion of an Internet-wide identified source in the IP header is not the only possible design, especially since the protocol is most commonly implemented over Ethernet networks exposing only link-local identifiers [RFC894].
Top   ToC   RFC8280 - Page 23
   A variety of alternative designs do exist, such as the Accountable
   and Private Internet Protocol [APIP] and High-speed Onion Routing at
   the Network Layer (HORNET) [HORNET] as well as source routing.  The
   latter would allow the sender to choose a predefined (safe) route and
   spoofing of the source IP address, which are technically supported by
   IPv4, but neither are considered good practice on the Internet
   [Farrow].  While projects like [TorProject] provide an alternative
   implementation of anonymity in connections, they have been developed
   in spite of the IPv4 protocol design.

5.2.3.1.2. Address Translation and Mobility
A major structural shift in the Internet that undermined the protocol design of IPv4, and significantly reduced the freedom of end users to communicate and assemble, was the introduction of network address translation [RFC3022]. Network address translation is a process whereby organizations and autonomous systems connect two networks by translating the IPv4 source and destination addresses between them. This process puts the router performing the translation in a privileged position, where it is predetermined which subset of communications will be translated. This process of translation has widespread adoption despite promoting a process that goes against the stated end-to-end process of the underlying protocol [NATusage]. In contrast, the proposed mechanism to provide support for mobility and forwarding to clients that may move -- encoded instead as an option in IP [RFC5944] -- has failed to gain traction. In this situation, the compromise made in the design of the protocol resulted in a technology that is not coherent with the end-to-end principles and thus creates an extra possible hurdle for freedom of expression in its design, even though a viable alternative exists. There is a particular problem surrounding NATs and Virtual Private Networks (VPNs) (as well as other connections used for privacy purposes), as NATs sometimes cause VPNs not to work.
5.2.3.2. DNS
The Domain Name System (DNS) [RFC1035] provides service discovery capabilities and provides a mechanism to associate human-readable names with services. The DNS is organized around a set of independently operated "root servers" run by organizations that function in line with ICANN's policy by answering queries for which organizations have been delegated to manage registration under each Top-Level Domain (TLD). The DNS is organized as a rooted tree, and this brings up political and social concerns over control. TLDs are maintained and determined by ICANN. These namespaces encompass several classes of services. The initial namespaces, including ".com" and ".net", provide common spaces for expression of ideas,
Top   ToC   RFC8280 - Page 24
   though their policies are enacted through US-based companies.  Other
   namespaces are delegated to specific nationalities and may impose
   limits designed to focus speech in those forums, to both (1) promote
   speech from that nationality and (2) comply with local limits on
   expression and social norms.  Finally, the system has recently been
   expanded with additional generic and sponsored namespaces -- for
   instance, ".travel" and ".ninja" -- that are operated by a range of
   organizations that may independently determine their registration
   policies.  This new development has both positive and negative
   implications in terms of enabling human rights.  Some individuals
   argue that it undermines the right to freedom of expression because
   some of these new generic TLDs have restricted policies on
   registration and particular rules on hate speech content.  Others
   argue that precisely these properties are positive because they
   enable certain (mostly minority) communities to build safer spaces
   for association, thereby enabling their right to freedom of
   association.  An often-mentioned example is an application like
   .gay [CoE].

   As discussed in [RFC7626], DNS has significant privacy issues.  Most
   notable is the lack of encryption to limit the visibility of requests
   for domain resolution from intermediary parties, and a limited
   deployment of DNSSEC to provide authentication, allowing the client
   to know that they received a correct, "authoritative" answer to a
   query.  In response to the privacy issues, the IETF DNS Private
   Exchange (DPRIVE) Working Group is developing mechanisms to provide
   confidentiality to DNS transactions, to address concerns surrounding
   pervasive monitoring [RFC7258].

   Authentication through DNSSEC creates a validation path for records.
   This authentication protects against forged or manipulated DNS data.
   As such, DNSSEC protects directory lookups and makes it harder to
   hijack a session.  This is important because interference with the
   operation of the DNS is currently becoming one of the central
   mechanisms used to block access to websites.  This interference
   limits both the freedom of expression of the publisher to offer their
   content and the freedom of assembly for clients to congregate in a
   shared virtual space.  Even though DNSSEC doesn't prevent censorship,
   it makes it clear that the returned information is not the
   information that was requested; this contributes to the right to
   security and increases trust in the network.  It is, however,
   important to note that DNSSEC is currently not widely supported or
   deployed by domain name registrars, making it difficult to
   authenticate and use correctly.
Top   ToC   RFC8280 - Page 25
5.2.3.2.1. Removal of Records
There have been a number of cases where the records for a domain are removed from the name system due to political events. Examples of this removal include the "seizure" of wikileaks [BBC-wikileaks] and the names of illegally operating gambling operations by the United States Immigration and Customs Enforcement (ICE) unit. In the first case, a US court ordered the registrar to take down the domain. In the second, ICE compelled the US-based registry in charge of the .com TLD to hand ownership of those domains over to the US government. The same technique has been used in Libya to remove sites in violation of "our Country's Law and Morality (which) do not allow any kind of pornography or its promotion." [techyum] At a protocol level, there is no technical auditing for name ownership, as in alternate systems like Namecoin [Namecoin]. As a result, there is no ability for users to differentiate seizure from the legitimate transfer of name ownership, which is purely a policy decision made by registrars. While DNSSEC addresses the network distortion events described below, it does not tackle this problem. (Although we mention alternative techniques, this is not a comparison of DNS with Namecoin: the latter has its own problems and limitations. The idea here is to show that there are several possible choices, and they have consequences for human rights.)
5.2.3.2.2. Distortion of Records
The most common mechanism by which the DNS is abused to limit freedom of expression is through manipulation of protocol messages by the network. One form occurs at an organizational level, where client computers are instructed to use a local DNS resolver controlled by the organization. The DNS resolver will then selectively distort responses rather than request the authoritative lookup from the upstream system. The second form occurs through the use of Deep Packet Inspection (DPI), where all DNS protocol messages are inspected by the network and objectionable content is distorted, as can be observed in Chinese networks. A notable instance of distortion occurred in Greece [Ververis], where a study found evidence of both (1) DPI to distort DNS replies and (2) more excessive blocking of content than was legally required or requested (also known as "overblocking"). Internet Service Providers (ISPs), obeying a governmental order, prevented clients from resolving the names of domains, thereby prompting this particular blocking of systems there.
Top   ToC   RFC8280 - Page 26
   At a protocol level, the effectiveness of these attacks is made
   possible by a lack of authentication in the DNS protocol.  DNSSEC
   provides the ability to determine the authenticity of responses when
   used, but it is not regularly checked by resolvers.  DNSSEC is not
   effective when the local resolver for a network is complicit in the
   distortion -- for instance, when the resolver assigned for use by an
   ISP is the source of injection.  Selective distortion of records is
   also made possible by the predictable structure of DNS messages,
   which makes it computationally easy for a network device to watch all
   passing messages even at high speeds, and the lack of encryption,
   which allows the network to distort only an objectionable subset of
   protocol messages.  Specific distortion mechanisms are discussed
   further in [Hall].

   Users can switch to another resolver -- for instance, a public
   resolver.  The distorter can then try to block or hijack the
   connection to this resolver.  This may start an arms race, with the
   user switching to secured connections to this alternative resolver
   [RFC7858] and the distorter then trying to find more sophisticated
   ways to block or hijack the connection.  In some cases, this search
   for an alternative, non-disrupting resolver may lead to more
   centralization because many people are switching to a few big
   commercial public resolvers.

5.2.3.2.3. Injection of Records
Responding incorrectly to requests for name lookups is the most common mechanism that in-network devices use to limit the ability of end users to discover services. A deviation that accomplishes a similar objective and may be seen as different from a "freedom of expression" perspective is the injection of incorrect responses to queries. The most prominent example of this behavior occurs in China, where requests for lookups of sites deemed inappropriate will trigger the network to return a false response, causing the client to ignore the real response when it subsequently arrives [greatfirewall]. Unlike the other network paradigms discussed above, injection does not stifle the ability of a server to announce its name; it instead provides another voice that answers sooner. This is effective because without DNSSEC, the protocol will respond to whichever answer is received first, without listening for subsequent answers.
5.2.3.3. HTTP
The Hypertext Transfer Protocol (HTTP) version 1.1 [RFC7230] [RFC7231] [RFC7232] [RFC7233] [RFC7234] [RFC7235] [RFC7236] [RFC7237] is a request-response application protocol developed throughout the 1990s. HTTP factually contributed to the exponential growth of the
Top   ToC   RFC8280 - Page 27
   Internet and the interconnection of populations around the world.
   Its simple design strongly contributed to the fact that HTTP has
   become the foundation of most modern Internet platforms and
   communication systems, from websites to chat systems and computer-to-
   computer applications.  In its manifestation in the World Wide Web,
   HTTP radically revolutionized the course of technological development
   and the ways people interact with online content and with each other.

   However, HTTP is also a fundamentally insecure protocol that doesn't
   natively provide encryption properties.  While the definition of the
   Secure Sockets Layer (SSL) [RFC6101], and later of Transport Layer
   Security (TLS) [RFC5246], also happened during the 1990s, the fact
   that HTTP doesn't mandate the use of such encryption layers by
   developers and service providers was one of the reasons for a very
   late adoption of encryption.  Only in the middle of the 2000s did we
   observe big ISPs, such as Google, starting to provide encrypted
   access to their web services.

   The lack of sensitivity and understanding of the critical importance
   of securing web traffic incentivized certain (offensive) actors to
   develop, deploy, and utilize interception systems at large and to
   later launch active injection attacks, in order to swipe large
   amounts of data and compromise Internet-enabled devices.  The
   commercial availability of systems and tools to perform these types
   of attacks also led to a number of human rights abuses that have been
   discovered and reported over the years.

   Generally, we can identify traffic interception (Section 5.2.3.3.1)
   and traffic manipulation (Section 5.2.3.3.2) as the two most
   problematic attacks that can be performed against applications
   employing a cleartext HTTP transport layer.  That being said, the
   IETF is taking steady steps to move to the encrypted version of HTTP,
   HTTP Secure (HTTPS).

   While this is commendable, we must not lose track of the fact that
   different protocols, implementations, configurations, and networking
   paradigms can intersect such that they (can be used to) adversely
   impact human rights.  For instance, to facilitate surveillance,
   certain countries will throttle HTTPS connections, forcing users to
   switch to (unthrottled) HTTP [Aryan-etal].

5.2.3.3.1. Traffic Interception
While we are seeing an increasing trend in the last couple of years to employ SSL/TLS as a secure traffic layer for HTTP-based applications, we are still far from seeing a ubiquitous use of encryption on the World Wide Web. It is important to consider that the adoption of SSL/TLS is also a relatively recent phenomenon.
Top   ToC   RFC8280 - Page 28
   Email providers such as riseup.net were the first to enable SSL by
   default.  Google did not introduce an option for its Gmail users to
   navigate with SSL until 2008 [Rideout] and turned TLS on by default
   later, in 2010 [Schillace].  It took an increasing amount of security
   breaches and revelations on global surveillance from Edward Snowden
   before other mail service providers followed suit.  For example,
   Yahoo did not enable SSL/TLS by default on its webmail services until
   early 2014 [Peterson].

   TLS itself has been subject to many attacks and bugs; this situation
   can be attributed to some fundamental design weaknesses, such as lack
   of a state machine (which opens a vulnerability for triple handshake
   attacks) and flaws caused by early US government restrictions on
   cryptography, leading to cipher-suite downgrade attacks (Logjam
   attacks).  These vulnerabilities are being corrected in TLS 1.3
   [Bhargavan] [Adrian].

   HTTP upgrading to HTTPS is also vulnerable to having an attacker
   remove the "s" in any links to HTTPS URIs from a web page transferred
   in cleartext over HTTP -- an attack called "SSL Stripping"
   [sslstrip].  Thus, for high-security use of HTTPS, IETF standards
   such as HTTP Strict Transport Security (HSTS) [RFC6797], certificate
   pinning [RFC7469], and/or DNS-Based Authentication of Named Entities
   (DANE) [RFC6698] should be used.

   As we learned through Snowden's revelations, intelligence agencies
   have been intercepting and collecting unencrypted traffic at large
   for many years.  There are documented examples of such
   mass-surveillance programs with the Government Communications
   Headquarters's (GCHQ's) Tempora [WP-Tempora] and the National
   Security Agency's (NSA's) XKeyscore [Greenwald].  Through these
   programs, the NSA and the GCHQ have been able to swipe large amounts
   of data, including email and instant messaging communications that
   have been transported in the clear for years by providers
   unsuspecting of the pervasiveness and scale of governments' efforts
   and investment in global mass-surveillance capabilities.

   However, similar mass interception of unencrypted HTTP communications
   is also often employed at the national level by some democratic
   countries, by exercising control over state-owned ISPs and through
   the use of commercially available monitoring, collection, and
   censorship equipment.  Over the last few years, a lot of information
   has come to public attention on the role and scale of a surveillance
   industry dedicated to developing different types of interception
   gear, making use of known and unknown weaknesses in existing
   protocols [RFC7258].  We have several records of such equipment being
   sold and utilized by some regimes in order to monitor entire segments
   of a population, especially at times of social and political
Top   ToC   RFC8280 - Page 29
   distress, uncovering massive human rights abuses.  For example, in
   2013, the group Telecomix revealed that the Syrian regime was making
   use of Blue Coat products in order to intercept cleartext traffic as
   well as to enforce censorship of unwanted content [RSF].  Similarly,
   in 2011, it was found that the French technology firm Amesys provided
   the Gadhafi government with equipment able to intercept emails,
   Facebook traffic, and chat messages at a country-wide level [WSJ].
   The use of such systems, especially in the context of the Arab Spring
   and of civil uprisings against the dictatorships, has caused serious
   concerns regarding significant human rights abuses in Libya.

5.2.3.3.2. Traffic Manipulation
The lack of a secure transport layer under HTTP connections not only exposes users to interception of the content of their communications but is more and more commonly abused as a vehicle for actively compromising computers and mobile devices. If an HTTP session travels in the clear over the network, any node positioned at any point in the network is able to perform man-in-the-middle attacks; the node can observe, manipulate, and hijack the session and can modify the content of the communication in order to trigger unexpected behavior by the application generating the traffic. For example, in the case of a browser, the attacker would be able to inject malicious code in order to exploit vulnerabilities in the browser or any of its plugins. Similarly, the attacker would be able to intercept, add malware to, and repackage binary software updates that are very commonly downloaded in the clear by applications such as word processors and media players. If the HTTP session were encrypted, the tampering of the content would not be possible, and these network injection attacks would not be successful. While traffic manipulation attacks have long been known, documented, and prototyped, especially in the context of Wi-Fi and LAN networks, in the last few years we have observed an increasing investment in the production and sale of network injection equipment that is both commercially available and deployed at scale by intelligence agencies. For example, we learned from some of the documents provided by Edward Snowden to the press that the NSA has constructed a global network injection infrastructure, called "QUANTUM", able to leverage mass surveillance in order to identify targets of interest and subsequently task man-on-the-side attacks to ultimately compromise a selected device. Among other attacks, the NSA makes use of an attack called "QUANTUMINSERT" [Haagsma], which intercepts and hijacks an unencrypted HTTP communication and forces the requesting browser to redirect to a host controlled by the NSA instead of the intended website. Normally, the new destination would be an exploitation
Top   ToC   RFC8280 - Page 30
   service, referred to in Snowden documents as "FOXACID", which would
   attempt to execute malicious code in the context of the target's
   browser.  The Guardian reported in 2013 that the NSA has, for
   example, been using these techniques to target users of the popular
   anonymity service Tor [Schneier].  The German Norddeutscher Rundfunk
   (NDR) reported in 2014 that the NSA has also been using its
   mass-surveillance capabilities to identify Tor users at large
   [Appelbaum].

   Recently, similar capabilities used by Chinese authorities have been
   reported as well in what has been informally called the "Great
   Cannon" [Marcak], which raised numerous concerns on the potential
   curb on human rights and freedom of speech due to the increasingly
   tighter control of Chinese Internet communications and access to
   information.

   Network injection attacks are also made widely available to state
   actors around the world through the commercialization of similar,
   smaller-scale equipment that can be easily acquired and deployed at a
   country-wide level.  Certain companies are known to have network
   injection gear within their products portfolio [Marquis-Boire].  The
   technology devised and produced by some of them to perform network
   traffic manipulation attacks on HTTP communications is even the
   subject of a patent application in the United States [Googlepatent].
   Access to offensive technologies available on the commercial lawful
   interception market has led to human rights abuses and illegitimate
   surveillance of journalists, human rights defenders, and political
   activists in many countries around the world [Collins].  While
   network injection attacks haven't been the subject of much attention,
   they do enable even unskilled attackers to perform silent and very
   resilient compromises, and unencrypted HTTP remains one of the main
   vehicles.

   There is a new version of HTTP, called "HTTP/2" [RFC7540], which aims
   to be largely backwards compatible while also offering new options
   such as data compression of HTTP headers, pipelining of requests, and
   multiplexing multiple requests over a single TCP connection.  In
   addition to decreasing latency to improve page-loading speeds, it
   also facilitates more efficient use of connectivity in low-bandwidth
   environments, which in turn enables freedom of expression; the right
   to assembly; the right to political participation; and the right to
   participate in cultural life, arts, and science.  [RFC7540] does not
   mandate TLS or any other form of encryption, nor does it support
   opportunistic encryption even though opportunistic encryption is now
   addressed in [RFC8164].
Top   ToC   RFC8280 - Page 31
5.2.3.4. XMPP
The Extensible Messaging and Presence Protocol (XMPP), specified in [RFC6120], provides a standard for interactive chat messaging and has evolved to encompass interoperable text, voice, and video chat. The protocol is structured as a federated network of servers, similar to email, where users register with a local server that acts on their behalf to cache and relay messages. This protocol design has many advantages, allowing servers to shield clients from denial of service and other forms of retribution for their expression; it is also designed to avoid central entities that could control the ability to communicate or assemble using the protocol. Nonetheless, there are plenty of aspects of the protocol design of XMPP that shape the ability for users to communicate freely and to assemble via the protocol.
5.2.3.4.1. User Identification
The XMPP specification [RFC6120] dictates that clients are identified with a resource (<node@domain/home> / <node@domain/work>) to distinguish the conversations to specific devices. While the protocol does not specify that the resource must be exposed by the client's server to remote users, in practice this has become the default behavior. In doing so, users can be tracked by remote friends and their servers, who are able to monitor the presence of not just the user but of each individual device the user logs in with. This has proven to be misleading to many users [Pidgin], since many clients only expose user-level rather than device-level presence. Likewise, user invisibility so that communication can occur while users don't notify all buddies and other servers of their availability is not part of the formal protocol and has only been added as an extension within the XML stream rather than enforced by the protocol.
5.2.3.4.2. Surveillance of Communication
XMPP specifies the standard by which communications channels may be encrypted, but it does not provide visibility to clients regarding whether their communications are encrypted on each link. In particular, even when both clients ensure that they have an encrypted connection to their XMPP server to ensure that their local network is unable to read or disrupt the messages they send, the protocol does not provide visibility into the encryption status between the two servers. As such, clients may be subject to selective disruption of communications by an intermediate network that disrupts communications based on keywords found through DPI. While many operators have committed to only establishing encrypted links from
Top   ToC   RFC8280 - Page 32
   their servers in recognition of this vulnerability, it remains
   impossible for users to audit this behavior, and encrypted
   connections are not required by the protocol itself [XMPP-Manifesto].

   In particular, Section 13.14 of the XMPP specification [RFC6120]
   explicitly acknowledges the existence of a downgrade attack where an
   adversary controlling an intermediate network can force the
   inter-domain federation between servers to revert to a non-encrypted
   protocol where selective messages can then be disrupted.

5.2.3.4.3. Group Chat Limitations
Group chat in XMPP is defined as an extension within the XML specification of XMPP (https://xmpp.org/extensions/xep-0045.html). However, it is not encoded or required at a protocol level and is not uniformly implemented by clients. The design of multi-user chat in XMPP suffers from extending a protocol that was not designed with assembly of many users in mind. In particular, in the federated protocol provided by XMPP, multi-user communities are implemented with a distinguished "owner" who is granted control over the participants and structure of the conversation. Multi-user chat rooms are identified by a name specified on a specific server, so that while the overall protocol may be federated, the ability for users to assemble in a given community is moderated by a single server. That server may block the room and prevent assembly unilaterally, even between two users, neither of whom trust or use that server directly.
5.2.3.5. Peer-to-Peer
Peer-to-Peer (P2P) is a distributed network architecture [RFC5694] in which all the participant nodes can be responsible for the storage and dissemination of information from any other node (see [RFC7574], an IETF standard that discusses a P2P architecture called the "Peer-to-Peer Streaming Peer Protocol" (PPSPP)). A P2P network is a logical overlay that lives on top of the physical network and allows nodes (or "peers") participating in it to establish contact and exchange information directly with each other. The implementation of a P2P network may vary widely: it may be structured or unstructured, and it may implement stronger or weaker cryptographic and anonymity properties. While its most common application has traditionally been file-sharing (and other types of content delivery systems), P2P is a popular architecture for networks and applications that require (or encourage) decentralization. Prime examples include Bitcoin and other proprietary multimedia applications.
Top   ToC   RFC8280 - Page 33
   In a time of heavily centralized online services, P2P is regularly
   described as an alternative, more democratic, and resistant option
   that displaces structures of control over data and communications and
   delegates all peers to be equally responsible for the functioning,
   integrity, and security of the data.  While in principle P2P remains
   important to the design and development of future content
   distribution, messaging, and publishing systems, it poses numerous
   security and privacy challenges that are mostly delegated to
   individual developers to recognize, analyze, and solve in each
   implementation of a given P2P network.

5.2.3.5.1. Network Poisoning
Since content, and sometimes peer lists, are safeguarded and distributed by their members, P2P networks are prone to what are generally defined as "poisoning attacks". Poisoning attacks might be aimed directly at the data that is being distributed, for example, (1) by intentionally corrupting the data, (2) at the index tables used to instruct the peers where to fetch the data, or (3) at routing tables, with an attempt to provide connecting peers with lists of rogue or nonexistent peers, with the intention to effectively cause a denial of service on the network.
5.2.3.5.2. Throttling
P2P traffic (and BitTorrent in particular) represents a significant percentage of global Internet traffic [Sandvine], and it has become increasingly popular for ISPs to perform throttling of customers' lines in order to limit bandwidth usage [torrentfreak1] and, sometimes, probably as an effect of the ongoing conflict between copyright holders and file-sharing communities [wikileaks]. Such throttling undermines the end-to-end principle. Throttling the P2P traffic makes some uses of P2P networks ineffective; this throttling might be coupled with stricter inspection of users' Internet traffic through DPI techniques, possibly posing additional security and privacy risks.
5.2.3.5.3. Tracking and Identification
One of the fundamental and most problematic issues with traditional P2P networks is a complete lack of anonymization of their users. For example, in the case of BitTorrent, all peers' IP addresses are openly available to the other peers. This has led to ever-increasing tracking of P2P and file-sharing users [ars]. As the geographical location of the user is directly exposed, as could also be his identity, the user might become a target of additional harassment and attacks of a physical or legal nature. For example, it is known that
Top   ToC   RFC8280 - Page 34
   in Germany law firms have made extensive use of P2P and file-sharing
   tracking systems in order to identify downloaders and initiate legal
   actions looking for compensations [torrentfreak2].

   It is worth noting that there are some varieties of P2P networks that
   implement cryptographic practices and that introduce anonymization of
   their users.  Such implementations may be proved to be successful in
   resisting censorship of content and tracking of network peers.  A
   prime example is Freenet [freenet1], a free software application that
   is (1) designed to make it significantly more difficult to identify
   users and content and (2) dedicated to fostering freedom of speech
   online [freenet2].

5.2.3.5.4. Sybil Attacks
In open-membership P2P networks, a single attacker can pretend to be many participants, typically by creating multiple fake identities of whatever kind the P2P network uses [Douceur]. Attackers can use Sybil attacks to bias choices that the P2P network makes collectively to the attacker's advantage, e.g., by making it more likely that a particular data item (or some threshold of the replicas or shares of a data item) is assigned to attacker-controlled participants. If the P2P network implements any voting, moderation, or peer-review-like functionality, Sybil attacks may be used to "stuff the ballots" to benefit the attacker. Companies and governments can use Sybil attacks on discussion-oriented P2P systems for "astroturfing" or creating the appearance of mass grassroots support for some position where in reality there is none. It is important to know that there are no known complete, environmentally sustainable, and fully distributed solutions to Sybil attacks, and routing via "friends" allows users to be de-anonymized via their social graph. It is important to note that Sybil attacks in this context (e.g., astroturfing) are relevant to more than P2P protocols; they are also common on web-based systems, and they are exploited by governments and commercial entities. Encrypted P2P and anonymous P2P networks have already emerged. They provide viable platforms for sharing material [Tribler], publishing content anonymously, and communicating securely [Bitmessage]. These platforms are not perfect, and more research needs to be done. If adopted at large, well-designed and resistant P2P networks might represent a critical component of a future secure and distributed Internet, enabling freedom of speech and freedom of information at scale.
Top   ToC   RFC8280 - Page 35
5.2.3.6. Virtual Private Networks
The VPNs discussed here are point-to-point connections that enable two computers to communicate over an encrypted tunnel. There are multiple implementations and protocols used in the deployment of VPNs, and they generally diversify by encryption protocol or particular requirements, most commonly in proprietary and enterprise solutions. VPNs are commonly used to (1) enable some devices to communicate through peculiar network configurations, (2) use some privacy and security properties in order to protect the traffic generated by the end user, or both. VPNs have also become a very popular technology among human rights defenders, dissidents, and journalists worldwide to avoid local monitoring and eventually also to circumvent censorship. VPNs are often debated among human rights defenders as a potential alternative to Tor or other anonymous networks. Such comparisons are misleading, as some of the privacy and security properties of VPNs are often misunderstood by less tech-savvy users and could ultimately lead to unintended problems. As VPNs have increased in popularity, commercial VPN providers have started growing as businesses and are very commonly picked by human rights defenders and people at risk, as they are normally provided with an easy-to-use service and, sometimes, even custom applications to establish the VPN tunnel. Not being able to control the configuration of the network, let alone the security of the application, assessing the general privacy and security state of common VPNs is very hard. Such services have often been discovered to be leaking information, and their custom applications have been found to be flawed. While Tor and similar networks receive a lot of scrutiny from the public and the academic community, commercial or non-commercial VPNs are far less analyzed and understood [Insinuator] [Alshalan-etal], and it might be valuable to establish some standards to guarantee a minimal level of privacy and security to those who need them the most.
5.2.3.6.1. No Anonymity against VPN Providers
One of the common misconceptions among users of VPNs is the level of anonymity that VPNs can provide. This sense of anonymity can be betrayed by a number of attacks or misconfigurations of the VPN provider. It is important to remember that, in contrast to Tor and similar systems, VPNs were not designed to provide anonymity properties. From a technical point of view, a VPN might leak identifiable information or might be the subject of correlation attacks that could expose the originating address of a connecting user. Most importantly, it is vital to understand that commercial and non-commercial VPN providers are bound by the law of the jurisdiction in which they reside or in which their infrastructure is
Top   ToC   RFC8280 - Page 36
   located, and they might be legally forced to turn over data of
   specific users if legal investigations or intelligence requirements
   dictate so.  In such cases, if the VPN providers retain logs, it is
   possible that a user's information could be provided to the user's
   adversary and lead to his or her identification.

5.2.3.6.2. Logging
Because VPNs are point-to-point connections, the service providers are in fact able to observe the original location of connecting users, and they are able to track at what time they started their session and, eventually, also to which destinations they're trying to connect. If the VPN providers retain logs for a long enough time, they might be forced to turn over the relevant data or they might be otherwise compromised, leading to the same data getting exposed. A clear log-retention policy could be enforced, but considering that countries enforce different levels of data-retention policies, VPN providers should at least be transparent regarding what information they store and for how long it is being kept.
5.2.3.6.3. Third-Party Hosting
VPN providers very commonly rely on third parties to provision the infrastructure that is later going to be used to run VPN endpoints. For example, they might rely on external dedicated server providers or on uplink providers. In those cases, even if the VPN provider itself isn't retaining any significant logs, the information on connecting users might be retained by those third parties instead, introducing an additional collection point for the adversary.
5.2.3.6.4. IPv6 Leakage
Some studies proved that several commercial VPN providers and applications suffer from critical leakage of information through IPv6 due to improper support and configuration [PETS2015VPN]. This is generally caused by a lack of proper configuration of the client's IPv6 routing tables. Considering that most popular browsers and similar applications have been supporting IPv6 by default, if the host is provided with a functional IPv6 configuration, the traffic that is generated might be leaked if the VPN application isn't designed to manipulate such traffic properly.
Top   ToC   RFC8280 - Page 37
5.2.3.6.5. DNS Leakage
Similarly, VPN services that aren't handling DNS requests and aren't running DNS servers of their own might be prone to DNS leaking that might not only expose sensitive information on the activity of a user but could also potentially lead to DNS hijacking attacks and subsequent compromises.
5.2.3.6.6. Traffic Correlation
Some VPN implementations appear to be particularly vulnerable to identification and collection of key exchanges that, some Snowden documents revealed, are systematically collected and stored for future reference. The ability of an adversary to monitor network connections at many different points over the Internet can allow them to perform traffic correlation attacks and identify the origin of certain VPN traffic by cross-referencing the connection time of the user to the endpoint and the connection time of the endpoint to the final destination. These types of attacks, although very expensive and normally only performed by very resourceful adversaries, have been documented [SPIEGEL] to be already in practice, and they could completely nullify the use of a VPN and ultimately expose the activity and the identity of a user at risk.
5.2.3.7. HTTP Status Code 451
"Every Internet user has run into the '404 Not Found' Hypertext Transfer Protocol (HTTP) status code when trying, and failing, to access a particular website" [Cath]. It is a response status that the server sends to the browser when the server cannot locate the URL. "403 Forbidden" is another example of this class of code signals that gives users information about what is going on. In the "403" case, the server can be reached but is blocking the request because the user is trying to access content forbidden to them, typically because some content is only for identified users, based on a payment or on special status in the organization. Most of the time, 403 is sent by the origin server, not by an intermediary. If a firewall prevents a government employee from accessing pornography on a work computer, it does not use 403.
Top   ToC   RFC8280 - Page 38
   As surveillance and censorship of the Internet are becoming more
   commonplace, voices were raised at the IETF to introduce a new status
   code that indicates when something is not available for "legal
   reasons" (like censorship):

   The 451 status code would allow server operators to operate with
   greater transparency in circumstances where issues of law or public
   policy affect their operation.  This transparency may be beneficial
   to both (1) these operators and (2) end users [RFC7725].

   The status code is named "451" in reference to both Bradbury's famous
   novel "Fahrenheit 451" and to 451 degrees Fahrenheit (the temperature
   at which some claim book paper autoignites).

   During the IETF 92 meeting in Dallas, there was discussion about the
   usefulness of 451.  The main tension revolved around the lack of an
   apparent machine-readable technical use of the information.  The
   extent to which 451 is just "political theatre" or whether it has a
   concrete technical use was heatedly debated.  Some argued that "the
   451 status code is just a status code with a response body"; others
   said it was problematic because "it brings law into the picture."
   Still others argued that it would be useful for individuals or for
   organizations like the "Chilling Effects" project that are crawling
   the Web to get an indication of censorship (IETF discussion on 451 --
   author's field notes, March 2015).  There was no outright objection
   during the Dallas meeting against moving forward on status code 451,
   and on December 18, 2015, the IESG approved "An HTTP Status Code to
   Report Legal Obstacles" (now [RFC7725]) for publication.  HTTP status
   code 451 is now an IETF-approved HTTP status code that signals when
   resource access is denied as a consequence of legal demands.

   What is interesting about this particular case is that not only
   technical arguments but also the status code's outright potential
   political use for civil society played a substantial role in shaping
   the discussion and the decision to move forward with this technology.

   It is nonetheless important to note that HTTP status code 451 is not
   a solution to detect all occasions of censorship.  A large swath of
   Internet filtering occurs in the network, at a lower level than HTTP,
   rather than at the server itself.  For these forms of censorship, 451
   plays a limited role, as typical censoring intermediaries won't
   generate it.  Besides technical reasons, such filtering regimes are
   unlikely to voluntarily inject a 451 status code.  The use of 451 is
   most likely to apply in the case of cooperative, legal versions of
   content removal resulting from requests to providers.  One can think
   of content that is removed or blocked for legal reasons, like
   copyright infringement, gambling laws, child abuse, etc.  Large
Top   ToC   RFC8280 - Page 39
   Internet companies and search engines are constantly asked to censor
   content in various jurisdictions.  451 allows this to be easily
   discovered -- for instance, by initiatives like the Lumen Database.

   Overall, the strength of 451 lies in its ability to provide
   transparency by giving the reason for blocking and giving the
   end user the ability to file a complaint.  It allows organizations to
   easily measure censorship in an automated way and prompts the user to
   access the content via another path (e.g., Tor, VPNs) when (s)he
   encounters the 451 status code.

   Status code 451 impacts human rights by making censorship more
   transparent and measurable.  It increases transparency by signaling
   the existence of censorship (instead of a much broader HTTP error
   message such as HTTP status code 404) as well as providing details of
   the legal restriction, which legal authority is imposing it, and to
   what class of resources it applies.  This empowers the user to seek
   redress.

5.2.3.8. DDoS Attacks
Many individuals, including IETF engineers, have argued that DDoS attacks are fundamentally against freedom of expression. Technically, DDoS attacks are attacks where one host or multiple hosts overload the bandwidth or resources of another host by flooding it with traffic or making resource-intensive requests, causing it to temporarily stop being available to users. One can roughly differentiate three types of DDoS attacks: 1. volume-based attacks (which aim to make the host unreachable by using up all its bandwidth; often-used techniques are UDP floods and ICMP floods) 2. protocol attacks (which aim to use up actual server resources; often-used techniques are SYN floods, fragmented packet attacks, and "ping of death" [RFC4949]) 3. application-layer attacks (which aim to bring down a server, such as a web server) DDoS attacks can thus stifle freedom of expression and complicate the ability of independent media and human rights organizations to exercise their right to (online) freedom of association, while facilitating the ability of governments to censor dissent. When it comes to comparing DDoS attacks to protests in offline life, it is important to remember that only a limited number of DDoS attacks solely involved willing participants. In the overwhelming majority of cases, the clients are hacked hosts of unrelated parties that
Top   ToC   RFC8280 - Page 40
   have not consented to being part of a DDoS (for exceptions, see
   Operation Ababil [Ababil] or the Iranian Green Movement's DDoS
   campaign at election time [GreenMovement]).  In addition,
   DDoS attacks are increasingly used as an extortion tactic.

   All of these issues seem to suggest that the IETF should try to
   ensure that their protocols cannot be used for DDoS attacks; this is
   consistent with the long-standing IETF consensus that DDoS is an
   attack that protocols should mitigate to the extent they can [BCP72].
   Decreasing the number of vulnerabilities in protocols and (outside of
   the IETF) the number of bugs in the network stacks of routers or
   computers could address this issue.  The IETF can clearly play a role
   in bringing about some of these changes, but the IETF cannot be
   expected to take a positive stance on (specific) DDoS attacks or to
   create protocols that enable some attacks and inhibit others.  What
   the IETF can do is critically reflect on its role in the development
   of the Internet and how this impacts the ability of people to
   exercise their human rights, such as freedom of expression.



(page 40 continued on part 3)

Next Section