Using the terminology of [
RFC 6973], the DNS servers (recursive resolvers and authoritative servers) are enablers: "they facilitate communication between an initiator and a recipient without being directly in the communications path". As a result, they are often forgotten in risk analysis. But, to quote [
RFC 6973] again, "Although [...] enablers may not generally be considered as attackers, they may all pose privacy threats (depending on the context) because they are able to observe, collect, process, and transfer privacy-relevant data". In [
RFC 6973] parlance, enablers become observers when they start collecting data.
Many programs exist to collect and analyze DNS data at the servers -- from the "query log" of some programs like BIND to tcpdump and more sophisticated programs like PacketQ [
packetq] and DNSmezzo [
dnsmezzo]. The organization managing the DNS server can use this data itself, or it can be part of a surveillance program like PRISM [
prism] and pass data to an outside observer.
Sometimes this data is kept for a long time and/or distributed to third parties for research purposes [
ditl] [
day-at-root], security analysis, or surveillance tasks. These uses are sometimes under some sort of contract, with various limitations, for instance, on redistribution, given the sensitive nature of the data. Also, there are observation points in the network that gather DNS data and then make it accessible to third parties for research or security purposes ("passive DNS" [
passive-dns]).
Recursive resolvers see all the traffic since there is typically no caching before them. To summarize: your recursive resolver knows a lot about you. The resolver of a large IAP, or a large public resolver, can collect data from many users.
Given all the above considerations, the choice of recursive resolver has direct privacy considerations for end users. Historically, end user devices have used the DHCP-provided local network recursive resolver. The choice by a user to join a particular network (e.g., by physically plugging in a cable or selecting a network in an OS dialogue) typically updates a number of system resources -- these can include IP addresses, the availability of IPv4/IPv6, DHCP server, and DNS resolver. These individual changes, including the change in DNS resolver, are not normally communicated directly to the user by the OS when the network is joined. The choice of network has historically determined the default system DNS resolver selection; the two are directly coupled in this model.
The vast majority of users do not change their default system DNS settings and so implicitly accept the network settings for the DNS. The network resolvers have therefore historically been the sole destination for all of the DNS queries from a device. These resolvers may have varied privacy policies depending on the network. Privacy policies for these servers may or may not be available, and users need to be aware that privacy guarantees will vary with the network.
All major OSes expose the system DNS settings and allow users to manually override them if desired.
More recently, some networks and users have actively chosen to use a large public resolver, e.g.,
Google Public DNS,
Cloudflare, or
Quad9. There can be many reasons: cost considerations for network operators, better reliability, or anti-censorship considerations are just a few. Such services typically do provide a privacy policy, and the user can get an idea of the data collected by such operators by reading one, e.g.,
Google Public DNS - Your Privacy.
In general, as with many other protocols, issues around centralization also arise with DNS. The picture is fluid with several competing factors contributing, where these factors can also vary by geographic region. These include:
-
ISP outsourcing, including to third-party and public resolvers
-
regional market domination by one or only a few ISPs
-
applications directing DNS traffic by default to a limited subset of resolvers (see Section 6.1.1.2)
An increased proportion of the global DNS resolution traffic being served by only a few entities means that the privacy considerations for users are highly dependent on the privacy policies and practices of those entities. Many of the issues around centralization are discussed in [
centralisation-and-data-sovereignty].
While support for opportunistic DoT can be determined by probing a resolver on port 853, there is currently no standardized discovery mechanism for DoH and Strict DoT servers.
This means that clients that might want to dynamically discover such encrypted services, and where users are willing to trust such services, are not able to do so. At the time of writing, efforts to provide standardized signaling mechanisms to discover the services offered by local resolvers are in progress [
DNSOP-RESOLVER]. Note that an increasing number of ISPs are deploying encrypted DNS; for example, see the Encrypted DNS Deployment Initiative [
EDDI].
An increasing number of applications are offering application-specific encrypted DNS resolution settings, rather than defaulting to using only the system resolver. A variety of heuristics and resolvers are available in different applications, including hard-coded lists of recognized DoH/DoT servers.
Generally, users are not aware of application-specific DNS settings and may not have control over those settings. To address these limitations, users will only be aware of and have the ability to control such settings if applications provide the following functions:
-
communicate the change clearly to users when the default application resolver changes away from the system resolver
-
provide configuration options to change the default application resolver, including a choice to always use the system resolver
-
provide mechanisms for users to locally inspect, selectively forward, and filter queries (either via the application itself or use of the system resolver)
Application-specific changes to default destinations for users' DNS queries might increase or decrease user privacy; it is highly dependent on the network context and the application-specific default. This is an area of active debate, and the IETF is working on a number of issues related to application-specific DNS settings.
The previous section discussed DNS privacy, assuming that all the traffic was directed to the intended servers (i.e., those that would be used in the absence of an active attack) and that the potential attacker was purely passive. But, in reality, there can be active attackers in the network.
The Internet Threat model, as described in [
RFC 3552], assumes that the attacker controls the network. Such an attacker can completely control any insecure DNS resolution, both passively monitoring the queries and responses and substituting their own responses. Even if encrypted DNS such as DoH or DoT is used, unless the client has been configured in a secure way with the server identity, an active attacker can impersonate the server. This implies that opportunistic modes of DoH/DoT as well as modes where the client learns of the DoH/DoT server via in-network mechanisms such as DHCP are vulnerable to attack. In addition, if the client is compromised, the attacker can replace the DNS configuration with one of its own choosing.
User privacy can also be at risk if there is blocking of access to remote recursive servers that offer encrypted transports, e.g., when the local resolver does not offer encryption and/or has very poor privacy policies. For example, active blocking of port 853 for DoT or blocking of specific IP addresses could restrict the resolvers available to the user. The extent of the risk to user privacy is highly dependent on the specific network and user context; a user on a network that is known to perform surveillance would be compromised if they could not access such services, whereas a user on a trusted network might have no privacy motivation to do so.
As a matter of policy, some recursive resolvers use their position in the query path to selectively block access to certain DNS records. This is a form of rendezvous-based blocking as described in
Section 4.3 of
RFC 7754. Such blocklists often include servers known to be used for malware, bots, or other security risks. In order to prevent circumvention of their blocking policies, some networks also block access to resolvers with incompatible policies.
It is also noted that attacks on remote resolver services, e.g., DDoS, could force users to switch to other services that do not offer encrypted transports for DNS.
Use of encrypted transports does not reduce the data available in the recursive resolver and ironically can actually expose more information about users to operators. As described in
Section 5.2, use of session-based encrypted transports (TCP/TLS) can expose correlation data about users.
DoH inherits the full privacy properties of the HTTPS stack and as a consequence introduces new privacy considerations when compared with DNS over UDP, TCP, or TLS [
RFC 7858].
Section 8.2 of
RFC 8484 describes the privacy considerations in the server of the DoH protocol.
A brief summary of some of the issues includes the following:
-
HTTPS presents new considerations for correlation, such as explicit HTTP cookies and implicit fingerprinting of the unique set and ordering of HTTP request header fields.
-
The User-Agent and Accept-Language request header fields often convey specific information about the client version or locale.
-
Utilizing the full set of HTTP features enables DoH to be more than an HTTP tunnel, but it is at the cost of opening up implementations to the full set of privacy considerations of HTTP.
-
Implementations are advised to expose the minimal set of data needed to achieve the desired feature set.
[
RFC 8484] specifically makes selection of HTTPS functionality vs. privacy an implementation choice. At the extremes, there may be implementations that attempt to achieve parity with DoT from a privacy perspective at the cost of using no identifiable HTTP headers, and there might be others that provide feature-rich data flows where the low-level origin of the DNS query is easily identifiable. Some implementations have, in fact, chosen to restrict the use of the User-Agent header so that resolver operators cannot identify the specific application that is originating the DNS queries.
Privacy-focused users should be aware of the potential for additional client identifiers in DoH compared to DoT and may want to only use DoH client implementations that provide clear guidance on what identifiers they add.
Unlike what happens for recursive resolvers, the observation capabilities of authoritative name servers are limited by caching; they see only the requests for which the answer was not in the cache. For aggregated statistics ("What is the percentage of LOC queries?"), this is sufficient, but it prevents an observer from seeing everything. Similarly, the increasing deployment of QNAME minimization [
ripe-qname-measurements] reduces the data visible at the authoritative name server. Still, the authoritative name servers see a part of the traffic, and this subset may be sufficient to violate some privacy expectations.
Also, the user often has some legal/contractual link with the recursive resolver (they have chosen the IAP, or they have chosen to use a given public resolver) while having no control and perhaps no awareness of the role of the authoritative name servers and their observation abilities.
As noted before, using a local resolver or a resolver close to the machine decreases the attack surface for an on-the-wire eavesdropper. But it may decrease privacy against an observer located on an authoritative name server. This authoritative name server will see the IP address of the end client instead of the address of a big recursive resolver shared by many users.
This "protection", when using a large resolver with many clients, is no longer present if ECS [
RFC 7871] is used because, in this case, the authoritative name server sees the original IP address (or prefix, depending on the setup).
As of today, all the instances of one root name server, L-root, receive together around 50,000 queries per second. While most of it is "junk" (errors on the Top-Level Domain (TLD) name), it gives an idea of the amount of big data that pours into name servers. (And even "junk" can leak information; for instance, if there is a typing error in the TLD, the user will send data to a TLD that is not the usual one.)
Many domains, including TLDs, are partially hosted by third-party servers, sometimes in a different country. The contracts between the domain manager and these servers may or may not take privacy into account. Whatever the contract, the third-party hoster may or may not be honest; in any case, it will have to follow its local laws. For example, requests to a given ccTLD may go to servers managed by organizations outside of the ccTLD's country. Users may not anticipate that when doing a security analysis.
Also, it seems (see the survey described in [
aeris-dns]) that there is a strong concentration of authoritative name servers among "popular" domains (such as the Alexa Top N list). For instance, among the
Alexa Top 100K, one DNS provider hosts 10% of the domains today. The ten most important DNS providers together host one-third of all domains. With the control (or the ability to sniff the traffic) of a few name servers, you can gather a lot of information.