In the beginning, the network layer protocol (i.e., IP) had the following four "classic" invariants:
-
Non-mutable: The address sent is the address received.
-
Non-mobile: The address doesn't change during the course of an "association".
-
Reversible: A return header can always be formed by reversing the source and destination addresses.
-
Omniscient: Each host knows what address a partner host can use to send packets to it.
Actually, the fourth can be inferred from 1 and 3, but it is worth mentioning explicitly for reasons that will be obvious soon if not already.
In the current "post-classic" world, we are intentionally trying to get rid of the second invariant (both for mobility and for multihoming), and we have been forced to give up the first and the fourth. [
RFC 3102] is an attempt to reinstate the fourth invariant without the first invariant. IPv6 attempts to reinstate the first invariant.
Few client-side systems on the Internet have DNS names that are meaningful. That is, if they have a Fully Qualified Domain Name (FQDN), that name typically belongs to a NAT device or a dial-up server, and does not really identify the system itself but its current connectivity. FQDNs (and their extensions as email names) are application-layer names; more frequently naming services than particular systems. This is why many systems on the Internet are not registered in the DNS; they do not have services of interest to other Internet hosts.
DNS names are references to IP addresses. This only demonstrates the interrelationship of the networking and application layers. DNS, as the Internet's only deployed and distributed database, is also the repository of other namespaces, due in part to DNSSEC and application-specific key records. Although each namespace can be stretched (IP with v6, DNS with KEY records), neither can adequately provide for host authentication or act as a separation between internetworking and transport layers.
The Host Identity (HI) namespace fills an important gap between the IP and DNS namespaces. An interesting thing about the HI is that it actually allows a host to give up all but the 3rd network-layer invariant. That is to say, as long as the source and destination addresses in the network-layer protocol are reversible, HIP takes care of host identification, and reversibility allows a local host to receive a packet back from a remote host. The address changes occurring during NAT transit (non-mutable) or host movement (non-omniscient or non-mobile) can be managed by the HIP layer.
With the exception of high-performance computing applications, the sockets API is the most common way to develop network applications. Applications use the sockets API either directly or indirectly through some libraries or frameworks. However, the sockets API is based on the assumption of static IP addresses, and DNS with its lifetime values was invented at later stages during the evolution of the Internet. Hence, the sockets API does not deal with the lifetime of addresses [
RFC 6250]. As the majority of the end-user equipment is mobile today, their addresses are effectively ephemeral, but the sockets API still gives a fallacious illusion of persistent IP addresses to the unwary developer. HIP can be used to solidify this illusion because HIP provides persistent, surrogate addresses to the application layer in the form of LSIs and HITs.
The persistent identifiers as provided by HIP are useful in multiple scenarios (see, e.g., [
ylitalo-diss] or [
komu-diss] for a more elaborate discussion):
-
When a mobile host moves physically between two different WLAN networks and obtains a new address, an application using the identifiers remains isolated regardless of the topology changes while the underlying HIP layer reestablishes connectivity (i.e., a horizontal handoff).
-
Similarly, the application utilizing the identifiers remains again unaware of the topological changes when the underlying host equipped with WLAN and cellular network interfaces switches between the two different access technologies (i.e., a vertical handoff).
-
Even when hosts are located in private address realms, applications can uniquely distinguish different hosts from each other based on their identifiers. In other words, it can be stated that HIP improves Internet transparency for the application layer [komu-diss].
-
Site renumbering events for services can occur due to corporate mergers or acquisitions, or by changes in Internet service provider. They can involve changing the entire network prefix of an organization, which is problematic due to hard-coded addresses in service configuration files or cached IP addresses at the client side [RFC 5887]. Considering such human errors, a site employing location-independent identifiers as promoted by HIP may experience fewer problems while renumbering their network.
-
More agile IPv6 interoperability can be achieved, as discussed in Section 4.4. IPv6-based applications can communicate using HITs with IPv4-based applications that are using LSIs. Additionally, the underlying network type (IPv4 or IPv6) becomes independent of the addressing family of the application.
-
HITs (or LSIs) can be used in IP-based access control lists as a more secure replacement for IPv6 addresses. Besides security, HIT-based access control has two other benefits. First, the use of HITs can potentially halve the size of access control lists because separate rules for IPv4 are not needed [komu-diss]. Second, HIT-based configuration rules in HIP-aware middleboxes remain static and independent of topology changes, thus simplifying administrative efforts particularly for mobile environments. For instance, the benefits of HIT-based access control have been harnessed in the case of HIP-aware firewalls, but can be utilized directly at the end-hosts as well [RFC 6538].
While some of these benefits could be and have been redundantly implemented by individual applications, providing such generic functionality at the lower layers is useful because it reduces software development effort and networking software bugs (as the layer is tested with multiple applications). It also allows the developer to focus on building the application itself rather than delving into the intricacies of mobile networking, thus facilitating separation of concerns.
HIP could also be realized by combining a number of different protocols, but the complexity of the resulting software may become substantially larger, and the interaction between multiple, possibly layered protocols may have adverse effects on latency and throughput. It is also worth noting that virtually nothing prevents realizing the HIP architecture, for instance, as an application-layer library, which has been actually implemented in the past [
xin-hip-lib]. However, the trade-off in moving the HIP layer to the application layer is that legacy applications may not be supported.
In computer science, many problems can be solved with an extra layer of indirection. However, the indirection always involves some costs as there is no such a thing as a "free lunch". In the case of HIP, the main costs could be stated as follows:
-
In general, an additional layer and a namespace always involve some initial effort in terms of implementation, deployment, and maintenance. Some education of developers and administrators may also be needed. However, the HIP community at the IETF has spent years in experimenting, exploring, testing, documenting, and implementing HIP to ease the adoption costs.
-
HIP introduces a need to manage HIs and requires a centralized approach to manage HIP-aware endpoints at scale. What were formerly IP address-based ACLs are now trusted HITs, and the HIT-to-IP address mappings as well as access policies must be managed. HIP-aware endpoints must also be able to operate autonomously to ensure mobility and availability (an endpoint must be able to run without having to have a persistent management connection). The users who want this better security and mobility of HIs instead of IP address-based ACLs have to then manage this additional 'identity layer' in a nonpersistent fashion. As exemplified in Appendix A.3.5, these challenges have been already solved in an infrastructure setting to distribute policy and manage the mappings and trust relationships between HIP-aware endpoints.
-
HIP decouples identifier and locator roles of IP addresses. Consequently, a mapping mechanism is needed to associate them together. A failure to map a HIT to its corresponding locator may result in failed connectivity because a HIT is "flat" by its nature and cannot be looked up from the hierarchically organized DNS. HITs are flat by design due to a security trade-off. The more bits that are allocated for the hash in the HIT, the less likely there will be (malicious) collisions.
-
From performance viewpoint, HIP control and data plane processing introduces some overhead in terms of throughput and latency as elaborated below.
Related to deployment drawbacks, firewalls are commonly used to control access to various services and devices in the current Internet. Since HIP introduces an additional namespace, it is expected that the HIP namespace would be filtered for unwanted connectivity also. While this can be achieved with existing tools directly in the end-hosts, filtering at the middleboxes requires modifications to existing firewall software or additional middleboxes [
RFC 6538].
The key exchange introduces some extra latency (two round trips) in the initial transport-layer connection establishment between two hosts. With TCP, additional delay occurs if the underlying network stack implementation drops the triggering SYN packet during the key exchange. The same cost may also occur during HIP handoff procedures. However, subsequent TCP sessions using the same HIP association will not bear this cost (within the key lifetime). Both the key exchange and handoff penalties can be minimized by caching TCP packets. The latter case can further be optimized with TCP user timeout extensions [
RFC 5482] as described in further detail by
Schütz et al. [
schuetz-intermittent].
The most CPU-intensive operations involve the use of the asymmetric keys and Diffie-Hellman key derivation at the control plane, but this occurs only during the key exchange, its maintenance (handoffs and refreshing of key material), and teardown procedures of HIP associations. The data plane is typically implemented with ESP because it has a smaller overhead due to symmetric key encryption. Naturally, even ESP involves some overhead in terms of latency (processing costs) and throughput (tunneling) (see, e.g., [
ylitalo-diss] for a performance evaluation).
This section describes some deployment and adoption considerations related to HIP from a technical perspective.
HIP has been adapted and deployed in an industrial control network in a production factory, in which HIP's strong network-layer identity supports the secure coexistence of the control network with many untrusted network devices operated by third-party vendors [
paine-hip]. Similarly, HIP has also been included in a security product to support Layer 2 VPNs [
henderson-vpls] to enable security zones in a supervisory control and data acquisition (SCADA) network. However, HIP has not been a "wild success" [
RFC 5218] in the Internet as argued by
Levä et al. [
levae-barriers]. Here, we briefly highlight some of their findings based on interviews with 19 experts from the industry and academia.
From a marketing perspective, the demand for HIP has been low and substitute technologies have been favored. Another identified reason has been that some technical misconceptions related to the early stages of HIP specifications still persist. Two identified misconceptions are that HIP does not support NAT traversal and that HIP must be implemented in the OS kernel. Both of these claims are untrue; HIP does have NAT traversal extensions [
RFC 9028], and kernel modifications can be avoided with modern operating systems by diverting packets for userspace processing.
The analysis by
Levä et al. clarifies infrastructural requirements for HIP. In a minimal setup, a client and server machine have to run HIP software. However, to avoid manual configurations, usually DNS records for HIP are set up. For instance, the popular DNS server software Bind9 does not require any changes to accommodate DNS records for HIP because they can be supported in binary format in its configuration files [
RFC 6538]. HIP rendezvous servers and firewalls are optional. No changes are required to network address points, NATs, edge routers, or core networks. HIP may require holes in legacy firewalls.
The analysis also clarifies the requirements for the host components that consist of three parts. First, a HIP control plane component is required, typically implemented as a userspace daemon. Second, a data plane component is needed. Most HIP implementations utilize the so-called Bound End-to-End Tunnel (BEET) mode of ESP that has been available since Linux kernel 2.6.27, but the BEET mode is also included as a userspace component in a few of the implementations. Third, HIP systems usually provide a DNS proxy for the local host that translates HIP DNS records to LSIs and HITs, and communicates the corresponding locators to the HIP userspace daemon. While the third component is not mandatory, it is very useful for avoiding manual configurations. The three components are further described in the [
RFC 6538].
Based on the interviews,
Levä et al. suggest further directions to facilitate HIP deployment. Transitioning a number of HIP specifications to the Standards Track in the IETF has already taken place, but the authors suggest other additional measures based on the interviews. As a more radical measure, the authors suggest to implement HIP as a purely application-layer library [
xin-hip-lib] or other kind of middleware. On the other hand, more conservative measures include focusing on private deployments controlled by a single stakeholder. As a more concrete example of such a scenario, HIP could be used by a single service provider to facilitate secure connectivity between its servers [
komu-cloud].
The IEEE 802 standards have been defining MAC-layer security. Many of these standards use Extensible Authentication Protocol (EAP) [
RFC 3748] as a Key Management System (KMS) transport, but some like IEEE 802.15.4 [
IEEE.802.15.4] leave the KMS and its transport as "out of scope".
HIP is well suited as a KMS in these environments:
-
HIP is independent of IP addressing and can be directly transported over any network protocol.
-
Master keys in 802 protocols are commonly pair-based with group keys transported from the group controller using pairwise keys.
-
Ad hoc 802 networks can be better served by a peer-to-peer KMS than the EAP client/server model.
-
Some devices are very memory constrained, and a common KMS for both MAC and IP security represents a considerable code savings.
HIP requires certain amount computational resources from a device due to cryptographic processing. HIP scales down to phones and small system-on-chip devices (such as Raspberry Pis, Intel Edison), but small sensors operating with small batteries have remained problematic. Different extensions to the HIP have been developed to scale HIP down to smaller devices, typically with different security trade-offs. For example, the non-cryptographic identifiers have been proposed in RFID scenarios. The Slimfit approach [
hummen] proposes a compression layer for HIP to make it more suitable for constrained networks. The approach is applied to a lightweight version of HIP (i.e., "Diet HIP") in order to scale down to small sensors.
The HIP Diet EXchange (DEX) [
hip-dex] design aims to reduce the overhead of the employed cryptographic primitives by omitting public-key signatures and hash functions. In doing so, the main goal is to still deliver security properties similar to the Base Exchange (BEX).
DEX is primarily designed for computation- or memory-constrained sensor/actuator devices. Like BEX, it is expected to be used together with a suitable security protocol such as the ESP for the protection of upper-layer protocol data. In addition, DEX can also be used as a keying mechanism for security primitives at the MAC layer, e.g., for IEEE 802.15.9 networks [
IEEE.802.15.9].
The main differences between HIP BEX and DEX are:
-
Minimum collection of cryptographic primitives to reduce the protocol overhead.
-
Static Elliptic Curve Diffie-Hellman (ECDH) key pairs for peer authentication and encryption of the session key.
-
AES-CTR for symmetric encryption and AES-CMAC for MACing function.
-
A simple fold function for HIT generation.
-
Forfeit of perfect forward secrecy with the dropping of an ephemeral Diffie-Hellman key agreement.
-
Forfeit of digital signatures with the removal of a hash function. Reliance on the ECDH-derived key used in HIP_MAC to prove ownership of the private key.
-
Diffie-Hellman derived key ONLY used to protect the HIP packets. A separate secret exchange within the HIP packets creates the session key(s).
-
Optional retransmission strategy tailored to handle the potentially extensive processing time of the employed cryptographic operations on computationally constrained devices.
The [
RFC 6538] enumerates a number of client and server applications that have been trialed with HIP. Based on the report, this section highlights and complements some potential ways how HIP could be exploited in existing infrastructure such as routers, gateways, and proxies.
HIP has been successfully used with forward web proxies (i.e., client-side proxies). HIP was used between a client host (web browser) and a forward proxy (Apache server) that terminated the HIP/ESP tunnel. The forward web proxy translated HIP-based traffic originating from the client into non-HIP traffic towards any web server in the Internet. Consequently, the HIP-capable client could communicate with HIP-incapable web servers. This way, the client could utilize mobility support as provided by HIP while using the fixed IP address of the web proxy, for instance, to access services that were allowed only from the IP address range of the proxy.
HIP with reverse web proxies (i.e., server-side proxies) has also been investigated, as described in more detail in [
komu-cloud]. In this scenario, a HIP-incapable client accessed a HIP-capable web service via an intermediary load balancer (a web-based load balancer implementation called HAProxy). The load balancer translated non-HIP traffic originating from the client into HIP-based traffic for the web service (consisting of front-end and back-end servers). Both the load balancer and the web service were located in a data center. One of the key benefits for encrypting the web traffic with HIP in this scenario was supporting a private-public cloud scenario (i.e., hybrid cloud) where the load balancer, front-end servers, and back-end servers were located in different data centers, and thus the traffic needed to be protected when it passed through potentially insecure networks between the borders of the private and public clouds.
While HIP could be used to secure access to intermediary devices (e.g., access to switches with legacy telnet), it has also been used to secure intermittent connectivity between middlebox infrastructure. For instance, earlier research [
komu-mitigation] utilized HIP between Simple Mail Transport Protocol (SMTP) servers in order to exploit the computational puzzles of HIP as a spam mitigation mechanism. A rather obvious practical challenge in this approach was the lack of HIP adoption on existing SMTP servers.
To avoid deployment hurdles with existing infrastructure, HIP could be applied in the context of new protocols with little deployment. Namely, HIP has been studied in the context of a new protocol, peer-to-peer SIP [
camarillo-p2psip]. The work has resulted in a number of related RFCs [
RFC 6078], [
RFC 6079], and [
RFC 7086]. The key idea in the research work was to avoid redundant, time-consuming ICE procedures by grouping different connections (i.e., SIP and media streams) together using the low-layer HIP, which executes NAT traversal procedures only once per host. An interesting aspect in the approach was the use of P2P-SIP infrastructure as rendezvous servers for the HIP control plane instead of utilizing the traditional HIP rendezvous services [
RFC 8004].
Researchers have proposed using HIP in cellular networks as a mobility, multihoming, and security solution. [
hip-lte] provides a security analysis and simulation measurements of using HIP in Long Term Evolution (LTE) backhaul networks.
HIP has been studied for securing cloud internal connectivity. First with virtual machines [
komu-cloud] and then between Linux containers [
ranjbar-synaptic]. In both cases, HIP was suggested as a solution to NAT traversal that could be utilized both internally by a cloud network and between multi-cloud deployments. Specifically in the former case, HIP was beneficial sustaining connectivity with a virtual machine while it migrated to a new location. In the latter case, a Software-Defined Networking (SDN) controller acted as a rendezvous server for HIP-capable containers. The controller enforced strong replay protection by adding middlebox nonces [
heer-end-host] to the passing HIP base exchange and UPDATE messages.
Tempered Networks provides HIP-based products. They refer to their platform as [
tempered-networks] because of HIP's identity-first networking architecture. Their objective has been to make it simple and nondisruptive to deploy HIP-enabled services widely in production environments with the purpose of enabling transparent device authentication and authorization, cloaking, segmentation, and end-to-end networking. The goal is to eliminate much of the circular dependencies, exploits, and layered complexity of traditional "address-defined networking" that prevents mobility and verifiable device access control. The products in the portfolio of Tempered Networks utilize HIP are as follows:
-
HIP Switches / Gateways
-
These are physical or virtual appliances that serve as the HIP gateway and policy enforcement point for non-HIP-aware applications and devices located behind it. No IP or infrastructure changes are required in order to connect, cloak, and protect the non-HIP-aware devices. Currently known supported platforms for HIP gateways are x86 and ARM chipsets, ESXi, Hyper-V, KVM, AWS, Azure, and Google clouds.
-
HIP Relays / Rendezvous
-
These are physical or virtual appliances that serve as identity-based routers authorizing and bridging HIP endpoints without decrypting the HIP session. A HIP relay can be deployed as a standalone appliance or in a cluster for horizontal scaling. All HIP-aware endpoints and the devices they're connecting and protecting can remain privately addressed. The appliances eliminate IP conflicts, tunnel through NAT and carrier-grade NAT, and require no changes to the underlying infrastructure. The only requirement is that a HIP endpoint should have outbound access to the Internet and that a HIP Relay should have a public address.
-
HIP-Aware Clients and Servers
-
This is software that is installed in the host's network stack and enforces policy for that host. HIP clients support split tunneling. Both the HIP client and HIP server can interface with the local host firewall, and the HIP server can be locked down to listen only on the port used for HIP, making the server invisible from unauthorized devices. Currently known supported platforms are Windows, OS X, iOS, Android, Ubuntu, CentOS, and other Linux derivatives.
-
Policy Orchestration Managers
-
These physical or virtual appliances serve as the engine to define and distribute network and security policy (HI and IP mappings, overlay networks, and whitelist policies, etc.) to HIP-aware endpoints. Orchestration does not need to persist to the HIP endpoints and vice versa, allowing for autonomous host networking and security.
The IRTF Name Space Research Group has posed a number of evaluating questions in [
nsrg-report]. In this section, we provide answers to these questions.
-
How would a stack name improve the overall functionality of the Internet?
HIP decouples the internetworking layer from the transport layer, allowing each to evolve separately. The decoupling makes end-host mobility and multihoming easier, also across IPv4 and IPv6 networks. HIs make network renumbering easier, and they also make process migration and clustered servers easier to implement. Furthermore, being cryptographic in nature, they provide the basis for solving the security problems related to end-host mobility and multihoming.
-
What does a stack name look like?
A HI is a cryptographic public key. However, instead of using the keys directly, most protocols use a fixed-size hash of the public key.
-
What is its lifetime?
HIP provides both stable and temporary Host Identifiers. Stable HIs are typically long-lived, with a lifetime of years or more. The lifetime of temporary HIs depends on how long the upper-layer connections and applications need them, and can range from a few seconds to years.
-
Where does it live in the stack?
The HIs live between the transport and internetworking layers.
-
How is it used on the endpoints?
The Host Identifiers may be used directly or indirectly (in the form of HITs or LSIs) by applications when they access network services. Additionally, the Host Identifiers, as public keys, are used in the built-in key agreement protocol, called the HIP base exchange, to authenticate the hosts to each other.
-
What administrative infrastructure is needed to support it?
In some environments, it is possible to use HIP opportunistically, without any infrastructure. However, to gain full benefit from HIP, the HIs must be stored in the DNS or a PKI, and the rendezvous mechanism is needed [RFC 8005].
-
If we add an additional layer, would it make the address list in SCTP unnecessary?
Yes
-
What additional security benefits would a new naming scheme offer?
HIP reduces dependency on IP addresses, making the so-called address ownership [Nik2001] problems easier to solve. In practice, HIP provides security for end-host mobility and multihoming. Furthermore, since HIP Host Identifiers are public keys, standard public key certificate infrastructures can be applied on the top of HIP.
-
What would the resolution mechanisms be, or what characteristics of a resolution mechanisms would be required?
For most purposes, an approach where DNS names are resolved simultaneously to HIs and IP addresses is sufficient. However, if it becomes necessary to resolve HIs into IP addresses or back to DNS names, a flat resolution infrastructure is needed. Such an infrastructure could be based on the ideas of Distributed Hash Tables, but would require significant new development and deployment.