The goal of this guide is to recommend HashiCorp Consul deployment practices. This reference architecture conveys a general architecture that should be adapted to accommodate the specific needs of each implementation.
The following topics are addressed in this guide:
- Consul's fundamental building blocks
- Recommended architectures
- On premise only architecture
- Deployment system requirements
- Next steps
»Fundamental building blocks
A Consul datacenter should be a single physical datacenter or a single cloud region. The underlying philosophy is that a Consul datacenter is a LAN construct and is optimized for LAN latency. The communication protocols that the server processes use to coordinate their actions, and that both server and client processes communicate with each other, are optimized for LAN bandwidth. A single Consul datacenter should not, for example, span multiple cloud regions, or physical datacenters which are hundreds or thousands of kilometers apart.
Since Consul is a highly-available, clustered service the most reliable Consul deployments are spread across multiple failure domains. In a cloud deployment, for example, these might be called availability zones: geographically separate infrastructure with fast, low-latency networks between them. In a physical datacenter, this might be placing the Consul servers in different racks, with different power and network feeds and independent cooling. In general, the idea is that of other clustered services: spreading out the servers so that a failure in power, cooling, network, etc, affects only one server instead of them all.
Examples of a failure domain in this context are:
- An isolated datacenter
- An isolated cage in a datacenter if it is isolated from other cages by all other means (power, network, etc)
- An availability zone in AWS, Azure, or GCP
In any case, there should be high-bandwidth, low-latency (sub 8ms round trip) connectivity between the failure domains.
A typical distribution in a cloud environment is to spread Consul servers into separate availability zones (AZs) within a high bandwidth, low latency network, such as an AWS Region, however this may not be possible in a datacenter installation where there is only one DC within the level of latency required.
It is important to understand that a change in requirements or best practices has come about as a result of the move towards greater utilization of highly distributed systems, such as Consul. When operating environments comprised of distributed systems, a shift is required in the redundancy coefficient of underlying components. Consul relies upon consensus negotiation to organize and replicate information, and so the environment must provide 3 unique resilient paths in order to provide meaningful reliability. Essentially, a consensus system requires a simple majority of nodes to be available at any time. In the example of 3 nodes, you must have 2 available. If those 3 nodes are placed in two failure domains, there is a 50% chance that losing a single failure domain would result in a complete outage.
Consul is designed to handle different failure scenarios that have different probabilities. When deploying a Consul datacenter, the failure tolerance that you require should be considered and designed for. Consul Enterprise can achieve n-4 redundancy using 3 voting and 3 non-voting members. Consul OSS can achieve n-3 redundancy using 5 servers. See Consul Internals for more details.
In Consul Enterprise, the recommended number is 3 voting and 3 non-voting members in a datacenter, but more can be used for read scalability to help with read-only workloads. A tutorial for providing fault tolerance with redundancy zones can be found here.
In OSS Consul, the recommended number of instances is to run either 3 or 5 servers in a datacenter as this maximizes availability without greatly sacrificing performance.
»Network connectivity details
LAN gossip occurs between all agents in a single datacenter with each agent
sending a periodic probe to random agents from its member list. Both client and
server agents participate in the gossip. The initial probe is sent over UDP
every second. If a node fails to acknowledge within
200ms, the agent pings
over TCP. If the TCP probe fails (10 second timeout), it asks a configurable
number of random agents to probe the same, non-responsive agent (also known as an indirect probe).
If there is no response from the peers regarding the status of the agent, that
agent is marked as down.
The agent's status directly affects the service discovery results. If an agent is down, the services it is monitoring will also be marked as down.
In addition, the agent also periodically performs a full state sync over TCP which gossips each agent's understanding of the member list around it (node names, IP addresses, and health status). These operations are expensive relative to the standard gossip protocol mentioned above and are performed at a rate determined by datacenter size to keep overhead low. It's typically between 30 seconds and 5 minutes.
In a larger network that spans L3 segments, traffic typically traverses through a firewall and/or a router. You must update your ACL or firewall rules to allow the following ports:
|Server RPC||8300||Used by servers to handle incoming requests from other agents. TCP only.|
|Serf LAN||8301||Used to handle gossip in the LAN. Required by all agents. TCP and UDP.|
|Serf WAN||8302||Used by servers to gossip over the LAN and WAN to other servers. TCP and UDP.|
|HTTP API||8500||Used by clients to talk to the HTTP API. TCP only.|
By default agents will only listen for HTTP and DNS traffic on the local interface.
One of the key features of Consul is its support for multiple datacenters. The architecture of Consul is designed to promote a low coupling of datacenters so that connectivity issues or failure of any datacenter does not impact the availability of Consul in other datacenters. This means each datacenter runs independently, each having a dedicated group of servers and a private LAN gossip pool. Network areas and network segments can be used to prevent opening up firewall ports between different subnets.
The architecture below is the recommended best approach to Consul deployment and should be the target architecture for any installation. This is split into two parts:
- Consul datacenter - This is the recommended architecture for a Consul datacenter as a single entity.
- Consul federation - Building on the Consul datacenter recommended architecture, this is the recommended pattern for connecting multiple Consul datacenters to allow for regional, performance, and disaster recovery.
We recommend a single Consul datacenter for applications deployed in the same datacenter. Consul supports traditional three-tier applications as well as microservices.
Typically, you will need between three or five servers to balance between availability and performance. These servers together run the Raft-driven consistent state store for updating catalog, session, prepared query, ACL, and KV state.
The recommended maximum size for a single datacenter is 5,000 Consul clients. For a write-heavy and/or a read-heavy datacenter, you may need to reduce the maximum number of agents further, depending on the number and the size of KV pairs and the number of watches. As you add more client machines, it takes more time for gossip to converge. Similarly, when a new server joins an existing multi-thousand node datacenter with a large KV store it may take more time to replicate the store to the new server's log and the update rate may increase.
TIP For write-heavy datacenters, consider scaling the Consul servers vertically with larger machine instances and lower latency storage.
In cases where agents cannot all contact with each other, due to network segmentation, you can use Consul's network segments (Consul Enterprise only) to create multiple tenants that share Consul servers in the same datacenter. Each tenant has its own gossip pool and does not communicate with the agents outside its pool. All the tenants, however, do share the KV store. If you do not have access to Consul network segments, you can create discrete Consul datacenters to isolate agents from each other.
»Single Consul datacenter (Enterprise)
In this scenario, the Consul deployment is hosted between three availability zones. This solution has an n-4 at the node level for Consul. And n-1 at the availability zone level. This level of redundancy is achieved by using six nodes with three of them as a non-voting members. The Consul deployment is set up using redundancy zones so that if any node were to fail a non-voting member would be promoted by autopilot to become a full member and so maintain quorum.
»Single Consul datacenter (OSS)
In this scenario, the Consul deployment is hosted between three availability zones. This solution has an n-3 at the node level for Consul. At n-1 at the availability zone level.
»Single Consul datacenter with a Kubernetes cluster
The Enterprise or OSS Consul datacenter designs above can be used to manage services in Kubernetes, Consul clients can be deployed within the Kubernetes cluster. This will also allow Kubernetes-defined services to be synced to Consul. This design allows Consul tools such as envconsul, consul-template, and more to work on Kubernetes.
This type of deployment in Kubernetes can also be set up with the official Helm chart. For further details review the Consul and Kubernetes Reference Architecture tutorial.
»Single Consul datacenter with ingress and terminating Gateways (Enterprise)
Within a Consul datacenter, ingress and terminating gateways allow services outside of the service mesh to access services inside the service mesh, and services inside the service mesh to securely connect to services outside the mesh while controlling access with Consul intentions. Between Consul datacenters mesh gateways allow you to route all Consul and service mesh traffic through the gateways.
In all cases, for any gateway type, you need a minimum of two instances and ideally one instance per availability zone.
You can join Consul datacenters with WAN federation so that the same service can be deployed in different
datacenters and services across datacenters can discover each other. The datacenters operate independently and only
communicate over the WAN on port
8302. Unless explicitly configured via CLI or
API, Consul servers will only return results from their local datacenter. Consul
does not replicate data between multiple datacenters, but you can use the
consul-replicate tool to
replicate the KV data periodically.
It is good practice to enable TLS server name checking in order to avoid accidental cross-joining of agents.
The network areas feature in
Consul Enterprise provides advanced federation. For example, imagine that
datacenter1 (dc1) hosts services like LDAP and shares them
with datacenter2 (dc2) and datacenter3 (dc3). However, due to compliance issues,
servers in dc2 must not connect with servers in dc3. Basic WAN federation cannot
isolate dc2 from dc3; it requires that all the servers in dc1, dc2 and dc3 are
connected in a full mesh and opens both gossip (
8302 tcp/udp) and RPC (
ports for communication.
Network areas allows peering between datacenters to make shared services discoverable over WAN. With network areas, servers in dc2 and dc3 can be configured to federate only with dc1, and not establish federation with other datacenters in the WAN. This meets the compliance requirement of the organization in our example use case. Servers that are part of the network area communicate over RPC only. This removes the overhead of sharing and maintaining the symmetric key used by the gossip protocol across datacenters. It also reduces the attack surface at the gossip ports since they no longer need to be opened in security gateways or firewalls.
Consul's prepared queries allow clients
to failover to another datacenter for service discovery. For example, if a
payment in the local datacenter dc1 goes down, a prepared query lets
users define a geographic fallback order to the nearest datacenter to check for
healthy instances of the same service.
NOTE Consul datacenters must be WAN linked for a prepared query to work across datacenters.
Prepared queries, by default, resolve the query in the local datacenter first. Prepared query config/templates are maintained consistently in Raft and are executed on the servers.
»Consul Connect service mesh and multiple datacenters
In situations where directly connecting your datacenters is not possible due to things like overlapping IP ranges, cost of establishing direct connects to multiple cloud providers, etc. mesh gateways can be deployed to provide connectivity for Connect-enabled, service mesh traffic between datacenters.
Consul Connect service mesh supports multi-datacenter connections and replicates intentions. Mesh gateways utilize the Server Name Indication (SNI) extension of the TLS protocol to route traffic between services. This allows WAN federated datacenters to provide connectivity to proxies in any federated datacenter without requiring direct IP reachability between services.
»Multiple Consul datacenters (Enterprise)
Enterprise Only: The network areas functionality described here is available only in Consul Enterprise with the Global Visibility, Routing and Scale module. To explore Consul Enterprise features, you can sign up for a for a free 30-day trial from here.
Consul's core federation capability uses the same gossip mechanism that is used for a single datacenter. This requires that every server from every datacenter be in a fully connected mesh with an open gossip port (8302/tcp and 8302/udp) and an open server RPC port (8300/tcp). Consul Enterprise offers a network area mechanism that allows operators to federate Consul datacenters together on a pairwise basis, enabling partially-connected network topologies. Once a link is created, Consul servers can make queries to the remote datacenter in service of both API and DNS requests for remote resources (in spite of the partially-connected nature of the topology as a whole).
For organizations with large numbers of datacenters, it becomes difficult to support a fully connected mesh. It is often desirable to have topologies like hub-and-spoke with central management datacenters and "spoke" datacenters that cannot interact with each other.
Consul datacenters can simultaneously participate in both network areas and the existing WAN pool, which eases migration.
»Multiple Consul datacenters with mesh gateways (Enterprise)
Mesh gateways enable routing of Connect service mesh traffic between different Consul datacenters. Those datacenters can reside in different clouds or runtime environments where general interconnectivity between all services in all datacenters is not feasible. These gateways operate by sniffing the SNI header out of the Connect TLS session and then route the connection to the appropriate destination based on the server name requested. The data within the mTLS session is not decrypted by the Gateway.
»Multiple Consul datacenters with mesh gateways for Consul as a shared service (Enterprise)
Enterprise Only: The namespace functionality described here is available only in Consul Enterprise with the Governance and Policy module. To explore Consul Enterprise features, you can sign up for a free 30-day trial from here.
Namespaces allow multiple teams within the same organization to share the same Consul datacenter(s) by separating services, Consul KV data, and other Consul data per team. This provides operators with the ability to more easily provide Consul as a service. Namespaces also enable operators to delegate ACL management.
Any service that is not registered in a namespace will be added to the
namespace. This means that all services are namespaced in Consul 1.7 and newer,
even if the operator has not created any namespaces.
For more information on how to use namespaces with Consul Enterprise please review the following HashiCorp Learn tutorials.
- Register and Discover Services within Namespaces - Register multiple services within different namespaces in Consul.
- Setup Secure Namespaces - Secure resources within a namespace and delegate namespace ACL rights via ACL tokens.
»On premise only architecture
In some deployments there may be insurmountable restrictions that mean the recommended architecture is not possible. This could be due to lack of availability zones. In these cases, the architectures below details the best case options available.
Note that in the following architectures, the Consul leader could be any of the five Consul server nodes.
»Single Consul datacenter (OSS)
»Single availability zone
In this scenario the Consul datacenter is hosted within one availability zone. This solution has a single point of failure at the availability zone level, but an n-3 at the node level for Consul. This is not a HashiCorp recommended architecture for production systems as there is no redundancy at the availability zone level.
»Two availability zones
It is important to understand a change in requirement or best practices that has come about as a result of the move towards greater utilization of highly distributed systems such as Consul. When operating environments comprised of distributed systems, a shift is required in the redundancy coefficient of underlying components. Consul relies upon consensus negotiation to organize and replicate information and so the environment must provide 3 unique resilient paths in order to provide meaningful reliability. Essentially, a consensus system requires a simple majority of nodes to be available at any time. In the example of 3 nodes, you must have 2 available. If those 3 nodes are placed in two failure domains, there is a 50% chance that losing a single failure domain would result in a complete outage.
»Deployment system requirements
Consul servers maintain the datacenter state, respond to RPC queries (read operations), and process all write operations. Since Consul servers are critical for the overall performance, efficiency, and health of the datacenter; their host sizing is also critical.
The following table provides high-level server host guidelines. Of particular note is the strong recommendation to avoid non-fixed performance CPUs, or "Burstable CPU".
|Type||CPU||Memory||Disk||Typical Cloud Instance Types|
|Small||2-4 core||8-16 GB RAM||50GB||AWS: m5.large, m5.xlarge|
|Azure: Standard_D2_v3, Standard_D4_v3|
|GCP: n2-standard-2, n2-standard-4|
|Large||8-16 core||32-64 GB RAM||100GB||AWS: m5.2xlarge, m5.4xlarge|
|Azure: Standard_D8_v3, Standard_D16_v3|
|GCP: n2-standard-8, n2-standard-16|
»Hardware sizing considerations
The small size would be appropriate for most initial production deployments or for development/testing environments.
The large size is for production environments where there is a consistently high workload.
NOTE For high workloads, ensure that the disks support a high number of IOPS to keep up with the rapid Raft log update rate.
For more information on server requirements, review the server performance documentation.
In this guide, you reviewed the operational considerations necessary to deploy Consul in production, including hardware sizing, datacenter design, and network connectivity. Next, review the Deployment Guide to learn the steps required to install, configure, and secure a single HashiCorp Consul datacenter.