HashiConf
Join us this September for 3 days of talks, training, product news & more. Book Your Ticket Now

Day 1: Deploying Your First Vault Cluster

Vault Reference Architecture

The goal of this document is to recommend HashiCorp Vault deployment practices. This reference architecture conveys a general architecture that should be adapted to accommodate the specific needs of each implementation.

The following topics are addressed in this guide:

Glossary

Vault Cluster

A Vault cluster is a set of Vault processes that together run a Vault service. These Vault processes could be running on physical or virtual servers, or in containers.

Consul storage backend cluster

HashiCorp recommends and supports Consul being used as the storage backend for Vault. A Consul cluster is a set of Consul server processes that together run a Consul service. These Consul processes could be running on physical or virtual servers, or in containers.

Availability Zone

A single failure domain on a location level that hosts part of, or all of a Vault cluster. The latency between availability zones should be < 8ms for a round trip. A single Vault cluster may be spread across multiple availability zones.

Examples of an availability zone in this context are:

  • An isolated datacenter
  • An isolated cage in a datacenter if it is isolated from other cages by all other means (power, network, etc)
  • An availability zone in AWS, Azure or GCP

Region

A geographically separate collection of one or more availability zones. A region would host one or more Vault clusters. There is no defined maximum latency requirement between regions in Vault architecture. A single Vault cluster would not be spread across multiple regions.

Design Summary

This design is the recommended architecture for production environments, as it provides flexibility and resilience.

It is a major architecture recommendation that the Consul servers are separate from the Vault servers and that the Consul cluster is only used as a storage backend for Vault and not for other Consul-focused functionality (eg service segmentation and service discovery) which can introduce unpredictable resource utilization. Separating Vault and Consul allows each to have a system that can be sized appropriately in terms of CPU, memory and disk. Consul is a memory intensive application and so separating it to its own resources is advantageous to prevent resource contention or starvation. Dedicating a Consul cluster as a Vault storage backend is also advantageous as this allows the Consul cluster to be upgraded only as required to improve Vault storage backend functionality. This is likely to be much less frequently than a Consul cluster that is also used for service discovery and service segmentation.

Vault to Consul backend connectivity is over HTTP and should be secured with TLS as well as a Consul token to provide encryption of all traffic. See the Vault Deployment Guide for more information. As the Consul cluster for Vault storage may be used in addition and separate to a Consul cluster for service discovery, it is recommended that the storage Consul process be run on non-default ports so that it does not conflict with other Consul functionality. Setting the Consul storage cluster to run on 7xxx ports and using this as the storage port in the Vault configuration will achieve this. It is also recommended that Consul be run using TLS.

Network Connectivity Details

Network Connectivity Details

The following table outlines the network traffic requirements for Vault cluster nodes.

SourceDestinationportprotocolDirectionPurpose
Consul clients and serversConsul Server7300TCPincomingServer RPC
Consul clientsConsul clients7301TCP and UDPbidirectionalLan gossip communications
Vault clientsVault servers8200TCPincomingVault API
Vault serversVault servers8201TCPbidirectionalVault replication traffic, request forwarding
Alternative Network Configurations

Vault can be configured in several separate ways for communications between the Vault and Consul Clusters:

  • Using host IP addresses or hostnames that are resolvable via standard named subsystem
  • Using load balancer IP addresses or hostnames that are resolvable via standard named subsystem
  • Using the attached Consul cluster DNS as service discovery to resolve Vault endpoints
  • Using a separate Consul service discovery cluster DNS as service discovery to resolve Vault endpoints

All of these options are explored more in the Vault Deployment Guide.

Failure Tolerance

Vault is designed to handle different failure scenarios that have different probabilities. When deploying a Vault cluster, the failure tolerance that you require should be considered and designed for. In OSS Vault the recommended number of instances is 3 in a cluster as any more would have limited value. In Vault Enterprise the recommended number is also 3 in a cluster, but more can be used if they were performance replicas to help with workload. The Consul cluster is from one to seven instances. This should be an odd number to allow for leadership elections to always resolve. It is recommended that the Consul cluster is at least five instances that are dedicated to performing backend storage functions for the Vault cluster only.

Node

The Vault and Consul cluster software allows for a failure domain at the node level by having replication within the cluster. In a single HA Vault cluster all nodes share the same underlying storage backend and therefore data. Vault achieves this by one of the Vault servers obtaining a lock within the data store to become the active Vault node and this has write access. If at any time the leader is lost then another Vault node will seamlessly take its place as the leader. To achieve n-2 redundancy (where the loss of 2 objects within the failure domain can be tolerated), an ideal size for a Vault cluster would be 3. Consul achieves replication and leadership through the use of its consensus and gossip protocols. In these protocols, a leader is elected by consensus and so a quorum of active servers must always exist. To achieve n-2 redundancy, an ideal size of a Consul cluster is 5. See Consul Internals for more details.

Availability Zone

Typical distribution in a cloud environment is to spread Consul/Vault nodes into separate Availability Zones (AZs) within a high bandwidth, low latency network, such as an AWS Region, however this may not be possible in a datacenter installation where there is only one DC within the level of latency required.

It is important to understand a change in requirement or best practices that has come about as a result of the move towards greater utilization of highly distributed systems such as Consul. When operating environments comprised of distributed systems, a shift is required in the redundancy coefficient of underlying components. Consul relies upon consensus negotiation to organize and replicate information and so the environment must provide 3 unique resilient paths in order to provide meaningful reliability. Essentially, a consensus system requires a simple majority of nodes to be available at any time. In the example of 3 nodes, you must have 2 available. If those 3 nodes are places in two failure domains, there is a 50% chance that losing a single failure domain would result in a complete outage.

Region

To protect against a failure at the region level, as well as provide additional geographic scaling capabilities, Vault enterprise offers:

  • Disaster Recovery Replication
  • Performance Replication

Please see the Recommended Patterns on Vault Replication for a full description of these options.

Because of the constraints listed above, the recommended architecture is with Vault and Consul Enterprise distributed across three availability zones within a cluster and for clusters to be replicated across regions using DR and Performance replication. There are also several “Best Case” architecture solutions for one and two Availability Zones and also for Consul OSS. These are not the recommended architecture, but are the best solutions if your deployment is restricted by Consul version or number of availability zones.

The architecture below is the recommended best approach to Vault deployment and should be the target architecture for any installation. This is split into two parts:

  • Vault cluster - This is the recommended architecture for a vault cluster as a single entity, but should also use replication as per the second diagram
  • Vault replication - This is the recommended architecture for multiple vault clusters to allow for regional, performance and disaster recovery.

Single Region Deployment (Enterprise)

Reference Diagram

In this scenario, the nodes in the Vault and associated Consul clusters are hosted between three Availability Zones. This solution has an n-2 at the node level for Vault and an n-3 at the node level for Consul. At the Availability Zone level, Vault is at n-2 and Consul at n-1. This differs from the OSS design in that the Consul cluster has six nodes with three of them as a non-voting members. The Consul cluster is set up using Redundancy Zones so that if any Node were to fail a non-voting member would be promoted by Autopilot to become a full member and so maintain Quorum.

Multiple Region Deployment (Enterprise)

Reference Diagram Resilience against Region Failure

Reference Diagram Resilience against Region Failure

In this scenario, the clusters are replicated to guard against a full region failure. There are three Performance Replica Vault clusters (clusters A, B, C) each with its own DR cluster (clusters D, E, F) in a different Region. Each cluster has its associated Consul cluster for storage backend.

This architecture allows for n-2 at the region level provided all secrets and secret engines are replicated across all clusters. Failure of the full Region 1 would require the DR cluster F to be promoted. Once this was done the Vault solution would be fully functional with some loss of redundancy till the Region 1 was restored. Applications would not have to re-authenticate as the DR cluster for each failed cluster would contain all leases and tokens.

Reference Diagram Resilience against Cluster Failure

Reference Diagram Resilience against Cluster Failure

This solution has full resilience at a cluster level, but does not guard against region failure as the DR clusters for the Performance replicas are in the same region. There are certain use-cases where this is the preferred method where data cannot be replicated to other regions due to governance restrictions such as GDPR. Also some infrastructure frameworks may not have the ability to route application traffic to different Regions.

Best Case Architecture

In some deployments there may be insurmountable restrictions that mean the recommended architecture is not possible. This could be due to lack of availability zones or because of using Vault OSS. In these cases, the architectures below detail the best case options available.

Note that in these following architectures the Consul leader could be any of the five Consul server nodes and the Vault active node could be any of the three Vault nodes

Deployment of Vault in one Availability Zone (all)

Reference Diagram

In this scenario, all nodes in the Vault and associated Consul cluster are hosted within one Availability Zone. This solution has a single point of failure at the availability zone level, but an n-2 at the node level for both Consul and Vault. This is not Hashicorp recommended architecture for production systems as there is no redundancy at the Availability Zone level. Also there is no DR capability and so as a minimum this should at least have a DR replica in a separate Region.

Deployment of Vault in two Availability Zones (OSS)

Reference Diagram

In this scenario, the nodes in the Vault and associated Consul cluster are hosted between two Availability Zones. This solution has an n-2 at the node level for Vault and Consul and n-1 for Vault at the Availability Zone level, but the addition of an Availability Zone does not significantly increase the availability of the Consul cluster. This is because the Raft protocol requires a quorum of (n/2)+1 and if Zone B were to fail in the above diagram then the cluster would not be quorate and so would also fail. This is not Hashicorp recommended architecture for production systems as there is only partial redundancy at the Availability Zone level and an Availability Zone failure may or may not result in an outage.

Deployment of Vault in two Availability Zones (Enterprise)

Reference Diagram

Due to the need to maintain quoracy in the Consul cluster, having only 2 Availability Zones is not ideal. There is no way to spread a consul cluster across two AZs with any guarantee of added resiliency. The best case solution in Vault enterprise is to treat the two AZs as regions and have separate Vault clusters in each.

The secondary Vault cluster can either be a Performance Replica or a DR Replica, each having their own advantages:

  • PR secondary: If the Vault address is managed by Consul or by a load balancer then a failure of either cluster will result in the traffic being directed to the other cluster with no outage, providing there is logic in your application or the Vault agent to manage re-requesting tokens that are not replicated between the clusters.
  • DR secondary: In this case the failure of the primary cluster will result in the need for operator intervention to promote the DR to the primary, but as all leases and tokens are replicated, there would be no need for any additional logic in the application to handle this.

Deployment of Vault in three Availability Zones (OSS)

Reference Diagram

In this scenario, the nodes in the Vault and associated Consul cluster are hosted between three Availability Zones. This solution has an n-2 at the node level for Vault and Consul and n-2 for Vault at the Availability Zone level. This also has an n-1 at the Availability Zone level for Consul and as such is considered the most resilient of all architectures for a single Vault cluster with a Consul storage backend for the OSS product.

Vault Replication (Enterprise Only)

In these architectures the Vault Cluster is illustrated as a single entity, and would be one of the single clusters detailed above based on your number of Availability Zones. Multiple Vault clusters acting as a single Vault solution and replicating between them is available in Enterprise Vault only. OSS Vault can be set up in multiple clusters, but they would each be individual Vault solutions and would not support replication between clusters.

The Vault documentation provides more detailed information on the replication capabilities within Vault Enterprise.

Performance Replication

Vault performance replication allows for secrets management across many sites. Static secrets, authentication methods, and authorization policies are replicated to be active and available in multiple locations, however leases tokens and dynamic secrets are not.

Disaster Recovery Replication

Vault disaster recovery replication ensures that a standby Vault cluster is kept synchronized with an active Vault cluster. This mode of replication includes data such as ephemeral authentication tokens, time-based token information as well as token usage data. This provides for aggressive recovery point objective in environments where preventing loss of ephemeral operational data is of the utmost concern. In any enterprise solution, Disaster Recovery Replicas are considered essential.

Corruption or Sabotage Disaster Recovery

Another common scenario to protect against, more prevalent in cloud environments that provide very high levels of intrinsic resiliency, might be the purposeful or accidental corruption of data and configuration, and or a loss of cloud account control. Vault's DR Replication is designed to replicate live data, which would propagate intentional or accidental data corruption or deletion. To protect against these possibilities, you should backup Vault's storage backend. This is supported through the Consul Snapshot feature, which can be automated for regular archival backups. A cold site or new infrastructure could be re-hydrated from a Consul snapshot.

Replication Notes

There is no set limit on number of clusters within a replication set. Largest deployments today are in the 30+ cluster range. A Performance Replica cluster can have a Disaster Recovery cluster associated with it and can also replicate to multiple Disaster Recovery clusters.

While a Vault cluster can possess a replication role (or roles), there are no special considerations required in terms of infrastructure, and clusters can assume (or be promoted or demoted) to another role. Special circumstances related to mount filters and HSM usage may limit swapping of roles, but those are based on specific organization configurations.

Using replication with Vault clusters integrated with HSM (or cloud auto-unseal) devices for automated unseal operations has some details that should be understood during the planning phase.

  • If a performance primary cluster uses an HSM, all other clusters within that replication set must use an HSM as well.
  • If a performance primary cluster does NOT use an HSM (uses Shamir secret sharing method), the clusters within that replication set can be mixed, such that some may use an HSM, others may use Shamir.

Reference Diagram

Deployment System Requirements

The following table provides guidelines for server sizing. Of particular note is the strong recommendation to avoid non-fixed performance CPUs, or Burstable CPU in AWS terms, such as T-series instances.

Sizing for Vault Servers

SizeCPUMemoryDiskTypical Cloud Instance Types
Small2 core4-8 GB RAM25 GBAWS: m5.large
Azure: Standard_D2_v3
GCE: n1-standard-2, n1-standard-4
Large4-8 core16-32 GB RAM50 GBAWS: m5.xlarge, m5.2xlarge
Azure: Standard_D4_v3, Standard_D8_v3
GCE: n1-standard-8, n1-standard-16

Sizing for Consul Servers

SizeCPUMemoryDiskTypical Cloud Instance Types
Small2 core8-16 GB RAM50 GBAWS: m5.large, m5.xlarge
Azure: Standard_D2_v3, Standard_D4_v3
GCE: n1-standard-4, n1-standard-8
Large4-8 core32-64+ GB RAM100 GBAWS: m5.2xlarge, m5.4xlarge
Azure: Standard_D4_v3, Standard_D8_v3
GCE: n1-standard-16, n1-standard-32

Hardware Considerations

The small size category would be appropriate for most initial production deployments, or for development/testing environments.

The large size is for production environments where there is a consistent high workload. That might be a large number of transactions, a large number of secrets, or a combination of the two.

In general, processing requirements will be dependent on encryption workload and messaging workload (operations per second, and types of operations). Memory requirements will be dependent on the total size of secrets/keys stored in memory and should be sized according to that data (as should the hard drive storage). Vault itself has minimal storage requirements, but the underlying storage backend should have a relatively high-performance hard disk subsystem. If many secrets are being generated/rotated frequently, this information will need to flush to disk often and can impact performance if slower hard drives are used.

Consul servers function in this deployment is to serve as the storage backend for Vault. This means that all content stored for persistence in Vault is encrypted by Vault, and written to the storage backend at rest. This data is written to the key-value store section of Consul's Service Catalog, which is required to be stored in its entirety in-memory on each Consul server. This means that memory can be a constraint in scaling as more clients authenticate to Vault, more secrets are persistently stored in Vault, and more temporary secrets are leased from Vault. This also has the effect of requiring vertical scaling on Consul server's memory if additional space is required, as the entire Service Catalog is stored in memory on each Consul server.

Furthermore, network throughput is a common consideration for Vault and Consul servers. As both systems are HTTPS API driven, all incoming requests, communications between Vault and Consul, underlying gossip communication between Consul cluster members, communications with external systems (per auth or secret engine configuration, and some audit logging configurations) and responses consume network bandwidth.

Due to network performance considerations in Consul cluster operations, replication of Vault datasets across network boundaries should be achieved through Performance or DR Replication, rather than spreading the Consul cluster across network and physical boundaries. If a single consul cluster is spread across network segments that are distant or inter-regional, this can cause synchronization issues within the cluster or additional data transfer charges in some cloud providers.

Other Considerations

Vault Production Hardening Recommendations provides guidance on best practices for a production hardened deployment of Vault.

Load Balancing

Load Balancing Using Consul Interface

Consul can provide load balancing capabilities through service discovery, but it requires that any Vault clients are Consul aware. This means that a client can either use Consul DNS or API interfaces to resolve the active Vault node. A client might access Vault via a URL like the following: http://active.vault.service.consul:8200

This relies upon the operating system DNS resolution system, and the request could be forwarded to Consul for the actual IP address response. The operation can be completely transparent to legacy applications and would operate just as a typical DNS resolution operation. See Consul DNS forwarding for more information

Load Balancing Using External Load Balancer

Vault Behind a Load Balancer

External load balancers are supported as an entry point to a Vault cluster. The external load balancer should poll the sys/health endpoint to detect the active node and route traffic accordingly. The load balancer should be configured to make an HTTP request for the following URL to each node in the cluster to: http://<Vault Node URL>:8200/v1/sys/health

The active Vault node will respond with a 200 while the standby nodes will return a 4xx response.

The following is a sample configuration block from HAProxy to illustrate:

listen vault
    bind 0.0.0.0:80
    balance roundrobin
    option httpchk GET /v1/sys/health
    server vault1 192.168.33.10:8200 check
    server vault2 192.168.33.11:8200 check
    server vault3 192.168.33.12:8200 check

Note that the above block could be generated by Consul (with consul-template) when a software load balancer is used. This could be the case when the load balancer is software like Nginx, HAProxy, or Apache.

Example Consul Template for the above HAProxy block:

listen vault
   bind 0.0.0.0:8200
   balance roundrobin
   option httpchk GET /v1/sys/health{{range service "vault"}}
   server {{.Node}} {{.Address}}:{{.Port}} check{{end}}

Client IP Address Handling

There are two supported methods for handling client IP addressing behind a proxy or load balancer; X-Forwarded-For Headers and PROXY v1. Both require a trusted load balancer and require IP address whitelisting to adhere to security best practices.

Additional References

  • Vault architecture documentation explains each Vault component
  • To integrate Vault with existing LDAP server, refer to LDAP Auth Method documentation
  • Refer to the AppRole Pull Authentication guide to programmatically generate a token for a machine or app
  • Consul is an integral part of running a resilient Vault cluster, regardless of location. Refer to the online Consul documentation to learn more.

Next steps

  • Read Production Hardening to learn best practices for a production hardening deployment of Vault.
  • Read Deployment Guide to learn the steps required to install and configure a single HashiCorp Vault cluster.