Virtual Event
Join us for the next HashiConf Digital October 12-15, 2020 Register for Free

Day 1: Deploying Your First Vault Cluster

Vault with Integrated Storage Reference Architecture

The goal of this document is to recommend HashiCorp Vault deployment practices. This reference architecture conveys a general architecture, leveraging the raft storage backend, that should be adapted to accommodate the specific needs of each implementation.

Using raft as a storage backend eliminates reliance on any third party systems, it implements high availability, supports Enterprise Replication features, and provides backup/restore workflows.

The following topics are addressed in this guide:

»Glossary

»Vault Cluster

A Vault cluster is a set of Vault processes that together run a Vault service. These Vault processes could be running on physical or virtual servers, or in containers.

»Availability Zone

A single failure domain on a location level that hosts part of, or all of a Vault cluster. The latency between availability zones should be less than 8 miliseconds for a round trip. A single Vault cluster may be spread across multiple availability zones. Examples of an availability zone in this context are:

  • An isolated datacenter
  • An isolated cage in a datacenter if it is isolated from other cages by all sother means (power, network, etc)
  • An availability zone in AWS, Azure or GCP

»Region

A geographically separate collection of one or more availability zones. A region would host one or more Vault clusters. There is no defined maximum latency requirement between regions in Vault architecture. A single Vault cluster would not be spread across multiple regions.

»Design Summary

This design is the recommended architecture for production environments, as it provides flexibility and resilience. When using the Raft storage backend, decisions on resilience for Vault are modified to account for the necessity of maintaining quorum for the Raft protocol.

»Network Connectivity Details

Network Connectivity Details Network Connectivity Details

The following table outlines the network traffic requirements for Vault cluster nodes.

SourceDestinationportprotocolDirectionPurpose
Vault clientsVault servers8200tcpincomingVault API
Vault serversVault servers8201tcpbidirectionalVault replication traffic, request forwarding
Vault serversVault servers8201tcpbidirectionalRaft gossip traffic

»Failure Tolerance

Vault is designed to handle different failure scenarios that have different probabilities. When deploying a Vault cluster, the failure tolerance that you require should be considered and designed for. In OSS Vault, the recommended number of instances is 5 in a cluster as any more would have limited value.

In Vault Enterprise, the recommended number is also 5 in a cluster, but more can be used if they were performance standbys to help with the read-only workload. When leveraging this feature, it is also advisable to configure the performance standby nodes as a non-voting node.

»Node

The Vault software allows for a failure domain at the node level by having replication within the cluster. In a single HA Vault cluster, all nodes share the same underlying storage backend and therefore data. Vault achieves this by one of the Vault servers obtaining a lock within the data store to become the active Vault node and this has the write access. If at any time the leader is lost, then another Vault node will seamlessly take its place as the leader. To achieve n-2 redundancy (where the loss of 2 objects within the failure domain can be tolerated), an ideal size for a Vault cluster leveraging raft storage would be is 5.

»Availability Zone

Typical distribution in a cloud environment is to spread Vault nodes into separate Availability Zones (AZs) within a high bandwidth, low latency network, such as an AWS Region; however, this may not be possible in a datacenter (DC) installation where there is only one DC within the level of latency required. It is important to understand a change in requirements or best practices that have come about as a result of the move towards greater utilization of highly distributed systems such as Raft. When operating environments comprised of distributed systems, a shift is required in the redundancy coefficient of underlying components. Raft relies upon consensus negotiation to organize and replicate information and so the environment must provide 3 unique resilient paths in order to provide meaningful reliability. Essentially, a consensus system requires a simple majority of nodes to be available at any time. In the example of 3 nodes, you must have 2 available. If those 3 nodes are placed in two failure domains, there is a 50% chance that losing a single failure domain would result in a complete outage.

»Region

To protect against failure at the region level, as well as provide additional geographic scaling capabilities, Vault enterprise offers:

  • Disaster Recovery (DR) Replication
  • Performance Replication

Please see the Recommended Patterns on Vault Replication for a full description of these options.

Because of the constraints listed above, the recommended architecture for Vault is to distribute nodes across three availability zones within a cluster and for clusters to be replicated across regions using DR and Performance replication. There are also several “Best Case” architecture solutions for one and two Availability Zones. These are not the recommended architecture but are the best solutions if your deployment is restricted by the number of availability zones.

»Recommended Architecture

The architecture below is the recommended best approach to deployment of a single Vault cluster and should be the target architecture for any installation.

»Deployment of Vault in three Availability Zones

Reference Diagram Details

In this scenario, the nodes in the Vault cluster are distributed between three Availability Zones. This solution has an n-2 at the node level for Vault at the region level. This also has an n-1 at the Availability Zone level and as such is considered the most resilient of all architectures for a single Vault cluster with integrated storage backend for the OSS product.

»Multiple Region Deployment (Enterprise)

The recommended architecture for multiple Vault clusters to allow for regional, performance and disaster recovery remains the same as what is described in our standard Recommended Architecture guide (with the exception that you would not need to deploy any Consul clusters).

»Best Case Architecture

In some deployments, there may be insurmountable restrictions that mean the recommended architecture is not possible. This could be due to a lack of availability zones, as an example. In these cases, the architectures below detail the best case options available.

»Deployment of Vault in one Availability Zone

Reference Diagram Details

In this scenario, all nodes in the Vault cluster are hosted within one Availability Zone. This solution has a single point of failure at the availability zone level, but an n-2 at the node level for Vault. This is not HashiCorp's recommended architecture for production systems since there is no redundancy at the Availability Zone level. Also, there is no DR capability and so at a minimum this should have a DR replica in a separate region.

»Deployment of Vault in two Availability Zones

Reference Diagram Details

In this scenario, the nodes in the Vault cluster are hosted between two Availability Zones. This solution has an n-2 at the node level and n-1 at the Availability Zone level, but the addition of an Availability Zone does not significantly increase the availability of the Vault cluster. This is because the Raft protocol requires a quorum of (n/2)+1 and if Zone A were to fail in the above diagram, then the cluster would not be quorate and so would also fail. This is not HashiCorp's recommended architecture for production systems since there is only partial redundancy at the Availability Zone level and an Availability Zone failure may or may not result in an outage.

»Vault Replication (Enterprise)

In these architectures, the "Vault Cluster" is illustrated as a single entity, and would be one of the single clusters detailed above based on your number of Availability Zones. Multiple Vault clusters acting as a single Vault solution and replicating between them is available in Vault Enterprise only. OSS Vault can be set up in multiple clusters, but they would each be individual Vault solutions and would not support replication between clusters. The Vault documentation provides more detailed information on the replication capabilities within Vault Enterprise.

»Performance Replication

Vault performance replication allows for secrets management across many sites. Static secrets, authentication methods, and authorization policies are replicated to be active and available in multiple locations; however, leases, tokens and dynamic secrets are not.

»Disaster Recovery Replication

Vault disaster recovery replication ensures that a standby Vault cluster is kept synchronized with an active Vault cluster. This mode of replication includes data such as ephemeral authentication tokens, time-based token information as well as token usage data. This provides for aggressive recovery point objective in environments where preventing loss of ephemeral operational data is of the utmost concern. In any enterprise solution, Disaster Recovery Replicas are considered essential.

»Corruption or Sabotage Disaster Recovery

Another common scenario to protect against, more prevalent in cloud environments that provide very high levels of intrinsic resiliency, might be the purposeful or accidental corruption of data and configuration, and or a loss of cloud account control. Vault's DR Replication is designed to replicate live data, which would propagate intentional or accidental data corruption or deletion. To protect against these possibilities, you should backup Vault's storage backend. This is supported through the Raft Snapshot feature, which can be automated for regular archival backups. A cold site or new infrastructure could be re-hydrated from a Raft snapshot.

»Replication Notes

There is no set limit on the number of clusters within a replication set. The largest deployments today are in the 30+ cluster range. A Performance Replica cluster can have a Disaster Recovery cluster associated with it and can also replicate to multiple Disaster Recovery clusters. While a Vault cluster can possess a replication role (or roles), there are no special considerations required in terms of infrastructure, and clusters can assume (or be promoted or demoted) to another role. Special circumstances related to mount filters and HSM usage may limit swapping of roles, but those are based on specific organization configurations.

»Considerations Related to HSM Integration

Using replication with Vault clusters integrated with Hardware Security Module (HSM) or cloud auto-unseal devices for automated unseal operations has some details that should be understood during the planning phase.

  • If a performance primary cluster uses an HSM, all other clusters within that replication set should use an HSM as well.
  • If a performance primary cluster does NOT use an HSM (uses Shamir secret sharing method), the clusters within that replication set can be mixed, such that some may use an HSM, others may use Shamir.

Reference Diagram

This behavior is by design. A downstream Shamir cluster presents a potential attack vector in the replication group since a threshold of key holders could recreate the master key; therefore, bypassing the upstream HSM key protection.

»Deployment System Requirements

The following table provides guidelines for server sizing. Of particular note is the strong recommendation to avoid non-fixed performance CPUs, or "Burstable CPU" in AWS terms, such as T-series instances. Additionally, non-burstable SSD for disks should be used.

»Sizing for Vault Servers

SizeCPUMemoryDiskTypical Cloud Instance Types
Small2 core8-16 GB RAM100 GBAWS: m5.large, m5.xlarge
Azure: Standard_D2_v3, Standard_D4_v3
GCE: n2-standard-2, n2-standard-4
Large4-8 core32-64 GB RAM200 GBAWS: m5.2xlarge, m5.4xlarge
Azure: Standard_D8_v3, Standard_D16_v3
GCE: n2-standard-8, n2-standard-16

»Hardware Considerations

The small size category would be appropriate for most initial production deployments, development or testing environments.

The large size is for production environments where there is a consistent high workload. That might be a large number of transactions, a large number of secrets, or a combination of the two.

In general, processing requirements will be dependent on encryption workload and messaging workload (operations per second, and types of operations). Memory requirements will be dependent on the total size of secrets/keys stored in memory and should be sized according to that data (as should the hard drive storage).

Vault itself has minimal storage requirements when not using integrated storage (raft). However, when using integrated storage the infrastructure should have a relatively high-performance hard disk subsystem, hence the non-burstable SSD requirement. If many secrets are being generated or rotated frequently, this information will need to flush to disk often and can impact performance if slower hard drives are used.

Furthermore, network throughput is a common consideration for Vault servers. As both systems are HTTPS API driven, all incoming requests, communications between Vault cluster members, communications with external systems (per auth or secret engine configuration, and some audit logging configurations) and responses consume network bandwidth. Replication of Vault datasets across network boundaries should be achieved through Performance or DR Replication.

»Other Considerations

Vault Production Hardening Recommendations provides guidance on best practices for a production hardened deployment of Vault.

»Load Balancing

»Load Balancing Using External Load Balancer

External load balancers are supported as an entry point to a Vault cluster. The external loadbalancer should poll the sys/health endpoint to detect the active node and route traffic accordingly. The loadbalancer should be configured to make an HTTP request for the following URL to each node in the cluster to http://<Vault Node URL>:8200/v1/sys/health.

The active Vault node will respond with a 200 while the standby nodes will return a 4xx response.

»Client IP Address Handling

There are two supported methods for handling client IP addressing behind a proxy or load balancer; X-Forwarded-For Headers and PROXY v1. Both require a trusted load balancer and IP address to be listed as allowed addresses to adhere to security best practices.

»Additional References

»Next steps