Enterprise Only: The functionality described in this tutorial is available only in Vault Enterprise. To explore Vault Enterprise features, you can sign up for a free 30-day trial.
»Challenge
Vault version upgrade is always a delicate moment for any production environment and it is important to have best practices in place that simplify the process where possible.
»Solution
Vault Enterprise provides automated version upgrades with the autopilot feature when using Integrated Storage. The feature allows you to start new Vault nodes alongside the old ones and automatically switch to the new nodes, as soon as they are able to reach quorum.
This automates the leader election process and ensures that the new leader is elected among the new nodes so that removing the old nodes from the datacenter does not trigger a leader election.
»Prerequisites
To test the automated upgrades feature explained in this tutorial you will need:
- A Vault Enterprise cluster with three nodes running Vault Enterprise 1.11.0 or later.
- Three extra nodes with Vault Enterprise 1.11.0 or later to use as the new servers once the upgrade is concluded.
You will also need a text editor, the curl
executable to test the API
endpoints, and optionally the jq
command to format the output for curl
.
»Scenario introduction
To demonstrate the new autopilot behavior, start a cluster with 3 nodes that has no automatic upgrade (see the Step 1 diagram below). Then, start additional 3 nodes with automatic upgrade enabled and add them to the cluster (see the Step 2 diagram below).
You will run a script to start a cluster.
- vault_1 (
http://127.0.0.1:8100
) is initialized and unsealed. The root token creates a transit key that enables the other Vaults auto-unseal. This Vault server is not a part of the cluster. - vault_2 (
http://127.0.0.1:8200
) is initialized and unsealed. This Vault starts as the cluster leader. - vault_3 (
http://127.0.0.1:8300
) is started and automatically joins the cluster viaretry_join
. - vault_4 (
http://127.0.0.1:8400
) is started and automatically joins the cluster viaretry_join
.
If this is your first time setting up a Vault cluster with integrated storage, go through the Vault HA Cluster with Integrated Storage tutorial.
»Setup an initial cluster
Retrieve the configuration by cloning the
hashicorp/learn-vault-raft
repository from GitHub.This repository contains supporting content for all of the Vault learn tutorials. The content specific to this tutorial can be found within a sub-directory.
Change the working directory to
learn-vault-raft/raft-auto-upgrade/local
.Set the
setup_1.sh
file to executable.Execute the
setup_1.sh
script to spin up a Vault cluster.You can find the server configuration files and the log files in the working directory.
Use your preferred text editor and open the
config-vault_2.hcl
file to examine the generated server configuration forvault_2
.config-vault_2.hclReview the generated server configuration for
vault_3
.config-vault_3.hclThe
retry_join
block is configured thatvault_3
andvault_4
nodes automatically joined the cluster.Export an environment variable for the
vault
CLI to address thevault_2
server.Verify the cluster members.
View the autopilot's upgrade state information.
Output:
Notice the Upgrade Info fields shows the Status to be idle.
If
watch
command (or similar) is installed, you can monitor the upgrade status as you proceed to adding more nodes.This checks the autopilot state every half a second.
»Add new nodes
When autopilot detects that a node with a newer version of Vault has joined the cluster, it will wait to promote the new node as voter until enough newer-versioned nodes have been added to the cluster to reach quorum. When the count of new nodes equals or exceeds that of the old nodes, autopilot will begin promoting the new nodes to voters and demoting the old nodes to non-voters.
Use your preferred text editor and open the
config-vault_5.hcl
file to examine the generated server configuration forvault_5
.config-vault_5.hclTo enable automatic version upgrade, add the
autopilot_upgrade_version
parameter in thestorage
stanza where its value is a SemVer compatible version string of your choosing.Vault Configuration: The
vault_5
,vault_6
andvault_7
nodes haveautopilot_upgrade_version
parameter configured. This implies that those nodes have a specific target Vault version.Set the
setup_2.sh
file to executable.Execute the
setup_2.sh
script to add three additional nodes to the cluster.Monitor the autopilot's upgrade status as it progress.
Or,
The Status changes from
idle
toawait-new-voters
.The status will change to
promoting
as autopilot promotes the 3 new nodes to be voters. Then the status will change todemoting
, as autopilot demotes 2 out of the 3 old nodes to be non-voters. Then, the leader will change fromvault_2
tovault_5
.Finally, the status chanes to
await-server-removal
.
Autopilot Statue: The progression of autopilot statuses during an upgrade
looks like: idle
→ await-new-voters
→ demoting
→ promoting
→
leader-transfer
→ await-server-removal
→ idle
.
»Remove non-voter nodes
Once the autopilot upgrade status changes to await-server-removal
, you can
remove the old non-voting nodes from the cluster.
List the current peers before removing any nodes.
Export an environment variable for the
vault
CLI to address the server.Remove
vault_2
from the cluster.Remove
vault_3
from the cluster.Remove
vault_4
from the cluster.Verify that the non-voter nodes have been removed from the cluster.
»Autopilot configuration
By default, automated upgrade migrations is supported for Vault Enterprise.
Output:
To disable automated upgrade migrations, set the -disable-upgrade-migration
parameter to true
.
»Clean up
The cluster.sh
script provides a clean
operation that removes all services,
configuration, and modifications to your local system.
Clean up your local workstation.
»Next steps
In this tutorial you upgraded your Vault datacenter by using autopilot's automated upgrades functionality. Automated upgrades lets you automatically upgrade a cluster of Vault nodes to a new version as updated server nodes join the cluster. Once the number of nodes on the new version is equal to or greater than the number of nodes on the old version, Autopilot will promote the newer versioned nodes to voters, demote the older versioned nodes to non-voters, and initiate a leadership transfer from the older version leader to one of the newer versioned nodes. After the leadership transfer completes, the older versioned non-voting nodes can be removed from the cluster.