Leverage Nomad's Vault Integration

Generate mTLS Certificates for Nomad using Vault

You can use consul-template in your Nomad cluster to integrate with Vault's PKI Secrets Engine to generate and renew dynamic X.509 certificates. By using this method, you enable each node to have a unique certificate with a relatively short time-to-live (ttl). This feature, along with automatic certificate rotation, allows you to safely and securely scale your cluster while using mutual TLS (mTLS).

In this guide, your goal will be to secure your existing Nomad cluster with mTLS. To accomplish this, you will configure Vault's PKI secrets engine to create both a root and intermediate CA. You will also use consul-template to fetch, renew, and periodically rotate your mTLS certificates on your Nomad nodes.

»Prerequisites

To perform the tasks described in this guide, you need to have

  • a Nomad environment with Consul and Vault installed. You can use this repository to provision a sandbox environment. This guide will assume a cluster with one server node and three client nodes.

  • You will also need consul-template installed on your nodes.

»Prepare Vault

If you already have a Vault environment that you are integrating with, you will be able to skip ahead to "Log in to Vault"

»Initialize Vault server

Run the following command to initialize Vault server and receive an unseal key and initial root token (if you are running the environment provided in this guide, the Vault server is co-located with the Nomad server). Be sure to note the unseal key and initial root token as you will need these two pieces of information.

$ vault operator init -key-shares=1 -key-threshold=1

The vault operator init command above creates a single Vault unseal key for convenience. For a production environment, it is recommended that you create at least five unseal key shares and securely distribute them to independent operators. The vault operator init command defaults to five key shares and a key threshold of three. If you provisioned more than one server, the others will become standby nodes but should still be unsealed.

»Unseal Vault

Run the following command and then provide your unseal key to Vault.

$ vault operator unseal

The output of unsealing Vault will look similar to the following.

Key                    Value
---                    -----
Seal Type              shamir
Initialized            true
Sealed                 false
Total Shares           1
Threshold              1
Version                1.0.3
Cluster Name           vault-cluster-d1b6513f
Cluster ID             87d6d13f-4b92-60ce-1f70-41a66412b0f1
HA Enabled             true
HA Cluster             n/a
HA Mode                standby
Active Node Address    <none>

»Log in to Vault

Use the login command to authenticate yourself against Vault using the initial root token you received earlier. You will need to authenticate to run the necessary commands to write policies, create roles, and configure your root and intermediate CAs.

$ vault login <your initial root token>

If your login is successful, the vault login command will generate output similar to this.

Success! You are now authenticated. The token information displayed below
is already stored in the token helper. You do NOT need to run "vault login"
again. Future Vault requests will automatically use this token.
...

»Prepare the PKI environment

»Generate the root CA

Enable the PKI secrets engine at the pki path.

$ vault secrets enable pki

Tune the PKI secrets engine to issue certificates with a maximum time-to-live (TTL) of 87600 hours.

$ vault secrets tune -max-lease-ttl=87600h pki

Generate the root certificate and save the certificate as CA_cert.crt.

$ vault write -field=certificate pki/root/generate/internal \
    common_name="global.nomad" ttl=87600h > CA_cert.crt

»Generate the intermediate CA and CSR

Enable the PKI secrets engine at the pki_int path.

$ vault secrets enable -path=pki_int pki

Tune the PKI secrets engine at the pki_int path to issue certificates with a maximum time-to-live (TTL) of 43800 hours.

$ vault secrets tune -max-lease-ttl=43800h pki_int

Generate a CSR from your intermediate CA and save it as pki_intermediate.csr.

$ vault write -format=json pki_int/intermediate/generate/internal \
    common_name="global.nomad Intermediate Authority" \
    ttl="43800h" | jq -r '.data.csr' > pki_intermediate.csr

»Sign and deploy the intermediate CA certificate

Sign the intermediate CA CSR with the root certificate and save the generated certificate as intermediate.cert.pem.

$ vault write -format=json pki/root/sign-intermediate \
    csr=@pki_intermediate.csr format=pem_bundle \
    ttl="43800h" | jq -r '.data.certificate' > intermediate.cert.pem

Once the CSR is signed and the root CA returns a certificate, it can be imported back into Vault.

$ vault write pki_int/intermediate/set-signed certificate=@intermediate.cert.pem

»Create a role

A role is a logical name that maps to a policy used to generate credentials. In our example, it will allow you to use configuration parameters that specify certificate common names, designate alternate names, and enable subdomains along with a few other key settings.

Create a role named nomad-cluster that specifies the allowed domains, enables you to create certificates for subdomains, and generates certificates with a TTL of 86400 seconds (24 hours).

$ vault write pki_int/roles/nomad-cluster allowed_domains=global.nomad \
    allow_subdomains=true max_ttl=86400s require_cn=false generate_lease=true

You will receive a success message if your role was created properly.

Success! Data written to: pki_int/roles/nomad-cluster

»Create a policy to access the role endpoint

Recall from earlier that you generated a root token that you used to log in to Vault. Although you could use that token in our next steps to generate our TLS certs, the recommended security approach is to create a new token based on a specific policy with limited privileges.

Create a policy file named tls-policy.hcl and provide it the following contents.

path "pki_int/issue/nomad-cluster" {
  capabilities = ["update"]
}

Note that you are specifying the update capability on the path pki_int/issue/nomad-cluster. All other privileges will be denied. You can read more about Vault policies here.

Write the policy you just created into Vault.

$ vault policy write tls-policy tls-policy.hcl
Success! Uploaded policy: tls-policy

»Configure consul-template

»Generate a token based on tls-policy

Create a token based on tls-policy with the following command.

$ vault token create -policy="tls-policy" -period=24h -orphan

On success, you will receive output similar to the following.

Key                  Value
---                  -----
token                s.m069Vpul3c4lfGnJ6unpxgxD
token_accessor       HiZALO25hDQzSgyaglkzty3M
token_duration       24h
token_renewable      true
token_policies       ["default" "tls-policy"]
identity_policies    []
policies             ["default" "tls-policy"]

Make a note of this token as you will need it in the upcoming steps.

»Create and populate the templates directory

You need to create templates that consul-template can use to render the actual certificates and keys on the nodes in our cluster. In this guide, you will place these templates in /opt/nomad/templates.

Create a directory called templates in /opt/nomad.

$ sudo mkdir /opt/nomad/templates

Below are the templates that the consul-template configuration will use. You will provide different templates to the nodes depending on whether they are server nodes or client nodes. All of the nodes will get the CLI templates (since you want to use the CLI on any of the nodes).

»Nomad Servers

agent.crt.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "common_name=server.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}}
{{ .Data.certificate }}
{{ end }}

agent.key.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "common_name=server.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}}
{{ .Data.private_key }}
{{ end }}

ca.crt.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "common_name=server.global.nomad" "ttl=24h"}}
{{ .Data.issuing_ca }}
{{ end }}

»Nomad Clients

Replace the word server in the common_name option in each template with the word client.

agent.crt.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "common_name=client.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}}
{{ .Data.certificate }}
{{ end }}

agent.key.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "common_name=client.global.nomad" "ttl=24h" "alt_names=localhost" "ip_sans=127.0.0.1"}}
{{ .Data.private_key }}
{{ end }}

ca.crt.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "common_name=client.global.nomad" "ttl=24h"}}
{{ .Data.issuing_ca }}
{{ end }}

»Nomad CLI

cli.crt.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "ttl=24h" }}
{{ .Data.certificate }}
{{ end }}

cli.key.tpl.

{{ with secret "pki_int/issue/nomad-cluster" "ttl=24h" }}
{{ .Data.private_key }}
{{ end }}

»Configure consul-template on all nodes

If you are using the AWS environment provided in this guide, you already have consul-template installed on all nodes. If you are using your own environment, ensure consul-template is installed. You can download it here.

Provide the token you created based on tls-policy to the consul-template configuration file located at /etc/consul-template.d/consul-template.hcl. You will also need to specify the template stanza so you can render each of the following on your nodes at the specified location from the templates you created in the previous step.

  • Node certificate
  • Node private key
  • CA public certificate

You will also specify the template stanza to create certs and keys from the templates you previously created for the Nomad CLI (which defaults to HTTP but will need to use HTTPS once once TLS is enabled in our cluster).

Your consul-template.hcl configuration file should look similar to the following (you will need to provide this to each node in the cluster).

# This denotes the start of the configuration section for Vault. All values
# contained in this section pertain to Vault.
vault {
  # This is the address of the Vault leader. The protocol (http(s)) portion
  # of the address is required.
  address      = "http://active.vault.service.consul:8200"

  # This value can also be specified via the environment variable VAULT_TOKEN.
  token        = "s.m069Vpul3c4lfGnJ6unpxgxD"

  # This should also be less than or around 1/3 of your TTL for a predictable
  # behaviour. Consult https://github.com/hashicorp/vault/issues/3414
  grace        = "1s"

  # This tells consul-template that the provided token is actually a wrapped
  # token that should be unwrapped using Vault's cubbyhole response wrapping
  # before being used. Consult Vault's cubbyhole response wrapping documentation
  # for more information.
  unwrap_token = false

  # This option tells consul-template to automatically renew the Vault token
  # given. If you are unfamiliar with Vault's architecture, Vault requires
  # tokens be renewed at some regular interval or they will be revoked. Consul
  # Template will automatically renew the token at half the lease duration of
  # the token. The default value is true, but this option can be disabled if
  # you want to renew the Vault token using an out-of-band process.
  renew_token  = true
}

# This block defines the configuration for connecting to a syslog server for
# logging.
syslog {
  enabled  = true

  # This is the name of the syslog facility to log to.
  facility = "LOCAL5"
}

# This block defines the configuration for a template. Unlike other blocks,
# this block may be specified multiple times to configure multiple templates.
template {
  # This is the source file on disk to use as the input template. This is often
  # called the "consul-template template".
  source      = "/opt/nomad/templates/agent.crt.tpl"

  # This is the destination path on disk where the source template will render.
  # If the parent directories do not exist, consul-template will attempt to
  # create them, unless create_dest_dirs is false.
  destination = "/opt/nomad/agent-certs/agent.crt"

  # This is the permission to render the file. If this option is left
  # unspecified, consul-template will attempt to match the permissions of the
  # file that already exists at the destination path. If no file exists at that
  # path, the permissions are 0644.
  perms       = 0700

  # This is the optional command to run when the template is rendered. The
  # command will only run if the resulting template changes.
  command     = "systemctl reload nomad"
}

template {
  source      = "/opt/nomad/templates/agent.key.tpl"
  destination = "/opt/nomad/agent-certs/agent.key"
  perms       = 0700
  command     = "systemctl reload nomad"
}

template {
  source      = "/opt/nomad/templates/ca.crt.tpl"
  destination = "/opt/nomad/agent-certs/ca.crt"
  command     = "systemctl reload nomad"
}

# The following template stanzas are for the CLI certs

template {
  source      = "/opt/nomad/templates/cli.crt.tpl"
  destination = "/opt/nomad/cli-certs/cli.crt"
}

template {
  source      = "/opt/nomad/templates/cli.key.tpl"
  destination = "/opt/nomad/cli-certs/cli.key"
}

»Start the consul-template service

Start the consul-template service on each node.

$ sudo systemctl start consul-template

You can quickly confirm the appropriate certs and private keys were generated in the destination directory you specified in your consul-template configuration by listing them out.

$ ls /opt/nomad/agent-certs/ /opt/nomad/cli-certs/
/opt/nomad/agent-certs/.
agent.crt  agent.key  ca.crt

/opt/nomad/cli-certs/.
cli.crt  cli.key

»Configure Nomad to use TLS

Add the following tls stanza to the configuration of all Nomad agents (servers and clients) in the cluster (configuration file located at /etc/nomad.d/nomad.hcl in this example).

tls {
  http = true
  rpc  = true

  ca_file   = "/opt/nomad/agent-certs/ca.crt"
  cert_file = "/opt/nomad/agent-certs/agent.crt"
  key_file  = "/opt/nomad/agent-certs/agent.key"

  verify_server_hostname = true
  verify_https_client    = true
}

Additionally, ensure the rpc_upgrade_mode option is set to true on your server nodes (this is to ensure the Nomad servers will accept both TLS and non-TLS connections during the upgrade).

rpc_upgrade_mode       = true

Reload Nomad's configuration on all nodes.

$ systemctl reload nomad

Once Nomad has been reloaded on all nodes, go back to your server nodes and change the rpc_upgrade_mode option to false (or remove the line since the option defaults to false) so that your Nomad servers will only accept TLS connections.

rpc_upgrade_mode       = false

You will need to reload Nomad on your servers after changing this setting. You can read more about RPC Upgrade Mode here.

If you run nomad status, you will now receive the following error.

Error querying jobs: Get http://172.31.52.215:4646/v1/jobs: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"

This is because the Nomad CLI defaults to communicating via HTTP instead of HTTPS. You can configure the local Nomad client to connect using TLS and specify our custom key and certificates by setting the following environments variables.

export NOMAD_ADDR=https://localhost:4646
export NOMAD_CACERT="/opt/nomad/agent-certs/ca.crt"
export NOMAD_CLIENT_CERT="/opt/nomad/cli-certs/cli.crt"
export NOMAD_CLIENT_KEY="/opt/nomad/cli-certs/cli.key"

After these environment variables are correctly configured, the CLI will respond as expected.

$ nomad status
No running jobs

»Encrypt server gossip

At this point all of Nomad's RPC and HTTP communication is secured with mTLS. However, Nomad servers also communicate with a gossip protocol, Serf, that does not use TLS.

  • HTTP - Used to communicate between CLI and Nomad agents. Secured by mTLS.
  • RPC - Used to communicate between Nomad agents. Secured by mTLS.
  • Serf - Used to communicate between Nomad servers. Secured by a shared key.

You can learn how to configure gossip encryption in the Enable Gossip Encryption for Nomad guide.

»Next Steps

Now that you have completed this guide,

  • you have explored the Vault PKI engine,
  • you have used consul-template to fetch certificates from Vault, and
  • you have enabled a Nomad cluster to use mTLS certificates to encrypt traffic

»Reference Material