Nomad's first-class integration with Consul allows operators to design jobs that
natively leverage Consul Connect. However in Consul Clusters that are ACL-enabled
there are a few extra steps required to verify that your Nomad servers and clients
have Consul ACL tokens with sufficient privileges to create additional services
for the required sidecar proxies. This tutorial explores those steps, has you run a
sample Connect workload, learn about the allow_unauthenticated
value; so that
you will be able to configure your own Nomad cluster to run Connect jobs against
your own ACL-Enabled consul cluster.
»Prerequisites
Nomad v0.10.4 or greater
a Nomad environment with Nomad and Consul installed. You can use this Terraform environment to provision a sandbox environment. This guide will assume a cluster with one node running both Consul and Nomad in server mode. And one or more nodes running Nomad and Consul in client mode.
You will need
- to have the Consul cluster you are connecting to ACL enabled and bootstrapped
- a management key.
You can use the "Secure Consul with ACLs" tutorial to configure a Consul cluster for this guide.
If your Consul cluster is TLS-enabled for agent communication and you are using Nomad version 0.10, you will need to provide some Consul configuration as environment variables for your Nomad process. This can be done by modifying your init scripts or systemd system units. This will be discussed later in the guide.
Note: This tutorial is for demo purposes and is only using a single Nomad server with a Consul server configured alongside it. In a production cluster, 3 or 5 Nomad server nodes are recommended along with a separate Consul cluster. Consult the Consul Reference Architecture to learn how to securely deploy a Vault cluster.
NOTE: A similar, interactive lab is also available if you do not have a Nomad environment to perform the steps described in this guide. Click the Show Tutorial button to launch the lab experience.
»Generate Consul ACL tokens for Nomad
»Create a Nomad server policy
Define the Nomad server policy by making a file named nomad-server-policy.hcl
with this content.
agent_prefix "" {
policy = "read"
}
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "write"
}
acl = "write"
Create the Nomad server policy by uploading this file.
$ consul acl policy create \
-name "nomad-server" \
-description "Nomad Server Policy" \
-rules @nomad-server-policy.hcl
The command outputs information about the newly created policy and its rules.
$ consul acl policy create \
> -name "nomad-server" \
> -description "Nomad Server Policy" \
> -rules @nomad-server-policy.hcl
ID: 4ca519e1-d480-5fd2-160e-8a84cc22eefa
Name: nomad-server
Description: Nomad Server Policy
Datacenters:
Rules:
agent_prefix "" {
policy = "read"
}
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "write"
}
acl = "write"
»Create a Nomad client policy
Define the Nomad client policy by making a file named nomad-client-policy.hcl
with this content.
agent_prefix "" {
policy = "read"
}
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "write"
}
# uncomment if using Consul KV with Consul Template
# key_prefix "" {
# policy = read
# }
Create the Nomad client policy by uploading this file.
$ consul acl policy create \
-name "nomad-client" \
-description "Nomad Client Policy" \
-rules @nomad-client-policy.hcl
The command outputs information about the newly created policy and its rules.
$ touch nomad-client-policy.hcl
$ consul acl policy create \
> -name "nomad-client" \
> -description "Nomad Client Policy" \
> -rules @nomad-client-policy.hcl
ID: b093d1c5-a800-7973-4d73-6c7ac2c8ec01
Name: nomad-client
Description: Nomad Client Policy
Datacenters:
Rules:
agent_prefix "" {
policy = "read"
}
node_prefix "" {
policy = "read"
}
service_prefix "" {
policy = "write"
}
# uncomment if using Consul KV with Consul Template
# key_prefix "" {
# policy = read
# }
»Create a token for Nomad
Generate a token associated with these policies and save it to a file named nomad-agent.token. Because this tutorial is written for a node that is both a client and a server, apply both policies to the token. Typically, you would generate tokens with the nomad-server role for your Nomad server nodes and tokens with the nomad-client role for your Nomad client nodes.
Consider applying roles instead of rotating tokens
If your Nomad node already has a token, it is better to add the required capabilities to the existing token or roles rather than changing to a new token.
$ consul acl token create \
-description "Nomad Demo Agent Token" \
-policy-name "nomad-server" \
-policy-name "nomad-client" | tee nomad-agent.token
The command will return a new Consul token for use in your Nomad configuration.
$ consul acl token create \
> -description "Nomad Demo Agent Token" \
> -policy-name "nomad-server" \
> -policy-name "nomad-client" | tee nomad-agent.token
AccessorID: 98eb9a6a-5823-6138-93b4-bf9958e6d16c
SecretID: 4ca3820e-1bc4-2980-94ef-e6a421eddd7d
Description: Nomad Demo Agent Token
Local: false
Create Time: 2020-03-31 19:04:03.734810397 +0000 UTC
Policies:
4ca519e1-d480-5fd2-160e-8a84cc22eefa - nomad-server
b093d1c5-a800-7973-4d73-6c7ac2c8ec01 - nomad-client
»Update Nomad's Consul configuration
Open the your Nomad configuration file on all of your nodes and add a consul
stanza with your token.
consul {
token = "«your nomad agent token»"
}
»Provide environment variables for TLS-enabled Consul
If you are using Nomad version 0.10 and your Consul cluster is TLS-enabled, you
will need to provide additional Consul configurations as environment
variables to the Nomad process. This is to work around a known issue in
Nomad—hashicorp/nomad#6594
. Refer to the TLS-enabled Consul
environment section in the "Advanced
considerations" of this tutorial for details. You will be able to return
to here after you read that material.
»Alternative architectures (non-x86/amd64)
If you are running on ARM or another non-x86/amd64 architecture, jump to the Alternative Architectures section in the "Advanced Considerations" appendix of this tutorial for details. You will be able to return to here after you read that material.
»Restart Nomad to load new configuration
Run systemctl restart nomad
to restart Nomad to load these changes.
»Run a Connect-enabled job
»Create the job specification
Create the "countdash" job by copying this job specification into a file named
countdash.nomad
.
job "countdash" {
datacenters = ["dc1"]
group "api" {
network {
mode = "bridge"
}
service {
name = "count-api"
port = "9001"
connect {
sidecar_service {}
}
}
task "web" {
driver = "docker"
config {
image = "hashicorpnomad/counter-api:v1"
}
}
}
group "dashboard" {
network {
mode ="bridge"
port "http" {
static = 9002
to = 9002
}
}
service {
name = "count-dashboard"
port = "9002"
connect {
sidecar_service {
proxy {
upstreams {
destination_name = "count-api"
local_bind_port = 8080
}
}
}
}
}
task "dashboard" {
driver = "docker"
env {
COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
}
config {
image = "hashicorpnomad/counter-dashboard:v1"
}
}
}
}
»Create an intention
In Consul, the default intention behavior is defined by the default ACL policy. If the default ACL policy is "allow all", then all service mesh connections are allowed by default. If the default ACL policy is "deny all", then all service mesh connections are denied by default.
To avoid unexpected behavior around this, it is better to create an explicit intention. Create an intention to allow traffic from the count-dashboard service to the count-api service.
Run consul intention create count-dashboard count-api
The command will output that it created an intention to allow traffic from the "count-dashboard" service to the "count-api" service.
$ consul intention create count-dashboard count-api
Created: count-dashboard => count-api (allow)
»Run the job
Run the job by calling nomad run countdash.nomad
.
The command will output the result of running the job and show the allocation IDs of the two new allocations that are created.
$ nomad run countdash.nomad
==> Monitoring evaluation "3e7ebb57"
Evaluation triggered by job "countdash"
Evaluation within deployment: "9eaf6878"
Allocation "012eb94f" created: node "c0e8c600", group "api"
Allocation "02c3a696" created: node "c0e8c600", group "dashboard"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "3e7ebb57" finished with status "complete"
Once you are done, run nomad stop countdash
to prepare for the next
step.
The command will output evaluation information about the stop request and stop the allocations in the background.
$ nomad stop countdash
==> Monitoring evaluation "d4796df1"
Evaluation triggered by job "countdash"
Evaluation within deployment: "18b25bb6"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "d4796df1" finished with status "complete"
»Use Consul authentication on jobs
By default, Nomad does not require that an operator validate themselves and will create ACL permissions at any level the Nomad server token can. In some scenarios, this can allow an operator to escalate their privileges to that of Nomad server.
To prevent this, you can set the allow_unauthenticated
option to false.
»Update Nomad configuration
Open the your Nomad configuration file on all of your nodes and add the
allow_unauthenticated
value inside of the consul
configuration block.
consul {
# ...
allow_unauthenticated = false
}
Run the systemctl restart nomad
command to restart Nomad to load
these changes.
»Submit the job with a Consul token
Start by unsetting the Consul token in your shell session.
$ unset CONSUL_HTTP_TOKEN
Now, try running countdash.nomad again. You will receive an error explaining that you need to supply a Consul token.
$ nomad run countdash.nomad
You will receive an error that indicates that you need to supply an Consul ACL token in order to run the job.
$ nomad run countdash.nomad
Error submitting job: Unexpected response code: 500 (operator token denied: missing consul token)
Nomad will not allow you to submit a job to the cluster without providing a Consul token that has write access to the Consul service that the job defines.
You can supply the token in a few ways:
CONSUL_HTTP_TOKEN
environment variable-consul-token
flag on the command line-X-Consul-Token
header on API calls
Reload your management token into the CONSUL_HTTP_TOKEN
environment variable.
$ export CONSUL_HTTP_TOKEN=$(awk '/SecretID/ {print $2}' consul.bootstrap)
Now, try running countdash.nomad again. This time it will succeed.
$ nomad run countdash.nomad
»Advanced considerations
»TLS-enabled Consul environment
To use the Nomad Connect integration with Nomad version 0.11 and a TLS-enabled Consul cluster you will need to provide a few Consul environment variables to the Nomad process through the init script or systemd unit to use the Nomad Connect integration with TLS-enabled Consul cluster. This applies to all Nomad nodes configured as a client, even when the node is enabled to be both a client and a server.
For a systemd unit, you can provide these environment variables using either an
EnvironmentFile
setting or by using Environment
settings in the
"[System]" section of the service unit.
Environment="CONSUL_HTTP_SSL=true"
Environment="CONSUL_CACERT=/opt/consul/tls/consul-agent-ca.pem"
Environment="CONSUL_CLIENT_CERT=/opt/consul/tls/dc1-server-consul-0.pem"
Environment="CONSUL_CLIENT_KEY=/opt/consul/tls/dc1-server-consul-0-key.pem"
These values should be set to agree with the Nomad agent's consul
stanza.
Once set, you will need to stop and start the Nomad service to pick up these new variables. For a systemd service defined in nomad.service, you would run the following.
$ systemctl restart nomad.service
If you came here from "Provide TLS environment variables for TLS-enabled Consul", return there now.
»Alternative architectures
Nomad provides a default link to a pause image. This image, however, is architecture specific and is only provided for the amd64 architecture. In order to use Consul service mesh on non-x86/amd64 hardware, you will need to configure Nomad to use a different pause container. If Nomad is trying to use a version of Envoy earlier than 1.16, you will need to specify a different version it as well. Read through the section on airgapped networks below. It explains the same configuration elements that you will need to set to use alternative containers for service mesh.
Special thanks to @GusPS, who reported this working configuration.
Envoy 1.16 now has ARM64 support. Configure it as your sidecar image by setting this the connect.sidecar_image meta variable on each of your ARM64 clients.
meta {
"connect.sidecar_image" = "envoyproxy/envoy:v1.16.0"
}
The rancher/pause
container has versions for several different architectures
as well. Override the default pause container and use it instead. In your client
configuration, add an infra_image
to your docker plugin configuration
overriding the default with the rancher version.
plugin "docker" {
config {
infra_image = "rancher/pause:3.2"
}
}
If you came here from "Alternative Architectures" note above, [return there now][alt-arch].
»Airgapped networks or proxied environments
If you are in an airgapped network or need to access Docker Hub via a proxy, you will have to perform some additional configuration on your Nomad clients to enable Nomad's Consul Connect integration.
»Set the "infra_image" path
Set the infra_image
configuration option for the Docker driver plugin on
your Nomad clients to a path that is accessible in your environment. For
example,
plugin "docker" {
config {
infra_image = "dockerhub.myproxy.com:8080/google_containers/pause-amd64:3.0"
}
}
Changing this value will require a restart of Nomad.
»Set the "sidecar_image" path
You will also need the Envoy proxy image used for Consul service mesh
networking. Configure the "meta.connect.sidecar_image" on your Nomad clients to
override the default container path by adding a "connect.sidecar_image"
value
to the client.meta stanza of your Nomad client configuration. If you do not have
a meta stanza inside of your top-level client stanza, add one as follows.
client {
# ...
meta {
# Set this value to a proxy or internal registry that can provide an
# appropriate envoy image.
"connect.sidecar_image" = "dockerhub.myproxy.com:8080/envoyproxy/envoy:v1.11.2@sha256:a7769160c9c1a55bb8d07a3b71ce5d64f72b1f665f10d81aa1581bc3cf850d09"
}
# ...
}
Changing this value will require a restart of Nomad.
»Next steps
Now that you have completed this guide, you have:
- configured your cluster with a Consul token
- run a non-validated job using the Nomad server's Consul token
- run a validated job using the permissions of a user-provided Consul token
Now that you have run both a non-validated and as a user-token validated job, which is right for your environment? All of these steps can be done using the Nomad API directly, which path might you use for your use case?
»Reference material
Learn more about Consul Connect in these Learn guides:
Learn more about Consul ACLs:
Study Nomad's Consul configuration: