Cloud infrastructure, applications, and services emit data, which Terraform can query and act on using data sources. Terraform uses data sources to fetch information from cloud provider APIs, such as disk image IDs, or information about the rest of your infrastructure through the outputs of other Terraform configurations.
Data sources allow you to load data from APIs or other Terraform workspaces. You can use this data to make your project's configuration more flexible, and to connect workspaces that manage different parts of your infrastructure. You can also use data sources to connect and share data between workspaces in Terraform Cloud and Terraform Enterprise.
In this tutorial, you will use Terraform to deploy a workspace containing a VPC
and security groups on AWS in the us-east-1
region. Next, you will use the
aws_availability_zones
data source to configure your VPC's Availability Zones
(AZs), allowing you to deploy this configuration in any AWS region. Then, you
will use the terraform_remote_state
data source to deploy another workspace
containing your application infrastructure to your VPC. Finally, you will use
the aws_ami
data source to configure the correct AMI for the current region.
»Prerequisites
In order to follow this tutorial, you will need:
- The Terraform CLI, version 0.13 or later.
- AWS Credentials configured for use with Terraform.
- The git CLI.
Note: Some of the infrastructure in this tutorial may not qualify for the AWS free tier. Destroy the infrastructure at the end of the guide to avoid unnecessary charges. We are not responsible for any charges that you incur.
»Clone example repositories
The example configuration for this tutorial is hosted in two GitHub repositories.
- The VPC workspace contains the configuration to deploy a VPC and security groups for your application.
Clone the VPC repository.
- The application workspace contains the configuration to deploy a VPC and security groups for your application.
Clone the application repository.
»Update VPC region
Change to the VPC repository directory.
The VPC configuration uses a variable called aws_region
with a default value
of us-east-1
to set the region.
However, changing the value of the aws_region
variable will not successfully
change the region because the VPC configuration includes an azs
argument to
set Availability Zones, which is a hard-coded list of availability zones in the
us-east-1
region.
Use the aws_availability_zones
data source to load the available AZs for the
current region. Add the following to main.tf
.
The aws_availability_zones
data source is part of the AWS provider, and its
documentation is under its provider in the Terraform
registry. Like
resources, data source
blocks support
arguments to specify how they behave. In this case, the state
argument limits
the availability zones to only those that are currently available.
You can reference data source attributes with the pattern
data.<NAME>.<ATTRIBUTE>
. Update the VPC configuration to use this data source
to set the list of availability zones.
Initialize this configuration.
Apply this configuration, setting the value of aws_region
to us-west-1
.
Respond to the confirmation prompt with a yes
.
Before moving on, have the VPC workspace output the region, which the
application workspace requires as an input. Add a data source to main.tf
to
access region information.
Add an output for the region to outputs.tf
.
Run terraform apply
to see the region name included in the outputs. Remember
to respond to the confirmation prompt with a yes
.
Tip: In this scenario, you could use the aws_region
variable to define
the output parameter instead of using the data source. However, there are
multiple ways to configure the AWS region. Using the aws_region
data source
will get the AWS provider's current region no matter how it was configured.
»Configure Terraform remote state
Now that your VPC workspace is deployed, go to the learn-terraform-data-sources-app
workspace directory.
This directory contains the Terraform configuration for your application. Like
the VPC workspace, this configuration includes hard-coded values for the
us-east-1
region. You can use the terraform_remote_state
data source
to use another Terraform workspace's output data.
Tip: We recommend using provider-specific data sources when convenient. terraform_remote_state
is more flexible, but requires access to the whole Terraform state.
Add a terraform_remote_state
data source to the main.tf
file inside the
learn-terraform-data-sources-app
directory.
This remote state block uses the local backend to load state data
from the path in the config
section. Terraform remote state also supports a
remote backend type for use with remote
systems,
such as Terraform Cloud, Consul, or other systems.
Replace the hard-coded region configuration in main.tf
with the region output
from the VPC workspace.
Configure the load balancer security group and subnet arguments with the corresponding outputs from your VPC workspace.
Note: Terraform remote state can only load "root-level" output values from the source workspace, it cannot directly access values from resources or modules in the source workspace. To retrieve those values, you must add a corresponding output to the source workspace.
»Scale EC2 instances
The configuration in main.tf
only uses a single EC2 instance. Update the
configuration to use multiple EC2 instances per subnet.
You can use values from data sources just like any other Terraform values,
including by passing them to functions. Now when you apply this configuration,
Terraform will provision var.instances_per_subnet
instances for each private
subnet configured in your VPC workspace.
»Configure region-specific AMIs
The AWS instance configuration also uses a hard-coded AMI ID, which is only
valid for the us-east-1
region. Use an aws_ami
data source to load the
correct AMI ID for the current region. Add the following to main.tf
.
Replace the hard-coded AMI ID with the one loaded from the new data source.
»Configure EC2 subnet and security groups
Finally, update the EC2 instance configuration to use the subnet and security group configuration from the VPC workspace.
Now that your app workspace uses remote state data from the VPC workspace, initialize the app workspace.
Apply the configuration and Terraform will provision the application
infrastructure. Respond to the confirmation prompt with a yes
.
After a few minutes, the load balancer health checks will pass, and will return the example response.
Tip: It can take several minutes for the load balancer to become available. If the curl command returns an error, try again after a few minutes.
You deployed your application using one workspace for the VPC and security
groups, and another for the application infrastructure. You used the
terraform_remote_state
data source to share data between them. You also
replaced region-specific configuration with dynamic values from several other
data sources.
»Clean up your infrastructure
Before moving on, destroy the infrastructure you created in this tutorial.
In the application workspace, destroy the application infrastructure. Respond to
the confirmation prompt with yes
.
Note: You must destroy the application workspace before the VPC workspace. Since the resources in the application workspace depend on those in the VPC workspace, the AWS API will return an error if you attempt to destroy the VPC first.
Now move on to the VPC workspace. Once again, respond to the confirmation prompt
with yes
.
Destroy this infrastructure as well.
»Next steps
Now that you have used Terraform data sources, check out the following resources for more information.
- Read the Terraform Data Sources documentation.
- Connect Terraform Cloud Workspaces with run triggers, and use outputs from one workspace to configure another workspace.
- Inject secrets into Terraform using the Vault provider.