August 10, 2018

Writing reusable Terraform modules

Introduction

One of the standard infrastructure architectural patterns for web applications - that I also apply for most of my projects - is to split the infrastructure into multiple logical environments. The most common ones are dev, staging and production. They use the same type of resources (load balancers, instances, databases, etc), but they differ in scale and public accessibility: while dev will use a low number of small instances and will be password or VPN protected from the Internet, production will use a high number of instances and will be accessible by anyone.

Because I try to apply DevOps best practices, these types of infrastructures are deployed from the beginning as infrastructure-as-code. This gives us the ability to easily reproduce the infrastructure and version the code via Git or any other VCS. And for that, one of the tools of choice is Terraform.

Note that all the code samples in this article are written using Terraform v0.11.7 and for a AWS based infrastructure, but these patterns can be applied to any of the cloud providers that Terraform supports.

You can see the example code in this repository, if you want to have a full overview of how it all ties in: https://github.com/nearform/tf-modules-example

The problem

While Terraform is a great tool and a great time-saver, when it gets to more complex architectures like the ones described above, things can get unwieldly.

Let’s assume the current directory structure for this kind of infrastructure looks like this (and this is a directory structure we would recommend for such projects):

├── environments
│   ├── dev
│   │   └── main.tf
│   ├── production
│   │   └── main.tf
│   └── staging
│       └── main.tf
├── main.tf
└── provider.tf

We mentioned that each environment uses the same type of resources. This means that if the application requires a new resource, for example Redis via AWS’s Elasticache, an engineer will first introduce the resource into the the dev environment, by adding the following code to environments/dev/main.tf:

resource "aws_elasticache_replication_group" "elasticache-cluster" {
    availability_zones            = ["us-west-2a", "us-west-2b"]
    replication_group_id          = "tf-rep-group-1"
    replication_group_description = "Dev replication group"
    node_type                     = "cache.m3.medium"
    number_cache_clusters         = 1
    parameter_group_name          = "default.redis3.2"
    port                          = 6379
}

What happens after the new features are introduced and tested? Before deploying the code to staging and then production, the same code will have to be copy-pasted (and modified) to both environments/staging/main.tf and environments/production/main.tf.

And this pattern will repeat itself with every new resources added, making the code-base bigger and harder to read and especially to modify because any change to how a resource is being used will mean the change has to be applied to every environment.

This also makes things prone to configuration drift: if a quick change is required to just the production environment, a lot of times it will quickly be added to environments/production/main.tf while forgetting to add it to the other environments, thus making the dev environment more and more different to the production one, defeating the purpose of having environments like dev and staging that truly mimic production so tests can be run on them without affecting live traffic and users.

The solution

One of the ways to mitigate these issues, is to create reusable and configurable Terraform modules. Let’s break this down and start of first with:

Reusable modules

Using the Elasticache example from above, we will create another directory in the root of our codebase, called elasticache. The directory structure will look like this:

.
├── elasticache
│   └── main.tf
├── environments
│   ├── dev
│   │   └── main.tf
│   ├── production
│   │   └── main.tf
│   └── staging
│       └── main.tf
├── main.tf
└── provider.tf

Terraform’s way of creating modules is very simple: create a directory that holds a bunch of .tf files. That module can be called in each of the environment modules. You can think of this module as an object in OOP, which you can instantiate in other parts of the code.

The module’s main.tf file will have the same piece of code, the Terraform resource, which we used above and copy / pasted in each environment:

resource "aws_elasticache_replication_group" "elasticache-cluster" {
    availability_zones            = ["us-west-2a", "us-west-2b"]
    replication_group_id          = "tf-rep-group"
    node_type                     = "cache.m3.medium"
    number_cache_clusters         = 1
    parameter_group_name          = "default.redis3.2"
    port                          = 6379
}

We can now add in environments/dev/main.tf:

module "dev-elasticache" {
    source = "../../elasticache"
}

Note that source parameter needs to call the module with its path as relative to the module it is being called from.

You can continue by adding the same piece of code to both environments/staging/main.tf and environments/production/main.tf, the only thing that you will need to modify in each file is the module name: each instance of a module needs to have a unique name (ex: instead of module dev-elasticache use module staging-elasticache).

Configurable modules

Now that we have our reusable module in place, we will hit another problem: each environment might have its own requirement from a certain resource. To continue to use our example, in dev we might need just one cache.m3.medium node in our Elasticache cluster, but in production we might need 3 cache.m3.large nodes in the cluster. The solution to this is to make the module configurable by using input parameters. These are basically variables that are available only in the module’s scope and can be passed to the module upon instantiating (calling) it.

You can add these variables directly in the module’s main.tf file, but one of the cleaner ways - especially if the number of input parameters grows - is to have a separate variables.tf file in the module’s directory. After adding that, our final directory structure looks like this:

.
├── elasticache
│   ├── main.tf
│   └── variables.tf
├── environments
│   ├── dev
│   │   └── main.tf
│   ├── production
│   │   └── main.tf
│   └── staging
│       └── main.tf
├── main.tf
└── provider.tf

The variables.tf file will hold the variables that configure the module. In our case, we want to be able to configure the number of nodes in the cluster, the type of nodes, the cluster’s description (so it is easy to know in which environment it runs) and the availability zones in which it runs:

variable "environment" {}
variable "node_count" {}
variable "node_type" {}
variable "availability_zones" { type = "list" }

Of course, the module needs to know where to use these variables, so the elasticache/main.tf file will look like:

resource "aws_elasticache_replication_group" "elasticache-cluster" {
    availability_zones            = ["${var.availability_zones}"]
    replication_group_id          = "tf-${var.environment}-rep-group"
    replication_group_description = "${var.environment} replication group"
    node_type                     = "${var.node_type}"
    number_cache_clusters         = "${var.node_count}"
    parameter_group_name          = "default.redis3.2"
    port                          = 6379
}

In each of the environment main.tf files, the module now needs these variables defined and passed to it. In our example, environments/dev/main.tf will look like:

module "dev-elasticache" {
    source             = "../../elasticache"
    environment        = "dev"
    node_count         = 1
    node_type          = "cache.m3.medium"
    availability_zones = ["us-east-1a", "us-east-1b"]
}

And environments/production/main.tf will look like:

module "production-elasticache" {
    source             = "../../elasticache"
    environment        = "dev"
    node_count         = 3
    node_type          = "cache.m3.large"
    availability_zones = ["us-east-1a", "us-east-1b"]
}

Wrapping up

At this point all that is left to do, is to call the environments modules in the main.tf file of the root:

module "dev" {
    source = "environments/dev"
}

module "staging" {
    source = "environments/staging"
}

module "production" {
    source = "environments/production"
}

Running terraform plan and terraform apply should bring up an Elasticache cluster in each environment, configured the way it is needed.

Conclusion

With a relatively small amount of effort, Terraform code can be structured in such a way from the beginning that growing the code base won’t bring with it growth pains. Configurable, reusable modules are one of the basic building blocks of clean, readable and scalable Terraform code.