Fix Terraform Bug
Here are the advanced bug fixing tasks about DevSecOps in Terraform manifests for GitLab CI/CD pipelines, now with concrete example snippets and solutions:
1. Production Pipeline: Misconfigured Terraform State Backends
Problem:
The terraform apply
step in your production GitLab CI pipeline fails with errors related to locked state files or concurrent access conflicts.
Example and Solution:
In your Terraform manifest (e.g., backend.tf
), configure the backend with proper state locking using AWS S3 and DynamoDB for locking:
terraform {
backend "s3" {
bucket = "my-production-tfstate-bucket"
key = "envs/prod/terraform.tfstate"
region = "us-west-2"
dynamodb_table = "terraform-lock-table"
encrypt = true
}
}
Your .gitlab-ci.yml
snippet to initialize Terraform should pass backend configs dynamically and use locking:
stages:
- validate
- plan
- apply
variables:
TF_BACKEND_BUCKET: "my-production-tfstate-bucket"
TF_BACKEND_REGION: "us-west-2"
TF_BACKEND_KEY: "envs/prod/terraform.tfstate"
TF_LOCK_TABLE: "terraform-lock-table"
terraform_init:
stage: validate
script:
- terraform init -backend-config="bucket=$TF_BACKEND_BUCKET" -backend-config="region=$TF_BACKEND_REGION" -backend-config="key=$TF_BACKEND_KEY" -backend-config="dynamodb_table=$TF_LOCK_TABLE" -input=false
Key points:
- The DynamoDB table handles locking, preventing concurrent state access conflicts.
- Initializing with
-input=false
avoids prompts in CI. - Bucket names and keys isolate states per environment.
- Validate by running concurrent pipeline jobs and ensuring state lock prevents race conditions.
2. Development Pipeline: Insecure Variable Management
Problem:
Secret variables (e.g., cloud credentials) leak into CI logs exposing sensitive data in the development pipeline.
Example and Solution:
In .gitlab-ci.yml
, never hardcode secrets. Use GitLab protected variables, marking sensitive variables as “masked” and “protected.” For example:
variables:
TF_VAR_db_password: $DB_PASSWORD # Injected securely from GitLab CI protected variables
TF_VAR_api_key: $API_KEY
terraform_plan:
stage: plan
script:
- terraform plan -no-color -out=tfplan
# Do not echo variables or terraform plan with sensitive info
In your Terraform variable definition variables.tf
, mark sensitive variables:
variable "db_password" {
description = "The database password"
type = string
sensitive = true
}
Additional best practice:
Use -no-color
in Terraform CLI commands in CI to avoid control characters in logs and avoid outputting sensitive values.
3. Testing Pipeline: Incomplete Security Scanning Integration
Problem:
IaC security scans like tfsec
or checkov
are run but do not block merges on failures.
Example and Solution:
Update your .gitlab-ci.yml
to make the security scanning job mandatory and fail the pipeline if any issue is detected:
stages:
- lint
- security_scan
- test
tfsec_scan:
stage: security_scan
image: aquasec/tfsec:latest
script:
- tfsec .
allow_failure: false # Ensures pipeline fails if tfsec finds issues
only:
- merge_requests
- develop
- feature/*
# Adjust branch rules: only skips production for strictness in testing branches
This ensures the pipeline blocks merges if security issues are present in Terraform manifests on testing and development branches.
These examples combine Terraform backend configurations with proper state locking, secure variable management using GitLab CI protected variables and Terraform’s sensitive attribute, and enforcing mandatory security scanning in GitLab CI pipelines. They offer practical solutions to typical advanced DevSecOps bugs in Terraform cloud-native pipelines.
Here are full example snippets for .gitlab-ci.yml
files for production, development, and testing pipelines for DevSecOps with Terraform in GitLab CI/CD. These examples build on best practices like remote state locking, secure variable handling, and mandatory security scanning.
1. Production Pipeline .gitlab-ci.yml
stages:
- validate
- plan
- apply
variables:
TF_BACKEND_BUCKET: "my-production-tfstate-bucket"
TF_BACKEND_REGION: "us-west-2"
TF_BACKEND_KEY: "envs/prod/terraform.tfstate"
TF_LOCK_TABLE: "terraform-lock-table"
before_script:
- terraform --version
- terraform init -backend-config="bucket=$TF_BACKEND_BUCKET" \
-backend-config="region=$TF_BACKEND_REGION" \
-backend-config="key=$TF_BACKEND_KEY" \
-backend-config="dynamodb_table=$TF_LOCK_TABLE" \
-input=false
validate:
stage: validate
script:
- terraform validate
plan:
stage: plan
script:
- terraform plan -out=tfplan -input=false
- terraform show -json tfplan > tfplan.json
artifacts:
paths:
- tfplan
- tfplan.json
expire_in: 1 week
public: false
apply:
stage: apply
script:
- terraform apply -input=false tfplan
dependencies:
- plan
rules:
- if: '$CI_COMMIT_BRANCH == "main"' # Only apply on main branch
when: manual # Manual approval recommended for production
- when: never # Otherwise, do not run apply automatically
- Uses AWS S3 backend with DynamoDB table locking for state consistency.
- Only allows manual apply on main branch for production safety.
- Artifacts secured to restrict sensitive data.
2. Development Pipeline .gitlab-ci.yml
stages:
- validate
- plan
- apply
variables:
TF_BACKEND_BUCKET: "my-dev-tfstate-bucket"
TF_BACKEND_REGION: "us-west-2"
TF_BACKEND_KEY: "envs/dev/terraform.tfstate"
TF_LOCK_TABLE: "terraform-lock-table"
before_script:
# Do not print any sensitive info here
- terraform --version
- terraform init -backend-config="bucket=$TF_BACKEND_BUCKET" \
-backend-config="region=$TF_BACKEND_REGION" \
-backend-config="key=$TF_BACKEND_KEY" \
-backend-config="dynamodb_table=$TF_LOCK_TABLE" \
-input=false
validate:
stage: validate
script:
- terraform validate
plan:
stage: plan
script:
- terraform plan -no-color -out=tfplan -input=false
artifacts:
paths:
- tfplan
expire_in: 1 week
public: false
apply:
stage: apply
script:
- terraform apply -input=false tfplan
dependencies:
- plan
rules:
- if: '$CI_COMMIT_BRANCH =~ /^dev/ || $CI_COMMIT_BRANCH =~ /^feature\//'
when: on_success
- when: never
# Sensitive variables (like TF_VAR_db_password) are provided only as protected CI/CD variables in GitLab UI.
- Uses a separate S3 state bucket and locking table for dev to isolate state.
- Uses protected variables in GitLab to securely inject secrets (not shown in YAML).
- Applied automatically on dev/feature branches.
3. Testing Pipeline .gitlab-ci.yml
stages:
- validate
- security_scan
- plan
variables:
TF_BACKEND_BUCKET: "my-test-tfstate-bucket"
TF_BACKEND_REGION: "us-west-2"
TF_BACKEND_KEY: "envs/test/terraform.tfstate"
TF_LOCK_TABLE: "terraform-lock-table"
before_script:
- terraform --version
- terraform init -backend-config="bucket=$TF_BACKEND_BUCKET" \
-backend-config="region=$TF_BACKEND_REGION" \
-backend-config="key=$TF_BACKEND_KEY" \
-backend-config="dynamodb_table=$TF_LOCK_TABLE" \
-input=false
validate:
stage: validate
script:
- terraform validate
tfsec_scan:
stage: security_scan
image: aquasec/tfsec:latest
script:
- tfsec .
allow_failure: false # Pipeline fails if security scan fails
rules:
- if: '$CI_MERGE_REQUEST_ID' # Run on merge requests
- when: never
plan:
stage: plan
script:
- terraform plan -out=tfplan -input=false
artifacts:
paths:
- tfplan
expire_in: 1 week
public: false
dependencies:
- tfsec_scan
- Adds mandatory
tfsec
static security scanning on Terraform files. - Fails pipeline if security issues are found—must pass to merge.
- Uses isolated state bucket and locking for test environment.
Notes Across Pipelines
- All pipelines initialize Terraform with environment-specific backend configs that isolate state and enforce locking.
- Sensitive secrets are never hardcoded in pipeline YAML but injected from GitLab protected CI/CD variables.
- The production pipeline requires manual approval for apply steps.
- Security scanning (
tfsec
) is enforced on testing branches and merge requests. - Artifacts containing plans are protected and expire after a week.
- You can extend these examples with further steps like
terraform destroy
, workspace management, or more sophisticated branching rules.
Here is a comprehensive example for a Terraform repo structure with GitLab CI/CD incorporating advanced handling like OIDC token usage for cloud authentication and workspace-based environment selection. This example builds on best practices and reflects common real-world setups:
Example Repo Structure
terraform-repo/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── backend.tf
│ │ └── outputs.tf
│ ├── staging/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ ├── backend.tf
│ │ └── outputs.tf
│ └── prod/
│ ├── main.tf
│ ├── variables.tf
│ ├── backend.tf
│ └── outputs.tf
├── modules/
│ ├── network/
│ │ └── main.tf
│ ├── compute/
│ │ └── main.tf
│ └── storage/
│ └── main.tf
├── .gitlab-ci.yml
├── README.md
└── scripts/
└── bootstrap.sh
- environments/ contains environment-specific configs with isolated state backends and variables.
- modules/ holds reusable Terraform modules for core infrastructure components.
- scripts/ optionally holds helper scripts for pre/post CI job tasks.
Example .gitlab-ci.yml
with OIDC and Workspace Usage
stages:
- init
- validate
- plan
- apply
variables:
TF_ROOT: "environments/${CI_ENVIRONMENT_NAME}"
TF_WORKSPACE: "${CI_ENVIRONMENT_NAME}"
default:
image:
name: hashicorp/terraform:latest
entrypoint: [""]
before_script:
- echo "Using environment: $CI_ENVIRONMENT_NAME"
# OIDC token usage for AWS - write job JWT to a file for AWS provider to use
- echo "$CI_JOB_JWT_V2" > web_identity_token
- export AWS_WEB_IDENTITY_TOKEN_FILE=$(pwd)/web_identity_token
- export AWS_ROLE_ARN="arn:aws:iam::123456789012:role/GitLabCIRoleForTerraform"
- export AWS_ROLE_SESSION_NAME="gitlab-ci-session-${CI_PIPELINE_ID}"
# Initialize Terraform backend with environment-specific config
- cd $TF_ROOT
- terraform --version
- terraform init -input=false
# Select or create workspace matching environment
- terraform workspace select $TF_WORKSPACE || terraform workspace new $TF_WORKSPACE
init:
stage: init
script:
- terraform init -input=false
validate:
stage: validate
script:
- terraform validate
plan:
stage: plan
script:
- terraform plan -out=tfplan -input=false
- terraform show -json tfplan > tfplan.json
artifacts:
paths:
- $TF_ROOT/tfplan
- $TF_ROOT/tfplan.json
expire_in: 1 week
public: false
apply:
stage: apply
script:
- terraform apply -input=false tfplan
dependencies:
- plan
rules:
- if: '$CI_COMMIT_BRANCH == "main" && $CI_ENVIRONMENT_NAME == "prod"'
when: manual
- if: '$CI_ENVIRONMENT_NAME != "prod"'
when: on_success
Explanation and Key Points
-
Repository Structure:
- The environment directories (
dev
,staging
,prod
) each contain their own Terraform manifests andbackend.tf
that points to dedicated remote state backends. - This allows isolated state management and environment-specific configurations.
- Modules in the
modules/
directory promote reuse and cleaner code.
- The environment directories (
-
Workspace Usage:
- The pipeline selects or creates a Terraform workspace matching the environment name (
dev
,staging
, orprod
) before planning or applying. - This allows using the same configuration directory for multiple environments if desired, with workspace isolations of state.
- The pipeline selects or creates a Terraform workspace matching the environment name (
-
OIDC Token Usage:
- The pipeline uses GitLab’s job JWT token (
CI_JOB_JWT_V2
) to authenticate securely with AWS via OpenID Connect (OIDC) without storing static secrets. - The
AWS_WEB_IDENTITY_TOKEN_FILE
andAWS_ROLE_ARN
environment variables enable Terraform AWS provider to assume the correct IAM role. - This enhances security by using short-lived credentials and enables fine-grained, auditable permission management for CI jobs.
- The pipeline uses GitLab’s job JWT token (
-
Pipeline Job Behavior:
- Plan artifacts (
tfplan
,tfplan.json
) are saved securely with controlled access and expiry. - The
apply
job runs manually for production (main branch + prod environment) for safety but automatically for non-production branches. - Terraform is initialized freshly in every job to ensure clean state and provider plugin setup.
- Plan artifacts (
-
Backend Configuration (
backend.tf
): Each environment’sbackend.tf
should configure a remote backend with environment-specific state isolation. For example, for AWS S3 backend:terraform { backend "s3" { bucket = "my-tf-state-bucket" key = "prod/terraform.tfstate" # Change per environment region = "us-west-2" dynamodb_table = "terraform-lock-table" encrypt = true } }
Sample Terraform code for environment modules, detailed backend configuration, or scripts to bootstrap your pipelines further.
This approach scales well for complex organizations with multi-environment, secure DevSecOps practices using Terraform and GitLab CI/CD.
To support a complex organization with multi-environment, secure DevSecOps practices using Terraform and GitLab CI/CD, here is a detailed approach including:
- Sample Terraform environment modules that isolate environment concerns and promote reuse,
- Backend configuration for remote state with locking and environment separation,
- Bootstrapping scripts that aid pipeline initialization and environment setup.
1. Sample Terraform Environment Modules
A recommended pattern is to create reusable Terraform modules for infrastructure components and then compose them in environment-specific folders with minimal environment-specific config.
Example: Reusable Network Module (modules/network/main.tf
)
variable "vpc_cidr" {
type = string
}
variable "public_subnets" {
type = list(string)
}
resource "aws_vpc" "this" {
cidr_block = var.vpc_cidr
tags = { Name = "vpc" }
}
resource "aws_subnet" "public" {
count = length(var.public_subnets)
vpc_id = aws_vpc.this.id
cidr_block = var.public_subnets[count.index]
map_public_ip_on_launch = true
tags = { Name = "public_subnet_${count.index}" }
}
output "vpc_id" {
value = aws_vpc.this.id
}
output "public_subnet_ids" {
value = aws_subnet.public[*].id
}
Example: Environment Overlay (environments/dev/main.tf
)
module "network" {
source = "../../modules/network"
vpc_cidr = "10.1.0.0/16"
public_subnets = ["10.1.1.0/24", "10.1.2.0/24"]
}
output "vpc_id" {
value = module.network.vpc_id
}
Similarly, staging
and prod
folders will have their own main.tf
overriding CIDR and subnets as needed.
Variables and Outputs
- Keep variable types and validations in module
variables.tf
, - Define outputs for shared resource IDs in
outputs.tf
inside modules, - Pass environment-specific vars in environment folders or through Terraform workspaces.
2. Detailed Backend Configuration for Remote State Management
Each environment’s backend.tf
should isolate Terraform state remotely and use locking (e.g., with AWS S3 and DynamoDB or Azure blob/state locking).
Example for AWS S3 backend with DynamoDB locking (place in each environment folder):
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "envs/${terraform.workspace}/terraform.tfstate"
region = "us-west-2"
dynamodb_table = "terraform-lock-table"
encrypt = true
}
}
- The
${terraform.workspace}
variable ensures that workspaces map to separate state keys, - DynamoDB table handles locking to avoid concurrent state corruptions,
- Encryption ensures state files are secure at rest.
If using Azure, a similar pattern exists using azurerm
backend type specifying resource_group_name
, storage_account_name
, and container_name
.
3. Bootstrapping Scripts for Pipeline Preparation
When pipelines run, pre-flight bootstrap scripts can set up environments, prepare credentials (like OIDC tokens), and configure Terraform workspaces.
Example: scripts/bootstrap.sh
#!/bin/bash
set -e
echo "Bootstrapping Terraform environment: $ENVIRONMENT"
# Authenticate to cloud provider via OIDC token, example for AWS
echo "$CI_JOB_JWT_V2" > web_identity_token
export AWS_WEB_IDENTITY_TOKEN_FILE=$(pwd)/web_identity_token
export AWS_ROLE_ARN="arn:aws:iam::123456789012:role/GitLabCIRoleForTerraform"
export AWS_ROLE_SESSION_NAME="gitlab-ci-session-${CI_PIPELINE_ID}"
# Initialize Terraform backend for the selected environment
cd "environments/${ENVIRONMENT}"
terraform init -input=false
# Select or create workspace matching environment
terraform workspace select "$ENVIRONMENT" || terraform workspace new "$ENVIRONMENT"
Make sure to include necessary environment variables (ENVIRONMENT
, CI_JOB_JWT_V2
, etc.) in your pipeline job.
4. Typical GitLab CI Job Using the Bootstrap Script
variables:
ENVIRONMENT: "dev" # Override per pipeline or branch as needed
stages:
- prepare
- validate
- plan
- apply
prepare:
stage: prepare
script:
- ./scripts/bootstrap.sh
validate:
stage: validate
script:
- terraform validate
plan:
stage: plan
script:
- terraform plan -out=tfplan -input=false
artifacts:
paths:
- environments/$ENVIRONMENT/tfplan
expire_in: 1 week
apply:
stage: apply
script:
- terraform apply -input=false tfplan
dependencies:
- plan
rules:
- if: '$CI_COMMIT_BRANCH == "main" && $ENVIRONMENT == "prod"'
when: manual
- if: '$ENVIRONMENT != "prod"'
when: on_success
Customize ENVIRONMENT
based on branch or pipeline variables to support multi-environment workflows.
Summary of the Approach
- Use modules for reusable infrastructure components to enforce DRY and consistency,
- Have isolated environment overlays with minimal overrides, containing
backend.tf
with remote state config, - Use terraform workspaces and
${terraform.workspace}
in backend config for clean environment state separation, - Apply OIDC token authentication for secure credential handling without static secrets,
- Have bootstrap scripts to unify cloud auth, initialization, and workspace management,
- Configure GitLab CI to run structured jobs for prepare/init, validate, plan, and apply stages,
- Use manual apply for production to reduce risks,
- Protect sensitive data using remote backends with encryption and pipeline variables.