Skip to main content

Challenge 31: Infrastructure as Code strategy

Platform: comparison

This challenge covers both GitHub Actions and Azure Pipelines for IaC deployment workflows.

Exam skills mapped

  • Recommend a configuration management technology for application infrastructure
  • Implement a configuration management strategy for application infrastructure
  • Define an IaC strategy, including source control and automation of testing and deployment

Scenario

Contoso Ltd manages 200+ Azure resources across 5 environments (dev, test, staging, production-east, production-west). All infrastructure changes have been performed manually through the Azure Portal by a team of 4 operations engineers. This has led to:

  • Configuration drift between environments (staging has different SKUs than production)
  • No audit trail for who changed what and when
  • 3 production incidents in the past quarter caused by manual misconfigurations
  • 2-week lead time for provisioning new environments

The CTO has mandated a move to Infrastructure as Code with automated testing, peer review, and CI/CD deployment. The team must choose between Bicep, Terraform, and ARM templates, then implement a complete pipeline.

The target architecture includes:

contoso-infrastructure/
modules/
networking/
compute/
database/
monitoring/
environments/
dev.bicepparam (or dev.tfvars)
test.bicepparam
staging.bicepparam
prod-east.bicepparam
prod-west.bicepparam
main.bicep (or main.tf)
.github/workflows/
azure-pipelines/

Task 1: Compare IaC technologies and create a decision matrix

Evaluate Bicep, Terraform, and ARM templates for Contoso's requirements:

CriteriaARM templatesBicepTerraform
Learning curveVerbose JSON, steepSimplified DSL, moderateHCL, moderate
Multi-cloud supportAzure onlyAzure onlyMulti-cloud
State managementStateless (Azure is source of truth)StatelessRequires remote state
ModularityLinked/nested templatesModules with registryModules with registry
What-if / Planaz deployment what-ifaz deployment what-ifterraform plan
IDE supportLimitedVS Code extension with IntelliSenseVS Code extension
Community modulesAzure Verified ModulesAzure Verified ModulesTerraform Registry
Drift detectionNone built-inNone built-interraform plan detects drift

For Contoso (Azure-only, wants drift detection, some team knows HCL):

# Decision: Use Bicep for new Azure-native projects (simpler syntax, no state to manage)
# Decision: Use Terraform where drift detection or multi-cloud is needed

# Verify Bicep CLI is installed
az bicep version
az bicep upgrade

# Verify Terraform is installed
terraform version

Task 2: Implement Bicep deployment via GitHub Actions

Create a modular Bicep structure with a GitHub Actions deployment pipeline:

// modules/networking/main.bicep
@description('The Azure region for deployment')
param location string = resourceGroup().location

@description('Environment name used for naming conventions')
@allowed(['dev', 'test', 'staging', 'prod'])
param environmentName string

@description('Address space for the virtual network')
param vnetAddressPrefix string = '10.0.0.0/16'

var nameSuffix = '${environmentName}-${location}'

resource vnet 'Microsoft.Network/virtualNetworks@2023-09-01' = {
name: 'vnet-contoso-${nameSuffix}'
location: location
properties: {
addressSpace: {
addressPrefixes: [vnetAddressPrefix]
}
subnets: [
{
name: 'snet-app'
properties: {
addressPrefix: cidrSubnet(vnetAddressPrefix, 24, 0)
}
}
{
name: 'snet-data'
properties: {
addressPrefix: cidrSubnet(vnetAddressPrefix, 24, 1)
serviceEndpoints: [
{ service: 'Microsoft.Sql' }
{ service: 'Microsoft.Storage' }
]
}
}
]
}
}

output vnetId string = vnet.id
output appSubnetId string = vnet.properties.subnets[0].id
output dataSubnetId string = vnet.properties.subnets[1].id
// main.bicep
targetScope = 'subscription'

@description('Environment to deploy')
@allowed(['dev', 'test', 'staging', 'prod'])
param environmentName string

@description('Primary Azure region')
param location string = 'eastus2'

var resourceGroupName = 'rg-contoso-${environmentName}'

resource rg 'Microsoft.Resources/resourceGroups@2023-07-01' = {
name: resourceGroupName
location: location
tags: {
environment: environmentName
managedBy: 'bicep'
costCenter: 'engineering'
}
}

module networking 'modules/networking/main.bicep' = {
scope: rg
name: 'deploy-networking-${environmentName}'
params: {
location: location
environmentName: environmentName
}
}

Create the GitHub Actions workflow at .github/workflows/infrastructure.yml:

name: Infrastructure Deployment

on:
push:
branches: [main]
paths:
- "modules/**"
- "environments/**"
- "main.bicep"
pull_request:
branches: [main]
paths:
- "modules/**"
- "environments/**"
- "main.bicep"

permissions:
id-token: write
contents: read
pull-requests: write

env:
AZURE_SUBSCRIPTION_ID: ${{ vars.AZURE_SUBSCRIPTION_ID }}

jobs:
validate:
name: Validate Bicep
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Run Bicep linter
run: az bicep build --file main.bicep --stdout > /dev/null

- name: Log in to Azure
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}

- name: Validate deployment
run: |
az deployment sub validate \
--location eastus2 \
--template-file main.bicep \
--parameters environments/dev.bicepparam

what-if:
name: What-if analysis
needs: validate
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to Azure
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}

- name: Run what-if
id: whatif
run: |
RESULT=$(az deployment sub what-if \
--location eastus2 \
--template-file main.bicep \
--parameters environments/dev.bicepparam \
--no-pretty-print 2>&1)
echo "whatif_output<<EOF" >> $GITHUB_OUTPUT
echo "$RESULT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT

- name: Post what-if to PR
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const output = `#### Infrastructure What-If Results
\`\`\`
${{ steps.whatif.outputs.whatif_output }}
\`\`\`
*Triggered by @${{ github.actor }} in commit ${{ github.sha }}*`;
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: output
});

deploy-dev:
name: Deploy to dev
needs: what-if
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
environment: infrastructure-dev
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to Azure
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}

- name: Deploy infrastructure
run: |
az deployment sub create \
--location eastus2 \
--template-file main.bicep \
--parameters environments/dev.bicepparam \
--name "deploy-dev-$(date +%Y%m%d-%H%M%S)"

deploy-prod:
name: Deploy to production
needs: deploy-dev
runs-on: ubuntu-latest
environment: infrastructure-prod
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to Azure
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ env.AZURE_SUBSCRIPTION_ID }}

- name: Deploy infrastructure
run: |
az deployment sub create \
--location eastus2 \
--template-file main.bicep \
--parameters environments/prod-east.bicepparam \
--name "deploy-prod-$(date +%Y%m%d-%H%M%S)"

Task 3: Implement Terraform with Azure backend via Azure Pipelines

Configure Terraform with remote state in Azure Storage and deploy via Azure Pipelines:

# Create storage account for Terraform state
az group create --name rg-contoso-tfstate --location eastus2

az storage account create \
--name stcontosoterraform \
--resource-group rg-contoso-tfstate \
--sku Standard_LRS \
--encryption-services blob \
--allow-blob-public-access false

az storage container create \
--name tfstate \
--account-name stcontosoterraform

# Enable soft delete for state recovery
az storage blob service-properties update \
--account-name stcontosoterraform \
--enable-delete-retention true \
--delete-retention-days 30
# backend.tf
terraform {
required_version = ">= 1.5.0"

required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~> 3.80"
}
}

backend "azurerm" {
resource_group_name = "rg-contoso-tfstate"
storage_account_name = "stcontosoterraform"
container_name = "tfstate"
key = "contoso-infra.tfstate"
use_oidc = true
}
}

provider "azurerm" {
features {}
use_oidc = true
}
# variables.tf
variable "environment" {
description = "Environment name"
type = string
validation {
condition = contains(["dev", "test", "staging", "prod"], var.environment)
error_message = "Environment must be dev, test, staging, or prod."
}
}

variable "location" {
description = "Azure region for resources"
type = string
default = "eastus2"
}

variable "tags" {
description = "Common tags for all resources"
type = map(string)
default = {}
}
# main.tf
resource "azurerm_resource_group" "main" {
name = "rg-contoso-${var.environment}"
location = var.location
tags = merge(var.tags, {
environment = var.environment
managedBy = "terraform"
})
}

module "networking" {
source = "./modules/networking"
environment = var.environment
location = var.location
rg_name = azurerm_resource_group.main.name
}

Create the Azure Pipelines YAML at azure-pipelines/infrastructure.yml:

trigger:
branches:
include:
- main
paths:
include:
- "*.tf"
- "modules/**"
- "environments/**"

pr:
branches:
include:
- main
paths:
include:
- "*.tf"
- "modules/**"
- "environments/**"

pool:
vmImage: "ubuntu-latest"

variables:
- group: terraform-backend
- name: TF_VERSION
value: "1.6.4"

stages:
- stage: Validate
displayName: "Validate Terraform"
jobs:
- job: Validate
displayName: "Format check and validate"
steps:
- task: TerraformInstaller@1
displayName: "Install Terraform $(TF_VERSION)"
inputs:
terraformVersion: $(TF_VERSION)

- script: terraform fmt -check -recursive
displayName: "Check formatting"
workingDirectory: $(System.DefaultWorkingDirectory)

- task: TerraformTaskV4@4
displayName: "Terraform init"
inputs:
provider: "azurerm"
command: "init"
backendServiceArm: "contoso-terraform-sc"
backendAzureRmResourceGroupName: "rg-contoso-tfstate"
backendAzureRmStorageAccountName: "stcontosoterraform"
backendAzureRmContainerName: "tfstate"
backendAzureRmKey: "contoso-infra.tfstate"

- task: TerraformTaskV4@4
displayName: "Terraform validate"
inputs:
provider: "azurerm"
command: "validate"

- stage: Plan
displayName: "Terraform Plan"
dependsOn: Validate
jobs:
- job: Plan
displayName: "Generate execution plan"
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(TF_VERSION)

- task: TerraformTaskV4@4
displayName: "Terraform init"
inputs:
provider: "azurerm"
command: "init"
backendServiceArm: "contoso-terraform-sc"
backendAzureRmResourceGroupName: "rg-contoso-tfstate"
backendAzureRmStorageAccountName: "stcontosoterraform"
backendAzureRmContainerName: "tfstate"
backendAzureRmKey: "contoso-infra.tfstate"

- task: TerraformTaskV4@4
displayName: "Terraform plan"
inputs:
provider: "azurerm"
command: "plan"
commandOptions: "-var-file=environments/dev.tfvars -out=tfplan"
environmentServiceNameAzureRM: "contoso-terraform-sc"

- task: PublishPipelineArtifact@1
displayName: "Publish plan artifact"
inputs:
targetPath: "$(System.DefaultWorkingDirectory)/tfplan"
artifactName: "terraform-plan"

- stage: Apply
displayName: "Terraform Apply"
dependsOn: Plan
condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
jobs:
- deployment: Apply
displayName: "Apply to dev"
environment: "infrastructure-dev"
strategy:
runOnce:
deploy:
steps:
- checkout: self

- task: TerraformInstaller@1
inputs:
terraformVersion: $(TF_VERSION)

- task: TerraformTaskV4@4
displayName: "Terraform init"
inputs:
provider: "azurerm"
command: "init"
backendServiceArm: "contoso-terraform-sc"
backendAzureRmResourceGroupName: "rg-contoso-tfstate"
backendAzureRmStorageAccountName: "stcontosoterraform"
backendAzureRmContainerName: "tfstate"
backendAzureRmKey: "contoso-infra.tfstate"

- task: DownloadPipelineArtifact@2
displayName: "Download plan"
inputs:
artifactName: "terraform-plan"
targetPath: "$(System.DefaultWorkingDirectory)"

- task: TerraformTaskV4@4
displayName: "Terraform apply"
inputs:
provider: "azurerm"
command: "apply"
commandOptions: "tfplan"
environmentServiceNameAzureRM: "contoso-terraform-sc"

Task 4: Implement IaC testing strategy

Configure automated testing for both Bicep and Terraform:

# Bicep linting - configure bicepconfig.json
cat > bicepconfig.json << 'EOF'
{
"analyzers": {
"core": {
"rules": {
"no-hardcoded-env-urls": { "level": "error" },
"no-unused-params": { "level": "warning" },
"prefer-interpolation": { "level": "warning" },
"secure-parameter-default": { "level": "error" },
"simplify-interpolation": { "level": "warning" },
"use-recent-api-versions": { "level": "warning", "maxAllowedAgeInDays": 730 }
}
}
}
}
EOF

# Run Bicep linter
az bicep build --file main.bicep 2>&1 | grep -E "(Warning|Error)"

# Terraform validation commands
terraform init -backend=false
terraform validate
terraform fmt -check -recursive

# Terraform static analysis with tflint
tflint --init
tflint --recursive

Add a testing job to the GitHub Actions workflow:

test:
name: Static analysis
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Run Bicep linter
run: |
az bicep build --file main.bicep 2>&1
if [ $? -ne 0 ]; then
echo "::error::Bicep linting failed"
exit 1
fi

- name: Run checkov for security scanning
uses: bridgecrewio/checkov-action@v12
with:
directory: .
framework: bicep
output_format: sarif
output_file_path: results.sarif

- name: Upload SARIF results
if: always()
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: results.sarif

Task 5: Implement PR workflow with plan-on-PR and apply-on-merge

Configure branch protection and the review workflow:

# Configure branch protection requiring IaC review
gh api repos/{owner}/{repo}/branches/main/protection --method PUT \
--field required_pull_request_reviews='{"required_approving_review_count":1}' \
--field required_status_checks='{"strict":true,"contexts":["Validate Bicep","What-if analysis"]}' \
--field enforce_admins=true

The PR workflow posts what-if results as a comment (shown in Task 2). The key principle:

  • On pull request: validate, lint, plan/what-if (read-only, informational)
  • On merge to main: apply the changes (write operations)

This ensures every infrastructure change is peer-reviewed with full visibility of what will change before it is applied.

Task 6: State management for Terraform

Configure secure remote state with locking:

# State locking is automatic with azurerm backend (uses blob leases)
# To view current state:
terraform state list

# To inspect a specific resource:
terraform state show azurerm_resource_group.main

# Import existing resources into state:
terraform import azurerm_resource_group.main \
/subscriptions/{sub-id}/resourceGroups/rg-contoso-dev

# Move state between configurations during refactoring:
terraform state mv module.old_name module.new_name

State management best practices for the pipeline:

# In Azure Pipelines, use separate state files per environment
- task: TerraformTaskV4@4
displayName: "Terraform init - $(environment)"
inputs:
provider: "azurerm"
command: "init"
backendServiceArm: "contoso-terraform-sc"
backendAzureRmResourceGroupName: "rg-contoso-tfstate"
backendAzureRmStorageAccountName: "stcontosoterraform"
backendAzureRmContainerName: "tfstate"
backendAzureRmKey: "contoso-$(environment).tfstate"
# Enable versioning for state recovery
az storage blob service-properties update \
--account-name stcontosoterraform \
--enable-versioning true

# List state file versions for recovery
az storage blob list \
--account-name stcontosoterraform \
--container-name tfstate \
--include v \
--output table

Task 7: Drift detection with scheduled pipelines

Create a scheduled pipeline that detects configuration drift:

# GitHub Actions - .github/workflows/drift-detection.yml
name: Infrastructure drift detection

on:
schedule:
- cron: "0 6 * * 1-5" # Every weekday at 06:00 UTC
workflow_dispatch:

permissions:
id-token: write
contents: read
issues: write

jobs:
detect-drift:
name: Check for configuration drift
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Log in to Azure
uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}

- name: Run what-if to detect drift
id: drift
run: |
RESULT=$(az deployment sub what-if \
--location eastus2 \
--template-file main.bicep \
--parameters environments/prod-east.bicepparam \
--no-pretty-print 2>&1)

if echo "$RESULT" | grep -q "noChange"; then
echo "drift_detected=false" >> $GITHUB_OUTPUT
else
echo "drift_detected=true" >> $GITHUB_OUTPUT
echo "drift_details<<EOF" >> $GITHUB_OUTPUT
echo "$RESULT" >> $GITHUB_OUTPUT
echo "EOF" >> $GITHUB_OUTPUT
fi

- name: Create issue for drift
if: steps.drift.outputs.drift_detected == 'true'
uses: actions/github-script@v7
with:
script: |
await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: `Infrastructure drift detected - ${new Date().toISOString().split('T')[0]}`,
body: `## Drift detection report\n\nConfiguration drift was detected in the production environment.\n\n\`\`\`\n${{ steps.drift.outputs.drift_details }}\n\`\`\`\n\nPlease investigate and either update the IaC templates or revert the manual change.`,
labels: ['infrastructure', 'drift', 'urgent']
});

For Terraform, drift detection is simpler:

# Azure Pipelines - scheduled drift detection
schedules:
- cron: "0 6 * * 1-5"
displayName: "Weekday drift check"
branches:
include: [main]
always: true

stages:
- stage: DriftCheck
jobs:
- job: DetectDrift
steps:
- task: TerraformInstaller@1
inputs:
terraformVersion: $(TF_VERSION)

- task: TerraformTaskV4@4
displayName: "Terraform init"
inputs:
provider: "azurerm"
command: "init"
backendServiceArm: "contoso-terraform-sc"
backendAzureRmResourceGroupName: "rg-contoso-tfstate"
backendAzureRmStorageAccountName: "stcontosoterraform"
backendAzureRmContainerName: "tfstate"
backendAzureRmKey: "contoso-prod.tfstate"

- task: TerraformTaskV4@4
displayName: "Terraform plan (drift check)"
name: plan
inputs:
provider: "azurerm"
command: "plan"
commandOptions: "-var-file=environments/prod.tfvars -detailed-exitcode"
environmentServiceNameAzureRM: "contoso-terraform-sc"

- script: |
if [ $(plan.exitCode) -eq 2 ]; then
echo "##vso[task.logissue type=warning]Drift detected in production"
echo "##vso[task.setvariable variable=driftDetected]true"
fi
displayName: "Evaluate drift status"

Break and fix

Exercise 1: Fix the broken Bicep deployment

The following Bicep template and pipeline have issues. Identify and fix them:

// BROKEN: main.bicep
targetScope = 'subscription'

param environmentName string = 'production' // ERROR 1: Default value for prod is dangerous
param location string

resource rg 'Microsoft.Resources/resourceGroups@2023-07-01' = {
name: 'rg-contoso' // ERROR 2: No environment differentiation
location: location
}

module storage 'modules/storage.bicep' = {
scope: resourceGroup(rg.name) // ERROR 3: Must use rg reference directly
name: 'storageDeployment'
params: {
storageAccountName: 'stcontoso${environmentName}' // ERROR 4: May exceed 24 chars
}
}
# BROKEN: GitHub Actions workflow
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: azure/login@v2
with:
creds: ${{ secrets.AZURE_CREDENTIALS }} # ERROR 5: Using legacy auth, not OIDC

- run: |
az deployment sub create \
--template-file main.bicep \
--location eastus2
# ERROR 6: Missing --parameters flag

Corrected version:

// FIXED: main.bicep
targetScope = 'subscription'

@allowed(['dev', 'test', 'staging', 'prod'])
param environmentName string // No default - must be explicitly provided

param location string = 'eastus2'

resource rg 'Microsoft.Resources/resourceGroups@2023-07-01' = {
name: 'rg-contoso-${environmentName}'
location: location
}

module storage 'modules/storage.bicep' = {
scope: rg
name: 'storageDeployment'
params: {
storageAccountName: take('stcontoso${environmentName}', 24)
}
}
# FIXED: GitHub Actions workflow
jobs:
deploy:
runs-on: ubuntu-latest
permissions:
id-token: write
contents: read
steps:
- uses: actions/checkout@v4

- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}

- run: |
az deployment sub create \
--template-file main.bicep \
--parameters environments/dev.bicepparam \
--location eastus2

Exercise 2: Fix the Terraform state locking error

A developer reports this error when running terraform apply:

Error: Error acquiring the state lock
Lock Info:
ID: a1b2c3d4-e5f6-7890-abcd-ef1234567890
Path: contoso-infra.tfstate
Operation: OperationTypeApply
Who: runner@fv-az123-456
Created: 2024-01-15 08:30:00.000000000 +0000 UTC

Diagnosis: A previous pipeline run crashed without releasing the state lock.

Show solution

Fix:

# Verify the lock is stale (previous run no longer active)
az storage blob show \
--account-name stcontosoterraform \
--container-name tfstate \
--name contoso-infra.tfstate \
--query "properties.lease.status"

# Force unlock (use only when confirmed stale)
terraform force-unlock a1b2c3d4-e5f6-7890-abcd-ef1234567890

# Prevention: Add timeout to pipeline steps to prevent indefinite hangs
# In azure-pipelines.yml:
# timeoutInMinutes: 30

Knowledge check

1. What is the primary advantage of using 'az deployment sub what-if' in a PR pipeline?

2. Which Terraform command exit code indicates that drift has been detected?

3. Why should Terraform state files use a remote backend with locking in a CI/CD pipeline?

4. In a Bicep module architecture, what is the recommended approach for environment-specific values?

Cleanup

# Remove deployed resource groups (if testing)
az group delete --name rg-contoso-dev --yes --no-wait
az group delete --name rg-contoso-test --yes --no-wait

# Remove Terraform state storage (if no longer needed)
az group delete --name rg-contoso-tfstate --yes --no-wait

# Remove GitHub environments
gh api --method DELETE repos/{owner}/{repo}/environments/infrastructure-dev
gh api --method DELETE repos/{owner}/{repo}/environments/infrastructure-prod

# Clean up Terraform local files
rm -rf .terraform/
rm -f tfplan
rm -f .terraform.lock.hcl