0x55aa
← Back to Blog

CloudFormation vs Terraform: The IaC Battle I Wish Someone Explained to Me ☁️⚔️

15 min read

CloudFormation vs Terraform: The IaC Battle I Wish Someone Explained to Me ☁️⚔️

Real talk: The first time I deployed infrastructure on AWS, I was a console cowboy. Click, click, create VPC, click, launch EC2, click, configure security group... 47 clicks later, I had a working environment! 🎉

Then my boss asked, "Can you recreate this in staging?" Me: "Uhh... let me write down all the settings?" Narrator: That did NOT go well. 😅

Welcome to Infrastructure as Code - the skill that separates senior engineers from people who still manually click "Launch Instance" in production!

What Even Is Infrastructure as Code? 🤔

IaC = Writing code to manage infrastructure instead of clicking buttons

Think of it like this:

Manual approach (Console Clicking):

You → AWS Console → Click "Create VPC" → Fill form → Click 47 more times
                   → Hope you remember what you did
                   → Pray you don't fat-finger a setting
                   → Good luck recreating this! 💀

IaC approach:

You → Write vpc.tf → Run terraform apply → Infrastructure created!
                   → Version controlled in Git
                   → Reproducible forever
                   → Sleep well at night 😴

Translation: IaC is treating your infrastructure like application code - versioned, tested, repeatable!

The Wake-Up Call That Taught Me IaC 📞

When architecting our e-commerce backend, I manually created everything in the AWS Console. Here's what happened:

The disaster timeline:

Week 1: Spent 6 hours clicking through AWS Console setting up production VPC, subnets, RDS, Lambda functions, API Gateway, CloudFront... 🖱️

Week 2: Boss: "We need a staging environment that matches production." Me: opens Console "This will take a while..." 😰

Week 3: Created staging environment. Forgot to add NAT Gateway in one subnet. Lambda functions couldn't reach the internet. Debugging took 4 hours! 🐛

Week 4: Junior dev accidentally deleted a security group in production (Console makes it too easy!). Site went down. We had NO record of what the settings were! 💥

Week 5: Me: "There has to be a better way!" Googles "Infrastructure as Code" 🔍

Week 6: Converted everything to Terraform. Now I can:

  • Recreate entire infrastructure in 5 minutes ✅
  • See exactly what changed in Git diffs ✅
  • Roll back bad changes instantly ✅
  • Never lose sleep over "what was that setting again?" ✅

In production, I've deployed hundreds of AWS resources using IaC. Never going back to console clicking! 🚀

CloudFormation: The AWS Native Way 📜

What it is: AWS's own IaC tool. Uses JSON or YAML templates to define infrastructure.

Why I started with CloudFormation:

# vpc.yaml - CloudFormation template
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Production VPC'

Resources:
  VPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      EnableDnsHostnames: true
      EnableDnsSupport: true
      Tags:
        - Key: Name
          Value: production-vpc

  PublicSubnet:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref VPC
      CidrBlock: 10.0.1.0/24
      AvailabilityZone: !Select [0, !GetAZs '']
      MapPublicIpOnLaunch: true
      Tags:
        - Key: Name
          Value: public-subnet-1a

  InternetGateway:
    Type: AWS::EC2::InternetGateway
    Properties:
      Tags:
        - Key: Name
          Value: production-igw

  AttachGateway:
    Type: AWS::EC2::VPCGatewayAttachment
    Properties:
      VpcId: !Ref VPC
      InternetGatewayId: !Ref InternetGateway

Deploy it:

aws cloudformation create-stack \
  --stack-name production-vpc \
  --template-body file://vpc.yaml

What happens:

  • CloudFormation reads the template
  • Creates resources in correct order (VPC → Subnet → IGW)
  • Tracks all resources in a "stack"
  • Updates or deletes everything together

What I loved:

  • ✅ Native AWS integration (works perfectly with all AWS services)
  • ✅ No extra tools to install (comes with AWS)
  • ✅ Free! (no additional cost)
  • ✅ Automatic rollback on failure
  • ✅ IAM integration is seamless

What drove me crazy:

  • ❌ YAML/JSON syntax is verbose and clunky
  • ❌ Error messages are cryptic AF ("Resource failed to create" - WHY?!)
  • ❌ Only works with AWS (can't manage GitHub, Cloudflare, Datadog, etc.)
  • ❌ No plan preview (CloudFormation change sets are painful)
  • ❌ State management is implicit (sometimes confusing)

Terraform: The Swiss Army Knife 🛠️

What it is: HashiCorp's multi-cloud IaC tool. Uses HCL (HashiCorp Configuration Language).

Why I switched to Terraform:

# vpc.tf - Terraform config
provider "aws" {
  region = "us-east-1"
}

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name = "production-vpc"
  }
}

resource "aws_subnet" "public" {
  vpc_id                  = aws_vpc.main.id
  cidr_block              = "10.0.1.0/24"
  availability_zone       = data.aws_availability_zones.available.names[0]
  map_public_ip_on_launch = true

  tags = {
    Name = "public-subnet-1a"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id

  tags = {
    Name = "production-igw"
  }
}

Deploy it:

terraform init    # Download AWS provider
terraform plan    # Preview changes (AMAZING!)
terraform apply   # Create infrastructure

What I loved:

  • ✅ HCL syntax is clean and readable
  • terraform plan shows EXACTLY what will change before applying
  • ✅ Works with 3000+ providers (AWS, GCP, Azure, GitHub, Cloudflare, etc.)
  • ✅ State file makes dependencies explicit
  • ✅ Amazing error messages
  • ✅ Huge community and module ecosystem

What annoyed me:

  • ❌ Extra tool to install and learn
  • ❌ State file management (can be tricky in teams)
  • ❌ Sometimes lags behind new AWS features
  • ❌ Community modules vary in quality

The Head-to-Head Comparison ⚔️

Let me show you REAL production scenarios:

Scenario 1: Creating an S3 Bucket

CloudFormation:

Resources:
  MyBucket:
    Type: AWS::S3::Bucket
    Properties:
      BucketName: my-production-bucket
      VersioningConfiguration:
        Status: Enabled
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true
        BlockPublicPolicy: true
        IgnorePublicAcls: true
        RestrictPublicBuckets: true
      LifecycleConfiguration:
        Rules:
          - Id: DeleteOldVersions
            Status: Enabled
            NoncurrentVersionExpirationInDays: 30

Terraform:

resource "aws_s3_bucket" "main" {
  bucket = "my-production-bucket"
}

resource "aws_s3_bucket_versioning" "main" {
  bucket = aws_s3_bucket.main.id

  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_public_access_block" "main" {
  bucket = aws_s3_bucket.main.id

  block_public_acls       = true
  block_public_policy     = true
  ignore_public_acls      = true
  restrict_public_buckets = true
}

resource "aws_s3_bucket_lifecycle_configuration" "main" {
  bucket = aws_s3_bucket.main.id

  rule {
    id     = "delete-old-versions"
    status = "Enabled"

    noncurrent_version_expiration {
      noncurrent_days = 30
    }
  }
}

Winner: CloudFormation - More concise for AWS-only resources!

Scenario 2: Creating Lambda with Dependencies

CloudFormation:

Resources:
  LambdaExecutionRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: '2012-10-17'
        Statement:
          - Effect: Allow
            Principal:
              Service: lambda.amazonaws.com
            Action: sts:AssumeRole
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

  MyLambda:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: my-function
      Runtime: nodejs18.x
      Handler: index.handler
      Code:
        ZipFile: |
          exports.handler = async (event) => {
            return { statusCode: 200, body: 'Hello!' };
          };
      Role: !GetAtt LambdaExecutionRole.Arn

Terraform:

resource "aws_iam_role" "lambda" {
  name = "lambda-execution-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = {
        Service = "lambda.amazonaws.com"
      }
      Action = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy_attachment" "lambda_basic" {
  role       = aws_iam_role.lambda.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

resource "aws_lambda_function" "main" {
  filename      = "lambda.zip"
  function_name = "my-function"
  role          = aws_iam_role.lambda.arn
  handler       = "index.handler"
  runtime       = "nodejs18.x"
}

Winner: Tie - Both are reasonable!

Scenario 3: Multi-Cloud Setup (AWS + Cloudflare DNS)

CloudFormation:

# Can't do it! CloudFormation only supports AWS! 😭

Terraform:

# AWS resources
provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "website" {
  bucket = "my-website"
}

resource "aws_cloudfront_distribution" "website" {
  origin {
    domain_name = aws_s3_bucket.website.bucket_regional_domain_name
    origin_id   = "S3-my-website"
  }
  # ... more config
}

# Cloudflare DNS
provider "cloudflare" {
  api_token = var.cloudflare_token
}

resource "cloudflare_record" "website" {
  zone_id = var.cloudflare_zone_id
  name    = "www"
  value   = aws_cloudfront_distribution.website.domain_name
  type    = "CNAME"
  proxied = true
}

# GitHub repo webhooks
provider "github" {
  token = var.github_token
}

resource "github_repository_webhook" "deploy" {
  repository = "my-website"
  events     = ["push"]

  configuration {
    url          = aws_lambda_function.deploy.function_url
    content_type = "json"
  }
}

Winner: Terraform - CloudFormation can't even compete here! 🏆

Scenario 4: Preview Changes Before Applying

CloudFormation:

# Create change set (clunky!)
aws cloudformation create-change-set \
  --stack-name my-stack \
  --template-body file://template.yaml \
  --change-set-name my-changes

# Wait for it to process...
aws cloudformation wait change-set-create-complete \
  --stack-name my-stack \
  --change-set-name my-changes

# View the change set (hard to read!)
aws cloudformation describe-change-set \
  --stack-name my-stack \
  --change-set-name my-changes

# Output:
# {
#   "Changes": [
#     {
#       "Type": "Resource",
#       "ResourceChange": {
#         "Action": "Modify",
#         "LogicalResourceId": "MyBucket",
#         "ResourceType": "AWS::S3::Bucket",
#         "Replacement": "False"
#       }
#     }
#   ]
# }
# "What exactly is changing?!" 🤔

Terraform:

terraform plan

# Output (beautiful and clear!):
# Terraform will perform the following actions:
#
#   # aws_s3_bucket.main will be updated in-place
#   ~ resource "aws_s3_bucket" "main" {
#         bucket = "my-bucket"
#       ~ versioning {
#           ~ enabled = false -> true
#         }
#     }
#
# Plan: 0 to add, 1 to change, 0 to destroy.

Winner: Terraform - Plan preview is a GAME CHANGER! 🎯

When I Use CloudFormation vs Terraform 🤔

After 7 years of AWS deployments, here's my decision tree:

Use CloudFormation When:

✅ AWS-only infrastructure

# Simple Lambda + API Gateway
# CloudFormation has great AWS integration!

✅ Serverless Framework / SAM

# Both use CloudFormation under the hood
serverless deploy
sam deploy

✅ AWS Service Catalog

# Your company mandates CloudFormation templates
# (compliance/governance reasons)

✅ CDK (Cloud Development Kit)

// Write in TypeScript, compiles to CloudFormation!
const bucket = new s3.Bucket(this, 'MyBucket', {
  versioned: true
});

My CloudFormation projects:

  • Serverless APIs (Serverless Framework)
  • AWS CDK infrastructure
  • Simple single-service deployments

Use Terraform When:

✅ Multi-cloud infrastructure

# AWS + GCP + Cloudflare + Datadog

✅ Complex infrastructure with modules

# Reusable VPC module used across 10 projects
module "vpc" {
  source = "terraform-aws-modules/vpc/aws"
  # ...
}

✅ Need plan preview before every change

terraform plan  # See before you destroy! 🛡️

✅ Managing non-AWS resources

# GitHub repos, Cloudflare DNS, PagerDuty, etc.

My Terraform projects:

  • Production VPC infrastructure
  • Multi-account AWS setups
  • Infrastructure spanning AWS + Cloudflare + GitHub
  • Anything I want tight control over

The Hybrid Approach (What I Actually Do) 🎯

Plot twist: I use BOTH in production!

Architecture:

My Production Infrastructure:
├── Terraform (Core Infrastructure)
│   ├── VPC, Subnets, Route Tables
│   ├── RDS Databases
│   ├── ElastiCache Clusters
│   ├── IAM Roles and Policies
│   ├── S3 Buckets and Policies
│   └── CloudFront Distributions
│
└── CloudFormation (Applications)
    ├── Serverless Framework (API endpoints)
    ├── SAM (Lambda functions)
    └── CDK (Event-driven workflows)

Why this works:

Terraform = Foundation (changes rarely, managed carefully)

  • VPCs don't change often
  • Database infrastructure is stable
  • Perfect for terraform plan workflow

CloudFormation = Applications (deployed frequently)

  • Lambda functions update daily
  • Serverless Framework makes deployments easy
  • SAM templates are simple and fast

When architecting on AWS, I learned: Use the right tool for the job! Terraform for infrastructure, CloudFormation for serverless apps! 🏗️

Common IaC Mistakes I Made (Learn From My Pain!) 🪤

Mistake #1: Not Using Remote State

Bad (local state):

# terraform.tfstate stored locally
# Team member makes changes
# State conflicts! 💥

Good (remote state):

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "production/vpc/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"  # Prevents concurrent changes!
  }
}

Lesson learned: Remote state in S3 + DynamoDB locking from DAY ONE! 🔒

Mistake #2: Hardcoding Values

Bad:

resource "aws_instance" "web" {
  ami           = "ami-0c55b159cbfafe1f0"  # Hardcoded!
  instance_type = "t3.micro"
  key_name      = "my-laptop-key"  # Works on my machine!
}

Good:

variable "ami_id" {
  description = "AMI ID for web server"
  type        = string
}

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"
}

resource "aws_instance" "web" {
  ami           = var.ami_id
  instance_type = var.instance_type
  key_name      = aws_key_pair.deploy.key_name
}

Lesson learned: Variables make infrastructure reusable! 🎯

Mistake #3: No Tagging Strategy

Bad:

resource "aws_instance" "web" {
  # No tags - good luck finding this in AWS Console!
}

Good:

locals {
  common_tags = {
    Environment = "production"
    Project     = "ecommerce"
    ManagedBy   = "terraform"
    CostCenter  = "engineering"
  }
}

resource "aws_instance" "web" {
  # ...
  tags = merge(local.common_tags, {
    Name = "web-server-1"
    Role = "frontend"
  })
}

Lesson learned: Tags = cost tracking + organization + sanity! 🏷️

Mistake #4: Not Using Modules

Bad (copy-paste everywhere):

# vpc-dev.tf
resource "aws_vpc" "dev" { ... }
resource "aws_subnet" "dev_public_1a" { ... }
resource "aws_subnet" "dev_public_1b" { ... }
# ... 50 more lines

# vpc-staging.tf
resource "aws_vpc" "staging" { ... }  # Copy-paste!
resource "aws_subnet" "staging_public_1a" { ... }
# ... same 50 lines

Good (reusable module):

# modules/vpc/main.tf
resource "aws_vpc" "main" {
  cidr_block = var.cidr_block
  # ...
}

# environments/dev.tf
module "dev_vpc" {
  source     = "../modules/vpc"
  cidr_block = "10.0.0.0/16"
  env        = "dev"
}

# environments/staging.tf
module "staging_vpc" {
  source     = "../modules/vpc"
  cidr_block = "10.1.0.0/16"
  env        = "staging"
}

Lesson learned: Modules = DRY principle for infrastructure! 🔁

Mistake #5: Not Testing Changes

Bad workflow:

# Just YOLO it!
terraform apply --auto-approve
# "What could go wrong?" 😱

Good workflow:

# Always plan first!
terraform plan -out=tfplan

# Review the plan
cat tfplan  # Read EVERYTHING

# Apply only if plan looks good
terraform apply tfplan

# Monitor AWS Console during apply
# Watch CloudWatch metrics

In production, I've learned: terraform plan has saved me from destroying production databases at least 5 times! 🛡️

The Decision Matrix: Which Should You Learn? 🎓

Learn CloudFormation if:

  • ✅ You're AWS-only (no multi-cloud plans)
  • ✅ You use Serverless Framework or SAM
  • ✅ You want zero external dependencies
  • ✅ Your team already uses it

Learn Terraform if:

  • ✅ You want to learn once, use everywhere (multi-cloud)
  • ✅ You need terraform plan workflow
  • ✅ You value community modules
  • ✅ You're starting fresh (most modern choice)

Learn BOTH if:

  • ✅ You're a professional DevOps/Cloud engineer (like me!)
  • ✅ You want maximum flexibility
  • ✅ You're serious about infrastructure

My recommendation: Start with Terraform (more transferable skills), learn CloudFormation when you need Serverless Framework! 🚀

Quick Start: Your First Terraform Project ✅

Ready to ditch console clicking? Here's your starter template:

1. Install Terraform:

# macOS
brew install terraform

# Linux
wget https://releases.hashicorp.com/terraform/1.7.0/terraform_1.7.0_linux_amd64.zip
unzip terraform_1.7.0_linux_amd64.zip
sudo mv terraform /usr/local/bin/

2. Create main.tf:

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

resource "aws_s3_bucket" "my_first_bucket" {
  bucket = "my-unique-bucket-name-12345"

  tags = {
    Name        = "My First Terraform Bucket"
    Environment = "learning"
  }
}

output "bucket_name" {
  value = aws_s3_bucket.my_first_bucket.bucket
}

3. Deploy:

terraform init     # Download AWS provider
terraform plan     # Preview changes
terraform apply    # Create the bucket!

4. Verify:

aws s3 ls | grep my-unique-bucket
# You'll see your bucket! 🎉

5. Destroy (cleanup):

terraform destroy  # Remove everything

That's it! You just managed AWS infrastructure as code! 🏆

The Bottom Line 💡

IaC isn't just "best practice" - it's the difference between professional infrastructure and chaos!

The essentials:

  1. Version control everything (infrastructure = code)
  2. Automate deployments (no more console clicking)
  3. Use remote state (team collaboration)
  4. Plan before apply (prevent disasters)
  5. Tag everything (cost tracking + organization)

The truth about IaC:

It's not "CloudFormation vs Terraform" - it's "clicking buttons like a caveman vs managing infrastructure like a pro!" Both tools are great. Pick one and LEARN IT!

When architecting our serverless backends, I learned this: Console clicking doesn't scale. Teams need reproducible infrastructure. Git history shows who changed what. IaC is mandatory for professional cloud engineering! And honestly? Terraform's plan command has saved my ass more times than I can count! 🙏

You don't need perfect IaC from day one - you need to START using it and iterate! 🚀

Your Action Plan 🎯

This week:

  1. Stop creating resources in AWS Console (seriously, stop!)
  2. Install Terraform or set up AWS CLI for CloudFormation
  3. Convert ONE resource to IaC (start with an S3 bucket)
  4. Commit your .tf or .yaml file to Git

This month:

  1. Convert your entire dev environment to IaC
  2. Set up remote state backend (S3 + DynamoDB)
  3. Create reusable modules for common patterns
  4. Write deployment documentation for your team

This quarter:

  1. Migrate production to IaC (carefully!)
  2. Implement CI/CD for infrastructure deployments
  3. Set up cost monitoring with proper tagging
  4. Become the IaC guru on your team! 🏆

Resources Worth Your Time 📚

Tools I use daily:

Reading list:

Real talk: The best IaC tool is the one you actually USE! Stop debating, start building!


Still clicking buttons in AWS Console? Connect with me on LinkedIn and share your IaC journey!

Want to see my Terraform modules? Check out my GitHub - production-ready VPC, Lambda, and RDS modules!

Now go forth and codify your infrastructure! ☁️💻


P.S. If you've manually recreated your infrastructure in multiple environments, you know the pain! IaC eliminates this forever. Your future self will thank you! 🙏

P.P.S. I once spent 8 hours debugging why staging didn't match production. Turns out I clicked the wrong checkbox in Console 3 months ago. Never happened again after switching to Terraform. Learn from my pain - USE IaC! 💸