Terraform vs Pulumi vs AWS CDK vs OpenTofu: Real-World Comparison

What Actually Happens When You Pick Each Tool

Reality Check	Terraform	Pulumi	AWS CDK	OpenTofu
What you'll be writing	HCL (looks like JSON had a baby with YAML)	Real code (TypeScript, Python, etc.)	TypeScript that generates CloudFormation	Same HCL as Terraform
How often it breaks	State corruption every 6 months	TypeScript errors make you question life	CloudFormation templates hit 500-resource limit	Same issues as Terraform
3am debugging involves	`terraform state rm` and praying	Stack traces from hell	Reading 10MB CloudFormation errors	OpenTofu forums + old TF docs
Cloud support	Works everywhere, slowly	Works most places, faster	AWS only, but actually works	Same as Terraform without vendor lock-in
Learning curve	Learn HCL + 200 providers	Learn programming + infrastructure + Pulumi	Learn AWS + programming + CDK + CloudFormation	Copy your Terraform configs
State file nightmares	Every fucking week	Handled for you (thank god)	CloudFormation's problem now	Same weekly nightmares
When it works	Bulletproof for simple stuff	Great for complex logic	Perfect AWS integration	Terraform without the corporate overlords
License situation	Vendor lock-in disguised as BSL	Actually open source	AWS owns your soul anyway	Truly open source

What I Learned Deploying These Tools in Production

I've deployed infrastructure with all four tools, broken production with all of them, and debugged the aftermath at ungodly hours. Here's what really happens, not the polished bullshit on their marketing sites.

Terraform: The Boring Choice That Actually Works

Terraform Logo

Terraform is like driving a Honda Civic - not exciting, but it gets you there. I've deployed everything from 3-server startups to 500-node Kubernetes clusters with it.

What works: HCL is easy to read during code reviews. Your ops team can learn it in a week. The provider ecosystem is massive - if it has an API, there's probably a Terraform provider for it. Over 3,000 providers covering everything from AWS to GitHub to PagerDuty.

What doesn't: Try doing anything dynamic and you'll want to scream. I spent 4 hours debugging why `count` wouldn't work with computed values. The error was "Error: value of 'count' cannot be computed" - super fucking helpful, right? This is why `for_each` exists, but good luck explaining that to someone learning HCL.

The state file will corrupt eventually. I've seen it happen to every team. You'll be running `terraform state pull | jq` at 2am trying to figure out why your load balancer disappeared from the state but not from AWS. The state corruption GitHub issue has 500+ comments because this happens constantly.

Recent bullshit: Terraform 1.7+ finally fixed some of the count/for_each crap that's been broken since 2019. The "deferred actions" feature helps, but we found out the hard way that it breaks if you use depends_on with computed values. Broke our CI pipeline three fucking times in two weeks.

Pulumi: For When You're Tired of HCL's Bullshit

Pulumi Homepage

Pulumi Infrastructure as Code

Pulumi lets you write actual code. After years of fighting HCL's limitations, being able to use real loops and conditionals feels like freedom. You can write infrastructure in TypeScript, Python, Go, C#, and even Java.

What works: TypeScript autocompletion is a game changer. Unit testing your infrastructure code with Jest feels natural. Complex logic that took 200 lines of HCL becomes 20 lines of TypeScript. The Pulumi Registry has native providers that are way faster than the bridged ones.

What doesn't: The learning curve is brutal if your team doesn't live in an IDE. I literally watched a senior ops engineer with 15 years experience stare at a TypeScript promise chain for 20 minutes and then ask me what .then() does. When shit breaks, you get stack traces that look like someone vomited Java all over your terminal - especially when the real error is 12 frames deep in some AWS SDK v3 bullshit. And cross-language stack references? Good fucking luck - I've seen grown engineers quit over trying to pass outputs between a TypeScript stack and a Python one.

Recent failure: Our staging went dark for 4 hours because Pulumi decided to set our instance count to zero. Turns out Number("") returns 0, not 3 like we expected. TypeScript's compiler was totally fine with it because technically it's valid code. I spent the rest of that Friday setting up every fucking ESLint rule Pulumi recommends.

War story: Pulumi's state encryption saved our ass when a contractor accidentally committed AWS credentials to Git. Traditional Terraform state files would have exposed everything in plaintext. The Pulumi Service encrypts secrets by default.

AWS CDK: AWS Lock-in Disguised as Developer Convenience

AWS CDK Logo

AWS CDK is perfect if you're all-in on AWS and never plan to leave. It generates CloudFormation templates, which means you get AWS's change management for free, but also inherit all of CloudFormation's limitations. The CDK Construct Hub has over 7,000 reusable components.

What works: New AWS features are available immediately through the AWS Construct Library. The constructs are well-designed - creating an entire VPC with subnets, route tables, and NAT gateways is 5 lines of code. The generated CloudFormation is actually readable, unlike hand-written templates.

What doesn't: The 500-resource limit per CloudFormation stack will bite you in the ass. We had to split our infrastructure into 8 different CDK apps just to stay under the limit. Cross-stack references become a nightmare when you have circular dependencies.

Production disaster: CDK spit out an 850KB CloudFormation template. AWS chokes on anything over 1MB (used to be 450KB before 2020), so our deployment just... stopped working. At 2am on a Tuesday. CDK tries to be smart and uploads big templates to S3, but our deployment role couldn't write to S3. The error? "Unable to upload template" - that's it. Took me 4 hours of digging through CloudTrail logs to figure out the permissions issue.

Version hell: CDK v1 to v2 migration broke every construct we'd written. The import paths changed, the APIs changed, even the basic app structure changed. It was like rewriting everything from scratch. The migration guide is 50 pages long for a reason.

OpenTofu: Terraform Without the Corporate Overlords

OpenTofu Logo

OpenTofu is what Terraform should have stayed - truly open source. It's 100% compatible with existing Terraform code, which means migration is literally s/terraform/tofu/g. The Linux Foundation backs it, so no surprise license changes.

What works: Drop-in replacement for Terraform. Recent releases fixed security vulnerabilities and improved state encryption. All your existing modules, providers, and state files work unchanged. The community governance model means no surprise license changes.

What doesn't: It's still Terraform, so all the same gotchas apply. State file corruption, limited HCL capabilities, and debugging nightmares are all still there. The community is smaller, so finding help can be harder - check the OpenTofu Slack instead of Stack Overflow.

The license drama: HashiCorp changed Terraform to BSL in 2023, which means you can't use it in competing products. Most companies don't care, but if you're building infrastructure tooling or SaaS platforms, OpenTofu is your escape hatch. Read the license FAQ if you're paranoid.

Team Reality Check

Your choice depends more on your team than the technology:

Ops teams that manage infrastructure: Stick with Terraform/OpenTofu. HCL is configuration, not programming.
Dev teams doing infrastructure: Go with Pulumi or CDK. Being able to write actual code is worth the learning curve.
Mixed teams: CDK if you're AWS-only, Terraform if you're multi-cloud. Pulumi if half your team are developers.

I've seen companies switch tools three times in two years trying to find the "perfect" solution. The perfect tool is the one your team will actually use correctly.

Alright, you've seen the carnage each tool can create. Now let's talk money and the shit they don't put on their pricing pages.

The Shit They Don't Tell You: Real Feature Comparison

Pain Point	Terraform	Pulumi	AWS CDK	OpenTofu
State File Corruption	Weekly occurrence	Handled by service	CloudFormation's problem	Same as Terraform
Provider Lag	Months behind AWS	Weeks behind AWS	Day-0 AWS support	Same as Terraform
Debugging Hell	HCL stack traces are useless	TypeScript stack traces from hell	CloudFormation errors are novels	Same debugging nightmares
Resource Limits	None (until your laptop dies)	None (until your wallet dies)	500 resources per stack	None (until your laptop dies)
Import Existing Resources	`terraform import` works 60% of the time	`pulumi import` mostly works	`cdk import` is hit or miss	`tofu import` same as terraform
Plan Takes Forever	15+ minutes with 1000+ resources	Usually fast	CloudFormation change sets take forever	15+ minutes with 1000+ resources
Parallelism Broken	Default parallelism=10 breaks things	Smart parallelism works	CloudFormation handles it	Default parallelism=10 breaks things

How to Pick a Tool Without Getting Fired

I've been through three tool migrations in five years.

Here's what I learned about making decisions that won't ruin your career.

Team Skills Trump Everything

The biggest mistake I see is choosing tools based on technical features instead of team capabilities. I watched a CTO pick Pulumi because "Type

Script is the future" while his entire ops team had zero programming experience.

Six months later, they were back to Terraform after wasting $200K in consulting fees.

Reality check: Your senior ops engineer who's been managing infrastructure for 10 years isn't going to become a TypeScript developer overnight.

And your JavaScript developers aren't going to suddenly understand VPC routing just because they can write loops.

What actually works:

Ops-heavy team: Terraform/OpenTofu. HCL looks like config files they already understand.
Dev-heavy team: Pulumi or CDK.

Let them use real programming languages.

Mixed team: Start with what the people managing production are comfortable with.

The AWS vs Multi-Cloud Decision

Most startups claim they're going multi-cloud.

They're usually bullshitting themselves.

Pick CDK if you're AWS-only and admit it. The deep AWS integration is worth the lock-in.

When EKS adds a new feature, CDK supports it the same day. Terraform providers lag by months.

Pick Terraform/OpenTofu if you actually deploy to multiple clouds. Not because you might someday, but because you already do.

The provider ecosystem is unmatched

I've used Terraform to manage everything from GitHub repos to PagerDuty escalation policies.

Pick Pulumi if you're multi-cloud and your team codes. The abstraction layer helps when you need to deploy the same app to AWS and GCP.

Migration Costs Are Always Higher Than You Think

I told my CTO our Terraform → Pulumi migration would take 3 months. 8 months later, we were still debugging provider edge cases and our consultant was dodging my calls.

That $200K project became a $500K nightmare.

What we underestimated:

Converting modules and shared libraries
Retraining the team
Debugging provider differences
Updating all our documentation and runbooks
CI/CD pipeline changes

The 2-week rule: If you can't migrate a representative sample of your infrastructure in 2 weeks, multiply your estimate by 3.

Budget Reality vs Marketing Claims

In my experience, free tiers never stay free at scale.

Terraform Cloud starts at $20/user/month but you'll need the $70/user tier for policy enforcement and advanced features.

With 50 engineers, that's $3,500/month.

Pulumi Cloud starts at $40/month but hits you with 18¢ per resource over 500.

We deployed a medium-sized K8s cluster and our bill jumped to $800/month overnight

every pod, service, and ingress counts as a resource. The "Individual" tier gives you 500 deployment minutes, which sounds generous until you realize a full deployment takes 45 minutes and you deploy 3 times a day during development.

AWS CDK has no platform cost, but your AWS bill will explode if you're not careful.

CDK makes it too easy to create expensive resources.

OpenTofu is actually free, but you pay in operational overhead.

You're responsible for runners, state storage, and backup strategies.

The License Trap

HashiCorp's license change in 2023 blindsided everyone.

Companies using Terraform in SaaS products technically need commercial licenses now.

Most companies don't care. You're probably fine if you're just managing your own infrastructure.

You should care if:

You're building infrastructure tooling as a product
You're a managed service provider
Your legal team is paranoid about vendor licensing

OpenTofu exists for a reason.

The Linux Foundation backing means no surprise license changes.

Version Hell and Breaking Changes

Terraform: Every major version breaks something. 0.12 → 0.13 → 0.14 → 1.0 each required rewriting parts of our codebase.

Pulumi: Rapid development means frequent breaking changes.

I've seen APIs change between minor versions.

CDK: The v1 → v2 migration was brutal.

Basically rewrote everything.

OpenTofu: Same breaking changes as Terraform since they maintain compatibility.

What Success Actually Looks Like

After five years of tool migrations, successful deployments have the same characteristics:

The team understands the tool
- Not just senior engineers, everyone who might need to debug at 3am
Clear ownership model
- Who writes infrastructure code vs who reviews it vs who operates it
Standardized patterns
- Cookie-cutter templates for common use cases
Disaster recovery procedures
- How to rebuild from scratch when everything breaks
Gradual adoption
- Start with non-critical environments, prove it works

My Recommendation Process

When teams ask me what tool to pick, I ask these questions:

Who will be on-call when this breaks? Pick the tool that person is comfortable with.
Are you actually multi-cloud today? Not planning to be, but actually are.
How complex is your infrastructure logic? Simple resources vs dynamic configurations.
What's your team's programming skill level? Be honest about this.
How risk-averse is your organization? Boring solutions are often right.

The most successful migrations I've seen were driven by real pain points, not technology trends. If your current tool works, don't change it just because something newer exists.

Starting fresh? Pick based on what your team actually knows, not what looks cool on Hacker News. You can migrate later when you hit real limits, not imaginary ones.

No matter which tool you pick, shit's going to break. And when it does at 3am on Saturday morning while you're trying to enjoy a beer, here are the questions you'll actually be googling and the answers that might save your weekend.

Questions You'll Actually Ask at 3am

My Terraform state file is corrupted and production is down. What do I do?

First, don't panic and don't run terraform apply blindly.

Quick fix:

terraform state pull > backup.tfstate

save what you have

terraform state list

see what's actually tracked

terraform import the critical resources that are missing
terraform plan to see the damage before fixingNuclear option: terraform state rm everything and re-import, but you'll lose all the metadata.

Pulumi users: This is why we switched.

State corruption is handled by the service, not your laptop.CDK users: Cloud

Formation handles state, so this isn't your problem.

Why is my terraform plan taking 25 minutes?

Because Terraform queries every single resource to check its current state, and AWS APIs are slow.

Immediate relief:

terraform plan -parallelism=20
pump up the concurrency from the pathetic default of 10
terraform plan -target=specific.resource
only check what you actually changed
terraform plan -refresh=false
skip the state refresh if you're sure nothing changedLonger fixes:
Break your monolith into smaller configs before you lose your mind
Use remote state for shared shit so teams aren't stepping on each other
Switch to Pulumi if you're tired of waiting
their engine is way fasterCDK users: Cloud

Formation change sets are also slow, but at least it's AWS's problem.

Which tool should I pick if I want to sleep at night?

Terraform if your team knows HCL and you don't mind state file babysitting. It's boring but predictable.CDK if you're AWS-only. CloudFormation handles state management and AWS takes the blame when things break.OpenTofu if you want Terraform without the licensing drama. Same stability, no vendor lock-in.Avoid Pulumi if your on-call team doesn't write code. TypeScript stack traces at 3am are not fun.

My CDK deployment is failing with "Template too large" errors. WTF?

AWS chokes on CloudFormation templates bigger than 1MB (used to be 450KB before they increased it in 2020).

CDK generated something massive and AWS is having none of it.Immediate fix:bash# CDK automatically uses S3 for large templatescdk deploy --require-approval neverIf that fails: 1.

Split your stack into multiple smaller stacks 2. Use cdk synth to see the generated CloudFormation 3. Look for repeated inline policies or large data sectionsLong-term fix: Redesign your constructs to be smaller and more focused.

Can I migrate from Terraform to OpenTofu without breaking everything?

Yes, it's a drop-in replacement.

Migration steps:

brew install opentofu (or grab the binary from GitHub releases)2.

Find-replace terraform with tofu in all your scripts/CI configs 3. tofu init -migrate-state

it'll ask nicely before touching your state

tofu plan to make sure nothing's fucked upGotcha: Don't forget your CI/CD pipeline, Git

Hub Actions, Docker images, etc.

They all need tofu now.Time estimate: Half day for the migration if you're organized, 2 weeks to find all the places you forgot to update.

Why does Pulumi cost so much compared to the others?

Because you're paying for the managed state service and compute resources.

Free tier reality: 2000 resources sounds generous until you deploy one medium K8s cluster and blow through it in a day.

Each pod, service, ingress, configmap

they all count.Cost breakdown (reality check):
Pulumi Team: $50/month base + resource overages
Pulumi Business: $100/month + compute costs for deployments
Terraform Cloud: $20-70/user/month (20 users = $400-1400/month)
OpenTofu: $0 but you manage everythingHidden cost: Pulumi deployments run in their cloud, so complex deployments cost more compute time.

My terraform apply is stuck. How do I force it to continue?

Don't force it. Terraform is probably waiting on a resource that's taking forever to create/update.

Safe debugging: 1.

Check AWS Console to see what's actually happening 2. terraform show to see the current state 3. Wait it out if AWS is just being slow (ELB creation takes 5+ minutes)Nuclear options (dangerous):

terraform apply -lock=false to bypass locking
terraform force-unlock LOCK_ID if you're sure no other process is running
terraform taint resource.name then terraform apply to force recreationBetter solution: Set realistic timeouts in your resource configurations.

Which tool has the least vendor lock-in?

**Open

Tofu**

Linux Foundation governance, truly open source, no corporate owner.Pulumi
Open source with commercial service, but you can self-host the backend.Terraform
Open core but controlled by HashiCorp, BSL license limits some uses.AWS CDK
Completely locked into AWS, generates CloudFormation templates you can't easily port.

Reality check: All IaC tools create some lock-in through your configuration code. The bigger risk is operational knowledge lock-in with your team.

Should I use workspaces or separate state files?

Separate state files. Workspaces are confusing and error-prone.

Why workspaces suck:

Easy to accidentally deploy to the wrong workspace
State corruption affects all environments
Hard to give different teams access to different environmentsBetter pattern:environments/├── dev/├── staging/├── prod/Each directory has its own state file and configuration. More code, but impossible to accidentally destroy prod.

My team wants to switch from Terraform to Pulumi. Should we?

**Don't fucking switch unless you have a real problem that's costing you sleep.**Good reasons to switch:

Your infrastructure logic is too complex for HCL
Your team is primarily developers who want real programming languages
You need better testing and CI/CD integrationBad reasons to switch:
"TypeScript is more modern" (HCL works fine)
"The new developer prefers Pulumi" (train the developer)
"It looks cooler in demos" (you'll regret this)Reality: Migration will take 3x longer than estimated and cost more than you think.

How do I debug CloudFormation errors from CDK?

Step 1: cdk synth to see the generated CloudFormation templateStep 2: Check the CloudFormation console for the actual error (CDK output is often useless)Step 3: Look for the most common issues:

IAM permissions missing
Resource limits exceeded (500 resources per stack)
Circular dependencies between resources
Names that are too long (63 character limit for many AWS resources)Step 4: Add more granular error handling in your CDK codePro tip: Enable CloudTrail to see exactly what AWS API calls are failing.

Quick Navigation

Terraform: The Boring Choice That Actually Works

Pulumi: For When You're Tired of HCL's Bullshit

AWS CDK: AWS Lock-in Disguised as Developer Convenience

OpenTofu: Terraform Without the Corporate Overlords

Team Reality Check

Team Skills Trump Everything

The AWS vs Multi-Cloud Decision

Migration Costs Are Always Higher Than You Think

Budget Reality vs Marketing Claims

The License Trap

Version Hell and Breaking Changes

What Success Actually Looks Like

My Recommendation Process

My Terraform state file is corrupted and production is down. What do I do?

Why is my terraform plan taking 25 minutes?

Which tool should I pick if I want to sleep at night?

My CDK deployment is failing with "Template too large" errors. WTF?

Can I migrate from Terraform to OpenTofu without breaking everything?

Why does Pulumi cost so much compared to the others?

My terraform apply is stuck. How do I force it to continue?

Which tool has the least vendor lock-in?

Should I use workspaces or separate state files?

My team wants to switch from Terraform to Pulumi. Should we?

How do I debug CloudFormation errors from CDK?

Related Tools & Recommendations

AWS CDK - Finally, Infrastructure That Doesn't Suck

Terraform Alternatives by Performance and Use Case - Which Tool Actually Fits Your Needs

Terraform - Define Infrastructure in Code Instead of Clicking Through AWS Console for 3 Hours

Pulumi Cloud for Platform Engineering - Build Self-Service Infrastructure at Scale

Pulumi Cloud Enterprise Deployment - What Actually Works in Production

Pulumi - Write Infrastructure in Real Programming Languages

Ansible - Push Config Without Agents Breaking at 2AM

Red Hat Ansible Automation Platform - Ansible with Enterprise Support That Doesn't Suck

Stop manually configuring servers like it's 2005

AWS CDK Production Deployment Horror Stories - When CloudFormation Goes Wrong

GitHub Actions Alternatives for Security & Compliance Teams

GitHub Actions Marketplace - Where CI/CD Actually Gets Easier

Tired of GitHub Actions Eating Your Budget? Here's Where Teams Are Actually Going

Google Kubernetes Engine (GKE) - Google's Managed Kubernetes (That Actually Works Most of the Time)

Kubernetes Enterprise Review - Is It Worth The Investment in 2025?

Fix Kubernetes Pod CrashLoopBackOff - Complete Troubleshooting Guide

Infrastructure as Code Pricing Reality Check: Terraform vs Pulumi vs CloudFormation

Jenkins Production Deployment - From Dev to Bulletproof

GitHub Actions + Jenkins Security Integration

Jenkins - The CI/CD Server That Won't Die