Every vendor claims their tool is "free" until you actually try to use it in production. Then you get hit with bills that make you question your career choices.
CloudFormation: "Free" Like a Puppy Is Free
AWS CloudFormation doesn't cost anything upfront, which is nice until you realize it's like getting a free car that only runs on premium gas and breaks down every weekend.
Sure, basic AWS resources are free to provision. But the moment you need anything beyond basic EC2 and S3 - like custom resources or third-party integrations - AWS starts charging $0.0009 per operation. Sounds cheap? Try provisioning a Datadog monitor resource 10,000 times a month. That's $9 just for one type of resource operation. We had 247 Datadog monitors across 3 environments, and every update triggered handler operations. The AWS Free Tier only covers 1,000 handler operations per month, which sounds generous until you realize third-party providers can burn through that in a single deployment.
I learned this the hard way in November 2023 when we migrated our monitoring setup from Terraform to CloudFormation. What we thought would be a free migration ended up costing $347/month just in CloudFormation handler operations. The Datadog provider alone was generating 12,000+ operations per month because it was polling for state changes every 30 seconds. The bill breakdown looked like we were paying AWS for the privilege of using other people's tools. Check out this CloudFormation cost optimization guide if you want to understand just how deep this rabbit hole goes. There's also extensive documentation on custom resources that fails to mention the cost implications.
Terraform: "Free" Until You Need to Sleep
Terraform itself costs nothing. Running Terraform without losing your sanity? That's where they get you.
You start with local state files. Then someone else touches your code and overwrites your state. So you set up S3 backend. Then you need state locking with DynamoDB. Then someone accidentally corrupts the state file and you get this fucking error. The AWS best practices guide covers this setup, but doesn't mention the operational nightmare that follows. Here's a detailed analysis of why state locking still doesn't solve all your problems:
Error: Error acquiring the state lock
Error message: ConditionalCheckFailedException: The conditional request failed
And you spend 6 hours rebuilding infrastructure from memory because your backup state file is also corrupted (this was Friday night, 9 PM, while my kid was screaming in the background).
Within 3 months, you're running a dedicated engineer just on Terraform operations. That's $120k/year before you even get to the infrastructure costs. Add monitoring, security scanning, and CI/CD integration, and you're easily at $200k annually just to make the "free" tool work. This state management guide breaks down the six most common issues you'll face. There's also this comprehensive troubleshooting guide that documents solutions to problems you didn't even know existed yet.
We tried running pure Terraform from v0.12 through v1.3 for 2 years. I calculated we spent more on engineering time debugging state file issues than we would have on HCP Terraform. The breaking point was when our lead DevOps engineer quit after spending a weekend in February 2023 recovering from a corrupted state file that took down staging. The state got corrupted during a partial apply when terraform v1.2.8 crashed mid-execution. This horror story from AWS shows what happens when state corruption goes undetected. There's also this recovery guide that walks through fixing corrupted state files - bookmark it now because you'll need it later.
Pulumi: Death by a Thousand Credits
Pulumi's credit system is designed to confuse you into spending money. They count every individual resource, including shit you didn't even know existed.
We thought we had 200 resources in our Kubernetes cluster. Pulumi's console showed 847 billable resources - I remember the exact number because I stared at that fucking dashboard for 3 hours trying to figure out where all the resources came from. Turns out every security group rule counts separately. Every IAM policy attachment is a separate resource. Every individual subnet route is a resource.
Our "simple" EKS cluster with 5 t3.medium nodes suddenly cost $423/month just in Pulumi credits. The actual compute was only $180/month. The worst part? You don't find out until after deployment. Their cost estimator is useless because nobody knows their actual resource count until Pulumi counts it for you. This detailed pricing analysis breaks down exactly how their credit system works and why it's designed to be confusing. There's also this pricing overview that explains the difference between declared resources and billable resources.
I spent a weekend in March 2024 trying to optimize our resource usage, combining security groups and IAM roles. Cut it down from 847 to 392 resources and felt proud. Then we added one more microservice (just a simple Express.js API) and were back up to 623 resources. Pulumi counted every single environment variable injection as a separate resource.
Want to see how fucked you really are? Run this:
pulumi stack --show-urns | grep -E \"(aws:|kubernetes:)\" | wc -l
That's your actual billable resource count. Spoiler: it's way higher than you think.
Spacelift: At Least They're Honest About Screwing You
Spacelift charges $399/month minimum. Sounds expensive until you realize it doesn't scale with your infrastructure size. You pay the same whether you manage 100 resources or 10,000.
The pricing is actually predictable, which is refreshing after dealing with Pulumi's credit roulette. You know exactly what you're paying each month. No surprises, no weird resource counting, just straightforward "pay this much, get this much concurrency."
We switched to Spacelift after the Pulumi bill hit $800/month for the same infrastructure. Yeah, $399 seemed expensive at first, but at least I could budget for it without having nightmares about resource drift adding $200 to next month's bill.