After looking at the basic pricing above, you might think you understand what you'll pay. You don't. Data platform pricing is deliberately designed to confuse you. They give you credits, DBUs, compute units, and storage tiers, then act surprised when your bill is 3x what you expected.
I've worked on several migrations to these platforms, and they all get hit by costs nobody saw coming. Pricing that looks reasonable in demos becomes a nightmare once you're running real workloads.
Snowflake's Credit Shell Game
Snowflake credits sound simple until you realize they're not. A Small warehouse burns 2 credits per hour at $2.40-$3.10 each (depending on your pricing plan) - so $4.80-$6.20/hour, right? Wrong. That 60-second minimum billing means a 10-second query costs the same as running for a full minute. Run 20 quick queries in an hour? You just paid for 20 minutes of compute you didn't use.
Had one startup get completely destroyed on their first month's bill - something like $47K. Their Airflow setup was keeping a Large warehouse running 24/7 with health check pings every few minutes. Auto-suspend was set to 10 minutes, but the health check kept resetting the timer. Took weeks to figure out because the queries looked harmless in the query history - tiny execution times but burning credits constantly.
Storage costs look reasonable at $23/TB until you realize that's after compression. Your data might compress really well or barely at all depending on what you're storing. JSON logs compress terribly. Good luck estimating without testing first.
But the real killer is when you factor in all the platforms together - because most companies aren't choosing just one anymore.
Databricks' DBU Deception
Databricks pricing is even more messed up. DBU rates start at $0.40 for basic stuff but jump to $0.87 for ML workloads. Standard clusters use 1 DBU per hour, but figuring out which instance type you need takes trial and error.
Startup time kills you - clusters take forever to spin up, maybe 3-5 minutes, and you pay for all of it. Short jobs cost more in startup than actual work. Serverless compute at $0.70/DBU is pricey but at least starts faster.
Saw one team leave a cluster running for like a month because someone turned off auto-termination while debugging and forgot about it. Cost them thousands in DBUs plus AWS instance costs. The cluster was doing nothing for most of that time. Nobody noticed because the person who set up cost alerts had left the company and nobody replaced them.
The pattern repeats across every platform - what starts as "simple" pricing becomes a minefield of gotchas.
BigQuery's Deceptively Simple Trap
BigQuery looked simple at $5/TB but Google bumped it to $6.25/TB with barely any notice. Then some analyst writes SELECT * FROM events e JOIN users u ON e.user_id = u.id
across our huge tables without any WHERE clause. Scanned like 60TB and cost us almost $400 before anyone could cancel it. BigQuery's dry run would have caught it, but nobody uses dry run for quick queries.
Everyone says to set query cost limits. But BigQuery won't tell you the cost until after it starts scanning. By the time you get the expensive query warning, you've already paid for metadata scanning. The 1TB free tier helps small teams, but real workloads burn through that fast.
Storage is cheap at $20/TB, but long-term storage at $10/TB requires data to be untouched for 90 days. Guess what happens if your monthly ETL job touches that "archived" data? Back to full price for the entire dataset.
Azure Synapse's Confusing Mess
Synapse has three different pricing models depending on which features you use:
- Dedicated SQL pools: $1,398/month for DW100c (don't be fooled - you need at least DW500c for real work)
- Serverless SQL: $5 per TB processed (same trap as BigQuery)
- Apache Spark pools: Same DBU bullshit as Databricks
The docs make it sound like you can seamlessly mix Dedicated SQL pools, Serverless SQL, and Spark pools. What they don't tell you is that moving data between pools hits you with Azure Storage transaction costs ($0.0004 per 10K transactions) plus egress charges. Had one client with multiple DW500c pools running 24/7 but only processing data during business hours. Took forever to convince them to use auto-pause because the resume process fails randomly and they didn't trust it.
The Hidden Costs They Don't Tell You About
Data egress is the silent killer. Every platform lets you pump data in for free but charges AWS/Azure/GCP rates to pull it out. Cross-region transfers for disaster recovery can add 25% to your monthly bill.
Professional services are mandatory unless you like hemorrhaging money. Snowflake's "Migration Accelerator" starts at $250K minimum and includes gems like "we'll help you understand your data patterns" (translation: we'll watch you fuck up for 3 months then tell you what you did wrong). Databricks certified consultants run $350-400/hour and you need both data engineering and ML expertise. Azure's "free" FastTrack is great for PowerPoint architecture diagrams, useless when your SQL pools keep timing out with ExecutionTimeout
.
Training costs will murder your budget. SnowPro Core certification is $175 per person and expires in 2 years. Databricks wants $2,400 per person for their 4-day training - and most of it's stuff you can learn from their notebooks for free. Azure requires 3 separate certifications to actually understand Synapse: DP-203 ($165), AZ-104 ($165), and DP-900 ($99). So $429 per person to maybe understand why your queries keep failing.