Before you start the Bazel migration that will consume the next 12-18 months of your life, let's do the math on whether you actually need this nightmare.
The Real Questions Your PM Won't Ask
Do you have 500+ developers? No? Don't migrate. Seriously. Bazel's benefits only kick in at Google scale. Your team of 30 engineers will spend more time fighting Bazel than benefiting from it.
Is your current build genuinely broken? "It takes 5 minutes" is not broken. "It randomly fails and we can't figure out why" might be. But if Gradle works fine and your biggest complaint is waiting for tests, you're solving the wrong problem.
Do you have a dedicated build engineer? Bazel isn't something you set up once and forget. You need someone who understands Starlark, can debug remote execution failures, and won't quit when builds break for the third time this month.
The Pinterest Timeline That Scared Everyone
Pinterest's migration took 18 months with a dedicated team and they're considered a success story. Here's what they didn't tell you in their blog post:
- First 3 months: Learning Starlark and figuring out why nothing compiles
- Month 4-6: Rewriting every BUILD file and discovering dependency hell
- Month 7-12: Fighting with remote execution and blaming AWS
- Month 13-15: Migrating from WORKSPACE to Bzlmod because Google deprecated WORKSPACE
- Month 16-18: Actually getting builds working consistently
Their senior engineer quit in month 10. The quote was "I'd rather debug Jenkins than deal with another Bazel error message."
What "Hermetic Builds" Actually Means
Bazel promises hermetic builds - your build only uses inputs you declare. Sounds great until you realize:
Your build depends on shit you didn't know about. That script that calls curl
to download something? Dead. The test that reads /etc/hostname
? Dead. The binary that expects to find java
in PATH? Also dead.
System dependencies are everywhere. You'll spend weeks hunting down every place your code assumes GNU vs BSD tools, different Python versions, or that which gcc
returns something useful.
The sandboxing breaks everything creative. Any clever build hack your team did - running Docker in Docker, calling out to external tools, generating code by shelling out - all broken.
I watched a team spend 2 weeks debugging the stupidest shit. Tests passed locally, failed in Bazel CI with:
FATAL ERROR in native method: No timezone info found for America/New_York
Turned out the tests were reading timezone data from /usr/share/zoneinfo
and the sandbox doesn't include system directories. The "hermetic" fix? Vendor the entire fucking timezone database into their repo. 47MB of timezone data just to make tests pass.
This complexity is what makes Bazel migrations so painful - every hidden dependency becomes a sandbox violation you have to track down and fix.
The WORKSPACE → Bzlmod Migration Tax
Here's the fun part: WORKSPACE is dead. Bazel 9 removes it entirely. So if you migrate to Bazel now using WORKSPACE (because it's easier), you get to migrate again to Bzlmod within 2 years.
The Bzlmod migration breaks everything again:
- All your
load()
statements change - Repository names change
- Module resolution conflicts with your carefully crafted dependency versions
- Every BUILD file needs updates
Teams are literally migrating twice. First WORKSPACE, then Bzlmod 18 months later when the first migration is barely stable.
The Infrastructure Bill Nobody Budgeted
Remote execution isn't optional at scale - it's the only way Bazel makes sense. But nobody tells you the real costs:
Compute costs explode. You're not just running builds on your laptop anymore. Every build hits a cluster of machines. I've seen AWS bills jump from $2k/month to $15k/month overnight.
Network bandwidth matters. Uploading build artifacts to remote cache for every commit adds up fast. One team I know hit their corporate 10Gbps limit after 3 weeks of Bazel remote execution. They had to explain to their IT director why the build system was consuming more bandwidth than their entire customer-facing website.
Cache storage costs. The remote cache that makes everything fast? It needs petabytes of storage. Budget $5k+/month for cache infrastructure that actually works.
Dedicated ops team. Remote execution clusters need babysitting. Machines crash, caches corrupt, network partitions happen. You need someone on call for your build system.
When Migration Actually Makes Sense
You should migrate to Bazel if you look like Google:
- 1000+ engineers touching the same codebase
- Multi-language monorepo (Go + Java + Python + JavaScript)
- Build times measured in hours, not minutes
- Cross-language dependencies everywhere
- Dedicated platform team with Bazel expertise
Everyone else should probably stick with language-native tools and save themselves 18 months of pain.
The question isn't "Should we use Bazel?" It's "Are we big enough that the 18-month migration cost is justified?"
For most teams, the answer is no.