My first AI build cost $1,800. Current setup ran $6,200. The next one I'm planning will probably hit $10k because I apparently hate money. Anyone expecting MSRP prices lives in fantasy land - I've been building these rigs since 2022 and it just keeps getting more expensive.
GPUs Eat Your Budget
Forget balanced PC builds. For AI, the GPU costs more than everything else combined and determines if your models actually run. RTX 4070 with 12GB VRAM is the cheapest option that won't make you want to throw things. Runs 7B models decent - anything bigger gets slow as hell.
VRAM math is roughly 2GB per billion parameters, but that depends on batch size, context length, and whether PyTorch decides to be a memory hog. Quantization helps but good luck debugging when it breaks.
RTX 5090 launched January 30 at $2,000 MSRP for 32GB VRAM - enough to run 70B models if you could actually buy one. They're selling for $3,500+ on scalper sites when they're not sold out, which is always. Been on Newegg waitlists since March.
Memory Speed Actually Matters
Learned this the expensive way - cheap DDR4-2400 bottlenecked my $2,000 GPU because I'm an idiot. Mac Studio M3 Ultra starts at $4,000 with 96GB unified memory and works great for AI despite what NVIDIA fanboys tell you.
For PC builds, you need ECC memory if you're running 24/7. I killed two regular DIMMs before switching to server RAM. 64GB minimum or you'll be swapping to disk constantly. 128GB DDR5 costs around $800 but models actually load without timing out.
Don't be cheap on system RAM. Seen too many people with RTX 4090s and 16GB system memory wondering why models crash. The OS uses 4GB, PyTorch eats 8GB just sitting there, and your model needs whatever's left.
When Local Hardware Makes Sense (Probably Not Yet)
OpenAI charges $0.03 per 1K tokens for GPT-4. If you're doing under a million tokens daily, just use the API and save yourself the pain. I spent $15,000 on hardware to save $200/month in API costs - probably my dumbest financial move ever.
But if you're processing millions of tokens daily, local starts paying off around month 8. Our company broke even after 8 months with 4x RTX 4090s since we run inference 24/7 and cloud bills were getting nuts.
Enterprise is fucking expensive. H200 GPUs cost $40k-50k each, and DGX H200 systems run $400k-500k for 8 GPUs. Then you need a $100k cooling system and electrical work for 10kW+ power draws.
My electric bill went up $80/month with one RTX 4090. Scale that to enterprise and you're looking at serious infrastructure costs.
Hardware's just the start. Add cooling, power, software licenses, and shit breaking constantly - your "cheap" AI rig becomes a money pit real fast.