Started using Gemini 2.5 Pro back in June. It's different from regular AI - actually pauses and works through stuff before answering. Takes forever and costs way more, but sometimes that's what you need.
When The Thinking Thing Is Worth It
Was pretty skeptical when Google said they made an AI that "thinks." Sounds like marketing BS. But had this legacy payment system that kept breaking in weird ways, and regular AI kept suggesting fixes that missed the point.
Threw our database schema at it - bunch of tables with no foreign keys because reasons. Other AI usually just says "add indexes" or "normalize everything." This one sat there for like 30 seconds, then said our user session table was causing all the cascade failures. Gave us a migration plan that wouldn't kill production.
Claude wanted to normalize the whole thing. ChatGPT wrote this perfect schema that ignored all our legacy constraints. Gemini actually understood we needed a fix that wouldn't break everything.
What Actually Matters In Practice
The benchmarks are decent - 88% on math problems, 69% on coding stuff. But here's what I care about:
- Catches obvious bugs before they ship
- Remembers what you're working on across long conversations
- Actually thinks through problems instead of just guessing
The thinking delay is annoying though. Simple stuff takes a few seconds, complex problems make you wait 30+ seconds. First time it happened I thought it was broken.
It's Expensive As Hell
Costs way more than regular AI. Like $1.25 per million input tokens, $10-15 for output. A single code review can cost you 5 bucks. Been spending around $600/month.
Set thinking budgets or you'll get fucked on the bill. Found that out when I got charged $800 for letting it analyze one big codebase. Now I keep it on "low" for simple stuff.
When It's Worth The Money (And When It Isn't)
Good for:
- Architecture stuff where context matters
- Debugging weird edge cases
- Code reviews that need to think through multiple things
Bad for:
- Boilerplate (Claude is way faster)
- Quick syntax questions (just use ChatGPT)
- Anything where you need fast responses
Stuff That'll Bite You
The 1M context window sounds great until you realize feeding it a big codebase takes forever to process. Takes like 45 seconds just to start thinking through our API docs.
Rate limits are confusing - the thinking time counts against your quota but they don't tell you. Hit limits all the time because the model spent 5 minutes thinking about what seemed like a simple question.
Sometimes gets stuck in loops when your prompt is vague. Had it think for 2 minutes about a database migration then spit out garbage because I wasn't specific enough about backwards compatibility.
Stuff Google Won't Mention
The experimental version is supposedly better for coding but breaks all the time. Times out mid-response and you lose all that thinking progress.
The image analysis thing is actually useful though. Threw an architecture diagram at it with our code and it found inconsistencies that took us days to spot.
Bottom line: if you need something more than basic CRUD generation, the thinking is worth the extra cost. Just don't expect magic - it's like having a thorough junior dev who sometimes goes down weird rabbit holes.