Nobody tells you the real numbers upfront. Vendors say "contact sales" for migration costs, which is code for "this is going to hurt." After doing this multiple times, here's what you're actually looking at.
The Data Export Nightmare
First thing that hits you is getting your vectors out. AWS charges 9 cents per GB to download your data. Our 50TB dataset cost us like $4,500 just to get our data out - and that was before we did anything useful with it.
Pinecone's export is a joke. Their API rate limits mean downloading large datasets takes forever. We started our export on a Friday thinking it'd be done by Monday. Try three weeks later with the export crapping out twice - some bullshit timeout error that their support couldn't explain. The whole time you're paying for both systems while nothing productive happens.
Here's the kicker: vector data barely compresses. Regular database dumps compress 5:1 easy. Vectors? Maybe 20% if you're lucky. All those float values don't shrink much, so you pay full freight for every GB.
Re-indexing Hell
Once you've got your vectors, you need to rebuild the index. This is where the real pain starts. HNSW indexing is CPU-intensive as fuck and there's no shortcut.
For our 100 million vectors, Qdrant took about 16 hours on a bunch of cores. AWS bill was over three grand just for compute. And that's if everything works perfectly - which it fucking doesn't.
We hit memory issues twice and had to restart the whole process. First time we ran out of memory after like 8 hours of indexing. Second time the OS just killed it around the 12-hour mark. The docs don't mention that Qdrant needs way more RAM than advertised during indexing. Plan for 3x the final index size in memory or you'll be debugging this shit at 3 AM too.
Milvus was even worse. Their distributed indexing supposedly scales better. In practice, it crashed constantly and took almost a day across 8 nodes. The error messages were useless - mostly just "indexing failed, try again."
Engineering Time - Your Biggest Cost
Engineers are expensive and migrations eat months of their time. I spent 60% of my time for four months on our last migration. That's half a senior engineer's salary just for me, and I wasn't the only one.
The problem is vector databases are weird as hell. Your team knows Postgres and Redis. They don't know why Qdrant needs different M and ef_construction values than Weaviate, or why their cosine similarity results don't match Pinecone's.
We hired a consultant at $350/hour who claimed he'd done "dozens" of these. He lasted three weeks before admitting he'd never done a production migration this size. Burned like $25,000 and we had to figure it out ourselves anyway. His most helpful contribution was a broken Python script that couldn't handle our vector dimensions.
The real time sink is debugging performance. Your queries that ran in 50ms on Pinecone now take 200ms on Qdrant. Nobody can tell you why. The documentation assumes you're an ML PhD who understands HNSW parameter tuning.
API Hell - Nothing Maps Cleanly
Every vector database has their own bullshit API. Pinecone uses their own weird thing. Weaviate forces GraphQL on you. Qdrant pretends REST is simple until you hit their nested filter syntax.
Your application code needs a complete rewrite. Our recommendation engine had 15 different query patterns. Every single one needed changes:
- Pinecone's
top_k
became Qdrant'slimit
- Metadata filtering completely changed - Pinecone's simple
{"category": "electronics"}
became Qdrant's nested bool nightmare - Batch uploads went from simple POST to complex streaming APIs
Took our team 200 hours to rewrite everything. And that's after we thought we understood the new APIs.
What You're Really Paying For
Running dual systems while migrating doubles your infrastructure costs. Our monthly bill went from like $15k to almost $38k during the three-month migration because we had to keep both systems at full capacity. Nobody mentions this upfront.
Testing is a nightmare too. Your search results need to match between systems or users notice. We built custom tools to compare similarity scores because nothing exists for this. Took 150 hours of QA time just to validate our results weren't complete garbage.
The pgvector Escape Hatch
Honestly? Skip the fancy vector databases. PostgreSQL with pgvector costs 80% less and your team already knows Postgres.
Yeah, it's slower. Our queries went from 50ms to around 125ms. But we cut our monthly costs from like $12k to around $2k and eliminated all the vendor bullshit. Sometimes slower and cheaper beats fast and expensive, especially when "fast" comes with surprise price increases every quarter.
pgvector handles our 50 million vectors fine. Unless you're doing real-time recommendations for millions of users, you probably don't need the performance of specialized vector databases.