Vector search is a massive pain in the ass. You've got millions of high-dimensional vectors from your ML models, and you need to find similar ones fast. PostgreSQL shits the bed at 100k vectors - pgvector query times go from 10ms to 30 seconds when you cross that threshold. Elasticsearch gets expensive real quick - we blew through $3k/month just indexing 50M embeddings. Most vector databases are just FAISS with marketing polish and a 10x price tag.
FAISS cuts through the bullshit. It's a C++ library from Meta's AI team that's been battle-tested on billion-vector datasets since 2017. Version 1.12.0 dropped in August 2023 with NVIDIA cuVS integration and fewer ways to accidentally kill your server.
What you actually get
Index Hell Made Simple
FAISS has 15+ index types because vector search is complicated. Want exact results? Use IndexFlatL2. Want speed? IndexIVFFlat. Want your 64GB RAM to actually fit 100M vectors? IndexPQ compresses them down 10-100x. Each index trades off speed, memory, and accuracy differently.
GPU Acceleration That Works
The CUDA implementation can hit thousands of queries per second on modern hardware. I've seen 5-20x speedups over CPU, assuming you survive the CUDA dependency hell.
Production-Ready Pain
FAISS handles the edge cases that kill other libraries. It works with billions of vectors, supports different distance metrics, and won't randomly crash when your dataset doesn't fit in memory.
Where FAISS Actually Gets Used
Image Search
When you upload a photo to find similar ones, that's probably FAISS under the hood. CNN embeddings go in, similar image IDs come out. Works great until someone uploads a corrupted JPEG and your entire index build shits itself with a cryptic std::bad_alloc
error at 3am. Pinterest and Instagram both use FAISS for image similarity - learned that the hard way when our Pinterest clone started OOMing on user uploads.
Text Embeddings
RAG systems use FAISS to find relevant documents. You embed your query with BERT or whatever, FAISS finds the closest document embeddings, and pray the LLM doesn't hallucinate some bullshit. LangChain integration makes this relatively painless - took me 2 hours instead of 2 days. Hugging Face Datasets includes FAISS support, which saved our asses when we needed to index Wikipedia.
Recommendation Engines
E-commerce sites embed user behavior and product features, then use FAISS to find similar users or products. The embeddings are usually garbage, but FAISS makes searching through garbage really fast. Spotify's recommendations and Netflix's personalization both rely on FAISS.
The Stuff No One Talks About
Content moderation, fraud detection, duplicate detection, anything where you need to find "things like this thing" in a giant dataset. Most similarity search is boring enterprise shit, not sexy AI demos. I spent 6 months building a duplicate product detector for e-commerce - FAISS found 2M duplicate listings in our catalog overnight.
The bottom line: if you're dealing with vectors at scale, you'll end up using FAISS whether you like it or not. Either directly (masochist route), or through one of the dozen vector databases that are just FAISS with a fancy REST API and monthly billing that'll bankrupt your startup.