ChromaDB got a lot better after they rewrote it in Rust. Version 0.5.0 broke a bunch of stuff, but the recent releases are way more stable. Still not fast though - tried it on a 5M vector dataset and queries took forever.
Three Ways People Deploy This Thing
Docker: Works fine until you fuck up the volume mount. You need to mount /data
- not /chroma
like the old docs said. Spent a whole Saturday figuring out why my data kept disappearing. The error messages don't help either.
Kubernetes: Uses a lot of memory and setting up persistent storage is a pain. There's a community Helm chart that I've used. The default settings are garbage though - you'll need to bump up the memory limits.
Chroma Cloud: Free tier plus usage costs. No monthly minimum anymore. Worth it if you don't want to deal with infrastructure bullshit. Their pricing is confusing though - you pay per collection operation and vector storage.
Memory Usage Reality
ChromaDB's docs recommend 2GB minimum. From running this in production:
- Under 1M vectors: 4GB works most of the time
- 1M-10M vectors: 8-16GB, really depends on your queries
- Over 10M vectors: 32GB+ or just use something else
Memory usage just keeps growing and never comes back down. I restart mine every few days.
Getting Persistence Right
ChromaDB uses SQLite for persistence. SQLite is reliable but needs proper volume mounting.
Persistence was broken in early versions, works fine now if you mount correctly:
docker run -d \
--name chromadb \
-p 8000:8000 \
-v /local/path:/data \
-e IS_PERSISTENT=true \
chromadb/chroma:latest
Mount /data
inside the container. The docs used to say /chroma/chroma
but that never worked for me. Wasted way too much time on this.
Storage that works:
- Local SSD volumes (fastest, see SSD performance benchmarks)
- AWS EBS (reliable, decent performance with gp3)
- GCP Persistent Disks (expensive but solid)
Storage that doesn't:
- Network storage like EFS or Azure Files (SQLite hates network latency)
- Containers without persistent volumes (data disappears, obviously)
Security Setup
ChromaDB has basic auth but it's off by default. Most people put nginx or Traefik in front:
server {
listen 443 ssl;
server_name your-chroma.company.com;
location / {
proxy_pass http://chromadb:8000;
proxy_set_header Host $host;
}
}
Don't trust ChromaDB's built-in security for anything important. Treat it like an internal database and use proper network security if you're on K8s. Check out OWASP guidelines for API security best practices.