Wasted three days trying to get STDIO working in production. Save yourself the time.
STDIO Transport: Doesn't work. Send a bunch of requests, most fail. Get timeouts, connection refused errors, response times over 20 seconds. Not performance issues - actual failures.
Documentation doesn't mention this. Deploy to production and find out when AI agents start hitting your endpoints.
SSE Transport: Works better than STDIO but deprecated. Building on dead tech isn't smart.
Streamable HTTP: Only transport that works. Session strategy matters - get 30 RPS or couple hundred depending on how you configure it.
Tested on staging: shared sessions performed well, unique sessions were terrible. Huge difference.
Session pooling became required after the crashes. Not optional anymore.
AI Traffic Patterns Break Everything
AI agents hit differently than regular users. Burst traffic that overwhelms databases running default configs.
Single AI conversation creates dozens of parallel requests. Database was still on PostgreSQL defaults - around 100 max connections. AI agents tried opening way more. Got FATAL: sorry, too many clients already
and everything crashed.
AI agents retry without backoff. Keep hammering until you add circuit breakers.
Connection Pool Reality
Static pools don't handle AI burst patterns. Found out when AI agent requested "analyze all customer data" and server tried opening more database connections than possible.
PostgreSQL defaults to around 100 connections - way too low for AI traffic bursts.
Memory Management Under AI Load
AI responses get massive. Started with small responses, then suddenly getting 30-50MB JSON payloads that crashed Node trying to serialize.
Production went down for hours when AI query returned huge dataset. Server died processing it all. Was customer data or product catalog, can't remember which.
Stream large responses or server dies when AI requests "all available data."