kubectl hammers your API server with inefficient requests because the defaults were clearly designed by someone who never managed a production cluster. In our massive cluster, kubectl get pods --all-namespaces
takes forever and eats a shit-ton of memory. This isn't a Kubernetes bug - it's kubectl being dumb about how it fetches data.
The Real Problems (From Someone Who's Debugged This 50 Times)
kubectl's Stupid Defaults: The client-go library defaults to QPS=5 and Burst=10. That's ridiculously conservative. I've tested clusters that handle 100+ QPS just fine, but kubectl crawls along at 5 requests per second like we're still using dial-up.
Memory Hog Behavior: kubectl loads EVERYTHING into memory first, then shows you results. List 5,000 pods? kubectl downloads all the pod manifests (probably 100-200KB each, maybe more with all the annotations people add), dumps them in RAM, then prints a table. That's why your laptop starts swapping to death trying to list pods in a big namespace.
Connection Overhead: kubectl creates a new HTTPS connection for every command. In cloud environments, there's network overhead that adds up fast. Do this constantly throughout the day and you've wasted a bunch of time just on TLS handshakes.
The Caching is Broken: kubectl caches API discovery info in ~/.kube/cache
, but it's conservative as hell. Cache expires randomly, cache directory fills up with garbage, and half the time kubectl ignores the cache anyway and re-discovers everything from scratch.
How Bad It Gets in Real Clusters
Small clusters are fine. You won't notice problems until you hit maybe 500+ pods, then kubectl starts getting sluggish.
Medium-sized clusters start getting annoying. kubectl get pods --all-namespaces
might take 10-15 seconds, which is tolerable but frustrating when you're debugging something urgent.
Big clusters make you want to quit. Commands timeout with "context deadline exceeded" errors. I've waited so long for simple pod listings that I forgot what I was debugging. Your laptop fan starts spinning up just to list pods, which is insane.
Once you hit those monster clusters with 1000+ nodes, kubectl is basically broken. You'll get TLS timeouts, memory exhaustion, and commands that just hang forever. At that point you should probably switch to k9s or just accept that kubectl isn't meant for interactive work anymore.
Error Messages You'll See When kubectl Shits the Bed
When kubectl fails in large clusters, you get unhelpful error messages that don't tell you shit:
error: context deadline exceeded
- API server took too long to respond (or got overwhelmed)error: unable to connect to the server: EOF
- Connection dropped, probably while downloading a massive responseerror: the server was unable to return a response within 60 seconds
- API server gave up tryingUnable to connect to the server: net/http: TLS handshake timeout
- Network is fucked or overloaded
The official docs mention these issues but offer zero practical solutions. Stack Overflow has better advice than the kubernetes documentation, which tells you everything.