Kong's CPU usage is spiking but request volume is normal. What's happening?

Three common causes I've debugged multiple times: **Plugin inefficiency**: Some plugin is doing expensive operations. Check which plugins are enabled and disable them one by one. The Prometheus plugin is particularly bad - it scans all entities periodically and can peg CPU at 100%. **Configuration reload loops**: Hybrid mode data planes constantly reloading configuration due to certificate issues or network problems. Check control plane connectivity and certificate expiration. **Upstream connection thrashing**: Kong creating/destroying connections rapidly because keepalive isn't working. Enable upstream connection pooling and monitor connection counts. Quick debug: `top -H -p $(pgrep nginx)` shows which Kong worker threads are consuming CPU.

Why is Kong using 8GB of RAM when the docs say it needs 2GB?

Kong's memory usage is mostly undocumented bullshit. Real memory usage depends on: - **Entity count**: 10,000 routes with complex regex patterns = way more memory than 100 simple routes - **Plugin state**: OAuth tokens, rate limiting counters, cached data - all held in memory - **Connection pools**: Each upstream connection consumes ~2KB memory. 10,000 connections = 20MB just for sockets The `mem_cache_size` setting controls configuration cache, not total memory usage. Set it to 50% of available RAM, not the default 128MB which is laughably small for production.

Kong latency is 200ms but my backend responds in 50ms. Where did 150ms go?

Kong adds latency in several places: **DNS resolution**: 20-100ms if Kong resolves hostnames for every request **Plugin processing**: Each plugin adds 1-10ms depending on complexity **Database queries**: Rate limiting, auth lookups can add 5-50ms **Connection establishment**: New upstream connections cost 10-50ms each Enable request debugging temporarily: `kong config set log_level debug` and grep logs for request timing breakdown. The logs show exactly where time is spent.

Kong keeps returning 502 Bad Gateway errors but my backend is healthy

502 means Kong can't connect to your upstream service. Common causes: **Connection limits**: Backend refusing connections because Kong opened too many. Check your app server's connection limits and Kong's upstream keepalive settings. **DNS resolution failures**: Kong can't resolve upstream hostname. Use IP addresses or fix DNS configuration. Check `/etc/resolv.conf` in Kong containers. **Network timeouts**: Kong's default upstream timeout is 60s but some networks drop idle connections faster. Reduce `timeout` in upstream configuration. **Health check failures**: Kong marked upstreams as unhealthy. Check health check configuration and backend health endpoints. Debug with: `kong health` command shows upstream connectivity from Kong's perspective.

Can Kong handle WebSocket traffic without destroying performance?

Yes, but with caveats. WebSocket connections bypass most Kong plugins once the HTTP upgrade succeeds. This means: - Rate limiting doesn't work on WebSocket messages, only the initial upgrade - Most authentication happens during HTTP upgrade, not on individual messages - Connection pooling settings affect WebSocket connection limits Performance impact: Each WebSocket connection holds open one upstream connection. 10,000 concurrent WebSocket connections = 10,000 upstream connections, which can exhaust connection pools. Set `upstream_keepalive_pool` to handle your expected WebSocket connection count plus regular HTTP connections.

Kong performance degrades after running for several hours. Memory leak?

Probably not a memory leak - likely configuration or connection management issues: **Connection pool growth**: Kong creates new upstream connections but doesn't close idle ones fast enough. Monitor connection counts with `ss -tulpn | grep nginx`. **Plugin state accumulation**: Some plugins cache data that grows over time. OAuth plugin caches tokens, rate limiting stores counters. Check plugin-specific memory usage. **Log file growth**: Kong logs consume disk I/O and memory for buffering. Rotate logs aggressively or disable verbose logging in production. **Lua garbage collection**: Kong runs on Lua/OpenResty which has garbage collection pauses. Enable GC statistics to see if collections are taking too long.

How do I scale Kong horizontally without everything breaking?

Horizontal scaling works well but avoid these mistakes: **Database connection limits**: Each new Kong instance needs database connections. PostgreSQL default 100 connections = ~8 Kong instances max before connection exhaustion. **Shared state problems**: Rate limiting, OAuth tokens stored locally don't sync across instances. Use Redis for shared plugin state. **Health check coordination**: Multiple Kong instances running health checks against the same backends can overwhelm them. Spread health check intervals. **Configuration sync delays**: In hybrid mode, new instances take 30+ seconds to sync full configuration. Don't route traffic immediately. Start with stateless plugins (JWT, CORS, transformations) and add stateful ones (rate limiting, OAuth) after confirming the base setup works.

Kong works fine with 1000 RPS but falls over at 5000 RPS. What breaks?

Several bottlenecks hit at higher request rates: **Worker saturation**: Default worker count (= CPU cores) handles ~5000 RPS per worker. Beyond that, requests queue up and latency spikes. **Database connection pool exhaustion**: Each request might need database access for rate limiting, auth lookups. Connection pools exhaust fast under load. **Plugin processing time**: Plugins that take 1ms at low load can take 10ms under high load due to lock contention and resource competition. **Upstream connection limits**: Your backend might have connection limits Kong hits at higher request rates. Scale workers first (`nginx_worker_processes` in `kong.conf`), then database connections, then optimize plugins. Don't guess - monitor what actually breaks.

Currently viewing the AI version

Switch to human version

Kong Gateway Performance Optimization: AI-Optimized Technical Reference

Executive Summary

Kong Gateway performance degrades significantly from marketing benchmarks (50,000+ RPS) to production reality (8,000-28,000 RPS depending on plugin load). Performance bottlenecks occur at database connections, plugin processing overhead, memory cache limitations, and network configuration. Critical optimization requires specific configuration changes, not just infrastructure scaling.

Performance Baselines and Reality

Realistic Performance Expectations

Bare Kong (no plugins): 35,000-45,000 RPS on 8-core, 16GB RAM
Essential plugins (auth, rate limiting): 20,000-28,000 RPS
Full enterprise plugin stack: 8,000-15,000 RPS
AI Gateway plugins enabled: 5,000-12,000 RPS

Memory Usage Reality

Base worker process: 500MB + plugin overhead
Production example: 4-worker instance with 500 routes and 12 plugins = 6.2GB RAM (documentation suggests 4GB - insufficient)
Memory scaling factors: Worker count × Entity count × Connection pools

Critical Configuration Requirements

Core Performance Settings (kong.conf)

# Worker and connection settings
nginx_worker_processes = 8                # 2x CPU cores (not 1x default)
nginx_worker_connections = 4096          # 4x default for burst traffic
upstream_keepalive_pool = 512            # Prevent connection thrashing
upstream_keepalive = 1024                # Pool size per upstream

# Memory settings
mem_cache_size = 8192m                   # 8GB cache (not 128MB default)
lua_shared_dict_size = 512m              # Increase from 5m default

# Database connections (critical bottleneck)
pg_max_concurrent_queries = 50           # Per worker
pg_keepalive_timeout = 600000            # 10 minutes

# Performance logging
proxy_access_log = /dev/stdout           # Avoid disk I/O
admin_access_log = off                   # Disable admin logging
error_log = /dev/stderr warn             # Error level only

PostgreSQL Configuration Requirements

max_connections = 200                     -- 2x Kong instances × workers × pool
shared_buffers = 2GB                     -- 25% of available RAM
effective_cache_size = 6GB               -- 75% of available RAM
work_mem = 64MB                          -- For rate limiting queries
random_page_cost = 1.1                   -- SSD optimization

Plugin Performance Impact

Latency Cost Per Plugin

Rate Limiting (database): +3-5ms, +200MB memory per worker
Rate Limiting (Redis): +1-2ms, +100MB memory per worker
JWT Validation: +0.5-1ms, +50MB memory per worker
OAuth 2.0: +5-8ms, +300MB memory per worker
Prometheus metrics: +2ms, +500MB memory (memory hog)
Request Transformer: +1-3ms depending on complexity
CORS: +0.5ms (efficient)

Optimal Plugin Execution Order

Request Termination (fail fast)
Rate Limiting (block early)
Authentication (JWT > Key Auth > OAuth)
Authorization (RBAC, ACL)
Request Validation
Request Transformation
Proxy/Upstream plugins
Response Transformation
Logging/Analytics (last - avoid processing rejected requests)

Critical Failure Points

Database Connection Exhaustion

Problem: Default PostgreSQL 100 connections vs 4 Kong workers × 30 connections = 120 needed
Symptoms: 502 errors under load, connection timeouts
Solution: Increase PostgreSQL connections + PgBouncer connection pooling

Memory Cache Undersizing

Problem: Default 128MB cache insufficient for production entity counts
Impact: Constant database hits for configuration data
Solution: Set mem_cache_size to 50% of available RAM

Plugin Processing Order

Problem: Expensive plugins (logging) running before cheap rejection (auth)
Impact: CPU waste processing requests that should be rejected early
Solution: Reorder plugins to fail fast

Connection Pool Saturation

Problem: Default upstream keepalive (1000) × number of upstreams = excessive connections
Impact: Socket exhaustion, connection thrashing
Solution: Tune upstream_keepalive_pool and upstream_keepalive

Monitoring Critical Metrics

Performance Indicators

kong_request_duration_ms{quantile="0.95"} > 100ms = Kong bottleneck
kong_upstream_target_health < 1 = upstream failures
kong_memory_lua_shared_dict_bytes > 80% = memory pressure
kong_database_reachable = 0 = database connectivity issues

Debug Commands

CPU analysis: top -H -p $(pgrep nginx) shows worker thread CPU usage
Connection monitoring: ss -tulpn | grep nginx tracks connection counts
Request timing: kong config set log_level debug for latency breakdown

Scaling Decision Matrix

Scenario	Max RPS	P95 Latency	Memory/Instance	Database Load	Scaling Action
Single optimized instance	25,000	20-50ms	8-12GB	Low	Vertical scaling limit
Multi-instance cluster	100,000+	10-30ms	6-8GB each	Very Low	Horizontal scaling
Database bottleneck	Variable	>100ms	Normal	High	Redis for stateful plugins
Plugin saturation	<15,000	>200ms	High	Variable	Disable non-essential plugins

Redis Migration Requirements

Critical for Production Scale

Database rate limiting: Becomes bottleneck at >5,000 RPS
Redis migration: Reduces latency from 5ms to 1ms per rate limit check
Configuration: Redis cluster required for multi-instance Kong deployments

Redis Settings

redis:
  timeout: 2000                      # Fail fast
  keepalive_pool_size: 100           # Per Kong worker
  keepalive_backlog: 50              # Connection queue

Common Failure Scenarios

502 Bad Gateway Causes

Backend connection limits exceeded - Check app server connection limits vs Kong upstream connections
DNS resolution failures - Use IP addresses or fix /etc/resolv.conf
Network timeouts - Reduce timeout settings, check network stability
Health check failures - Verify health check endpoints and intervals

Performance Degradation Over Time

Connection pool growth - Monitor with ss -tulpn | grep nginx
Plugin state accumulation - OAuth tokens, rate limit counters growing
Log file growth - Rotate logs aggressively
Lua garbage collection - Enable GC statistics monitoring

Scaling Bottlenecks at Load Increases

5,000 RPS threshold: Worker saturation, database pool exhaustion
Solution order: Scale workers → database connections → optimize plugins → add instances

Resource Requirements by Scale

Small Production (5,000-15,000 RPS)

Kong instances: 1-2
CPU cores: 4-8 per instance
Memory: 8-12GB per instance
Database: PostgreSQL with 200 connections
Redis: Single instance for rate limiting

Medium Production (15,000-50,000 RPS)

Kong instances: 3-5
CPU cores: 8-16 per instance
Memory: 8-16GB per instance
Database: PostgreSQL with PgBouncer connection pooling
Redis: Redis cluster for high availability

Large Production (50,000+ RPS)

Kong instances: 5+ (horizontal scaling)
CPU cores: 8-16 per instance
Memory: 12-24GB per instance
Database: PostgreSQL cluster with dedicated connection pooling
Redis: Redis cluster with sharding

Time Investment Requirements

Initial Setup Optimization

Configuration tuning: 4-8 hours
Load testing validation: 8-16 hours
Monitoring setup: 4-8 hours
Documentation: 2-4 hours

Ongoing Performance Management

Performance monitoring: 2-4 hours/week
Capacity planning: 4-8 hours/quarter
Configuration updates: 1-2 hours/month
Incident response: 2-8 hours/incident (depending on complexity)

Prerequisites and Dependencies

Technical Requirements

PostgreSQL expertise: Database tuning, connection pooling
Redis knowledge: Clustering, memory management
Load testing tools: K6, Apache Bench, or equivalent
Monitoring stack: Prometheus, Grafana for metrics collection

Infrastructure Requirements

Network latency: <5ms between Kong and upstreams for optimal performance
Storage: SSD storage for database and logs
DNS: Fast, reliable DNS resolution (use IP addresses when possible)
Load balancer: Layer 4 load balancer in front of Kong instances

Implementation Checklist

Pre-Production Validation

Load test with realistic traffic patterns (burst + sustained)
Verify database connection pool usage <80%
Confirm upstream connection pooling working
Test plugin disable/enable procedures
Validate health check intervals don't overwhelm backends
Set up P95 latency monitoring
Document scaling procedures

Post-Deployment Monitoring

P95 request latency trending
Database connection pool utilization
Memory usage per Kong worker
Upstream connection counts
Plugin processing times
Error rate by plugin and endpoint

Useful Links for Further Investigation

Kong Performance Resources (Actually Helpful Links)

Link	Description
Kong Gateway Performance Benchmarks	Official benchmark results that show ideal-case performance. Useful for understanding theoretical limits, but expect 30-50% lower performance in production with real backends and network latency.
Kong Production Deployment Guide	Comprehensive guide covering deployment topologies, security, and basic performance considerations. Good starting point but lacks specific tuning recommendations for high-traffic scenarios.
Kong Gateway Sizing Guidelines	Resource allocation recommendations based on expected load. The memory estimates are conservative - plan for 50-100% more RAM in production.
Kong Configuration Reference	Complete reference for all kong.conf parameters. Essential for understanding performance-related settings but lacks context on which ones actually matter.
PostgreSQL Performance Tuning for Kong	Basic database optimization guide. Covers connection pooling and basic PostgreSQL tuning but doesn't go deep enough for high-traffic deployments.
PgBouncer Configuration Guide	Essential for Kong deployments beyond 5-10 instances. Connection pooling becomes mandatory when you have multiple Kong instances hitting the same database.
Redis for Kong Plugins	Documentation for Redis-backed rate limiting. Critical for production deployments - database-backed rate limiting doesn't scale beyond moderate traffic levels.
Redis Cluster Setup Guide	If you're using Redis for multiple stateful Kong plugins, clustering becomes necessary for high availability and performance at scale.
Kong Prometheus Plugin	Essential for performance monitoring. The plugin exposes the right metrics but be careful - it uses significant memory at scale. Monitor the monitor.
Kong Logging Best Practices	Structured logging configuration that doesn't destroy performance. File-based logging can become a bottleneck faster than you'd expect.
Kong Debug Mode Setup	How to enable request tracing for debugging performance issues. Use sparingly - debug logging impacts performance significantly.
Grafana Dashboards for Kong	Community-maintained dashboard with the right performance metrics. Focus on P95 latency, upstream health, and resource utilization.
Kong Upstream Health Checks	Configuration guide for upstream health monitoring. Default settings can overwhelm backends - tune intervals based on your infrastructure.
Kong Load Balancing Guide	Explains Kong's load balancing algorithms and their performance characteristics. Ring-hash is good for caching workloads, weighted-round-robin for general use.
Kong SSL/TLS Configuration	SSL termination performance optimization. TLS handshakes are CPU-intensive - proper configuration and session caching matter for high-traffic sites.
Kong Kubernetes Ingress Controller	The right way to run Kong in Kubernetes. Performance considerations are different in container environments - resource limits and networking matter more.
Kong Helm Chart Configuration	Helm charts with production-ready defaults. The default resource requests are too small for production - plan to override them.
Kong Docker Images	Official Docker images. Use the specific version tags in production, not "latest". The alpine images are smaller but the standard images have better debugging tools.
Kong Plugin Development Guide	How to write custom plugins that don't destroy performance. Plugin inefficiency is a common cause of Kong performance problems.
Kong Plugin Priority and Execution Order	Understanding plugin execution order is critical for performance. Expensive plugins should run after cheap authentication/authorization checks.
Kong Lua Performance Best Practices	Guidelines for writing efficient Lua code in Kong plugins. Memory management and connection pooling patterns that actually work.
Kong Community Forum	Active community with real production experiences. Search for performance-related topics - lots of practical advice from people who've debugged similar issues.
Kong Performance on Stack Overflow	Performance-specific questions and answers. Good source of real-world debugging scenarios and solutions.
Kong Engineering Blog	Technical deep dives from Kong's engineering team. The performance-related posts cover advanced optimization techniques not documented elsewhere.
Kong Terraform Provider	Infrastructure-as-code for Kong configuration. Useful for managing performance settings consistently across environments.
K6 Load Testing Scripts for Kong	Load testing tool with good Kong integration. Essential for validating performance optimizations before production deployment.
Kong Performance Monitoring with Datadog	APM integration that provides deep visibility into Kong performance bottlenecks. Paid tool but worth it for complex deployments.
Kong Support Knowledge Base	Enterprise customer support articles. Many performance-related troubleshooting guides that aren't publicly documented elsewhere.
Kong GitHub Issues	Real bug reports and performance issues from the community. Search for performance-related keywords to find solutions to specific problems.
Kong Admin API Reference	Admin API documentation for performance monitoring and troubleshooting. Covers the most common API endpoints for debugging techniques.
High Performance Browser Networking	Not Kong-specific but essential background for understanding API gateway performance. Network optimization principles that apply to Kong deployments.
Kong Production Deployment Topologies	Architecture patterns and deployment strategies for Kong infrastructure scaling. How to plan topology and growth requirements.
Kong Load Testing Examples	Official performance testing fixtures and examples. Use these as templates for testing your own Kong configurations.
Kong Stress Testing Documentation	Helper utilities and examples for stress testing Kong deployments. Good for understanding Kong's performance characteristics in context.

Kong Gateway Performance Optimization: AI-Optimized Technical Reference

Executive Summary

Performance Baselines and Reality

Realistic Performance Expectations

Memory Usage Reality

Critical Configuration Requirements

Core Performance Settings (kong.conf)

PostgreSQL Configuration Requirements

Plugin Performance Impact

Latency Cost Per Plugin

Optimal Plugin Execution Order

Critical Failure Points

Database Connection Exhaustion

Memory Cache Undersizing

Plugin Processing Order

Connection Pool Saturation

Monitoring Critical Metrics

Performance Indicators

Debug Commands

Scaling Decision Matrix

Redis Migration Requirements

Critical for Production Scale

Redis Settings

Common Failure Scenarios

502 Bad Gateway Causes

Performance Degradation Over Time

Scaling Bottlenecks at Load Increases

Resource Requirements by Scale

Small Production (5,000-15,000 RPS)

Medium Production (15,000-50,000 RPS)

Large Production (50,000+ RPS)

Time Investment Requirements

Initial Setup Optimization

Ongoing Performance Management

Prerequisites and Dependencies

Technical Requirements

Infrastructure Requirements

Implementation Checklist

Pre-Production Validation

Post-Deployment Monitoring

Useful Links for Further Investigation

Kong Performance Resources (Actually Helpful Links)

Related Tools & Recommendations

GitOps Integration Hell: Docker + Kubernetes + ArgoCD + Prometheus

Prometheus + Grafana + Jaeger: Stop Debugging Microservices Like It's 2015

Why I Finally Dumped Cassandra After 5 Years of 3AM Hell

Kafka + MongoDB + Kubernetes + Prometheus Integration - When Event Streams Break

Set Up Microservices Monitoring That Actually Works

NGINX Ingress Controller - Traffic Routing That Doesn't Shit the Bed

NGINX - The Web Server That Actually Handles Traffic Without Dying

Automate Your SSL Renewals Before You Forget and Take Down Production

RAG on Kubernetes: Why You Probably Don't Need It (But If You Do, Here's How)

Docker Alternatives That Won't Break Your Budget

I Tested 5 Container Security Scanners in CI/CD - Here's What Actually Works

MongoDB vs PostgreSQL vs MySQL: Which One Won't Ruin Your Weekend

I Survived Our MongoDB to PostgreSQL Migration - Here's How You Can Too

Apache Cassandra - The Database That Scales Forever (and Breaks Spectacularly)

How to Fix Your Slow-as-Hell Cassandra Cluster

API Gateway Pricing: AWS Will Destroy Your Budget, Kong Hides Their Prices, and Zuul Is Free But Costs Everything

AWS API Gateway - Production Security Hardening

AWS API Gateway - The API Service That Actually Works

Redis vs Memcached vs Hazelcast: Production Caching Decision Guide

Redis Alternatives for High-Performance Applications