Kong Gateway Performance Optimization: AI-Optimized Technical Reference
Executive Summary
Kong Gateway performance degrades significantly from marketing benchmarks (50,000+ RPS) to production reality (8,000-28,000 RPS depending on plugin load). Performance bottlenecks occur at database connections, plugin processing overhead, memory cache limitations, and network configuration. Critical optimization requires specific configuration changes, not just infrastructure scaling.
Performance Baselines and Reality
Realistic Performance Expectations
- Bare Kong (no plugins): 35,000-45,000 RPS on 8-core, 16GB RAM
- Essential plugins (auth, rate limiting): 20,000-28,000 RPS
- Full enterprise plugin stack: 8,000-15,000 RPS
- AI Gateway plugins enabled: 5,000-12,000 RPS
Memory Usage Reality
- Base worker process: 500MB + plugin overhead
- Production example: 4-worker instance with 500 routes and 12 plugins = 6.2GB RAM (documentation suggests 4GB - insufficient)
- Memory scaling factors: Worker count × Entity count × Connection pools
Critical Configuration Requirements
Core Performance Settings (kong.conf)
# Worker and connection settings
nginx_worker_processes = 8 # 2x CPU cores (not 1x default)
nginx_worker_connections = 4096 # 4x default for burst traffic
upstream_keepalive_pool = 512 # Prevent connection thrashing
upstream_keepalive = 1024 # Pool size per upstream
# Memory settings
mem_cache_size = 8192m # 8GB cache (not 128MB default)
lua_shared_dict_size = 512m # Increase from 5m default
# Database connections (critical bottleneck)
pg_max_concurrent_queries = 50 # Per worker
pg_keepalive_timeout = 600000 # 10 minutes
# Performance logging
proxy_access_log = /dev/stdout # Avoid disk I/O
admin_access_log = off # Disable admin logging
error_log = /dev/stderr warn # Error level only
PostgreSQL Configuration Requirements
max_connections = 200 -- 2x Kong instances × workers × pool
shared_buffers = 2GB -- 25% of available RAM
effective_cache_size = 6GB -- 75% of available RAM
work_mem = 64MB -- For rate limiting queries
random_page_cost = 1.1 -- SSD optimization
Plugin Performance Impact
Latency Cost Per Plugin
- Rate Limiting (database): +3-5ms, +200MB memory per worker
- Rate Limiting (Redis): +1-2ms, +100MB memory per worker
- JWT Validation: +0.5-1ms, +50MB memory per worker
- OAuth 2.0: +5-8ms, +300MB memory per worker
- Prometheus metrics: +2ms, +500MB memory (memory hog)
- Request Transformer: +1-3ms depending on complexity
- CORS: +0.5ms (efficient)
Optimal Plugin Execution Order
- Request Termination (fail fast)
- Rate Limiting (block early)
- Authentication (JWT > Key Auth > OAuth)
- Authorization (RBAC, ACL)
- Request Validation
- Request Transformation
- Proxy/Upstream plugins
- Response Transformation
- Logging/Analytics (last - avoid processing rejected requests)
Critical Failure Points
Database Connection Exhaustion
- Problem: Default PostgreSQL 100 connections vs 4 Kong workers × 30 connections = 120 needed
- Symptoms: 502 errors under load, connection timeouts
- Solution: Increase PostgreSQL connections + PgBouncer connection pooling
Memory Cache Undersizing
- Problem: Default 128MB cache insufficient for production entity counts
- Impact: Constant database hits for configuration data
- Solution: Set mem_cache_size to 50% of available RAM
Plugin Processing Order
- Problem: Expensive plugins (logging) running before cheap rejection (auth)
- Impact: CPU waste processing requests that should be rejected early
- Solution: Reorder plugins to fail fast
Connection Pool Saturation
- Problem: Default upstream keepalive (1000) × number of upstreams = excessive connections
- Impact: Socket exhaustion, connection thrashing
- Solution: Tune upstream_keepalive_pool and upstream_keepalive
Monitoring Critical Metrics
Performance Indicators
- kong_request_duration_ms{quantile="0.95"} > 100ms = Kong bottleneck
- kong_upstream_target_health < 1 = upstream failures
- kong_memory_lua_shared_dict_bytes > 80% = memory pressure
- kong_database_reachable = 0 = database connectivity issues
Debug Commands
- CPU analysis:
top -H -p $(pgrep nginx)
shows worker thread CPU usage - Connection monitoring:
ss -tulpn | grep nginx
tracks connection counts - Request timing:
kong config set log_level debug
for latency breakdown
Scaling Decision Matrix
Scenario | Max RPS | P95 Latency | Memory/Instance | Database Load | Scaling Action |
---|---|---|---|---|---|
Single optimized instance | 25,000 | 20-50ms | 8-12GB | Low | Vertical scaling limit |
Multi-instance cluster | 100,000+ | 10-30ms | 6-8GB each | Very Low | Horizontal scaling |
Database bottleneck | Variable | >100ms | Normal | High | Redis for stateful plugins |
Plugin saturation | <15,000 | >200ms | High | Variable | Disable non-essential plugins |
Redis Migration Requirements
Critical for Production Scale
- Database rate limiting: Becomes bottleneck at >5,000 RPS
- Redis migration: Reduces latency from 5ms to 1ms per rate limit check
- Configuration: Redis cluster required for multi-instance Kong deployments
Redis Settings
redis:
timeout: 2000 # Fail fast
keepalive_pool_size: 100 # Per Kong worker
keepalive_backlog: 50 # Connection queue
Common Failure Scenarios
502 Bad Gateway Causes
- Backend connection limits exceeded - Check app server connection limits vs Kong upstream connections
- DNS resolution failures - Use IP addresses or fix /etc/resolv.conf
- Network timeouts - Reduce timeout settings, check network stability
- Health check failures - Verify health check endpoints and intervals
Performance Degradation Over Time
- Connection pool growth - Monitor with
ss -tulpn | grep nginx
- Plugin state accumulation - OAuth tokens, rate limit counters growing
- Log file growth - Rotate logs aggressively
- Lua garbage collection - Enable GC statistics monitoring
Scaling Bottlenecks at Load Increases
- 5,000 RPS threshold: Worker saturation, database pool exhaustion
- Solution order: Scale workers → database connections → optimize plugins → add instances
Resource Requirements by Scale
Small Production (5,000-15,000 RPS)
- Kong instances: 1-2
- CPU cores: 4-8 per instance
- Memory: 8-12GB per instance
- Database: PostgreSQL with 200 connections
- Redis: Single instance for rate limiting
Medium Production (15,000-50,000 RPS)
- Kong instances: 3-5
- CPU cores: 8-16 per instance
- Memory: 8-16GB per instance
- Database: PostgreSQL with PgBouncer connection pooling
- Redis: Redis cluster for high availability
Large Production (50,000+ RPS)
- Kong instances: 5+ (horizontal scaling)
- CPU cores: 8-16 per instance
- Memory: 12-24GB per instance
- Database: PostgreSQL cluster with dedicated connection pooling
- Redis: Redis cluster with sharding
Time Investment Requirements
Initial Setup Optimization
- Configuration tuning: 4-8 hours
- Load testing validation: 8-16 hours
- Monitoring setup: 4-8 hours
- Documentation: 2-4 hours
Ongoing Performance Management
- Performance monitoring: 2-4 hours/week
- Capacity planning: 4-8 hours/quarter
- Configuration updates: 1-2 hours/month
- Incident response: 2-8 hours/incident (depending on complexity)
Prerequisites and Dependencies
Technical Requirements
- PostgreSQL expertise: Database tuning, connection pooling
- Redis knowledge: Clustering, memory management
- Load testing tools: K6, Apache Bench, or equivalent
- Monitoring stack: Prometheus, Grafana for metrics collection
Infrastructure Requirements
- Network latency: <5ms between Kong and upstreams for optimal performance
- Storage: SSD storage for database and logs
- DNS: Fast, reliable DNS resolution (use IP addresses when possible)
- Load balancer: Layer 4 load balancer in front of Kong instances
Implementation Checklist
Pre-Production Validation
- Load test with realistic traffic patterns (burst + sustained)
- Verify database connection pool usage <80%
- Confirm upstream connection pooling working
- Test plugin disable/enable procedures
- Validate health check intervals don't overwhelm backends
- Set up P95 latency monitoring
- Document scaling procedures
Post-Deployment Monitoring
- P95 request latency trending
- Database connection pool utilization
- Memory usage per Kong worker
- Upstream connection counts
- Plugin processing times
- Error rate by plugin and endpoint
Useful Links for Further Investigation
Kong Performance Resources (Actually Helpful Links)
Link | Description |
---|---|
Kong Gateway Performance Benchmarks | Official benchmark results that show ideal-case performance. Useful for understanding theoretical limits, but expect 30-50% lower performance in production with real backends and network latency. |
Kong Production Deployment Guide | Comprehensive guide covering deployment topologies, security, and basic performance considerations. Good starting point but lacks specific tuning recommendations for high-traffic scenarios. |
Kong Gateway Sizing Guidelines | Resource allocation recommendations based on expected load. The memory estimates are conservative - plan for 50-100% more RAM in production. |
Kong Configuration Reference | Complete reference for all kong.conf parameters. Essential for understanding performance-related settings but lacks context on which ones actually matter. |
PostgreSQL Performance Tuning for Kong | Basic database optimization guide. Covers connection pooling and basic PostgreSQL tuning but doesn't go deep enough for high-traffic deployments. |
PgBouncer Configuration Guide | Essential for Kong deployments beyond 5-10 instances. Connection pooling becomes mandatory when you have multiple Kong instances hitting the same database. |
Redis for Kong Plugins | Documentation for Redis-backed rate limiting. Critical for production deployments - database-backed rate limiting doesn't scale beyond moderate traffic levels. |
Redis Cluster Setup Guide | If you're using Redis for multiple stateful Kong plugins, clustering becomes necessary for high availability and performance at scale. |
Kong Prometheus Plugin | Essential for performance monitoring. The plugin exposes the right metrics but be careful - it uses significant memory at scale. Monitor the monitor. |
Kong Logging Best Practices | Structured logging configuration that doesn't destroy performance. File-based logging can become a bottleneck faster than you'd expect. |
Kong Debug Mode Setup | How to enable request tracing for debugging performance issues. Use sparingly - debug logging impacts performance significantly. |
Grafana Dashboards for Kong | Community-maintained dashboard with the right performance metrics. Focus on P95 latency, upstream health, and resource utilization. |
Kong Upstream Health Checks | Configuration guide for upstream health monitoring. Default settings can overwhelm backends - tune intervals based on your infrastructure. |
Kong Load Balancing Guide | Explains Kong's load balancing algorithms and their performance characteristics. Ring-hash is good for caching workloads, weighted-round-robin for general use. |
Kong SSL/TLS Configuration | SSL termination performance optimization. TLS handshakes are CPU-intensive - proper configuration and session caching matter for high-traffic sites. |
Kong Kubernetes Ingress Controller | The right way to run Kong in Kubernetes. Performance considerations are different in container environments - resource limits and networking matter more. |
Kong Helm Chart Configuration | Helm charts with production-ready defaults. The default resource requests are too small for production - plan to override them. |
Kong Docker Images | Official Docker images. Use the specific version tags in production, not "latest". The alpine images are smaller but the standard images have better debugging tools. |
Kong Plugin Development Guide | How to write custom plugins that don't destroy performance. Plugin inefficiency is a common cause of Kong performance problems. |
Kong Plugin Priority and Execution Order | Understanding plugin execution order is critical for performance. Expensive plugins should run after cheap authentication/authorization checks. |
Kong Lua Performance Best Practices | Guidelines for writing efficient Lua code in Kong plugins. Memory management and connection pooling patterns that actually work. |
Kong Community Forum | Active community with real production experiences. Search for performance-related topics - lots of practical advice from people who've debugged similar issues. |
Kong Performance on Stack Overflow | Performance-specific questions and answers. Good source of real-world debugging scenarios and solutions. |
Kong Engineering Blog | Technical deep dives from Kong's engineering team. The performance-related posts cover advanced optimization techniques not documented elsewhere. |
Kong Terraform Provider | Infrastructure-as-code for Kong configuration. Useful for managing performance settings consistently across environments. |
K6 Load Testing Scripts for Kong | Load testing tool with good Kong integration. Essential for validating performance optimizations before production deployment. |
Kong Performance Monitoring with Datadog | APM integration that provides deep visibility into Kong performance bottlenecks. Paid tool but worth it for complex deployments. |
Kong Support Knowledge Base | Enterprise customer support articles. Many performance-related troubleshooting guides that aren't publicly documented elsewhere. |
Kong GitHub Issues | Real bug reports and performance issues from the community. Search for performance-related keywords to find solutions to specific problems. |
Kong Admin API Reference | Admin API documentation for performance monitoring and troubleshooting. Covers the most common API endpoints for debugging techniques. |
High Performance Browser Networking | Not Kong-specific but essential background for understanding API gateway performance. Network optimization principles that apply to Kong deployments. |
Kong Production Deployment Topologies | Architecture patterns and deployment strategies for Kong infrastructure scaling. How to plan topology and growth requirements. |
Kong Load Testing Examples | Official performance testing fixtures and examples. Use these as templates for testing your own Kong configurations. |
Kong Stress Testing Documentation | Helper utilities and examples for stress testing Kong deployments. Good for understanding Kong's performance characteristics in context. |
Related Tools & Recommendations
GitOps Integration Hell: Docker + Kubernetes + ArgoCD + Prometheus
How to Wire Together the Modern DevOps Stack Without Losing Your Sanity
Prometheus + Grafana + Jaeger: Stop Debugging Microservices Like It's 2015
When your API shits the bed right before the big demo, this stack tells you exactly why
Why I Finally Dumped Cassandra After 5 Years of 3AM Hell
integrates with MongoDB
Kafka + MongoDB + Kubernetes + Prometheus Integration - When Event Streams Break
When your event-driven services die and you're staring at green dashboards while everything burns, you need real observability - not the vendor promises that go
Set Up Microservices Monitoring That Actually Works
Stop flying blind - get real visibility into what's breaking your distributed services
NGINX Ingress Controller - Traffic Routing That Doesn't Shit the Bed
NGINX running in Kubernetes pods, doing what NGINX does best - not dying under load
NGINX - The Web Server That Actually Handles Traffic Without Dying
The event-driven web server and reverse proxy that conquered Apache because handling 10,000+ connections with threads is fucking stupid
Automate Your SSL Renewals Before You Forget and Take Down Production
NGINX + Certbot Integration: Because Expired Certificates at 3AM Suck
RAG on Kubernetes: Why You Probably Don't Need It (But If You Do, Here's How)
Running RAG Systems on K8s Will Make You Hate Your Life, But Sometimes You Don't Have a Choice
Docker Alternatives That Won't Break Your Budget
Docker got expensive as hell. Here's how to escape without breaking everything.
I Tested 5 Container Security Scanners in CI/CD - Here's What Actually Works
Trivy, Docker Scout, Snyk Container, Grype, and Clair - which one won't make you want to quit DevOps
MongoDB vs PostgreSQL vs MySQL: Which One Won't Ruin Your Weekend
integrates with postgresql
I Survived Our MongoDB to PostgreSQL Migration - Here's How You Can Too
Four Months of Pain, 47k Lost Sessions, and What Actually Works
Apache Cassandra - The Database That Scales Forever (and Breaks Spectacularly)
What Netflix, Instagram, and Uber Use When PostgreSQL Gives Up
How to Fix Your Slow-as-Hell Cassandra Cluster
Stop Pretending Your 50 Ops/Sec Cluster is "Scalable"
API Gateway Pricing: AWS Will Destroy Your Budget, Kong Hides Their Prices, and Zuul Is Free But Costs Everything
competes with AWS API Gateway
AWS API Gateway - Production Security Hardening
competes with AWS API Gateway
AWS API Gateway - The API Service That Actually Works
competes with AWS API Gateway
Redis vs Memcached vs Hazelcast: Production Caching Decision Guide
Three caching solutions that tackle fundamentally different problems. Redis 8.2.1 delivers multi-structure data operations with memory complexity. Memcached 1.6
Redis Alternatives for High-Performance Applications
The landscape of in-memory databases has evolved dramatically beyond Redis
Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization