What Kong Gateway Actually Is (And Why Your Backend Engineers Don't Hate It)

Kong Gateway is an open-source API gateway built on OpenResty/NGINX that actually works in production. Version 3.9.1 released in June 2025 includes serious AI capabilities, incremental configuration sync, and performance improvements that don't break existing setups.

The Reality Check

Kong's networking makes me want to throw my laptop when debugging hybrid mode, but it's the least terrible option for managing APIs at scale. Major enterprises use it for critical infrastructure - if it can handle enterprise traffic, it can probably handle your startup's 12 API calls per minute.

Architecture That Actually Makes Sense

Kong splits into data plane and control plane nodes in hybrid mode. The control plane handles configuration, the data plane processes traffic. This isn't marketing bullshit - it means you can configure policies centrally while keeping traffic processing distributed.

The data plane runs on port 8000 (proxy), the control plane on port 8001 (admin API). In hybrid deployments, data planes connect to control planes via mTLS on port 8005. The control plane config sync will fuck you over during rolling updates if you don't sequence them properly - ask me how I know.

Production Reality

Here's what actually matters: Kong can handle 50,000+ requests per second on decent hardware. That's not theoretical - it's measured with K6 load testing against real deployments.

The catch? Performance degrades with plugin complexity. Rate limiting adds ~2ms latency. OAuth validation adds ~5ms. JWT validation is surprisingly fast at ~1ms. The AI Gateway plugins (semantic caching, prompt guards) add 10-50ms depending on your LLM roundtrip.

The Plugin Ecosystem Problem

Kong has 300+ plugins. Half are useful, quarter are enterprise-only, and quarter are "why does this exist?" The good ones:

  • Rate Limiting: Actually works, unlike AWS API Gateway's version
  • OAuth 2.0: Proper implementation, not a toy
  • JWT: Fast, handles rotation correctly
  • Request/Response Transformer: Saves you from writing middleware

The problematic ones:

  • CORS: Overly complex for what should be simple
  • Prometheus: Resource-heavy, consider alternatives
  • File Log: Will fill your disk, guarantee it

AI Gateway Capabilities (The New Hotness)

Kong AI Gateway launched in February 2024 and version 3.10 added automated RAG pipelines and PII sanitization. Version 3.11 includes prompt compression that reduces token costs by up to 5x.

Supports OpenAI, Azure OpenAI, AWS Bedrock, Anthropic Claude, Google Gemini, and Cohere. The semantic caching prevents redundant LLM calls - found this out at 2am when a routine config update took down half our API endpoints because cache keys weren't invalidating properly.

Database Requirements (Critical Decision)

Kong supports PostgreSQL and Cassandra for the database layer.

PostgreSQL: Always PostgreSQL. Cassandra sounds cool but the operational overhead isn't worth it unless you're Netflix-scale. Kong 3.x requires PostgreSQL 12+ and the migration scripts actually work.

DB-less mode: Use declarative YAML configuration instead of a database. Perfect for containerized deployments but limited plugin compatibility. Can't use rate limiting or OAuth plugins in DB-less mode - they need persistent storage.

I've run both in production. PostgreSQL is boring and reliable. Cassandra was exciting until the 3am pages about split-brain scenarios. For cloud deployments, check the database requirements and supported versions in the compatibility matrix.

Kong Gateway vs The Competition (Honest Assessment)

Feature

Kong Gateway

AWS API Gateway

NGINX Plus

Azure APIM

Envoy Proxy

Performance

50k+ RPS

10k RPS limit

40k+ RPS

20k+ RPS

60k+ RPS

Cost (Monthly)

Free (OSS)
$25+ (Konnect)

$3.50/1M requests

$2.5k/instance

$0.035/1k calls

Free (OSS)

Setup Time

30 minutes

5 minutes

2 hours

15 minutes

4+ hours

Plugin Ecosystem

300+ plugins

AWS-specific only

Limited modules

Built-in policies

Complex extensions

Learning Curve

Moderate

Easy

Steep

Moderate

Very steep

Production Ready

āœ… Battle-tested

āœ… Managed service

āœ… Enterprise grade

āœ… Microsoft backed

ā“ DIY complexity

Multi-Cloud

āœ… Anywhere

āŒ AWS only

āœ… Anywhere

ā“ Azure preferred

āœ… Anywhere

AI Gateway

āœ… Native support

āŒ None

āŒ None

āŒ Limited

āŒ None

Database Required

Optional

No

No

No

Optional

Container Support

āœ… Excellent

āœ… Native

āœ… Good

āœ… Native

āœ… Native

Documentation

Good

Excellent

Poor

Good

Terrible

When Things Break

Stack Overflow

AWS Support

Pray to the NGINX gods

Microsoft Support

GitHub issues

Kong Gateway Pricing, Deployment, and Why DevOps Teams Actually Use It

The Pricing Minefield

Kong's pricing structure is like every other enterprise software - starts reasonable, gets expensive fast. The Konnect Plus managed service offers different tiers. Here's the brutal breakdown:

Kong Gateway (Open Source): Free, but you're on your own for support, monitoring, and advanced features. Good luck explaining to your CTO why the API gateway went down and you're debugging Lua code at 3am.

Kong Konnect Plus: Starts at $25/month for serverless control plane. Sounds cheap until you add:

  • Hybrid control planes: $200/month each
  • Additional developer portals: $200/month each
  • Advanced analytics: Extra cost
  • Support: Extra cost

Real example: One team I consulted for had their admin API exposed for 6 months (don't ask), someone from Ukraine adding cryptocurrency mining services, and ended up with $50k in fraudulent charges before AWS shut down the instances. Their Kong Gateway? Still running fine, completely unaware of the chaos around it.

Deployment Options That Don't Suck

Kong runs beautifully in containers. The official Docker images are maintained, small (~200MB), and don't include surprise dependencies. Check the Docker installation guide for different deployment options.

apiVersion: v1
kind: Service
metadata:
  name: kong-gateway
spec:
  ports:
  - port: 8000
    name: proxy
  - port: 8001
    name: admin
  selector:
    app: kong

The Kong Ingress Controller is the right way to run Kong in Kubernetes. It syncs Kubernetes Ingress resources to Kong configuration automatically. When it works, it's magical. When it breaks, you're debugging custom resource definitions at 2am. The Gateway API support is more robust than Ingress.

Traditional Deployment (Still Works)

Install via package managers on Ubuntu, CentOS, or Amazon Linux. The packages handle systemd service files, log rotation, and upgrades without breaking your configuration.

Ubuntu: apt install kong
CentOS: yum install kong
Amazon Linux: yum install kong

Hybrid Mode (Enterprise-Grade)

Split control plane and data plane across regions. Control planes handle configuration, data planes handle traffic. Solves the "how do we manage 50 Kong instances across 3 continents" problem.

The control plane config sync will fuck you over during rolling updates if you don't sequence them properly. Deploy control planes first, wait for sync, then update data planes. Learn from my 4 hours of downtime and 47 Slack messages from angry engineers.

Community and Ecosystem Reality

Kong has the most mature plugin ecosystem in the API gateway space. The Plugin Hub isn't just marketing fluff - these plugins solve actual production problems:

Rate Limiting: Handles burst traffic properly, unlike naive implementations that leak requests
OAuth 2.0: Proper PKCE support, token introspection, all the enterprise OAuth flows
Request Transformer: Modify headers/bodies without writing middleware
Prometheus: Export metrics that actually matter for monitoring

The community fills gaps that documentation misses. Stack Overflow has 5k+ Kong questions with decent answers. The official Kong Nation forum is where Kong employees actually respond to bug reports.

Performance Benchmarks (Real Numbers)

Kong's official benchmarks show impressive numbers, but here's production reality:

Baseline Performance: 50k+ requests/second on 8 CPU cores
With Rate Limiting: ~45k requests/second (10% overhead)
With JWT Validation: ~48k requests/second (4% overhead)
With OAuth Plugin: ~35k requests/second (30% overhead)

These numbers are from dedicated Kong instances. If you're running Kong alongside other services, expect 20-40% performance degradation due to resource contention.

Why DevOps Teams Choose Kong

Three reasons Kong wins in enterprise environments:

  1. Predictable Behavior: Kong doesn't surprise you. Configuration changes apply immediately, rollbacks work, and error messages actually help debug problems.

  2. Operations Tooling: Proper health checks, metrics endpoints, structured logging. The admin API lets you build automation without screen scraping.

  3. Escape Hatches: When AWS API Gateway limits hit, when Azure APIM pricing explodes, when NGINX configuration becomes unmaintainable - Kong provides a migration path that doesn't require rewriting applications.

The downside? You own the operational complexity. Kong won't auto-scale, auto-patch, or auto-recover from failures. You need monitoring, alerting, backup/restore procedures, and engineers who understand how reverse proxies work.

The Honest Assessment

Kong is the Swiss Army knife of API gateways. Powerful, flexible, and sharp enough to cut yourself if you're not careful. Perfect for teams that want control and have the expertise to manage infrastructure properly.

Skip Kong if you want a fully managed service or don't have dedicated platform engineering resources. The operational overhead is real, and the learning curve is steeper than vendor marketing suggests.

Choose Kong if you need to solve complex API management problems and don't mind debugging proxy configurations when things go sideways. The plugin ecosystem and community support make the operational complexity worth it for most enterprise deployments.

Kong Gateway FAQ (The Questions You're Actually Asking)

Q

Why is Kong eating all my CPU during startup?

A

Kong compiles Lua code on startup, which is CPU-intensive. On container deployments, increase CPU limits during startup or use init containers to warm up the cache. The CPU usage drops to normal levels after ~30 seconds.Also check your plugin configuration

  • some plugins (like Prometheus) scan all routes on startup. If you have 1000+ routes, this takes forever.
Q

Can Kong handle WebSocket connections without falling over?

A

Yes, but there are gotchas. WebSocket connections bypass most plugins (rate limiting, auth, etc.) once upgraded. You need to authenticate WebSocket connections at the HTTP upgrade stage, not after.Set upstream_keepalive to handle connection pooling properly. Default settings will create new upstream connections for every WebSocket, which exhausts connection pools fast.

Q

How do I debug "no route matched" errors that make no sense?

A

Kong route matching is strict about path prefixes. /api/v1/users doesn't match /api/v1/users/ (trailing slash). Use regex patterns or wildcard paths to handle both.Check route priority if multiple routes could match. Kong processes routes by creation order, not specificity. Create more specific routes first.Most importantly: enable request debugging in logs temporarily to see exactly what Kong is matching against.

Q

Why does Kong's admin API return 404 for endpoints that definitely exist?

A

Two common causes:

  1. Wrong admin port:

Admin API runs on 8001 by default, proxy on 8000. Don't confuse them.2. Hybrid mode confusion: In hybrid deployments, admin API only works on control plane nodes. Data plane nodes reject admin API calls with 404. If running in Kubernetes, make sure you're hitting the right service. The admin API service is usually separate from the proxy service.

Q

Kong keeps restarting in Docker - what's broken?

A

Check PostgreSQL connectivity first.

Kong fails hard if it can't reach the database. Common issues:

  • Wrong database host (use service names in Docker Compose)
  • Database not ready when Kong starts (use depends_on with health checks)
  • Network policies blocking database accessUse kong health command in a debug container to test connectivity before troubleshooting Kong configuration.
Q

How do I migrate from version X to Y without breaking everything?

A

Kong provides migration commands but they're not magic.

Steps that actually work:

  1. Backup database first: pg_dump for Postgre

SQL, nodetool snapshot for Cassandra 2. Test migration on staging:

Run kong migrations up on a database copy 3. Read release notes: Breaking changes are documented (usually)4. Plan rollback: Kong migrations aren't always reversible

For zero-downtime upgrades, use blue-green deployment with separate Kong instances. Migrate one environment, test, then cut over traffic.

Q

Rate limiting isn't working - requests are getting through

A

Kong rate limiting is eventually consistent with Postgre

SQL backend. Rapid bursts can exceed limits before the counter updates. Use Redis for strict rate limiting or accept that bursts happen.Also check rate limiting scope

  • global, consumer, service, or route. Wrong scope means rate limiting applies differently than expected.For APIs with unpredictable traffic, combine Kong rate limiting with upstream circuit breakers.
Q

Can I run Kong without a database (DB-less mode)?

A

Yes, for simple use cases.

DB-less mode uses declarative YAML configuration instead of database storage. Limitations:

  • No rate limiting plugins (need persistent storage)
  • No OAuth plugins (need to store tokens)
  • Configuration reloads require restart
  • No admin API writes (read-only)Perfect for microservices where each service has its own Kong instance. Terrible for shared API gateways serving multiple teams.
Q

Kong is using way too much memory - how do I fix it?

A

Kong memory usage scales with:

  • Number of active connections

  • Plugin complexity (especially Lua plugins)

  • Route/service count

  • Upstream connection poolsQuick fixes:

  • Reduce worker_connections (default 1024 per worker)

  • Lower upstream_keepalive pool sizes

  • Disable unnecessary plugins

  • Use connection limits on upstream services

If running in Kubernetes, set memory requests/limits based on actual usage patterns, not guesswork.

Q

How do I handle SSL/TLS certificates without pulling my hair out?

A

Kong supports multiple certificate sources:

  • File-based:

Mount certificates as volumes

  • Database: Store certificates in PostgreSQL (not recommended)
  • Let's Encrypt:

Use ACME plugin for automatic certificates

  • External: Terminate SSL at load balancer level

For production, use Let's Encrypt ACME plugin or external certificate management. Don't store certificates in the database

  • backup/restore becomes a nightmare.
Q

Kong returns 502 errors randomly - what's happening?

A

502 errors mean upstream connection failures.

Common causes:

  1. Upstream timeouts:

Default 60s might be too short for slow APIs 2. Connection pool exhaustion: Too many concurrent requests, not enough upstream connections 3. Network issues:

DNS resolution, routing problems, firewall rules 4. Upstream health: Backend services failing but Kong doesn't knowEnable upstream health checks to detect failing backends automatically. Set appropriate timeout values for your upstream APIs

  • don't use defaults blindly.
Q

Is Kong Enterprise worth the money?

A

Depends on your team and use case.

Kong Enterprise adds:

  • Commercial support (actually helpful)
  • Advanced plugins (RBAC, OIDC, Graph

QL)

  • Kong Manager UI (beats API-only configuration)
  • 24/7 support (saves you from 3am debugging)If you're running Kong in production with multiple teams, the support alone justifies the cost.

If you're a startup with one API, stick with open source until you need enterprise features.The real question: can your team manage Kong operations without vendor support? If no, buy Enterprise. If yes, save money and contribute to the open source project instead.

Q

How do I monitor Kong properly?

A

Essential metrics to track:

  • Request rate:

Requests per second by route

  • Error rate: 4xx/5xx responses by service
  • Latency:

P50, P95, P99 response times

  • Upstream health: Backend availability
  • Resource usage: CPU, memory, connectionsUse Prometheus plugin for metrics collection. Grafana for visualization. Set up alerts for error rate spikes and latency increases.Don't monitor everything
  • focus on metrics that indicate user-facing problems.

Related Tools & Recommendations

tool
Similar content

AWS API Gateway: The API Service That Actually Works

Discover AWS API Gateway, the service for managing and securing APIs. Learn its role in authentication, rate limiting, and building serverless APIs with Lambda.

AWS API Gateway
/tool/aws-api-gateway/overview
100%
tool
Similar content

AWS API Gateway Security Hardening: Protect Your APIs in Production

Learn how to harden AWS API Gateway for production. Implement WAF, mitigate DDoS attacks, and optimize performance during security incidents to protect your API

AWS API Gateway
/tool/aws-api-gateway/production-security-hardening
89%
tool
Similar content

KrakenD API Gateway: Fast, Open Source API Management Overview

The fastest stateless API Gateway that doesn't crash when you actually need it

Kraken.io
/tool/kraken/overview
85%
news
Similar content

Linux Foundation Takes Control of Solo.io AI Agent Gateway

Open source governance shift aims to prevent vendor lock-in as AI agent infrastructure becomes critical to enterprise deployments

Technology News Aggregation
/news/2025-08-25/linux-foundation-agentgateway
78%
compare
Recommended

PostgreSQL vs MySQL vs MongoDB vs Cassandra - Which Database Will Ruin Your Weekend Less?

Skip the bullshit. Here's what breaks in production.

PostgreSQL
/compare/postgresql/mysql/mongodb/cassandra/comprehensive-database-comparison
69%
tool
Similar content

MuleSoft Anypoint Platform: Costs, Features & Real-World Experience

Salesforce's enterprise integration platform that actually works once you figure out DataWeave and survive the licensing costs

MuleSoft Anypoint Platform
/tool/mulesoft/overview
68%
integration
Recommended

Setting Up Prometheus Monitoring That Won't Make You Hate Your Job

How to Connect Prometheus, Grafana, and Alertmanager Without Losing Your Sanity

Prometheus
/integration/prometheus-grafana-alertmanager/complete-monitoring-integration
61%
review
Similar content

MuleSoft Review - Is It Worth the Insane Price Tag?

After 18 months of production pain, here's what MuleSoft actually costs you

MuleSoft Anypoint Platform
/review/mulesoft-anypoint-platform/comprehensive-review
53%
tool
Similar content

KrakenD Production Troubleshooting - Fix the 3AM Problems

When KrakenD breaks in production and you need solutions that actually work

Kraken.io
/tool/kraken/production-troubleshooting
53%
tool
Similar content

AWS Lambda Overview: Run Code Without Servers - Pros & Cons

Upload your function, AWS runs it when stuff happens. Works great until you need to debug something at 3am.

AWS Lambda
/tool/aws-lambda/overview
51%
integration
Recommended

Automate Your SSL Renewals Before You Forget and Take Down Production

NGINX + Certbot Integration: Because Expired Certificates at 3AM Suck

NGINX
/integration/nginx-certbot/overview
47%
tool
Recommended

NGINX - The Web Server That Actually Handles Traffic Without Dying

The event-driven web server and reverse proxy that conquered Apache because handling 10,000+ connections with threads is fucking stupid

NGINX
/tool/nginx/overview
47%
tool
Recommended

Google Kubernetes Engine (GKE) - Google's Managed Kubernetes (That Actually Works Most of the Time)

Google runs your Kubernetes clusters so you don't wake up to etcd corruption at 3am. Costs way more than DIY but beats losing your weekend to cluster disasters.

Google Kubernetes Engine (GKE)
/tool/google-kubernetes-engine/overview
39%
troubleshoot
Recommended

Fix Kubernetes Service Not Accessible - Stop the 503 Hell

Your pods show "Running" but users get connection refused? Welcome to Kubernetes networking hell.

Kubernetes
/troubleshoot/kubernetes-service-not-accessible/service-connectivity-troubleshooting
39%
integration
Recommended

Jenkins + Docker + Kubernetes: How to Deploy Without Breaking Production (Usually)

The Real Guide to CI/CD That Actually Works

Jenkins
/integration/jenkins-docker-kubernetes/enterprise-ci-cd-pipeline
39%
troubleshoot
Recommended

Docker Won't Start on Windows 11? Here's How to Fix That Garbage

Stop the whale logo from spinning forever and actually get Docker working

Docker Desktop
/troubleshoot/docker-daemon-not-running-windows-11/daemon-startup-issues
39%
howto
Recommended

Stop Docker from Killing Your Containers at Random (Exit Code 137 Is Not Your Friend)

Three weeks into a project and Docker Desktop suddenly decides your container needs 16GB of RAM to run a basic Node.js app

Docker Desktop
/howto/setup-docker-development-environment/complete-development-setup
39%
news
Recommended

Docker Desktop's Stupidly Simple Container Escape Just Owned Everyone

integrates with Technology News Aggregation

Technology News Aggregation
/news/2025-08-26/docker-cve-security
39%
howto
Recommended

MySQL to PostgreSQL Production Migration: Complete Step-by-Step Guide

Migrate MySQL to PostgreSQL without destroying your career (probably)

MySQL
/howto/migrate-mysql-to-postgresql-production/mysql-to-postgresql-production-migration
39%
howto
Recommended

I Survived Our MongoDB to PostgreSQL Migration - Here's How You Can Too

Four Months of Pain, 47k Lost Sessions, and What Actually Works

MongoDB
/howto/migrate-mongodb-to-postgresql/complete-migration-guide
39%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization