NGINX Ingress Controller - Traffic Routing That Doesn't Shit the Bed

What is NGINX Ingress Controller

NGINX Ingress Controller solves the fundamental problem of getting traffic into your Kubernetes cluster without wanting to quit DevOps. Managing individual NodePort services is a special kind of hell, and cloud provider load balancers get expensive fast. Instead you get a single entry point that routes traffic based on hostnames, paths, headers, or whatever routing logic you need.

Kubernetes Ingress Architecture

But here's where it gets confusing - there are actually two different NGINX Ingress Controllers:

The Community Version: kubernetes/ingress-nginx

The Kubernetes community maintains ingress-nginx - it's free, open source, and runs the standard NGINX open source binary inside Kubernetes pods. Works with current Kubernetes versions and handles the basic Kubernetes Ingress resources plus some custom annotations.

This version auto-generates NGINX config files from your Kubernetes Ingress resources, reloads NGINX when configs change, and provides basic load balancing across your pods. It works great for most people and powers thousands of production clusters.

The Commercial Version: F5 NGINX Ingress Controller

F5's NGINX Ingress Controller can run either NGINX OSS or NGINX Plus and adds enterprise features through custom Kubernetes resources. Recent versions include NGINX One Console integration, advanced rate limiting, OpenTelemetry tracing, and JWT-based authentication policies.

The key difference is extensibility - F5's version uses Custom Resource Definitions (CRDs) like VirtualServer and Policy resources for configuration instead of cramming everything into Ingress annotations. This gives you more control but adds complexity.

Kubernetes Logo

How NGINX Ingress Controller Actually Works

Both versions follow the same basic pattern: a Kubernetes controller watches for Ingress resources (and CRDs in F5's case), generates NGINX configuration, and manages NGINX processes. The controller runs as a Deployment or DaemonSet, typically with a LoadBalancer or NodePort Service exposing ports 80/443.

NGINX Ingress Controller Workflow

When you create an Ingress resource, the controller:

Discovers the Ingress and validates rules
Generates NGINX server blocks and upstream configurations
Tests the config with nginx -t
Reloads NGINX gracefully if valid
Updates the Ingress status with the load balancer IP

The magic happens in the NGINX config generation - your Kubernetes services become NGINX upstream blocks, and Ingress rules become server blocks with location directives. The controller maintains this mapping automatically as pods scale up/down or services change.

Why NGINX Ingress Controller Doesn't Suck

Unlike some ingress controllers that try to reinvent HTTP routing and fail miserably, NGINX Ingress Controller uses the same NGINX that's been handling traffic for 15+ years. You get the same performance that handles hundreds of thousands of connections in traditional deployments.

The controller handles service discovery automatically through Kubernetes APIs. When pods die or new ones start, the controller updates NGINX upstreams without manual intervention. Health checking works through Kubernetes readiness probes - if a pod isn't ready, it doesn't receive traffic.

SSL termination leverages NGINX's production-tested TLS implementation with automatic certificate management through cert-manager integration. Let's Encrypt certificates can be provisioned and renewed automatically, and SNI works properly for multiple domains per IP.

The configuration flexibility is what sold me - regex-based path matching, header-based routing, geographic restrictions, rate limiting, request/response transformation. These are all NGINX features exposed through Kubernetes-native configuration, not half-baked reimplementations.

Performance characteristics remain similar to standalone NGINX - sub-millisecond latency for static content and efficient connection handling for dynamic requests. The Kubernetes integration adds minimal overhead since it's just configuration management and service discovery.

NGINX Ingress Controller vs Alternatives

Feature	NGINX Ingress (Community)	NGINX Ingress (F5)	Traefik	HAProxy Ingress	Istio Gateway	Kong Ingress
Licensing	Apache 2.0 (Free)	Commercial + OSS options	MIT (Free)	Apache 2.0 (Free)	Apache 2.0 (Free)	Apache 2.0 + Enterprise
Configuration Model	Ingress + Annotations	CRDs (VirtualServer, Policy)	CRDs + Labels	Ingress + ConfigMap	Gateway API + CRDs	Ingress + CRDs
Backend Technology	NGINX OSS	NGINX OSS/Plus	Traefik (Go)	HAProxy	Envoy Proxy	Kong (Lua/NGINX)
Performance	45k req/sec in our setup	60k+ req/sec with NGINX Plus	28k req/sec before choking	50k+ req/sec	25k req/sec	20k req/sec
Memory Usage	200MB base + 5MB per 100 ingresses	150MB base (NGINX Plus)	300MB+ with lots of routes	100MB base	500MB+ (Envoy overhead)	250MB base
HTTP/2 Support	✅ Full	✅ Full	✅ Full	✅ Full	✅ Full	✅ Full
HTTP/3 Support	❌ No	✅ Yes (NGINX Plus)	✅ Yes	❌ No	✅ Yes	⚠️ Limited
TLS Termination	✅ Excellent	✅ Excellent	✅ Good	✅ Very Good	✅ Good	✅ Good
Load Balancing	Round-robin, IP Hash	Advanced algorithms	Round-robin, Weighted	Advanced algorithms	Full Envoy features	Multiple algorithms
Rate Limiting	Basic (annotations)	Advanced (JWT, variables)	Good (middleware)	Advanced	Good (Envoy filters)	Excellent (plugins)
Authentication	Basic auth, External auth	JWT, OIDC, mTLS	OIDC, Forward auth	Basic/External	Full service mesh	OAuth, JWT, LDAP
Observability	Basic metrics	OpenTelemetry, NGINX One	Excellent (Prometheus)	Good metrics	Full service mesh	Good metrics
Configuration Reload	NGINX reload	NGINX reload	Dynamic (no reload)	HAProxy reload	Envoy dynamic	Dynamic
Learning Curve	Medium	Medium-High	Low-Medium	Medium-High	High	Medium
Enterprise Support	Community	F5 Commercial	Traefik Enterprise	HAProxy Enterprise	Istio Partners	Kong Enterprise
Market Adoption	Very High	Medium	High	Medium	Growing	Medium

NGINX Ingress Controller in Production

minikube is fucking useless for testing real production scenarios. Here's what actually matters when your clusters serve real traffic and downtime costs money.

SSL Certificate Management Hell

SSL Certificate Architecture

SSL certificates are where most people lose their minds with ingress controllers. NGINX Ingress Controller integrates with cert-manager to automate Let's Encrypt certificate provisioning, but you need to understand the certificate lifecycle or you'll be debugging expired certs at 2am.

The automatic certificate provisioning works through annotations on Ingress resources. cert-manager watches for cert-manager.io/cluster-issuer annotations, requests certificates from ACME providers, and stores them in Kubernetes secrets. NGINX Ingress Controller automatically picks up the certificates and configures SNI properly.

NGINX Certificate Management Flow

But certificate rotation is where things break. Let's Encrypt certificates expire every 90 days, and cert-manager handles renewal automatically - until it doesn't. I've seen clusters where cert-manager failed to renew certificates because of DNS propagation issues, rate limiting, or ACME challenge failures. Nothing quite like getting paged at 3am because all your SSL certs expired and customers can't access the site.

The real gotcha is wildcard certificates. They require DNS-01 challenges instead of HTTP-01, which means cert-manager needs cloud DNS API credentials. Route53, CloudFlare, and Google DNS all work, but credential management becomes another failure point.

For production, you want multiple certificate issuers configured as backups. If Let's Encrypt rate limits kick in (50 certificates per registered domain per week), having a secondary ACME provider saves you from emergency certificate purchasing.

Load Balancing and Service Discovery

NGINX Ingress Controller handles Kubernetes service discovery automatically, but understanding the internals prevents mysterious traffic routing issues. The controller watches Endpoints resources and maintains NGINX upstream blocks with current pod IPs.

When pods scale or restart, there's a brief window where NGINX might still route to dead pod IPs. The controller usually detects endpoint changes fast, but I've seen it take 30 seconds when the API server was having a bad day. The nginx.ingress.kubernetes.io/upstream-fail-timeout annotation helps by marking failed upstreams as down faster - when it works right.

Session affinity is another production headache. The nginx.ingress.kubernetes.io/affinity annotation provides cookie-based or IP-based session persistence, but it fights against Kubernetes' load balancing philosophy. If you need session affinity, your application architecture probably needs work.

Rate Limiting That Actually Protects You

NGINX's rate limiting is one of its killer features, but configuring it correctly requires understanding burst handling and distributed rate limiting. The community ingress controller provides basic rate limiting through annotations, but F5's version offers sophisticated controls through Policy resources.

Basic rate limiting with nginx.ingress.kubernetes.io/rate-limit-rpm works for simple cases, but it applies per NGINX worker process, not globally across all ingress pods. Rate limits were completely fucked for weeks until I realized each worker process applies them independently. Took me way too long to figure out why we were getting flooded with 12x the traffic we expected. Should've actually read the docs instead of assuming it worked like a sane person would expect.

F5's advanced rate limiting supports tiered limits based on JWT claims - premium users get higher limits than free users. This requires deep integration between your authentication system and ingress configuration, but it's incredibly powerful for SaaS applications.

The biggest rate limiting mistake is not configuring burst properly. Without burst handling, legitimate traffic spikes trigger rate limits even when average rates are acceptable. The burst setting allows temporary spikes above the base rate, then enforces the average over time.

Security Beyond Basic TLS

NGINX Ingress Controller provides security features that go beyond basic SSL termination, but most people don't enable them properly. Security note: In March 2025, several critical vulnerabilities (CVE-2025-1974 and others) were disclosed affecting the community ingress-nginx controller. Make sure you're running patched versions and have proper network isolation. Web Application Firewall integration, geographic restrictions, and authentication delegation all work, but require careful configuration.

NGINX App Protect integration (F5 version only) provides WAF capabilities directly in the ingress controller. It can block SQL injection, XSS, and other OWASP Top 10 attacks at the ingress layer before they reach your applications. The detection accuracy is good, but tuning rules for your specific applications prevents false positives.

Geographic restrictions through the GeoIP module work well for compliance requirements. You can block or allow traffic from specific countries using MaxMind's GeoIP database. But IP geolocation isn't perfect - VPNs, proxies, and mobile carriers can make geographic restrictions unreliable.

Authentication delegation with the auth_request module lets you offload authentication to external services. The ingress controller makes a subrequest to your auth service for every request, caching responses based on your configuration. This centralizes authentication logic but adds latency and creates a dependency on your auth service.

Monitoring and Debugging Production Issues

When NGINX Ingress Controller breaks in production, you need visibility into what's happening at the NGINX level, not just the Kubernetes level. The default metrics are useful but incomplete for serious troubleshooting.

The nginx-prometheus-exporter provides detailed NGINX metrics including request rates, response codes, upstream health, and SSL handshake statistics. Combined with Kubernetes metrics from kube-state-metrics, you get full visibility into ingress performance.

But metrics don't help when traffic isn't reaching your pods at all. The nginx.ingress.kubernetes.io/enable-access-log annotation enables detailed access logging for specific ingresses. Combined with the kubectl logs from ingress pods, you can trace request routing decisions.

The most useful debugging technique is enabling NGINX debug logging temporarily. It shows exactly how NGINX processes requests, evaluates location blocks, and selects upstreams. The logs are verbose and will impact performance, but they reveal why requests aren't routing as expected.

Request tracing with OpenTelemetry (available in F5's recent releases) provides distributed tracing across your entire request path - when you can get it configured properly. You can see latency breakdown from ingress → service → pod → database, identifying bottlenecks and failures in complex microservice architectures.

High Availability and Scaling Patterns

Running NGINX Ingress Controller in production means planning for failures and scaling beyond single pod deployments. The typical pattern is running ingress controllers as a DaemonSet on dedicated nodes with NodePort services, but this creates single points of failure.

For true high availability, you want multiple ingress controller replicas behind a cloud load balancer. The cloud LB handles health checking and traffic distribution across ingress pods. If one pod dies, traffic routes to healthy instances automatically.

Node affinity rules ensure ingress pods run on different worker nodes, preventing a single node failure from taking down ingress entirely. Anti-affinity rules spread pods across availability zones in multi-AZ clusters.

But scaling ingress controllers isn't just about pod replicas - NGINX configuration complexity affects reload times. Large clusters with hundreds of ingresses start choking on config reloads. Someone added a fuckload of regex patterns and custom annotations to every ingress, and suddenly our 2-second reloads turned into 30-second nightmares. Never found out which genius thought that was a good idea. The F5 version's dynamic reconfiguration API reduces this clusterfuck by avoiding full NGINX reloads for certain changes.

Frequently Asked Questions

What's the difference between the two NGINX Ingress Controllers?

There are two completely different projects: kubernetes/ingress-nginx (community) and F5 NGINX Ingress Controller (commercial). The community version is free, uses standard Kubernetes Ingress resources, and runs NGINX OSS. F5's version supports both NGINX OSS and Plus, uses custom resources for advanced configuration, and offers commercial support. Both solve the same basic problem but with different feature sets and complexity levels.

Which version should I use for production?

Start with the community kubernetes/ingress-nginx unless you specifically need F5's advanced features like JWT authentication, advanced rate limiting, or commercial support. The community version handles most production workloads perfectly well and has a larger user base. Upgrade to F5's version if you need enterprise features that justify the additional complexity and cost.

How does performance compare to cloud load balancers?

NGINX Ingress Controller can handle tens of thousands of requests per second depending on configuration and hardware. Cloud load balancers like AWS ALB or GCP Load Balancer offer similar throughput but with higher latency and less control. The real advantage is cost

cloud LBs cost $20-50/month each, while ingress controllers run on your existing cluster nodes.

Can I run multiple ingress controllers in the same cluster?

Yes, using IngressClass resources to specify which controller handles each Ingress. This is common for separating internal vs external traffic, or running different controllers for different applications. Each controller needs its own service and configuration to avoid conflicts.

How do I handle SSL certificates automatically?

Install cert-manager alongside NGINX Ingress Controller. cert-manager automates certificate provisioning from Let's Encrypt, Venafi, or other ACME providers. Add cert-manager.io/cluster-issuer: "letsencrypt-prod" annotation to your Ingress resources, and cert-manager handles the rest. Certificates renew automatically every 60-90 days.

What happens when the ingress controller pod dies?

Everything breaks until Kubernetes gets its shit together and reschedules the pod. Usually takes 30-60 seconds if you're lucky, but I've seen it take 5 minutes when nodes were overloaded. This is why you run multiple ingress replicas behind a cloud load balancer

when it works. Spent 2 hours debugging why the cloud LB wasn't failing over correctly. Turns out the default health check path was wrong and it was marking all pods as unhealthy.

How do I debug traffic routing issues?

Enable debug logging with kubectl edit configmap nginx-configuration -n ingress-nginx and add error-log-level: debug. Check ingress controller logs with kubectl logs -n ingress-nginx deployment/nginx-ingress-controller. The logs show exactly how NGINX processes requests and selects upstreams. Remember to disable debug logging afterward as it impacts performance.

Can NGINX Ingress Controller handle WebSocket connections?

Yes, WebSocket connections work automatically through HTTP/1.1 upgrade headers. No special configuration required for basic WebSocket support. For sticky sessions with WebSockets, use nginx.ingress.kubernetes.io/affinity: cookie annotation to ensure connections stick to the same backend pod.

How do I configure rate limiting properly?

Community version: use nginx.ingress.kubernetes.io/rate-limit-rps: "100" annotation on Ingress resources. F5 version: create Policy resources with advanced rate limiting including burst, JWT-based tiering, and custom variables. Remember that rate limits apply per NGINX worker process, so multiply by your worker count and replica count for actual limits.

What's the maximum number of ingresses NGINX can handle?

The limit depends on NGINX configuration complexity, not absolute ingress count. Clusters with 1000+ simple ingresses work fine, but complex routing rules increase reload times. Each ingress adds server blocks and upstream configuration to NGINX. F5's version handles larger configurations better through optimized config generation and dynamic updates.

How do I monitor NGINX Ingress Controller performance?

Deploy nginx-prometheus-exporter to expose NGINX metrics to Prometheus. Key metrics include request rate, response codes, upstream response times, and active connections. Grafana dashboards are available for visualization. The F5 version includes built-in Prometheus metrics and supports OpenTelemetry tracing.

Can I use NGINX Ingress Controller with service mesh?

Yes, but it's often redundant. Service meshes like Istio provide their own ingress gateways with similar functionality. You can run NGINX Ingress Controller as a north-south gateway while the service mesh handles east-west traffic, but this adds complexity. Choose one approach for consistency unless you have specific requirements for both.

How do I handle large file uploads through the ingress?

Increase nginx.ingress.kubernetes.io/proxy-body-size annotation (default 1m). For very large files, consider direct pod access or object storage integration to avoid proxying large payloads through the ingress layer. The ingress controller buffers request bodies, which can consume memory and impact performance for concurrent large uploads.

What are common production gotchas to avoid?

Don't use hostnames in upstream configurations

always use IP addresses to avoid DNS lookup delays that'll fuck up your response times. Configure appropriate resource limits and requests for ingress pods (learned this one the hard way when OOMKiller took down prod). Plan for certificate renewal failures with backup ACME issuers. Monitor NGINX reload times in large clusters
if reloads start taking more than 10 seconds, you're in for a world of pain. Use anti-affinity rules to spread ingress pods across nodes. Test configuration changes in staging before production deployment
seriously, don't be the person who breaks Friday deployments.

How do I migrate from cloud load balancers to NGINX Ingress Controller?

Start with parallel deployment

run both systems simultaneously while testing. Update DNS entries gradually to shift traffic to the ingress controller. Most cloud LB features translate to NGINX annotations or configurations. The biggest difference is certificate management
cloud LBs often handle this automatically while NGINX requires cert-manager or manual management.

Quick Navigation

The Community Version: kubernetes/ingress-nginx

The Commercial Version: F5 NGINX Ingress Controller

How NGINX Ingress Controller Actually Works

Why NGINX Ingress Controller Doesn't Suck

SSL Certificate Management Hell

Load Balancing and Service Discovery

Rate Limiting That Actually Protects You

Security Beyond Basic TLS

Monitoring and Debugging Production Issues

High Availability and Scaling Patterns

What's the difference between the two NGINX Ingress Controllers?

Which version should I use for production?

How does performance compare to cloud load balancers?

Can I run multiple ingress controllers in the same cluster?

How do I handle SSL certificates automatically?

What happens when the ingress controller pod dies?

How do I debug traffic routing issues?

Can NGINX Ingress Controller handle WebSocket connections?

How do I configure rate limiting properly?

What's the maximum number of ingresses NGINX can handle?

How do I monitor NGINX Ingress Controller performance?

Can I use NGINX Ingress Controller with service mesh?

How do I handle large file uploads through the ingress?

What are common production gotchas to avoid?

How do I migrate from cloud load balancers to NGINX Ingress Controller?

Related Tools & Recommendations

Helm: Simplify Kubernetes Deployments & Avoid YAML Chaos

NGINX Overview: Web Server, Reverse Proxy & Load Balancer Guide

cert-manager: Stop Certificate Expiry Paging in Kubernetes

Debugging Istio Production Issues: The 3AM Survival Guide

Kafka, MongoDB, K8s, Prometheus: Event-Driven Observability

OpenTelemetry + Jaeger + Grafana on Kubernetes - The Stack That Actually Works

NGINX Certbot Integration: Automate SSL Renewals & Prevent Outages

containerd - The Container Runtime That Actually Just Works

Kong Gateway: Cloud-Native API Gateway Overview & Features

Set Up Microservices Monitoring That Actually Works

Kubernetes Crisis Management: Fix Your Down Cluster Fast

Jsonnet Overview: Stop Copy-Pasting YAML Like an Animal

Fix Kubernetes ImagePullBackOff Error - The Complete Battle-Tested Guide

Lock Down Your K8s Cluster Before It Costs You $50k

Debug Kubernetes Issues: The 3AM Production Survival Guide

Kubernetes Cluster Autoscaler: Automatic Node Scaling Guide

GitOps Integration: Docker, Kubernetes, Argo CD, Prometheus Setup

Fix Slow kubectl in Large Kubernetes Clusters: Performance Optimization

Minikube Troubleshooting Guide: Fix Common Errors & Issues

Google Cloud Run: Deploy Containers, Skip Kubernetes Hell