KEDA - Kubernetes Event-driven Autoscaling

What is KEDA and Why You Actually Need It

KEDA (Kubernetes Event-driven Autoscaler) is a CNCF graduated project originally created by Microsoft and Red Hat. The latest version is v2.17.2, released in June 2024, and it fixes Kubernetes' biggest autoscaling fuckup: scaling based on metrics that actually matter.

The Problem: CPU Metrics Are Bullshit

Traditional Kubernetes Horizontal Pod Autoscaler (HPA) only looks at CPU and memory. That's like judging a restaurant's popularity by how hot the kitchen gets. Your message queue could have 10,000 pending jobs, but if your workers aren't pegging CPU, HPA doesn't give a shit.

I learned this the hard way when our Redis queue went completely nuts - had to be 40-something thousand messages, maybe 50k? Hard to say exactly - and HPA just sat there like a useless brick while our response times turned to absolute shit. CPU was fine, so why scale?

How KEDA Actually Works

KEDA Architecture Diagram

KEDA has three components that don't suck:

KEDA Operator monitors external stuff - your message queues, databases, whatever. It creates ScaledObjects and ScaledJobs that actually make sense for your workload.

Metrics Server translates external metrics into something Kubernetes HPA can understand. It's basically the middleware that makes KEDA play nice with existing K8s autoscaling.

Scalers connect to 60+ external services including Apache Kafka, Redis, RabbitMQ, AWS SQS, Azure Service Bus, Prometheus, and tons more.

Scale-to-Zero: Actually Useful

KEDA can scale your pods to zero when there's no work. Not "minimum 1 replica" - actual zero. When messages show up in your queue, it spins up pods in about 30 seconds (don't believe the "within seconds" marketing BS).

This saved us maybe 60% on our staging environment costs, hard to say exactly. Production? That's where you learn the hard way that your startup time is actually like 45 seconds on a bad day, not the 5 seconds you thought, and users start bitching about timeouts.

Real Production Gotchas Nobody Tells You

KEDA operator resource usage: The docs say 200MB RAM but that's bullshit. Plan for at least 400-500MB RAM, probably more if your cluster decides to be an asshole. Each scaler hammers your APIs every 30 seconds - hope you like those sweet, sweet cloud provider API bills.

Scale-to-zero timing: That "within seconds" claim? Complete bullshit. Expect like 30-60 seconds for the first pod, maybe longer if your image is huge. If you need sub-5-second response times, scale-to-zero will piss off your users.

Event source failures: If your Redis/Kafka/whatever goes down, KEDA keeps your app at current scale. It doesn't freak out and scale to zero, which is actually pretty smart.

Authentication debugging: TriggerAuthentication fails silently like a passive-aggressive coworker who leaves you Post-it notes about your failures instead of just telling you. You'll spend 6 hours debugging why your ScaledObject sits there doing absolutely nothing, only to discover you fat-fingered the secret name. Again. Always check kubectl logs -l app=keda-operator -n keda first, not after you've already questioned your career choices.

When KEDA Actually Makes Sense

Event-driven workloads (message queues, batch processing)
Variable traffic patterns where CPU scaling is useless
Cost-sensitive environments where scale-to-zero matters
Integration with cloud services that HPA can't see

KEDA works with Deployments, StatefulSets, and Jobs, plus any custom resource that implements the /scale sub-resource. It plays nice with existing VPA and other Kubernetes tools.

Anyway, here's how KEDA compares to the other autoscaling options that probably aren't working for you either.

KEDA vs HPA vs VPA - Kubernetes Autoscaling Comparison

Feature	KEDA	HPA (Horizontal Pod Autoscaler)	VPA (Vertical Pod Autoscaler)
Scaling Direction	Horizontal (pod replicas) + Scale-to-Zero	Horizontal (pod replicas)	Vertical (resource requests/limits)
Scaling Triggers	60+ event sources including message queues, databases, cloud services, custom metrics	CPU, memory, custom metrics via Kubernetes Metrics API	CPU and memory utilization patterns
Scale-to-Zero	✅ Built-in capability	❌ Minimum 1 replica	❌ Not applicable
Event Sources	Apache Kafka, RabbitMQ, Redis, AWS SQS, Azure Service Bus, Prometheus, PostgreSQL, MongoDB, Cron, HTTP, and more	Kubernetes metrics only	Resource usage patterns only
Complexity	Moderate requires event source configuration	Low straightforward setup	High requires careful tuning
Best Use Cases	Event-driven applications, microservices, batch processing, serverless workloads	Traditional web applications with predictable traffic patterns	Resource optimization for long-running applications
Workload Types	Deployments, StatefulSets, Jobs, Custom Resources with `/scale`	Deployments, ReplicaSets, StatefulSets	Deployments, StatefulSets, DaemonSets, Jobs
Cost Efficiency	Excellent (scale-to-zero + precise event-based scaling)	Good (horizontal scaling based on metrics)	Good (resource optimization)
Cloud Integration	✅ Native cloud service scalers	Limited to metrics	Resource-focused only
Learning Curve	Moderate (event source concepts)	Low (standard Kubernetes)	High (resource analysis required)
Production Readiness	✅ CNCF Graduated, widely adopted	✅ Built into Kubernetes	✅ Stable but requires expertise
Dependencies	KEDA operator installation	Metrics Server	VPA components installation
Kubernetes Version	1.23+ recommended	Built-in (1.1+)	1.11+
Resource Overhead	Minimal	Very minimal	Low

Actually Deploying KEDA (And What Goes Wrong)

I've deployed KEDA about a dozen times across different teams. Here's what actually works and what will waste your time.

Installation: Just Use Helm

Skip the YAML manifests unless you love pain. The official Helm chart handles all the CRD and certificate BS for you:

helm repo add kedacore https://kedacore.github.io/charts
helm install keda kedacore/keda --namespace keda --create-namespace

Takes about 2 minutes if your cluster doesn't hate you. The YAML manifests exist but you'll spend like 3 hours dealing with webhook certificates, RBAC bullshit, and admission controllers that some asshole forgot to document.

OpenShift users: Use OperatorHub instead. It handles the security context constraints that will otherwise make your life miserable.

ScaledObject: Where Most People Screw Up

Here's a Redis scaler that actually works in production:

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: worker-scaler
spec:
  scaleTargetRef:
    name: worker-deployment
  minReplicaCount: 1    # Don't set to 0 in prod unless you like angry users
  maxReplicaCount: 20   # Set this or prepare for chaos
  triggers:
  - type: redis
    metadata:
      address: redis.default.svc.cluster.local:6379
      listName: work-queue
      listLength: '5'
    authenticationRef:
      name: redis-auth

Common fuckups:

Setting minReplicaCount: 0 in production without thinking about cold start times
Forgetting maxReplicaCount and watching your cluster explode
Using redis-server:6379 instead of the full Kubernetes service name
Not setting up TriggerAuthentication for anything that needs auth

Authentication: The Part That Always Breaks

TriggerAuthentication fails silently. When your ScaledObject isn't scaling, this is usually why:

apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
  name: redis-auth
spec:
  secretTargetRef:
  - parameter: password
    name: redis-secret
    key: password

Debug with: kubectl logs -l app=keda-operator -n keda | grep -i auth

Pro tip: Test your auth separately before adding it to ScaledObject. I've seen teams spend fucking days debugging scaling when the problem was just a typo in the secret name.

Scalers That Actually Work Well

Apache Kafka: Solid. Scales based on consumer lag. Works great for event processing.

RabbitMQ: Reliable. Queue depth scaling works exactly as expected. Just don't forget the management plugin.

AWS SQS: Works well with IRSA. Approximate message count is good enough for most use cases.

Prometheus: Powerful but debugging PromQL queries in KEDA will make you hate life. Test your queries in Prometheus first.

Cron: Perfect for predictable workloads. We use it to pre-scale before our morning batch jobs.

Production Reality Check

Here's the shit nobody tells you in the marketing demos:

Resource requirements: The docs say 200MB RAM, 100m CPU. That's a fucking joke. We're using way more than that, probably 400-500MB with around 20 ScaledObjects. Hit some crazy memory spike when everyone deployed at once - think it was like 700-800MB? That was a fun Friday. Some genius created like 200 ScaledObjects all polling every 30 seconds. Each scaler hammers your APIs constantly - hope you like those sweet, sweet AWS CloudWatch API calls at $0.01 per 1,000 requests.

Scaling storms: Someone's query was scanning the entire user table every 30 seconds. Took forever to figure out which one was doing it. Database completely shit the bed, took like 2 hours to recover, and my phone wouldn't stop buzzing. Turns out some dev forgot a WHERE clause in their monitoring script. Always test your queries under load, preferably not in production like we did.

KEDA operator restarts: When KEDA crashes (and it will crash - we've seen it die from memory pressure, certificate issues, and random network timeouts), all scaling just stops. Your apps keep running at whatever scale they were at, which is great until your morning batch jobs don't scale up and your CEO asks why reports are 3 hours late.

Scale-to-zero gotchas:

Pod startup time matters. If your app takes 30+ seconds to start, users will notice.
PersistentVolumeClaims don't get cleaned up automatically with ScaledJobs
Some cloud load balancers freak out when target groups go to zero

Monitoring You Actually Need

Prometheus integration is essential. Watch these metrics:

keda_scaler_errors_total - When your scalers are failing
keda_scaled_object_paused - When scaling is broken
keda_metrics_server_* - API server health

Set up alerts for scaler errors. Silent failures are KEDA's specialty - it'll fail quietly while your pods sit there doing nothing and your users get pissed.

When KEDA Isn't Worth It

Simple web apps with predictable traffic - just use HPA
Stateful workloads that can't handle restarts
Real-time systems that need sub-second response times
Teams that don't understand their event sources

KEDA is great for event-driven stuff. For everything else, it's probably overkill.

Still have questions? Of course you do - KEDA is powerful but not always intuitive. Here are the questions everyone asks when they're trying to figure out if KEDA is right for their setup.

Questions People Actually Ask

What the hell is KEDA and why should I care?

KEDA (Kubernetes Event-driven Autoscaler) is a CNCF graduated project that scales pods based on actual events instead of bullshit CPU metrics. Your message queue has like 800 pending jobs but CPU is at 20%? KEDA scales up anyway. HPA just sits there like an idiot. KEDA supports 70+ event sources and can scale to zero pods when there's no work, which is great for your cloud bill.

Can I run this on my janky cluster?

KEDA needs Kubernetes v1.23+ for all features, though v1.17+ works with limitations. Your cluster needs CRDs and Metrics Server. Works on AKS, EKS, GKE, OpenShift, and whatever homebrew Kubernetes setup you're running.

Is this production-ready or just another demo project?

KEDA v2.17+ is solid. Microsoft uses it, Reddit runs it, and it's CNCF graduated which means real governance and security practices. I've run it in production for 2+ years across multiple companies without major issues.

Does KEDA cost anything or is it another "free trial" scam?

KEDA is actually free. No hidden fees, no "enterprise features", no bullshit. It's CNCF funded and open source. The only cost is the resources KEDA uses (plan for like 400-500MB RAM, maybe more if you're unlucky).

Can I run KEDA with my existing HPA setup?

No, don't be an idiot. KEDA and HPA will fight each other over the same deployment. KEDA creates its own HPA under the hood. If you need CPU/memory scaling, use KEDA's CPU and Memory scalers instead of running both.

How fast does scale-to-zero actually work?

KEDA marketing says "within seconds" which is complete bullshit. Expect like 30-60 seconds for the first pod to start from zero, maybe longer if your image is huge or your cluster decides to be an asshole that day. Scale-to-zero works great for batch jobs and development, but think twice before using it for user-facing APIs.

Can I run multiple KEDA installations because I like chaos?

No, and stop asking. One KEDA per cluster, period. Kubernetes only lets one external metrics server run, and KEDA claims that spot. Install a second one and they'll fight over who gets to provide metrics. Your scaling will break in fun and mysterious ways.

What happens when KEDA crashes?

When the KEDA operator dies (and it will eventually), your apps keep running at whatever scale they were at. New scaling events stop processing until KEDA recovers. The HPA uses stale metrics for a while, then gives up. Run KEDA with multiple replicas and proper resource limits or prepare for 3am pages when your batch jobs don't scale.

How do I scale based on HTTP requests without losing my mind?

KEDA has no native HTTP scaler because HTTP scaling is hard. Your options: the experimental HTTP Add-on (use at your own risk), the Prometheus scaler with HTTP metrics (prepare for PromQL debugging hell), or cloud-specific options like Azure Application Insights. For most HTTP workloads, just use regular HPA.

Can I throw multiple triggers at one ScaledObject?

Yes, and you should. Multiple triggers in one ScaledObject work fine. KEDA scales when ANY trigger fires, and HPA picks the highest replica count. It's cleaner than managing multiple ScaledObjects that fight each other.

How often does KEDA poll my event sources?

Default is every 30 seconds, configurable per scaler. This matters for scale-from-zero timing

your first pod won't appear until the next poll cycle. Scale-up/down after that uses HPA's faster polling. Some scalers support webhooks to reduce polling, but most just hammer your APIs relentlessly every 30 seconds like an impatient child.

How does KEDA handle auth without exposing my secrets?

KEDA uses TriggerAuthentication resources to keep credentials separate from ScaledObjects. It supports AWS IAM roles, Azure Workload Identity, GCP Workload Identity, HashiCorp Vault, and plain Kubernetes secrets. The auth setup is usually where things break first.

Can I run KEDA with paranoid security settings?

Yes, KEDA v2.10+ runs with readOnlyRootFilesystem=true by default. Earlier versions need custom certificate volumes. If your security team makes you jump through hoops, check the security guide for all the knobs you can turn.

Why isn't my ScaledObject doing anything?

Nine times out of fucking ten, it's auth. Check kubectl logs -l app=keda-operator -n keda | grep ERROR. Common fuckups: wrong service name, typo in secret name, RBAC permissions missing, network can't reach your event source, or you forgot to create the TriggerAuthentication entirely.

How do I debug why scaling is broken?

Start with kubectl get hpa and kubectl describe hpa [scaledObject-name]. Look for "unable to get metrics" or similar errors. Check KEDA operator logs with kubectl logs -l app=keda-operator -n keda. If you're using Prometheus scaler, test your PromQL query directly first. The troubleshooting guide has more debugging steps that actually work.

How do I migrate from HPA without breaking production?

Start by running KEDA alongside HPA on non-production workloads.

Map your CPU/memory triggers to KEDA's CPU and Memory scalers.

Once you trust it, delete the HPA and let KEDA take over. The migration guide has the step-by-step process. Don't rush this

HPA conflicts with ScaledObjects.

Quick Navigation

The Problem: CPU Metrics Are Bullshit

How KEDA Actually Works

Scale-to-Zero: Actually Useful

Real Production Gotchas Nobody Tells You

When KEDA Actually Makes Sense

Installation: Just Use Helm

ScaledObject: Where Most People Screw Up

Authentication: The Part That Always Breaks

Scalers That Actually Work Well

Production Reality Check

Monitoring You Actually Need

When KEDA Isn't Worth It

What the hell is KEDA and why should I care?

Can I run this on my janky cluster?

Is this production-ready or just another demo project?

Does KEDA cost anything or is it another "free trial" scam?

Can I run KEDA with my existing HPA setup?

How fast does scale-to-zero actually work?

Can I run multiple KEDA installations because I like chaos?

What happens when KEDA crashes?

How do I scale based on HTTP requests without losing my mind?

Can I throw multiple triggers at one ScaledObject?

How often does KEDA poll my event sources?

How does KEDA handle auth without exposing my secrets?

Can I run KEDA with paranoid security settings?

Why isn't my ScaledObject doing anything?

How do I debug why scaling is broken?

How do I migrate from HPA without breaking production?

Related Tools & Recommendations

Helm: Simplify Kubernetes Deployments & Avoid YAML Chaos

Kafka, MongoDB, K8s, Prometheus: Event-Driven Observability

Linkerd Overview: The Lightweight Kubernetes Service Mesh

PostgreSQL vs MySQL vs MongoDB vs Cassandra - Which Database Will Ruin Your Weekend Less?

Setting Up Prometheus Monitoring That Won't Make You Hate Your Job

Istio Service Mesh: Real-World Complexity, Benefits & Deployment

MongoDB Atlas Enterprise Deployment Guide

kubectl: Kubernetes CLI - Overview, Usage & Extensibility

Flux GitOps: Secure Kubernetes Deployments with CI/CD

containerd - The Container Runtime That Actually Just Works

Red Hat OpenShift Container Platform: Enterprise Kubernetes Overview

Development Containers - Production Deployment Guide

Kubernetes Enterprise Value Assessment: Is It Worth the Investment?

kubeadm - The Official Way to Bootstrap Kubernetes Clusters

Master Microservices Setup: Docker & Kubernetes Guide 2025

ArgoCD - GitOps for Kubernetes That Actually Works

ArgoCD Production Troubleshooting: Debugging & Fixing Deployments

Change Data Capture (CDC) Integration Patterns for Production

etcd Overview: The Core Database Powering Kubernetes Clusters

Fix Kubernetes Pod CrashLoopBackOff - Complete Troubleshooting Guide