Cilium - Fix Kubernetes Networking with eBPF

Currently viewing the human version

How Cilium Actually Works

Cilium sticks eBPF programs at various points in the Linux kernel networking stack. Instead of your packets going through the normal Linux networking path (and hitting thousands of iptables rules), eBPF intercepts them early and makes forwarding decisions in the kernel.

eBPF: Skip the Bullshit

eBPF lets you run code in the kernel without recompiling it. Cilium uses this to do networking stuff faster than userspace solutions. The downside? Debugging is a nightmare when things break.

Traditional CNI plugins like Flannel create VXLAN tunnels and rely on iptables for everything. Once you hit about 1,000 services, those iptables rules become a performance bottleneck. Every packet has to traverse this massive rule chain to figure out where to go.

Cilium says "fuck that" and implements forwarding logic directly in eBPF. Your packets get processed by kernel code that knows exactly what to do with them, no rule traversal needed.

eBPF programs hook into different points in the Linux networking stack - from the XDP layer for early packet processing to TC hooks for policy enforcement. Check out the eBPF and XDP Reference Guide for technical details on how this works.

Identity-Based Security That Actually Works

Instead of tracking IP addresses that change every 5 minutes in Kubernetes, Cilium tracks workload identities that stay consistent. Each pod gets a security identity based on its labels, and packets carry that identity information.

This matters because traditional Kubernetes network policies break when pods get rescheduled and IPs change. Cilium's approach means your security policies keep working even when everything's moving around. Read more about identity management and security identity allocation.

The Agent Setup Reality

Every node runs a Cilium agent that compiles and loads eBPF programs. When this works, it's magical. When it doesn't, you're debugging kernel-level networking with tools that assume you have a PhD in eBPF.

The Cilium architecture consists of the agent (running on each node), the Cilium operator (cluster-wide management), the CNI plugin (pod networking), and Hubble (observability). These components work together to replace traditional iptables-based networking with eBPF programs.

The agent talks to the Kubernetes API to get policy updates, then translates them into eBPF code and loads it into the kernel. If your kernel doesn't support the eBPF features Cilium needs, you're fucked.

I learned this the hard way when trying to run Cilium on CentOS 7 with kernel 3.10. The agent just kept restarting with "invalid argument" errors. Kernel 4.9+ is the minimum but you really want 5.10+ to avoid random eBPF loading failures. Check the system requirements before installation or you'll waste hours debugging cryptic kernel errors.

kube-proxy Replacement

kube-proxy creates iptables rules for every service endpoint. With 1,000 services and 10 endpoints each, that's 10,000+ iptables rules your packets have to potentially traverse.

Cilium's eBPF load balancing uses hash tables for O(1) lookups. It can handle millions of services without the linear performance degradation you get with iptables. See the kube-proxy replacement guide and service load balancing documentation.

I've seen clusters where replacing kube-proxy with Cilium reduced average response latency by 30-40%, but I've also seen it completely break NodePort services on GKE because Google's load balancer expects kube-proxy iptables rules. Always test in a staging cluster that matches your production setup exactly.

Performance Reality Check

The official benchmarks show impressive numbers, but they're running on clean test clusters with optimal configurations. Real performance differences between eBPF and traditional iptables become dramatic as you scale - eBPF maintains consistent O(1) lookup times while iptables performance degrades linearly with the number of rules. Also check out the CNI performance benchmark blog and scalability testing.

In the real world, Cilium is definitely faster than traditional CNI plugins, but the performance gain depends on your workload. If you're doing mostly east-west traffic between microservices, the difference is significant. If you're just running a few CRUD apps, you might not notice.

The most dramatic improvement is connection setup time. eBPF load balancing eliminates the NAT operations that create bottlenecks during high connection churn. Read about connection tracking and socket acceleration.

Multi-Cluster Networking (ClusterMesh)

ClusterMesh lets pods in different clusters talk to each other like they're in the same cluster. It's actually pretty cool when it works.

ClusterMesh creates a secure mesh between clusters by running a clustermesh-apiserver in each cluster that exposes local services to remote clusters. The clusters establish mutual TLS connections and sync service discovery information across the mesh.

Each cluster runs a clustermesh-apiserver that exposes local services to remote clusters. The clusters establish secure connections and sync service discovery information. Check the multi-cluster setup guide and troubleshooting documentation.

The catch? Setting it up requires understanding Kubernetes networking, BGP routing, and how Cilium's identity system works across clusters. I've spent entire weekends debugging certificate trust issues between clusters that worked fine individually. Budget 3x longer than you think for ClusterMesh setup.

Current Status (September 2025)

Cilium has over 22k GitHub stars as of September 2025. The project graduated from CNCF in October 2023, so it won't disappear when a maintainer switches jobs.

They release pretty regularly but each version breaks something different. Version 1.15.7 had eBPF program loading issues on Ubuntu 22.04 with kernel 5.15. Version 1.16.2 fixed that but broke ClusterMesh between different minor versions. Version 1.17.x works well unless you're using AWS Load Balancer Controller v2.6+ - they conflict and pods randomly lose network connectivity.

I'm running 1.16.15 in production right now because 1.17 kept having weird identity allocation failures under heavy pod churn. Your mileage may vary but test thoroughly before upgrading.

Real Talk CNI Comparison

Reality Check	Cilium	Calico	Flannel
What it actually does	eBPF kernel networking	iptables + BGP routing	VXLAN tunnels everywhere
Setup complexity	Complex as hell	Moderate learning curve	Just fucking works
When it breaks	Good luck debugging eBPF	Check BGP and iptables	Restart the pod
Performance	Fast once working	Decent, scales well	Fine for most workloads
Documentation	Comprehensive but assumes expertise	Pretty good	Basic but sufficient
Community	Active but niche	Large enterprise user base	Simple, stable community
Enterprise support	Isovalent (now Cisco)	Tigera (solid company)	Red Hat/CoreOS

What Actually Runs on Your Nodes

Linux Kernel Networking

eBPF Programs Everywhere

Cilium installs eBPF programs at multiple points in your kernel networking stack. When it works, packets flow through these programs instead of the usual Linux networking path. When it breaks, you're debugging kernel code with tools that barely exist.

The 3AM eBPF Debugging Reality:

`bpftool` shows you what programs are loaded, but good luck understanding what they actually do to your packets
Error messages are cryptic kernel-level failures like "Invalid argument" - I've spent hours on GitHub issues trying to decode what that means
When eBPF programs fail to load, networking just stops. No graceful fallback, no helpful error message
Kernel logs give you gems like "BPF JIT compilation failed" at 3am when you need to fix production

Cilium Agent: The Thing That Breaks

The Cilium agent runs on every node and does all the heavy lifting:

Compiles eBPF programs from your network policies
Loads them into the kernel at various hook points
Manages IP address allocation and routing tables
Talks to the Kubernetes API to get updates

What Actually Goes Wrong:

level=error msg=\"Unable to compile BPF program\" error=\"cannot load program: permission denied\"

This usually means your kernel doesn't support the eBPF features Cilium needs, or someone enabled a security policy that blocks BPF syscalls. I hit this exact error on RHEL 8.4 because they disabled unprivileged BPF by default. The "fix" was either downgrading security or running containers privileged, which defeats half the point of using Kubernetes.

Layer 7 Policies: The HTTP Example That Works Sometimes

The textbook HTTP policy example looks clean:

apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
spec:
  endpointSelector:
    matchLabels:
      app: backend
  ingress:
  - fromEndpoints:
    - matchLabels:
        app: frontend
    toPorts:
    - ports:
      - port: \"8080\"
        protocol: TCP
      rules:
        http:
        - method: \"GET\"
          path: \"/api/v1/.*\"

The Reality:

Works great if your HTTP requests match exactly what you expect
Breaks when your frontend sends POST requests you forgot about
The regex \"/api/v1/.*\" might not match /api/v1/users?limit=10 depending on how your app constructs URLs
Debugging requires looking at Hubble flows to see which requests are getting denied

Performance: It's Complicated

Cilium is definitely faster than iptables-based solutions, but the performance gain depends on your specific workload and configuration.

Hash Table Lookups vs iptables Chains:
iptables rules form a linked list that packets traverse sequentially. With 10,000 service endpoints, that's potentially 10,000 rule evaluations per packet. Cilium uses kernel hash tables for O(1) lookups.

Socket-Level Load Balancing:
For connections between pods (east-west traffic), Cilium intercepts socket operations and picks endpoints before the connection is established. This eliminates the NAT translation overhead you get with kube-proxy.

XDP Support (If Your NICs Support It):
XDP processes packets before they hit the normal kernel networking stack. Most cloud provider NICs don't support all XDP features, so you might not get the full benefit.

Hubble: Actually Useful Observability

Network Observability

Hubble is one of Cilium's genuinely useful features. It captures network flow data using the same eBPF programs that handle networking, so you get comprehensive visibility without additional performance overhead.

The Hubble UI provides an interactive service map that visualizes network flows in real-time, showing which services are communicating, HTTP status codes, latency metrics, and policy decisions. It's like having a detailed network diagram that updates itself as your applications run.

What You Actually See:

Every network connection with source/destination pod identities
HTTP request methods, response codes, and latency
DNS queries and responses (useful for debugging service discovery issues)
Network policy decisions (which connections were allowed/denied)

Debugging Example:
Your frontend can't reach your backend. Hubble shows:

frontend-pod -> backend-service DENIED (Policy denied)

You can see exactly which policy rule blocked the connection and fix it. Without Hubble, you'd be guessing why connections are failing.

ClusterMesh: Cool But Complex

ClusterMesh lets services in different clusters discover and connect to each other. The architecture is actually elegant:

Each cluster runs a clustermesh-apiserver that exposes local services
Clusters connect using mutual TLS and sync service information
Pods can connect to service.namespace.svc.cluster2.local across clusters

What Will Go Wrong:

Network connectivity between clusters - your firewall team will block random ports
Certificate trust issues that make no sense - clusters that worked yesterday suddenly can't talk
IP address conflicts when someone forgot to check CIDR ranges before creating clusters
Identity conflicts causing random 403s when the same service exists in multiple clusters

I spent a full day debugging a ClusterMesh setup where everything looked correct but services couldn't connect. Turned out the mutual TLS certificates had different CA chains and were silently failing validation. Budget 3x longer than the docs suggest.

BGP Mode: For On-Premises Reality

If you're running on bare metal or want to avoid overlay networks, Cilium supports BGP routing. Your pods get routable IP addresses, and Cilium announces routes to your network infrastructure.

In BGP mode, Cilium eliminates VXLAN tunneling overhead by having each node announce pod routes directly to your network switches. This gives you native performance and makes traffic visible to existing network monitoring tools.

Requirements:

Your network team needs to understand BGP
Your switches/routers need to be configured correctly
IP address planning becomes critical (no more overlapping pod CIDRs)

The Benefit:
No VXLAN tunnels means better performance and easier troubleshooting. Network packets look normal to your existing monitoring tools.

When Things Break: The Real Architecture

Cilium Agent Down:

New pods can't get IP addresses
Existing network policies stop being enforced
eBPF programs stay loaded but don't get updates

eBPF Programs Fail to Load:

Networking stops working entirely on that node
Kubernetes events usually show unhelpful error messages
Check kernel logs and dmesg for actual eBPF errors

Memory Pressure:

eBPF maps have size limits
High connection churn can exhaust map entries
Symptoms: new connections fail while existing ones work

The architecture is elegant when everything works correctly. When it doesn't, you need to understand Linux networking, eBPF, and Kubernetes all at once.

Questions People Actually Ask

Why does my Cilium agent keep crashing with "permission denied"?

Your kernel doesn't support the eBPF features Cilium needs, or BPF syscalls are restricted. Check:bash# Verify eBPF supportzgrep CONFIG_BPF /proc/config.gz# Should show CONFIG_BPF=y# Check kernel versionuname -r# Need 4.9+, preferably 5.10+Fix it by upgrading your kernel or adjusting security policies that restrict BPF operations. Some security-hardened distributions disable BPF by default.

How do I debug "connection refused" errors with Cilium network policies?

Use Hubble to see what's actually happening:bashhubble observe --follow --pod frontend-podYou'll see output like:frontend-pod:54321 -> backend-pod:8080 DENIED (Policy denied)The policy denied the connection. Check your CiliumNetworkPolicy resources and make sure the selectors match your pod labels. Layer 7 policies are especially picky about HTTP method and path matching.

Can I just replace kube-proxy with Cilium without breaking everything?

Sometimes, but expect shit to break in weird ways.

Enable kube-proxy replacement like this:bashcilium install --set kubeProxyReplacement=strictWhat will definitely break:

Node

Port services on GKE because Google's load balancer expects kube-proxy iptables rules

Any monitoring that scrapes kube-proxy metrics
those endpoints disappear
Legacy apps that make assumptions about iptables NAT behavior
Anything using hostNetwork pods with services
weird edge case but it happensTest in staging first. When it works, it's noticeably faster. When it doesn't, you'll spend hours figuring out why your load balancer health checks started failing.

Why is Cilium using so much memory compared to Flannel?

Cilium does more stuff.

It maintains:

eBPF maps for connection tracking
Identity mappings for all pods
Policy state for Layer 7 filtering
Flow data for Hubble observabilityTypical memory usage: 150-300MB per node vs 50MB for Flannel. The memory usage scales with the number of pods and network policies. If you're running a simple cluster, Flannel's lower overhead might be better.

How do I know if my NIC supports XDP for better performance?

Check the driver:bashethtool -i eth0 | grep driverGood XDP support: mlx5_core, i40e, ixgbe, veth (containers)Limited XDP: Most cloud provider NICs, virtio driversYou can enable XDP mode in Cilium, but you might not see performance gains if your NIC driver doesn't support the required features.

Why are my HTTP network policies not working?

Layer 7 policies are finicky.

Common issues:

Your pods aren't actually sending HTTP traffic (might be HTTPS)
The HTTP path regex doesn't match what your app actually sends
The policy is applied to the wrong pods (label selectors)
You forgot to allow the specific HTTP method (POST, PUT, etc.)Debug with:bashhubble observe --http-status --pod your-pod-name

Can I migrate from Calico to Cilium without downtime?

No.

Don't believe anyone who tells you otherwise. Both CNIs install kernel modules and eBPF programs that conflict. You need to:

Backup your network policies (and rewrite them because the syntax is different)2. Drain nodes completely and reinstall with Cilium
Or build a new cluster and migrate workloads (safer but more work)I tried the "gradual migration" approach once. Spent two days with half my nodes unable to route to the other half. Just plan for downtime and do it cleanly.

What happens when the Cilium agent dies?

Existing connections: Keep working (eBPF programs stay loaded)New connections: Fail after existing conntrack entries expireNew pods: Can't get IP addresses or network policiesPolicy updates: Stop being enforcedThe agent restart usually takes 30-60 seconds. If it keeps crashing, check the logs:bashkubectl logs -n kube-system ds/cilium

How do I troubleshoot ClusterMesh connectivity issues?

Start with the basics:bash# Check clustermesh statuscilium clustermesh status# Verify certificateskubectl get secret -n kube-system clustermesh-apiserver-remote-certCommon failures:

Network connectivity between cluster API servers
Certificate trust issues
IP address space conflicts between clusters
Firewall rules blocking clustermesh traffic (port 2379)

Why is my Cilium installation slower than kube-proxy?

You probably have a configuration issue:

Running in overlay mode when you could use native routing
XDP mode enabled on NICs that don't support it properly
Unnecessary Layer 7 policies adding overhead
Hubble collecting too much flow dataCheck your Cilium configuration:bashcilium config view | grep -E "(routing-mode|kube-proxy-replacement)"

Does Cilium work with my service mesh?

It can work alongside Istio/Linkerd, but you're duplicating functionality. Cilium provides its own service mesh features through eBPF.With traditional service mesh: You get double encryption, double observability overheadCilium-only: Lighter weight but fewer advanced traffic management featuresChoose one approach. Running both is usually overkill unless you have specific requirements that only one can meet.

How do I debug eBPF programs when things go wrong?

You mostly can't, and that's the real problem with eBPF. The debugging tools are hot garbage:bash# See what programs are loaded (but not what they do)bpftool prog list# Check eBPF maps (cryptic output)bpftool map list# Kernel logs for eBPF errors (good luck)dmesg | grep -i bpfWhen eBPF programs fail to load, you get error messages like "Invalid argument" with zero context. I've had production outages where the only fix was rebooting nodes because some eBPF map got corrupted.Your best bet is checking GitHub issues for your exact kernel version + Cilium version combo. Someone else probably hit the same cryptic error.

What's new in Cilium for 2025?

The upcoming 1.19 release focuses on improved multi-cluster support and better cloud integration.

Key improvements include:

Enhanced XDP support for AWS Nitro and Google GKE Autopilot
Better integration with Kubernetes Gateway API v1.0
Improved eBPF program compilation for ARM64 architectures
Enhanced Tetragon integration for runtime securityThe Cisco acquisition in 2024 brought enterprise focus without changing the open-source model. Expect more enterprise features in the commercial Isovalent distribution while keeping core functionality free.

Resources That Actually Help

Related Tools & Recommendations

integration

Recommended

GitOps Integration Hell: Docker + Kubernetes + ArgoCD + Prometheus

How to Wire Together the Modern DevOps Stack Without Losing Your Sanity

prometheus

/integration/docker-kubernetes-argocd-prometheus/gitops-workflow-integration

100%

integration

Recommended

Kafka + MongoDB + Kubernetes + Prometheus Integration - When Event Streams Break

When your event-driven services die and you're staring at green dashboards while everything burns, you need real observability - not the vendor promises that go

Apache Kafka

/integration/kafka-mongodb-kubernetes-prometheus-event-driven/complete-observability-architecture

100%

tool

Recommended