Cilium sticks eBPF programs at various points in the Linux kernel networking stack. Instead of your packets going through the normal Linux networking path (and hitting thousands of iptables rules), eBPF intercepts them early and makes forwarding decisions in the kernel.
eBPF: Skip the Bullshit
eBPF lets you run code in the kernel without recompiling it. Cilium uses this to do networking stuff faster than userspace solutions. The downside? Debugging is a nightmare when things break.
Traditional CNI plugins like Flannel create VXLAN tunnels and rely on iptables for everything. Once you hit about 1,000 services, those iptables rules become a performance bottleneck. Every packet has to traverse this massive rule chain to figure out where to go.
Cilium says "fuck that" and implements forwarding logic directly in eBPF. Your packets get processed by kernel code that knows exactly what to do with them, no rule traversal needed.
eBPF programs hook into different points in the Linux networking stack - from the XDP layer for early packet processing to TC hooks for policy enforcement. Check out the eBPF and XDP Reference Guide for technical details on how this works.
Identity-Based Security That Actually Works
Instead of tracking IP addresses that change every 5 minutes in Kubernetes, Cilium tracks workload identities that stay consistent. Each pod gets a security identity based on its labels, and packets carry that identity information.
This matters because traditional Kubernetes network policies break when pods get rescheduled and IPs change. Cilium's approach means your security policies keep working even when everything's moving around. Read more about identity management and security identity allocation.
The Agent Setup Reality
Every node runs a Cilium agent that compiles and loads eBPF programs. When this works, it's magical. When it doesn't, you're debugging kernel-level networking with tools that assume you have a PhD in eBPF.
The Cilium architecture consists of the agent (running on each node), the Cilium operator (cluster-wide management), the CNI plugin (pod networking), and Hubble (observability). These components work together to replace traditional iptables-based networking with eBPF programs.
The agent talks to the Kubernetes API to get policy updates, then translates them into eBPF code and loads it into the kernel. If your kernel doesn't support the eBPF features Cilium needs, you're fucked.
I learned this the hard way when trying to run Cilium on CentOS 7 with kernel 3.10. The agent just kept restarting with "invalid argument" errors. Kernel 4.9+ is the minimum but you really want 5.10+ to avoid random eBPF loading failures. Check the system requirements before installation or you'll waste hours debugging cryptic kernel errors.
kube-proxy Replacement
kube-proxy creates iptables rules for every service endpoint. With 1,000 services and 10 endpoints each, that's 10,000+ iptables rules your packets have to potentially traverse.
Cilium's eBPF load balancing uses hash tables for O(1) lookups. It can handle millions of services without the linear performance degradation you get with iptables. See the kube-proxy replacement guide and service load balancing documentation.
I've seen clusters where replacing kube-proxy with Cilium reduced average response latency by 30-40%, but I've also seen it completely break NodePort services on GKE because Google's load balancer expects kube-proxy iptables rules. Always test in a staging cluster that matches your production setup exactly.
Performance Reality Check
The official benchmarks show impressive numbers, but they're running on clean test clusters with optimal configurations. Real performance differences between eBPF and traditional iptables become dramatic as you scale - eBPF maintains consistent O(1) lookup times while iptables performance degrades linearly with the number of rules. Also check out the CNI performance benchmark blog and scalability testing.
In the real world, Cilium is definitely faster than traditional CNI plugins, but the performance gain depends on your workload. If you're doing mostly east-west traffic between microservices, the difference is significant. If you're just running a few CRUD apps, you might not notice.
The most dramatic improvement is connection setup time. eBPF load balancing eliminates the NAT operations that create bottlenecks during high connection churn. Read about connection tracking and socket acceleration.
Multi-Cluster Networking (ClusterMesh)
ClusterMesh lets pods in different clusters talk to each other like they're in the same cluster. It's actually pretty cool when it works.
ClusterMesh creates a secure mesh between clusters by running a clustermesh-apiserver in each cluster that exposes local services to remote clusters. The clusters establish mutual TLS connections and sync service discovery information across the mesh.
Each cluster runs a clustermesh-apiserver that exposes local services to remote clusters. The clusters establish secure connections and sync service discovery information. Check the multi-cluster setup guide and troubleshooting documentation.
The catch? Setting it up requires understanding Kubernetes networking, BGP routing, and how Cilium's identity system works across clusters. I've spent entire weekends debugging certificate trust issues between clusters that worked fine individually. Budget 3x longer than you think for ClusterMesh setup.
Current Status (September 2025)
Cilium has over 22k GitHub stars as of September 2025. The project graduated from CNCF in October 2023, so it won't disappear when a maintainer switches jobs.
They release pretty regularly but each version breaks something different. Version 1.15.7 had eBPF program loading issues on Ubuntu 22.04 with kernel 5.15. Version 1.16.2 fixed that but broke ClusterMesh between different minor versions. Version 1.17.x works well unless you're using AWS Load Balancer Controller v2.6+ - they conflict and pods randomly lose network connectivity.
I'm running 1.16.15 in production right now because 1.17 kept having weird identity allocation failures under heavy pod churn. Your mileage may vary but test thoroughly before upgrading.