Longhorn - Distributed Storage for Kubernetes That Doesn't Suck

What is Longhorn?

Longhorn Architecture Overview

Longhorn is basically distributed storage that doesn't make you want to quit your job. Built by Rancher Labs and now maintained by SUSE, it's a CNCF Incubating project that actually works without requiring a PhD in distributed systems.

Instead of setting up a separate storage cluster (which is a pain in the ass), Longhorn just uses the disks you already have on your Kubernetes nodes. Each volume gets its own dedicated storage engine, and it replicates your data across multiple nodes so when hardware inevitably dies, your data doesn't.

I've been running this in production for 8 months. It's survived two node failures, one kernel panic, and me accidentally deleting a namespace. The replica rebuilds took forever, but nothing broke.

What Actually Works

It Doesn't Break Everything When One Thing Breaks: Each volume runs its own dedicated storage engine instead of sharing one giant clusterfuck like Ceph. When something goes wrong, it's usually just that one volume, not your entire storage system. I learned this the hard way when our old Ceph cluster took down prod for 2 hours because one OSD went sideways.

Snapshots That Don't Suck: Point-in-time snapshots actually work and don't eat all your disk space. They're incremental, so you're not storing 50 copies of the same data. Saved my ass when someone deployed code that corrupted a database - rolled back to 30 minutes before and called it a day.

Backups to Real Storage: Backs up to S3 or NFS, not some proprietary bullshit. We send ours to an S3 bucket and can restore to a completely different cluster. Tested this during our DC migration - worked flawlessly, though it took 4 hours to restore 200GB because S3 egress is what it is.

Longhorn Backup Architecture

Thin Provisioning: Only uses the disk space you're actually consuming. A 20GB volume with 1GB of data uses 1GB on disk. This is standard now, but worth mentioning because some legacy storage systems are still stupid about this.

UI That Doesn't Suck: The Longhorn dashboard is actually usable. You can see your volumes, check replica status, and trigger backups without memorizing kubectl commands. Gets slow with 50+ volumes, but still better than debugging Ceph through cryptic CLI tools.

Longhorn Dashboard Interface

How This Thing Actually Works

Longhorn splits things into two parts: the storage engines that handle your data, and the managers that orchestrate everything. The Longhorn Manager runs on every node as a DaemonSet, which means it's always there watching things. When you create a volume, it spins up a dedicated storage engine just for that volume.

Longhorn Volume Architecture

It hooks into Kubernetes through CSI, so your apps just use regular PersistentVolumeClaims. No special APIs or modifications needed - if your app works with standard Kubernetes storage, it works with Longhorn.

Current Status: Longhorn v1.9.1 came out on July 23, 2025 and is stable as of September 2025. The offline replica rebuilding is actually useful - means you can fix broken replicas without tanking your application's performance. They release new versions every 4 months, which is fast enough to get fixes but not so fast you're constantly upgrading.

Real Performance Numbers: In production with SSDs, we see 4,000-6,000 IOPS for random 4K reads and about 60% of that for writes. Latency stays under 10ms for most operations. With HDDs, cut those numbers in half and add some prayer.

Tribal Knowledge from 8 Months in Production:

Always check kubectl get volumeattachments when pods won't start - sometimes they get stuck
If replica rebuilds take forever, check your network - we had a flaky switch port causing 30% packet loss
The "Unknown" volume state usually means networking issues or a dead node - restart the manager pod first
Backup restoration can timeout on slow S3 connections - increase the timeout in advanced settings or you'll be debugging for hours
Never delete all replicas of a volume at once - learned this the hard way, took 6 hours to recover from backups

Bottom Line: Longhorn isn't revolutionary, but it works without making you want to change careers. If you need storage that's reliable, simple to manage, and won't keep you up at night debugging weird edge cases, it's your best bet. The day-to-day operational experience is smooth enough that you'll forget it's there - which is exactly what you want from infrastructure.

Installation and System Requirements

Prerequisites (The Shit That Actually Matters)

Longhorn requires a Kubernetes cluster that doesn't suck and some packages your distro probably doesn't include by default:

Kubernetes Version: Minimum Kubernetes v1.25 works with most distros. RKE2, K3s, and cloud provider clusters are fine. If you're running some custom Kubernetes build, you're on your own.

Node Requirements: Each node needs:

Container runtime (Docker, containerd, CRI-O - whatever)
open-iscsi package installed and running (this breaks on Ubuntu if you forget it)
At least 4GB RAM and 2 CPU cores (more if you have lots of volumes)
Local disks (SSDs for production unless you enjoy slow databases)

Ubuntu/Debian: apt install open-iscsi && systemctl enable iscsid
RHEL/CentOS: yum install iscsi-initiator-utils && systemctl enable iscsid

Cluster Size: Need at least 3 nodes or it won't work. Two-node clusters are useless because you can't do proper quorum. I learned this the hard way trying to run Longhorn on a 2-node homelab cluster.

Longhorn Cluster Requirements

Network: Your nodes need to actually talk to each other without dropping packets. High latency kills write performance because replicas sync synchronously. We had weird performance issues until we found a switch with flaky ports.

Installation (The 5 Minutes That Becomes 2 Hours)

Longhorn has several installation methods. Use Helm unless you hate yourself:

Helm Installation (Actually works):

## First, check if open-iscsi is actually running
sudo systemctl status iscsid

## If that's dead, fix it first or you'll waste an hour debugging

## Add the official Longhorn Helm repository
kubectl create namespace longhorn-system
## Get the current repo URL from: https://longhorn.io/docs/latest/deploy/install/install-with-helm/
helm repo add longhorn [official-charts-url]
helm repo update
helm install longhorn longhorn/longhorn --namespace longhorn-system

On RKE2: You need --set defaultSettings.kubeletRootDir=/var/lib/kubelet or it can't find your pods.

kubectl Method (for masochists):

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.9.1/deploy/longhorn.yaml

This works but you get zero configuration options.

Rancher Catalog: One-click install through Rancher Apps if you're using Rancher. Actually works pretty well.

GitOps: Works with Fleet, Flux, ArgoCD. Just remember to set the kubelet path if you're not using standard locations.

After Installation (The Fun Part)

Access the UI with kubectl port-forward -n longhorn-system svc/longhorn-frontend 8080:80 and go to http://localhost:8080. The dashboard shows you what's broken and what's working.

Longhorn Replica Architecture

Storage Class: Longhorn creates a StorageClass automatically with 3 replicas by default. This is usually fine, but 2 replicas might be better for performance if you can tolerate slightly less availability.

Backup Setup: Configure backup targets to S3 or NFS or you'll regret it when shit hits the fan. We use an S3 bucket with versioning enabled.

What Actually Breaks During Install

Version Gotchas: v1.8.2 had a bug where replica rebuilds would hang if you had mixed disk types (SSD + HDD on same node). v1.9.0 RC1 broke the UI for clusters with 50+ volumes. Always test upgrades in staging first.

Undocumented Behaviors:

Ubuntu 20.04 ships with iscsid disabled by default - will fail silently until you enable it
Volume attach fails silently if your node runs out of loop devices (check with losetup -l)
The UI becomes unusable with more than 100 volumes - API still works but dashboard times out
RKE2 clusters need the kubelet path override or pods can't mount volumes
Cloud provider load balancers sometimes break the manager endpoints during upgrades
If you have custom CNI configs, the manager DaemonSet might not start - check the logs

Production Reality Check

Performance: Use SSDs or your databases will be slow as hell. Two replicas gives you 50% better write performance but one node failure makes things tense until replicas rebuild.

Memory Usage: 256MB per TB sounds reasonable until you realize that's per replica. With 3 replicas of a 1TB volume, you're looking at 768MB just for metadata. This adds up fast - we hit 2GB memory usage with just 4TB allocated across multiple volumes.

Scale Limitations: Works fine up to about 500 volumes per cluster according to official docs. In practice, the UI starts choking around 100 volumes and becomes unusable at 200+. The API still works fine though, so kubectl and Prometheus scraping continue normally.

Monitoring: Prometheus metrics are available and actually useful. Monitor replica rebuild status because that's when things break. Set alerts for volumes stuck in "Unknown" state - that usually means someone fucked up the networking.

Longhorn Volume Status

The Reality: Getting Longhorn up and running takes 5 minutes if everything works, or 4 hours if you hit the usual Linux storage nonsense. Once it's stable though, you can basically ignore it. That's the sweet spot - boring infrastructure that just works. Your databases get persistent storage, your monitoring sees healthy volumes, and you can focus on actual application problems instead of debugging why Ceph decided to corrupt itself again.

Kubernetes Storage Solutions Comparison

Feature	Longhorn	Rook-Ceph	OpenEBS	StorageOS
Architecture	Microservices, dedicated engine per volume	Ceph distributed storage with Rook operator	Multiple storage engines (Jiva, cStor, Mayastor)	Distributed block storage with consensus
Minimum Nodes	3 nodes	3 nodes (recommended 5+)	1 node	3 nodes
Storage Types	Block storage only	Block, Object, File	Block storage	Block storage
Backup Integration	Native S3/NFS backup	External backup tools required	Manual backup configuration	Enterprise backup features
UI Dashboard	Intuitive web UI included	Basic dashboard via Ceph	Web UI for some engines	Enterprise web UI
Resource Usage	Light base, heavy per-TB	Heavy as shit (2GB+ per node)	Variable by engine	Medium resource usage
Installation Complexity	Simple until networking breaks	Complex nightmare	Moderate if you pick the right engine	Moderate
Snapshot Support	Native incremental snapshots	RBD snapshots	Engine-dependent snapshots	Snapshot capabilities
Performance	Decent, tanks during rebuilds	Fast but complex	Mayastor is fast, others meh	Fast when it works
Enterprise Support	SUSE commercial support	Red Hat/IBM support available	MayaData commercial support	StorageOS commercial
CNCF Status	Incubating	Graduated	Sandbox	Not CNCF project
Maturity	Production ready	Very mature	Mature (varies by engine)	Production ready
Learning Curve	Low	Steep as fuck	Medium	Medium
Best For	When you want "good enough" storage	When you have a storage team	When you're not sure what you want	High-performance applications

Questions People Actually Ask

Is this actually production-ready or just marketing bullshit?

It's production-ready for most workloads. We use it for GitLab, Jenkins, and other stateful apps that can handle brief IO pauses during replica rebuilds. I wouldn't put my main database on it yet, but for everything else it's solid. Just test your disaster recovery before you need it.

What do I actually need to run this?

3 Kubernetes nodes minimum, k8s v1.21+, and open-iscsi installed and running on every node.

That last part will bite you in the ass if you forget it

Ubuntu doesn't install it by default. Check the requirements but seriously, just run sudo apt install open-iscsi && systemctl enable iscsid on Ubuntu before you start.

What happens when nodes die?

It detects the failure pretty quickly and starts rebuilding replicas on other nodes. Your volumes become read-only until enough replicas are healthy again. The rebuild process can take hours for large volumes and your IO performance goes to shit during that time. Plan your maintenance windows accordingly.

Can I run this on 2 nodes?

Nope, you need 3 minimum or it won't work. Trust me, I tried this on my homelab cluster and spent 2 hours debugging before reading the docs properly. Two nodes can't do proper quorum for distributed consensus.

How do backups work?

Backups go to S3 or NFS and they're incremental, so you're not uploading everything every time. Can restore to different clusters, which saved our asses during a DC migration. Set up recurring backups or you'll forget to do them manually.

Longhorn Snapshot Management

How's the performance compared to Ceph or other options?

Better than Ceph for smaller deployments because it writes locally first, then replicates. Not as fast as dedicated storage arrays, but way easier to manage. Use SSDs or your databases will be painfully slow. Write performance drops about 70% during replica rebuilds.

Can I pay someone to fix it when it breaks?

Yeah, SUSE has commercial support if you're using their Rancher stack. Community support is pretty good though

the Slack channel is active and people actually help instead of telling you to RTFM.

Can I make volumes bigger after creating them?

Yes, expanding volumes works fine through kubectl or the UI while apps are running. Shrinking doesn't work because it's block storage and would probably destroy your data anyway.

How painful are upgrades?

Rolling upgrades usually work fine with minimal downtime. Each volume controller upgrades independently so other volumes keep working. Just don't skip versions - you have to go through each minor version. v1.8 to v1.9 took us 6 hours because we didn't read the release notes about the migration process.

Longhorn Node Management

How much RAM does this thing eat?

About 256MB per TB per replica for metadata. Sounds reasonable until you realize that's per replica, not per volume. With default 3-replica setup, a 1TB volume uses 768MB just for indexes. We hit 2GB memory usage with 4TB allocated across multiple volumes. The base install is only 300MB though.

My pods are stuck in ContainerCreating, what's wrong?

Nine times out of ten it's open-iscsi not running: sudo systemctl status iscsid.

If that's dead, start it: sudo systemctl start iscsid. The error message "MountVolume.WaitForAttach failed for volume" is usually this. Ubuntu 20.04 doesn't start iscsid by default

bit me on three different clusters.

Volume stuck in "Unknown" state, now what?

Usually means a node died or networking is fucked. Check kubectl get nodes first. If all nodes are healthy, restart the Longhorn manager pod on the affected node: kubectl delete pod -n longhorn-system longhorn-manager-xxxxx. Takes 30 seconds to come back up.

Replica rebuild taking forever, is this normal?

Define forever. 1GB takes about 5 minutes on decent hardware with good networking. 100GB can take 2 hours. If it's been stuck for 6+ hours, something's wrong. Check dmesg | grep iscsi for network timeouts. We had a switch with bad ports that made rebuilds hang.

Resources That Actually Help

Related Tools & Recommendations

tool

Popular choice

jQuery - The Library That Won't Die

Explore jQuery's enduring legacy, its impact on web development, and the key changes in jQuery 4.0. Understand its relevance for new projects in 2025.

jQuery

/tool/jquery/overview

50%

news

Popular choice