I've been running Fluentd in production since 2019, and it's basically a log router that doesn't suck. It reads logs from wherever they are, transforms them if needed, and ships them to whatever storage you want. The killer feature is that it treats everything as JSON streams, which means you can process logs consistently instead of fighting regex patterns for every different log format.
How It Actually Works in Production
Here's the reality: Fluentd sits between your applications spitting out logs and your log storage system trying to make sense of them. You configure input plugins to slurp logs from files, HTTP endpoints, or message queues. Then filter plugins can modify, enrich, or route the data. Finally, output plugins dump everything to Elasticsearch, S3, or whatever you're using.
The plugin architecture is genuinely useful because you can swap destinations without touching your app configs. I've migrated from Splunk to ELK to S3 without changing a single application - just swapped the output plugin config. The CNCF graduated status means it's not going anywhere, unlike some logging tools that get abandoned.
Plugins That Don't Completely Suck
The plugin ecosystem is actually one of Fluentd's strengths. There are 500+ plugins for pretty much everything:
- Elasticsearch output: Works reliably, handles backpressure properly
- S3 output: Batches files, compresses them, doesn't lose data
- Kafka output: Actually maintains partition ordering
- tail input: Follows log files without missing rotations (usually)
- Kubernetes integration: DaemonSet configs that work out of the box
Installing plugins is fluent-gem install fluent-plugin-whatever
. Just make sure you restart the daemon after installing or you'll wonder why nothing works. The plugin development guide is decent if you need to write custom plugins.
Real Performance Numbers (From Actual Usage)
In my experience, a single Fluentd process handles about few thousand events per second before you start seeing buffer backups. Memory usage starts low but can spike if you're doing heavy regex matching or JSON parsing on large payloads.
The recent v1.19.0 release finally switched from yajl-ruby to the standard JSON gem, which gives you better throughput with Ruby 3.x. They also added Zstandard compression which compresses better than gzip but uses more CPU. Worth checking the performance tuning docs for actual optimization tips.
When Fluentd Will Ruin Your Day
Here's the thing nobody talks about - Ruby's GIL means Fluentd is basically single-threaded for most operations. Not a huge deal for I/O bound work like log processing, but it does cap your throughput. If you need to process 50K+ events/sec, you'll need multi-process workers or you should probably use Fluent Bit instead.
The configuration syntax is also annoying - it's this weird Ruby-ish DSL that looks like neither Ruby nor YAML. You'll spend time debugging config parsing errors that should be caught at startup but aren't. Here's a typical config that actually works in production:
<source>
@type tail
path /var/log/app/*.log
pos_file /var/log/fluentd/app.log.pos
format json
tag app.logs
</source>
<filter app.logs>
@type grep
<exclude>
key message
pattern /health-check/
</exclude>
</filter>
<match app.logs>
@type elasticsearch
host elasticsearch
port 9200
index_name app-logs
</match>
That config took me way longer to figure out than it should have - probably 3-4 hours of trial and error because the syntax errors are cryptic as hell. Error messages just say "parsing failed" without telling you which line is fucked.