Gleam Performance Optimization - Make Your BEAM Apps Actually Fast

BEAM Performance Reality Check

Let's cut the bullshit: BEAM isn't fast for CPU-heavy work. It's about as fast as Python, which means slow as shit for number crunching. But that's not the point - BEAM is built for latency and fault tolerance, not raw speed.

But BEAM has two performance superpowers that make it worth the tradeoff:

Concurrency is very cheap - you can spawn 200,000+ concurrent processes like it's nothing
VM is optimized for latency - per-process heap with no global stop-the-world, pre-emptive scheduling

This design proved its worth at scale: WhatsApp handles 40+ billion messages daily and Discord stores billions of messages using the same BEAM foundation that powers Gleam.

Memory Layout That Actually Matters

Every BEAM process gets four blocks of memory:

Stack: Return addresses, function arguments, local variables
Heap: Larger structures like lists and tuples
Message area (mailbox): Messages from other processes
Process Control Block: Process metadata

Each process burns about 2KB minimum, which sounds expensive until you realize you can spawn hundreds of thousands without your server catching fire. This lightweight process model is nothing like OS threads - it's what lets BEAM handle insane concurrency.

Why Your Gleam App Is Probably Slow

Pattern matching overhead: Gleam's pattern matching is powerful but not free. Efficient compilation of pattern matching is a surprisingly challenging problem, and complex nested patterns can create expensive dispatch trees.

List operations: Gleam lists are linked lists, not arrays. list.length() is O(n), not O(1). If you're calling list.length(my_list) > 1000, you're already fucked. I spent 6 hours debugging why our API went to shit - some genius was calling list.length() on 50k-item lists in a hot path. Don't be that person.

No tail call optimization awareness: Gleam supports tail call optimization, but the compiler won't warn you when you're not using it. Writing recursive functions without proper accumulators will eat your stack.

String operations: BEAM strings are UTF-8 binaries. Concatenating strings creates new binaries every time. If you're building strings in loops, use `iodata` instead.

Performance Improvements That Actually Shipped

The Gleam team isn't sitting around - they shipped real performance wins:

v1.11.0 got 30% faster JavaScript compilation in June 2025
v1.12.0 enabled function inlining with conservative configuration
Binary operations got optimized - taking a sub-slice is now constant time on JavaScript to match Erlang behavior

These aren't marketing bullshit numbers - Richard Viney actually benchmarked real workloads to prove it.

Debug Performance Issues First, Optimize Second

The `echo` keyword is your best friend for quick performance debugging:

import gleam/io

pub fn slow_function(data) {
  data
  |> echo("Input data size")
  |> expensive_operation()
  |> echo("After expensive operation") 
  |> another_expensive_operation()
  |> echo("Final result")
}

The compiler tracks `echo` usage and will warn you if you try to publish with debug statements still in your code. Use it liberally while profiling, then remove it when you're done. The language server also provides code actions to remove all echos from a module.

Pro tip: Add timestamps to your echo statements:

import gleam/erlang/system_time

pub fn timed_operation(data) {
  let start = system_time.monotonic_time()
  
  let result = data 
    |> expensive_operation()
    |> echo("Operation completed")
  
  let end = system_time.monotonic_time()
  let duration_ms = (end - start) / 1_000_000
  
  io.println("Operation took " <> int.to_string(duration_ms) <> "ms")
  result
}

This gives you microsecond-precision timing without external tools.

Performance Debugging Questions That Keep You Up at 3AM

Why is my Gleam app using 100% CPU when it's supposed to be idle?

Scheduler busy-wait is fucking with you. BEAM's scheduler spins for several milliseconds when it runs out of work, which shows as high CPU usage even when your app is doing nothing. This is normal behavior designed to reduce latency, but it looks scary in monitoring tools.Fix it: Add +sbwt none to your VM flags to disable scheduler busy-wait:bash# In your release configurationerl -sbwt none -sname myapp# Or if you get "erl: invalid argument '+sbwt'"# Use +sbwtdcpu none +sbwtdio none instead (older BEAM versions)This trades slightly higher latency for lower idle CPU usage. Only do this if you're running other important processes on the same machine.

My function is called millions of times but Observer shows it's fast. What's wrong?

You're looking at the wrong metric.

Observer shows execution time per call, not total system impact. A function that takes 0.001ms but gets called 10 million times is your real bottleneck.Find the real bottlenecks: Use recon:proc_count(reductions, 10) to find processes doing the most work:gleam// In an Erlang shell connected to your running system1> recon:proc_count(reductions, 10).Look for processes with high reduction counts

that's where your CPU is going.

Why does my pattern matching crash with "function_clause" in production but work locally?

Your patterns aren't exhaustive and you're hitting edge cases. This is the classic "it works on my machine" but for functional programming. Debug it properly: Use Gleam's exhaustiveness checking during development and add a catch-all pattern with proper error handling:gleamcase complex_data { Expected(value) -> handle_expected(value) Error(reason) -> handle_error(reason) // Gleam forces you to handle all cases, but you might miss edge cases in real data _ -> { // Log the unexpected case instead of crashing io.println("Unexpected pattern: " <> string.inspect(complex_data)) Error("Unhandled case") }}

My BEAM app randomly runs out of memory. How do I find the leak?

It's probably not a leak, it's message queue buildup.

BEAM processes don't leak memory like C

they crash and get garbage collected. But message queues can grow infinitely if processes can't keep up.Find the bloated process:bash# Connect to your running node$ erl -name debug@127.0.0.1 -setcookie your_cookie# Find processes with huge message queues1> recon:proc_count(message_queue_len, 10).# Get detailed process info2> recon:proc_window(message_queue_len, 3, 1000).Look for processes with message queue lengths over 1000. That's your memory problem.

Performance is fine locally but terrible in production. WTF?

Scheduler configuration is wrong for your hardware. BEAM defaults to one scheduler per CPU core, but that's not always optimal.Check your scheduler setup:bash# See current scheduler configuration1> erlang:system_info(schedulers).8# See actual CPU utilization per scheduler2> observer:start().If you see uneven load across schedulers, you might need to tune +S (scheduler count) or +A (async thread pool size) in your VM arguments.

Why does my recursive function eat all the memory?

No tail call optimization. You're building up stack frames instead of reusing them. This is the classic functional programming trap.Bad recursive function (eats memory):gleampub fn sum_list(list: List(Int)) -> Int { case list { [] -> 0 [head, ..tail] -> head + sum_list(tail) // NOT tail recursive }}Good recursive function (constant memory):gleampub fn sum_list(list: List(Int)) -> Int { sum_list_helper(list, 0)}fn sum_list_helper(list: List(Int), acc: Int) -> Int { case list { [] -> acc [head, ..tail] -> sum_list_helper(tail, acc + head) // Tail recursive }}The difference: the good version's last operation is the recursive call, so BEAM can reuse the stack frame.

Profiling Tools That Actually Work with Gleam

Observer: The Swiss Army Knife

Observer is the GUI tool that lets you peek inside your running BEAM system. It works perfectly with Gleam applications because Gleam compiles to BEAM bytecode. The Observer user guide covers all features in detail.

Start Observer locally:

$ erl -name observer@127.0.0.1
1> observer:start().

Connect to remote production node (this is where it gets powerful):

$ erl -name debug@127.0.0.1 -setcookie production_cookie
1> net_adm:ping('myapp@production-server').
2> observer:start().

Click "Nodes" menu → select your production node. Now you can debug production while it's running. For detailed setup instructions, see Fly.io's guide on connecting Observer to production.

What to actually look for in Observer:

Processes tab: Sort by memory or reductions to find resource hogs
System tab: Check memory usage, scheduler utilization
Load Charts: CPU and memory trends over time
Applications tab: See which OTP applications are running

Spectator: Observer for Gleam Nerds

Spectator is a BEAM observer written specifically for Gleam applications. It understands gleam_otp processes better than standard Observer.

Add to your project:

[dependencies]
spectator = "~> 0.5"

Start in your application:

import spectator

pub fn main() {
  spectator.start()
  // Your app code here
}

Access the web UI at http://localhost:9001.

Tag your processes for easier identification:

import gleam/otp/actor
import spectator

pub fn start_worker() {
  actor.start(0, handle_message)
  |> spectator.tag("worker_process")
}

Heads up: Spectator probes processes using process_info/2, which temporarily locks each process. Don't use it on systems with millions of processes unless you hate yourself.

recon: Production Debugging Without Fear

recon is the tool for when Observer is too much GUI and you need command-line precision. It comes with the excellent "Erlang in Anger" book that actually explains how to debug BEAM in production.

Install recon in your project:

[dependencies]
recon = "~> 2.5"

Find memory hogs:

% Connect to your running system
$ erl -name debug@127.0.0.1 -setcookie your_cookie -remsh myapp@production

% Top 10 processes by memory usage
1> recon:proc_count(memory, 10).

% Top 10 by CPU (reductions)
2> recon:proc_count(reductions, 10).

% Get memory allocation info
3> recon_alloc:memory(used).

Trace function calls safely in production:

% Trace all calls to a slow function for 30 seconds, max 1000 calls
4> recon_trace:calls({module, function, '_'}, 1000, [{time, 30000}]).

This won't crash your production system - recon has safety limits built in. For more advanced tracing patterns, check out the tracing chapter in Erlang in Anger.

Getting Flame Graphs from Gleam

eflame generates flame graphs from BEAM tracing data. These show you exactly where your CPU time is going.

Trace your application:

$ erl -name debug@127.0.0.1 -setcookie your_cookie

1> eflame:apply(normal_with_children, "flame_graph", [], your_module, your_function, [args]).

This creates flame_graph.out - upload it to Flame Graph Generator for visualization. Brendan Gregg's flame graphs guide explains the format in detail.

What flame graphs tell you:

Width = CPU time spent in function
Height = call stack depth
Colors are random (ignore them)
Look for wide plateaus - those are your bottlenecks

For BEAM-specific flame graph analysis, check out this presentation on BEAM profiling and the eflame README for usage examples.

Linux perf for Low-Overhead Profiling

If you're running on Linux with BEAM JIT enabled, perf gives you system-level profiling with minimal overhead.

Enable JIT profiling:

$ erl +JPperf true -name myapp

Profile your running application:

## Profile for 30 seconds
$ perf record -g --pid=$(pidof beam.smp) sleep 30

## Generate flame graph
$ perf script | stackcollapse-perf.pl | flamegraph.pl > profile.svg

This works even on heavily loaded production systems because perf uses sampling, not tracing. For more details on BEAM JIT profiling, see the Erlang efficiency guide.

Memory Profiling with Observer Allocator Info

BEAM's memory management is complex - it uses multiple allocators for different data types. Observer can show you exactly where your memory is going.

In Observer: System tab → Memory Allocators

Command line version:

1> recon_alloc:memory(used).
2> recon_alloc:memory(allocated).

Look for:

binary_alloc: Large binaries (strings, files)
eheap_alloc: Process heaps
ets_alloc: ETS table storage
ll_alloc: Linked list data

If one allocator is consuming way more memory than others, that's your starting point for optimization. The recon documentation explains each allocator in detail, and the BEAM memory architecture guide provides deep insights into how BEAM manages memory internally.

Profiling Tools Comparison: Which One Won't Waste Your Time

Tool	Best For	Overhead	Remote Access	Gleam Support	Learning Curve
Observer	General debugging, memory analysis	Low	✅ SSH tunnel	Full (BEAM bytecode)	Easy
Spectator	Gleam-specific process monitoring	Medium	✅ Web UI	Native Gleam	Easy
recon	Production troubleshooting	Very Low	✅ Remote shell	Full (Erlang API)	Medium
eflame	CPU bottleneck analysis	High	❌ Local only	Full (tracing)	Hard
perf	System-level profiling	Very Low	❌ Local only	JIT only	Hard
etop	Command-line monitoring	Low	✅ Remote shell	Full (BEAM VM)	Easy

Essential Performance Resources

33%

Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization

Quick Navigation

Memory Layout That Actually Matters

Why Your Gleam App Is Probably Slow

Performance Improvements That Actually Shipped

Debug Performance Issues First, Optimize Second

Why is my Gleam app using 100% CPU when it's supposed to be idle?

My function is called millions of times but Observer shows it's fast. What's wrong?

Why does my pattern matching crash with "function_clause" in production but work locally?

My BEAM app randomly runs out of memory. How do I find the leak?

Performance is fine locally but terrible in production. WTF?

Why does my recursive function eat all the memory?

Observer: The Swiss Army Knife

Spectator: Observer for Gleam Nerds

recon: Production Debugging Without Fear

Getting Flame Graphs from Gleam

Linux perf for Low-Overhead Profiling

Memory Profiling with Observer Allocator Info

Related Tools & Recommendations

Gleam Production Deployment: Docker, BEAM Releases & Monitoring

Python vs JavaScript vs Go vs Rust - Production Reality Check

SolidJS Production Debugging: Fix Crashes, Leaks & Performance

Gleam: Type Safety, BEAM VM & Erlang's 'Let It Crash' Philosophy

LM Studio Performance: Fix Crashes & Speed Up Local AI

Getting Started with Gleam: Installation, Usage & Why You Need It

Jaeger: Distributed Tracing for Microservices - Overview

Node.js Memory Leaks & Debugging: Stop App Crashes

Google Avoids $2.5 Trillion Breakup in Landmark Antitrust Victory

Google Avoids Breakup, Stock Surges

Stripe Next.js Serverless Performance: Optimize & Fix Cold Starts

Rust Overview: Memory Safety, Performance & Systems Programming

Express.js Production Guide: Optimize Performance & Prevent Crashes

Datadog Production Troubleshooting Guide: Fix Agent & Cost Issues

TypeScript Migration Troubleshooting Guide: Fix Common Issues

PyTorch Production Deployment: Scale, Optimize & Prevent Crashes

PostgreSQL: Why It Excels & Production Troubleshooting Guide

Getting Pieces to Remember Stuff in VS Code Copilot (When It Doesn't Break)

Cursor AI Review: Your First AI Coding Tool? Start Here

Cursor Enterprise Security Assessment - What CTOs Actually Need to Know