Supabase Realtime - When It Works, It's Great; When It Breaks, Good Luck

How Supabase Realtime Actually Works

Supabase Realtime is an Elixir cluster that syncs data over WebSockets. Built on Phoenix Framework, it can supposedly handle millions of connections across regions. Works great in demos, flaky as hell in production.

Supabase Realtime Architecture

WebSocket Real-time Communication

WebSocket Broadcasting Pattern

Phoenix Channels handle the pub/sub stuff using Elixir processes. Messages supposedly take the shortest path between regions - when it works. Sometimes your Singapore users get routed through Virginia for no fucking reason.

Core Components

Phoenix Channels run the messaging using Phoenix.PubSub. Works fine until it doesn't.

Global state sync supposedly keeps presence data consistent across regions. In reality, ghost users pile up like digital zombies and your "online users" count becomes meaningless.

Database integration streams changes from PostgreSQL's WAL through replication slots. When your database gets hammered, WAL replication lags like hell and your "real-time" updates turn into "whenever-the-fuck-they-feel-like-it" updates.

Broadcast from Database (2025 Update)

The latest Broadcast from Database feature sends messages when database changes happen. Creates a partitioned realtime.messages table that publishes changes over WebSockets. Built-in authorization through RLS policies actually works.

Messages get purged after 3 days by dropping partitions - at least that part doesn't require manual cleanup.

Fault-tolerant my ass. Here's what actually breaks in production:

When Realtime Breaks (And It Will)

The "fault-tolerant" system still fails in predictable ways. I've debugged all of these at 3am:

Connection Pool Death Spiral: During traffic spikes, connection pools get exhausted and new WebSocket connections start timing out. Your users get the dreaded "connection failed" error with zero useful context. The solution? Restart everything and pray.

WAL Replication Lag Hell: WAL replication lags like hell when your DB is getting hammered. Your "real-time" changes turn into "eventual-time" changes and users see stale data while thinking everything is live.

Message Ordering Chaos: Broadcast messages don't arrive in order during network congestion. Your collaborative cursor app turns into a seizure-inducing mess as cursors jump randomly around the screen. There's no built-in message sequencing - you'll need to add timestamps and handle out-of-order delivery yourself.

The Phantom Presence Problem: Users who force-quit their browser or lose WiFi stay "online" forever. Ghost users pile up like digital zombies until your user list is meaningless. The CRDT "eventual consistency" sometimes means "never consistent."

Regional Routing Madness: Messages "usually" take the shortest path between regions, except when AWS has connectivity issues and your Singapore users get routed through Virginia for no fucking reason. Latency spikes from 50ms to 500ms and there's nothing you can debug.

The Database Connection Black Hole: When your database goes down, Realtime tries to reconnect from the "nearest available region." In practice, this means 30-60 seconds of complete silence while your users wonder if their internet broke.

Deep Dive References

For more technical details on Realtime's architecture and implementation:

Elixir Language Official Guide - Understanding the language behind Realtime's performance
Phoenix Framework Documentation - Core framework powering Realtime's WebSocket handling
Phoenix Channels Deep Dive - How real-time communication actually works
Phoenix PubSub Architecture - The messaging system beneath broadcasts
Erlang Process Groups - How messages get routed globally
PostgreSQL Logical Replication - How database changes get streamed
Write-Ahead Logging in PostgreSQL - The foundation of Postgres Changes
Conflict-Free Replicated Data Types - How Presence maintains consistency
WebSocket Protocol Specification - Understanding the underlying connection protocol
AWS Global Network Infrastructure - How multi-region message routing actually works
Ably's WebSocket Architecture Guide - Industry best practices for real-time systems

Supabase Realtime Features Comparison

Feature	Postgres Changes	Broadcast	Presence
Purpose	Listen to database changes	Send ephemeral messages	Track user state
Data Persistence	Database-driven	Ephemeral (not stored)	In-memory only
Authorization	Row Level Security	RLS + Custom policies	RLS + Custom policies
Latency	~100-500ms	<50ms	<50ms
Production Reality	WAL lags when DB is hammered	500ms+ during network hiccups	Ghost users pile up forever
Use Cases	Live data sync, notifications	Chat, gaming, cursors	Online users, activity
Scalability	Limited by DB connections	High (millions of messages)	High (thousands of users)
Message Size	Unlimited (DB record)	256KB max per message	64KB max per presence
Delivery Guarantee	At-least-once	Best-effort (aka "good luck")	Best-effort (aka "good luck")
Regional Support	Global with DB proximity	Global cluster	Global cluster
Setup Complexity	Publication required	Channel subscription	Channel subscription
Pricing Impact	WAL streaming costs	2.50/million (cursor moves add up fast)	Connection-based

Implementation Guide and Best Practices

Database Changes Implementation

The most reliable (but slowest) of Realtime's three features. Postgres Changes needs a publication to stream database events to connected clients. Uses PostgreSQL's logical replication to capture INSERT, UPDATE, and DELETE operations.

-- Enable realtime for specific table
ALTER PUBLICATION supabase_realtime ADD TABLE messages;

// The docs version - works in dev, breaks in prod
const channel = supabase.channel('messages-changes')

// The version that actually works in production
const channel = supabase
  .channel('messages-changes', {
    config: {
      // Because the defaults will fail you
      heartbeat_timeout: 60000,
      reconnect_after_timeout: 5000
    }
  })
  .on('postgres_changes', 
    { event: 'INSERT', schema: 'public', table: 'messages' },
    (payload) => console.log('New message:', payload)
  )
  .on('error', (error) => {
    console.error('Realtime connection died:', error)
    // Add your resurrection logic here
  })
  .on('disconnect', (reason) => {
    console.warn('Disconnected:', reason)  
    // Mobile browsers, shitty WiFi, corporate firewalls - pick one
  })
  .subscribe()

Your replication slot becomes a chokepoint when you're doing heavy writes. Batch your operations or watch your replication lag climb like a fucking rocket.

Version-Specific Gotchas I've Learned the Hard Way:

Connection timeout bullshit: Newer versions of realtime-js changed the default timeout from 10s to 30s. Your users will sit there for 30 fucking seconds before getting a connection error. Override with shorter timeouts or they'll think your app is broken.
Authorization enforcement changes: RLS policies started getting enforced differently for Broadcast channels in recent versions. Code that worked fine suddenly fails silently. No migration guide, just stack traces and confusion.
WAL replication randomly stops: On Postgres 15+, logical replication occasionally just stops working during high-throughput periods. Your database changes stop flowing and you lose several minutes of updates. Restart the replication slot and pray.

Broadcast for Real-time Messaging

The fast but dangerous option. Broadcast sends instant messages between connected clients without touching the database. Messages are ephemeral and only reach currently connected users - great for live interactions until your network hiccups.

// Send broadcast message - looks simple, right?
channel.send({
  type: 'broadcast',
  event: 'cursor_move',
  payload: { x: 100, y: 200, user_id: 'user-123', timestamp: Date.now() }
})

// Receive broadcast messages - add timestamp handling or regret it later
channel.on('broadcast', { event: 'cursor_move' }, (payload) => {
  // Messages arrive out of order during network congestion
  // Without timestamps, cursors jump around like broken mice
  if (payload.timestamp < lastCursorUpdate[payload.user_id]) {
    return // Ignore outdated message
  }
  updateCursor(payload.user_id, payload.x, payload.y)
  lastCursorUpdate[payload.user_id] = payload.timestamp
})

The Broadcast Billing Trap: That innocent cursor movement? Each mouse move is a message. Users dragging their cursor generates 100+ messages per second. Our whiteboard generated something insane like 40-50 million messages in a week - think it was $120-something just for cursor movements. Rate limit everything or watch your bill explode.

Authorization: The August 2024 authorization update introduced Row Level Security policies for Broadcast channels, ensuring secure message routing based on user permissions.

Presence for User State Tracking

The feature that works great in demos and poorly in production. Presence maintains distributed state of connected users using CRDTs. Provides "eventual consistency" across nodes without database writes - when it feels like working.

// Track user presence
channel.track({
  user_id: 'user-123',
  name: 'John Doe',
  cursor_position: { x: 150, y: 300 },
  last_active: Date.now()
})

// Monitor presence changes
channel.on('presence', { event: 'sync' }, () => {
  const presences = channel.presenceState()
  console.log('Online users:', Object.keys(presences).length)
})

Client Libraries and Integration

Official client libraries:

JavaScript/TypeScript: @supabase/realtime-js - most battle-tested
Dart/Flutter: @supabase/realtime-dart - works fine on mobile
Python, Swift, C#: Community-maintained - your mileage may vary

APIs are supposedly consistent across platforms. In practice, each has its own special way of breaking.

Success with Supabase Realtime isn't avoiding the quirks - it's understanding them and coding defensively around the inevitable failures.

Essential Implementation Resources

When building production applications with Realtime, these resources will save you hours of debugging:

Supabase Realtime JavaScript Client - Official TypeScript client with comprehensive examples
Realtime Authorization Guide - Securing channels with Row Level Security
PostgreSQL Replication Configuration - Setting up your database for real-time changes
WebSocket Connection Best Practices - Handling disconnections and reconnections properly
Real-time React Patterns - Managing WebSocket state in React applications
Flutter Realtime Integration - Building mobile real-time features
Next.js WebSocket Integration - Server-side real-time patterns
Load Testing WebSockets - Performance testing your real-time features
Rate Limiting Strategies - Preventing message spam and billing surprises
Message Queue Patterns - Handling message ordering and reliability
WebSocket Error Handling - Robust error recovery patterns
Production Monitoring for WebSockets - Observability for real-time systems

Questions You'll Actually Ask While Debugging at 3AM

What's the difference between Broadcast and Postgres Changes?

Broadcast sends ephemeral messages directly between connected clients without database storage, ideal for live interactions like cursor movements or chat messages. Postgres Changes streams actual database events (INSERT, UPDATE, DELETE) to subscribers, ensuring data persistence and consistency for critical updates.

Why do my WebSocket connections keep dying?

Because WebSockets are fragile as hell. Mobile browsers kill background connections, corporate firewalls randomly drop WebSocket traffic, and load balancers don't understand heartbeats. Your connection shows "connected" but messages disappear into the void. Add aggressive reconnection logic or your users will think your app is broken.

Does Supabase Realtime guarantee message delivery?

Fuck no. Realtime uses "best-effort delivery" which means "we'll try but no promises." Your chat message might vanish into the digital ether during a network hiccup. For anything important, store it in the database first, then broadcast a notification. Don't trust ephemeral messaging for critical data.

Why is my Realtime bill 10x higher than expected?

Because every heartbeat, cursor movement, presence update, and failed reconnection attempt counts as a message.

Your collaborative app with 20 users generated something like 40-60 million messages in a week

think the bill was around $150 just for cursor movements. Rate limit everything or your bill will explode. $2.50 per million sounds cheap until you realize a single user can generate thousands of messages per minute.

Can I use Realtime across multiple regions?

Yes, Supabase Realtime operates as a global cluster. Messages automatically route through the shortest path between regions. A user in Singapore can communicate with users in the US with minimal added latency.

What are the connection limits for Realtime?

The default configuration supports up to 16,384 WebSocket connections per node, with 100 acceptor processes handling incoming connections. Enterprise plans can customize these limits based on specific requirements.

How do I secure Realtime channels?

Use Row Level Security (RLS) policies to control access to channels. Realtime respects PostgreSQL's authentication and authorization rules, ensuring users only receive data they're permitted to access.

What happens if my database connection is lost?

Realtime automatically attempts to reconnect to the database from the nearest available region. Each region has multiple nodes for redundancy. Clients will experience a temporary interruption but should reconnect automatically once the connection is restored.

Can I filter which database changes I receive?

Yes, you can filter by specific tables, schemas, and event types when subscribing to Postgres Changes. Use table publications and RLS policies to control data access at the database level.

Is there a message size limit for Broadcast?

Broadcast messages are limited to 256KB per message. For larger payloads, consider storing data in the database and sending smaller notification messages through Broadcast to trigger data fetching.

How does Presence handle network partitions?

Presence uses CRDTs (Conflict-Free Replicated Data Types) to maintain eventual consistency across network partitions. When connections are restored, presence state automatically synchronizes without conflicts.

Why do my presence users never go offline?

Because presence state is built on "eventual consistency" which sometimes means "never consistent." Users who force-quit their browser or lose WiFi connection stay "online" forever. Ghost users accumulate like digital zombies until your user list becomes meaningless. You'll need to implement your own heartbeat system to purge stale presence data.

My messages arrive out of order - is this normal?

Unfortunately, yes. During network congestion or server restarts, broadcast messages arrive whenever they feel like it. Your chat app shows "Hello" after "How are you?" and users get confused. Add sequence numbers to every message and sort them client-side, or accept the chaos.

Can I use Realtime with my existing PostgreSQL database?

Maybe. Your database needs PostgreSQL 10+ with logical replication enabled, which means wal_level = logical and available replication slots. Most managed database providers (AWS RDS, Google Cloud SQL) restrict this for security reasons. You might need to upgrade your plan or migrate to Supabase's PostgreSQL to get real-time features working.

Essential Supabase Realtime Resources

Related Tools & Recommendations

tool

Popular choice

jQuery - The Library That Won't Die

Explore jQuery's enduring legacy, its impact on web development, and the key changes in jQuery 4.0. Understand its relevance for new projects in 2025.

jQuery

/tool/jquery/overview

50%

tool

Popular choice