Selenium Grid - Run Multiple Browsers Simultaneously

How Selenium Grid Actually Works (And Why It's Painful)

Selenium Grid takes your normal Selenium WebDriver tests and runs them on multiple browsers at once. Instead of your laptop running one Chrome browser testing your app, Grid spins up 5 Chrome browsers on different machines and splits your tests between them.

The catch? Setting up Grid is like assembling IKEA furniture - the instructions look simple until you actually try it. Selenium 4 supposedly made this easier by splitting everything into microservices architecture. In practice, it just gave you more moving parts to break.

The Six Components That Need To Talk To Each Other

Grid 4 split into 6 different services that need to play nice together. The Distributor and Session Map are the ones that'll ruin your day when they break. The Distributor decides which browser gets which test - when it dies, all your tests just hang forever. The Session Map remembers which browser is doing what, and when it gets confused (daily occurrence), tests start sending commands to random browsers.

The Router is basically a traffic cop, the Session Queue holds tests in line, and the Event Bus lets everything gossip with each other. The Node actually runs browsers and is where Chrome crashes, Firefox leaks memory, and everything goes sideways.

Hub-Node vs Fully Distributed (Both Have Problems)

Grid 3 had a simple hub-node setup where one hub managed everything. Grid 4's distributed approach splits the hub into separate services.

The distributed model is supposedly more reliable because if one piece crashes, the others keep working. In reality, you now have 6 things that can break instead of 2. Each component failure manifests differently, making debugging distributed systems a special kind of hell.

The old hub-node model still works and is simpler to understand when things break. For small teams, it's probably the right choice unless you enjoy troubleshooting distributed systems at 2 AM.

When You Request A Browser Session

Here's what happens when your test asks for a Chrome browser:

Router gets your request for Chrome
Distributor checks if any Chrome nodes are free
If all Chrome browsers are busy, Session Queue holds your test
When a browser frees up, Session Map remembers the pairing
Node starts Chrome and tells everyone it's ready
Router sends your test commands to the right Chrome instance

This works great until step 5 fails because Chrome ran out of memory, or step 6 fails because the Session Map forgot which browser was which. The Event Bus is supposed to keep everything synchronized, but events can get lost during network hiccups or component restarts.

When it works, you get parallel test execution. When it breaks, you get tests hanging indefinitely with no clear error messages.

When Grid Makes Sense (And When It Doesn't)

Grid is good for one thing: running lots of tests faster by splitting them across multiple browsers. But it comes with enough baggage that you should think twice before setting it up.

Selenium Grid UI Dashboard

Parallel Testing (The Main Reason Anyone Uses This)

If your test suite takes forever to run, Grid can split it across multiple browsers to finish faster. A 4-hour test suite can run in 30 minutes if you spread it across 8 browsers. The math works - until browsers start crashing randomly and you're back to debugging instead of testing.

Jenkins, GitHub Actions, and other CI tools can hit Grid endpoints directly. Your existing tests don't need to change - just point them at http://grid-url:4444 instead of starting browsers locally.

The catch is that parallel testing only works if your tests are actually independent. If they share data, create conflicting state, or depend on specific timing, parallel execution will make them flaky as hell.

Cross-Browser Testing (Chrome vs Firefox vs Safari)

Grid can run the same test on Chrome 119, Firefox 118, Safari 17, and Edge all at once. This is useful for catching browser-specific bugs that only show up in production.

Browser Support Matrix

Chrome eats RAM like a teenager eats pizza. Firefox is more stable until profiles get corrupted around test #100. Safari costs more than your mortgage and only works on expensive Mac hardware. Edge exists, and IE compatibility mode? Just don't.

In practice, most teams run 90% of tests on Chrome and spot-check on other browsers. Cross-browser testing sounds comprehensive but becomes maintenance hell quickly.

Cost Analysis (Self-Hosted vs Cloud Services)

Running your own Grid can save money if you have enough tests. The break-even point is around 100-200 test sessions per day. Above that, self-hosting becomes cheaper than BrowserStack, Sauce Labs, LambdaTest, or AWS Device Farm.

But the cost comparison is misleading because it ignores operational overhead. You need someone who understands Docker, Kubernetes, and distributed systems troubleshooting. That person will spend 20% of their time keeping Grid running instead of doing other work.

Cloud services cost more but eliminate the maintenance headache. They also provide better browser coverage and handle the infrastructure scaling for you.

Deployment Options (All Have Trade-offs)

Docker Compose - Works for small teams with predictable testing loads. Use official Docker images and accept that you'll need to restart containers regularly when Chrome leaks memory.

Kubernetes - Required for auto-scaling and production workloads. KEDA can scale browser nodes based on test queue depth, but you need Kubernetes expertise to make this work reliably.

KEDA Auto-scaling Architecture

CI Integration - Your test framework doesn't care whether browsers are local or remote. Change the WebDriver URL from localhost to your Grid endpoint and tests run the same. The integration is the easy part.

Monitoring (You'll Need This When Things Break)

Grid 4 has GraphQL endpoints for checking status and OpenTelemetry tracing for debugging. These sound fancy but you'll mostly be checking whether browsers are alive and sessions aren't stuck.

Video recording captures test failures for debugging, but generates massive amounts of data. A busy Grid can create terabytes of videos monthly. Most teams enable recording only for failed tests to keep storage costs reasonable.

Without proper monitoring, Grid failures look like random test flakiness. You'll waste hours debugging "flaky tests" that are actually infrastructure problems.

Selenium Grid Deployment Reality Check

Aspect	Standalone Mode	Hub-Node Mode	Fully Distributed	Cloud Services
Setup Complexity	30 minutes	Half a day	2-3 weeks	Sign up and pay
Realistic Session Limit	~8-12 browsers	~30-50 browsers	100-300 browsers	Whatever you pay for
Resource Requirements	4GB RAM minimum	8-16GB RAM	32GB+ RAM	Credit card
What Happens When It Breaks	Restart everything	Find which node died	Good luck debugging	Call support
Scaling Method	`docker-compose restart`	Add more containers	Kubernetes magic	Automatic
Browser Support	Chrome + Firefox	Chrome + Firefox	Chrome + Firefox + headaches	Everything
Monthly Cost (rough)	$50 to who-knows-what	Starts at $300, spirals	Anywhere from $1k to "holy shit the AWS bill"	Whatever you pay for not debugging Grid at 3am
Time Spent Maintaining	1-2 hours/week	4-6 hours/week	8-16 hours/week	0 hours
Customization	Change Docker args	Write more YAML	Kubernetes configs	API calls
When It Works	85% of the time	75% of the time	60% of the time	95% of the time
Best For	Development	Small teams	Masochists	Teams with money

Setting Up Selenium Grid (Prepare for Pain)

Setting up Grid is where the theory meets reality and reality wins. Plan for this to take 3 times longer than expected and fail at least twice before working.

Docker Compose Quick Start (Famous Last Words)

The Docker Compose approach looks simple in tutorials. Here's what they don't tell you:

Docker Selenium Grid Setup

## docker-compose.yml - This will break the first time you try it
version: '3.8'
services:
  selenium-hub:
    image: selenium/hub:4.35.0
    container_name: selenium-hub
    ports:
      - "4444:4444"
    environment:
      - GRID_MAX_SESSION=16
      - GRID_BROWSER_TIMEOUT=300
      - GRID_TIMEOUT=300

  chrome:
    image: selenium/node-chrome:4.35.0
    shm_size: 2gb  # Chrome WILL crash without this
    depends_on:
      - selenium-hub
    environment:
      - HUB_HOST=selenium-hub
      - SE_NODE_MAX_SESSIONS=2  # Don't get greedy, 2 max
    deploy:
      replicas: 3  # Start with 3, scale up if it survives

  firefox:
    image: selenium/node-firefox:4.35.0
    shm_size: 2gb  # Firefox needs this too
    depends_on:
      - selenium-hub
    environment:
      - HUB_HOST=selenium-hub
      - SE_NODE_MAX_SESSIONS=2
    deploy:
      replicas: 2

Run docker-compose up and watch it crash. Common failures on first run:

Chrome containers exit with code 125 (permission issues)
Containers can't reach each other (Docker networking)
Hub starts but nodes never connect (firewall/DNS issues)
Everything starts but sessions hang (memory limits too low)

Code Changes (The Easy Part)

Changing your tests to use Grid is actually straightforward:

// Before: Local ChromeDriver
WebDriver driver = new ChromeDriver();

// After: Remote WebDriver pointing at Grid
ChromeOptions options = new ChromeOptions();
WebDriver driver = new RemoteWebDriver(
    new URL("http://localhost:4444"), options
);

The hard part isn't the code change - it's debugging why http://localhost:4444 returns connection refused errors even though the hub is supposedly running. I once spent 12 hours debugging this before realizing the hub container was binding to 127.0.0.1 instead of 0.0.0.0. Docker networking can eat shit.

Choosing Your Deployment Hell

Docker Compose - Good for development and small teams. Expect to restart containers daily when Chrome leaks memory. Works 80% of the time, which is better than the alternatives.

Kubernetes - Required for production but brings complexity you didn't ask for. You'll spend more time debugging YAML than running tests. Only attempt this if you already have Kubernetes expertise.

Cloud Services - Costs 3x more but someone else deals with the infrastructure. If your time is worth anything, this is usually the right choice.

Configuration That Actually Works

The official examples are optimistic. Here's what works in practice:

Chrome needs special care:

shm_size: 2gb minimum or it crashes randomly - I spent 8 hours debugging mysterious Chrome crashes before finding this gem
`--no-sandbox --disable-dev-shm-usage` in Chrome options or Chrome refuses to start
Restart Chrome nodes every 50-100 tests to prevent memory leaks

Firefox is more stable but:

Profile cleanup every 50 tests or profiles get corrupted
Avoid addons completely - they break session isolation
Use clean profiles (--profile /tmp/firefox-profile-$RANDOM)

Session limits matter:

Max 2 sessions per Chrome node (1 is safer)
Max 3 sessions per Firefox node
Higher limits = random failures and debugging nightmares

Monitoring (You'll Need This When It Breaks)

The Grid status page at http://grid:4444/ui shows pretty charts but won't tell you why tests are hanging.

Selenium Grid Status Dashboard

Watch for these warning signs:

Session queue depth >5: Nodes are dying faster than tests can run
Session assignment time >30 seconds: Distributor is overloaded
Node restarts >3/hour: Memory leaks or browser crashes
Tests timing out randomly: Network issues between components

Set up Grafana dashboards if you want pretty graphs, but mostly you'll be tailing Docker logs and restarting containers.

When Things Go Wrong (They Will)

Common failures and fixes:

Tests hang forever: Session Map lost track of browsers, restart everything
Chrome crashes on startup: Add more shm_size, check `/dev/shm` mount
Firefox profiles corrupted: Wipe /tmp/firefox* and restart nodes
Hub unreachable: Docker network issues, check docker network ls
Sessions never start: Check memory limits, probably OOM killed

The debugging process is: check Docker logs, restart containers, sacrifice a goat, repeat.

Questions You'll Ask (And Wish You Hadn't)

Should I use Selenium Grid 3 or 4?

Use Grid 4 if you enjoy debugging microservices. Use Grid 3 if you want something that actually works.

Grid 4 split the monolithic hub into 6 separate components that need to talk to each other perfectly. This supposedly provides better scalability and fault tolerance. In practice, you get 6 things that can break instead of 1. Grid 4's architecture looks impressive on paper but debugging distributed failures is a nightmare.

Grid 3's hub-node model is simpler: one hub, multiple nodes. When it breaks, you know where to look. When Grid 4 breaks, good luck figuring out which of the 6 components decided to stop working.

How many sessions can I run simultaneously?

Depends on how much pain you can tolerate. In theory: unlimited. In practice: way fewer than you think.

Start with 10-20 sessions and see what breaks first. Usually it's Chrome eating all your RAM or the Distributor giving up when session assignment takes >30 seconds. I've seen setups handle 100+ sessions, but they require dedicated infrastructure babysitting.

The official docs claim you can run 1000+ sessions. They don't mention you'll need a full-time DevOps engineer to keep it running.

What hardware do I actually need?

More than the docs suggest. Chrome alone uses 1-3GB per session and crashes randomly if you don't give it enough memory. Firefox is lighter but corrupts profiles after ~100 tests.

For a basic setup that doesn't fall over immediately:

8GB RAM minimum (16GB if you want Chrome to not crash)
4 CPU cores (more if you value your sanity)
Fast SSD storage (browser profiles generate tons of temp files)

The resource calculators online assume browsers behave predictably. They don't.

Docker or Kubernetes?

Docker Compose for development and small teams. It's simpler and you can restart the whole thing with one command when it inevitably breaks.

Kubernetes for production if you already have K8s expertise. Otherwise, you're adding container orchestration problems on top of Grid problems. That's two complex systems to debug instead of one.

Cloud services if your time is worth more than $50/hour.

How does it compare to BrowserStack/Sauce Labs?

Self-hosted Grid is cheaper if you ignore the operational overhead. Cloud services cost more but someone else deals with browser crashes at 3 AM.

Break-even point is around 100-200 daily test sessions. Below that, cloud services are cheaper when you factor in your time. Above that, self-hosting saves money but costs sanity.

Cloud services provide more browser/OS combinations and better support. Your Grid will run Chrome and Firefox reliably, maybe Safari if you hate money.

Which browsers actually work?

Chrome works best but eats memory like a black hole. Firefox is more stable but profile corruption will drive you insane. Safari only works on expensive Mac hardware. Edge... just don't.

In reality, most teams run 95% of tests on Chrome and spot-check on Firefox. Cross-browser testing sounds comprehensive but maintaining multiple browser configurations is exhausting.

What happens when browsers crash?

They crash a lot. Chrome runs out of memory, Firefox corrupts profiles, Safari does mysterious macOS things. Grid tries to detect crashes but the Session Map often forgets which browser was doing what.

Your tests will hang indefinitely waiting for a browser that crashed 10 minutes ago. Set aggressive timeouts (5 minutes max) and restart nodes regularly. Browser crashes are a feature, not a bug.

Can I run mobile tests?

Chrome mobile emulation works decently for basic responsive testing. Real device testing requires USB connections or ADB wireless debugging, both of which add complexity you probably don't need.

iOS testing needs macOS machines and is expensive to set up correctly. Android testing is more feasible but cloud services handle this better than self-hosted Grid.

How long until I get this working?

Plan for 2-3 weeks if you're new to container orchestration. Plan for 1-2 months to get it stable enough for production. Plan for ongoing maintenance forever.

The "quick start" tutorials skip the parts where containers fail to communicate, Chrome crashes on startup, and tests hang randomly. You'll spend more time debugging infrastructure than writing tests.

How do I debug when everything breaks?

Check Docker logs first: docker-compose logs -f. Look for OOM kills, connection failures, and browser crash dumps. When in doubt, restart everything and try again.

Common debugging steps:

Are containers actually running? (docker ps)
Can containers reach each other? (docker exec -it container ping other-container)
Is Chrome getting enough shared memory? (Check /dev/shm usage)
Are browser processes still alive? (ps aux | grep chrome)
Restart everything and hope it works this time

The Grid status page shows pretty graphs but rarely explains why tests are failing.

Is Grid secure?

No. Don't expose it to the internet. Grid accepts arbitrary WebDriver commands from anyone who can reach it. Put it behind a VPN or firewall and pray.

Container scanning, network isolation, and regular updates help but Grid wasn't designed with security as a priority. Cloud services handle security better than you will.

Quick Navigation

The Six Components That Need To Talk To Each Other

Hub-Node vs Fully Distributed (Both Have Problems)

When You Request A Browser Session

Parallel Testing (The Main Reason Anyone Uses This)

Cross-Browser Testing (Chrome vs Firefox vs Safari)

Cost Analysis (Self-Hosted vs Cloud Services)

Deployment Options (All Have Trade-offs)

Monitoring (You'll Need This When Things Break)

Docker Compose Quick Start (Famous Last Words)

Code Changes (The Easy Part)

Choosing Your Deployment Hell

Configuration That Actually Works

Monitoring (You'll Need This When It Breaks)

When Things Go Wrong (They Will)

Should I use Selenium Grid 3 or 4?

How many sessions can I run simultaneously?

What hardware do I actually need?

Docker or Kubernetes?

How does it compare to BrowserStack/Sauce Labs?

Which browsers actually work?

What happens when browsers crash?

Can I run mobile tests?

How long until I get this working?

How do I debug when everything breaks?

Is Grid secure?

Related Tools & Recommendations

Playwright vs Cypress - Which One Won't Drive You Insane?

Playwright - Fast and Reliable End-to-End Testing

Docker Desktop Alternatives That Don't Suck

Docker Swarm - Container Orchestration That Actually Works

Docker Security Scanner Performance Optimization - Stop Waiting Forever

Making Pulumi, Kubernetes, Helm, and GitOps Actually Work Together

CrashLoopBackOff Exit Code 1: When Your App Works Locally But Kubernetes Hates It

Temporal + Kubernetes + Redis: The Only Microservices Stack That Doesn't Hate You

Jenkins + Docker + Kubernetes: How to Deploy Without Breaking Production (Usually)

Jenkins Production Deployment - From Dev to Bulletproof

Jenkins - The CI/CD Server That Won't Die

Zig Memory Management Patterns

Phasecraft Quantum Breakthrough: Software for Computers That Work Sometimes

GitHub Actions Alternatives for Security & Compliance Teams

Tired of GitHub Actions Eating Your Budget? Here's Where Teams Are Actually Going

GitHub Actions is Fine for Open Source Projects, But Try Explaining to an Auditor Why Your CI/CD Platform Was Built for Hobby Projects

TypeScript Compiler (tsc) - Fix Your Slow-Ass Builds

Google NotebookLM Goes Global: Video Overviews in 80+ Languages

Robot Framework

ByteDance Releases Seed-OSS-36B: Open-Source AI Challenge to DeepSeek and Alibaba