SQLite Performance Optimization: AI-Optimized Technical Reference
Configuration Settings That Actually Work
Critical Performance Settings
-- Essential performance configuration
PRAGMA journal_mode = WAL; -- Enable Write-Ahead Logging
PRAGMA synchronous = NORMAL; -- Reduce disk sync overhead
PRAGMA cache_size = -64000; -- 64MB cache (negative = KB)
PRAGMA mmap_size = 268435456; -- 256MB memory mapping
PRAGMA temp_store = memory; -- Keep temp tables in RAM
PRAGMA wal_autocheckpoint = 1000; -- Default checkpoint interval
Performance Impact by Configuration
Setting | Write Performance | Read Performance | Memory Usage | Data Safety | Trade-off |
---|---|---|---|---|---|
Default Settings | 30-60 writes/sec | Decent | 2MB | Safe | Performance for safety |
WAL + Normal Sync | 3000+ writes/sec | Decent | Medium | 99.9% safe | Seconds of data loss risk |
WAL + Off Sync | 10000+ writes/sec | Decent | Medium | High risk | All safety for speed |
Large Cache (64MB+) | Variable | 10x faster | High | Safe | RAM for speed |
Memory Mapping | Variable | 5x faster | High | Safe | RAM for read speed |
Critical Failure Modes and Solutions
Transaction Batching - Most Common Performance Killer
Problem: Individual INSERTs cause disk sync per operation
- Symptoms: 30-50 inserts/second maximum, import scripts taking hours
- Root Cause: Each INSERT is its own transaction requiring disk confirmation
- Solution: Batch operations in transactions
-- WRONG: Individual transactions
for row in data:
INSERT INTO users (name, email) VALUES (?, ?); -- 200K disk syncs
-- CORRECT: Batched transactions
BEGIN;
for row in data:
INSERT INTO users (name, email) VALUES (?, ?);
COMMIT; -- One disk sync for entire batch
- Performance Impact: 8 hours → 8 minutes (100x improvement)
- Batch Size Limits: 5K-10K records per batch (larger batches lock database)
WAL Mode Silent Failures
Problem: WAL mode silently disabled on incompatible filesystems
- Docker for Mac: WAL mode fails silently, falls back to DELETE journal
- Network Filesystems: NFS doesn't support shared memory required for WAL
- Detection: Run
PRAGMA journal_mode;
to verify actual mode - Symptoms: Expected performance gains don't materialize
- Workaround: Use containers with native filesystem or PostgreSQL for network storage
Backup Script Failures with WAL Mode
Problem: WAL creates 3 files (.db, .db-wal, .db-shm), backup scripts often copy only .db file
- Data Loss Risk: Active data sits in WAL file, not backed up
- Solution: Checkpoint before backup:
PRAGMA wal_checkpoint(TRUNCATE);
- Alternative: Copy all three files atomically
Index Design Failures
Critical Limitation: SQLite uses only ONE index per table per query
-- INEFFECTIVE: Separate indexes don't combine
CREATE INDEX idx_users_name ON users(name);
CREATE INDEX idx_users_status ON users(status);
-- Query uses only ONE index, scans for rest
SELECT * FROM users WHERE name = 'Alice' AND status = 'active';
-- CORRECT: Composite index
CREATE INDEX idx_users_name_status ON users(name, status);
- Column Order: Most selective column first
- Partial Index Usage: Can't use right part without left part
Memory Configuration Disasters
OOMKill Risk: Memory mapping + large cache can exceed container limits
- Kubernetes: mmap_size = 2GB + cache + app memory > container limit = OOMKill
- Safe Formula: mmap_size + cache_size < 50% of available memory
- Platform Issues: macOS has unpredictable virtual memory behavior, Windows not recommended
Resource Requirements and Scaling Limits
Memory Requirements by Use Case
- Development: 32MB cache, 256MB mmap
- Production Web App: 64-128MB cache per connection
- Analytics Workload: Up to 25% of system RAM
- Container (512MB): 8-16MB cache, minimal mmap
- Container (2GB+): 64MB cache, 256MB mmap
When to Abandon SQLite
Hard Limits:
- Concurrent Writers: More than ~100 writes/second from multiple connections
- Database Size: Multi-terabyte databases (technically possible, operationally painful)
- Geographic Distribution: No built-in replication
- Complex Analytics: Lacks advanced JSON, full-text search, custom types
Migration Threshold: Expensify processes millions of requests/day on SQLite - don't migrate prematurely
Connection Pool Anti-Pattern
Problem: More connections hurt SQLite performance
- Why: Each connection has separate cache (50 connections × 10MB = 500MB duplicated cache)
- Better: 5 connections × 100MB cache each
- Thread Safety: One connection per thread to avoid corruption
Debugging and Monitoring
Essential Diagnostic Commands
-- Performance analysis
EXPLAIN QUERY PLAN SELECT ...; -- Find table scans and missing indexes
.timer on -- Measure query execution time
.stats on -- Monitor cache hit ratios
-- Health checks
PRAGMA journal_mode; -- Verify WAL mode enabled
PRAGMA cache_size; -- Check memory allocation
PRAGMA wal_checkpoint(TRUNCATE); -- Force checkpoint and cleanup
Critical Warning Signs
- "SCAN TABLE" in query plan: Missing index, checking every row
- "USING TEMP B-TREE": Building temporary indexes in memory
- WAL file >1GB: Checkpoints failing, disk space risk
- "Database is locked": Transaction never committed or connection leak
- Cache hit ratio <90%: Insufficient cache for workload
Production Monitoring Checklist
- Slow Query Threshold: >100ms indicates problems
- File Size Monitoring: Database + WAL file growth
- Error Rate: "Database is locked" errors indicate serious issues
- Memory Usage: Cache + mmap vs available memory
- Checkpoint Frequency: WAL should checkpoint regularly
Emergency Performance Recovery
Immediate Actions for Production Issues
- Checkpoint WAL file:
PRAGMA wal_checkpoint(TRUNCATE);
- Update statistics:
ANALYZE;
- Increase cache:
PRAGMA cache_size = -128000;
(128MB) - Enable memory temp storage:
PRAGMA temp_store = memory;
Maintenance Window Actions
- Defragment database:
VACUUM;
- Rebuild indexes:
REINDEX;
- Check file fragmentation:
filefrag -v database.db
Lock Debugging Strategy
# Log transaction lifecycle to find hanging transactions
import time, threading, logging
def debug_transaction(conn):
thread_id = threading.get_ident()
transaction_start = time.time()
try:
conn.execute("BEGIN")
logging.info(f"Transaction started on thread {thread_id}")
# ... your operations ...
conn.execute("COMMIT")
duration = time.time() - transaction_start
logging.info(f"Transaction completed in {duration:.2f}s")
except Exception as e:
logging.error(f"Transaction failed on thread {thread_id}: {e}")
conn.execute("ROLLBACK")
Performance Testing Framework
Load Testing with Real Data Patterns
-- Generate realistic test data
INSERT INTO test_table
SELECT
random() % 1000000 as id,
CASE WHEN random() % 10 = 0 THEN 'premium' ELSE 'standard' END as status,
datetime('now', '-' || (random() % 365) || ' days') as created_at
FROM (
WITH RECURSIVE series(x) AS (
SELECT 0 UNION ALL SELECT x+1 FROM series LIMIT 1000000
) SELECT x FROM series
);
Automated Performance Regression Detection
def benchmark_critical_queries(conn):
critical_queries = [
("User lookup", "SELECT * FROM users WHERE email = ?", ['test@example.com']),
("Status filter", "SELECT * FROM users WHERE status = ?", ['active']),
("Date range", "SELECT * FROM orders WHERE created_at > ?", ['2024-01-01'])
]
for name, query, params in critical_queries:
times = []
for _ in range(100):
start = time.perf_counter()
conn.execute(query, params).fetchall()
times.append(time.perf_counter() - start)
avg_time = sum(times) / len(times)
p95_time = sorted(times)[95]
if avg_time > 0.1: # 100ms threshold
print(f"PERFORMANCE REGRESSION: {name} averaging {avg_time:.3f}s")
Technical Specifications and Thresholds
Safe Operating Limits
- Transaction Batch Size: 5K-10K operations
- Cache Size: 25-50% of available RAM
- WAL File Size: <1GB (checkpoint when larger)
- Memory Mapping: <50% of container memory limit
- Connection Pool: 3-5 connections maximum
- Query Timeout: 30 seconds with busy_timeout
Platform-Specific Considerations
- Linux: Optimal performance, supports all features
- macOS: Unpredictable memory mapping behavior
- Windows: Avoid for production workloads
- Docker: WAL mode fails on osxfs, use Linux containers
- Network Storage: Use PostgreSQL instead, file locking unreliable
This reference provides the operational intelligence needed to successfully implement and maintain SQLite in production environments while avoiding common failure modes that cause performance degradation and data integrity issues.
Useful Links for Further Investigation
SQLite Performance Resources That Actually Help
Link | Description |
---|---|
SQLite PRAGMA Statements | The configuration reference you'll actually use for managing and optimizing SQLite database behavior. |
Write-Ahead Logging (WAL) | Official WAL mode docs when you need to understand what broke, detailing its benefits and operational aspects. |
EXPLAIN QUERY PLAN | How to debug slow queries by analyzing the execution plan of SQL statements in SQLite. |
phiresky's SQLite Performance Guide | A real-world optimization guide for SQLite performance tuning that provides practical advice and actually works. |
Simon Willison's SQLite TILs | Practical tips and tricks from someone who knows what they're talking about, covering various SQLite use cases. |
Expensify's 4M QPS on SQLite | A detailed case study on how Expensify managed to scale SQLite to millions of requests per day on a single server. |
SQLite vs Filesystem Performance | An analysis explaining why SQLite can often be faster and more efficient than direct file system access for data storage. |
Node.js better-sqlite3 | The fastest and most efficient SQLite wrapper for Node.js applications, offering synchronous and asynchronous operations. |
Python sqlite3 docs | The official Python SQLite documentation, covering the `sqlite3` module for interacting with SQLite databases. |
Related Tools & Recommendations
MongoDB vs PostgreSQL vs MySQL: Which One Won't Ruin Your Weekend
competes with postgresql
Why I Finally Dumped Cassandra After 5 Years of 3AM Hell
competes with MongoDB
I Survived Our MongoDB to PostgreSQL Migration - Here's How You Can Too
Four Months of Pain, 47k Lost Sessions, and What Actually Works
MySQL Replication - How to Keep Your Database Alive When Shit Goes Wrong
competes with MySQL Replication
MySQL Alternatives That Don't Suck - A Migration Reality Check
Oracle's 2025 Licensing Squeeze and MySQL's Scaling Walls Are Forcing Your Hand
Python 3.13 Production Deployment - What Actually Breaks
Python 3.13 will probably break something in your production environment. Here's how to minimize the damage.
Python 3.13 Finally Lets You Ditch the GIL - Here's How to Install It
Fair Warning: This is Experimental as Hell and Your Favorite Packages Probably Don't Work Yet
Python Performance Disasters - What Actually Works When Everything's On Fire
Your Code is Slow, Users Are Pissed, and You're Getting Paged at 3AM
Android 16 Public Beta Launches with Live Updates and Dark Mode Force
integrates with General Technology News
Android Studio - Google's Official Android IDE
Current version: Narwhal Feature Drop 2025.1.2 Patch 1 (August 2025) - The only IDE you need for Android development, despite the RAM addiction and occasional s
Why Enterprise AI Coding Tools Cost 10x What They Advertise
integrates with GitHub Copilot
Stripe Terminal iOS Integration: The Only Way That Actually Works
Skip the Cross-Platform Nightmare - Go Native
Fix Kubernetes Pod OOMKilled When Memory Looks Fine
Your monitoring lies to you. Here's how to debug the memory that actually kills your pods.
PostgreSQL vs MySQL vs MariaDB vs SQLite vs CockroachDB - Pick the Database That Won't Ruin Your Life
competes with mariadb
PostgreSQL vs MySQL vs MariaDB - Performance Analysis 2025
Which Database Will Actually Survive Your Production Load?
MariaDB - What MySQL Should Have Been
competes with MariaDB
DuckDB - When Pandas Dies and Spark is Overkill
SQLite for analytics - runs on your laptop, no servers, no bullshit
DuckDB Performance Tuning That Actually Works
Three settings fix most problems. Everything else is fine-tuning.
Deploy Django with Docker Compose - Complete Production Guide
End the deployment nightmare: From broken containers to bulletproof production deployments that actually work
Stop Waiting 3 Seconds for Your Django Pages to Load
integrates with Redis
Recommendations combine user behavior, content similarity, research intelligence, and SEO optimization