MySQL to PostgreSQL Production Migration: Complete Step-by-Step Guide

Before You Start: Critical Pre-Migration Questions

How long will this migration actually take?

Real timeline: Plan for 3-6 months for a complete production migration, not the "2-week estimate" from your project manager. Here's what actually happened to us:

Planning & assessment: 2-4 weeks (you'll find databases you forgot existed)
Tool setup & testing: 2-3 weeks (pgloader will break 5 times before it works)
Data migration runs: 1-4 weeks (add another week when you discover encoding issues)
Application testing: 4-8 weeks (every SQL query that worked in MySQL is broken)
Production cutover: 1-2 weeks (multiply by 3 when your rollback plan fails)

Every DBA I know has a horror story about migrations that "should have been simple" turning into 6-month projects.

What's the real risk of data loss?

Higher than you think. AWS DMS has known data integrity issues with specific data types:

JSON/BLOB data can be silently dropped or corrupted during migration
Datetime edge cases like MySQL's 0000-00-00 00:00:00 cause PostgreSQL failures
Auto-increment sequences often end up out of sync, causing primary key conflicts
Foreign key constraints may fail to recreate if referential integrity was already broken

Solution: Always migrate to a staging PostgreSQL instance first and run comprehensive data validation queries before touching production.

Can I migrate without downtime?

Zero downtime is marketing bullshit. I've been through three "zero downtime" migrations. Here's what actually happens:

Dual-write setup - Your app will corrupt data writing to both databases differently
Data reconciliation - You'll spend weeks debugging sync conflicts
Gradual traffic cutover - Something will break and you'll flip everything back in panic
Fallback plan - Won't work when you need it most (trust me)

Best case: 10-30 minutes of downtime for DNS/connection issues. Worst case: We once had 8 hours of downtime because the "tested" rollback script had a typo. Plan for downtime, communicate it upfront, and have a tested rollback that actually works.

How do I convince management this isn't just "copying files"?

Show them the complexity. MySQL and PostgreSQL handle these differently:

Data types: MySQL's TINYINT(1) vs PostgreSQL's BOOLEAN
SQL syntax: LIMIT 10 vs LIMIT 10 OFFSET 20
Stored procedures: MySQL's syntax won't work in PostgreSQL
Character encoding: MySQL's utf8 vs PostgreSQL's UTF8
Indexes: Different performance characteristics and syntax

Reference the Knight Capital disaster where a "simple deployment" caused $440M in losses in 45 minutes.

What if the migration fails halfway through?

Have a complete rollback plan:

MySQL backup taken immediately before migration starts
Application deployment rollback scripts ready to execute
DNS/load balancer configuration to route traffic back to MySQL
PostgreSQL cleanup procedures to remove partial data
Team communication plan for coordinating the rollback

The GitLab database disaster shows what happens when backups fail during a migration crisis.

Should I use AWS DMS, pgloader, or build custom scripts?

It depends on your specific situation:

Use AWS DMS when:

You're already on AWS and need ongoing replication
Database size is under 1TB with simple schemas
You can afford the $1,500-8,000/month cost during migration

Use pgloader when:

One-time migration with complex data types
You have Linux/Docker expertise on your team
Budget constraints (pgloader is free)

Use custom scripts when:

Heavy data transformation is required
Existing tools don't handle your specific edge cases
You have dedicated database engineering resources

Discovery Phase: Finding Out How Fucked You Really Are

Discovery Phase:

Finding Out How Fucked You Really Are

Database Migration Architecture

Look, I get it.

Management wants this "simple database migration" done by Friday. You're thinking "how hard can it be?" Let me save you some pain: this discovery shit will save you months of debugging and prevent the kind of data loss that ends careers.

Before you even think about touching production, you need to understand what clusterfuck you're actually dealing with.

Database Inventory:

The Shit Nobody Told You About

Start with the obvious, then prepare to be horrified. That "simple" application your PM described? It probably has more databases than a fucking casino.

Map Your MySQL Ecosystem (And Cry)

Run this query on each MySQL server to see how many databases are hiding in your infrastructure:


SELECT 
    SCHEMA_NAME,
    SUM(DATA_LENGTH + INDEX_LENGTH) / 1024 / 1024 / 1024 as SIZE_GB,
    COUNT(*) as TABLE_COUNT
FROM information_schema.

SCHEMATA s
JOIN information_schema.TABLES t ON s.SCHEMA_NAME = t.TABLE_SCHEMA
WHERE SCHEMA_NAME NOT IN ('information_schema', 'mysql', 'performance_schema', 'sys')
GROUP BY SCHEMA_NAME
ORDER BY SIZE_GB DESC;

What you'll discover (and why you'll need therapy):

Databases named test_db_final_REALLY_FINAL_v2 that are somehow still in production
Log tables eating 80% of your storage that haven't been rotated since 2018
One database shared by three different applications because "it was easier at the time"
Development databases mixed with production because the previous DBA was a masochist

Skyvia found that 60% of migrations uncover "forgotten" databases still being used by legacy systems.

Translation: that API from 2019 that "nobody uses anymore" is actually hit 10,000 times a day.

MySQL Workbench will crash if you try to map more than 50 tables, so good luck there. pt-query-digest takes forever and the output looks like it was designed by someone who hates readable reports.

Find MySQL Features That'll Make PostgreSQL Laugh At You

PostgreSQL doesn't give a shit about MySQL's weird quirks and will reject half your data.

Here's how to find the stuff that'll break:

-- Find My

SQL-specific data types
SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, DATA_TYPE
FROM information_schema.

COLUMNS
WHERE DATA_TYPE IN ('ENUM', 'SET', 'YEAR', 'TINYINT', 'MEDIUMINT');

-- Find MySQL-specific functions in stored procedures
SELECT ROUTINE_SCHEMA, ROUTINE_NAME, ROUTINE_DEFINITION
FROM information_schema.ROUTINES
WHERE ROUTINE_DEFINITION LIKE '%NOW()%''
   OR ROUTINE_DEFINITION LIKE '%STR_TO_DATE%''
   OR ROUTINE_DEFINITION LIKE '%GROUP_CONCAT%';

-- Find tables with MySQL-specific options
SELECT TABLE_SCHEMA, TABLE_NAME, ENGINE, AUTO_INCREMENT
FROM information_schema.TABLES
WHERE ENGINE = 'MyISAM' OR AUTO_INCREMENT IS NOT NULL;

The shit that'll break first:

ENUM columns
PostgreSQL has ENUMs but they're rigid as hell.

You can't just add values without pain (PostgreSQL ENUM docs)

AUTO_INCREMENT
becomes sequences that you'll fuck up at least twice (sequence docs)
MySQL's zero dates (0000-00-00)
PostgreSQL will laugh and reject your entire table (datetime docs)
Case-insensitive comparisons
My

SQL is casual about case, PostgreSQL is a pedantic asshole (collation docs)

The [PostgreSQL wiki](https://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_Postgre

SQL#MySQL) has conversion info, but it's incomplete and assumes you know what you're doing. Estuary's guide is better but still glosses over the painful edge cases.

Schema Analysis:

Where Dreams Go To Die

Data Type Compatibility Hell

Every developer thinks data types are simple until they start a migration. MySQL and PostgreSQL disagree on basically everything. Here's the shit you'll spend weeks debugging:

| My

SQL Type	PostgreSQL Type	Migration Notes
`TINYINT(1)`	`BOOLEAN`	MySQL uses 0/1, PostgreSQL uses true/false
`DATETIME`	`TIMESTAMP`	Handle timezone conversions
`ENUM('a','b','c')`	`VARCHAR CHECK (...)`	Create constraints for each enum
`MEDIUMTEXT`	`TEXT`	Direct mapping, but verify character limits
`DOUBLE`	`DOUBLE PRECISION`	Precision handling may differ

Find The Garbage Data That'll Kill Your Migration

PostgreSQL has standards, unlike MySQL which accepts any bullshit you feed it.

Fix this crap before migration or watch pgloader crash spectacularly:

-- Find invalid dates that will crash PostgreSQL
SELECT COUNT(*) FROM your_table WHERE date_column = '0000-00-00 00:00:00';

-- Find potential character encoding issues
SELECT COUNT(*) FROM your_table WHERE column_name REGEXP '[^\\x00-\\x7F]';

-- Check for constraint violations PostgreSQL won't accept
SELECT * FROM child_table c
LEFT JOIN parent_table p ON c.parent_id = p.id
WHERE p.id IS NULL;

The Percona pgloader guide shows that 40% of migrations fail initially due to data quality issues that were ignored in MySQL but PostgreSQL enforces strictly.

Use pgloader's data transformation capabilities and PostgreSQL's constraint validation tools to handle these issues systematically.

Performance Baseline:

Know What You're Leaving Behind

Step 5: Document Current My

SQL Performance

Before migration, establish performance baselines using MySQL Performance Schema and pt-query-digest so you can validate that PostgreSQL performs adequately with pg_stat_statements and pgbench:

Query Performance:

-- Enable MySQL slow query log
SET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;

-- Run your application's typical workload for 24 hours
-- Then analyze the slow query log

Resource Usage:

-- Check current connection counts
SHOW STATUS LIKE 'Threads_connected';

-- Monitor memory usage
SHOW STATUS LIKE 'Key_buffer_size';

-- Check buffer pool utilization
SHOW ENGINE INNODB STATUS\\G

Application Dependencies:

Document all applications connecting to each MySQL database
Note connection pooling configurations
Identify batch jobs and their schedules
Map out read replica usage patterns

Step 6:

Test Your Backup and Recovery Process

CRITICAL: Test your MySQL backup restoration BEFORE starting migration.

The GitLab incident happened partly because their backups hadn't been tested.

# Create a full backup
mysqldump --all-databases --routines --triggers --single-transaction > full_backup.sql

# Test restore on a separate MySQL instance
mysql -u root -p < full_backup.sql

# Verify data integrity
mysql -e \"SELECT COUNT(*) FROM important_table;\"

AWS Database Migration Service (DMS) Overview:

Migration Strategy Decision Matrix

Based on your discovery findings, choose your migration approach:

For databases < 100GB with simple schemas:

Tool: pgloader
Downtime: 2-8 hours
Complexity: Medium
Best for: One-time migrations, development environments

For databases > 100GB or complex applications:

Tool: AWS DMS or custom replication
Downtime: 30 minutes to 2 hours
Complexity: High
Best for: Production systems requiring minimal downtime

For applications requiring zero downtime:

Tool: Dual-write with gradual cutover
Downtime: Theoretically zero (realistically 10-30 minutes)
Complexity: Very high
Best for: Mission-critical systems with 24/7 availability requirements

The assessment phase typically reveals that your "simple" migration is actually complex, but it's better to know this upfront than discover it during the production cutover window.

Migration Tool Comparison: Choose Your Path Wisely

Migration Tool	Best Use Case	Database Size Limit	Downtime Required	Setup Complexity	Monthly Cost	Reality Check
pgloader	One-time migration, simple schemas	< 1TB	2-12 hours	Medium	Free	Works great until it doesn't
AWS DMS	Ongoing replication, AWS environments	< 5TB	30 minutes 2 hours	High	$1,500-8,000	Expensive but reliable
Foreign Data Wrapper (FDW)	Phased migration, data validation	< 500GB	< 30 minutes	High	Free	Complex setup, good for testing
Skyvia	No-code solution, small teams	< 100GB	1-4 hours	Low	$50-500	Easy but limited
Custom Scripts	Heavy transformation needed	Unlimited	Variable	Very High	Development time	You'll regret this choice
Logical Replication	Near-zero downtime required	< 2TB	< 15 minutes	Very High	Free + infrastructure	Sounds good, rarely works

Migration Day: Where Everything Goes Wrong

pgloader: Your New Best Friend/Worst Enemy

This is where your months of careful planning meet the brutal reality of production systems. Spoiler alert: something will break that you never anticipated, and it'll happen at 2 AM on a Saturday.

Pre-Migration Setup (The Boring But Critical Stuff)

Set Up PostgreSQL (And Pray The Permissions Work)

PostgreSQL eats RAM like Chrome eats battery life. Plan for 20-30% more RAM than MySQL used because PostgreSQL is greedier but also smarter about memory management. The PGTune calculator is your friend - AWS's sizing guide is mostly marketing bullshit designed to sell bigger instances.

## PostgreSQL configuration for migration target
## postgresql.conf settings for migration performance

shared_buffers = '25% of total RAM'
work_mem = '256MB'                    # Higher during migration
maintenance_work_mem = '2GB'          # Critical for index creation
effective_cache_size = '75% of total RAM'
checkpoint_completion_target = 0.9
wal_buffers = '16MB'
max_wal_size = '10GB'                 # Prevent checkpoint storms

## Connection settings
max_connections = 200
shared_preload_libraries = 'pg_stat_statements'

Set up your PostgreSQL environment (and pray the permissions work):

-- Create migration database (hope you got the name right)
CREATE DATABASE production_pg;  -- Can't rename this later without pain
CREATE ROLE migration_user WITH LOGIN PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON DATABASE production_pg TO migration_user;

-- Grant permissions (PostgreSQL is anal about these)
\c production_pg
GRANT CREATE, USAGE ON SCHEMA public TO migration_user;
GRANT ALL ON ALL TABLES IN SCHEMA public TO migration_user;
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO migration_user;

Prep MySQL (The Last Time You'll Touch It, Hopefully)

Turn on logging because you'll need it when shit goes sideways:

-- Enable binary logging for point-in-time recovery
SET GLOBAL binlog_format = 'ROW';
SET GLOBAL sync_binlog = 1;

-- Create consistent backup point
FLUSH TABLES WITH READ LOCK;
-- Record binary log position
SHOW MASTER STATUS;
-- Release lock after backup starts
UNLOCK TABLES;

Address data quality issues identified during assessment:

-- Fix MySQL zero dates that PostgreSQL rejects
UPDATE your_table 
SET date_column = '1970-01-01 00:00:00' 
WHERE date_column = '0000-00-00 00:00:00';

-- Handle MySQL-specific boolean representations
UPDATE your_table 
SET boolean_column = CASE WHEN tinyint_column = 0 THEN FALSE ELSE TRUE END;

-- Fix potential foreign key issues
DELETE FROM child_table 
WHERE parent_id NOT IN (SELECT id FROM parent_table);

Method 1: pgloader (Because You're Not Rich Enough For AWS DMS)

Configure pgloader (And Learn To Hate LISP Syntax)

pgloader config files look like they were designed by someone who thinks parentheses are optional. The docs assume you already know what the fuck you're doing:

pgloader Tool

-- migration_config.load
LOAD DATABASE
    FROM mysql://user:password@mysql_host:3306/source_db
    INTO postgresql://migration_user:password@pg_host:5432/production_pg

WITH include drop, create tables, create indexes, reset sequences,
     workers = 8, concurrency = 2,
     multiple readers per thread, rows per range = 50000

SET PostgreSQL PARAMETERS
    maintenance_work_mem to '2GB',
    work_mem to '512MB'

SET MySQL PARAMETERS
    net_read_timeout = '31536000',
    net_write_timeout = '31536000',
    lock_wait_timeout = '31536000'

-- Handle MySQL-specific data type conversions
CAST type tinyint(1) to boolean drop typemod,
     type datetime to timestamp drop default drop not null using zero-dates-to-null,
     type enum when (= 'active') to boolean drop typemod,
     type year to integer drop typemod

-- Include specific tables (exclude logs and temporary data)
INCLUDING ONLY TABLE NAMES MATCHING ~/^(?!temp_|log_|cache_)/

-- Handle problematic data during migration
BEFORE LOAD DO
  $$ DROP SCHEMA IF EXISTS mysql CASCADE; $$,
  $$ CREATE SCHEMA IF NOT EXISTS mysql; $$;

Run The Migration (And Watch Everything Break)

Execute pgloader and pray to whatever gods you believe in:

## Create migration log directory
mkdir -p /var/log/migration

## Execute migration with detailed logging
pgloader --verbose --debug migration_config.load > migration_full.log 2>&1

## Monitor progress in real-time (separate terminal)
tail -f migration_full.log | grep -E "(COPY|ERROR|WARNING)"

Monitor PostgreSQL while it chokes on your data:

-- Check active connections and long-running queries
SELECT pid, state, query_start, query 
FROM pg_stat_activity 
WHERE state = 'active' AND pid <> pg_backend_pid();

-- Monitor table creation progress
SELECT schemaname, tablename, n_tup_ins as rows_inserted
FROM pg_stat_user_tables 
ORDER BY n_tup_ins DESC;

-- Check disk space usage
SELECT pg_size_pretty(pg_database_size('production_pg')) as database_size;

AWS RDS PostgreSQL Configuration:

Method 2: AWS DMS (For When You Have Money To Burn)

Configure AWS DMS (And Watch Your Budget Evaporate)

DMS costs more per month than most people's rent, but it works well when it's not randomly failing with "task failed successfully" errors:

## AWS CLI command to create replication instance
aws dms create-replication-instance \
    --replication-instance-identifier mysql-to-postgres-replication \
    --replication-instance-class dms.r5.xlarge \
    --allocated-storage 200 \
    --apply-immediately \
    --auto-minor-version-upgrade \
    --multi-az \
    --publicly-accessible

Configure source and target endpoints:

## Source MySQL endpoint
aws dms create-endpoint \
    --endpoint-identifier mysql-source \
    --endpoint-type source \
    --engine-name mysql \
    --server-name mysql-host.amazonaws.com \
    --port 3306 \
    --database-name production \
    --username mysql_user \
    --password mysql_password

## Target PostgreSQL endpoint  
aws dms create-endpoint \
    --endpoint-identifier postgres-target \
    --endpoint-type target \
    --engine-name postgres \
    --server-name postgres-host.amazonaws.com \
    --port 5432 \
    --database-name production_pg \
    --username postgres_user \
    --password postgres_password

Create migration task with proper settings:

{
  \"ReplicationTaskIdentifier\": \"mysql-postgres-migration\",
  \"SourceEndpointArn\": \"arn:aws:dms:us-east-1:123456789:endpoint:mysql-source\",
  \"TargetEndpointArn\": \"arn:aws:dms:us-east-1:123456789:endpoint:postgres-target\", 
  \"ReplicationInstanceArn\": \"arn:aws:dms:us-east-1:123456789:rep:mysql-to-postgres-replication\",
  \"MigrationType\": \"full-load-and-cdc\",
  \"TableMappings\": {
    \"rules\": [{
      \"rule-type\": \"selection\",
      \"rule-id\": \"1\",
      \"rule-name\": \"1\",
      \"object-locator\": {
        \"schema-name\": \"%\",
        \"table-name\": \"%\"
      },
      \"rule-action\": \"include\"
    }]
  },
  \"ReplicationTaskSettings\": {
    \"TargetMetadata\": {
      \"TargetSchema\": \"\",
      \"SupportLobs\": true,
      \"FullLobMode\": false,
      \"LobChunkSize\": 0,
      \"LimitedSizeLobMode\": true,
      \"LobMaxSize\": 32,
      \"InlineLobMaxSize\": 0,
      \"LoadMaxFileSize\": 0,
      \"ParallelLoadThreads\": 8,
      \"ParallelLoadBufferSize\": 0,
      \"BatchApplyEnabled\": true,
      \"BatchApplyPreserveTransaction\": true,
      \"BatchSplitSize\": 0
    }
  }
}

Monitor DMS (While It Silently Corrupts Your Data)

Check if DMS is actually working or just pretending:

## Monitor task progress
aws dms describe-replication-tasks \
    --filters Name=replication-task-id,Values=mysql-postgres-migration

## Check for errors
aws logs describe-log-streams \
    --log-group-name dms-tasks-mysql-to-postgres-replication

Validate data consistency during replication using DMS validation features and custom validation scripts:

-- Compare row counts between source and target
-- MySQL
SELECT COUNT(*) as mysql_count FROM important_table;

-- PostgreSQL  
SELECT COUNT(*) as postgres_count FROM important_table;

-- Check for replication lag
SELECT 
    table_name,
    inserts,
    updates, 
    deletes,
    last_updated
FROM awsdms_control.awsdms_status 
ORDER BY last_updated DESC;

When Everything Goes To Shit (The Inevitable)

Foreign Key Violations (Classic)

What happens: Your migration explodes with referential integrity errors because MySQL doesn't give a fuck about constraints, but PostgreSQL does.

How to fix it without crying:

-- Temporarily disable foreign key checks in PostgreSQL
SET session_replication_role = replica;

-- Run your data migration
-- pgloader or DMS process here

-- Re-enable constraint checking
SET session_replication_role = DEFAULT;

-- Validate and fix any constraint violations
SELECT c.table_name, c.constraint_name 
FROM information_schema.table_constraints c
WHERE c.constraint_type = 'FOREIGN KEY'
AND NOT EXISTS (
    SELECT 1 FROM information_schema.constraint_column_usage u
    WHERE c.constraint_name = u.constraint_name
);

Auto-Increment Sequences Are Fucked (Always)

What happens: PostgreSQL sequences are out of sync and every INSERT throws primary key violations. This happens in 90% of migrations.

The nuclear option that actually works:

-- Reset all sequences to match current max values
DO $$
DECLARE 
    rec RECORD;
    max_val INTEGER;
BEGIN
    FOR rec IN 
        SELECT sequence_name, table_name, column_name
        FROM information_schema.sequences s
        JOIN information_schema.columns c ON s.sequence_name = c.column_default
    LOOP
        EXECUTE 'SELECT COALESCE(MAX(' || rec.column_name || '), 1) FROM ' || rec.table_name INTO max_val;
        EXECUTE 'ALTER SEQUENCE ' || rec.sequence_name || ' RESTART WITH ' || (max_val + 1);
    END LOOP;
END
$$;

Your Queries Are Now Slow As Shit

What happens: Queries that were lightning fast on MySQL now take longer than a government project. This is normal and everyone will blame you.

How to figure out what's fucked:

-- Enable query performance monitoring
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Identify slow queries
SELECT query, mean_time, calls, total_time
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;

-- Check if indexes were created properly
SELECT schemaname, tablename, attname, n_distinct, correlation
FROM pg_stats
WHERE schemaname = 'public'
AND n_distinct < 100
ORDER BY tablename;

-- Update table statistics for better query planning
ANALYZE VERBOSE;

-- Consider creating additional indexes for PostgreSQL optimization
CREATE INDEX CONCURRENTLY idx_user_email_btree ON users (email);
CREATE INDEX CONCURRENTLY idx_order_date_brin ON orders USING BRIN (created_date);

If you've made it this far without getting fired, congratulations. You now have a PostgreSQL database that (hopefully) contains your data. Whether it actually works correctly is another question entirely - we'll find that out when you cut over and everything explodes.

Troubleshooting: When Things Go Wrong

The migration is stuck at 90% complete. What's happening?

Don't panic and kill the process. I did this once and had to start a 12-hour migration from scratch.

Most likely causes:

Index creation phase
- Postgre

SQL is rebuilding indexes (this took 8 hours on our 200GB users table)2. Foreign key constraint validation

PostgreSQL actually enforces constraints, unlike MySQL

Sequence reset
- pgloader is fixing the AUTO_INCREMENT mess MySQL left behindCheck what's actually running:```sql-- PostgreSQL: See active queries

SELECT pid, state, query_start, left(query, 100) as current_queryFROM pg_stat_activity WHERE state = 'active' AND pid <> pg_backend_pid();-- Check if indexes are being builtSELECT schemaname, tablename, indexname FROM pg_indexes WHERE schemaname = 'public';```Don't panic and kill the process. Index creation can take 4-8 hours for tables over 100GB. Grab coffee, update your project manager, and resist the urge to "fix" it.

PostgreSQL is rejecting half my data with encoding errors

Root cause: MySQL's loose utf8 encoding vs PostgreSQL's strict UTF-8 enforcement.

Quick fix for invalid characters:```sql-- On My

SQL side, clean invalid UTF-8 before migrationUPDATE your_table SET problematic_column = CONVERT(CONVERT(CONVERT(problematic_column USING latin1) USING binary) USING utf8)WHERE problematic_column IS NOT NULL;-- Alternative:

Strip non-UTF8 characters

UPDATE your_table SET text_column = REPLACE(REPLACE(text_column, CHAR(0), ''), CHAR(1), '')WHERE text_column REGEXP '[^\x00-\x7F]';**Long-term solution:** Set consistent encoding from the start:bash# pgloader configuration to handle encodingCAST column problematic_text using remove-null-characters```

My stored procedures disappeared and nothing works

Reality check: PostgreSQL doesn't run MySQL stored procedures.

Period. I spent 2 days trying to make this work before accepting reality.Migration strategy for stored procedures:

Inventory all procedures:```sql-- List all My

SQL stored proceduresSELECT ROUTINE_SCHEMA, ROUTINE_NAME, ROUTINE_TYPE, ROUTINE_DEFINITIONFROM information_schema.

ROUTINESWHERE ROUTINE_SCHEMA NOT IN ('information_schema', 'mysql', 'performance_schema');```2. Convert syntax for PostgreSQL:

DELIMITER $$ becomes $function$
DECLARE variable INT; becomes DECLARE variable INTEGER;
`IF condition THEN ...

END IF;becomesIF condition THEN ... END IF;`

MySQL functions like NOW() become CURRENT_TIMESTAMP3. Test each procedure individually in PostgreSQL:```sql-- Example MySQL to PostgreSQL procedure conversion-- MySQL version:

-- DELIMITER $$-- CREATE PROCEDURE GetUserCount()-- BEGIN-- SELECT COUNT() FROM users WHERE active = 1;-- END$$-- PostgreSQL version:CREATE OR REPLACE FUNCTION GetUserCount()RETURNS INTEGER AS $function$BEGIN RETURN (SELECT COUNT() FROM users WHERE active = true);END;$function$ LANGUAGE plpgsql;```

Application is crashing with "relation does not exist" errors

**Common causes:**1. Case sensitivity

MySQL Users vs PostgreSQL users2. Schema placement
Tables created in wrong schema

Incomplete migration
- Some tables failed to migrateDebug steps:```sql-- Check what tables actually exist\dt public.*-- Find tables in non-public schemas

SELECT schemaname, tablename FROM pg_tables WHERE schemaname NOT IN ('information_schema', 'pg_catalog');-- Check for case sensitivity issuesSELECT table_name FROM information_schema.tables WHERE lower(table_name) = lower('Users');**Fix case sensitivity:**sql-- Create case-insensitive aliases if neededCREATE VIEW "Users" AS SELECT * FROM users;```

Queries are 10x slower on PostgreSQL than MySQL

Been there. Our main dashboard query went from 200ms to 8 seconds after migration.

Here's what was broken:

Missing indexes:```sql-- Check if indexes were migrated properly

SELECT schemaname, tablename, indexname, indexdefFROM pg_indexesWHERE schemaname = 'public'ORDER BY tablename;-- Compare with original MySQL indexes-- (Check your MySQL SHOW CREATE TABLE statements)2. **Outdated statistics:**sql-- Update Postgre

SQL query planner statisticsANALYZE VERBOSE;-- Check last analyze timeSELECT schemaname, tablename, last_analyze, last_autoanalyzeFROM pg_stat_user_tablesWHERE last_analyze IS NULL;3. **Configuration differences:**sql-- Check Postgre

SQL memory settingsSELECT name, setting, unit, context FROM pg_settings WHERE name IN ('shared_buffers', 'effective_cache_size', 'work_mem');**Performance tuning for migrated workloads:**sql-- PostgreSQL-specific optimizationsSET random_page_cost = 1.1; -- For SSD storageSET effective_io_concurrency = 200; -- For SSD storageSET max_parallel_workers_per_gather = 4;```

AWS DMS shows "Task completed" but data is missing

**DMS silent data loss scenarios:**1. LOB columns over 32KB

DMS truncates without warning

Complex WHERE clauses
- DMS may skip rows that don't match perfectly
JSON data
- Transformation failures aren't always loggedData validation after DMS:```sql-- Compare table row counts-- MySQL:

SELECT TABLE_NAME, TABLE_ROWS FROM information_schema.TABLES WHERE TABLE_SCHEMA = 'your_database';-- PostgreSQL:SELECT schemaname, tablename, n_tup_ins FROM pg_stat_user_tables;-- Deep validation for critical tables

SELECT 'MySQL' as source, COUNT() as total_rows, COUNT(DISTINCT id) as unique_ids, MAX(created_date) as latest_dateFROM mysql_tableUNION ALLSELECT 'PostgreSQL' as source, COUNT() as total_rows, COUNT(DISTINCT id) as unique_ids, MAX(created_date) as latest_date FROM postgres_table;```

How do I roll back if everything is broken?

**Emergency rollback procedure:**1. Stop all application traffic immediately:bash# Update load balancer to maintenance page# Or kill application processessudo systemctl stop your-application2. Restore MySQL from backup:```bash# If you have a recent mysqldump backupmysql -u root -p your_database < backup_before_migration.sql# If using binary backup# Stop My

SQL, restore data directory, restart3. **Update application configuration:**bash# Revert database connections to MySQL# Redeploy previous application version# Update DNS if necessary4. **Verify data integrity:**sql-- Check critical business dataSELECT COUNT() FROM orders WHERE date_created >= 'migration_start_date';SELECT COUNT() FROM users WHERE last_login >= 'migration_start_date';```Partial rollback (keep PostgreSQL running for investigation):

Route read-only queries to PostgreSQL
Send all writes back to MySQL
Compare data between systems to identify discrepancies
Fix issues and attempt cutover again

The boss is asking why this is taking so long

Template email for management:

Subject: Migration Status Update - On Track with Expected Complexity

We're currently in the data validation phase of the MySQL to PostgreSQL migration. This is taking longer than originally estimated because:

Data Quality Issues (40% of time): We discovered [X] data integrity problems that would have caused application failures. Fixing these proactively prevents production outages.
Performance Optimization (30% of time): PostgreSQL's query optimizer works differently than MySQL. We're tuning [Y] queries to ensure equivalent or better performance.
Application Compatibility (20% of time): Converting [Z] stored procedures and [A] SQL queries to PostgreSQL syntax.

The extra time invested now prevents:

Data loss incidents (Reference: GitLab lost 6 hours of data rushing their migration)
Performance degradation forcing emergency rollbacks
Application downtime from SQL compatibility issues

Current timeline: [X] weeks remaining. We're prioritizing data integrity over speed to avoid business disruption.

Include metrics that matter to business:

Zero data loss achieved so far
Performance tests show equal or better query times
Rollback plan tested and ready if needed
Total business risk reduced by thorough validation

Validation and Cutover: The Part Where You Find Out If You're Fucked

Time To See If Your Data Actually Made It:

Your migration ran without crashing? Congratulations, you've completed the easy part. Now comes the terrifying moment where you find out if your data is actually correct or if you've just created an expensive PostgreSQL-flavored disaster.

Comprehensive Data Validation

Check If Your Data Actually Migrated (Spoiler: Some Didn't)

Trust nothing. Verify everything. pgloader lies about success rates, and DMS has been known to silently skip problematic rows. Time to write some paranoid validation scripts:

-- Row count validation for all migrated tables
WITH mysql_counts AS (
  SELECT 'users' as table_name, COUNT(*) as row_count FROM mysql_users
  UNION ALL
  SELECT 'orders', COUNT(*) FROM mysql_orders  
  UNION ALL
  SELECT 'products', COUNT(*) FROM mysql_products
),
postgres_counts AS (
  SELECT 'users' as table_name, COUNT(*) as row_count FROM users
  UNION ALL  
  SELECT 'orders', COUNT(*) FROM orders
  UNION ALL
  SELECT 'products', COUNT(*) FROM products
)
SELECT 
  m.table_name,
  m.row_count as mysql_rows,
  p.row_count as postgres_rows,
  CASE WHEN m.row_count = p.row_count THEN 'MATCH' ELSE 'MISMATCH' END as status
FROM mysql_counts m
JOIN postgres_counts p ON m.table_name = p.table_name;

Check data type conversions were handled correctly:

-- Validate boolean conversions (MySQL TINYINT(1) -> PostgreSQL BOOLEAN)
SELECT COUNT(*) as mysql_true FROM mysql_users WHERE active = 1;
SELECT COUNT(*) as postgres_true FROM users WHERE active = true;

-- Validate datetime conversions
SELECT COUNT(*) as mysql_valid_dates 
FROM mysql_orders 
WHERE created_at > '1970-01-01 00:00:00';

SELECT COUNT(*) as postgres_valid_dates
FROM orders  
WHERE created_at > '1970-01-01 00:00:00'::timestamp;

-- Check for data truncation in text fields
SELECT MAX(LENGTH(description)) as max_mysql_length FROM mysql_products;
SELECT MAX(LENGTH(description)) as max_postgres_length FROM products;

Step 2: Business Logic Validation

Validate critical business calculations match between systems:

-- Financial calculations must be identical
SELECT 
  DATE(created_at) as order_date,
  COUNT(*) as order_count,
  SUM(total_amount) as daily_revenue,
  AVG(total_amount) as avg_order_value
FROM mysql_orders 
WHERE created_at >= '2024-01-01'
GROUP BY DATE(created_at)
ORDER BY order_date DESC
LIMIT 30;

-- Run identical query on PostgreSQL and compare results
SELECT 
  DATE(created_at) as order_date,
  COUNT(*) as order_count,
  SUM(total_amount) as daily_revenue,
  AVG(total_amount) as avg_order_value
FROM orders
WHERE created_at >= '2024-01-01'
GROUP BY DATE(created_at) 
ORDER BY order_date DESC
LIMIT 30;

Test foreign key relationships:

-- Verify referential integrity wasn't broken during migration
SELECT COUNT(*) as orphaned_orders
FROM orders o
LEFT JOIN users u ON o.user_id = u.id  
WHERE u.id IS NULL;

-- Check for duplicate data that shouldn't exist
SELECT user_id, COUNT(*) as duplicate_count
FROM user_profiles
GROUP BY user_id
HAVING COUNT(*) > 1;

Step 3: Performance Benchmarking

Compare query performance between MySQL and PostgreSQL using identical workloads with pgbench, sysbench, and pg_stat_statements for comprehensive performance analysis:

-- PostgreSQL: Enable query timing
	iming on

-- Run your most critical queries and record execution times
EXPLAIN (ANALYZE, BUFFERS) 
SELECT u.email, COUNT(o.id) as order_count, SUM(o.total) as lifetime_value
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at >= '2024-01-01'
GROUP BY u.email
HAVING COUNT(o.id) > 5
ORDER BY lifetime_value DESC
LIMIT 100;

Load testing against PostgreSQL:

## Use pgbench for basic load testing
pgbench -i -s 10 production_pg  # Initialize with scale factor 10
pgbench -c 10 -j 2 -t 1000 production_pg  # 10 connections, 1000 transactions each

## Or use your application's existing load testing tools
## Point them at PostgreSQL and compare metrics with MySQL baseline

If PostgreSQL performance is significantly worse than MySQL, investigate:

Missing or suboptimal indexes
PostgreSQL configuration parameters
Query plans that need optimization
Connection pooling settings

Application Migration Strategy

Step 4: Staged Application Cutover

Never cut over all applications simultaneously. Use a phased approach with blue-green deployment strategies, canary releases, and feature toggles to minimize risk:

Phase A: Read-Only Applications First

## Update read-only services to use PostgreSQL
DATABASE_CONFIG = {
    'reports_db': 'postgresql://user:pass@postgres-host/production_pg',
    'analytics_db': 'postgresql://user:pass@postgres-host/production_pg',
    'main_app_db': 'mysql://user:pass@mysql-host/production'  # Still MySQL
}

Phase B: Non-Critical Write Operations

## Migrate background jobs and batch processes
CELERY_DATABASE_URL = 'postgresql://user:pass@postgres-host/production_pg'
ADMIN_DATABASE_URL = 'postgresql://user:pass@postgres-host/production_pg'  
USER_FACING_DB_URL = 'mysql://user:pass@mysql-host/production'  # Still MySQL

Phase C: Critical User-Facing Applications

## Final cutover of customer-facing services
PRIMARY_DATABASE_URL = 'postgresql://user:pass@postgres-host/production_pg'

Step 5: Connection String and Configuration Updates

Update application database configurations systematically using Django database settings, Spring Boot configuration, and connection pooling best practices:

## Before migration (MySQL)
DATABASE_CONFIG = {
    'ENGINE': 'django.db.backends.mysql',
    'NAME': 'production',
    'USER': 'app_user',  
    'PASSWORD': 'mysql_password',
    'HOST': 'mysql-host.amazonaws.com',
    'PORT': '3306',
    'OPTIONS': {
        'charset': 'utf8mb4',
        'init_command': "SET sql_mode='STRICT_TRANS_TABLES'",
    }
}

## After migration (PostgreSQL)  
DATABASE_CONFIG = {
    'ENGINE': 'django.db.backends.postgresql',
    'NAME': 'production_pg',
    'USER': 'app_user',
    'PASSWORD': 'postgres_password', 
    'HOST': 'postgres-host.amazonaws.com',
    'PORT': '5432',
    'OPTIONS': {
        'sslmode': 'require',
    }
}

Update connection pooling for PostgreSQL:

## PostgreSQL requires different connection pooling strategies
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.postgresql',
        'NAME': 'production_pg',
        'USER': 'app_user',
        'PASSWORD': 'postgres_password',
        'HOST': 'postgres-host.amazonaws.com', 
        'PORT': '5432',
        'CONN_MAX_AGE': 60,  # Connection reuse
        'OPTIONS': {
            'MAX_CONNS': 20,   # Per application instance
            'MIN_CONNS': 5,
        }
    }
}

PostgreSQL Monitoring

Monitoring During Cutover

Step 6: Real-Time Monitoring Setup

Monitor both database performance and application behavior during cutover using PostgreSQL monitoring tools, New Relic database monitoring, and Datadog PostgreSQL integration:

-- PostgreSQL: Monitor active connections and query performance
SELECT 
    count(*) as active_connections,
    state,
    wait_event_type,
    wait_event
FROM pg_stat_activity 
WHERE state = 'active'
GROUP BY state, wait_event_type, wait_event;

-- Check for lock contention
SELECT 
    blocked_locks.pid AS blocked_pid,
    blocked_activity.usename AS blocked_user,
    blocking_locks.pid AS blocking_pid, 
    blocking_activity.usename AS blocking_user,
    blocked_activity.query AS blocked_statement
FROM pg_catalog.pg_locks blocked_locks
    JOIN pg_catalog.pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
    JOIN pg_catalog.pg_locks blocking_locks ON blocking_locks.locktype = blocked_locks.locktype
    JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

Application-level monitoring:

## Add database connection monitoring to your application
import logging
import time
from contextlib import contextmanager

@contextmanager
def db_query_monitor():
    start_time = time.time()
    try:
        yield
    except Exception as e:
        logging.error(f"Database query failed after {time.time() - start_time:.2f}s: {e}")
        raise
    else:
        duration = time.time() - start_time
        if duration > 1.0:  # Log slow queries
            logging.warning(f"Slow database query: {duration:.2f}s")
        
## Use in your application:
with db_query_monitor():
    results = User.objects.filter(active=True).count()

Rollback Decision Points

Step 7: Define Clear Rollback Triggers

Establish objective criteria for when to rollback the migration:

Data Integrity Issues:
- Any missing or corrupted data discovered
- Financial calculations that don't match between systems
- Foreign key constraint violations
Performance Degradation:
- Average query response time > 2x MySQL baseline
- Database connections exhausted (PostgreSQL connection limit hit)
- Application timeout errors > 1% of requests
Application Failures:
- Error rate > 5% increase from baseline
- Critical business functions not working
- User-reported data inconsistencies

Automated rollback triggers:

## Application health check for automatic rollback
def check_migration_health():
    checks = {
        'database_connectivity': test_db_connection(),
        'query_performance': test_critical_queries(),
        'data_consistency': validate_critical_data(),
        'error_rate': check_application_errors()
    }
    
    failures = [check for check, passed in checks.items() if not passed]
    
    if len(failures) >= 2:
        logging.critical(f"Migration health check failed: {failures}")
        trigger_rollback()
        return False
    return True

Step 8: Execute Rollback if Necessary

If rollback becomes necessary, act quickly and systematically:

#!/bin/bash
## rollback_migration.sh - Emergency rollback script

echo "EMERGENCY ROLLBACK INITIATED"
echo "Timestamp: $(date)"

## 1. Stop all applications immediately
sudo systemctl stop your-web-app
sudo systemctl stop your-worker-processes  

## 2. Update load balancer to maintenance page
curl -X POST "https://api.cloudflare.com/client/v4/zones/$ZONE_ID/settings/development_mode" \
     -H "Authorization: Bearer $API_TOKEN" \
     -H "Content-Type: application/json" \
     --data '{"value":"on"}'

## 3. Revert database configurations
cp /etc/app/database_config_mysql.yml /etc/app/database_config.yml

## 4. Restart applications with MySQL connection
sudo systemctl start your-web-app
sudo systemctl start your-worker-processes

## 5. Verify rollback success
curl -f http://your-app-domain.com/health/ || echo "ROLLBACK FAILED - MANUAL INTERVENTION REQUIRED"

echo "Rollback completed at $(date)"

Communication during rollback:

## Send alerts to team
curl -X POST $SLACK_WEBHOOK \
     -H 'Content-type: application/json' \
     --data '{"text":"🚨 DATABASE MIGRATION ROLLBACK IN PROGRESS - All hands on deck"}'

## Update status page
curl -X PATCH "https://api.statuspage.io/v1/pages/$PAGE_ID/incidents/$INCIDENT_ID" \
     -H "Authorization: OAuth $STATUSPAGE_TOKEN" \
     -d '{"incident":{"status":"investigating","message":"Rolling back database migration due to performance issues"}}'

At this point, you've either successfully migrated to PostgreSQL or safely rolled back to MySQL. Either outcome is better than a partially-working system in production. The key is having clear criteria for success/failure and the discipline to follow your rollback plan when needed.

Post-migration cleanup and optimization comes next - or if you rolled back, you'll be planning your second migration attempt with lessons learned from the first try.

MySQL to PostgreSQL Migration Resources (The Shit You'll Actually Need)

Related Tools & Recommendations

compare

Similar content

PostgreSQL vs MySQL vs MongoDB vs Cassandra: In-Depth Comparison

Skip the bullshit. Here's what breaks in production.

PostgreSQL

/compare/postgresql/mysql/mongodb/cassandra/comprehensive-database-comparison

100%

compare

Similar content

PostgreSQL vs MySQL vs MariaDB vs SQLite vs CockroachDB

Compare PostgreSQL, MySQL, MariaDB, SQLite, and CockroachDB to pick the best database for your project. Understand performance, features, and team skill conside

/compare/postgresql-mysql-mariadb-sqlite-cockroachdb/database-decision-guide

84%

howto

Similar content

MongoDB to PostgreSQL Migration: The Complete Survival Guide

Four Months of Pain, 47k Lost Sessions, and What Actually Works

MongoDB

/howto/migrate-mongodb-to-postgresql/complete-migration-guide

51%

compare

Similar content

PostgreSQL vs MySQL vs MariaDB: Developer Ecosystem Analysis

PostgreSQL, MySQL, or MariaDB: Choose Your Database Nightmare Wisely

PostgreSQL

/compare/postgresql/mysql/mariadb/developer-ecosystem-analysis

43%

compare

Similar content

PostgreSQL vs MySQL vs MariaDB - Performance Analysis 2025

Which Database Will Actually Survive Your Production Load?

PostgreSQL

/compare/postgresql/mysql/mariadb/performance-analysis-2025

37%

alternatives

Similar content

MongoDB Atlas Alternatives: Escape High Costs & Migrate Easily

Fed up with MongoDB Atlas's rising costs and random timeouts? Discover powerful, cost-effective alternatives and learn how to migrate your database without hass

MongoDB Atlas

/alternatives/mongodb-atlas/migration-focused-alternatives

35%

integration

Recommended