CPU Utilization Will Destroy Your Legacy Servers
AWS Discovery Agent Performance Impact: CPU usage spikes to 40%+ on legacy servers, memory consumption increases over time, and performance degrades on anything older than 5 years.
Installing AWS Application Discovery Agents sounds simple until you try it on that ancient CentOS 6.9 box running your payment system. The agent documentation says "minimal performance impact" but doesn't mention that "minimal" means 40% CPU usage on anything older than 2015.
Real scenario from production: Installed discovery agents on 12 servers during business hours. Three went offline because they couldn't handle the CPU load. The Windows 2008 R2 domain controller became unresponsive for 20 minutes while the agent attempted to inventory every single registry key.
The fix: Install agents during maintenance windows and test on non-critical systems first. On servers with less than 4GB RAM or older than 5 years, expect performance degradation. Monitor CPU usage for the first 2 hours - if it stays above 30%, kill the agent and try agentless discovery instead.
Memory Leaks That Kill Servers
Discovery agents have a known memory leak that AWS doesn't advertise. After running for 2-3 weeks, the agent process can consume 500MB+ of RAM on busy servers. On systems already running near capacity, this kills performance.
Error you'll see: Application timeouts, database connection failures, general system sluggishness. The agent shows as "healthy" in the console while your server dies.
The workaround: Restart the discovery agent weekly using a cron job:
0 2 * * 0 systemctl restart aws-discovery-daemon
Network Discovery: Missing the Obvious Connections
Network Dependency Mapping: Visualization showing server connections with arrows, but missing 20% of critical dependencies that only appear during monthly batch jobs or system failures.
The network visualization looks impressive until you realize it misses 20% of your critical connections. The agent only captures active network connections - if your backup job runs at 3 AM and you install the agent at 9 AM, that dependency won't show up.
War story: Migrated a web application that worked fine for 3 weeks. Then the monthly reporting job failed because it couldn't connect to an Oracle database that only gets accessed once per month. The dependency wasn't discovered because nobody thought to run all scheduled jobs during the discovery period.
How to actually map dependencies:
- Run discovery for at least 14 days to catch weekly/monthly jobs
- Manually trigger all scheduled tasks during discovery
- Check application logs for outbound connections the agent missed
- Document every custom service account - they often indicate hidden dependencies
Agentless Discovery Limitations
AWS Agentless Discovery Connector sounds perfect until you try to use it. It requires VMware vCenter 5.5+ and can only see what VMware knows about - which excludes most of your custom applications and all of your bare metal servers.
Reality check: Agentless discovery finds your servers but tells you nothing useful about what they do. You get basic specs (CPU, RAM, disk) but no process information, no network connections, and no application dependencies. It's basically an expensive version of vmware-toolbox-cmd stat hosttime
.
Authentication Nightmares
Setting up the proper IAM roles for Migration Hub feels like navigating a Byzantine bureaucracy. The required policies documentation is outdated and doesn't mention half the permissions you actually need.
Permission error you'll hit: User is not authorized to perform: discovery:GetDiscoverySummary
even though you followed the official setup guide. The IAM simulator says everything should work, but the console throws permission errors.
The actual permissions you need (beyond what AWS documents):
discovery:*
mgh:*
AWSApplicationMigrationAgentPolicy
AWSApplicationMigrationReplicationServerPolicy
- Custom policy for CloudWatch logs access
Pro tip: Use AWS CloudTrail to see exactly which API calls are failing, then add those specific permissions. Don't trust the documentation. Also check the AWS IAM Policy Simulator to test permissions before deploying. The AWS Well-Architected Security Pillar has best practices for IAM role design, though it doesn't cover Migration Hub specifics. For complex setups, use AWS Organizations Service Control Policies to prevent accidental permission escalation.
Home Region Confusion
You can only view migration data in your "home region" but AWS makes it unclear how to change this. If you accidentally set the wrong home region during setup, you're stuck with it unless you contact support.
The problem: Set up Migration Hub in us-east-1 but your infrastructure is in us-west-2. All your migration tracking data lives in the wrong region and you can't move it.
The solution: Before installing ANY agents, verify your home region in the Migration Hub console. If it's wrong, you need to contact AWS Support to reset it. This process takes 1-2 business days.
API Rate Limiting During Large Migrations
The Migration Hub APIs have undocumented rate limits that kick in when you're tracking 100+ servers. Your monitoring scripts start failing with HTTP 429 errors, but the AWS documentation doesn't mention any limits.
When this hits: During the data collection phase with 200+ discovery agents running. The console becomes unresponsive and API calls timeout. AWS Support's initial response: "Migration Hub is designed to scale automatically."
Workaround: Implement exponential backoff in your automation scripts and batch API calls. Monitor your CloudWatch metrics - if you see API errors spiking, slow down your requests. Use the AWS SDK retry configuration for automatic backoff. Consider AWS Service Quotas to request limit increases for large migrations. The AWS Support API can help automate quota requests. Monitor with AWS X-Ray to trace API call patterns and identify bottlenecks. Use AWS Config to track configuration changes during large-scale migrations.