Why is the discovery agent maxing out CPU on my server?

The agent scans every process, connection, and file handle on the system every 15 minutes. On busy servers or those with hundreds of processes, this creates massive CPU spikes. Legacy servers with single-core CPUs become unresponsive during scans. **Quick fix**: Edit `/opt/aws/discovery/config/agent.properties` and change the collection interval from 900 seconds (15 minutes) to 3600 seconds (1 hour). Restart the agent: `sudo systemctl restart aws-discovery-daemon`.

The agent shows "healthy" but I don't see any data in the console. What's wrong?

Usually means the agent can't reach the AWS endpoints due to firewall rules or proxy settings. The agent reports "healthy" because it's running, but it can't upload data. **Check connectivity**: `curl -I https://application-discovery.us-west-2.amazonaws.com` from the server. If this fails, you need to configure proxy settings in `/opt/aws/discovery/config/agent.properties` or open firewall ports 443 and 8888.

Can I install the discovery agent on the same server as my database?

Technically yes, but don't. Database servers are already I/O intensive and adding discovery agent scanning makes everything worse. Use agentless discovery if possible, or install the agent during maintenance windows only.

How do I uninstall this thing when it's breaking my server?

```bash sudo systemctl stop aws-discovery-daemon sudo systemctl disable aws-discovery-daemon sudo /opt/aws/discovery/uninstall sudo rm -rf /opt/aws/discovery ``` If the uninstaller fails (it often does), manually kill the processes: `sudo pkill -f discovery` and delete the directory.

Why does the agent keep restarting every few hours?

Memory leak. The agent accumulates memory over time and hits system limits. On servers with limited RAM, the OOM killer terminates the agent process. AWS claims this is "fixed" in newer versions but it still happens. **Workaround**: Set up a weekly restart cron job or monitor memory usage and restart when it exceeds 500MB.

My migration shows "completed" but the application doesn't work. What happened?

Migration Hub tracks the server migration but knows nothing about application functionality. The server migrated successfully, but the application configuration is wrong (database connection strings, license servers, network routing). **Reality check**: "Completed" means the files copied successfully, not that your application works. Plan for 2-4 weeks of post-migration troubleshooting for any non-trivial application.

Application Migration Service replicated my server but it won't boot. Now what?

Boot failures happen 30% of the time, especially with Windows servers or custom Linux configurations. The replication copied the disk but didn't account for hardware differences, driver issues, or boot sector problems. **Emergency fix process**: 1. Launch the target instance and attach the replicated EBS volume as secondary disk 2. Boot from a rescue AMI and mount the migrated volume 3. Fix `/etc/fstab` (Linux) or registry entries (Windows) for new hardware 4. Install AWS-compatible drivers before attempting to boot **Prevention**: Test boot the migrated instance in a non-production environment first. Every time.

How long should I wait for replication to finish?

AWS says "a few hours" but reality is days or weeks depending on data size and network speed. For a 2TB server over a 100Mbps connection, expect 48+ hours for initial replication. **Real timelines from production**: - 500GB server: 6-12 hours - 2TB server with database: 2-3 days - 10TB file server: 1-2 weeks - Add 50% to any estimate for network hiccups and AWS throttling

The network diagram shows my servers are connected but the application can't reach the database after migration. Why?

AWS doesn't migrate network configuration. Your on-premises network routing, VLANs, and firewall rules don't automatically translate to AWS VPC security groups and route tables. **What's missing after migration**: - Security group rules for inter-server communication - Route table entries for subnet routing - NACLs that block traffic Migration Hub doesn't know about - Custom DNS configurations **Fix it before you migrate**: Document every network flow and translate it to AWS networking before starting server migration.

Can I pause a migration that's failing?

No. Once Application Migration Service starts the cutover process, you can't pause it. You can fail back to source, but that requires starting over. This is why testing is critical. **When to fail back**: - Boot failures that you can't fix within your downtime window - Database corruption or data inconsistency - Application performance issues that make the system unusable - Network connectivity problems that prevent users from accessing the application

How do I know if my migration actually worked?

Test everything. Migration Hub showing "completed" means nothing for application functionality. Run your full test suite on the migrated systems before declaring success. **Minimum testing checklist**: - Application starts and responds to requests - Database connections work and data is accessible - File shares and network drives mount correctly - Scheduled jobs execute successfully - Monitoring and backup systems connect to migrated servers - End-user acceptance testing in production-like conditions **Time estimate**: Plan for testing to take as long as the actual migration. A 4-hour server migration needs 4-8 hours of testing.

Currently viewing the AI version

Switch to human version

AWS Migration Hub Implementation Guide - AI-Optimized Technical Reference

Critical Configuration Requirements

Discovery Agent Installation

CPU Impact: 40%+ on legacy servers (older than 5 years)
Memory Consumption: 500MB+ after 2-3 weeks (memory leak)
Performance Threshold: Systems with <4GB RAM will experience degradation
Failure Scenario: Windows 2008 R2 domain controllers become unresponsive for 20+ minutes during registry scanning

Production-Ready Configuration:

# /opt/aws/discovery/config/agent.properties
collection_interval=3600  # Change from default 900 seconds

Required Maintenance:

# Weekly restart to prevent memory leak crashes
0 2 * * 0 systemctl restart aws-discovery-daemon

Network Discovery Limitations

Coverage Gap: Misses 20% of critical dependencies
Sampling Issue: Only captures active connections during scan period
Hidden Dependencies: Monthly/quarterly jobs, backup processes, scheduled tasks
Minimum Discovery Period: 14 days to catch weekly/monthly processes

Authentication Requirements

Actual Permissions Needed (beyond documented):

discovery:*
mgh:*
AWSApplicationMigrationAgentPolicy
AWSApplicationMigrationReplicationServerPolicy
Custom CloudWatch logs access policy

Debug Process: Use CloudTrail to identify failing API calls, then add specific permissions

Home Region Constraints

Critical Limitation: Cannot change home region without AWS Support intervention
Resolution Time: 1-2 business days via AWS Support
Impact: All migration data locked to initial region selection

Resource Requirements and Time Estimates

Migration Timeline Reality

Server Size	Network Speed	Actual Time
500GB	100Mbps	6-12 hours
2TB with database	100Mbps	2-3 days
10TB file server	100Mbps	1-2 weeks

Rule of thumb: Add 50% buffer for network issues and AWS throttling

API Rate Limiting

Threshold: 100+ servers trigger undocumented rate limits
Symptoms: HTTP 429 errors, console unresponsiveness
Workaround: Exponential backoff, batch API calls
Monitoring: CloudWatch metrics for API error spikes

Critical Failure Modes

Discovery Agent Failures

CPU Overload on Legacy Systems
- Cause: Agent scans every process/connection every 15 minutes
- Impact: Single-core systems become unresponsive
- Fix: Increase scan interval to 1 hour
Memory Leak Crashes
- Cause: Known leak in agent process
- Timeline: 2-3 weeks to failure
- Detection: Process >500MB RAM usage
- Solution: Weekly automated restarts
Network Connectivity Issues
- Symptom: Agent shows "healthy" but no data in console
- Cause: Firewall/proxy blocking AWS endpoints
- Test: curl -I https://application-discovery.us-west-2.amazonaws.com
- Fix: Configure proxy or open ports 443, 8888

Migration Tracking Failures

Status Mapping Breaks
- Frequency: 40% of migrations lose tracking
- Cause: Automatic mapping fails between tools and discovered servers
- Manual Fix: Migration Hub → Updates → Edit → Manually map servers
Application Group Logic Fails
- Issue: Servers assigned to wrong applications or multiple groups
- Impact: "Partially migrated" status for completed migrations
- Solution: Delete auto-generated groups, create manual groups
Multi-Region Tracking Impossible
- Limitation: Single home region view only
- Impact: Cannot track migrations spanning multiple regions
- Workaround: Custom CloudWatch dashboards and Lambda functions

Migration Execution Failures

Boot Failures After Migration
- Frequency: 30% of Windows servers, 15% of custom Linux
- Cause: Hardware differences, driver issues, boot sector problems
- Emergency Fix:
  - Launch rescue instance
  - Mount migrated volume as secondary disk
  - Fix /etc/fstab or Windows registry for new hardware
Application Non-Functionality
- Issue: Server migrates successfully but application fails
- Common Causes: Database connection strings, license servers, network routing
- Testing Required: 2-4 weeks post-migration troubleshooting

Performance Monitoring Inadequacies

Discovery Agent Metrics Limitations

Averaging Period: 15-minute intervals mask peak loads
Missing Data: Night-time backups, month-end processing spikes
Impact: Undersized AWS instances cause performance issues

Better Monitoring Approach:

Maintain existing monitoring tools during discovery
Export VMware vCenter performance data
Use AWS Systems Manager for detailed metrics
Run stress tests for actual resource requirements

Decision Support Matrix

Agent vs Agentless Discovery

Criteria	Agent-Based	Agentless
Dependency Mapping	Complete (with caveats)	Basic specs only
Performance Impact	High (40% CPU)	None
Server Compatibility	All platforms	VMware vCenter 5.5+ only
Application Visibility	Process-level	None
Installation Complexity	High	Medium

When to Use Alternative Tools

CloudEndure/AWS MGN: Better for large-scale migrations
Turbonomic: More accurate right-sizing analysis
Zerto: Enterprise-grade replication with better tracking

Emergency Procedures

Agent Removal (When Breaking Server)

sudo systemctl stop aws-discovery-daemon
sudo systemctl disable aws-discovery-daemon
sudo /opt/aws/discovery/uninstall
sudo rm -rf /opt/aws/discovery
# If uninstaller fails:
sudo pkill -f discovery

Migration Rollback Triggers

Boot failures exceeding downtime window
Database corruption or data inconsistency
Unusable application performance
Network connectivity preventing user access

Support Escalation Paths

Basic Issues: AWS re:Post Migration Hub questions
Critical Failures: AWS Enterprise Support (24/7)
Complex Migrations: AWS Professional Services
Large Scale: AWS Migration Acceleration Program

Operational Intelligence

What AWS Documentation Doesn't Tell You

"Minimal performance impact" = 40% CPU on legacy systems
"Automated mapping" works 60% of the time
Rate limits exist but aren't documented
Memory leaks require weekly restarts
Boot failure rate is 30% for Windows migrations

Hidden Costs

Time Investment: 2-4 weeks post-migration troubleshooting per application
Expertise Required: Network engineering, systems administration, AWS architecture
Infrastructure: Separate testing environment mandatory
Support: Enterprise Support recommended for production migrations

Success Criteria

Technical: Application functionality, not just server replication
Performance: Stress testing under production loads
Integration: All dependencies and scheduled jobs working
Monitoring: Full observability stack operational
Rollback: Tested and documented fallback procedures

Breaking Points

100+ servers: API rate limiting kicks in
Legacy servers (>5 years): Performance degradation likely
Complex applications: Manual dependency mapping required
Multi-region: Native tracking fails, custom solution needed
Large datasets (>2TB): Week+ migration windows required

This guide prioritizes real-world operational intelligence over vendor marketing claims, focusing on what actually breaks and how to fix it.

Useful Links for Further Investigation

Resources for When Everything Goes Wrong

Link	Description
Migration Hub Troubleshooting Guide	Basic troubleshooting that covers 20% of real problems, offering initial guidance for common issues encountered during migration processes.
Application Discovery Service Troubleshooting	Addresses agent installation and connectivity issues, providing steps to resolve problems with the Application Discovery Service.
Application Migration Service Troubleshooting	Provides solutions for scenarios where servers won't boot after migration, helping to diagnose and fix post-migration startup failures.
Migration Hub API Error Codes	A reference guide to decode cryptic error messages returned by the Migration Hub API, aiding in understanding and resolving API-related issues.
AWS re:Post Migration Hub Questions	Explore real problems and solutions shared by real engineers on AWS re:Post, focusing on questions related to Migration Hub.
AWS Developer Forums - Migration	Participate in community discussions and find solutions for various migration troubleshooting challenges within the AWS Developer Forums.
AWS Migration Samples Repository	Access workshop materials and sample code for application migration, providing practical examples and guidance for implementation.
Stack Overflow Migration Hub Tags	Find solutions and discussions for code-level troubleshooting related to AWS Migration Hub on Stack Overflow, a popular developer Q&A site.
AWS CLI Migration Hub Config Commands	Utilize AWS CLI commands for Migration Hub configuration when the console is inaccessible or for scripting automated management tasks.
AWS MigOps CloudFormation Templates	Leverage CloudFormation templates to automate the setup and configuration of Migration Operations (MigOps) that might otherwise fail manually.
PowerShell AWS Migration Module	Access PowerShell cmdlets for AWS Migration Hub to create and manage migration resources and automate Windows-based migration tasks.
Terraform AWS Migration Hub Provider	Manage AWS Migration Hub applications and resources using Terraform, enabling infrastructure as code for migration tracking and orchestration.
IDrive Business Backup	Explore IDrive Business Backup for reliable server replication and data protection, offering a robust solution that actually works for business continuity.
Turbonomic Migration Planning	Discover Turbonomic for right-sizing analysis and cloud migration planning, providing more accurate resource recommendations than native AWS tools.
CloudEndure User Forum	Access community knowledge and discussions from the CloudEndure User Forum, which is now part of AWS MGN but still offers valuable insights.
Zerto IT Resilience Platform	Learn about Zerto's IT Resilience Platform for enterprise-grade migration, offering advanced replication, recovery, and better tracking capabilities.
AWS Enterprise Support	Opt for AWS Enterprise Support when you need someone to answer the phone at 3 AM, providing critical assistance for urgent issues.
AWS Migration Acceleration Program	Engage with the AWS Migration Acceleration Program for professional services and funding to accelerate large-scale cloud migrations.
AWS Professional Services	Consult with AWS Professional Services, a team of experts who have seen it all before and can provide tailored guidance for complex migrations.
AWS Partner Network Migration Specialists	Find third-party help and specialized expertise from AWS Partner Network Migration Specialists when native AWS support isn't enough.
CloudWatch Migration Metrics	Configure CloudWatch to monitor migration metrics and set up alerts for potential migration failures, ensuring timely detection of issues.
Migration Hub Events for EventBridge	Integrate Migration Hub events with EventBridge to automate responses and trigger workflows based on various migration-related occurrences.
AWS Systems Manager for Migration Monitoring	Utilize AWS Systems Manager to monitor the health and operational status of servers after they have been migrated to AWS.
Third-party Migration Monitoring Tools	Explore third-party migration monitoring tools available through AWS partners when native monitoring solutions do not meet specific requirements.
AWS Migration Whitepaper	Review the AWS Migration Whitepaper for theoretical insights and best practices, though it may not always perfectly match real-world migration challenges.
AWS re:Invent Migration Sessions	Access recordings of AWS re:Invent conference sessions focused on migration strategies, best practices, and lessons learned from various customer scenarios.
AWS Migration Hub Orchestrator User Guide	Consult the official user guide for AWS Migration Hub Orchestrator to understand and implement hands-on automation workflows for complex migration tasks.
AWS Migration Blog	Read the AWS Migration Blog for case studies, lessons learned, and updates on migration strategies and services from AWS experts.