Skip to content

AWS Elastic Disaster Recovery - Recovery Instances Status Check Failed

0

Hi everyone,

I’m facing an issue with AWS Elastic Disaster Recovery and was hoping someone here might be able to help!

We’re replicating some of our servers from the primary to the secondary region. For the source servers, we’re using a custom AMI based on RHEL 9, and each server is attached to our RedHat Satellite.

We’ve also successfully installed the** SSM Agent** and DRS Replication Agent pointing to the secondary region.

Replication seems to be working fine, but when we start a recovery job, all the recovery instances fail to pass the status check. We receive the following error message :

Instance reachability check failed

I’ve tried multiple times to pull the system logs from the recovery instances created by the service, but the logs appear to be empty

Has anyone else experienced a similar issue with custom AMIs on AWS Disaster Recovery? Would really appreciate any advice or suggestions!

1 Answer
0

The issue you're experiencing with AWS Elastic Disaster Recovery (DRS) where recovery instances fail to pass the status check is not uncommon, especially when using custom AMIs. Here are a few points to consider and potential solutions:

  1. Custom AMI Compatibility: Ensure that your custom RHEL 9 AMI is fully compatible with AWS Elastic Disaster Recovery. Some customizations might interfere with the recovery process.

  2. Network Configuration: Check if the network settings in your recovery region are correctly configured. Ensure that the security groups, VPC settings, and network interfaces are properly set up to allow communication between the recovery instances and the AWS DRS service.

  3. RedHat Satellite Integration: Your attachment to RedHat Satellite might be causing issues during the recovery process. Try temporarily disabling this integration to see if it resolves the problem.

  4. SSM Agent and DRS Replication Agent: While you've successfully installed these agents, verify that they are the correct versions and are properly configured for your secondary region.

  5. Instance Type Compatibility: Ensure that the instance types you're using for recovery are compatible with your custom AMI and the AWS Elastic Disaster Recovery service in your secondary region.

  6. Operating System Support: Confirm that RHEL 9 is fully supported by AWS Elastic Disaster Recovery in your target region.

  7. Logs and Monitoring: Although you mentioned the system logs appear empty, try accessing CloudWatch logs or other AWS monitoring services to gather more information about the failure.

  8. Post-Launch Actions: AWS Elastic Disaster Recovery has a feature for post-launch actions. You might want to set up some custom scripts to run after instance launch to help diagnose the issue or make necessary configurations.

If the problem persists after checking these points, it would be advisable to contact AWS support for more detailed troubleshooting, as they can access more specific information about your setup and the recovery process.
Sources
Recovery Instances page - AWS Elastic Disaster Recovery
Disaster recovery overview - AWS Elastic Disaster Recovery
What source infrastructure does AWS Elastic Disaster Recovery support? - AWS Elastic Disaster Recovery
AWS Elastic Disaster Recovery Service Release Notes - AWS Elastic Disaster Recovery

answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.