- Newest
- Most votes
- Most comments
The error you're encountering suggests an issue with AWS DMS (Database Migration Service) attempting to download a backup from your RDS SQL Server instance during the CDC (Change Data Capture) replication process. This error is occurring consistently on Saturdays, which might indicate a conflict with scheduled maintenance or backup operations on your RDS instance.
To collect further diagnostic data:
-
Check the CloudWatch logs for your DMS task. The "task log" mentioned in the error message is likely referring to these logs, which may contain more detailed information about the failure.
-
Review the RDS event logs for your SQL Server instance, particularly focusing on events occurring around the time of the failure on Saturdays.
-
Examine the RDS performance insights and CloudWatch metrics for your SQL Server instance to identify any resource constraints or unusual activity during the failure period.
To address this issue:
-
Verify that the IAM role associated with your DMS task has the necessary permissions to access RDS backups.
-
Check if there are any conflicting maintenance windows or backup schedules for your RDS instance that coincide with the failure time on Saturdays. You may need to adjust these schedules to avoid conflicts with the DMS task.
-
Ensure that your RDS instance has sufficient storage space and that you're not hitting any storage limits, which could prevent new backups from being created.
-
Consider increasing the backup retention period for your RDS instance. If the backup retention period is set to 0 (disabled), this could cause issues with CDC replication.
-
If the issue persists, you may need to reach out to AWS Support as suggested in the error message. Provide them with the task ID (2461 in this case) and any additional logs or information you've gathered.
As a workaround, you mentioned that restarting the task resolves the issue temporarily. You could consider setting up an automated process to restart the task if it fails, perhaps using AWS Lambda triggered by a CloudWatch alarm on the task status.
Remember that while this workaround might keep your replication running, it's crucial to identify and address the root cause of the problem to ensure long-term stability of your DMS replication process.
Sources
Troubleshooting migration tasks in AWS Database Migration Service - AWS Database Migration Service
Troubleshooting - Amazon Aurora MySQL Migration Handbook
Relevant content
- asked 2 years ago
Further looking into logs, I've noticed 10 minutes prior to the error it was reported thast CDC stopped, indicating something wrong with Instance perhaps, but not much info in instance logs though :(
What is next.
2025-04-26T08:26:39 [SOURCE_CAPTURE ]I: STOP signal was detected. Terminating CAPTURE loop. (sqlserver_endpoint_capture.c:748) 2025-04-26T08:26:39 [SOURCE_CAPTURE ]I: End of CDC / CAPTURE events for MS-SQL endpoint. (sqlserver_endpoint_capture.c:1303) 2025-04-26T08:26:39 [TASK_MANAGER ]I: Subtask #0 ended (replicationtask_util.c:595) 2025-04-26T08:26:39 [TASK_MANAGER ]I: Task - <task-nam> is in STOPPED state, updating starting status to AR_NOT_APPLICABLE (repository.c:5483) 2025-04-26T08:26:39 [TASK_MANAGER ]I: Updated Task,<task-name>, info with TaskState - 0 (replicationtask.c:4546)