I want to improve the change data capture (CDC) replication performance on my full load and CDC AWS Database Migration Service (AWS DMS) task. The source latency isn't high, but the target latency is high or is increasing.
Short description
By default, AWS DMS uses transactional apply to replicate data in the CDC phase. If your task captures a high number of transactions from the source and causes target latency, then you can activate the batch apply setting.
Note: The Amazon Redshift target uses batch apply by default. The Amazon Simple Storage Service (Amazon S3) target must use transactional apply.
Batch apply only works on tables with a primary key or unique index. For tables without a primary key or unique index, bulk apply only applies the insert in bulk mode, and then performs updates and deletes one-by-one. If the table has a primary key or unique index but switches to one-by-one mode, see How can I troubleshoot why Amazon Redshift switched to one-by-one mode?
When you include large binary object (LOB) columns in the replication, you can only use BatchApplyEnabled in limited LOB mode. For more information, see Target metadata task settings.
Note: If you set BatchApplyEnabled to true, and your target has a unique constraint, then AWS DMS generates an error message.
Resolution
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshooting errors for the AWS CLI. Also, make sure that you're using the most recent AWS CLI version.
BatchApplySetting is off by default. To activate this setting, use either the AWS CLI or the AWS DMS console. Before you activate batch apply, create an IAM user with programmatic access.
Use the AWS CLI to activate batch apply
Complete the following steps:
- Open the system where you use the AWS CLI.
- Run the configure command to open the AWS CLI prompt.
- Enter your AWS access key ID.
- Enter your AWS secret key ID.
- Enter the AWS Region of your AWS DMS resources.
- Enter the output format.
- Confirm that the task is in the stopped state.
- Run the modify-replication-task command with the following batch setting:
aws dms modify-replication-task --replication-task-arn arn:aws:dms:region:123456789123:task:4VUCZ6ROH4ZYRIA25M3SE6NXCM --replication-task-settings "{\"TargetMetadata\":{\"BatchApplyEnabled\":true}}"
Note: Replace replication-task-arn with your Amazon Resource Name (ARN), and region with your Region.
- Open the AWS DMS console.
- In the navigation pane, under Migrate or replicate, choose Tasks.
- Select your task, and then choose Task settings (JSON).
- Confirm that BatchApplyEnabled is set to enabled.
Use the AWS DMS console to activate batch apply
Complete the following steps:
- Open the AWS DMS console.
- In the navigation pane, under Migrate or replicate, choose Tasks.
- Select your task, and then choose Modify.
- From the Task settings section, choose JSON editor.
- Under TargetMetadata, change BatchApplyEnabled to true.
- Choose Save.
Troubleshoot a high CDCLatencyTarget after you run a task in batch mode
If the CDCLatencyTarget is high after you run the task in batch mode, then you might experience latency for the following reasons:
- You have a long-running transaction on the target because there's no primary and secondary index.
- You have insufficient resource availability to process the workload on target.
- You have high resource contention on your AWS DMS replication instance.
To troubleshoot high latency, see Troubleshooting latency issues in AWS Database Migration Service.
Related information
Monitoring AWS DMS tasks
Change processing tuning settings