Skip to content

AWS DMS (DB2 LUW source and S3 as target): Full load data appears to be “replaced” by CDC even with Target table preparation mode = DO_NOTHING

0

When running a DMS (Full load+CDC) task with DB2 LUW as the source and S3 as target, the target shows that data loaded during full load later gets “replaced” by CDC changes. The task’s Target table preparation mode is set to DO_NOTHING, but the behavior still occurs. My expectation was that DO_NOTHING would prevent any replacement/overwrite of the full-load results on the target (or at least preserve the full-load output as-is).

asked 18 days ago53 views
2 Answers
0

When using Amazon S3 as a target for AWS DMS, the behavior you're experiencing is actually expected, even with the Target table preparation mode set to DO_NOTHING. This is due to how AWS DMS handles data files in S3 during full load and CDC phases.

With S3 as a target, DO_NOTHING mode means that AWS DMS assumes target tables (or in this case, S3 paths) are pre-created and won't perform any cleanup or truncation operations. However, this doesn't prevent CDC changes from being written to S3 - it simply means DMS won't clear existing data before starting the migration.

The key issue is that when intensive writing occurs to the source table during the full load phase, or if a task stops and resumes during full load with DO_NOTHING mode, AWS DMS may write duplicate records to the S3 bucket. Additionally, S3 doesn't support automatic updates or cleanups the way traditional database targets do.

During CDC, AWS DMS writes change records to S3 as separate files with timestamps. These CDC files contain the changes (inserts, updates, deletes) that occurred on the source. The "replacement" you're seeing is likely the CDC process writing new files that represent the changed state of the data, rather than literally overwriting the full load files.

To better control this behavior, you might consider:

  1. Using TRUNCATE_BEFORE_LOAD mode instead, which ensures a clean state before the full load begins
  2. Implementing date-based folder partitioning to better organize full load versus CDC data
  3. Reviewing your CDC handling settings to understand how changes are being captured and stored in S3

The fundamental difference with S3 targets is that data is stored as immutable files rather than updateable database records, so the concept of "replacement" works differently than with traditional database targets.
Sources
Resolve inconsistent data with Amazon S3 as the target for AWS DMS migration | AWS re:Post
Using Amazon S3 as a target for AWS Database Migration Service - AWS Database Migration Service
Target table preparation mode - AWS Database Migration Service

answered 18 days ago
EXPERT
reviewed 18 days ago
  • We have already implemented date-based folder partitioning for better organize full load versus CDC data, but still dms is deleting the full load parition.

0

AWS DMS with S3 as a target does not preserve full-load output “as-is” when CDC is enabled even with TargetTablePrepMode=DO_NOTHING. This is expected behavior because S3 is a file-based target, and CDC writes new change files that reflect ongoing updates, inserts, and deletes from the source, which can appear to “replace” or supersede the full-load data.

https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.CustomizingTasks.TaskSettings.FullLoad.html

EXPERT
answered 18 days ago
EXPERT
reviewed 18 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.