Extra Files Added to Source Endpoint During a Full Load

0

When a DMS Data Migration Tasks is configured as Full-Load and connected to a target endpoint of type S3 bucket, for a few tables there are some phantom files being added when the task runs. The files extra files are named with a timestamp as opposed to LOAD00000X (image below).

Extra Files

My assumption is that these are changes made to the source table during or after the table's migration as they have the extra column marker of 'I','U' and 'D', to indicate data operation, as common with CDC deposits.

These files are being replicated to a second S3 bucket which is the source of another Data Migration Task, however, the "extra" files are causing Table Errors in the second task during full load.

For example, when the extra files are there, I am getting errors indicating that the task is expecting 6 columns but found 7 for a given table. When the extra files above are not in the S3 source bucket, there are no Table Errors when the task starts.

I tried looking for settings to control this behavior or configure the second task to correctly process the extra files, but I am having no luck.

2 Answers
0
Accepted Answer

Hi,

Glad to hear that it was answered!

Please also note that the extra fields that you are observing in the CSV files is as per the expected behavior as additional fields are added to each migrated record when migrating with AWS DMS to an S3 bucket. As you correctly mentioned, additional fields indicate the operation applied to the record at the source database i.e. field contains the letter I (INSERT), U (UPDATE), or D (DELETE) to indicate whether the row was inserted, updated, or deleted at the source database.

This behavior can be controlled based on the migration type and by configuring the Extra Connection Attributes, includeOpForFullLoad, cdcInsertsOnly, and cdcInsertsAndUpdates.

To know more about this in detail, kindly refer the below documentation [+]:

[+] Indicating source DB operations in migrated S3 data - https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.S3.html#CHAP_Target.S3.Configuring.InsertOps

AWS
SUPPORT ENGINEER
Deepam
answered a year ago
  • @Deepam - Thanks for the response. I read the documentation and have one question. When the task is set to full load and ongoing. Do these files represent the changes that occur while full load is running and prior to CDC starting. I just want to verify that the changes in these files will not also be captured by cdc.

0

This was answered in another post by @Pulkit_B. https://repost.aws/questions/QUCqKo0aqQTjGc37B48Usdcg/dms-migration-task-is-not-creating-cdc-files-when-source-endpoint-is-s-3

At some point the PreserveTransactionProperty was removed but the cdc-path remained.

profile picture
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions