Why is no data migrated from my Amazon S3 source endpoint even though my AWS DMS task is successful?

5 minuti di lettura
0

My AWS Database Migration Service (AWS DMS) task is successful, but no data is migrated from my Amazon Simple Storage Service (Amazon S3) source endpoint. Why isn't my data migrating, and how do I troubleshoot this issue?

Short description

The following are the most common reasons why an AWS DMS task is successful but no data is migrated:

  • The task status is Load complete, replication ongoing, but no data is loaded on the target.
  • The task status is Load complete, replication ongoing, but no table is in the Table statistics section.
  • The task status is Running and the table is created in the target endpoint, but no data is loaded. Also, you receive a No response body error in the replication log.

Resolution

The task status is Load complete, replication ongoing, but no data is loaded on the target

Confirm that the Amazon S3 path defined for the source endpoint is correct. From the replication log, review the log entries. Identify entries that indicate that AWS DMS can't find the data files in the Amazon S3 path that is defined for the source endpoint. See the following example replication log entry:

[SOURCE_UNLOAD ]I: Unload finished for table 'dms_schema'.'dms_table' (Id = 1). 0 rows sent. (streamcomponent.c:3396)
[TARGET_LOAD ]I: Load finished for table 'dms_schema'.'dms_table' (Id = 1). 0 rows received. 0 rows skipped. Volume transferred 0. (streamcomponent.c:3667)

In Amazon S3, the data file for the full load phase (data.csv) and the data file for on-going changes (change_data.csv) are stored like this:

  • S3-bucket/dms-folder/sub-folder/dms_schema/dms_table/data.csv
  • S3-bucket/dms-folder/sub-folder/dms-cdc-path/dms-cdc-sub-path/change_data.csv

The Amazon S3 source endpoint uses three important fields to find the data files:

  • Bucket folder
  • Change data capture (CDC) path
  • Table structure

In the example file paths listed previously, dms-folder/sub-folder is the Bucket folder. The CDC path that you enter when creating the Amazon S3 source endpoint is dms-cdc-path/dms-cdc-sub-path. The following example Table structure uses the same example file path listed previously:

{
  "TableCount": 1,
  "Tables": [
    {
      "TableColumns": […],
      "TableColumnsTotal": "1",
      "TableName": "dms_table",
      "TableOwner": "dms_schema",
      "TablePath": "dms_schema/dms_table/"
    }
  ]
}

Important: Don't include the bucket folder path ( dms-folder/sub-folder) in the TablePath of the table structure.

When specifying your Endpoint configuration, consider the following:

  • The bucket folder is optional. If a bucket folder is specified, then the CDC path and table path (the TablePath field for a full load) must be in the same folder in Amazon S3. If the bucket folder isn't specified, then the TablePath and CDC path are directly under the S3 bucket.
  • The Bucket folder field of the Amazon S3 source endpoint can be any folder directory between the S3 bucket name and the schema name of the table structure. In the previous example, it's dms-schema. If you don't have a hierarchy of folders under the S3 bucket, then you can leave the fields blank.
  • Bucket folders or CDC paths can be individual folders or they can include subfolders, such as dms-folder or dms-folder/sub-folder.

If your DMS task setting uses Amazon S3 as the source endpoint, then you must include the schema and table in the table mapping. This is required to successfully migrate data to the target. For more information, see Source data types for Amazon S3.

If you use Drop tables on target as the table preparation mode for the task, then DMS creates the target table dms_schema.dms_table. See the following example:

CREATE TABLE 'dms_schema'.'dms_table' (...);

Note: Folder and object names in Amazon S3 are case-sensitive. Be sure to specify both folder and object names with correct case in the S3 endpoint.

The task status is Load complete, replication ongoing, but no table is in the Table statistics section

You might find that no table was created in the target endpoint when Drop tables on target was used. This means that the issue might be caused by the table structure that is specified for the Amazon S3 source endpoint.

Confirm that the Amazon S3 path for the source endpoint is correct, as described previously. Then, verify that your data type is supported by the Amazon S3 endpoint.

After confirming that the Amazon S3 path is correct and that your data type is supported, check the filter that is defined by the table mapping of your DMS task. Check to see if the filter is the cause of the missing tables. Review the table that is needed within the task table mapping and check that the table is defined in the table structure of the Amazon S3 source endpoint.

The task status is Running and the table is created in the target endpoint, but no data is loaded. Also, you receive a No response body error in the replication log

If AWS DMS can't get the content from the Amazon S3 path, you can find errors in the replication log. See the following examples:

[SOURCE_CAPTURE ]E: No response body. Response code: 403 [1001730] (transfer_client.cpp:589)
[SOURCE_CAPTURE ]E: failed to download file </dms-folder/sub-folder/dms_schema/dms_table/data.csv> from bucket <dms-test> as </rdsdbdata/data/tasks/NKMBA237MEB4UFSRDF5ZAF3EZQ/bucketFolder/dms-folder/sub-folder/dms_schema/dms_table/data.csv>,
                status = 4 (FAILED) [1001730] (transfer_client.cpp:592)

This error occurs when the AWS Identity and Access Management (IAM) role for the Amazon S3 source endpoint doesn't have the correct permissions: s3:GetObject. To resolve this error, confirm that the data file exists in the Amazon S3 path that is in the error message. Then, confirm that the IAM user has permissions for s3:GetObject.
Note: If the source Amazon S3 bucket has versioning enabled, additional s3:GetObjectVersion permissions is required.


Related information

Using Amazon S3 as a source for AWS DMS

AWS UFFICIALE
AWS UFFICIALEAggiornata 2 anni fa