By using AWS re:Post, you agree to the Terms of Use
/DMS - How to retain non-ASCII values when running DMS task from MySQL source to Redshift target/

DMS - How to retain non-ASCII values when running DMS task from MySQL source to Redshift target

0

Hello,

We're facing an issue where when running a DMS task with a MySQL source to a Redshift target, columns containing non-ASCII values are being converted/replaced with UTF-8 hex values. Is there a way to disable this? We would like to keep these values as-is when migrated over to Redshift.

*For example, the below source on MySQL is automatically converted to the below target value when DMS'd to Redshift:

Source value(MySQL): ' feW®ê EªAÆð sWÊ' zÒ*%ÿ 5 ß ' ' Target value(Redshift): '92666557AEEA1E45AA41C6F0160A7357CA27027AD22A25FF1135020ADF122706'

Edit:

Issue is actually with BLOB datatypes in general. BLOB data types from MySQL are automatically converted to hex values when DMSed to Redshift.

Thanks!

  • Hi,

    Based on the description, issue seems related to non-ASCII. I can help investigate the issue if you may share below:

    1. Table DDL from MySQL source.
    2. Depends on TARGET Prep Mode (If DO_NOTHING and DMS create the target table then no need Redshift DDL). Please share the Redshift DDL
    3. Let me know which column contains this value
    4. I will try to test it out

    Note: If table DDL not able to be shared, you can just let me know the column datatype for the MySQL and Target Redshift.

  • Thanks for reaching out!

    I investigated this more and found that the issue is actually with BLOB datatype columns in general. So if we run a DMS full load task from source MySQL to target Redshift, the source values are automatically converted to hex values, regardless if the values are ASCII or not.

    For example, I created a test table in the MySql source with one column of BLOB data type, and then inserted value 'TEST 3'. Then, I ran a full load DMS task to Redshift. When the task completed, the table was successfully moved to Redshift, but value 'TEST 3' changed to '544553542033', data type character varying(65535).

    Still unsure why this happens, but I hope this additional info helps.

    Let me know your thoughts!

2 Answers
0

Hello AWS-User-8809738, Do you still need help? If so, kindly provide the information.

MODERATOR
answered a month ago
0

Hi AWS-User-8809738

Thanks for the additional info. This seems to be the data converting issue.

Data converting steps:

MySQL (BLOB data) --> DMS data type (BYTES(66535)) --> VARCHAR (Length)

https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.MySQL.html#CHAP_Source.MySQL.DataTypes https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Target.Redshift.html#CHAP_Target.Redshift.DataTypes

I am thinking we can test using https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.html#CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Transformations.Values

So convert the Bytes to String using the mapping change-data-type see if this can help fix the issue.

Note: as mentioned this datatype change is regarding to DMS datatype not source or target datatype.

If this does not work, feel free to report this to the support team if you have support plan. If not I can help report this internally.

SUPPORT ENGINEER
answered 17 days ago
  • I did a quick test for this issue using s3 as target, Note: Redshift is loading s3 so essentially this can be the same thing.

    I do notice the same that BLOB column from source MySQL covert to hex value at target s3:

    mysql> select * from blobtest;
    +----+------------+
    | id | blobcolumn |
    +----+------------+
    |  1 | 2          |
    |  2 | -240       |
    +----+------------+
    2 rows in set (0.05 sec)
    
    mysql> select hex(blobcolumn) from blobtest;
    +-----------------+
    | hex(blobcolumn) |
    +-----------------+
    | 32              |
    | 2D323430        |
    +-----------------+
    2 rows in set (0.05 sec)

    S3 target:

    Op,DMS_TIMESTAMP,id,blobcolumn
    I,2022-05-13 22:30:27.460339,1,32
    I,2022-05-13 22:30:27.460364,2,2D323430

    Note: change datatype won't work in this case.

    I believe Service team is aware of the issue. You may also report this as feature request if needed.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions