Transformations rules with MongoDB Source to Aurora Mysql Target in document mode do not see _doc column

0

I have an AWS DMS task that I've setup to migrate my MongoDB collection to AWS RDS (Aurora MySql). Both endpoints connect succesfully and I can perform a simple migration in Document mode that has minimal transformation rules (only a schema rename) successfully. The resulting table, complete with the _id preserved as a separate field, is complete with the rest of the document in _doc column. As is commonly necessary, we need to normalize this data into a relational structure for RDS, and we've found that a secondary normalizing insert step is very slow with the volume of data that we have, so the next attempt was to include transformation rules, primarily consisting of add-column entries which allow for a SQLite expression to set a value. In this way the target table was to contain a set of columns with pre-extracted values.

We are attempting to use the below transformation rule:

{
                "rule-type": "transformation",
                "rule-id": "2",
                "rule-name": "extract-field",
                "rule-action": "add-column",
                "rule-target": "column",
                "object-locator": {
                  "table-name": "metadata",
                  "schema-name": "%"
                },
                "value": "field",
                "expression": "json_extract(_doc,'$.field')",
                "data-type": {
                  "type": "string",
                  "length": 15
                }
              }

In the expression we are attempting to use the SQLite json_extract() function to extract the top-level field "field", but in our CloudWatch logs, it is reported that the _doc column cannot be found.

2023-10-13T14:34:22 [DATA_STRUCTURE ]E: SQLite general error. Code <1>, Message <no such column: _doc>. [1000504] (at_sqlite.c:336)
2023-10-13T14:34:22 [TRANSFORMATION  ]E:  Failed to init column calculation expression 'json_extract(_doc,'$.field')' [1000504]  (manipulator.c:1798)

What transformations need to take place such that I can end up with a migration table that consists of the 3 fields: _id, _doc, and field?

asked 7 months ago227 views
1 Answer
0

Thanks for contacting AWS.

The error you are facing is because the column names should start with $ in the transformation rule, therefore the correct value should be '$_doc'.

In AWS documentation you can see examples of using $ [+] https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Tasks.CustomizingTasks.TableMapping.SelectionTransformation.Expressions.html

Further, on testing following could be seen:

  {
            "rule-type": "transformation",
            "rule-id": "425815267",
            "rule-name": "extract-field",
            "rule-action": "add-column",
            "rule-target": "column",
            "object-locator": {
                "table-name": "new",
                "schema-name": "catalog"
            },
            "value": "field",
            "expression": "$_doc",
            "data-type": {
                "type": "string",
                "length": 15
            }
        }
> select * from new;
+--------------------------+------------------------------------------------------------------------------------+-------+
| _id                      | _doc                                                                               | field |
+--------------------------+------------------------------------------------------------------------------------+-------+
| 652caea3a4d85c5d6af773cc | { "_id" : { "$oid" : "652caea3a4d85c5d6af773cc" }, "item" : "card", "qty" : 15.0 } | NULL  |
| 652caeaaa4d85c5d6af773cd | { "_id" : { "$oid" : "652caeaaa4d85c5d6af773cd" }, "item" : "card", "qty" : 15.0 } | NULL  |
+--------------------------+------------------------------------------------------------------------------------+-------+
2 rows in set (0.00 sec)
  • When changed the expression to '$_id', the column was migrated.
     {
            "rule-type": "transformation",
            "rule-id": "425815267",
            "rule-name": "extract-field",
            "rule-action": "add-column",
            "rule-target": "column",
            "object-locator": {
                "table-name": "new",
                "schema-name": "catalog"
            },
            "value": "field",
            "expression": "$_id",
            "data-type": {
                "type": "string",
                "length": 15
            }
        }
> select * from new;
+--------------------------+------------------------------------------------------------------------------------+-----------------+
| _id                      | _doc                                                                               | field           |
+--------------------------+------------------------------------------------------------------------------------+-----------------+
| 652caea3a4d85c5d6af773cc | { "_id" : { "$oid" : "652caea3a4d85c5d6af773cc" }, "item" : "card", "qty" : 15.0 } | 652caea3a4d85c5 |
| 652caeaaa4d85c5d6af773cd | { "_id" : { "$oid" : "652caeaaa4d85c5d6af773cd" }, "item" : "card", "qty" : 15.0 } | 652caeaaa4d85c5 |
+--------------------------+------------------------------------------------------------------------------------+-----------------+
2 rows in set (0.00 sec)
> select * from new;
+--------------------------+------+------+
| oid__id                  | item | qty  |
+--------------------------+------+------+
| 652caea3a4d85c5d6af773cc | card |   15 |
| 652caeaaa4d85c5d6af773cd | card |   15 |
+--------------------------+------+------+
2 rows in set (0.00 sec)
AWS
SUPPORT ENGINEER
answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions