Glue Data Catalog configuration when updating with Database Migration Service

0

I set up a replication task with AWS Database Migration Service to implement full load + CDC from a RDS instance to a S3 bucket. Since I want to use Athena to query the data in S3, I set the option "GlueCatalogGeneration": true so that I wouldn't need to configure a separate crawler to run periodically and get me the latest data: however, I realized that when DMS generates the tables in the Glue Data Catalog it sets the option escape.delim to null. This doesn't seem to be a problem for Athena, but if I try to access any table using Spark (e.g. with the create_dynamic_frame_from_catalog option) I receive an error of IllegalEscaper; is there some option in DMS I can configure so that this parameter doesn't get created at all?

已提问 1 个月前192 查看次数
2 回答
1
已接受的回答

When DMS replicates data from a database to S3 and enables Glue catalog generation, it sets certain properties in the generated Glue tables. One such property is escape.delim, which gets set to null.

This null value does not cause issues when querying the data from Athena. However, it can cause problems when trying to access the tables from Spark using the create_dynamic_frame_from_catalog option, as Spark expects a non-null escape delimiter value.

There is currently no option in DMS to configure this escape.delim property value.

  • After the initial load and replication is complete, update the Glue table definition manually through the Glue console or API to set a non-null escape delimiter value.
  • Alternatively, instead of using create_dynamic_frame_from_catalog in Spark, you can directly query the data from S3 using Spark SQL without going through the Glue catalog.
profile picture
专家
已回答 1 个月前
profile pictureAWS
专家
已审核 1 个月前
0

I noticed that when using create_dynamic_frame_from_options and reading directly from S3 I don't have the same problem, I was curious as to why that was the case. Thank you! Now it's clear

已回答 1 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则