Reading dynamo DB based glue catalog table using redshift external table

0

I have created a glue catalog table on top of a dynamo DB table using glue crawler.

I'm trying to read this glue table by creating a external schema in redshift on top of glue database.

Glue catalog and redshift are located in different accounts, so I have configured required IAM permissions (DynamoDB and glue is located on the same account).

Following is the CREATE EXTERNAL SCHEMA used to create the external schema :

create external schema a2dynamodb from data catalog database 'a2-dynamodb' region 'us-east-1' iam_role 'arn:aws:iam::<RedshiftAccountID>:role/CrossAccountGlueToRedshift,arn:aws:iam::<GlueAccountID>:role/GlueDynamoAnalyticsAccountRole';

Once the schema is created based on Glue database, redshift is automatically fetching all the tables in the glue database as external tables under the a2dynamodb external schema.

When I try to query these tables, I get the follwoing error on redshift:

[XX000] ERROR: S3Location [2023-10-27 15:18:41] Detail: [2023-10-27 15:18:41] ----------------------------------------------- [2023-10-27 15:18:41] error: S3Location [2023-10-27 15:18:41] code: 8001 [2023-10-27 15:18:41] context: Invalid S3 Path: arn:aws:dynamodb:us-east-1:<DynamoDBAccountID>:table/Users [2023-10-27 15:18:41] query: 32988533 [2023-10-27 15:18:41] location: s3location.cpp:146 [2023-10-27 15:18:41] process: padbmaster [pid=1073840402]

Folowing is the CREATE TABLE query of the auto fetched external table :

create external table a2dynamodb.users ( devadmin boolean, costcenterid string, systemuser boolean, buildingcode string, costcentername string, fullname string, userid string, email string, status string, username string, managerusername string ) row format serde 'com.amazon.ionhiveserde.IonHiveSerDe' stored as inputformat 'com.amazon.ionhiveserde.formats.IonInputFormat' outputformat 'com.amazon.ionhiveserde.formats.IonOutputFormat' location 'arn:aws:dynamodb:us-east-1:<DynamoDBAccountID>:table/Users' table properties ('CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'UPDATED_BY_CRAWLER'='A2-DynamoDB-Users');

Why am I getting Invalid S3 path error when the external tables are based on dynamoDB tables?

質問済み 7ヶ月前320ビュー
2回答
0
承認された回答

Use DynamoDB as source for Glue table and direct integration with Amazon Redshift is not yet supported. Alternatively you can get the data synced into Redshift and then query it.

The Redshift COPY command supports loading DynamoDB table into Redshift table. This is good for one-time load. Refer https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-source-dynamodb.html

For continued replication refer https://aws.amazon.com/blogs/database/dynamodb-streams-use-cases-and-design-patterns/ which describes leveraging DynamoDB Streams.

profile pictureAWS
回答済み 7ヶ月前
0

Regarding the continuous replication, an article was recently posted which shows how to use a stored procedure to convert the JSON payload to tables/columns and to even handle schema changes:

https://repost.aws/articles/ARlyMMJAeUQLWZd8p3c1Jz6A/automated-change-data-capture-cdc-data-ingestion-from-dynamodb-to-redshift

profile pictureAWS
エキスパート
Rajiv_G
回答済み 6ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ