Al usar AWS re:Post, aceptas las AWS re:Post Términos de uso

Reading dynamo DB based glue catalog table using redshift external table

0

I have created a glue catalog table on top of a dynamo DB table using glue crawler.

I'm trying to read this glue table by creating a external schema in redshift on top of glue database.

Glue catalog and redshift are located in different accounts, so I have configured required IAM permissions (DynamoDB and glue is located on the same account).

Following is the CREATE EXTERNAL SCHEMA used to create the external schema :

create external schema a2dynamodb from data catalog database 'a2-dynamodb' region 'us-east-1' iam_role 'arn:aws:iam::<RedshiftAccountID>:role/CrossAccountGlueToRedshift,arn:aws:iam::<GlueAccountID>:role/GlueDynamoAnalyticsAccountRole';

Once the schema is created based on Glue database, redshift is automatically fetching all the tables in the glue database as external tables under the a2dynamodb external schema.

When I try to query these tables, I get the follwoing error on redshift:

[XX000] ERROR: S3Location [2023-10-27 15:18:41] Detail: [2023-10-27 15:18:41] ----------------------------------------------- [2023-10-27 15:18:41] error: S3Location [2023-10-27 15:18:41] code: 8001 [2023-10-27 15:18:41] context: Invalid S3 Path: arn:aws:dynamodb:us-east-1:<DynamoDBAccountID>:table/Users [2023-10-27 15:18:41] query: 32988533 [2023-10-27 15:18:41] location: s3location.cpp:146 [2023-10-27 15:18:41] process: padbmaster [pid=1073840402]

Folowing is the CREATE TABLE query of the auto fetched external table :

create external table a2dynamodb.users ( devadmin boolean, costcenterid string, systemuser boolean, buildingcode string, costcentername string, fullname string, userid string, email string, status string, username string, managerusername string ) row format serde 'com.amazon.ionhiveserde.IonHiveSerDe' stored as inputformat 'com.amazon.ionhiveserde.formats.IonInputFormat' outputformat 'com.amazon.ionhiveserde.formats.IonOutputFormat' location 'arn:aws:dynamodb:us-east-1:<DynamoDBAccountID>:table/Users' table properties ('CrawlerSchemaDeserializerVersion'='1.0', 'CrawlerSchemaSerializerVersion'='1.0', 'UPDATED_BY_CRAWLER'='A2-DynamoDB-Users');

Why am I getting Invalid S3 path error when the external tables are based on dynamoDB tables?

preguntada hace un año571 visualizaciones
2 Respuestas
0
Respuesta aceptada

Use DynamoDB as source for Glue table and direct integration with Amazon Redshift is not yet supported. Alternatively you can get the data synced into Redshift and then query it.

The Redshift COPY command supports loading DynamoDB table into Redshift table. This is good for one-time load. Refer https://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-source-dynamodb.html

For continued replication refer https://aws.amazon.com/blogs/database/dynamodb-streams-use-cases-and-design-patterns/ which describes leveraging DynamoDB Streams.

profile pictureAWS
EXPERTO
respondido hace un año
0

Regarding the continuous replication, an article was recently posted which shows how to use a stored procedure to convert the JSON payload to tables/columns and to even handle schema changes:

https://repost.aws/articles/ARlyMMJAeUQLWZd8p3c1Jz6A/automated-change-data-capture-cdc-data-ingestion-from-dynamodb-to-redshift

profile pictureAWS
EXPERTO
respondido hace un año

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas