Athena: Missing columns from DDB data source

3

Hi,

I'm using a DynamoDB connector in Athena to be able to query a DDB table and also use the data in QuickSight. I was able to set up the data source and query the table fine for the most part, but I've noticed that there are several columns missing in Athena. I've verified that they exist in DDB, but when I try to query in Athena (e.g. SELECT column_name FROM table_name), it says that the column cannot be resolved.

I'd appreciate any tips/pointers -- thank you!

  • Same issue is occurring for me now. The initial scan of the table works on the install, but the table definitions are not updated with newly added columns.

asked 2 years ago1622 views
2 Answers
0

I have been struggling with this one myself. I achieved solving my problem by crawling the DynamoDB table with AWS Glue and let AWS Glue infer the schema.

  1. In AWS Glue, create a new database with its Location URI or Location set to the value dynamo-db-flag - use this exact value; otherwise, Athena won't see the future table.

  2. In AWS Glue, create a crawler:

  • Select DynamoDB as the data source.
  • For the table name, give the same table name as your dynamodb table.
  • For the target database, choose the database you just created.
  • Run the crawler.
  1. In AWS Glue, after the Crawler ran successfully, verify the table's schema, which the crawler just populated. Edit the table in Glue to add two additional table properties (Table > Edit Table Property > Table Properties):
  • sourceTable is the exact name of your dynamodb table, matching case. E.g. if my dynamodb table is named DbUsersTableXDXDXDXDX-SZSZSZSZSZSZSZ, sourceTable property's value has to be DbUsersTableXDXDXDXDX-SZSZSZSZSZSZSZ.
  • columnMapping defines column name mappings. In AWS Glue catalog, column names should be in lowercase. So, this entry is necessary. E.g. if your table has 3 column: id, email, createdAt, the value of columnMapping is id=id,email=email,createdat=createdAt, (Glue column name) in the left side of each mapping (=).
  1. After saving the changes, go to Amazon Athena, for the DataSource DynamoDB (from the dynamodb connector), select the new database you created in (1), the table should be here.
Herve
answered a year ago
  • I'm worried about this solution. I have used the existing database (dynamoDBconnection.default) as the data source for QuickSight, which has an ETL built on those tables. I don't know if I can change that data source easily.

-2
  • This would be a slightly different issue. Right now, queries in Athena don't show the column, so they are not expected in Quicksight which is using Athena as a data source

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions