Amazon Athena error on querying DynamoDB exported data

Background

We've configured an export to s3 from dynamodb using the native dynamodb s3 export, and ION as the format output.

After this, we've created a table in Athena

CREATE EXTERNAL TABLE export_2022_07_01_v4 (
  `_PK` STRING,
  URL STRING,
  Item struct<
    `_PK`:string,
    URL:string
  >  
)
ROW FORMAT SERDE
 'com.amazon.ionhiveserde.IonHiveSerDe'
WITH SERDEPROPERTIES (
  "ignore_malformed_ion" = "true"
)
STORED AS ION
LOCATION '...';

Querying this works all right for small simple queries, but attempting to produce a full output with

UNLOAD (
  SELECT Item.URL FROM "helioptileexports"."export_2022_07_01_v4"  WHERE Item.URL IS NOT NULL
 ) to '...'  WITH (format = 'TEXTFILE')

Results in this error

HIVE_CURSOR_ERROR: Syntax error at line 1 offset 2: invalid syntax [state:STATE_BEFORE_ANNOTATION_DATAGRAM on token:TOKEN_SYMBOL_OPERATOR]
This query ran against the "helioptileexports" database, unless qualified by the query. Please post the error message on our forum  or contact customer support  with Query Id: f4ca5812-1194-41f1-bfda-00a1a5b2471b

Questions

Is there a way to make Athena more tolerant of formatting errors on specific files? As shown in the example, we are attempting without success to use ignore_malformed_ion. Is there anything beyond that that can be done?
Is this a bug on DynamoDB ION export process?
Is there any mechanism or logging to identify the files which have the malformed data and remove them?

Shantanu Oak
2年前
Try "ion.ignore_malformed" instead of "ignore_malformed_ion"

トピック

分析データベース

タグ

Amazon Athena Amazon DynamoDB

言語

English

rePost-User-5203860

質問済み 2年前125ビュー

回答なし

新しい順
投票が多い順
コメントが多い順

Amazon Athena error on querying DynamoDB exported data

関連するコンテンツ