Need solution for this Error: Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

0

We imported MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import we found that it is importing the "lastupdated" column thrice. Also, when exported we're seeing it got updated thrice a time. While querying in Athena, it showed this error. If there any query present to remove duplicate keys from a table row, pls do share it. Also, if anyone found a solution on this error pls share. Thanks in advance.

Query Id: de44bccc-36af-488b-8c3d-bcf7e6d9360f

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

asked 2 years ago584 views
1 Answer
0

I understand that you are importing MIMIC IV data (which is already in .ndjson format) into HealthLake data store & exported it, but at the time of import you found that it is importing the "lastupdated" column thrice and hence you are getting the following error while running your Athena query:

—————

Row is not a valid JSON Object - JSONException: Duplicate key "lastupdated"

—————

Please note that Athena treats JSON key names as case insensitive, so this error is usually encountered when the underlying source data have multiple tags with the same name and some of the tags are in uppercase and others in lowercase. Athena does not allow for duplicate keys, hence the error you are seeing.

Therefore, in order to resolve this issue, you could modify the table settings to not be case sensitive and create a mapping for the problematic columns. Alternatively, you could create new table to test with the same DDL for the original table and implement these settings.

Using ALTER TABLE:

ALTER TABLE <yourTableName> SET TBLPROPERTIES (

'case.insensitive'='false',

'mapping.col'='Col_Name',

'mapping.write enabled'='Write Enabled')


Creating a new external table:

CREATE EXTERNAL TABLE <new_tablename> (

eventType string, ---> here provide your original table DDL

........

)

ROW FORMAT SERDE '..........'

WITH SERDEPROPERTIES (

'case.insensitive'='false', ------> this sets the case insensitivity

'mapping.Column_name'='New_Column_name' -----> here provide the mapping for the problematic column

)

LOCATION 's3://<YOUR BUCKET HERE>'


To get the DDL for the original table, you can run the below query:

SHOW CREATE TABLE table_name;

A similar kind of issue has been discussed in the following AWS documentations in detail as well.

-- https://aws.amazon.com/premiumsupport/knowledge-center/json-duplicate-key-error-athena-config/

https://stackoverflow.com/questions/53922517/duplicate-keys-with-amazon-athena-and-open-jsonx-serde

AWS
SUPPORT ENGINEER
answered 2 years ago
AWS
EXPERT
reviewed 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions