loading geospatial data into table
I'm trying to load a JSON file containing a geometry in the "EnclosedJsonInputFormat" following a pattern demonstrated in the Athena geospatial query sample. The table is created but contains no data and I'm assuming there is some sort of parsing error. Is there any way to get more information on the problem?
Hi,
As per the documentation, you need to store "earthquakes.csv" and "california-counties.json" inside of the folder in the S3 bucket.
If you look at the sample earthquake code example, you need to mention the s3 folder path (TEXTFILE LOCATION) when you run the query.
CREATE external TABLE earthquakes
(
earthquake_date string,
latitude double,
longitude double,
depth double,
magnitude double,
magtype string,
mbstations string,
gap string,
distance string,
rms string,
source string,
eventid string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE LOCATION 's3://<your_bucket_name/csv/';
You need to store the .csv file under "csv" folder inside your bucket. Likewise, you also need to store the .json file under "csv" folder inside your bucket.
Make sure you set the Query result location parameter in "Query result and encryption settings" under the settings tab in Athena Console.
Once you do, then you can run the queries and you'll see all the records created in the Athena data catalog table.
Note: Do not add the S3 object name at the end of the TEXTFILE LOCATION in earquake sql table query as well as LOCATION in countries sql query.
For eg. add s3://bucketabc/csv/ instead of s3://<your_bucket_name/csv/earthquakes.csv
I have done the above steps and I can see all the records in both the tables:
Records in Earthquake Table: 77037
Records in Countries Table: 30
I hope this helps.
Relevant questions
Athena and Analytics
asked a month agoAthena query consistently fails with HIVE_CURSOR_ERROR: Failed to read ORC file
asked 2 months agoAmazon Quicksight SPICE Data set partition
Accepted Answerasked 3 years agoHIVE_UNKNOWN_ERROR: Duplicate key string
asked 3 months agoOdd Athena Query Performance Issue
asked 2 days agoAthena - Execute multiple query and capture output in a file on e2
asked a month agocrawled data can not be queried in athena
Accepted Answerasked 5 years agoloading geospatial data into table
asked 4 months agoGlue table not showing in console
asked a month agoHow to store the Athena Query Results in DynamoDB table?
asked 3 years ago
Thanks for your reply. I too can get the earthquakes example to work but I'm using my own JSON files in an S3 bucket. I'm getting them to load now (my original problem was missing a trailing slash on the s3 address) but I'm still wondering how to best debug situations where the data don't parse correctly.