- Newest
- Most votes
- Most comments
This error occurs because the Open CSV SerDe in Athena doesn't properly handle empty or null values in columns defined as numeric data types (like DOUBLE). When you created your table from the CSV file, the crawler likely defined the column as a DOUBLE, but some rows contain empty or null values that can't be converted to the numeric type.
There are a few ways to solve this issue:
- Create the column with null values as STRING instead of DOUBLE, and then use CAST in your query to convert it to a numeric type while handling nulls:
SELECT CAST(your_column AS DOUBLE) AS your_column_converted FROM your_table
- Or provide a default value for nulls when casting:
SELECT COALESCE(CAST(your_column AS DOUBLE), 0) AS your_column_converted FROM your_table
- Modify your table schema in AWS Glue to change the data type of the problematic column from DOUBLE to STRING, then recrawl the data or manually update the table definition.
This issue is common when working with CSV data in Athena because the Open CSV SerDe recognizes data types like BOOLEAN, BIGINT, INT, and DOUBLE, but doesn't properly handle empty or null values in columns defined as numeric data types, leaving them as strings which then cause type conversion errors during queries.
Sources
Amazon Athena HIVE_BAD_DATA: Error Parsing a column in the table: Cannot convert value of type String to a REAL value | AWS re:Post
Troubleshoot issues in Athena - Amazon Athena
Open CSV SerDe for processing CSV - Amazon Athena
Relevant content
- asked 3 years ago
