When I run queries on my Amazon Athena table, I get an "HIVE_CURSOR_ERROR" exception.
Resolution
You might get this exception under one of the following conditions:
- The data objects in your table location are corrupt, aren't valid, or are incorrectly compressed.
- The records within the table data aren't valid (example: a malformed JSON record).
Common HIVE_CURSOR_ERROR exceptions
You might get one of the following HIVE_CURSOR_ERROR exceptions when there is an issue with the underlying objects:
- HIVE_CURSOR_ERROR: incorrect header check
- HIVE_CURSOR_ERROR: invalid block type
- HIVE_CURSOR_ERROR: incorrect data check
- HIVE_CURSOR_ERROR: Unexpected end of input stream
- HIVE_CURSOR_ERROR: invalid stored block lengths
- HIVE_CURSOR_ERROR: invalid distance code
If you recently added new objects to your Amazon Simple Storage Service (Amazon S3) table location, then be sure that these objects are valid and aren't corrupt. Download the objects, and then inspect them using an appropriate tool. For example, decompress the GZIP-compressed objects to check whether the compression is valid.
If your table is partitioned, check if you are able to query individual partitions. If there are new partitions in the table, they might contain objects that aren't valid.
Specific HIVE_CURSOR_ERROR exceptions
HIVE_CURSOR_ERROR: Row is not a valid JSON Object
You get this error when a row within the table is not a valid JSON record.
To resolve this issue, do the following:
- Recreate the table by adding the property 'ignore.malformed.json' = 'true'.
- Query the new table to identify the files with malformed records by running a command similar to the following:
SELECT "$path" FROM example_table WHERE example_column = NULL
For more information, see Resolve JSON errors in Amazon Athena and OpenX JSON SerDe Options.
HIVE_CURSOR_ERROR: Corrupted uncompressed block
You get this error when the objects in your table location are compressed using LZO format.
To avoid getting this error with LZO-compressed data, recreate the table with the INPUTFORMAT "com.hadoop.mapred.DeprecatedLzoTextInputFormat". For example:
CREATE EXTERNAL TABLE example_table (
example_column_1 STRING,
example_column_2 STRING
)
STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://example_bucket/example_prefix/'
You can also resolve this error by using a different compression format.
HIVE_CURSOR_ERROR: Can not read value at 0 in block 0 in file
You get this error when you have invalid Parquet objects in the S3 path. Check if your Parquet objects contain DECIMAL data types. If Parquet objects contain DECIMAL data types and were written using Spark, they might be incompatible with Hive. To check this condition, try reading the Parquet objects from a Spark job.
If you can read the Parquet objects using the Spark job, then rewrite the Parquet objects with legacy mode in Spark with the following configuration:
spark.sql.parquet.writeLegacyFormat = true
For information on Spark configuration Options for Parquet, see Configuration in Spark SQL guide.
HIVE_CURSOR_ERROR: org.apache.hadoop.io.ArrayWritable cannot be cast to org.apache.hadoop.io.Text
You get this error if you used an incorrect SerDe during table definition. For example, the table might be using a JSON SerDe, and the source data includes Parquet objects.
To resolve this error, check the source data and confirm that the correct SerDe is used. For more information, see Supported SerDes and data formats.
Related information
How do I resolve "HIVE_CURSOR_ERROR: Row is not a valid JSON Object - JSONException: Duplicate key" when reading files from AWS Config in Athena?
Troubleshooting in Athena