When I run a SELECT COUNT(*) FROM TABLE query, the output is "1", but the input file has multiple JSON records. I used org.openx.data.jsonserde.JsonSerDe to create the table.
Resolution
When records aren't separated by a newline character (\n), the SELECT COUNT(*) FROM TABLE query returns "1."
For example, the three records in the following JSON file are not separated by a newline character:
{"ID":"1","MYTIMESTAMP":"2020-06-09 17:50:46.267","Name":"John"}{"ID":"2","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Joel"}{"ID":"3","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Patrick"}
Add a newline character after each record in the previous JSON file to get the following JSON file:
{"ID":"1","MYTIMESTAMP":"2020-06-09 17:50:46.267","Name":"John"}
{"ID":"2","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Joel"}
{"ID":"3","MYTIMESTAMP":"2020-06-10 17:50:46.267","Name":"Patrick"}
If you run a SELECT COUNT(*) FROM TABLE query on the new JSON file, then the query returns "3."
Note: The SerDe expects each JSON document to be on a single line of text with no line termination characters separating the fields in the record. For more information, see Best practices for reading JSON data.
Related information
SELECT
CREATE TABLE
Run SQL queries in Amazon Athena