Date_parse large files for Grafana with Athena

0

I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs";

However, this command works as long as the log file "example_logs" is not too long (~10,000 rows). If the file is too long, I get the following error : INVALID_FUNCTION_ARGUMENT: Invalid format: "..."

I tried to use a crawler on Glue to partition the data, but I get the same error.

  • The error you provided seems like there's a timestamp on your data that is not in the format you specified, so date_parse() will fail.

1 Answer
0
Accepted Answer

Yes, but I checked the data and there is no row with the format "...". It seems like Amazon does not acknowledge the rows after a certain amount and add and ellipsis (...) row on which it applies the query.

answered 8 months ago
  • Have you really checked all records ? you can easily do it now with spark notebooks on athena.

  • I found out that some rows didn't fit the date_parse format by reviewing the files. I modified the request to exclude rows that don't fit the format with : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs" WHERE "date" LIKE '0%' OR "date" LIKE '1%'OR "date" LIKE '2%'OR "date" LIKE '3%'. It means I only take rows where the first character is 0;1;2 or 3. It works because the specific lines where it doesn't fit, the data is corrupted and is of format Nan but I expect it is not the most robust solution.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions