Date_parse large files for Grafana with Athena

0

I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs";

However, this command works as long as the log file "example_logs" is not too long (~10,000 rows). If the file is too long, I get the following error : INVALID_FUNCTION_ARGUMENT: Invalid format: "..."

I tried to use a crawler on Glue to partition the data, but I get the same error.

  • The error you provided seems like there's a timestamp on your data that is not in the format you specified, so date_parse() will fail.

1 Antwort
0
Akzeptierte Antwort

Yes, but I checked the data and there is no row with the format "...". It seems like Amazon does not acknowledge the rows after a certain amount and add and ellipsis (...) row on which it applies the query.

beantwortet vor 9 Monaten
  • Have you really checked all records ? you can easily do it now with spark notebooks on athena.

  • I found out that some rows didn't fit the date_parse format by reviewing the files. I modified the request to exclude rows that don't fit the format with : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs" WHERE "date" LIKE '0%' OR "date" LIKE '1%'OR "date" LIKE '2%'OR "date" LIKE '3%'. It means I only take rows where the first character is 0;1;2 or 3. It works because the specific lines where it doesn't fit, the data is corrupted and is of format Nan but I expect it is not the most robust solution.

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen