Date_parse large files for Grafana with Athena

0

I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs";

However, this command works as long as the log file "example_logs" is not too long (~10,000 rows). If the file is too long, I get the following error : INVALID_FUNCTION_ARGUMENT: Invalid format: "..."

I tried to use a crawler on Glue to partition the data, but I get the same error.

  • The error you provided seems like there's a timestamp on your data that is not in the format you specified, so date_parse() will fail.

1回答
0
承認された回答

Yes, but I checked the data and there is no row with the format "...". It seems like Amazon does not acknowledge the rows after a certain amount and add and ellipsis (...) row on which it applies the query.

回答済み 9ヶ月前
  • Have you really checked all records ? you can easily do it now with spark notebooks on athena.

  • I found out that some rows didn't fit the date_parse format by reviewing the files. I modified the request to exclude rows that don't fit the format with : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs" WHERE "date" LIKE '0%' OR "date" LIKE '1%'OR "date" LIKE '2%'OR "date" LIKE '3%'. It means I only take rows where the first character is 0;1;2 or 3. It works because the specific lines where it doesn't fit, the data is corrupted and is of format Nan but I expect it is not the most robust solution.

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ