Date_parse large files for Grafana with Athena

0

I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs";

However, this command works as long as the log file "example_logs" is not too long (~10,000 rows). If the file is too long, I get the following error : INVALID_FUNCTION_ARGUMENT: Invalid format: "..."

I tried to use a crawler on Glue to partition the data, but I get the same error.

  • The error you provided seems like there's a timestamp on your data that is not in the format you specified, so date_parse() will fail.

1 回答
0
已接受的回答

Yes, but I checked the data and there is no row with the format "...". It seems like Amazon does not acknowledge the rows after a certain amount and add and ellipsis (...) row on which it applies the query.

已回答 9 个月前
  • Have you really checked all records ? you can easily do it now with spark notebooks on athena.

  • I found out that some rows didn't fit the date_parse format by reviewing the files. I modified the request to exclude rows that don't fit the format with : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs" WHERE "date" LIKE '0%' OR "date" LIKE '1%'OR "date" LIKE '2%'OR "date" LIKE '3%'. It means I only take rows where the first character is 0;1;2 or 3. It works because the specific lines where it doesn't fit, the data is corrupted and is of format Nan but I expect it is not the most robust solution.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则