Date_parse large files for Grafana with Athena

0

I am trying to use Amazon Grafana with files on Amazon S3. For that, I need to change the date format of the original file to fit on Grafana. I use the following command : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs";

However, this command works as long as the log file "example_logs" is not too long (~10,000 rows). If the file is too long, I get the following error : INVALID_FUNCTION_ARGUMENT: Invalid format: "..."

I tried to use a crawler on Glue to partition the data, but I get the same error.

  • The error you provided seems like there's a timestamp on your data that is not in the format you specified, so date_parse() will fail.

1개 답변
0
수락된 답변

Yes, but I checked the data and there is no row with the format "...". It seems like Amazon does not acknowledge the rows after a certain amount and add and ellipsis (...) row on which it applies the query.

답변함 9달 전
  • Have you really checked all records ? you can easily do it now with spark notebooks on athena.

  • I found out that some rows didn't fit the date_parse format by reviewing the files. I modified the request to exclude rows that don't fit the format with : SELECT cast(date_parse("date", '%d.%m.%Y %H:%i:%s') as timestamp) FROM "example_logs" WHERE "date" LIKE '0%' OR "date" LIKE '1%'OR "date" LIKE '2%'OR "date" LIKE '3%'. It means I only take rows where the first character is 0;1;2 or 3. It works because the specific lines where it doesn't fit, the data is corrupted and is of format Nan but I expect it is not the most robust solution.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠