跳至内容

Random Queries with [ErrorCode: INTERNAL_ERROR_QUERY_ENGINE] Amazon Athena experienced an internal error

0

Hi

I'm testing a lambda function that executes batches of small athena queries against an S3 glue table.

When I execute the function code locally as my admin it schedules the queries and they never error, when the function executes them perhaps 1 in 30 fail (each query is the same with different partition constraints, changing the date range of files to search).

I have even set the code to execute the identical query again if the original fails, in the majority of cases (not 100% of them but most) the 2nd execution succeeds, with no change in permissions or the query itself.

If I select the failed query from the athena console and re-run it, it executes without error.

If anyone from AWS support can look at the error, here are two identical queries in us-east-1 that failed and then succeeded... 042735b7-31e6-4ac4-90ef-913ac8c5201f FAILED 04a415cd-3f01-42b8-a05c-8b344d06085e SUCCEEDED

Regards

Matthew

已提问 3 年前877 查看次数
1 回答
0
已接受的回答

Could you please share the exact error with which the query fails?

based on the few information it could be that when the lambda function executes the query it submits more queries that your quota allow as explained here.

If this is the case you could be able to resolve the issue either:

  1. requesting a quota increase (could not work if later you continue to submit more queries than the quota)
  2. throttling the number of queries that your lambda can run in parallel.
  3. set up a retry mechanism hope this helps
AWS
专家
已回答 3 年前
专家
已审核 2 年前
  • Thanks, that was more or less the entire message, i had previously hit the quota exceeded messages, when this happens you get a meaningful error. I have asked for a quota increase but this particular message was not quota based (or the error was not saying quota), even when only one process was running I was getting random instances of these. That was on the 10th, however yesterday this behaviour stopped, the same processes running yesterday didn't fail randomly in the same way. My gut is something was failing behind the scenes and AWS have resolved the issue in this region. I did spend a day adding in retry mechanisms and reducing the batch sizes of the queries to make my process more tolerant to things like this.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。