Athena - Query exhausted resources at this scale factor

0

I have received the following error executing the query in Athena : "Query exhausted resources at this scale factor"

The query that I executed was count(*) to get no.of rows for S3 Object with 17k files which is only 12GB. This query used to work before for the same amount of data, but recently I've been receiving this error.

Some insight on what's causing this error would be appreciated.

Thanks.

asked 2 years ago2611 views
2 Answers
0

Hi there,

As you may already know Athena is a serverless technology which make use of shared resources. Athena uses Presto as a query engine which runs different stages of a query in memory. For a small number of queries and for certain operators, Presto brings all the data into a single nodes memory and may fail because it cannot spill pages to disk when memory is exhausted. This is not a resource-related issue, but more related to how specific operators cannot handle large amounts of data. This error is transient in nature, so if you submit the same query again, it might be successful. However, if you keep getting the same error consistently consider the following suggestions:

  • If you are running your query in Athena1 consider running it on Athena2 which has a better performance than Athena1.
  • Optimize the query as mentioned in Performance Tuning Best Practices for Athena.
  • try using exponential backoff method and rerun the query.
  • if the queries are being submitted at top of the hour, try to run the queries at a different time.
  • You can also use the EMR cluster to run your query using Hive or Presto. This option will give you a flexibility to tune memory parameters as per your requirements.

I hope this helps.

Cebi
answered 2 years ago
  • This passage makes no sense to me: "For a small number of queries and for certain operators, Presto brings all the data into a single nodes memory and may fail because it cannot spill pages to disk when memory is exhausted. This is not a resource-related issue" since memory is a resource. Besides, why would subsequent attempts succeed for exactly the same code and the same input? Could you please add more clarity here?

0

If your data is in CSV format anot columnar format like Parquet or ORC, Athena may have to read the entire contents of the file and that can sometimes exhaust its shared resources for you during some queries.

answered a year ago
  • That makes sense. But why would it succeed if you tried again given that both query and data are the same? Is this system clever enough to allocate more memory next time you try? Can we give it a hint before the query runs?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions