Query exhausted resources at this scale factor

0

Got an Query exhausted resources at this scale factor error when running Query Id: f0263908-31e5-48d7-a8d7-e1ec2492ed6b.

I have some gzipped parquet files in S3 in the following layout -

/data
  /day=0
    data_0.parquet
  /day=1
    data_0.parquet
:::

I created the table using the following queries -

create external table if not exists mytablename (
    i1 int,
    i2 int,
    i3 int,
    i4 int,
    i5 int,
    i6 int,
    i7 int,
    i8 int,
    i9 int,
    i10 int,
    i11 int,
    i12 int,
    i13 int,
    s1 string,
    s2 string,
    s3 string,
    s4 string,
    s5 string,
    s6 string,
    s7 string,
    s8 string,
    s9 string,
    s10 string,
    s11 string,
    s12 string,
    s13 string,
    s14 string,
    s15 string,
    s16 string,
    s17 string,
    s18 string,
    s19 string,
    s21 string,
    s22 string,
    s23 string,
    s24 string,
    s25 string,
    s26 string,
    label int
)
PARTITIONED BY (day STRING)
STORED AS PARQUET
LOCATION 's3://my/s3/location/data/'
tblproperties ("parquet.compression"="GZIP");

MSCK REPAIR TABLE mytablename;

And then I ran the following test query where I hit the error:

select i1, s1 from mytablename limit 10;
asked 2 years ago929 views
1 Answer
0

“Query exhausted resources at this scale factor” indicates that you hit memory limits on the Athena cluster that runs your query. Most commonly, this can be probably as the query doesn’t follow the best practices. New Athena query engine version 3 is now available in most regions - try upgrading to that or to Version 2 (You should see a notification at the top of Athena interface with that information). This issue may sometime be transient and it may go away if you retry. If there are joins the query in the you ran or if you have not partitioned the data, I could try to optimizing it - but this does not seem to be the case here. Check the below items as well:

  • How big is your data after compression in the parquet/gzip format? and how is your data accessed? (What the typical WHERE clause elements?).
  • Please check if your partition strategy is right for your data, as you have already partitioned by day
  • Try bucketing your data by specifying CLUSTERED BY (<bucketed columns>) INTO <number of buckets> BUCKETS
  • Use the new query stats visualizer (click Query stats next to Query results, then Execution details) to figure out if it shows any errors.

Please refer to this blog for some tuning tips for your query to perform better: https://aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena/

profile pictureAWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions