How can I resolve "Query Timeout" errors in Athena?

4 minute read
0

I ran a query in Amazon Athena and it failed with "Query Timeout" error message.

Short description

You might receive query timeout errors or queries that run slow when the DML query timeout limit is exceeded.

Resolution

Follow these best practices to optimize your queries and avoid timeout errors.

Reduce the amount of time to run the query from Athena

The following are steps that you can take in Athena to reduce the query runtime:

Increase the query runtime for Amazon Athena

When you import data from Athena to Amazon QuickSight SPICE, you can receive query timeout errors because the DML query reaches its maximum runtime.

To resolve this issue:

  1. Check your Athena query history to find the query that QuickSight generated.
  2. Check how long the query ran before it failed.
  3. Follow the steps in the preceding section "Reduce the amount of time to run the query from Athena" and run the query again. If this doesn't resolve the query issue, then proceed to step 4.
  4. If the amount of time is close to the maximum DML query timeout quota in minutes, then increase the service quota.

For more information on Service Quotas and to request a quota increase, see Service quotas.

Set up notifications for an Athena query timeout

To receive a notification for queries that time out and fail, use an Amazon CloudWatch event rule to capture the event. Then, use Amazon Simple Notification Service (Amazon SNS) to set up notifications.

For instructions, see How do I generate notifications for an Athena query timeout?

Amazon S3 access logs queries

If you run queries for Amazon Simple Storage Service (Amazon S3) access logs and the query times out, you can partition the S3 data. Use an AWS Glue ETL job to partition your Amazon S3 data. Then, run Athena queries on limited partitions.

For instructions, see How do I use a partitioned Amazon S3 access log to prevent an Athena query timeout?

CloudTrail data queries

If you use Athena to query AWS CloudTrail data, the queries might take a long time to run or time out. This is because CloudTrail logs can grow in size over time even if you partition the CloudTrail table to reduce the runtime of the queries.

Use partition projection to manually create a CloudTrail table. This allows Athena to dynamically calculate the value of CloudTrail tables, which reduces query runtime. With partition projection, you don't need to manage partitions because partition values and locations are calculated from the configuration.

For instructions, see How can I use partition projection to create CloudTrail tables for Athena queries?

Use CTAS queries

A CREATE TABLE AS SELECT (CTAS) query creates a new table in Athena from the results of a SELECT statement from another query. CTAS creates tables from query results in one step which makes it easier to work with raw data sets. CTAS queries also improves query performance and reduces query costs.

For examples, see Examples of CTAS queries.

For more information, see How can I set the number or size of files when I run a CTAS query in Athena?

Use EXPLAIN and EXPLAIN ANALYZE statements

Use EXPLAIN and EXPLAIN ANALYZE in Athena for complex queries. The EXPLAIN statement can optimize queries by refining SQL statements and verify partition pruning.

Make sure that you review the considerations and limitations with the EXPLAIN and EXPLAIN ANALYZE statements.

For more information, see EXPLAIN statement examples.

Related information

Top 10 performance tuning tips for Amazon Athena

How do I use the results of an Amazon Athena query in another query?

AWS OFFICIAL
AWS OFFICIALUpdated 2 months ago