Why do my Amazon Athena queries take a long time to run?

4 minute read
0

My Amazon Athena queries take a long time to run, and the query queue times are high.

Resolution

Your Athena queries might be temporarily queued before they run. Queries might take a long time to run because of either a high queue time or a high engine process time.

Call the GetQueryExecution API for your query ID. This API returns information about a single processed query. You can view this information in the QueryExecutionStatistics parameter of the API response. You can also use the Athena query editor to view statistics and details for completed queries.

Higher queue time

Your queries might have a high queue time because of high resource usage on the backend. The queue time in Athena is dependent on resource allocation. After you submit your queries to Athena, the queries are assigned resources and processed on the following criteria:

  • Overall service load
  • Number of new requests

If your queries have a high queue time, take the following actions to improve query performance:

  • Distribute your queries over a period of time. If you submit queries in batches, then submit small batches more frequently instead of large batches less frequently. This can reduce the time that a query stays in the QUEUED state.
  • Run a combination of simple and complex queries instead of a set of complex queries at the same time. Submit simple queries first and then complex queries. Because simple queries are processed quickly, you can allocate resources to the complex queries without high queue times.
  • For scheduled queries, avoid the timeframes at the start of the hour and 30 minutes past the hour. This is because most automated scripts and cron jobs run in these timeframes. The service load is usually high in these periods and might result in increased queue times.
  • If your use case permits, run your queries in multiple AWS Regions to distribute the load and help acquire more backend resources.

Important: You might incur Amazon Simple Storage Service (Amazon S3) cross-Region charges.

Higher plan time

When you over partition the table, you might cause a higher plan time. Tables with hundreds or thousands of partitions can result in slower query process times. To improve query performance, take any of the following actions:

  • Reduce the number of partitions.
  • Query over one partition at a time and join the results.
  • Use partition projection to speed up query process time of highly partitioned tables and automate partition management.

Higher process time

If your queries have a higher engine process time, then take the following actions to improve query performance:

  • Partition your tables to restrict the amount of data that each query scans. For more information, see Partitioning data in Athena.
  • If the Amazon S3 file that you query is small, such as less than 128 MB, then the query process time might be higher. The time increase is because of increased tasks, such as when you open the S3 file, list directories, and set up data transfer. Use the S3DistCp tool in Amazon EMR to combine smaller S3 files into larger objects. Larger objects require fewer Amazon S3 requests and reduces the query process time.
  • Perform other storage and query optimizations.

Note: You can submit several queries to Athena at the same time based on the default query-related quotas in your Region. Athena assigns resources based on the overall service load and the number of new requests to process queries. Therefore, all your submitted queries might not run concurrently.

Related information

Performance tuning in Athena

AWS OFFICIAL
AWS OFFICIALUpdated 2 months ago