In SYS_QUERY_HISTORY, the execution_time, compile_time, queue_time, planning_time and lock_wait_time columns don't add up to elapsed_time

0

Hi, we use SYS_QUERY_HISTORY for our internal monitoring purposes. One of our systems surfaced a couple of SELECT queries with comparatively high execution time.

Following those queries, and looking at their corresponding in SYS_QUERY_HISTORY revealed something interesting and controversial at the same time.

The elapsed_time time of the queries did not add up to execution_time, compile_time, queue_time, planning_time and lock_wait_time columns.

Further investigating the issue revealed the same could be seen for almost all the SELECT queries in the system.

with delta_time as (
    select
        start_time,
        query_id,
        (elapsed_time -
         execution_time - compile_time - queue_time - planning_time -
         - lock_wait_time) as total_delta_time,
        elapsed_time,
        execution_time,
        queue_time,
        compile_time,
        planning_time,
        lock_wait_time
    from sys_query_history
    where query_type = 'SELECT'
)
select
    count(*) as total_queries,
    sum(case when total_delta_time > 0 then 1 else 0 end) as total_delta_time_gt_0,
    total_delta_time_gt_0 * 100.0 / total_queries as pct_queries_with_delta_time
from delta_time;

+-------------+---------------------+---------------------------+
|total_queries|total_delta_time_gt_0|pct_queries_with_delta_time|
+-------------+---------------------+---------------------------+
|13552        |13384                |98.760330578512396         |
+-------------+---------------------+---------------------------+

Can we please shed light on why the sum of execution_time, compile_time, queue_time, planning_time, lock_wait_time not equal to elapsed_time?

Sumeet
asked 2 months ago279 views
1 Answer
1
Accepted Answer

I raised an AWS support ticket and got the following response:


By researching internally, I would like to inform, although the “elapsed_time” of a query encompasses all phases of processing the request, the formula below cannot match exactly because the phases are not executed sequentially.

elapsed_time = execution_time + lock_wait_time + queue_time + compile_time + planning_time

Indeed, as you may already know, when processing a query, the Redshift query optimizer translates the query plan into “steps”, “segments”, and “streams” (link [2]). A segment is the smallest compilation unit executable by a “compute node slice”. Segments are compiled in parallel, which means that the sum of the individual segment compilation times will sometimes be greater than expected. Additionally, some segments may be compiled while others are running. The “compile_time” recorded in the “sys_query_history” table corresponds to the sum of the compilation times of the individual segments. Since compilation is done in parallel, it is normal that, in some cases, this sum is greater than the difference (in microseconds) between the end time of compilation of the last segment and the start time of compilation of the first segment. This explains why the right-hand side of the equation above could be greater than the “elapsed_time”.

As for cases where the “elapsed_time” is much greater than the sum of all other durations (some durations are 0), I want to confirm that this can happen in cases where the request is served from one of Redshift's internal caches. In these cases, the “elapsed_time” will correspond to the time needed to serve the request from the cache. The individual durations in the “sys_query_history” table can be 0 in the following cases:

  • the “execution_time” can be equal to 0 if the request is served from the cache.

  • the “compile_time” and the “planning_time” can also be equal to 0 if the compilation of segments is served from the compilation cache.


Sumeet
answered 2 months ago
profile picture
EXPERT
reviewed a month ago
profile pictureAWS
EXPERT
reviewed 2 months ago
  • Thank you for sharing the answer. Please let us know if you have any additional questions.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions