In SYS_QUERY_HISTORY, the execution_time, compile_time, queue_time, planning_time and lock_wait_time columns don't add up to elapsed_time

0

Hi, we use SYS_QUERY_HISTORY for our internal monitoring purposes. One of our systems surfaced a couple of SELECT queries with comparatively high execution time.

Following those queries, and looking at their corresponding in SYS_QUERY_HISTORY revealed something interesting and controversial at the same time.

The elapsed_time time of the queries did not add up to execution_time, compile_time, queue_time, planning_time and lock_wait_time columns.

Further investigating the issue revealed the same could be seen for almost all the SELECT queries in the system.

with delta_time as (
    select
        start_time,
        query_id,
        (elapsed_time -
         execution_time - compile_time - queue_time - planning_time -
         - lock_wait_time) as total_delta_time,
        elapsed_time,
        execution_time,
        queue_time,
        compile_time,
        planning_time,
        lock_wait_time
    from sys_query_history
    where query_type = 'SELECT'
)
select
    count(*) as total_queries,
    sum(case when total_delta_time > 0 then 1 else 0 end) as total_delta_time_gt_0,
    total_delta_time_gt_0 * 100.0 / total_queries as pct_queries_with_delta_time
from delta_time;

+-------------+---------------------+---------------------------+
|total_queries|total_delta_time_gt_0|pct_queries_with_delta_time|
+-------------+---------------------+---------------------------+
|13552        |13384                |98.760330578512396         |
+-------------+---------------------+---------------------------+

Can we please shed light on why the sum of execution_time, compile_time, queue_time, planning_time, lock_wait_time not equal to elapsed_time?

Sumeet
posta 2 mesi fa291 visualizzazioni
1 Risposta
1
Risposta accettata

I raised an AWS support ticket and got the following response:


By researching internally, I would like to inform, although the “elapsed_time” of a query encompasses all phases of processing the request, the formula below cannot match exactly because the phases are not executed sequentially.

elapsed_time = execution_time + lock_wait_time + queue_time + compile_time + planning_time

Indeed, as you may already know, when processing a query, the Redshift query optimizer translates the query plan into “steps”, “segments”, and “streams” (link [2]). A segment is the smallest compilation unit executable by a “compute node slice”. Segments are compiled in parallel, which means that the sum of the individual segment compilation times will sometimes be greater than expected. Additionally, some segments may be compiled while others are running. The “compile_time” recorded in the “sys_query_history” table corresponds to the sum of the compilation times of the individual segments. Since compilation is done in parallel, it is normal that, in some cases, this sum is greater than the difference (in microseconds) between the end time of compilation of the last segment and the start time of compilation of the first segment. This explains why the right-hand side of the equation above could be greater than the “elapsed_time”.

As for cases where the “elapsed_time” is much greater than the sum of all other durations (some durations are 0), I want to confirm that this can happen in cases where the request is served from one of Redshift's internal caches. In these cases, the “elapsed_time” will correspond to the time needed to serve the request from the cache. The individual durations in the “sys_query_history” table can be 0 in the following cases:

  • the “execution_time” can be equal to 0 if the request is served from the cache.

  • the “compile_time” and the “planning_time” can also be equal to 0 if the compilation of segments is served from the compilation cache.


Sumeet
con risposta 2 mesi fa
profile picture
ESPERTO
verificato 2 mesi fa
profile pictureAWS
ESPERTO
verificato 2 mesi fa
  • Thank you for sharing the answer. Please let us know if you have any additional questions.

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande