Redshift SYS_QUERY_DETAIL returns incorrect entries for the tables that are scanned during query execution

0

We use Redshift for internal purposes. As a part of that, we want to determine the data lineage of the tables available in the Redshift cluster.


To generate lineage entries, we traditionally used STL_INSERT and STL_SCAN views.

For a given query ID, the entries in STL_SCAN are considered source_tables, and the entry in STL_INSERT is considered as the destination_table. Thus we can create an edge in the Lineage graph. Once we repeat this process for all DML queries, we have an excellent lineage of tables.

Note: For a given query there can be multiple entries in STL_INSERT that reference redshift-centric internal intermediate destination tables, however, there can only be a single entry corresponding to a user-defined destination table. We ignore the entries for intermediate tables since they are used by Redshift internally and are not publicly accessible.


The query execution is divided into multiple steps e.g. scanning the source table, applying filters, inserting the final result in the destination table, etc. Again this is an over-simplification of the query execution process.

Based on this documentation: https://docs.aws.amazon.com/redshift/latest/dg/r_STL_INSERT.html redshift recommends using SYS_QUERY_DETAIL view to get a complete list of steps executed for a given query. This also includes the SCAN and INSERT steps. Furthermore, this view contains data from both the main and the concurrency scaling cluster thus it would result in comparatively accurate information.


We migrated our lineage systems to the above-recommended view successfully.

During validation, we found that the SYS_QUERY_DETAIL contains incorrect table IDs for the scan steps. Precisely, the table_id associated with the first SCAN step is incorrect and is the same as the table_id corresponding to the INSERT step.

Thus the lineage graph we generated was missing a bunch of edges and we saw a sudden increase in self-edges in the graph which is counter-intuitive since none of our tables inserts into themselves. Because of this, our integration tests started failing thereby leading to this detailed investigation.

Note: If we continue using the old views i.e. STL_INSERT and STL_SCAN the issue disappears.


The redshift version we are using is as follows:

PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.61678

How to reproduce the issue

  • Run the following SQL to generate the test tables
create table parent_1
(
    source varchar(32),
    a      int
);
insert into parent_1
values ('parent_1', 1),
       ('parent_1', 2),
       ('parent_1', 3);
create table parent_2
(
    source varchar(32),
    a      int
);
insert into parent_2
values ('parent_2', 1),
       ('parent_2', 2),
       ('parent_2', 3);
create table parent_3
(
    source varchar(32),
    a      int
);
insert into parent_3
values ('parent_3', 1),
       ('parent_3', 2),
       ('parent_3', 3);

-- The following SQL is equivalent to what our DBT redshift connector would generate while running the models
CREATE TABLE "downstream"
AS (WITH SOURCE AS
             (SELECT *
              FROM "parent_1"
              UNION
              SELECT *
              FROM "parent_2"
              UNION
              SELECT *
              FROM "parent_3"),
         renamed AS
             (SELECT *
              FROM SOURCE)
    SELECT *
    FROM renamed)
  • Find the table_id of the destination table
select *
from pg_catalog.svv_table_info
where "table" = 'downstream';

+----------------------------------+------+--------+----------+---------------+----------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
|database                          |schema|table_id|table     |encoded        |diststyle |sortkey1     |max_varchar|sortkey1_enc|sortkey_num|size|pct_used|empty|unsorted|stats_off|tbl_rows|skew_sortkey1|skew_rows|estimated_visible_rows|risk_event|vacuum_sort_benefit|create_time               |
+----------------------------------+------+--------+----------+---------------+----------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
|do_not_delete_used_for_intg_test_1|public|1580265 |downstream|Y, AUTO(ENCODE)|AUTO(EVEN)|AUTO(SORTKEY)|32         |null        |0          |5   |0.0001  |0    |null    |0.00     |9       |null         |null     |9                     |null      |null               |2024-01-19 01:48:51.024926|
+----------------------------------+------+--------+----------+---------------+----------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
  • Find the query ID of the above CTAS
select *
from pg_catalog.svv_table_info
where "table" = 'downstream';

+----------------------------------+------+--------+----------+---------------+----------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
|database                          |schema|table_id|table     |encoded        |diststyle |sortkey1     |max_varchar|sortkey1_enc|sortkey_num|size|pct_used|empty|unsorted|stats_off|tbl_rows|skew_sortkey1|skew_rows|estimated_visible_rows|risk_event|vacuum_sort_benefit|create_time               |
+----------------------------------+------+--------+----------+---------------+----------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
|do_not_delete_used_for_intg_test_1|public|1580265 |downstream|Y, AUTO(ENCODE)|AUTO(EVEN)|AUTO(SORTKEY)|32         |null        |0          |5   |0.0001  |0    |null    |0.00     |9       |null         |null     |9                     |null      |null               |2024-01-19 01:48:51.024926|
+----------------------------------+------+--------+----------+---------------+----------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
  • Find the query ID of the above CTAS
select *
from sys_query_detail
where table_id = <table id of the destination table>
  and step_name = 'insert'
  and start_time >= '<add a valid date filter here to avoid a full scan>';

+-------+--------+--------------------+---------+----------+-------+---------+--------+----------+---------+--------------------------+--------------------------+--------+-----+-----------+----------+------------+-----------+-----------+------------+-------------+--------------+------+-------------+-------------+---------+------------------------+-------------------------+--------------+
|user_id|query_id|child_query_sequence|stream_id|segment_id|step_id|step_name|table_id|table_name|is_rrscan|start_time                |end_time                  |duration|alert|input_bytes|input_rows|output_bytes|output_rows|blocks_read|blocks_write|local_read_io|remote_read_io|source|data_skewness|time_skewness|is_active|spilled_block_local_disk|spilled_block_remote_disk|step_attribute|
+-------+--------+--------------------+---------+----------+-------+---------+--------+----------+---------+--------------------------+--------------------------+--------+-----+-----------+----------+------------+-----------+-----------+------------+-------------+--------------+------+-------------+-------------+---------+------------------------+-------------------------+--------------+
|100    |17492523|1                   |3        |5         |2      |insert   |1580265 |downstream|f        |2024-01-19 01:48:59.569734|2024-01-19 01:48:59.575038|5304    |     |0          |0         |0           |9          |0          |0           |0            |0             |      |50           |5            |f        |0                       |0                        |              |
+-------+--------+--------------------+---------+----------+-------+---------+--------+----------+---------+--------------------------+--------------------------+--------+-----+-----------+----------+------------+-----------+-----------+------------+-------------+--------------+------+-------------+-------------+---------+------------------------+-------------------------+--------------+
  • Plugin in the query ID in the following SQL to get the corresponding scan and insert steps
select *
from sys_query_detail
where table_name <> ''
  and query_id = <query id from above>
  and start_time >= '<add a valid date filter here to avoid a full scan>'
order by start_time;

+-------+--------+--------------------+---------+----------+-------+---------+--------+--------------------------------------------------+---------+--------------------------+--------------------------+--------+-----+-----------+----------+------------+-----------+-----------+------------+-------------+--------------+---------------+-------------+-------------+---------+------------------------+-------------------------+--------------+
|user_id|query_id|child_query_sequence|stream_id|segment_id|step_id|step_name|table_id|table_name                                        |is_rrscan|start_time                |end_time                  |duration|alert|input_bytes|input_rows|output_bytes|output_rows|blocks_read|blocks_write|local_read_io|remote_read_io|source         |data_skewness|time_skewness|is_active|spilled_block_local_disk|spilled_block_remote_disk|step_attribute|
+-------+--------+--------------------+---------+----------+-------+---------+--------+--------------------------------------------------+---------+--------------------------+--------------------------+--------+-----+-----------+----------+------------+-----------+-----------+------------+-------------+--------------+---------------+-------------+-------------+---------+------------------------+-------------------------+--------------+
|100    |17492523|1                   |0        |0         |0      |scan     |1580265 |do_not_delete_used_for_intg_test_1.public.parent_1|f        |2024-01-19 01:48:55.341610|2024-01-19 01:48:59.708793|4367183 |     |0          |12        |360         |12         |0          |0           |0            |0             |Redshift(local)|66           |40           |f        |0                       |0                        |              |
|100    |17492523|1                   |1        |1         |0      |scan     |1580261 |do_not_delete_used_for_intg_test_1.public.parent_2|f        |2024-01-19 01:48:55.351864|2024-01-19 01:48:55.355093|3229    |     |0          |3         |72          |3          |0          |0           |0            |0             |Redshift(local)|50           |6            |f        |0                       |0                        |              |
|100    |17492523|1                   |2        |2         |0      |scan     |1580263 |do_not_delete_used_for_intg_test_1.public.parent_3|f        |2024-01-19 01:48:55.355709|2024-01-19 01:48:55.358835|3126    |     |0          |3         |72          |3          |0          |0           |0            |0             |Redshift(local)|50           |7            |f        |0                       |0                        |              |
|100    |17492523|1                   |3        |5         |2      |insert   |1580265 |downstream                                        |f        |2024-01-19 01:48:59.569734|2024-01-19 01:48:59.575038|5304    |     |0          |0         |0           |9          |0          |0           |0            |0             |               |50           |5            |f        |0                       |0                        |              |
+-------+--------+--------------------+---------+----------+-------+---------+--------+--------------------------------------------------+---------+--------------------------+--------------------------+--------+-----+-----------+----------+------------+-----------+-----------+------------+-------------+--------------+---------------+-------------+-------------+---------+------------------------+-------------------------+--------------+
  • The table_id for parent_1 is 1580259 as per the following result. However, the first entry in the above result shows the table_id as 1580265 i.e. exactly the same as the destination table
select *
from pg_catalog.svv_table_info
where "table" = 'parent_1';

+----------------------------------+------+--------+--------+---------------+---------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
|database                          |schema|table_id|table   |encoded        |diststyle|sortkey1     |max_varchar|sortkey1_enc|sortkey_num|size|pct_used|empty|unsorted|stats_off|tbl_rows|skew_sortkey1|skew_rows|estimated_visible_rows|risk_event|vacuum_sort_benefit|create_time               |
+----------------------------------+------+--------+--------+---------------+---------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+
|do_not_delete_used_for_intg_test_1|public|1580259 |parent_1|Y, AUTO(ENCODE)|AUTO(ALL)|AUTO(SORTKEY)|32         |null        |0          |5   |0.0001  |0    |null    |0.00     |3       |null         |null     |3                     |null      |null               |2024-01-19 01:47:31.986812|
+----------------------------------+------+--------+--------+---------------+---------+-------------+-----------+------------+-----------+----+--------+-----+--------+---------+--------+-------------+---------+----------------------+----------+-------------------+--------------------------+

Cont'd in comments

  • Following similar steps on STL_INSERT AND STL_SCAN yields correct results

  • Just to add to the above issue:

    • The issue is persistent on a Redshift provisioned cluster of instance type "ra3.xlplus"
    • The issue is not persistent on a Redshift provisioned cluster of instance type "ra3.4xlarge"
  • We have created an AWS-supported ticket, the internal team was able to reproduce the issue and they are currently investigating the root cause.

Sumeet
asked 3 months ago276 views
1 Answer
0

Hello,

I executed the same queries on my provisioned cluster and saw correct table_id in SYS_QUERY_DETAIL. Could you check if this is one off issue or this happens for all your queries ? Are you running this on a provisioned or serverless workgroup ?

I would recommend reaching out to AWS support if you see more occurrences of the same issue.

Enter image description here

Enter image description here

Enter image description here

AWS
EXPERT
answered 3 months ago
  • Thanks, Ranjan for trying it out. We are using a provisioned cluster and the issue is consistently reproducible.

  • Hi Ranjan - can you please check the Redshift version you are running and confirm if it matches with the following version?

    SELECT VERSION();
    
    PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.61678

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions