Redshift diagnostics queries to identify the performance bottleneck

3 minute read
Content level: Advanced
0

This article provides the tool/queries to identify redshift tables having missing statistics, data skewness and longest running query.

  1. Query to find out tables w/ missing statistics

To find queries impacted by missing statistics in Redshift, you can query stl_explain and look for "missing statistics" in the plannode field. Group by the substring of the plannode, order by the count descending, like this:

select substring(trim(plannode),1,100) as plannode, count(*) from stl_explain where plannode like '%missing statistics%' group by plannode order by 2 desc; This will provide a count of the queries impacted by missing statistics on each table, identified by the plannode value.

  1. Query to identify tables with data skew or unsorted rows in Redshift cluster

select trim(pgn.nspname) as schema, trim(a.name) as table, id as tableid, decode(pgc.reldiststyle,0, 'even',1,det.distkey ,8,'all') as distkey, dist_ratio.ratio::decimal(10,4) as skew, det.head_sort as "sortkey", det.n_sortkeys as "#sks", b.mbytes,
decode(b.mbytes,0,0,((b.mbytes/part.total::decimal)100)::decimal(5,2)) as pct_of_total, decode(det.max_enc,0,'n','y') as enc, a.rows, decode( det.n_sortkeys, 0, null, a.unsorted_rows ) as unsorted_rows , decode( det.n_sortkeys, 0, null, decode( a.rows,0,0, (a.unsorted_rows::decimal(32)/a.rows)100) )::decimal(5,2) as pct_unsorted from (select db_id, id, name, sum(rows) as rows, sum(rows)-sum(sorted_rows) as unsorted_rows from stv_tbl_perm a group by db_id, id, name) as a join pg_class as pgc on pgc.oid = a.id join pg_namespace as pgn on pgn.oid = pgc.relnamespace left outer join (select tbl, count() as mbytes from stv_blocklist group by tbl) b on a.id=b.tbl inner join (select attrelid, min(case attisdistkey when 't' then attname else null end) as "distkey", min(case attsortkeyord when 1 then attname else null end ) as head_sort , max(attsortkeyord) as n_sortkeys, max(attencodingtype) as max_enc from pg_attribute group by 1) as det on det.attrelid = a.id inner join ( select tbl, max(mbytes)::decimal(32)/min(mbytes) as ratio from (select tbl, trim(name) as name, slice, count() as mbytes from svv_diskusage group by tbl, name, slice ) group by tbl, name ) as dist_ratio on a.id = dist_ratio.tbl join ( select sum(capacity) as total from stv_partitions where part_begin=0 ) as part on 1=1 where mbytes is not null order by mbytes desc;

  1. Query to identify longest running queries in your cluster :

select trim(database) as db, count(query) as n_qry, max(substring (qrytext,1,80)) as qrytext, min(run_minutes) as "min" , max(run_minutes) as "max", avg(run_minutes) as "avg", sum(run_minutes) as total,
max(query) as max_query_id, max(starttime)::date as last_run, sum(alerts) as alerts, aborted from (select userid, label, stl_query.query, trim(database) as database, trim(querytxt) as qrytext, md5(trim(querytxt)) as qry_md5, starttime, endtime, (datediff(seconds, starttime,endtime)::numeric(12,2))/60 as run_minutes,
alrt.num_events as alerts, aborted from stl_query left outer join (select query, 1 as num_events from stl_alert_event_log group by query ) as alrt on alrt.query = stl_query.query where userid <> 1 and starttime >= dateadd(day, -7, current_date)) group by database, label, qry_md5, aborted order by total desc limit 50;

profile pictureAWS
EXPERT
published a month ago140 views