Redshift: Slow read speeds


We have a golang client running on k8s pod reading data from redshift table. Table has 50 columns and 10 million records and query looks like this

select * from tablename where id > 0 and id < 5000000

we are seeing that reading 5000 records is taking around 30 seconds which is very slow. what are the redshift metrics one should look at in these cases?

  • Why are you getting 5M records from Redshift at a time? That isn't a typical pattern for Redshift.

asked 7 months ago144 views
2 Answers

Redshift is a columnar database. So instead of using select * from a table, selecting specific columns will perform lot better. Based on query you have provided, please try creating sort key on column id and see if it helps. Typically Redshift takes care of updating statistics automatically but you can also update it using analyze table command.

answered 7 months ago

Hello, to improve performance for this specific query and table, I would first explore data model optimizations such as ensuring that you have optimal compression and sort keys (e.g. id column) for the table (distribution style is also important in most cases but since this query doesn't involve joins, not so much). You can easily add these characteristics to your table via the ALTER command. Try looking at the Redshift Advisor recommendations in the Redshift console to see if there any data model optimizations recommended by the Redshift ML algorithms. Another aspect worth considering is if you have an underpowered Redshift cluster vis-a-vis this workload and/or other concurrent workloads. Try examining the CPU utilization for example to see if it is peaking. Try experimenting with an increased node count to see if it results in improved query runtimes.

answered 7 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions