Redshift column VARCHAR(MAX) question

0

A customer is looking at creating a redshift table with a large varchar(65535) column (although only a tiny percentage of data will be large), but our best practice suggests that use the smallest possible column size (due to potential memory waste).

They’d like to clarify:

  1. Does that mean having a varchar column with large max size will only affect performance if we store a lot of large strings in the column?
  2. Would there be any problems if say 90% of the data is only 20 bytes?

I found a support article mentions that trailing blanks can occupy the full length in memory during query processing, but still not clear enough to answer the above questions.

asked 4 years ago7915 views
1 Answer
0
Accepted Answer

The impact of the wide columns is mainly when you query the data. I.e. if the user queries others columns in the table, the wide column will not impact the query performance. If the column is queried often, the better approach is to parse out the pertinent piece(s) of information and load into dedicated & right-sized columns for most reporting purposes. You can leave the raw data in the current column but advise users to avoid accessing those columns.

profile pictureAWS
EXPERT
Rajiv_G
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions