Redshift column VARCHAR(MAX) question

0

A customer is looking at creating a redshift table with a large varchar(65535) column (although only a tiny percentage of data will be large), but our best practice suggests that use the smallest possible column size (due to potential memory waste).

They’d like to clarify:

  1. Does that mean having a varchar column with large max size will only affect performance if we store a lot of large strings in the column?
  2. Would there be any problems if say 90% of the data is only 20 bytes?

I found a support article mentions that trailing blanks can occupy the full length in memory during query processing, but still not clear enough to answer the above questions.

已提问 4 年前7971 查看次数
1 回答
0
已接受的回答

The impact of the wide columns is mainly when you query the data. I.e. if the user queries others columns in the table, the wide column will not impact the query performance. If the column is queried often, the better approach is to parse out the pertinent piece(s) of information and load into dedicated & right-sized columns for most reporting purposes. You can leave the raw data in the current column but advise users to avoid accessing those columns.

profile pictureAWS
专家
Rajiv_G
已回答 4 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则