Redshift column VARCHAR(MAX) question

0

A customer is looking at creating a redshift table with a large varchar(65535) column (although only a tiny percentage of data will be large), but our best practice suggests that use the smallest possible column size (due to potential memory waste).

They’d like to clarify:

  1. Does that mean having a varchar column with large max size will only affect performance if we store a lot of large strings in the column?
  2. Would there be any problems if say 90% of the data is only 20 bytes?

I found a support article mentions that trailing blanks can occupy the full length in memory during query processing, but still not clear enough to answer the above questions.

preguntada hace 4 años7971 visualizaciones
1 Respuesta
0
Respuesta aceptada

The impact of the wide columns is mainly when you query the data. I.e. if the user queries others columns in the table, the wide column will not impact the query performance. If the column is queried often, the better approach is to parse out the pertinent piece(s) of information and load into dedicated & right-sized columns for most reporting purposes. You can leave the raw data in the current column but advise users to avoid accessing those columns.

profile pictureAWS
EXPERTO
Rajiv_G
respondido hace 4 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas