DataBrew: Is it possible to run the same step over multiple columns?

0

It's surprisingly difficult to find information about this. Am I able to say "Format all strings as lowercase?" or "Fill in missing values with NULL for these five columns?"

preguntada hace un año241 visualizaciones
1 Respuesta
-1

Hello,

I understand that you want to run the same steps over multiple columns in Glue Databrew. To do this, you can add the steps in a recipe for all columns (say formatting the string as lowercase) and perform the same steps over multiple columns, there's no limitation for re-iterating same step for multiple column. You can read more about Databrew recipes here.

[1] Creating and using AWS Glue DataBrew recipes - https://docs.aws.amazon.com/databrew/latest/dg/recipes.html [2] Data cleaning recipe steps - https://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.data-cleaning.html

respondido hace un año
  • I appreciate you taking the time to respond, but my question is about "fill in missing values with NULL for these five columns," not about repeating steps for multiple columns. Based on your answer, it sounds like the answer is "no, this isn't possible," but I don't think your solution is a valid workaround. A dataset with 10 columns becomes 10x harder to maintain if you have to repeat the same step 10x for each "universal transformation" you want to apply. Thank you for the answer, though!

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas