DataBrew: Is it possible to run the same step over multiple columns?

0

It's surprisingly difficult to find information about this. Am I able to say "Format all strings as lowercase?" or "Fill in missing values with NULL for these five columns?"

已提问 1 年前241 查看次数
1 回答
-1

Hello,

I understand that you want to run the same steps over multiple columns in Glue Databrew. To do this, you can add the steps in a recipe for all columns (say formatting the string as lowercase) and perform the same steps over multiple columns, there's no limitation for re-iterating same step for multiple column. You can read more about Databrew recipes here.

[1] Creating and using AWS Glue DataBrew recipes - https://docs.aws.amazon.com/databrew/latest/dg/recipes.html [2] Data cleaning recipe steps - https://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.data-cleaning.html

已回答 1 年前
  • I appreciate you taking the time to respond, but my question is about "fill in missing values with NULL for these five columns," not about repeating steps for multiple columns. Based on your answer, it sounds like the answer is "no, this isn't possible," but I don't think your solution is a valid workaround. A dataset with 10 columns becomes 10x harder to maintain if you have to repeat the same step 10x for each "universal transformation" you want to apply. Thank you for the answer, though!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则