DataBrew: Is it possible to run the same step over multiple columns?

0

It's surprisingly difficult to find information about this. Am I able to say "Format all strings as lowercase?" or "Fill in missing values with NULL for these five columns?"

已提問 1 年前檢視次數 241 次
1 個回答
-1

Hello,

I understand that you want to run the same steps over multiple columns in Glue Databrew. To do this, you can add the steps in a recipe for all columns (say formatting the string as lowercase) and perform the same steps over multiple columns, there's no limitation for re-iterating same step for multiple column. You can read more about Databrew recipes here.

[1] Creating and using AWS Glue DataBrew recipes - https://docs.aws.amazon.com/databrew/latest/dg/recipes.html [2] Data cleaning recipe steps - https://docs.aws.amazon.com/databrew/latest/dg/recipe-actions.data-cleaning.html

已回答 1 年前
  • I appreciate you taking the time to respond, but my question is about "fill in missing values with NULL for these five columns," not about repeating steps for multiple columns. Based on your answer, it sounds like the answer is "no, this isn't possible," but I don't think your solution is a valid workaround. A dataset with 10 columns becomes 10x harder to maintain if you have to repeat the same step 10x for each "universal transformation" you want to apply. Thank you for the answer, though!

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南