Is there an option in AWS Glue DataBrew to keep only the required columns and delete the rest?

0

I have many datasets in AWS Glue with hundreds of columns, but I need only a few of those columns for feature selection. I don't see an option in AWS Glue DataBrew for keeping the required columns and deleting the rest. Is there a feature or capability to achieve this option?

profile pictureAWS
エキスパート
pechung
質問済み 3年前1263ビュー
1回答
0
承認された回答

You can use the text box in the Columns tab to view the required columns in the AWS Glue DataBrew console. You can search for the columns, select the required columns, and deselect the rest.

To remove some of the columns from your final dataset, you need to apply the delete column recipe that doesn't have the global filter/search functionality.

To delete multiple columns, you can download the recipe as a JSON file, add your columns in the DELETE step, and then upload the recipe.

Example:

{
"Action": {
  "Operation": "DELETE",
  "Parameters": {
    "sourceColumns": "[\"victory_status\",\"winner\",\"turns\"]"
  }
}
AWS
回答済み 3年前
  • Is this still true? I don't see the sourceColumns parameter documented anywhere, only sourceColumn

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ