Is there an option in AWS Glue DataBrew to keep only the required columns and delete the rest?

0

I have many datasets in AWS Glue with hundreds of columns, but I need only a few of those columns for feature selection. I don't see an option in AWS Glue DataBrew for keeping the required columns and deleting the rest. Is there a feature or capability to achieve this option?

profile pictureAWS
EXPERT
pechung
demandé il y a 3 ans1264 vues
1 réponse
0
Réponse acceptée

You can use the text box in the Columns tab to view the required columns in the AWS Glue DataBrew console. You can search for the columns, select the required columns, and deselect the rest.

To remove some of the columns from your final dataset, you need to apply the delete column recipe that doesn't have the global filter/search functionality.

To delete multiple columns, you can download the recipe as a JSON file, add your columns in the DELETE step, and then upload the recipe.

Example:

{
"Action": {
  "Operation": "DELETE",
  "Parameters": {
    "sourceColumns": "[\"victory_status\",\"winner\",\"turns\"]"
  }
}
AWS
répondu il y a 3 ans
  • Is this still true? I don't see the sourceColumns parameter documented anywhere, only sourceColumn

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions