Is there an option in AWS Glue DataBrew to keep only the required columns and delete the rest?

0

I have many datasets in AWS Glue with hundreds of columns, but I need only a few of those columns for feature selection. I don't see an option in AWS Glue DataBrew for keeping the required columns and deleting the rest. Is there a feature or capability to achieve this option?

profile pictureAWS
ESPECIALISTA
pechung
feita há 3 anos1263 visualizações
1 Resposta
0
Resposta aceita

You can use the text box in the Columns tab to view the required columns in the AWS Glue DataBrew console. You can search for the columns, select the required columns, and deselect the rest.

To remove some of the columns from your final dataset, you need to apply the delete column recipe that doesn't have the global filter/search functionality.

To delete multiple columns, you can download the recipe as a JSON file, add your columns in the DELETE step, and then upload the recipe.

Example:

{
"Action": {
  "Operation": "DELETE",
  "Parameters": {
    "sourceColumns": "[\"victory_status\",\"winner\",\"turns\"]"
  }
}
AWS
respondido há 3 anos
  • Is this still true? I don't see the sourceColumns parameter documented anywhere, only sourceColumn

Você não está conectado. Fazer login para postar uma resposta.

Uma boa resposta responde claramente à pergunta, dá feedback construtivo e incentiva o crescimento profissional de quem perguntou.

Diretrizes para responder a perguntas