Refresh DataSet in Glue DataBrew

0

I have an S3 backed dataset in Glue DataBrew with JSON and gzipped csv files in it. I removed the JSON files from the S3 bucket, do I need to refresh the dataset or re-add it for the changes to be picked up? How would I do so?

I couldn't find the answer in the documentation but I may have missed it.

질문됨 일 년 전437회 조회
1개 답변
2
수락된 답변

Hi,

If you removed files from the S3 bucket connecting to the Glue DataBrew job, manually re-run the job and it will notice the changes. Also, you can set DataBrew to process or refresh data automatically using dynamic datasets for files in S3, where you can specify time-based, pattern-based and customizable parameters to create dynamic datasets.

Here's a link to a blog that goes into more detail of this: https://aws.amazon.com/blogs/big-data/simplify-incoming-data-ingestion-with-dynamic-parameterized-datasets-in-aws-glue-databrew/

Hope this helps!

profile pictureAWS
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠