AWS Glue visual studio S3 target not updating schema

0

Hi,

I am using Glue studio editor to write some ETL with target S3. In the configuration I checked the flag Create a table in the Data Catalog and on subsequent runs, update the schema and add new partitions to automatically update the schema but it doesn't work and looking at the logs there are no error. The first time I run the job the table is created correctly but for example if I change output format from parquet to json the glue table is not updated. Any idea on why?

Thanks

Paolo
질문됨 일 년 전566회 조회
1개 답변
2

Please check the "Job Bookmark" option in Job details, if the Job bookmark is enabled then AWS Glue tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run. Job bookmarks help AWS Glue maintain state information and prevent the reprocessing of old data. So, I think in your case the bookmark is enabled thus when you re-run the job then it will skip the processing of the data as it has not changed after the first run. You can re-run the job either by disabling the job bookmark or by making some changes to the source data.

profile pictureAWS
전문가
답변함 일 년 전
  • I checked and the job bookmark is set to disabled

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠