How to rewind Job bookmark programatically

1

I am using Glue service to read the files and migrating to database. The same script is run for 30-40 tables. The S3 path and table name are changing dynamically through a csv file I am passing. Doing this to avoid creating that many jobs. Each datasource being read includes their own dedicated transformation_ctx property. Next time when the job runs again it picks where the tables where last read.The problem I am facing is when any of the table load fails. For those too the file was read already but write did not happen, due to which I would lose the data which was read but not written for that specific table in the next run. Below are the possibilities I have come up with: 1. Make the entire job fail if any of the table load is failing 2. Add notification for failed table sent over email (so that I could troubleshoot) and rewind bookmark for the failed table and process next tables.

I am unable to achieve the second option, as I don't want to stop other tables from being written. I would like to rewind the bookmark by code or reprocess files for that table only, not all tables.

Can I achieve this with any other way?

질문됨 2년 전1299회 조회
1개 답변
-1

You can only rewind job bookmarks to any previous job run - https://docs.aws.amazon.com/cli/latest/reference/glue/reset-job-bookmark.html Since there are multiple tables being processed in a single job, this would mean reprocessing data for all of the tables - even for those tables where this issue didn't happen. It seems like the first option would be better - to make the entire job fail even if one table load is failing.

AWS
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠