aws glue create_dynamic_frame from data in PostgreSQL with custom bookmark key

0

Hi AWS expert, I have a code read data from AWS aurora PostgreSQL, I want to bookmark the table with custom column named 'ceres_mono_index'. But it seems like the bookmark is still uses the primary key as the bookmark key instead of column 'ceres_mono_index'. Here is the code


cb_ceres = glueContext.create_dynamic_frame.from_options(
    connection_type="postgresql",
    connection_options={
        "url": f"jdbc:postgresql://{ENDPOINT}:5432/{DBNAME}",
        "dbtable": "xxxxx_raw_ceres",
        "user": username,
        "password": password,
    },
    additional_options={"jobBookmarkKeys": ["ceres_mono_index"], "jobBookmarkKeysSortOrder": "asc"},
    transformation_ctx="cb_ceres_bookmark",
)

How could I fix the issue? Thank you

질문됨 일 년 전1439회 조회
1개 답변
0

I believe PostgreSQL column names are case-sensitive(when double quoted as given in the code). AWS Glue does not support using case-sensitive columns as job bookmark keys. I think that is why glue is still using primary key as the job bookmark key. Check out the documentation for the same when using with JDBC sources https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html

답변함 일 년 전
  • Thank you so much for your reply. I am confuse whether it is possible to change it to non-case-sensitive (with single quoted? ), and make it work for PostgreSQL or it is impossible or it is not possible to work with PostgreSQL?

  • yes of course it is possible to work with PostgreSQL. As per your code you are using 'additional_options' param in 'create_dynamic_frame.from_options' function which is not correct way. If you look into the documentation link, you would notice this - You can specify jobBookmarkKeys and jobBookmarkKeysSortOrder in the following ways: create_dynamic_frame.from_catalog — Use additional_options. create_dynamic_frame.from_options — Use connection_options. you can try giving the same in "connection_options" as mentioned and check if it works for you.

    Alternatively, you can also try another way - create a glue crawler to crawl your PostgreSQL datastore and run it to create glue catalog metadata table. you can pass in classifiers in crawler config or even map columns to different name/type using applyMapping feature. Then create the dynamic frame using 'glueContext.create_dynamic_frame.from_catalog' function and pass in bookmark keys in 'additional_options' param.

  • Thank you so much for the help. Move the arguments to 'connection_options' works

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠