aws glue create_dynamic_frame from data in PostgreSQL with custom bookmark key

0

Hi AWS expert, I have a code read data from AWS aurora PostgreSQL, I want to bookmark the table with custom column named 'ceres_mono_index'. But it seems like the bookmark is still uses the primary key as the bookmark key instead of column 'ceres_mono_index'. Here is the code


cb_ceres = glueContext.create_dynamic_frame.from_options(
    connection_type="postgresql",
    connection_options={
        "url": f"jdbc:postgresql://{ENDPOINT}:5432/{DBNAME}",
        "dbtable": "xxxxx_raw_ceres",
        "user": username,
        "password": password,
    },
    additional_options={"jobBookmarkKeys": ["ceres_mono_index"], "jobBookmarkKeysSortOrder": "asc"},
    transformation_ctx="cb_ceres_bookmark",
)

How could I fix the issue? Thank you

posta un anno fa1440 visualizzazioni
1 Risposta
0

I believe PostgreSQL column names are case-sensitive(when double quoted as given in the code). AWS Glue does not support using case-sensitive columns as job bookmark keys. I think that is why glue is still using primary key as the job bookmark key. Check out the documentation for the same when using with JDBC sources https://docs.aws.amazon.com/glue/latest/dg/monitor-continuations.html

con risposta un anno fa
  • Thank you so much for your reply. I am confuse whether it is possible to change it to non-case-sensitive (with single quoted? ), and make it work for PostgreSQL or it is impossible or it is not possible to work with PostgreSQL?

  • yes of course it is possible to work with PostgreSQL. As per your code you are using 'additional_options' param in 'create_dynamic_frame.from_options' function which is not correct way. If you look into the documentation link, you would notice this - You can specify jobBookmarkKeys and jobBookmarkKeysSortOrder in the following ways: create_dynamic_frame.from_catalog — Use additional_options. create_dynamic_frame.from_options — Use connection_options. you can try giving the same in "connection_options" as mentioned and check if it works for you.

    Alternatively, you can also try another way - create a glue crawler to crawl your PostgreSQL datastore and run it to create glue catalog metadata table. you can pass in classifiers in crawler config or even map columns to different name/type using applyMapping feature. Then create the dynamic frame using 'glueContext.create_dynamic_frame.from_catalog' function and pass in bookmark keys in 'additional_options' param.

  • Thank you so much for the help. Move the arguments to 'connection_options' works

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande