AWS Glue with Dynamo - Not getting DuplicateItemException when duplicate Primary keys are inserted

0

#Part of Glue Job code - that is writing dataframe into Dynamo table

Write df_dyn_target_1 into Dynamo database

glueContext.**write_dynamic_frame_from_options**(
    frame=df_dyn_target_1,
    connection_type="dynamodb",
    connection_options={"dynamodb.output.tableName": "dy_lookup_table",
        "dynamodb.throughput.write.percent": "1.0"
    }
)

I'm trying to write contents from dataframe (df_dyn_target_1) into "dy_lookup_table" table. As these records (primary keys = partition+sort) are already present in Dynamo table, I am expecting** "DuplicateItemException: Duplicate primary key exists in table"** and eventually a Glue Job failure. But it isn't working in that way and Job is successful. Do I need to pass any parameters in connection_options={} to solve the issue. Pleas suggest

已提问 1 年前381 查看次数
1 回答
0
已接受的回答

You will not get a DuplicateItemException when you write an item which already exists, instead DynamoDB will just overwrite the existing item as this is how DynamoDB works.

Where you could receive a DuplicateItemException is when Glue creates a batch of 25 items to do its BatchWriteItem (which happens under the hood) and in that batch of 25, 2 or more items have the same key.

Unfortunately with Glue you cannot do a conditional write, as it uses BatchWriteItem and its not supported.

profile pictureAWS
专家
已回答 1 年前
AWS
专家
已审核 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则