AWS Glue with Dynamo - Not getting DuplicateItemException when duplicate Primary keys are inserted


#Part of Glue Job code - that is writing dataframe into Dynamo table

Write df_dyn_target_1 into Dynamo database

    connection_options={"dynamodb.output.tableName": "dy_lookup_table",
        "dynamodb.throughput.write.percent": "1.0"

I'm trying to write contents from dataframe (df_dyn_target_1) into "dy_lookup_table" table. As these records (primary keys = partition+sort) are already present in Dynamo table, I am expecting** "DuplicateItemException: Duplicate primary key exists in table"** and eventually a Glue Job failure. But it isn't working in that way and Job is successful. Do I need to pass any parameters in connection_options={} to solve the issue. Pleas suggest

demandé il y a un an510 vues
1 réponse
Réponse acceptée

You will not get a DuplicateItemException when you write an item which already exists, instead DynamoDB will just overwrite the existing item as this is how DynamoDB works.

Where you could receive a DuplicateItemException is when Glue creates a batch of 25 items to do its BatchWriteItem (which happens under the hood) and in that batch of 25, 2 or more items have the same key.

Unfortunately with Glue you cannot do a conditional write, as it uses BatchWriteItem and its not supported.

profile pictureAWS
répondu il y a un an
vérifié il y a un an

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions