How to perform reliable asynchronous CDC for Amazon Aurora PostgreSQL DB

0

I’m planning on using Aurora RDS as my service database and I’d be versioning every record I store in the DB. Is there any managed or reliable way to do asynchronous CDC from the DB and store each version in a historical data store (S3) for audit purposes.

I’ve come across approaches using DMS or triggers but I’d like to avoid DMS since it doesn’t seem like the correct use case for it and I don’t want to use triggers because I’d like the CDC to be asynchronous.

In simple terms I’d like to build something like DynamoDB streams for my Aurora DB.

已提问 24 天前182 查看次数
1 回答
0

Though I havn't myself done the setup end to end . At solution high level the below approach should work. Aurora Database Activity Streams with AWS Lambda and S3 https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/DBActivityStreams.Overview.html

1/ Enable Database Activity Streams: This built-in Aurora feature captures database modifications (inserts, updates, deletes) in near real-time. It automatically creates a Kinesis data stream to push this activity data.

2/ AWS Lambda Function: Create a Lambda function triggered by the Kinesis data stream. This function will process the activity stream events and extract the relevant data changes.

3/ Data Transformation (Optional): The Lambda function can transform the extracted data if needed to match your desired format for historical storage in S3.

4/ Store Versions in S3: Use the AWS SDK within the Lambda function to write the extracted and potentially transformed data to S3 objects. Each object can represent a specific version of your data.

OR maybe , some thing similar AWS CDC (Change Data Capture) with Amazon EventBridge and S3 : https://aws.amazon.com/blogs/database/capturing-data-changes-in-amazon-aurora-using-aws-lambda/

Hope it helps.

profile pictureAWS
akad
已回答 23 天前
  • Thank you for your answer.

    For Asynchronous Database Activity Streams, the error handling doesn't seem to be straightforward (https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/DBActivityStreams.Overview.html#DBActivityStreams.Overview.sync-mode). It's saying we would receive an RDS event; however, I'm unsure how we can use that to reprocess events. Additionally, failures could potentially disrupt the order of events (I forgot to mention this in my question, but the ordering of events is critical for my use case). Please let me know if you are aware of how we can handle this gracefully.

    Regarding the asynchronous lambda trigger approach, I really don't want to deal with triggers because they create a lot of unmanageable configurations in the database.

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则