- Newest
- Most votes
- Most comments
I understand you're trying to set up change data capture (CDC) from an RDS MySQL database to Apache Iceberg tables in S3 using Amazon Data Firehose, and you're encountering issues. Let's review the process and potential areas to check:
-
Database Configuration: You've correctly set up the MySQL RDS instance with a parameter group that includes the necessary properties for CDC. Increasing the retention period to 30 hours is also a good step.
-
Network Configuration: You've created a target group with the RDS instance's IPv4 address, set up a Network Load Balancer (NLB) with this target group, and created a VPC endpoint service using the NLB. This setup is intended to allow Firehose to connect to your RDS instance securely.
-
Firehose Configuration: You've configured Firehose and used AWS Secrets Manager for RDS credentials, which is the recommended approach for secure credential management.
-
Connectivity: You mentioned using an EC2 jump host to connect to the RDS instance, which suggests your RDS is in a private subnet.
Given that you're not seeing any errors but also not getting data in Firehose, here are some areas to investigate:
-
Verify that the Firehose delivery stream is correctly configured to use the VPC endpoint service you created.
-
Check that the VPC endpoint service is properly associated with your Firehose delivery stream.
-
Ensure that the necessary permissions are in place for Firehose to access the RDS instance, Secrets Manager, and S3.
-
Confirm that the database user specified in Secrets Manager has the required permissions for CDC operations on the RDS instance.
-
Verify that the target Iceberg tables in S3 are properly set up and Firehose has the necessary permissions to write to them.
-
Check Firehose metrics and logs for any indications of connection issues or data processing problems.
-
Ensure that your RDS instance is actually generating CDC events. You might want to make some test changes to the database to trigger CDC events.
-
Verify that the VPC endpoint service's acceptance settings allow Firehose to establish a connection.
If after checking these areas you still don't see data flowing, you may want to engage AWS support for further troubleshooting, as they can provide more detailed diagnostics on the Firehose service side.
Sources
Replicate changes from databases to Apache Iceberg tables using Amazon Data Firehose (in preview) | AWS News Blog
Replicate database changes to Apache Iceberg Tables with Amazon Data Firehose - Amazon Data Firehose
Amazon Data Firehose supports continuous replication of database changes to Apache Iceberg Tables in Amazon S3 - AWS
Relevant content
- asked 6 days ago
- asked 6 months ago
- asked 6 months ago
- asked 6 years ago