- Newest
- Most votes
- Most comments
I hope this will help: https://docs.aws.amazon.com/AmazonS3/latest/userguide/backup-for-s3.html https://aws.amazon.com/blogs/storage/point-in-time-recovery-and-continuous-backup-for-amazon-rds-with-aws-backup/
Hi,
Please, have a look at aws_s3 extension for pg: it was designed for use cases very close to yours
See https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/postgresql-s3-export.html
It is highly cost-efficient: it will run in your RDS instance at no additional cost. You will indeed pay only the cost of your S3 option (choose the right one for your use case to remain cost-optimal).
If you this extension within a stored procedure, you don't have any external compute service (Lambda to use).
Finally, you can schedule it directly within pg by following this blog post: https://aws.amazon.com/blogs/database/schedule-jobs-with-pg_cron-on-your-amazon-rds-for-postgresql-or-amazon-aurora-for-postgresql-databases/
So, by doing this, you'll have a very "compact" (all in pg) and cost-efficient solution, which only adds the cost of S3 to your current costs.
Best,
Didier
Hi Didier,
Thank you very much for your quick and cost optimal solution.
Hope this works for huge volume of data transfers also, lets say we might have 30GB transfer of data per each tenant for every 7 days so we have total 2000 tenants that means in a worst case we may transfer 60TB data for every 7 days.
Actually we had other latency issue once data moved to S3, Glue Crawler taking time to index the s3 data. Hence like to use the CDC to live feeding to S3 and in parallel Glue Crawler index it as well. We can directly delete older than 7 days data from pg.
Thanks, Baji
Hi Mi Sha, I was not looking for any backup solution. We are exceeding the size of data very quick hence like to move the older than 7 days data to S3 because client usually play around with last 7 days data in 80% use cases. If any tenant asks older than 7 days data then with Glue with Athena we are retrieving from S3.