Skip to content

Time Taken for DDB export to S3.

0

I was testing AWS Glue's DynamoDB export connector that internally exports the DDB tables to S3 using the Export to S3 feature of DDB, I was concerned to see that even small tables that I created to test this (upto 5 entries) are sometimes taking an hour to export, but at the same time they are also sometimes exported within 15 minutes, I wanted to see if there is a way to estimate the time an export can take, or if there is a way to optimize it?

asked 8 months ago531 views
2 Answers
2
Accepted Answer

There is no way to estimate as it relies on too many factors, such as data skewness, and available resources etc.. But what I would say, if you are ok with consuming read capacity from your tables, its faster and more cost effective to read directly from the table, rather than use the export functionality through Glue.

AWS
EXPERT
answered 8 months ago
1

When exporting DynamoDB tables to S3, the time taken can vary significantly, even for small tables. Unfortunately, there isn't a precise way to estimate the export time for DynamoDB tables.

The DynamoDB export to S3 feature is an asynchronous process that doesn't consume read capacity units and has no impact on table performance. However, several factors can influence the export duration:

  1. Table partitioning: DynamoDB typically creates approximately one file per partition during export. If you're using on-demand capacity mode, you'll have a minimum of 4 partitions, resulting in at least 4 files on S3, regardless of how small your table is.

  2. AWS service prioritization: Since this is an asynchronous process, AWS may prioritize different operations at different times, which could explain why sometimes the same small table exports in 15 minutes and other times takes an hour.

  3. Background processes: The export process involves several background steps that occur before data is actually written to S3.

For optimization, there aren't many levers you can pull since this is a managed service. However, understanding your table's partitioning structure might help set expectations. Unfortunately, you won't be able to identify exactly how many partitions you have until you perform the export.

The variability you're experiencing (15 minutes to an hour) for small tables is within normal operational parameters for this type of asynchronous AWS service operation.
Sources
How to estimate DynamoDB ExportToS3 cost? | AWS re:Post
DynamoDB data export to Amazon S3: how it works - Amazon DynamoDB

answered 8 months ago
AWS
SUPPORT ENGINEER
reviewed 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.