Skip to content

AWS DMS Serverless full load OOM (full load + CDC):best practices to reduce failures with cost constraints

0

Hi everyone, We’re running an AWS DMS Serverless task with migration type Full load + CDC. We previously used DMS on-demand replication instances but hit out-of-memory (OOM) issues, so we moved to DMS Serverless. We’re still seeing frequent OOM errors during the full load phase (CDC is generally more stable). Constraint: cost optimisation is a priority, so we’ve configured the task with min DCU = 2 and a max DCU based on sizing guidance/formula for our source type. Even with this, full load OOM happens repeatedly. What are the best practices to minimise full load OOM frequency in DMS Serverless especially when trying to keep baseline DCU low?

3 Answers
3

Tip from my experience, in addition to the re:Post Agent answer:

To further reduce OOM risks, I recommend a more granular approach based on your specific data structure:

  • Audit CloudWatch Logs: Identify exactly which table is being processed when the OOM occurs. Often, it isn't the total volume of tables, but one specific "wide" table with many VARCHAR(MAX), TEXT, or JSON columns that spikes memory usage.
  • The Isolation Strategy: If you identify a problematic table, remove it from the main replication task. Create a dedicated task just for that table with MaxFullLoadSubTasks set to 1. This ensures DMS focuses all available memory from your DCUs on that single stream without competition.
  • The 2 DCU Reality Check: Keep in mind that a baseline of 2 DCUs provides very limited RAM (typically around 4–6 GB). If your largest table is several hundred GBs or has extremely high row density, it may physically exceed the buffer limits of a low-DCU configuration regardless of tuning. In such cases, temporarily increasing the Max DCU for the Full Load phase is the most cost-effective way to prevent repeated, expensive failures.

Quick question to help narrow this down: How large is your single largest table, and does it contain many LOB or JSON columns? Knowing this can help determine if 2 DCUs is a realistic floor for your specific schema.

EXPERT
answered a month ago
EXPERT
reviewed 25 days ago
1

To minimize full load OOM issues in AWS DMS Serverless while maintaining cost constraints, here are the best practices:

Task Configuration Optimization:

  1. Split Large Migrations: Divide your migration into multiple smaller tasks rather than one large task. This reduces memory pressure on individual tasks and can help prevent OOM errors during full load.

  2. Optimize Table Loading:

  • Set MaxFullLoadSubTasks to control the number of tables loaded in parallel (default is 8, maximum is 49). Start conservatively with lower values like 4-6 when DCU is constrained.
  • Adjust ParallelLoadThreads carefully - the total thread count is approximately ParallelLoadThreads × MaxFullLoadSubTasks. High thread counts consume significant memory.
  • Increase CommitRate to 30,000-50,000 to transfer more records per batch, reducing memory overhead.
  1. LOB Handling:
  • If your tables don't contain LOB columns, disable LOB support entirely (SupportLobs: false) for significant performance gains and reduced memory usage.
  • If LOBs are necessary, use LimitedSizeLobMode instead of FullLobMode and set appropriate LobMaxSize (e.g., 32 KB) to prevent large objects from consuming excessive memory.
  1. Delay Index Creation: Set CreatePkAfterFullLoad to true to defer primary key and unique index creation until after the full load completes. This dramatically speeds up inserts and reduces memory pressure during the load phase.

  2. Table Preparation: Use TargetTablePrepMode: DROP_AND_CREATE to let DMS recreate target tables, which works well with deferred index creation.

Memory Management:

  • Configure MemoryLimitTotal appropriately for your DCU allocation. Allocate 50-70% of available memory to this setting.
  • Set MemoryKeepTime to control how long changes are kept in memory before writing to disk (typically 60 seconds).

Monitoring and Adjustment:

  • Monitor CloudWatch metrics closely, particularly FreeableMemory and SwapUsage. If swap usage is high, you're hitting memory limits.
  • Consider temporarily increasing max DCU during the full load phase, then reducing it once CDC begins (which is generally more stable and memory-efficient).

Phased Approach:

Since CDC is more stable for you, consider running separate tasks:

  1. Run a Full Load only task first (set StopTaskCachedChangesApplied: false)
  2. After completion, start a CDC-only task with BatchApplyEnabled: true for better throughput

This approach allows you to allocate resources differently for each phase and can be more cost-effective than maintaining high DCU throughout a combined task.

Additional Considerations:

  • Ensure your target endpoint can handle the write throughput to prevent backpressure that increases memory usage
  • Review table selection - prioritize smaller tables first or exclude very large tables for separate handling
  • Check if specific tables are causing issues and consider using parallel load settings per table

By implementing these practices, you should see reduced OOM frequency while maintaining cost efficiency with lower baseline DCU settings.
Sources
Optimize full load and CDC AWS DMS tasks | AWS re:Post
Full-load task settings - AWS Database Migration Service
Optimizing AWS DMS Replication Task Performance: Best Practices & Sample Configurations | AWS re:Post

answered a month ago
EXPERT
reviewed a month ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.