- Newest
- Most votes
- Most comments
Tip from my experience, in addition to the re:Post Agent answer:
To further reduce OOM risks, I recommend a more granular approach based on your specific data structure:
- Audit CloudWatch Logs: Identify exactly which table is being processed when the OOM occurs. Often, it isn't the total volume of tables, but one specific "wide" table with many VARCHAR(MAX), TEXT, or JSON columns that spikes memory usage.
- The Isolation Strategy: If you identify a problematic table, remove it from the main replication task. Create a dedicated task just for that table with MaxFullLoadSubTasks set to 1. This ensures DMS focuses all available memory from your DCUs on that single stream without competition.
- The 2 DCU Reality Check: Keep in mind that a baseline of 2 DCUs provides very limited RAM (typically around 4–6 GB). If your largest table is several hundred GBs or has extremely high row density, it may physically exceed the buffer limits of a low-DCU configuration regardless of tuning. In such cases, temporarily increasing the Max DCU for the Full Load phase is the most cost-effective way to prevent repeated, expensive failures.
Quick question to help narrow this down: How large is your single largest table, and does it contain many LOB or JSON columns? Knowing this can help determine if 2 DCUs is a realistic floor for your specific schema.
To minimize full load OOM issues in AWS DMS Serverless while maintaining cost constraints, here are the best practices:
Task Configuration Optimization:
-
Split Large Migrations: Divide your migration into multiple smaller tasks rather than one large task. This reduces memory pressure on individual tasks and can help prevent OOM errors during full load.
-
Optimize Table Loading:
- Set
MaxFullLoadSubTasksto control the number of tables loaded in parallel (default is 8, maximum is 49). Start conservatively with lower values like 4-6 when DCU is constrained. - Adjust
ParallelLoadThreadscarefully - the total thread count is approximatelyParallelLoadThreads × MaxFullLoadSubTasks. High thread counts consume significant memory. - Increase
CommitRateto 30,000-50,000 to transfer more records per batch, reducing memory overhead.
- LOB Handling:
- If your tables don't contain LOB columns, disable LOB support entirely (
SupportLobs: false) for significant performance gains and reduced memory usage. - If LOBs are necessary, use
LimitedSizeLobModeinstead ofFullLobModeand set appropriateLobMaxSize(e.g., 32 KB) to prevent large objects from consuming excessive memory.
-
Delay Index Creation: Set
CreatePkAfterFullLoadto true to defer primary key and unique index creation until after the full load completes. This dramatically speeds up inserts and reduces memory pressure during the load phase. -
Table Preparation: Use
TargetTablePrepMode: DROP_AND_CREATEto let DMS recreate target tables, which works well with deferred index creation.
Memory Management:
- Configure
MemoryLimitTotalappropriately for your DCU allocation. Allocate 50-70% of available memory to this setting. - Set
MemoryKeepTimeto control how long changes are kept in memory before writing to disk (typically 60 seconds).
Monitoring and Adjustment:
- Monitor CloudWatch metrics closely, particularly
FreeableMemoryandSwapUsage. If swap usage is high, you're hitting memory limits. - Consider temporarily increasing max DCU during the full load phase, then reducing it once CDC begins (which is generally more stable and memory-efficient).
Phased Approach:
Since CDC is more stable for you, consider running separate tasks:
- Run a Full Load only task first (set
StopTaskCachedChangesApplied: false) - After completion, start a CDC-only task with
BatchApplyEnabled: truefor better throughput
This approach allows you to allocate resources differently for each phase and can be more cost-effective than maintaining high DCU throughout a combined task.
Additional Considerations:
- Ensure your target endpoint can handle the write throughput to prevent backpressure that increases memory usage
- Review table selection - prioritize smaller tables first or exclude very large tables for separate handling
- Check if specific tables are causing issues and consider using parallel load settings per table
By implementing these practices, you should see reduced OOM frequency while maintaining cost efficiency with lower baseline DCU settings.
Sources
Optimize full load and CDC AWS DMS tasks | AWS re:Post
Full-load task settings - AWS Database Migration Service
Optimizing AWS DMS Replication Task Performance: Best Practices & Sample Configurations | AWS re:Post
Relevant content
- asked 2 years ago
