2 Answers
- Newest
- Most votes
- Most comments
3
Below are some pratice you can take as reference:
- Handling Large BSON Documents • Split Large Documents: If possible, consider splitting large MongoDB documents into smaller sub-documents before migration. This can be done programmatically using MongoDB's aggregation framework or custom scripts. • Use LOB Mode: AWS DMS supports migrating large objects (LOBs) by enabling LOB mode. You can adjust the LobChunkSize and MaxLobSize parameters to handle large BSON documents more effectively. • Enable Compression: Compressing the data before migration can help reduce the size of BSON documents.
- Adjusting DMS Settings • NestingLevel: If your documents are deeply nested, consider using "NestingLevel": "one" instead of "none". This can help flatten the structure and reduce size issues. • Task Settings: Fine-tune the CdcMinFileSize and CdcMaxBatchInterval values to optimize the CDC process for large documents. • Output Format: Switching to JSON instead of Parquet might simplify the migration process, as JSON is more flexible with document structures.
- Alternative AWS Services • AWS DataSync: For large datasets, AWS DataSync can be a better alternative. It supports high-speed data transfers and can handle large files efficiently. • MongoDB Atlas Data Federation: If you're using MongoDB Atlas, you can leverage its Data Federation feature to export data directly to S3 in Parquet format. • AWS Snowball: For extremely large datasets, AWS Snowball provides a physical device for secure data transfer to S3.
- Segmentation for Performance • Use segmentation to improve performance during migration. AWS DMS supports auto-segmentation and range segmentation for MongoDB collections, which can help distribute the load and avoid bottlenecks.
- Error Handling and Monitoring • Implement robust error handling to retry failed migrations. • Use AWS CloudWatch to monitor DMS tasks and identify potential issues early.
0
The failure occurs because MongoDB enforces a strict 16 MB document limit, and AWS DMS reads documents as BSON before converting them to the target format. When a single document exceeds that limit, DMS cannot buffer or chunk it, even if LobChunkSize is increased—those settings only apply to binary or text LOBs stored within otherwise valid documents. To migrate these oversized records, you’ll need to restructure them at the source. The most reliable pattern is to pre-process the collection using MongoDB’s aggregation pipeline or a script that splits large documents into smaller logical fragments before running DMS.
—Taz
answered 5 months ago
Relevant content
- asked 3 years ago
- asked a year ago
- asked 3 years ago
