Skip to content

DocumentDB to Postgres Migration

0

I am migrating a table from DocumentDB to postgres using DMS and i am not sure how to handle JSON document . Please advise .

1 Answer
1

Hello,

Thank you for reaching out with your query. I understand you're migrating from DocumentDB to PostgreSQL using AWS DMS and need guidance on how to handle JSON documents.

AWS DMS provides two migration modes that determine how your JSON data is handled when using DocumentDB as a source: Document Mode and Table Mode.

1. Document Mode (Default - Recommended for preserving JSON structure)

  • Migrates the entire JSON document as-is into a single column named _doc.

  • Creates two columns in PostgreSQL:

    oid_id - the document's ObjectId

    _doc - the full JSON document

For example, consider the following document in a Amazon DocumentDB collection called myCollection.

db.myCollection.find() { "_id" : ObjectId("5a94815f40bd44d1b02bdfe0"), "a" : 1, "b" : 2, "c" : 3 }

Example result in PostgreSQL:

oid_id_doc
5a94815f40bd44d1b02bdfe0{ "a" : 1, "b" : 2, "c" : 3 }

CDC Behavior:

  • If you add a new collection to the source, AWS DMS automatically creates a new target table and replicates the documents.
  • If using CDC with multi-document transactions, extractDocID must be set to true, otherwise the DMS task will fail.

2. Table Mode (Flattens JSON into columns)

  • Transforms each top-level field into a separate column.
  • Nested values are flattened using dot notation (e.g., {"a" : {"b" : {"c": 1}}} becomes column a.b.c).
  • DMS scans documents to determine column structure (default: 1,000 documents, configurable via docsToInvestigate parameter).

Example result in PostgreSQL:

oid_idabc
5a94815f40bd44d1b02bdfe0123

General Behavior

  • When you add a document to an existing collection, the document is replicated. If there are fields that don't exist in the target, those fields aren't replicated.
  • When you update a document, the updated document is replicated. If there are fields that don't exist in the target, those fields aren't replicated.
  • Deleting a document is fully supported.

CDC Specific Behavior:

  • Adding a new collection during CDC won't create a new table on the target.
  • Renaming a collection is not supported during the CDC phase.

Configuration:

  1. You can specify the migration mode when creating your DocumentDB source endpoint:

Console: Use the "Metadata mode" option

  1. CLI/Extra Connection Attributes:

nestingLevel=none // Document mode (default)

nestingLevel=one // Table mode

  1. Additional parameters:

extractDocID=true - Creates a separate _id column as primary key (required for CDC with multi-document transactions)

docsToInvestigate=1000 - Number of documents to scan for schema discovery in Table mode

Recommendations:

➤ Use Document Mode if:

  • Your documents have variable/dynamic schemas
  • You want to preserve the full JSON structure
  • You plan to use PostgreSQL's native JSON querying capabilities

➤ Use Table Mode if:

  • Your documents have a consistent, flat structure
  • You need to query individual fields frequently without JSON operators
  • You want a traditional relational table structure

Post-Migration:

For best practices on querying and optimizing JSON data in PostgreSQL, refer to:

[+] PostgreSQL as a JSON Database: Advanced Patterns and Best Practices (AWS Blog) - https://aws.amazon.com/blogs/database/postgresql-as-a-json-database-advanced-patterns-and-best-practices/

[+] PostgreSQL JSON Functions and Operators (PostgreSQL Documentation) - https://www.postgresql.org/docs/current/functions-json.html

Note: I recommend testing both migration modes in a test environment to determine which option best suits your requirements before proceeding with the full migration.

If you encounter any issues while configuring these options or need further assistance, I encourage you to open a support case with AWS. Please ensure to create the case from the AWS account where your DMS resources reside.

Link to raise case: https://support.console.aws.amazon.com/support/home#/case/create

Thanks!

Sources:

[1] Using Amazon DocumentDB (with MongoDB compatibility) as a source for AWS DMS - https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.DocumentDB.html

answered 5 days ago
AWS
SUPPORT ENGINEER
revised 3 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.