Migrating Postgres Databases to S3 Data Lake: Boolean Column Issue

0

We are in the process of migrating several Postgres databases from RDS to an S3 data lake. During this migration, we've encountered a peculiar issue with boolean columns in the Change Data Capture (CDC).

Issue Observed:

  • When using AWS DMS, all boolean columns in the CDC are consistently marked as true, regardless of their actual value in the source database.
  • Initially, DMS migrated all boolean values as varchar(5). To address this, we enabled the "MapBooleanAsBoolean" option on the source endpoint, expecting to see accurate boolean values in the data lake.
  • For one database, this adjustment worked flawlessly. However, for all other databases, every boolean value in the CDC remains as true.

Additional Observations:

  • Disabling the "MapBooleanAsBoolean" option led to the following results:
    • For the database where there wasn't an issue, boolean values appeared as "true" and "false" strings.
    • For databases with the boolean issue, values appeared as "t" and "f" strings.

We can manage these string conversions in our ETL processes or modify the parquet files, but the root cause of this boolean discrepancy remains elusive.

Troubleshooting Done:

  • We've checked configurations of our DMS task, replication instance, and both source and target endpoints; they are identical across all databases.
  • Documentation only states that booleans are converted to strings. Interestingly, it indicates that the "MapBooleanAsBoolean" option is only for Redshift targets. However, this option works (at least for one database) when our target is S3.

Key Question: Is there a specific setting or aspect within Postgres that might trigger this behavior?

Environment Details:

  • DMS replication instance: 3.4.7
  • All databases: Postgres 12.14

Any insights or guidance would be greatly appreciated!

yatmee
demandé il y a 7 mois133 vues
Aucune réponse

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions