- 최신
- 최다 투표
- 가장 많은 댓글
Hello,
Glue or Spark uses a connector called "MongoDB Spark connector" when reading from Mongodb and as per the documentation here https://www.mongodb.com/docs/spark-connector/current/configuration/read/ I do not see any options available for handling corrupted/bad records. I could also see a feature request was placed recently here https://jira.mongodb.org/browse/SPARK-327 to the community.
You can try exporting mongo collection as JSON or CSV files to an s3 location using something like https://www.mongodb.com/docs/database-tools/mongoexport/ and consume them using Spark data frames. Spark should support handling bad records for these file formats.
You can also convert between Spark data frame and Glue dynamic frame easily as shown in the below links