Cannot connect to MongoDB tables in catalog

1

Hi all, I'm trying to use AWS Glue to run ETL scripts, extracting data from MongoDB collections stored in AWS DocumentDB, and loading it into PostgreSQL tables in AWS RDS.

For ease of use in the Glue jobs themselves, I created a connection to my MongoDB and added the tables I will read from to the Data catalog. When creating the connection, the AWS console only let me proceed when the connection URL was of the form mongodb://<host>:<port>/<database>.

In the Python script, I tried to create a dynamic from from the catalog, by calling

glueContext.create_dynamic_frame_from_catalog(
    database='database', # Database name as defined in the catalog
    table_name='table_name', # Table name as defined in the catalog
    additional_options={
        'database': 'database', # Database name in MongoDB
        'collection': 'collection', # Collection name in MongoDB
        'ssl': 'true'
    }
)

However, this results in an error:

pyspark.sql.utils.IllegalArgumentException: "requirement failed: Invalid uri: 'mongodb://<username>:<password>@<host>:<port>/<database>/?ssl=true'"

Notice the '/' after <database>, which I believe is the cause of the exception being thrown.

I tried setting <database> to both the database that contains the collection I want to query and the admin database, to no avail. And, like I said previously, leaving it empty didn't work either. Any idea on what I am doing wrong? How can I connect to my DB?

質問済み 2年前268ビュー
1回答
0

You could write your connection code in the following manner to fix the issue: documentdb_write_uri = 'mongodb://yourdocumentdbcluster.amazonaws.com:27017' write_documentdb_options = { "uri": documentdb_write_uri, "database": "yourdbname", "collection": "yourcollectionname", "username": "###", "password": "###" }

You can follow the below mentioned reference document for further information : https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect.html#aws-glue-programming-etl-connect-documentdb

AWS
回答済み 10ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ