Cannot connect to MongoDB tables in catalog

1

Hi all, I'm trying to use AWS Glue to run ETL scripts, extracting data from MongoDB collections stored in AWS DocumentDB, and loading it into PostgreSQL tables in AWS RDS.

For ease of use in the Glue jobs themselves, I created a connection to my MongoDB and added the tables I will read from to the Data catalog. When creating the connection, the AWS console only let me proceed when the connection URL was of the form mongodb://<host>:<port>/<database>.

In the Python script, I tried to create a dynamic from from the catalog, by calling

glueContext.create_dynamic_frame_from_catalog(
    database='database', # Database name as defined in the catalog
    table_name='table_name', # Table name as defined in the catalog
    additional_options={
        'database': 'database', # Database name in MongoDB
        'collection': 'collection', # Collection name in MongoDB
        'ssl': 'true'
    }
)

However, this results in an error:

pyspark.sql.utils.IllegalArgumentException: "requirement failed: Invalid uri: 'mongodb://<username>:<password>@<host>:<port>/<database>/?ssl=true'"

Notice the '/' after <database>, which I believe is the cause of the exception being thrown.

I tried setting <database> to both the database that contains the collection I want to query and the admin database, to no avail. And, like I said previously, leaving it empty didn't work either. Any idea on what I am doing wrong? How can I connect to my DB?

已提问 2 年前268 查看次数
1 回答
0

You could write your connection code in the following manner to fix the issue: documentdb_write_uri = 'mongodb://yourdocumentdbcluster.amazonaws.com:27017' write_documentdb_options = { "uri": documentdb_write_uri, "database": "yourdbname", "collection": "yourcollectionname", "username": "###", "password": "###" }

You can follow the below mentioned reference document for further information : https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect.html#aws-glue-programming-etl-connect-documentdb

AWS
已回答 10 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则