Cannot connect to MongoDB tables in catalog

1

Hi all, I'm trying to use AWS Glue to run ETL scripts, extracting data from MongoDB collections stored in AWS DocumentDB, and loading it into PostgreSQL tables in AWS RDS.

For ease of use in the Glue jobs themselves, I created a connection to my MongoDB and added the tables I will read from to the Data catalog. When creating the connection, the AWS console only let me proceed when the connection URL was of the form mongodb://<host>:<port>/<database>.

In the Python script, I tried to create a dynamic from from the catalog, by calling

glueContext.create_dynamic_frame_from_catalog(
    database='database', # Database name as defined in the catalog
    table_name='table_name', # Table name as defined in the catalog
    additional_options={
        'database': 'database', # Database name in MongoDB
        'collection': 'collection', # Collection name in MongoDB
        'ssl': 'true'
    }
)

However, this results in an error:

pyspark.sql.utils.IllegalArgumentException: "requirement failed: Invalid uri: 'mongodb://<username>:<password>@<host>:<port>/<database>/?ssl=true'"

Notice the '/' after <database>, which I believe is the cause of the exception being thrown.

I tried setting <database> to both the database that contains the collection I want to query and the admin database, to no avail. And, like I said previously, leaving it empty didn't work either. Any idea on what I am doing wrong? How can I connect to my DB?

질문됨 2년 전268회 조회
1개 답변
0

You could write your connection code in the following manner to fix the issue: documentdb_write_uri = 'mongodb://yourdocumentdbcluster.amazonaws.com:27017' write_documentdb_options = { "uri": documentdb_write_uri, "database": "yourdbname", "collection": "yourcollectionname", "username": "###", "password": "###" }

You can follow the below mentioned reference document for further information : https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-connect.html#aws-glue-programming-etl-connect-documentdb

AWS
답변함 10달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠