trying to create a csv file in s3 using glue from mongodb as data source.

0

i have installed a mongodb server on t2 micro. i was able to successfully connect it with mongo compass without ssh tunnel and just the authentication and public ip.

then i have also created a crawler and ran it on the source and the crawler successfully created a table and i can see the names of all of the fields.

now i am tryin to make a glue job but i am constantly getting this error: An error occurred while calling o96.getDynamicFrame. scala.collection.immutable.HashMap$HashTrieMap cannot be cast to java.lang.String

i am successfully able to run another glue job on a sample json data sitting in s3. jobid: Job Run - jr_19c00d6ff707bd8af110e007a207d9d92d0f64e41dacffa98250398b57dbf30b

i am stuck on this error for two days now. any help will be highly appreciated.

已提问 2 年前661 查看次数
1 回答
0

Could you please try tweaking the additional_options in your code - I am assuming your code looks like the one below? The error suggests that it is expecting a String but you passed say a Boolean or list or other incompatible data type. If none of those work, please share your code snippet.

source_df = glue_context.create_dynamic_frame_from_catalog(
        database = catalogDB,
        table_name = catalogTable,
        additional_options = {"database":"database_name", 
            "collection":"collection_name"}) 
profile pictureAWS
已回答 2 年前
  • new error: An error occurred while calling o93.getDynamicFrame. Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[{address=172.31.17.170:27017, type=UNKNOWN, state=CONNECTING, exception={com.mongodb.MongoSocketOpenException: Exception opening socket}, caused by {java.net.SocketTimeoutException: connect timed out}}]

    code: DataCatalogtable_node1 = glueContext.create_dynamic_frame.from_catalog( database="mongo", table_name="qainnovate_test", transformation_ctx="DataCatalogtable_node1", additional_options = {"database":"qainnovate", "collection":"test"} )

    I think glue is a very bad choice is you have to make connection to a service which is not provided by AWS. I does not even have a proper documentation with code snippets like other regular python libraries..

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则