AWS Glue: Crawler worked with connection but using it in Glue Jobs result to "while calling o145.pyWriteDynamicFrame. The connection attempt failed."

0

When I created a crawler to crawl an RDS (Postgres), it was able to connect and crawl one table I specified. When I created a job, using the node type "AWS Glue Data Catalog table with PostgreSQL as the data target" and pointing to the database and table, it won't connect to the target. It is giving me the "An error occurred while calling o145.pyWriteDynamicFrame. The connection attempt failed." I've checked the security group and subnet of the RDS and the connection in Glue. What else should I be checking?

已提问 1 年前507 查看次数
2 回答
0

Hi,

I see that you are receiving the following error while trying to connect to RDS as target using glue job.

"An error occurred while calling o145.pyWriteDynamicFrame. The connection attempt failed."

Common causes for this error can be caused by the subnet the Job is running not being able to connect to the RDS instance or the Security Group of the RDS instance not allowing access to the Security Group being used by the Glue Job. As a first reference, we have a step-by-step guide on how to set up the environment for access the RDS data stores, which covers the configurations that are needed here - https://docs.aws.amazon.com/glue/latest/dg/setup-vpc-for-glue-access.html

As you mentioned your glue crawler was working with RDS using same glue connection then I would also request you to check your data source with the glue job if you have used another connection for reading the data catalog table. Then kindly note that, AWS Glue supports one connection per job. If you specify more than one connection in a job, AWS Glue uses the first connection only. If your job requires access to more than one virtual private cloud (VPC) you have to create a dedicated VPC for Glue connection and then configure a peering connection with the VPCs where you have your data source. Please check below link for more details on same.

https://aws.amazon.com/premiumsupport/knowledge-center/connection-timeout-glue-redshift-rds/

https://aws.amazon.com/blogs/big-data/connecting-to-and-running-etl-jobs-across-multiple-vpcs-using-a-dedicated-aws-glue-vpc/

If the issue still persist, then please open a support case with AWS providing the connection details and code snippet used - https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

Thank you.

AWS
支持工程师
已回答 1 年前
0

Following up just in case people are wondering about this: What I actually did was started with a new canvas, filled in the target first (and went backwards) and my source last. This fixed the issue and the run was successful. Very quirky. Don't know if AWS knows about this issue.

已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则