AWS Glue: Crawler worked with connection but using it in Glue Jobs result to "while calling o145.pyWriteDynamicFrame. The connection attempt failed."

0

When I created a crawler to crawl an RDS (Postgres), it was able to connect and crawl one table I specified. When I created a job, using the node type "AWS Glue Data Catalog table with PostgreSQL as the data target" and pointing to the database and table, it won't connect to the target. It is giving me the "An error occurred while calling o145.pyWriteDynamicFrame. The connection attempt failed." I've checked the security group and subnet of the RDS and the connection in Glue. What else should I be checking?

질문됨 일 년 전507회 조회
2개 답변
0

Hi,

I see that you are receiving the following error while trying to connect to RDS as target using glue job.

"An error occurred while calling o145.pyWriteDynamicFrame. The connection attempt failed."

Common causes for this error can be caused by the subnet the Job is running not being able to connect to the RDS instance or the Security Group of the RDS instance not allowing access to the Security Group being used by the Glue Job. As a first reference, we have a step-by-step guide on how to set up the environment for access the RDS data stores, which covers the configurations that are needed here - https://docs.aws.amazon.com/glue/latest/dg/setup-vpc-for-glue-access.html

As you mentioned your glue crawler was working with RDS using same glue connection then I would also request you to check your data source with the glue job if you have used another connection for reading the data catalog table. Then kindly note that, AWS Glue supports one connection per job. If you specify more than one connection in a job, AWS Glue uses the first connection only. If your job requires access to more than one virtual private cloud (VPC) you have to create a dedicated VPC for Glue connection and then configure a peering connection with the VPCs where you have your data source. Please check below link for more details on same.

https://aws.amazon.com/premiumsupport/knowledge-center/connection-timeout-glue-redshift-rds/

https://aws.amazon.com/blogs/big-data/connecting-to-and-running-etl-jobs-across-multiple-vpcs-using-a-dedicated-aws-glue-vpc/

If the issue still persist, then please open a support case with AWS providing the connection details and code snippet used - https://docs.aws.amazon.com/awssupport/latest/user/case-management.html#creating-a-support-case

Thank you.

AWS
지원 엔지니어
답변함 일 년 전
0

Following up just in case people are wondering about this: What I actually did was started with a new canvas, filled in the target first (and went backwards) and my source last. This fixed the issue and the run was successful. Very quirky. Don't know if AWS knows about this issue.

답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인