EMR serverless spark jobs connection with postgresql

0

Hi, I want to run a job on EMR serverless which reads and writes data from postgresql. I downloaded the jar file and pushed it to s3 and set "spark.jars" in Spark properties in management console. However the job is still failing.

Thank you, Muthu

  • More details are needed including what the error message is when the job fails. That said, if you are connecting to postgres, make sure your Serverless application is created in a VPC and that the security groups have access to the database. Reachability analyzer can be used to debug network connectivity issues.

  • Hi Muthu,

    Can you please share what error you are getting and snippet of your code if possible to see how you are trying to connect?

muthu
已提問 1 年前檢視次數 1058 次
1 個回答
2
已接受的答案

Thank you all for taking your time to reply, I solved the issue by following Dacort's comment and setting up my Serverless application inside a VPC whose security groups have access to the database. This is my code snippet

sample_data = spark.read.format("jdbc").options(

url='jdbc:postgresql://<sample-name>.<region-name>.rds.amazonaws.com/dev',

dbtable='public."<sample-name>"',

user='<sample-user>',

password='<sample-pass>',

driver='org.postgresql.Driver').load()

I was getting this error :

Caused by: java.net.SocketTimeoutException: connect timed out : org.postgresql.util.PSQLException: The connection attempt failed.
muthu
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南