EMR serverless spark jobs connection with postgresql

0

Hi, I want to run a job on EMR serverless which reads and writes data from postgresql. I downloaded the jar file and pushed it to s3 and set "spark.jars" in Spark properties in management console. However the job is still failing.

Thank you, Muthu

  • More details are needed including what the error message is when the job fails. That said, if you are connecting to postgres, make sure your Serverless application is created in a VPC and that the security groups have access to the database. Reachability analyzer can be used to debug network connectivity issues.

  • Hi Muthu,

    Can you please share what error you are getting and snippet of your code if possible to see how you are trying to connect?

muthu
asked a year ago996 views
1 Answer
2
Accepted Answer

Thank you all for taking your time to reply, I solved the issue by following Dacort's comment and setting up my Serverless application inside a VPC whose security groups have access to the database. This is my code snippet

sample_data = spark.read.format("jdbc").options(

url='jdbc:postgresql://<sample-name>.<region-name>.rds.amazonaws.com/dev',

dbtable='public."<sample-name>"',

user='<sample-user>',

password='<sample-pass>',

driver='org.postgresql.Driver').load()

I was getting this error :

Caused by: java.net.SocketTimeoutException: connect timed out : org.postgresql.util.PSQLException: The connection attempt failed.
muthu
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions