AWS Glue Postgres JDBC Data Source Failing with NullPointerException

0

I am writing a Spark Job in Java for execution on AWS Glue. It attempts to connect to a Postgres database using the glueContext.getSource() method. It is failing with the following NullPointerException:

2023-04-20 14:57:01,183 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(94)): Exception in User Class
java.lang.NullPointerException
	at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$apply$6(JDBCUtils.scala:914)
	at scala.collection.MapLike$MappedValues.$anonfun$foreach$3(MapLike.scala:252)
	at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:788)
	at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:230)
	at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:461)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:787)
	at scala.collection.MapLike$MappedValues.foreach(MapLike.scala:252)
	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:58)
	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:49)
	at scala.collection.mutable.MapBuilder.$plus$plus$eq(MapBuilder.scala:25)
	at scala.collection.immutable.DefaultMap.$plus(DefaultMap.scala:40)
	at scala.collection.immutable.DefaultMap.$plus$(DefaultMap.scala:38)
	at scala.collection.immutable.MapLike$$anon$2.$plus(MapLike.scala:101)
	at scala.collection.immutable.MapLike.$anonfun$$plus$plus$1(MapLike.scala:87)
	at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
	at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
	at scala.collection.immutable.Map$Map3.foreach(Map.scala:192)
	at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:156)
	at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:154)
	at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
	at scala.collection.TraversableOnce.$div$colon(TraversableOnce.scala:150)
	at scala.collection.TraversableOnce.$div$colon$(TraversableOnce.scala:150)
	at scala.collection.AbstractTraversable.$div$colon(Traversable.scala:104)
	at scala.collection.immutable.MapLike.$plus$plus(MapLike.scala:87)
	at scala.collection.immutable.MapLike.$plus$plus$(MapLike.scala:86)
	at scala.collection.immutable.MapLike$$anon$2.$plus$plus(MapLike.scala:101)
	at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:956)
	at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)
	at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)
	at com.amazonaws.services.glue.util.JDBCWrapper.tableDF(JDBCUtils.scala:865)
	at com.amazonaws.services.glue.util.NoCondition$.tableDF(JDBCUtils.scala:87)
	at com.amazonaws.services.glue.util.NoJDBCPartitioner$.tableDF(JDBCUtils.scala:173)
	at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:1088)
	at com.amazonaws.services.glue.DataSource.getDynamicFrame(DataSource.scala:101)
	at com.amazonaws.services.glue.DataSource.getDynamicFrame$(DataSource.scala:101)
	at com.amazonaws.services.glue.AbstractSparkSQLDataSource.getDynamicFrame(DataSource.scala:725)
	at com.amazonaws.services.glue.DataSource.getDataFrame(DataSource.scala:118)
	at com.amazonaws.services.glue.DataSource.getDataFrame$(DataSource.scala:118)
	at com.amazonaws.services.glue.AbstractSparkSQLDataSource.getDataFrame(DataSource.scala:725)

My code for connecting and building a Data Frame is as follows:

scala.collection.mutable.Map<String, String> optionsMap = new scala.collection.mutable.HashMap<>();

optionsMap.put("url", "jdbc:postgresql://[hostname]:5432/postgres");
optionsMap.put("dbtable", "public.test");
optionsMap.put("user", "postgres");
optionsMap.put("password", "Passw0rd!123");

JsonOptions jsonOptions = new JsonOptions(optionsMap);

DataSource source = super.getGlueContext().getSource(
		getConfiguration().getFormat().toLowerCase(),
		jsonOptions,
		"",
		"");

I have confirmed that my job has an IAM role with sufficient permissions to read RDS. Does anyone have any suggestions?

profile picture
asked a year ago380 views
2 Answers
0
Accepted Answer

This ended up being a hidden null option input. The username and password options, pulled from a separate class, were not parsing correctly, and therefore returning nulls. This ended up causing a NullPointerException deep in the Glue code.

So if you get this NPE, make sure that all of the option inputs to the JsonOptions map are non-null.

profile picture
answered a year ago
0

I'm not sure if that "JsonOptions(optionsMap)" is correctly handling the configuration map.

I've always passed a json string to build JsonOptions, like in the documentation: https://docs.aws.amazon.com/glue/latest/dg/glue-etl-scala-apis-glue-gluecontext.html#glue-etl-scala-apis-glue-gluecontext-defs-getSource

profile pictureAWS
EXPERT
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions