AWS Glue Postgres JDBC Data Source Failing with NullPointerException

0

I am writing a Spark Job in Java for execution on AWS Glue. It attempts to connect to a Postgres database using the glueContext.getSource() method. It is failing with the following NullPointerException:

2023-04-20 14:57:01,183 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(94)): Exception in User Class
java.lang.NullPointerException
	at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$apply$6(JDBCUtils.scala:914)
	at scala.collection.MapLike$MappedValues.$anonfun$foreach$3(MapLike.scala:252)
	at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:788)
	at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:230)
	at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:461)
	at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:787)
	at scala.collection.MapLike$MappedValues.foreach(MapLike.scala:252)
	at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:58)
	at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:49)
	at scala.collection.mutable.MapBuilder.$plus$plus$eq(MapBuilder.scala:25)
	at scala.collection.immutable.DefaultMap.$plus(DefaultMap.scala:40)
	at scala.collection.immutable.DefaultMap.$plus$(DefaultMap.scala:38)
	at scala.collection.immutable.MapLike$$anon$2.$plus(MapLike.scala:101)
	at scala.collection.immutable.MapLike.$anonfun$$plus$plus$1(MapLike.scala:87)
	at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
	at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
	at scala.collection.immutable.Map$Map3.foreach(Map.scala:192)
	at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:156)
	at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:154)
	at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
	at scala.collection.TraversableOnce.$div$colon(TraversableOnce.scala:150)
	at scala.collection.TraversableOnce.$div$colon$(TraversableOnce.scala:150)
	at scala.collection.AbstractTraversable.$div$colon(Traversable.scala:104)
	at scala.collection.immutable.MapLike.$plus$plus(MapLike.scala:87)
	at scala.collection.immutable.MapLike.$plus$plus$(MapLike.scala:86)
	at scala.collection.immutable.MapLike$$anon$2.$plus$plus(MapLike.scala:101)
	at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:956)
	at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)
	at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)
	at com.amazonaws.services.glue.util.JDBCWrapper.tableDF(JDBCUtils.scala:865)
	at com.amazonaws.services.glue.util.NoCondition$.tableDF(JDBCUtils.scala:87)
	at com.amazonaws.services.glue.util.NoJDBCPartitioner$.tableDF(JDBCUtils.scala:173)
	at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:1088)
	at com.amazonaws.services.glue.DataSource.getDynamicFrame(DataSource.scala:101)
	at com.amazonaws.services.glue.DataSource.getDynamicFrame$(DataSource.scala:101)
	at com.amazonaws.services.glue.AbstractSparkSQLDataSource.getDynamicFrame(DataSource.scala:725)
	at com.amazonaws.services.glue.DataSource.getDataFrame(DataSource.scala:118)
	at com.amazonaws.services.glue.DataSource.getDataFrame$(DataSource.scala:118)
	at com.amazonaws.services.glue.AbstractSparkSQLDataSource.getDataFrame(DataSource.scala:725)

My code for connecting and building a Data Frame is as follows:

scala.collection.mutable.Map<String, String> optionsMap = new scala.collection.mutable.HashMap<>();

optionsMap.put("url", "jdbc:postgresql://[hostname]:5432/postgres");
optionsMap.put("dbtable", "public.test");
optionsMap.put("user", "postgres");
optionsMap.put("password", "Passw0rd!123");

JsonOptions jsonOptions = new JsonOptions(optionsMap);

DataSource source = super.getGlueContext().getSource(
		getConfiguration().getFormat().toLowerCase(),
		jsonOptions,
		"",
		"");

I have confirmed that my job has an IAM role with sufficient permissions to read RDS. Does anyone have any suggestions?

profile picture
gefragt vor einem Jahr405 Aufrufe
2 Antworten
0
Akzeptierte Antwort

This ended up being a hidden null option input. The username and password options, pulled from a separate class, were not parsing correctly, and therefore returning nulls. This ended up causing a NullPointerException deep in the Glue code.

So if you get this NPE, make sure that all of the option inputs to the JsonOptions map are non-null.

profile picture
beantwortet vor einem Jahr
0

I'm not sure if that "JsonOptions(optionsMap)" is correctly handling the configuration map.

I've always passed a json string to build JsonOptions, like in the documentation: https://docs.aws.amazon.com/glue/latest/dg/glue-etl-scala-apis-glue-gluecontext.html#glue-etl-scala-apis-glue-gluecontext-defs-getSource

profile pictureAWS
EXPERTE
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen