I am writing a Spark Job in Java for execution on AWS Glue. It attempts to connect to a Postgres database using the glueContext.getSource() method. It is failing with the following NullPointerException:
2023-04-20 14:57:01,183 ERROR [main] glue.ProcessLauncher (Logging.scala:logError(94)): Exception in User Class
java.lang.NullPointerException
at com.amazonaws.services.glue.util.JDBCWrapper$.$anonfun$apply$6(JDBCUtils.scala:914)
at scala.collection.MapLike$MappedValues.$anonfun$foreach$3(MapLike.scala:252)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:788)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:230)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:461)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:787)
at scala.collection.MapLike$MappedValues.foreach(MapLike.scala:252)
at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:58)
at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:49)
at scala.collection.mutable.MapBuilder.$plus$plus$eq(MapBuilder.scala:25)
at scala.collection.immutable.DefaultMap.$plus(DefaultMap.scala:40)
at scala.collection.immutable.DefaultMap.$plus$(DefaultMap.scala:38)
at scala.collection.immutable.MapLike$$anon$2.$plus(MapLike.scala:101)
at scala.collection.immutable.MapLike.$anonfun$$plus$plus$1(MapLike.scala:87)
at scala.collection.TraversableOnce.$anonfun$foldLeft$1(TraversableOnce.scala:156)
at scala.collection.TraversableOnce.$anonfun$foldLeft$1$adapted(TraversableOnce.scala:156)
at scala.collection.immutable.Map$Map3.foreach(Map.scala:192)
at scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:156)
at scala.collection.TraversableOnce.foldLeft$(TraversableOnce.scala:154)
at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
at scala.collection.TraversableOnce.$div$colon(TraversableOnce.scala:150)
at scala.collection.TraversableOnce.$div$colon$(TraversableOnce.scala:150)
at scala.collection.AbstractTraversable.$div$colon(Traversable.scala:104)
at scala.collection.immutable.MapLike.$plus$plus(MapLike.scala:87)
at scala.collection.immutable.MapLike.$plus$plus$(MapLike.scala:86)
at scala.collection.immutable.MapLike$$anon$2.$plus$plus(MapLike.scala:101)
at com.amazonaws.services.glue.util.JDBCWrapper$.connectionProperties(JDBCUtils.scala:956)
at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties$lzycompute(JDBCUtils.scala:739)
at com.amazonaws.services.glue.util.JDBCWrapper.connectionProperties(JDBCUtils.scala:739)
at com.amazonaws.services.glue.util.JDBCWrapper.tableDF(JDBCUtils.scala:865)
at com.amazonaws.services.glue.util.NoCondition$.tableDF(JDBCUtils.scala:87)
at com.amazonaws.services.glue.util.NoJDBCPartitioner$.tableDF(JDBCUtils.scala:173)
at com.amazonaws.services.glue.JDBCDataSource.getDynamicFrame(DataSource.scala:1088)
at com.amazonaws.services.glue.DataSource.getDynamicFrame(DataSource.scala:101)
at com.amazonaws.services.glue.DataSource.getDynamicFrame$(DataSource.scala:101)
at com.amazonaws.services.glue.AbstractSparkSQLDataSource.getDynamicFrame(DataSource.scala:725)
at com.amazonaws.services.glue.DataSource.getDataFrame(DataSource.scala:118)
at com.amazonaws.services.glue.DataSource.getDataFrame$(DataSource.scala:118)
at com.amazonaws.services.glue.AbstractSparkSQLDataSource.getDataFrame(DataSource.scala:725)
My code for connecting and building a Data Frame is as follows:
scala.collection.mutable.Map<String, String> optionsMap = new scala.collection.mutable.HashMap<>();
optionsMap.put("url", "jdbc:postgresql://[hostname]:5432/postgres");
optionsMap.put("dbtable", "public.test");
optionsMap.put("user", "postgres");
optionsMap.put("password", "Passw0rd!123");
JsonOptions jsonOptions = new JsonOptions(optionsMap);
DataSource source = super.getGlueContext().getSource(
getConfiguration().getFormat().toLowerCase(),
jsonOptions,
"",
"");
I have confirmed that my job has an IAM role with sufficient permissions to read RDS. Does anyone have any suggestions?