Glue job failing with Null Pointer Exception when writing df

0

Running a job to fetch data from S3 and write to GCP BQ using Glue BQ connector by AWS. Everything else is fine, but for one table the second runs seems to fail always with below error. First time it runs fine, I have bookmarks enabled to fetch new data added in S3 and write to BQ. It fails with below error on write function.

Unable to understand the null pointer exception thrown.

Caused by: java.lang.NullPointerException
	at com.google.cloud.bigquery.connector.common.BigQueryClient.loadDataIntoTable(BigQueryClient.java:532)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.loadDataToBigQuery(BigQueryWriteHelper.scala:87)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.writeDataFrameToBigQuery(BigQueryWriteHelper.scala:66)
	... 42 more

gefragt vor 2 Jahren281 Aufrufe
1 Antwort
0

Hi,

if you have bookmark enabled, are you sure you have new data in S3 for the second run?

If not the read step will create an empty dataframe that might cause the write to BigQuery to fail.

Currently you might want to implement a try/catch or conditional logic to test if the dataframe you read has data and writes to bigquery only if true otherwise only log a message that there is no available input at the moment.

Hope this helps,

AWS
EXPERTE
beantwortet vor 2 Jahren
  • Yes, more data is present in S3, I have printed the data and checked just before writing, but still it is throwing this error. I thought maybe something around nullability of the columns, but have fixed that too, by setting the nullable property of source to True same as target , but still the same error. I am clueless now!

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen