Glue job failing with Null Pointer Exception when writing df

0

Running a job to fetch data from S3 and write to GCP BQ using Glue BQ connector by AWS. Everything else is fine, but for one table the second runs seems to fail always with below error. First time it runs fine, I have bookmarks enabled to fetch new data added in S3 and write to BQ. It fails with below error on write function.

Unable to understand the null pointer exception thrown.

Caused by: java.lang.NullPointerException
	at com.google.cloud.bigquery.connector.common.BigQueryClient.loadDataIntoTable(BigQueryClient.java:532)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.loadDataToBigQuery(BigQueryWriteHelper.scala:87)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.writeDataFrameToBigQuery(BigQueryWriteHelper.scala:66)
	... 42 more

已提问 2 年前281 查看次数
1 回答
0

Hi,

if you have bookmark enabled, are you sure you have new data in S3 for the second run?

If not the read step will create an empty dataframe that might cause the write to BigQuery to fail.

Currently you might want to implement a try/catch or conditional logic to test if the dataframe you read has data and writes to bigquery only if true otherwise only log a message that there is no available input at the moment.

Hope this helps,

AWS
专家
已回答 2 年前
  • Yes, more data is present in S3, I have printed the data and checked just before writing, but still it is throwing this error. I thought maybe something around nullability of the columns, but have fixed that too, by setting the nullable property of source to True same as target , but still the same error. I am clueless now!

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则