Glue job failing with Null Pointer Exception when writing df

0

Running a job to fetch data from S3 and write to GCP BQ using Glue BQ connector by AWS. Everything else is fine, but for one table the second runs seems to fail always with below error. First time it runs fine, I have bookmarks enabled to fetch new data added in S3 and write to BQ. It fails with below error on write function.

Unable to understand the null pointer exception thrown.

Caused by: java.lang.NullPointerException
	at com.google.cloud.bigquery.connector.common.BigQueryClient.loadDataIntoTable(BigQueryClient.java:532)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.loadDataToBigQuery(BigQueryWriteHelper.scala:87)
	at com.google.cloud.spark.bigquery.BigQueryWriteHelper.writeDataFrameToBigQuery(BigQueryWriteHelper.scala:66)
	... 42 more

질문됨 2년 전281회 조회
1개 답변
0

Hi,

if you have bookmark enabled, are you sure you have new data in S3 for the second run?

If not the read step will create an empty dataframe that might cause the write to BigQuery to fail.

Currently you might want to implement a try/catch or conditional logic to test if the dataframe you read has data and writes to bigquery only if true otherwise only log a message that there is no available input at the moment.

Hope this helps,

AWS
전문가
답변함 2년 전
  • Yes, more data is present in S3, I have printed the data and checked just before writing, but still it is throwing this error. I thought maybe something around nullability of the columns, but have fixed that too, by setting the nullable property of source to True same as target , but still the same error. I am clueless now!

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인