AWS glue job runs but doesn't insert data into DB

0

My AWS glue job is configured to extract data from one rds postgres DB and insert it into another one (after some transormations).

Although the job completes "succesfully", I can't see any data i the destination DB.

I see this in the job logs:

23/07/16 13:52:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Please advise on how to overcome this. Thanks

已提問 10 個月前檢視次數 501 次
1 個回答
0

This is just a warning and should not be cause for not writing data. I'd suggest you to enable driver logs, which would provide you additional details to look into the issue and isolate it accordingly.

If you'd enable SparkUI logs and there you can check in history server to see what the driver is doing. check again, was the driver started a Spark job/stage and didn't get resources or it got resources but something further happened.

Also, sometimes it depends on stages in the job and sparkUI should able to help you find, in which stage the problem exists.

References:

Glue Known Issues Glue parallel read jobs

Hope you find this useful.

profile pictureAWS
專家
已回答 10 個月前
  • I have access to the spark logs, but it's a big 17207 line json file. I downloaded them to my machine. I also installed spark locally and opened the spark web ui.

    Not sure how to view the downloaded logs in my local spark web ui. Any pointers?

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南