AWS glue job runs but doesn't insert data into DB

0

My AWS glue job is configured to extract data from one rds postgres DB and insert it into another one (after some transormations).

Although the job completes "succesfully", I can't see any data i the destination DB.

I see this in the job logs:

23/07/16 13:52:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Please advise on how to overcome this. Thanks

已提问 10 个月前502 查看次数
1 回答
0

This is just a warning and should not be cause for not writing data. I'd suggest you to enable driver logs, which would provide you additional details to look into the issue and isolate it accordingly.

If you'd enable SparkUI logs and there you can check in history server to see what the driver is doing. check again, was the driver started a Spark job/stage and didn't get resources or it got resources but something further happened.

Also, sometimes it depends on stages in the job and sparkUI should able to help you find, in which stage the problem exists.

References:

Glue Known Issues Glue parallel read jobs

Hope you find this useful.

profile pictureAWS
专家
已回答 10 个月前
  • I have access to the spark logs, but it's a big 17207 line json file. I downloaded them to my machine. I also installed spark locally and opened the spark web ui.

    Not sure how to view the downloaded logs in my local spark web ui. Any pointers?

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则