AWS glue job runs but doesn't insert data into DB

0

My AWS glue job is configured to extract data from one rds postgres DB and insert it into another one (after some transormations).

Although the job completes "succesfully", I can't see any data i the destination DB.

I see this in the job logs:

23/07/16 13:52:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Please advise on how to overcome this. Thanks

asked 9 months ago479 views
1 Answer
0

This is just a warning and should not be cause for not writing data. I'd suggest you to enable driver logs, which would provide you additional details to look into the issue and isolate it accordingly.

If you'd enable SparkUI logs and there you can check in history server to see what the driver is doing. check again, was the driver started a Spark job/stage and didn't get resources or it got resources but something further happened.

Also, sometimes it depends on stages in the job and sparkUI should able to help you find, in which stage the problem exists.

References:

Glue Known Issues Glue parallel read jobs

Hope you find this useful.

profile pictureAWS
EXPERT
answered 9 months ago
  • I have access to the spark logs, but it's a big 17207 line json file. I downloaded them to my machine. I also installed spark locally and opened the spark web ui.

    Not sure how to view the downloaded logs in my local spark web ui. Any pointers?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions