AWS glue job runs but doesn't insert data into DB

0

My AWS glue job is configured to extract data from one rds postgres DB and insert it into another one (after some transormations).

Although the job completes "succesfully", I can't see any data i the destination DB.

I see this in the job logs:

23/07/16 13:52:21 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

Please advise on how to overcome this. Thanks

gefragt vor 10 Monaten501 Aufrufe
1 Antwort
0

This is just a warning and should not be cause for not writing data. I'd suggest you to enable driver logs, which would provide you additional details to look into the issue and isolate it accordingly.

If you'd enable SparkUI logs and there you can check in history server to see what the driver is doing. check again, was the driver started a Spark job/stage and didn't get resources or it got resources but something further happened.

Also, sometimes it depends on stages in the job and sparkUI should able to help you find, in which stage the problem exists.

References:

Glue Known Issues Glue parallel read jobs

Hope you find this useful.

profile pictureAWS
EXPERTE
beantwortet vor 10 Monaten
  • I have access to the spark logs, but it's a big 17207 line json file. I downloaded them to my machine. I also installed spark locally and opened the spark web ui.

    Not sure how to view the downloaded logs in my local spark web ui. Any pointers?

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen