Step function state to execute a Glue job seems to be stalling

1

Hi - I have a step function set up that invokes a glue job. This happens successfully, and the glue job succeeds in about 30 seconds. The step function state simply stays on running, never moving to the next state. I've waited about an hour after the glue job succeeded.

Here is my state for the glue job:

"Glue-Transform": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters": {
"JobName": "job-name",
"Arguments": {
"--argument.$": "$"
}
},
"Catch": [{
"ErrorEquals": ["States.DataLimitExceeded",
"States.Runtime",
"States.Timeout",
"States.TaskFailed",
"States.Permissions"],
"ResultPath": "$.Error",
"Next": "MapFailed"
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.Error",
"Next": "MapFailed"
}],
"End": true
}

I haven't been able to find documentation on troubleshooting anything related to this, besides this page which only details how to start a simple job, which I believe I am following correctly: https://docs.aws.amazon.com/step-functions/latest/dg/connect-glue.html

ratiugo
asked 3 years ago3492 views
1 Answer
0

Found my answer here - https://stackoverflow.com/questions/56812780/how-to-have-a-python-glue-job-return-when-called-in-step-function

The solution to my actual problem was permissions. You need four permissions when running a startJogRun.sync:

glue:StartJobRun
glue:GetJobRun
glue:GetJobRuns
glue:BatchStopJobRun
Those are actually the Terraform values, but should help anybody struggling with this.

ratiugo
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions