Step function state to execute a Glue job seems to be stalling

1

Hi - I have a step function set up that invokes a glue job. This happens successfully, and the glue job succeeds in about 30 seconds. The step function state simply stays on running, never moving to the next state. I've waited about an hour after the glue job succeeded.

Here is my state for the glue job:

"Glue-Transform": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters": {
"JobName": "job-name",
"Arguments": {
"--argument.$": "$"
}
},
"Catch": [{
"ErrorEquals": ["States.DataLimitExceeded",
"States.Runtime",
"States.Timeout",
"States.TaskFailed",
"States.Permissions"],
"ResultPath": "$.Error",
"Next": "MapFailed"
}],
"Catch": [{
"ErrorEquals": ["States.ALL"],
"ResultPath": "$.Error",
"Next": "MapFailed"
}],
"End": true
}

I haven't been able to find documentation on troubleshooting anything related to this, besides this page which only details how to start a simple job, which I believe I am following correctly: https://docs.aws.amazon.com/step-functions/latest/dg/connect-glue.html

ratiugo
preguntada hace 3 años3528 visualizaciones
1 Respuesta
0

Found my answer here - https://stackoverflow.com/questions/56812780/how-to-have-a-python-glue-job-return-when-called-in-step-function

The solution to my actual problem was permissions. You need four permissions when running a startJogRun.sync:

glue:StartJobRun
glue:GetJobRun
glue:GetJobRuns
glue:BatchStopJobRun
Those are actually the Terraform values, but should help anybody struggling with this.

ratiugo
respondido hace 3 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas