1 Answers
1
If you are using a standard workflow, you can invoke the Lambda function with .waitForTaskToken integration. This will allow you to pass a task token to the Lambda function. Unless you specify a task timeout or heartbeat timeout, you can pause Step Functions indefinitely, and wait for an external process or workflow to complete. Upon invoking a spark script from Lambda, you can call a SendTaskSuccess or SendTaskFailure API depending on the result of the spark script (or however you like) to let Step Functions know the task status.
Standard Step Functions workflow -> Lambda task state (.waitForTaskToken) -> Lambda (call Spark/SendTaskSuccess | SendTaskFailure)) # this workflow waits until the Lambda function calls one of the SendTask* APIs
{ "StartAt":"GetManualReview", "States":{ "GetManualReview":{ "Type":"Task", "Resource":"arn:aws:states:::lambda:invoke.waitForTaskToken", "Parameters":{ "FunctionName":"get-model-review-decision", "Payload":{ "model.$":"$.new_model", "token.$":"$$.Task.Token" }, "Qualifier":"prod-v1" }, "End":true } } }
answered 15 days ago
Relevant questions
How do I use Step Functions to create EMR clusters with different specifications?
Accepted Answerasked 2 years agoTrigger Step Function with API Gateway and use Fargate within Step Function?
asked 4 months agoEMR Serverless integration with Step Functions
asked 2 days agoHTTP API Integration with lamda function and stage variable not working
asked a year agoCan I specify GET URL path parameter in step function?
asked 5 months agoWhy aren't the HTTP headers passed from API Gateway to Step Functions?
asked 7 months agoEMR step failure Alerts Issue
asked 2 months agoAWS Step Function does not recognize job flow ID
Accepted Answerasked 2 years agoStep function and EMR server less integration
asked 17 days ago[AWS EMR] HTTP 403: Forbidden (Workspace is not attached to cluster]
asked 18 hours ago
Hi @Taka_M, Thanks for your answer, the proposed solution looks great. Just I have a cuple of question, what if does the spark script takes more than the lambda timeout ( 15 minutes )? is there a way to react to a EMR Serverles failure event? Thanks for all your support.
You can retry the Lambda task if it times out by adding a retry for the task. A better way to do that is doing it async. If you can possibly make that SendTask* API call from your EMR cluster, then your Lambda won't need to wait just to get the spark job status. You still pass the task token to Lambda but Lambda passes it to the spark job when it starts. If it's not feasible, you can just invoke a Lambda function and then have a wait state right after (if you know it takes x amount of time). And you can have another lambda task state to check the status. You can put it in a loop using a choice state (Lambda to kick off a spark job -> wait state -> lambda to check the job -> choice (to determine if the job is done, failed, need more time) # repeat it as needed. or you can use a retry field for the last Lambda and add the max retry count (return a failure if it needs more time for the job).