1 Answer
- Newest
- Most votes
- Most comments
1
If you are using a standard workflow, you can invoke the Lambda function with .waitForTaskToken integration. This will allow you to pass a task token to the Lambda function. Unless you specify a task timeout or heartbeat timeout, you can pause Step Functions indefinitely, and wait for an external process or workflow to complete. Upon invoking a spark script from Lambda, you can call a SendTaskSuccess or SendTaskFailure API depending on the result of the spark script (or however you like) to let Step Functions know the task status.
Standard Step Functions workflow -> Lambda task state (.waitForTaskToken) -> Lambda (call Spark/SendTaskSuccess | SendTaskFailure)) # this workflow waits until the Lambda function calls one of the SendTask* APIs
{ "StartAt":"GetManualReview", "States":{ "GetManualReview":{ "Type":"Task", "Resource":"arn:aws:states:::lambda:invoke.waitForTaskToken", "Parameters":{ "FunctionName":"get-model-review-decision", "Payload":{ "model.$":"$.new_model", "token.$":"$$.Task.Token" }, "Qualifier":"prod-v1" }, "End":true } } }
answered 2 years ago
Relevant content
- asked a year ago
- asked 2 years ago
- asked 2 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
Hi @Taka_M, Thanks for your answer, the proposed solution looks great. Just I have a cuple of question, what if does the spark script takes more than the lambda timeout ( 15 minutes )? is there a way to react to a EMR Serverles failure event? Thanks for all your support.
You can retry the Lambda task if it times out by adding a retry for the task. A better way to do that is doing it async. If you can possibly make that SendTask* API call from your EMR cluster, then your Lambda won't need to wait just to get the spark job status. You still pass the task token to Lambda but Lambda passes it to the spark job when it starts. If it's not feasible, you can just invoke a Lambda function and then have a wait state right after (if you know it takes x amount of time). And you can have another lambda task state to check the status. You can put it in a loop using a choice state (Lambda to kick off a spark job -> wait state -> lambda to check the job -> choice (to determine if the job is done, failed, need more time) # repeat it as needed. or you can use a retry field for the last Lambda and add the max retry count (return a failure if it needs more time for the job).