A practical guide to distinguishing between API response codes and actual workflow execution status in AWS Step Functions EXPRESS workflows, ensuring proper error handling in production applications.
SHORT DESCRIPTION
When using AWS Step Functions' StartSyncExecution
API for EXPRESS workflows, receiving a 200 HTTP response code only indicates that the workflow request was successfully received and executed by the Step Functions service, it does not indicate that the workflow itself was successful. The 200 status code confirms that your request was properly received and processed by the Step Functions service. To determine if your workflow actually succeeded and no step in the state machine failed, you must check the execution status in the API response body, specifically looking at the 'status' field which can be 'SUCCEEDED', 'FAILED', or 'TIMED_OUT'. This distinction is crucial for implementing proper error handling in your applications. This leads to the below:
Amazon API Gateway Integration with AWS Step Functions:
- Amazon API Gateway sees a 200 from AWS Step Functions and returns 200 to the client.
- Actual workflow failures are masked with successful HTTP status codes.
Direct Invocation with AWS Lambda/AWS SDK
- The response status code doesn't reflect the true execution status.
- Error details are buried in the response body.
THE SOLUTION
Amazon API Gateway Integration
- Open the AWS API gateway console .
- Choosing your existing API and then the resource entity.
- Select the HTTP method that is setup for the AWS Step functions integration.
- Choose 'Integration request' and 'Edit'.
- Under 'Mapping templates', add the below mapping template in Amazon API Gateway integration response and save - The Status codes now accurately reflect workflow execution outcomes.
#set($inputRoot = $input.path('$'))
#set($output = $util.parseJson($inputRoot.output))
#if($output.statusCode)
#set($context.responseOverride.status = $output.statusCode)
#end
$output.body
Direct Invocation - AWS Lambda/AWS SDK
- AWS Lambda Integration - Implement status code checking in function code → Lambda now properly handles Step Functions execution statuses.
- AWS SDK Integration - Implement error handling using SDK response parsing → SDK calls now surface actual workflow execution results.
response = stepfunctions.start_sync_execution(
stateMachineArn='YOUR_STATE_MACHINE_ARN',
input=json.dumps(input_data)
)
# Check actual execution status, not the API response code
execution_status = response['status']
if execution_status != 'SUCCEEDED':
error_output = json.loads(response['output'])
print(f"Workflow failed: {error_output.get('error')}")
The above sample code is in Python however, you can implement similar code for other languages.
KEY POINTS
- Always check response['status'], along with the HTTP status code.
- Parse response['output'] for error details.
- Use mapping templates in Amazon API Gateway to propagate actual status codes.
BEST PRACTICES
- Use Express Workflows for synchronous API calls (faster, lower latency).
- Structure error responses consistently across your workflows.
- Enable Amazon CloudWatch logging for better debugging.
- Implement proper retry mechanisms for transient failures.
- Keep synchronous executions under 29 seconds (Amazon API Gateway timeout) for Amazon API Gateway integrations.