- Newest
- Most votes
- Most comments
There is no built in way to do what you want, however, there are a few options to implement this.
One option is to catch the Step Function SUCCEEDED event in EventBridge, and use it to invoke a Lambda that checks if there is another message in the queue and if there is, it invokes a new state machine. You will not be able to use the SQS to Lambda trigger (Step 3 above). So your Lambda 1 checks if there is an execution. If there is, it just puts the message in the queue. If there is none, it invokes the state machine.
Another option is to start a step function execution for each object, but before starting the glue jobs, check if there is another one in progress. If there is one, you sleep for a few seconds, and check again. If there is none, you start the glue jobs. You can use Dynamo DB to mark when you start the glue jobs and when you are done.
When the messages are injected to FIFO SQS from Lambda 1, consider having one message group ID. This will ensure only one Lambda instance processes messages for a particular message group ID at a time. This also ensure messages within a group are handled sequentially.
And then in the FIFO SQS -> Lambda set up, have batch size set as 1, this will ensure only one message from the queue is consumed.
Effectively with this, S3 invokes Lambda 1, Lambda 1 injects messages to the FIFO SQS with single message group id. Single message group id will ensure only one Lambda instance processes messages. With batch size as 1, the Lambda 2 will only send/process one message and invoke step function.
If I were you, I think I'd just use EventBridge S3 Notifications to trigger your Step Functions state machine directly (no need for deploying and managing intermediary things). Then implement the concurrency control in the workflow logic. I'm assuming you won't have massive volume, in which case placing a queue in the middle with more flow control might be helpful. The advantage of doing it this way is that you have fewer moving parts and you have better visibility and control (e.g., you can see which ones are waiting and cancel some if you don't want them to proceed.
I wrote up this blog post a while back on how to implement concurrency control in your workflow definition: https://aws.amazon.com/blogs/compute/controlling-concurrency-in-distributed-systems-using-aws-step-functions/
More recently I included a slightly altered approach in this app, where the logic for acquiring and releasing the locks is externalized for re-use and where the key you use for your lock is dynamic. https://github.com/aws-samples/aws-stepfunctions-examples/tree/main/sam/app-meeting-summarization
Relevant content
- asked 3 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 3 years ago

So the process would be, S3 file placed -> Lambda checks SF status If SF running -> Lambda sends message to SQS If SF not running -> Lambda triggers SF EventBridge gets notified of SF Success -> Triggers lambda (same as above?) to check for SQS messages If SQS message -> Lambda triggers SF (How do I check if an SQS message is waiting? I cannot find a way to do this) If No SQS message -> Process completed until new S3 file received . Does that flow make sense? Can I do everything in one lambda or would I need two?
Yes. Exactly.
To check if there is a message in the queue, you do not use an SQS integration, but rather, you use the AWS SQS SDK to read a message from the queue. If you get a message back, you handle it. If not, you do nothing.
The functions do not need to be the same. Usually I would recommend having two different functions. However, in this case, to prevent race conditions, maybe it is better to have one function and limit the function's concurrency to 1 (by using Reserved Concurrency). Pay attention that each event source has a different event object structure, so you will need to check it in your handler.
Edit: I realize my error. I was using a colon instead of an equals sign for the QueueURL . Ok great! One final question. I was working on this process yesterday and the step I'm working on is the lambda checking if there's messages in SQS. I'm trying to check if there's a message in the queue, if there is I'll read it and pass it to the state machine. But I keep getting an error when I try to do get_queue_attributes() I'm using Python sqs = boto3.client('sqs') sqs_response = sqs.get_queue_attributes( QueueUrl: 'https://sqs.here-is-my-sqs-url.fifo', AttributeNames = ['ApproximateNumberOfMessages'] ) No matter the order I put the parameters for get_queue_attributes I always get an error on the first one and it's just a basic Runtime.UserCodeSyntaxError so I can't tell what's wrong. Hopefully once that is figured out I'll be able to get this process working. I appreciate your help, Uri!
There is no reason to check if there are messages in the queue. Just read one message. Specify 0 for the wait time and 1 for the batch size:
If there is a message the response Messages attribute will include 1 message. If there are none, it will be empty.
The only reason might be cost ($0.40 vs $0.50 per million requests), but I do not think it is a factor in your case.