- Mais recentes
- Mais votos
- Mais comentários
The way it works is the following: SQS is a PULL based service, i.e., if you want to know if there are messages in the queue you need to go and ask for the messages. You will not be notified. Lambda is PUSH based service, i.e., in order to invoke a function someone needs to call the Lambda Invoke API. When we want to trigger a function from a queue, we need to put a component in the middle that will poll the queue constantly (using Long polling) and when it gets the messages, it invoked the function. This is exactly what Lambda Event Source Mapping does for you. It is an internal Lambda service process (note, Lambda service, not the Lambda function) that polls the queue and invokes the function.
You do not pay for that service, only for the SQS requests ($0.40/million requests).
This is why you see the constant polling on the queue, but you do not see Lambda invocations.
You can achieve similar functionality using EventBridge Scheduler, i.e., when you receive the WebHook, create a schedule for 10 minutes from now that will invoke the function.
How about using StepFunctions? Using the Wait statement of StepFunctions, you can invoke a Lambda Function after waiting 10 minutes.
https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-wait-state.html
Thanks - but I think I looked into that. Doesn't it require the lambda function to keep "alive" while awaiting? I don't want that as I have set the RAM/CPU very high so that it processes the files more quickly, and it may end up being costly to sit idle for long periods. I may have misunderstood StepFunctions though.
Yes, there is no need to run Lambda longer than necessary.
Since StepFunctions manages the state, running Lambda while it waits for 10 minutes is unnecessary.
I recommend the following configuration.
Since StepFunctions does not directly receive WebHooks, it accepts WebHooks via Lambda or API Gateway and launches StepFunctions (Lambda can be terminated at this point, and since it only starts StepFunctions, a small resource is sufficient).
After StepFunctions is launched, it waits 10 minutes before launching the Lambda to process the file.
You could make use of EventBridge to achieve the same. Refer below tutorial that could help. https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-run-lambda-schedule.html
Conteúdo relevante
- AWS OFICIALAtualizada há 2 anos
Thanks @uri and @Ravisankar C for your quick answers. OK, it all seems a lot more confusing and complicated than it need be.
I think many people find the term "trigger" particularly confusing because we often think of a "trigger" as being something that causes an event, rather than something that keeps looking for an event to happen and using up quota.
With your suggestions finally I have figured out how to do this (with a bit of help from Claude AI! Sadly Amazon Q didn't seem to understand what I wanted).
Step 1: Create a rule by running the code below manually once (adding the arguments of course). Step 2: Go to your lamdba function and add a Cloudwatch Events trigger with the name of the trigger. Step 3: Now set your lambda code to run this below when the dropbox webhook is received (or whatever trigger you want) and it should work.
Here's the code if anyone else needs it:
You are correct regading the trigger, however, as SQS doesn't support triggering other services, and we want to make the integrations as seemless as possible, and open new workloads, we created the pollers that look for the messages. BTW, polling an empty queue like this costs less than $0.26 per month.
A few points regarding the code that you wrote:
I am not sure how DropBox calls the webhook, but I would assume it does so for every file uploaded, this means that you will create multiple schedules for each folder. Make sure you handle that properly. Maybe each time you get a new webhook, you should update the previous one, so that it will invoke a few minutes after the last file is uploaded, to make sure that all are there.
Thank you @uri for that detailed but clear explanation.
I ended up creating a trigger like that because I asked Amazon Q what the best alternative to SQS and it suggested EventBridge, but wasn't so good at writing the code. So I asked a different AI and it suggested the code above. I couldn't find any way to create an event the "new" way that used cron and could deliver a small payload. It seems to work perfectly for my case.
Regarding multiple triggers: Yes, you are right it might retrigger lots of times, so this is what I put in place, based on quite a bit of testing. We know it never takes more than 10 minutes to upload the files. Some files are small and will produce multpiple rapid triggers, so we want Lambda to respond as fast as possible so a concurrent Lambda isn't triggered. Therefore:
As soon as Lamba is triggered by the first webhook, drop a lockfile and set the timer going. When the next webhook comes in, if there's a lockfile, quickly return an "OK" to Dropbox but go no further. When the EventBridge trigger arrives, process the Dropbox folder. Because the Webhooks requests arrive in quick succession, it's very unlikely that the Lambda will have gone to sleep and deleted the lockfile in between, but try and delete the lockfile anyway in case it still exists. From my testing, Lambda is responding in <20ms to Webhooks, and there's been no concurrency, so that's fine.
When I have more time later in the year I might try and re-work it the "proper" way :)
What do you mean by a lock file? A file in /tmp folder of the function? If this is the case, this is not tye right approach. Although it MAY be that we will reuse the same instance, it may well be that we will create new ones, especially if you have files uploaded to 2 folders calling the webhook. Use state inside the function is not recommended, and if it worked for you, it is pure luck.
You should keep the state in some external source, like DynamoDB. Use the DropBox folder name as the key and use DDB conditional writes to make sure that only one instance triggers the schedule.
BTW, based on what I see in your code, you do not need cron expressions, you need one time schedules. If you use the EventBridge Bus rules for this, you will very soon run out of your rule limit (300). EB does not delete the rules. You will need to delete them yourself. Maybe you can find some good code examples here.