Wanting to trigger Lambda 10 minutes after receiving webhook.

0

I want to re-trigger a Lambda function 10 minutes after a different event happens.

  • If a certain payload with body is received (Dropbox webhook), start timer.
  • After 10 minutes, send a simple "process" request to my Lambda.
  • This might happen once, maybe twice per week.
  • The "message" I need to store and delay is about 12 characters long.

I thought this would be perfect for SQS, as it can be set to "trigger" a Lambda after a certain delay. I set it up, made the trigger, and it works great.. until I noticed that SQS seems to be active all the time.

It looks like Lamba was polling almost continuously, but I couldn't find any evidence of Lambda being active in the logs at these times.

I spent a good couple of hours finding endless posts where people were confused as I was. I then tried almost every combination of settings and delays in both queue types, and still I get this high usage.

And yes, I have changed received message wait time and queue type as suggested in other posts, but all it's done is cause the usage to go from 6 a minute to 15 a minute! Receive message wait time: 20s Delivery delay: 10m

Cost-wise it doesn't matter as it's only predicted to use about 20% of my free tier for SQS, but something must be wrong here. Yes, I can see suggestions of alternatives like creating a one-off Cloudwatch trigger, but that seems counter-intuitive as it seems my use case it exactly what SQS was made for.

Background for anyone wondering: I have a lambda function that processes a Dropbox folder full of files. Dropbox sends webhooks for each upload event, but the folder must be processed in one go. I know that once the uploads start, it never takes more than 5 minutes to complete the upload. Therefore, waiting 10 minutes ensures fairly quick processing without having a regular trigger polling for an event which happens (randomly) just once a week. The entire folder is then transcoded, indexed and copied to s3.

I've got it working just fine - it's just that sqs is completely baffling me...

3 Answers
2
Accepted Answer

The way it works is the following: SQS is a PULL based service, i.e., if you want to know if there are messages in the queue you need to go and ask for the messages. You will not be notified. Lambda is PUSH based service, i.e., in order to invoke a function someone needs to call the Lambda Invoke API. When we want to trigger a function from a queue, we need to put a component in the middle that will poll the queue constantly (using Long polling) and when it gets the messages, it invoked the function. This is exactly what Lambda Event Source Mapping does for you. It is an internal Lambda service process (note, Lambda service, not the Lambda function) that polls the queue and invokes the function.

You do not pay for that service, only for the SQS requests ($0.40/million requests).

This is why you see the constant polling on the queue, but you do not see Lambda invocations.

You can achieve similar functionality using EventBridge Scheduler, i.e., when you receive the WebHook, create a schedule for 10 minutes from now that will invoke the function.

profile pictureAWS
EXPERT
Uri
answered 23 days ago
profile picture
EXPERT
reviewed 22 days ago
profile picture
EXPERT
reviewed 22 days ago
  • Thanks @uri and @Ravisankar C for your quick answers. OK, it all seems a lot more confusing and complicated than it need be.

    I think many people find the term "trigger" particularly confusing because we often think of a "trigger" as being something that causes an event, rather than something that keeps looking for an event to happen and using up quota.

    With your suggestions finally I have figured out how to do this (with a bit of help from Claude AI! Sadly Amazon Q didn't seem to understand what I wanted).

    Step 1: Create a rule by running the code below manually once (adding the arguments of course). Step 2: Go to your lamdba function and add a Cloudwatch Events trigger with the name of the trigger. Step 3: Now set your lambda code to run this below when the dropbox webhook is received (or whatever trigger you want) and it should work.

  • Here's the code if anyone else needs it:

    async function createOneOffSchedule (ruleName, delayMinutes, functionArn, payload) {
      const cronExpression = createFutureCronExpression(1)
      const putRuleCommand = new PutRuleCommand({
        Name: ruleName,
        ScheduleExpression: `cron(${cronExpression})`,
        State: 'ENABLED'
      })
    
      const putTargetsCommand = new PutTargetsCommand({
        Rule: ruleName,
        Targets: [{ Id: 'LambdaTarget', Arn: functionArn, Input: JSON.stringify(payload) }]
      })
    
      try {
        await ebClient.send(putRuleCommand)
        await ebClient.send(putTargetsCommand)
        console.log(`One-off schedule "${ruleName}" created successfully.`)
      } catch (error) {
        console.error('Error creating one-off schedule:', error)
      }
    }
    
    function createFutureCronExpression (minutesOffset) {
      const date = new Date()
    
      date.setMinutes(date.getMinutes() + minutesOffset)
    
      const hours = date.getUTCHours()
      const minutes = date.getUTCMinutes()
      const dayOfMonth = date.getUTCDate()
      const month = date.getUTCMonth() + 1
      const year = date.getUTCFullYear()
    
      return `${minutes} ${hours} ${dayOfMonth} ${month} ? ${year}`
    }
    
  • You are correct regading the trigger, however, as SQS doesn't support triggering other services, and we want to make the integrations as seemless as possible, and open new workloads, we created the pollers that look for the messages. BTW, polling an empty queue like this costs less than $0.26 per month.

    A few points regarding the code that you wrote:

    1. You wrote the code (in a Lambda function?) to configure the infrastructure. We have other, better suited options for that like CLoudFormation, SAM and CDK.
    2. You are using the old way of creating schedules, using the Event Bus rules. We have the EventBridge Scheduler which allows to create one time schedules (as well as cron and rate based). There is no need to create rules. When you create the schedule, you define the target.

    I am not sure how DropBox calls the webhook, but I would assume it does so for every file uploaded, this means that you will create multiple schedules for each folder. Make sure you handle that properly. Maybe each time you get a new webhook, you should update the previous one, so that it will invoke a few minutes after the last file is uploaded, to make sure that all are there.

  • Thank you @uri for that detailed but clear explanation.

    I ended up creating a trigger like that because I asked Amazon Q what the best alternative to SQS and it suggested EventBridge, but wasn't so good at writing the code. So I asked a different AI and it suggested the code above. I couldn't find any way to create an event the "new" way that used cron and could deliver a small payload. It seems to work perfectly for my case.

    Regarding multiple triggers: Yes, you are right it might retrigger lots of times, so this is what I put in place, based on quite a bit of testing. We know it never takes more than 10 minutes to upload the files. Some files are small and will produce multpiple rapid triggers, so we want Lambda to respond as fast as possible so a concurrent Lambda isn't triggered. Therefore:

    As soon as Lamba is triggered by the first webhook, drop a lockfile and set the timer going. When the next webhook comes in, if there's a lockfile, quickly return an "OK" to Dropbox but go no further. When the EventBridge trigger arrives, process the Dropbox folder. Because the Webhooks requests arrive in quick succession, it's very unlikely that the Lambda will have gone to sleep and deleted the lockfile in between, but try and delete the lockfile anyway in case it still exists. From my testing, Lambda is responding in <20ms to Webhooks, and there's been no concurrency, so that's fine.

    When I have more time later in the year I might try and re-work it the "proper" way :)

  • What do you mean by a lock file? A file in /tmp folder of the function? If this is the case, this is not tye right approach. Although it MAY be that we will reuse the same instance, it may well be that we will create new ones, especially if you have files uploaded to 2 folders calling the webhook. Use state inside the function is not recommended, and if it worked for you, it is pure luck.

    You should keep the state in some external source, like DynamoDB. Use the DropBox folder name as the key and use DDB conditional writes to make sure that only one instance triggers the schedule.

    BTW, based on what I see in your code, you do not need cron expressions, you need one time schedules. If you use the EventBridge Bus rules for this, you will very soon run out of your rule limit (300). EB does not delete the rules. You will need to delete them yourself. Maybe you can find some good code examples here.

1

How about using StepFunctions? Using the Wait statement of StepFunctions, you can invoke a Lambda Function after waiting 10 minutes.

https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-wait-state.html

profile picture
EXPERT
shibata
answered 22 days ago
profile picture
EXPERT
reviewed 22 days ago
  • Thanks - but I think I looked into that. Doesn't it require the lambda function to keep "alive" while awaiting? I don't want that as I have set the RAM/CPU very high so that it processes the files more quickly, and it may end up being costly to sit idle for long periods. I may have misunderstood StepFunctions though.

  • Yes, there is no need to run Lambda longer than necessary.

    Since StepFunctions manages the state, running Lambda while it waits for 10 minutes is unnecessary.

    I recommend the following configuration.

    Since StepFunctions does not directly receive WebHooks, it accepts WebHooks via Lambda or API Gateway and launches StepFunctions (Lambda can be terminated at this point, and since it only starts StepFunctions, a small resource is sufficient).

    After StepFunctions is launched, it waits 10 minutes before launching the Lambda to process the file.

1

You could make use of EventBridge to achieve the same. Refer below tutorial that could help. https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-run-lambda-schedule.html

profile picture
answered 23 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions