Infinite retries due to exceeded SQS visibility timeout

0

I am using SQS for queuing messages that are processed by workers running in ECS task. I understand that if visibility timeout of SQS exceeds then same message will be visible to other workers which is causing an infinite loop and the queue gets stuck on the same message. Is there a way to limit the retries or unblock the other messages?

karan
질문됨 한 달 전103회 조회
2개 답변
1

Hello.

Basically, the application must be designed so that there is no problem even if the application processes SQS messages redundantly.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/standard-queues-at-least-once-delivery.html

If this occurs, the copy of the message isn't deleted on that unavailable server, and you might get that message copy again when you receive messages. Design your applications to be idempotent (they should not be affected adversely when processing the same message more than once).

The Lambda documentation recommends setting the visibility timeout to 6 times the execution time.
Therefore, I thought it would be a good idea to set the visibility timeout to a fairly long period even if you are using ECS.
https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html

To allow your function time to process each batch of records, set the source queue's visibility timeout to at least six times the timeout that you configure on your function. The extra time allows for Lambda to retry if your function is throttled while processing a previous batch.

If you set up a dead letter queue and adjust the maximum number of times messages can be received in the standard queue, it may be possible to process them without duplication to some extent.
https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-configure-dead-letter-queue.html

profile picture
전문가
답변함 한 달 전
profile picture
전문가
검토됨 한 달 전
  • Thanks for the suggestion. The time taken by messages varies and can not set high visibility timeout. I want to use DLQ and reprocess the same message in the source queue with higher visibility timeout for that specific message. is there a standard way of doing this? My idea is to trigger a lambda when a message is moved to DLQ. The lambda will send the message back to source queue with higher visibility timeout.

0

You should make sure that your visibility timeout is longer than the processing time. The challenge is that if the consumer fails to process the message and you configured a long visibility, it will take long time for the message to be processed. In this case the best practice would be to set a shorter visibility timeout and extend it from the consumer while it is processing the message.

In addition, set a DLQ and set the max retries to a small number, so that even if you do not extend the timeout, you will not get into a loop.

profile pictureAWS
전문가
Uri
답변함 한 달 전
  • Thanks for the suggestion. Changing the visibility timeout from the consumer is not in scope as celery tasks are performed. I want to use DLQ and reprocess the same message in the source queue with higher visibility timeout. is there a standard way of doing this? My idea is to trigger a lambda when a message is moved to DLQ. The lambda will send the message back to source queue with higher visibility timeout.

  • You can't specify a visibility timeout when sending a message, so this will not work. Why not use the higher default visibility timeout for the queue to start with so you process every message only once?

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠