Same Message Group Tasks executing in parallel in Celery with Amazon SQS FIFO

0

I have a setup where I'm using Celery as the task queue with Amazon SQS FIFO. My goal is to ensure sequential processing of tasks within the same message group ID, while allowing tasks with different message group IDs to be processed in parallel. However, despite following the recommended configurations and understanding the behavior of SQS message groups, I'm experiencing parallel processing of tasks within the same message group by multiple Celery worker processes. How can I ensure that tasks with the same message group ID are processed sequentially by a single worker process, while maintaining parallel processing for tasks with different message group IDs?

Some extra details (for reference) : For celery I haven't used --concurrency setting, so it is by default spawning 4 pool processes (no. of cores). I am passing message group id using following syntax : ` message_properties = { "MessageGroupId": f"{supplier_id}" } celery_task.s( param1, **message_properties ).apply_async(**message_properties)

` and I have made sure the queue is fifo and it ends with .fifo . additional settings - { 'polling_interval': 60, 'wait_time_seconds': 10, 'visibility_timeout': 600 } image SS

Manish
已提問 10 個月前檢視次數 291 次
1 個回答
0

Not very familiar with Celery, but my guess is that it reads messages from SQS in batches and then it distributes the batches to the different works, without honoring the message group ID. If there is a way, configure the batch size to 1. This will solve the issue. Otherwise, you will need to set your workers to 1, and then you will be only able to process a single message at a time.

profile pictureAWS
專家
Uri
已回答 10 個月前
  • Thanks for your response. There does not seem to be a direct way to configure the batch size in celery. and setting worker to 1 will cause performance issues.

    As per amazon sqs fifo documentation :

    'When receiving messages from a FIFO queue with multiple message group IDs, Amazon SQS first attempts to return as many messages with the same message group ID as possible. This allows other consumers to process messages with a different message group ID. When you receive a message with a message group ID, no more messages for the same message group ID are returned unless you delete the message or it becomes visible.'

    According to above description I understand that even with multiple consumers (multiple celery workers in my case) it should still work(honor message group ids). or am I missing out on something? Any help would be appreciated, thanks.

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南