inconsistency of SQS (Simple Queue Service) message reception from SNS (Simple Notification Service) fan-out

0

We are currently experiencing a critical issue in our production environment related to the inconsistency of SQS (Simple Queue Service) message reception from SNS (Simple Notification Service) fan-out. Our setup involves an SNS topic triggering SQS queues based on message attributes. While the email delivery mechanism triggered by the same SNS topic and message attribute is functioning properly, the SQS queues intermittently fail to receive messages. This inconsistency poses a significant operational risk to our system.

已提问 2 个月前184 查看次数
2 回答
1
已接受的回答

How can you tell that the messages do not arrive in the queue? Did you check the CloudWatch metrics for the queue for the number of sent messages?

Could it be that you have more than one consumer on the queue? You should only have one as the consumer deletes the message from the queue after processing it. If you have more than one consumer, each one will get a subset of the messages.

profile pictureAWS
专家
Uri
已回答 2 个月前
profile picture
专家
已审核 2 个月前
  • Hello Uri,

    Yes, that's right. We have more than one consumer accidentally configured for the same SQS. That's the root cause of the issue. Thank you so much for your answer.

    Regards, Vijay

0

Are you able to take the messages that failed to be received by SQS and resend them? That would allow you to confirm if the queue just failed to receive it or if the filter subscription filter policy is why the queue isn't receiving the messages. Also, do you have a DLQ to see if there was a problem processing the message by SQS. The SNS integration with SQS will retry any failed message attempts.

AWS
已回答 2 个月前
  • Hi Matthew,

    Yes, I attempted to send the same message, and it eventually worked after a few attempts. I don't see any issues with the filter policy. Another subscription for email, using the same filter policy, consistently works fine.

    Additionally, I have configured a Dead Letter Queue (DLQ) for the SQS, but there are no messages in it.

    The current SNS settings include 3 retries. I attempted to disable encryption, but it still did not resolve the issue. Interestingly, the same settings in my lower environments are functioning correctly; they have not failed even once.

  • SNS's integration with SQS will retry 3 times immediately but then starts using backoff and will make over 100k attempts over 23 days. I agree the filter policy isn't the problem if sending the same message eventually goes through and it works in your lower environments.

    How are you confirming that SQS is processing the message? Are you using CloudWatch NumberOfMessagesSent metric to check that a message was received by SQS from SNS?

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则