inconsistency of SQS (Simple Queue Service) message reception from SNS (Simple Notification Service) fan-out

0

We are currently experiencing a critical issue in our production environment related to the inconsistency of SQS (Simple Queue Service) message reception from SNS (Simple Notification Service) fan-out. Our setup involves an SNS topic triggering SQS queues based on message attributes. While the email delivery mechanism triggered by the same SNS topic and message attribute is functioning properly, the SQS queues intermittently fail to receive messages. This inconsistency poses a significant operational risk to our system.

已提問 2 個月前檢視次數 184 次
2 個答案
1
已接受的答案

How can you tell that the messages do not arrive in the queue? Did you check the CloudWatch metrics for the queue for the number of sent messages?

Could it be that you have more than one consumer on the queue? You should only have one as the consumer deletes the message from the queue after processing it. If you have more than one consumer, each one will get a subset of the messages.

profile pictureAWS
專家
Uri
已回答 2 個月前
profile picture
專家
已審閱 2 個月前
  • Hello Uri,

    Yes, that's right. We have more than one consumer accidentally configured for the same SQS. That's the root cause of the issue. Thank you so much for your answer.

    Regards, Vijay

0

Are you able to take the messages that failed to be received by SQS and resend them? That would allow you to confirm if the queue just failed to receive it or if the filter subscription filter policy is why the queue isn't receiving the messages. Also, do you have a DLQ to see if there was a problem processing the message by SQS. The SNS integration with SQS will retry any failed message attempts.

AWS
已回答 2 個月前
  • Hi Matthew,

    Yes, I attempted to send the same message, and it eventually worked after a few attempts. I don't see any issues with the filter policy. Another subscription for email, using the same filter policy, consistently works fine.

    Additionally, I have configured a Dead Letter Queue (DLQ) for the SQS, but there are no messages in it.

    The current SNS settings include 3 retries. I attempted to disable encryption, but it still did not resolve the issue. Interestingly, the same settings in my lower environments are functioning correctly; they have not failed even once.

  • SNS's integration with SQS will retry 3 times immediately but then starts using backoff and will make over 100k attempts over 23 days. I agree the filter policy isn't the problem if sending the same message eventually goes through and it works in your lower environments.

    How are you confirming that SQS is processing the message? Are you using CloudWatch NumberOfMessagesSent metric to check that a message was received by SQS from SNS?

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南