synchronous queue implementation on AWS

0

I have a queue in which producers are adding data and consumers wants to read and process it.

In the diagram below producers are adding data in a queue with (Px, Tx, X) example (P3, T3,10) here, P3 is the producer ID, T3 is the number of packets required to process and 10 is data.

for (P3, T3,10) consumer needs to read 3 packets from the P3 producer so In the Image below, one of the consumer needs to pick (P3, T3,10), (P3, T3,15) and (P3, T3,5) and perform a function on data that just add all the number that is 10+15+5 = 30 and save 30 to DB.

Similarly there is a case for P1 producer (P1,T2,1) and (P1,T2,10) sum = 10+1 = 11 to DB.

I have read about AWS Kinesis but it has issues, all consumers read the same data which doesn't fit my case.

The major issue is how we can limit consumers for:

1 - Read data queue in synchronous.

2 - If one of the consumers has read (P1, T2,1) then only this consumer can read the next packet from the P1 producer (This point is the major issue for me as the consumer need to add those two number)

3 - This can also cause deadlock as some of the consumers will be forced to read data from a particular producer only because they have already read one packet from the same producer, now they have to wait for the next packet to perform add.

I have also read about SQS and MQ but the above challenges still exist for them too.

Image

https://i.stack.imgur.com/7b3Mm.png

My current approach:

for N produces I have started N EC2 instances, producers send data to EC2 through WebSocket (Websocket is not a requirement) and I can process it there easily. As you can see having N EC2 to process N producers will cause budget issues, how can I improve on this solution.

  • How many producers? How many consumers? How many messages?

  • 100+ producers (This may increase going forward), each producer can have a data rate of 20KB/sec The consumers are something I am trying to optimize, There is no limit on consumers.

1 Answer
1

I assume that it is given that the producers need to send their payload in multiple messages, otherwise, combine them and send them as a single message. If the reason for not consolidating is payload size, you could save the payload in S3/DynamoDB/etc and only send a pointer to the data in the queue.

If you still need to send multiple messages, you can use Kafka (or the managed version MSK). You will create a single topic with multiple partitions. You will define the producer ID as the message key which will route all the message for the same producer to the same partition within the topic. You will create a single consumer group which will subscribe to that topic. Within the consumer group each partition will be handled by a single consumer. Note that the consumer may still handle messages for multiple partitions, i.e., producers, so it will need to maintain in memory the relevant data until it receives all messages.

profile pictureAWS
EXPERT
Uri
answered 2 years ago
profile picture
EXPERT
reviewed 21 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions