RDS Database Query

0

Hi AWS, I have a doubt for the following question:

A company has an API that receives real-time data from a fleet of monitoring devices. The API stores this data in an Amazon RDS DB instance for later analysis. The amount of data that the monitoring devices send to the API fluctuates. During periods of heavy traffic, the API often returns timeout errors.

After an inspection of the logs, the company determines that the database is not capable of processing the volume of write traffic that comes from the API. A solutions architect must minimize the number of connections to the database and must ensure that data is not lost during periods of heavy traffic.

Which solution will meet these requirements?

  1. Increase the size of the DB instance to an instance type that has more available memory.
  2. Modify the DB instance to be a Multi-AZ DB instance. Configure the application to write to all active RDS DB instances.
  3. Modify the API to write incoming data to an Amazon Simple Queue Service (Amazon SQS) queue. Use an AWS Lambda function that Amazon SQS invokes to write data from the queue to the database.
  4. Modify the API to write incoming data to an Amazon Simple Notification Service (Amazon SNS) topic. Use an AWS Lambda function that Amazon SNS invokes to write data from the topic to the database.

As per me, the right answer is option (B) as RDS Multi-AZ deployment, Amazon RDS automatically creates a primary database (DB) instance and synchronously replicates the data to an instance in a different AZ. When it detects a failure, Amazon RDS automatically fails over to a standby instance without manual intervention.

Option (C) can be the answer but don't you think it requires a lot of work rather than enabling the Multi-AZ setting in RDS DB cluster itself.

3 Answers
2
Accepted Answer

From the AWS documentation (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html) , it's clear that there are several best practices to follow when dealing with high traffic and ensuring data is not lost. Some of the guidelines include:

  • Scaling up your DB instance when you are approaching storage capacity limits.
  • Allocating enough RAM so that your working set resides almost completely in memory.
  • Using metrics to monitor memory, CPU, replica lag, and storage usage, setting up Amazon CloudWatch to notify when usage patterns change or when you approach the capacity of your deployment.
  • If using Multi-AZ DB clusters, monitoring the time difference between the latest transaction on the writer DB instance and the latest applied transaction on a reader DB instance, called replica lag​.

Option B, where you consider a Multi-AZ DB instance, is a good approach for high availability and failover support for DB instances. But it's important to note that in the context of Multi-AZ, the secondary instance is mainly for failover support. It's not designed to scale the write operations. The application doesn't write to all active DB instances. Instead, it writes to the primary instance, and then Amazon RDS synchronously replicates the data to the standby replica in a different Availability Zone (AZ).

Option C, where you consider using Amazon SQS with AWS Lambda, is a common pattern for decoupling database load from high-traffic data ingestion. This approach could help to even out the write traffic load on your RDS instance by placing incoming data into an SQS queue. The Lambda function is then triggered to write data from the queue to the RDS instance, which can be paced to match the write capacity of the RDS instance.

While option C might require more initial setup, it provides a scalable solution that separates the data ingestion from the data persistence, which may help to avoid overloading the RDS instance.

profile picture
EXPERT
answered a year ago
profile picture
EXPERT
reviewed a year ago
  • Agree and a solution architect will suggest a method like this. Multi AZ gives HA and 99% of the time the other instances are ReadOnly which doesn’t help

2

There are two important ask in question: "processing the volume of write traffic that comes from the API" "must ensure that data is not lost during periods of heavy traffic."

In such case , you will need to persist data and make the design async to avoid database being overwhelmed with request.

Option C is a good choice.

With Option "B" , it only provides high availability as one of database is writer node and second is reader node. The database connection can still be an issue with Multi-AZ implementation.

AWS
answered a year ago
profile picture
EXPERT
reviewed a year ago
  • Agree and a solution architect will suggest a method like this. Multi AZ gives HA and 99% of the time the other instances are ReadOnly which doesn’t help

1

As per me, the right answer is option (B) as RDS Multi-AZ deployment, Amazon RDS automatically creates a primary database (DB) instance and synchronously replicates the data to an instance in a different AZ. When it detects a failure, Amazon RDS automatically fails over to a standby instance without manual intervention.

Think about the question being asked: the database is not capable of processing the volume of write traffic that comes from the API. And you are being asked to minimize the number of connections to the database and ensure that data is not lost during periods of heavy traffic.

Your current database is struggling. Turning this into a multi-AZ instance just means you have (in effect) a single writer node of the same instance type, and multiple reader nodes in other AZs. That single writer node still has to do all the write operations, and so is still going to be struggling at peak periods and will be returning timeout errors just like it is now when it is a single-AZ instance.

1. Increase the size of the DB instance to an instance type that has more available memory - do we know that it is memory that we are short of? We are not told this in the question. We could bump the RDS instance up to something with more memory and the same # of CPUs and find out that we still have the problem.

Option 4 doesn't make sense to me.

So that leaves us with option 3. You are correct when you say it is a lot of additional work, but it does satisfy the criteria asked in the question.

profile picture
EXPERT
Steve_M
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions