Does using stored procedures with Amazon Aurora reduce costs and/or speed up response time?

0

I'm making a webapplication that can serve pages that require data from multiple tables. The queries necessary for fetching this data have been bundled into one stored procedure. Now it takes only one database call to this stored procedure in order to receive the resultsets from all the multiple tables. If i had executed all the queries separately it would have entailed making multiple database calls.

I've read that queries that are executed from a stored procedure are just as fast as queries that are performed separately, however i can imagine that opening a new database connection or transaction for each separate query will make the process time a lot longer then when all of these queries are executed in one stored procedure using only one database connection or transaction. Thats why i decided to make stored procedures instead of doing separate database calls. I also assumed that i would be charged per DB query made when using Amazon aurora.

However on the pricing page https://aws.amazon.com/rds/aurora/pricing/ i'm reading that i get charged per I/O operation instead. If the query result can be received from the memory cache then i don't get charged in that case.

So i'm kind of wondering whether the use of stored procedures instead of separate db calls will really save me money or not. I'm also wondering whether using stored procedures really is faster then simply doing all the required queries separately. Lets assume that all the separate queries made by the application are done within one transaction. How would this compare to using a stored procedure that executes the same queries?

I can imagine that each query has to travel from the EC2 server to the amazon aurora server and the result has to travel back again. I would assume that this adds a few milliseconds of extra processing time to each query that is made separately, but i'm not sure so thats why i'm asking.

asked 2 years ago589 views
1 Answer
0

Forgive the length of this answer, this kind of question will have many different answers/solutions, I'm trying to remain as general as I can...

The Structured Query Language, by design, is able to query relational data across multiple tables. You know this because you've created a Stored Procedure to do so (I'm assuming you're using the JOIN syntax to retrieve relational data using Primary Keys & Conditions).

I can't think of a reason (off hand) that would prevent you from being able to execute the same query from within your Application as a single database call. In fact, I would expect the query to perform ever so slightly quicker from your Application because you haven't included the overhead of calling the Stored Procedure in the middle. However, when it comes down to general read performance and how the DBMS executes the query, you should not notice any difference in query execution as the query will be executed the same way.

I come from a PHP background and, although there are specific situations where multiple DB connections may be required, in general would expect a Web Application to start/maintain a single database connection between each running Script and the DBMS. It's established practice nowadays that once a script has begun, it will open a DB connection when required that will remain open for the duration of the script's execution. Yes, you could take the approach that you only open/maintain a DB connection while you actually need to use it but unless your scripts are running for longer that the seconds needed to handle a HTTP request in my opinion you are adding unnecessary overhead to your applicaiton.

The first thing that came to mind while reading your question was "Cache". By default, nearly all DMBS will automatically cache individual DB queries until a period of time has passed or the service is told to ignore the cache or the underlying data is changed. This is automatic behaviour designed to reduce the amount of disk reads over time (cache being stored in system memory). So from the point of view of an application developer, you would expect repeat calls of the same Query to return faster because the hard work was done when the cache was populated.

However, although the query cache will take the load off the backend storage, it does nothing to reduce the number of times the application servers will need to query the DMBS for the data. At this level you should start thinking about using a caching service like ElastiCache to hold frequently accessed non-volatile data and save the servers having to query the DBMS at all. In a well designed solution, where you are caching data that doesn't frequently change, this implementation will dramatically reduce Query load on the DBMS and should also improve application performance due to faster turnarounds.

This answer is already turning into an essay and I'm getting hungry so if you don't mind I'll be a little more flippant with the rest of it:

  • Should I be concerned about I/O operation charges in an average DB? No, the charges will be negligible and dwarfed by the server costs
  • Will using Stored Procedures save me money? No, they were designed for specific situations (that I won't go into) but in your case they are no different to executing queries on the DB.
  • Will the travel time of each query have a material impact on my application's performance? No, assuming the DB is located in the same region and your VPC design is sufficiently optimised, you should be able to perform hundreds of queries in each script without breaking a sweat.
answered 2 years ago
  • Will the travel time of each query have a material impact on my application's performance? No, assuming the DB is located in the same region and your VPC design is sufficiently optimised, you should be able to perform hundreds of queries in each script without breaking a sweat. What do you mean with in each script? Are you referring to the stored procedure? What happens when you perform hundreds of queries outside of a script, sequentially? (comment continues below)

  • Say i would make 300 queries to the database separately, wouldnt the traveltime for each query add up and become noticable? Both the EC2 instance and the Amazon aurora instance will be in the same region. However what i've learned from this stack overflow answer is that the latency between a database and an EC2 instance within the same region can be anywhere from 3 milliseconds up to 10 milliseconds. Lets assume the most optimistic estimate is true. (comment continues below).

  • That would make the total response time for making 300 separate queries 300 * 3 = 900 milliseconds. The response back probably takes the same amount of time, so the total processing time will be 1800 milliseconds at minimum. And that still excludes the amount of time the database needs to run each query. If i were to put all 300 queries in one stored procedure and call that instead, i could save a lot of time.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions