Aurora PostgreSQL - recommended maximum number of databases (for a multi-tenant data layer)



TL;DR - I have a customer who has a SaaS solution which includes multi-tenant data layers. They want to improve their tenant isolation by creating a logical DB per tenant.

Aurora Postgres is a main component of the architecture. In order to create better isolation, they would like to create a database per tenant (to clarify: multiple logical databases within the same cluster). They ask whether a single Aurora PostgreSQL cluster can hold 10K DBs.

  1. Is this a reasonable requirement from Aurora Postgres (and from PostgreSQL at all)?

  2. In which point (regarding DBs number) solutions like sharding will make sense? (talking only about the number of DBs in a cluster. To clarify - I'm also discussing with them about each DB performance requirements, availability requirements, etc..)

  3. In addition, I am discussing with the customer about other considerations, like dividing the DBs in order to create better maintenance / operations abilities and limit the blast radius of one cluster).

Many thanks in advance.

asked 4 years ago1383 views
1 Answer
Accepted Answer

Multi-tenant DB architecture will come down to their use case, as you've hinted to. There are lots of papers/articles out there that discuss tradeoffs with alternative approaches when it comes to isolation, maintainability, migrations, development pain, schema changes, and other factors.

Given that, I'll try and focus my answer on 1&2 in your question: We certainly have customers on both Aurora and RDS Postgres with 10k databases. Some even orders of magnitude more. It works. However, they'd certainly want to think about their future plans for scaling. There is a point where even just the sheer number of files generally causes Postgres (or the OS) grief. If their schema is complex, with many relations per database, then they could run in to issues much sooner. It's tough to give solid numbers, but at 10k you could say it's feasible enough for them to put in the effort to do a real evaluation.

Another potential sticking point is connection management. If their workload calls for frequent access to many customers/databases, it's worth mentioning that Postgres is connection-per-database. Even at 10k, they won't realistically be able to pool connections for all customers. Creating and tearing down connections will introduce overhead they wouldn't otherwise encounter with something like table-per-customer.

There are a lot of other things to consider as I mentioned, but yes, people do 10k databases. As always, they should test using something representative of their specific workload.

answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions