Process large data file ( up to 10 GB) and save to RDS

0

Hi I have an architecture like below
user upload file -> S3 -> lambda trigger glue job -> glue job pull the file, read content, and save to a record in a table in Aurora Postgres
Everything is good with a small file, but when file size increases (up to 10Gb), I think the architecture can not fit anymore. I researched and found that Postgres has something like an external table that can store much more data than a regular table. I also think about switching to a NoSQL database like Dynamo or MongoDB I have some questions:

  • Aurora Postgres is good for storing and searching large content?
  • If Aurora Posgres is not good then which NoSQL database fits this scenario?
1 réponse
1

Hello.

Aurora Postgres is good for storing and searching large content?

PostgreSQL is a relational database, so it is suitable for performing complex queries such as table joins.
Therefore, I think it is suitable for performing complex updates and searches.

If Aurora Posgres is not good then which NoSQL database fits this scenario?

NoSQL such as DynamoDB is faster than relational databases, which are faster to read and write.
However, it is not suitable for systems that require complex data updates.

We recommend that you check the following documentation to select the appropriate database for your solution.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SQLtoNoSQL.WhyDynamoDB.html

profile picture
EXPERT
répondu il y a 8 mois
profile pictureAWS
EXPERT
vérifié il y a 8 mois
  • Can you tell me more about architecture? Did it fit with a 10GB file size? Aurora Postgres is good for storing and searching large content -> I mean in 1 row, file content will be stored in 1 row with each file. If file size is 10GB, that row will hold a 10GB data string There is no complex query, just a simple search, the update mechanism will be performed by glue job through the glue database.

  • If you don't have complex queries, I think NoSQL such as DynamoDB is better. Also, if one row is about 10GB in size, I think it would be difficult to store it in a relational database.

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions