Process large data file ( up to 10 GB) and save to RDS

0

Hi I have an architecture like below
user upload file -> S3 -> lambda trigger glue job -> glue job pull the file, read content, and save to a record in a table in Aurora Postgres
Everything is good with a small file, but when file size increases (up to 10Gb), I think the architecture can not fit anymore. I researched and found that Postgres has something like an external table that can store much more data than a regular table. I also think about switching to a NoSQL database like Dynamo or MongoDB I have some questions:

  • Aurora Postgres is good for storing and searching large content?
  • If Aurora Posgres is not good then which NoSQL database fits this scenario?
1개 답변
1

Hello.

Aurora Postgres is good for storing and searching large content?

PostgreSQL is a relational database, so it is suitable for performing complex queries such as table joins.
Therefore, I think it is suitable for performing complex updates and searches.

If Aurora Posgres is not good then which NoSQL database fits this scenario?

NoSQL such as DynamoDB is faster than relational databases, which are faster to read and write.
However, it is not suitable for systems that require complex data updates.

We recommend that you check the following documentation to select the appropriate database for your solution.
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SQLtoNoSQL.WhyDynamoDB.html

profile picture
전문가
답변함 8달 전
profile pictureAWS
전문가
검토됨 8달 전
  • Can you tell me more about architecture? Did it fit with a 10GB file size? Aurora Postgres is good for storing and searching large content -> I mean in 1 row, file content will be stored in 1 row with each file. If file size is 10GB, that row will hold a 10GB data string There is no complex query, just a simple search, the update mechanism will be performed by glue job through the glue database.

  • If you don't have complex queries, I think NoSQL such as DynamoDB is better. Also, if one row is about 10GB in size, I think it would be difficult to store it in a relational database.

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠