Trying to optimise DocumentDB performance

0

I have a DocumentDB cluster with a single instance that is mainly used by a node.js application. I need to run a simple query across a collection, which is linked to two other collections. This query is going to return a lor of results, let's say from 80.000 to 600.000 documents. Then the node application will serialize the necessary information to return to the client. Then the client can apply filters to this data for his analysys.

The request tooks approx 30 seconds: 28 sec waiting to the server and <2 sec content download. I would like to understand if this is due to the fact that my instance is just 4 GiB and 2 vCPU, or if there's some bottleneck when trasferring data from DocumentDB and EC2, maybe I made some mistake in the configuration.

Or should I just try to optimize how I save data to this collection, trying to get rid of other collection relations? Or should I just move to a model where I make an API call and a query every time the client changes filters client side?

thanks for ny suggestion M

  • Have you created indexes? - you need to have indexes created for the fields used in your query. If you haven't then some appropriate indexes could significantly speed up query execution.

asked 10 months ago934 views
3 Answers
1

There are several things you can try to improve your Document DB performance.

Instance Sizing First, you should try to make sure you are using the right-sized instance based on the performance that you expect. There is some really good information in the documentation to help you make an educated decision. Additionally there are 3rd party sizing calculator.

Index Creating indexes can drastically improve your query performance. However, there is a cost during write transactions.

While indexes can improve query performance by avoiding the need to scan every document in a collection, this improvement comes with a tradeoff. For each index on a collection, every time a document is inserted, updated, or deleted, the database must update the collection and write the fields to each of the indexes for the collection. For example, if a collection has nine indexes, the database must perform ten writes before acknowledging the operation to the client. Thus, each additional index incurs additional write latency, I/O's, and increase in the overall utilized storage.

https://docs.aws.amazon.com/documentdb/latest/developerguide/best_practices.html#best_practices-indexes

Monitor Finally, you should consult CloudWatch metrics to identify performance bottlenecks. https://docs.aws.amazon.com/documentdb/latest/developerguide/best_practices.html#best_practices-performance

I hope these help!

EXPERT
answered 10 months ago
0

Yes I created indexes, however I am not sure I am fully using them. I first tried to create indexes in the mongoose schema, using "index: true" in the field I want to index, but then checking the schema using Compass I could not see them. So I created them directly in Compass. Then, I enabled DocumentDB profiler, and I can see indexes are used when quering from Compass. For some reason, when using the application, the profiler doesn't log anything.

Is there any other way I can check if the application is leveraging indexes?

answered 10 months ago
0

For index optimization: You can check if any unused indexes with DocumentDB index review tool: https://github.com/awslabs/amazon-documentdb-tools/tree/master/performance/index-review as well as the performance insight. Btw, you can also try the compress the db size with DocumentDB 5.0 https://docs.aws.amazon.com/documentdb/latest/developerguide/doc-compression.html

            compression:{
                enable: <true | false> 

For the query, is there any ReadIOPS spike or network congestion found in the cluster?

AWS
jjpe
answered 8 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions