Why MUST an index be sparse to be used in a DocumentDB query with an $exists clause?

0

Given a DocumentDB collection called test, with 200 documents of the following shape (i.e. containing a simple key/value pair):

{ "num" : 1 }

and an index on the key num:

{ v: 4, key: { num: 1 }, name: 'num-idx', ns: 'db.test2' }

the index is not used for the following query:

db.test.find({'num': {$exists: true}})
...
winningPlan: { stage: 'COLLSCAN' }

This is consistent with the AWS documentation which states:

Amazon DocumentDB does not currently support the ability to use indexes with the $ne, $nin, $nor, $not, $exists, $distinct, and $elemMatch operators. As a result, utilizing these operators will result in collection scans

However when I recreate the index, specifying it is sparse:

{  v: 4, key: { num: 1 }, name: 'num-idx', ns: 'db.test2', sparse: true }

then the index IS used:

db.test.find({'num': {$exists: true}})
...
winningPlan: { stage: 'IXSCAN', indexName: 'num-idx', direction: 'forward' } }

According to the MongoDB documentation, making an index sparse has the following implications:

  • documents missing the indexed keys will not be present in the index
  • this makes the index smaller in memory
  • you cannot use it for queries containing { $exists: false }

There is no mention of the sparsity of an index being a requirement in order for it to be used in the case of { $exists: true } queries.

The closest to an answer I can find is https://docs.aws.amazon.com/documentdb/latest/developerguide/functional-differences.html#functional-differences.sparse-index piece of AWS documentation which states:

To use a sparse index that you have created in a query, you must use the $exists clause on the fields that cover the index. If you omit $exists, Amazon DocumentDB does not use the sparse index

However, this clashes with the previous documentation quote.

Behaviour for MongoDB (db.test.find({'num': {$exists: <VALUE>}})):

Headexists:trueexists:false
sparse indexIDXCOLLSCAN
non sparse indexIDXIDX

Behaviour for DocumentDB (db.test.find({'num': {$exists: <VALUE>}})):

Headexists:trueexists:false
sparse indexIDXCOLLSCAN
non sparse indexCOLLSCANCOLLSCAN

Is this an unintended loophole?

Can I rely on this behaviour to guarantee that a query containing an { $exists: true } clause is able to use an index if it is sparse?

asked 2 years ago1306 views
1 Answer
0

Not sure what you mean by unintended loophole in this context, but your tests validate what the DocumentDB documentation states.

To use a sparse index that you have created in a query, you must use the $exists clause on the fields that cover the index. If you omit $exists, Amazon DocumentDB does not use the sparse index

means that if you have a sparse index and you use $exist: true condition the index will be used, which is what happens. Maybe the documentation could explicitly state $exist: true.

On the other hand, DocumentDB does not support using the index with $exists: true if the index is not sparse, and your tests demonstrate it, as pointed out by:

Amazon DocumentDB does not currently support the ability to use indexes with the $ne, $nin, $nor, $not, $exists, $distinct, and $elemMatch operators. As a result, utilizing these operators will result in collection scans

AWS
EXPERT
answered a year ago
  • Yes, it would be better if the documentation was made clearer but combining these 2 pieces of information in one place:

    "Amazon DocumentDB does not currently support the ability to use indexes with the $ne, $nin, $nor, $not, $exists, $distinct, and $elemMatch operators. As a result, utilizing these operators will result in collection scans. Except in the case of sparse indexes which can be used for queries containing a $exists: true clause".

    I am still interested in knowing why this behaviour exists though. As I mentioned above, it doesn't make sense that the sparsity of the index affects whether or not it can be used in a query containing $exists:true. This is what I meant by 'unintended loophole'. Because it doesn't make sense, I wondered if this was intentional behaviour. And if not, someone might come along and fix it and suddenly sparse indexes stop working for particular queries.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions