IXSCAN slow on $regex search including "."

0

I have a simple collection, sites with ~10m obejcts. It contains two fields, domain and reverse_domain, both having a unique index.

Now finding documents with a domain starting with something works like a charm db.sites.find( { domain: { $regex:/^amazon/} } ).hint({domain:1})

exectionStats shows: executionStats: { executionSuccess: true, executionTimeMillis: '5.023', planningTimeMillis: '0.448', executionStages: { stage: 'SUBSCAN', nReturned: '597', executionTimeMillisEstimate: '4.506', inputStage: { stage: 'SORT', nReturned: '597', executionTimeMillisEstimate: '4.326', sortPattern: { domain: 1 }, inputStage: { stage: 'FETCH', nReturned: '597', executionTimeMillisEstimate: '2.837', inputStage: { stage: 'IXOR', nReturned: '0', executionTimeMillisEstimate: '0.327', inputStages: [ { stage: 'IXSCAN', nReturned: '600', executionTimeMillisEstimate: '0.318', indexName: 'domain_1' }, { stage: 'IXSCAN', nReturned: '0', executionTimeMillisEstimate: '0.007', indexName: 'domain_1' } ] } } } } }

The problem is that using it on the reverse_domain it scans it all.

db.sites.find( { reverse_domain: { $regex:/^com\.amazon/} } ).hint({reverse_domain:1})

The stats are: executionStats: { executionSuccess: true, executionTimeMillis: '27047.026', planningTimeMillis: '0.366', executionStages: { stage: 'SUBSCAN', nReturned: '345', executionTimeMillisEstimate: '27046.556', inputStage: { stage: 'IXSCAN', nReturned: '345', executionTimeMillisEstimate: '27046.456', indexName: 'reverse_domain_1', direction: 'forward' } } }

The reverse_domain index, works, as it is used when trying: db.sites.find({reverse_domain:'com.amazon.aws'})

with stats: executionStats: { executionSuccess: true, executionTimeMillis: '0.184', planningTimeMillis: '0.137', executionStages: { stage: 'IXSCAN', nReturned: '1', executionTimeMillisEstimate: '0.027', indexName: 'reverse_domain_1', direction: 'forward' } }

My gut feeling tells me that it has something to do with the affinity of the "com." start. Searching using a different superdomain fx. "shop." gives much better results. Trying searching for "com111.example" is ultra fast while searching for "com.111" is slow, which is weird as "com" and "com." should have roughly the same matches in a huge db of domains.

My question is if anyone can explain what is going on, and why the "." matters so much.

Cheers

Kezzel
質問済み 7ヶ月前41ビュー
回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ