OpenSearch Serverless search latency spikes blocking searches

1

We are having issues with the auto-scaling of our collection. We're trying to slowly shift traffic to the new collection but as soon as there is a little bit of traffic, the search latency spikes, requests start to accumulate and we basically can't do searches anymore because it overloads the collection.

I was able to trigger scaling of search OCUs by sending manual requests incrementally, up to ~2.5/s but this took about 1h to go from 0 to that, and it only scaled to 3 OCUs. I'm concerned we won't be able to handle rapid traffic increases.

We exclusively use the msearch API for our searches.

We had to add timeouts:

  • cancel_after_time_interval: '2s' on all queries
  • maxRetries: 0 on msearch
  • requestTimeout: 2000 on msearch

This allowed us to be able to recover from those spikes that would clog our system, but this is a band-aid.

Our setup is CloudFront -> API Gateway -> Lambda -> OpenSearch Serverless collection through a VPC endpoint.

We're also having troubles finding resources on how to troubleshoot OpenSearch Serverless. Most are for provisioned domains, such as this one: https://repost.aws/knowledge-center/opensearch-latency-spikes

Search monitor

  • Thanks for exploring AOSS. Can you send an email to anapat@amazon.com with the details about your collection information and query you are trying to run? We can look into the issue you are facing.

回答なし

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ