OpenSearch Serverless search latency spikes blocking searches

1

We are having issues with the auto-scaling of our collection. We're trying to slowly shift traffic to the new collection but as soon as there is a little bit of traffic, the search latency spikes, requests start to accumulate and we basically can't do searches anymore because it overloads the collection.

I was able to trigger scaling of search OCUs by sending manual requests incrementally, up to ~2.5/s but this took about 1h to go from 0 to that, and it only scaled to 3 OCUs. I'm concerned we won't be able to handle rapid traffic increases.

We exclusively use the msearch API for our searches.

We had to add timeouts:

  • cancel_after_time_interval: '2s' on all queries
  • maxRetries: 0 on msearch
  • requestTimeout: 2000 on msearch

This allowed us to be able to recover from those spikes that would clog our system, but this is a band-aid.

Our setup is CloudFront -> API Gateway -> Lambda -> OpenSearch Serverless collection through a VPC endpoint.

We're also having troubles finding resources on how to troubleshoot OpenSearch Serverless. Most are for provisioned domains, such as this one: https://repost.aws/knowledge-center/opensearch-latency-spikes

Search monitor

  • Thanks for exploring AOSS. Can you send an email to anapat@amazon.com with the details about your collection information and query you are trying to run? We can look into the issue you are facing.

已提問 9 個月前檢視次數 376 次
沒有答案

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南