Question on OpenSearch Serverless scale and pricing

0

The serverless-overview states:

Customers will be billed for a minimum of 4 OCUs (2x indexing includes primary and standby, and 2x search that includes one replica for HA) for the first collection in an account.

and

Each OCU is a combination of 6 GiB of memory and corresponding virtual CPU (vCPU), as well as data transfer to Amazon S3. Each OCU includes enough hot ephemeral storage for 120 GiB of index data.

Have a few questions for the full-text search use case:

  1. The 120GiB of index data is the size of documents? or the actual OpenSearch index segment file size?
  2. Assume one application has 2 collections and each collection has 120GiB of index data. When both collections are accessed, how many OCUs will run? 8 OCUs?
  3. Assume OpenSearch Serverless can automatically scale down to 4 OCUs when one collection is not accessed. How will data be cached for the collections? Later when the load arrives for both collections around the same time, OpenSearch Serverless will scale up to 8 OCUs? What would be the latency for the query? Assume the minimal 4 OCUs will keep 120GiB of index data in OCU's local storage. But the new 4 OCUs will need to download the rest 120GiB of index data before the query is served? If so, the query could be very slow?
  • Could someone from OpenSearch Serverless team help to answer it? Need to know whether OpenSearch Serverless is suitable for the workload. Thanks!

2 Answers
1
Accepted Answer

The 120GiB of index data is the size of documents? or the actual OpenSearch index segment file size?

120GB is the storage available for that OCU. This will store the data in the shards for the index (primaries and replicas), so it's index size, not source size.

Assume one application has 2 collections and each collection has 120GiB of index data. When both collections are accessed, how many OCUs will run? 8 OCUs?

Collections with the same KMS key share compute resources. So if they are the same key, it will be 4 OCUs (minimum) then scale based on the index size

Assume OpenSearch Serverless can automatically scale down to 4 OCUs when one collection is not accessed. How will data be cached for the collections? Later when the load arrives for both collections around the same time, OpenSearch Serverless will scale up to 8 OCUs? What would be the latency for the query? Assume the minimal 4 OCUs will keep 120GiB of index data in OCU's local storage. But the new 4 OCUs will need to download the rest 120GiB of index data before the query is served? If so, the query could be very slow?

All data is moved from indexing to S3 and retrieved from there by the search OCUs. Data stored in hot depends on the collection type (search, or time series). Search collections keep all data hot to provide best latency. Time series collections keep the latest day of each index in hot. At present we do not provide controls for the retention strategy.

Whether the OCUs scale depends on how much data they contain. If you have more than 240GB of index total in the 2 collections, then yes, the OCUs will scale to accommodate the indices. Load time will very much depend on many factors, so, I can't really predict the latency for the first query.

answered a year ago
profile picture
EXPERT
reviewed 14 days ago
  • Thanks Jon for the answer! Bad to hear that it is hard to predict the latency for the first query. One follow-up question.

    Whether the OCUs scale depends on how much data they contain. If you have more than 240GB of index total in the 2 collections, then yes, the OCUs will scale to accommodate the indices.

    Let's say all data are ingested and indexed. So the OCUs do not need to scale for the index. For search, when the OCUs scale for 240GB of index in the 2 collections, how many OCUs will be added to serve the second 120GB of index? 2 more OCUs? So when the search requests come for the 2 collections with 240GB of index, the total charge will be 2 OCUs for index, and 4 OCUs for search. Once the search requests stop, the search OCUs will scale back from 4 to 2. Is this understanding correct? Thanks!

1

Let's say all data are ingested and indexed. So the OCUs do not need to scale for the index. For search, when the OCUs scale for 240GB of index in the 2 collections, how many OCUs will be added to serve the second 120GB of index? 2 more OCUs?

2 OCUs can handle the whole 240GB. Let's take a case that's not on the line: say 2 collections that use 150GB of storage each. When you create the first collection and index data, Serverless will use 2 OCUs to handle that for both search and indexing (each). Now when you create and populate the second collection, you need 300GB of storage. OpenSearch Serverless will add 1 more OCU each for search and indexing, 6 total, to bring total storage to 360GB, storage used to 300GB.

So when the search requests come for the 2 collections with 240GB of index, the total charge will be 2 OCUs for index

Actually, we keep the source data for indexing as well as search. Keeping the data in the indexing tier enables us to handle updates. In the 300GB example, you will have 3 indexing OCUs and 3 search OCUs for a total of 6.

and 4 OCUs for search. Once the search requests stop, the search OCUs will scale back from 4 to 2. Is this understanding correct?

We scale for search by spreading the data across more OCUs. In the 300GB example, If you bring the traffic from X qps to 2X qps, you would likely see search OCUs double to 6. They would scale back to 3 when the qps drops back to X qps.

As a sidebar, I wanted to mention that storage used is a good way to think about this problem. But storage used is not 1:1 with source data you send to the service to index. You need 150 GB of storage for something like 70-75 GB of source data. When OpenSearch Serverless creates an index from the source, it roughly doubles in size. Indexes are different data structures and layouts than the source data. Depending on mapping, we typically see a 10% inflation. Then we add a replica, which doubles the storage used.

Jon

AWS
answered a year ago
  • Thanks a lot for the detail answers! This is very helpful! Looks I misunderstand that the search OCUs will scale with active and standby, while serverless actually spreads the data across OCUs. For the 300GB example, we only need 3 search OCUs. If 1 OCU is down for some reason (node crash, etc), a new OCU will automatically join. The query latency may be impacted as the new OCU needs time to load data. Thanks for the details about the storage usage!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions