Opensearch Serverless - Init ingest performances issue

0

Hello everyone,
I am working on the Ingest part of Opensearch Serverless and I'm facing multiples issues.
For the context, I've wrote a script to perform bulk requests, mainly to do parallel_bulk for ingest our data in our Opensearch instance we maintain on an EC2. Then I have created a Step Function to perform first init by using bulk API.

When I try to Ingest our data (1.8M of documents, it should be kinda.. low, in terms of dataset's size for ES / OS..) I'm facing an exception :
{ ... , 'status': 429, 'error': {'type': 'circuit_breaking_exception', 'reason': 'rejected execution of primary operation [throttled]', 'bytes_wanted': 0, 'bytes_limit': 0, 'durability': 'TRANSIENT'}, ...}.

With this error, I loose tons of data in the ingest process. I've found that it could be due to data send being too large, but in our EC2 we don't have any problems.

I can understand it's about the initial 2 OCUs, but in this case, how is it possible to handle this case ? I need to load my dataset even if the storage is in a "Cold" state, is there any tips / trick to avoid this error ? Does someone know in which optimised way I can ingest those data ?
Could it be possible to have a "warmup" solution for the collection ?

Thank you for your time and your help. Regards.

oraluka
asked 5 months ago84 views
No Answers

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions