Opensearch Serverless - Init ingest performances issue

0

Hello everyone,
I am working on the Ingest part of Opensearch Serverless and I'm facing multiples issues.
For the context, I've wrote a script to perform bulk requests, mainly to do parallel_bulk for ingest our data in our Opensearch instance we maintain on an EC2. Then I have created a Step Function to perform first init by using bulk API.

When I try to Ingest our data (1.8M of documents, it should be kinda.. low, in terms of dataset's size for ES / OS..) I'm facing an exception :
{ ... , 'status': 429, 'error': {'type': 'circuit_breaking_exception', 'reason': 'rejected execution of primary operation [throttled]', 'bytes_wanted': 0, 'bytes_limit': 0, 'durability': 'TRANSIENT'}, ...}.

With this error, I loose tons of data in the ingest process. I've found that it could be due to data send being too large, but in our EC2 we don't have any problems.

I can understand it's about the initial 2 OCUs, but in this case, how is it possible to handle this case ? I need to load my dataset even if the storage is in a "Cold" state, is there any tips / trick to avoid this error ? Does someone know in which optimised way I can ingest those data ?
Could it be possible to have a "warmup" solution for the collection ?

Thank you for your time and your help. Regards.

oraluka
demandé il y a 6 mois89 vues
Aucune réponse

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions