Opensearch Serverless - Init ingest performances issue

0

Hello everyone,
I am working on the Ingest part of Opensearch Serverless and I'm facing multiples issues.
For the context, I've wrote a script to perform bulk requests, mainly to do parallel_bulk for ingest our data in our Opensearch instance we maintain on an EC2. Then I have created a Step Function to perform first init by using bulk API.

When I try to Ingest our data (1.8M of documents, it should be kinda.. low, in terms of dataset's size for ES / OS..) I'm facing an exception :
{ ... , 'status': 429, 'error': {'type': 'circuit_breaking_exception', 'reason': 'rejected execution of primary operation [throttled]', 'bytes_wanted': 0, 'bytes_limit': 0, 'durability': 'TRANSIENT'}, ...}.

With this error, I loose tons of data in the ingest process. I've found that it could be due to data send being too large, but in our EC2 we don't have any problems.

I can understand it's about the initial 2 OCUs, but in this case, how is it possible to handle this case ? I need to load my dataset even if the storage is in a "Cold" state, is there any tips / trick to avoid this error ? Does someone know in which optimised way I can ingest those data ?
Could it be possible to have a "warmup" solution for the collection ?

Thank you for your time and your help. Regards.

oraluka
gefragt vor 6 Monaten89 Aufrufe
Keine Antworten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen