Elasticsearch reindex - self-hosted to AWS - timing out, consistently failinglg...
I have a set of what I presumed to be fairly small elasticsearch indices in a self-hosted cluster of ec2 instances. I'm in the middle of trying to migrate this date into an AWS-managed elasticesearch cluster, and I've been having trouble getting the reindex tasks to _consistently_ reproduce the documents in the target cluster.
My cluster has about 330,000 documents in it - my approach has been to expose the source cluster to the destination cluster and to issue a reindex operation to the destination cluster, using some scripts written in ruby:
```ruby
def reindex(index:)
logger.info("reindexing #{index}")
task = dest_es.reindex(
body: {
source: {
remote: {
host: source,
username: source_username,
password: source_password,
socket_timeout: '5m',
connect_timeout: '30s',
external: true
},
index: index
},
dest: { index: index },
conflicts: "proceed"
},
refresh: true,
timeout: '3m',
wait_for_completion: wait_for_completion? # set to false by default
)
logger.info(task)
end
```
This task, however, never reindexes 100% of the documents, and frequently times out with the task status stating:
```json
{
"completed" : true,
"task" : {
...
"type" : "transport",
"action" : "indices:data/write/reindex",
"status" : {
"total" : 317554,
"updated" : 0,
"created" : 184000,
"deleted" : 0,
"batches" : 184,
"version_conflicts" : 0,
"noops" : 0,
"retries" : {
"bulk" : 0,
"search" : 0
},
"throttled_millis" : 0,
"requests_per_second" : -1.0,
"throttled_until_millis" : 0
},
"description" : """
...
""",
"start_time_in_millis" : 1660058838837,
"running_time_in_nanos" : 312625819581,
"cancellable" : true,
"headers" : { }
},
"error" : {
"type" : "socket_timeout_exception",
"reason" : null
}
}
```
If i try to break this reindex operation into multiple tasks, running asyncronously, i can issue a number of async tasks to reindex, say, 2000 documents at a tim. . By looking at `POST /.tasks/_search` it seems that they all complete without issue, but I still can't quite get to 100% - in fact the total number of docs varies between 60% and 99%
Are there config settings that are generally used with reindexing across clusters? I feel like this should be a lot more straightforward than it's turning out to be.
Thanks!lg...