Writer Instance Scaling

0

Hello,

I understand that increasing a writer instance tier will cause downtime, however i'm exploring ways to avoid that.

One way that comes to mind is to add a read replica within the cluster which is a higher tier, then to promote it to the writer instance. Can this be achieved without causing downtime? If so, what Neptune API's would be invoked to do it?

Thanks

asked 4 years ago754 views
3 Answers
0

Hi Austin,

The method you are mentioning is actually the preferred way to scale the write master in a Neptune cluster. In most cases, you should already have (at least) a single read replica for the purposes of HA. If not, you would create a new read replica with the desired size that you want for the new write master. Once this new read replica is online, you would then issue a cluster failover with the new read replica as the failover target. Once the failover completes, the read replica now becomes the write master. The old write master now becomes a read replica.

The failover process can take a few seconds (this will vary based on the load on the cluster). On an idle cluster, you should expect around 3-5 seconds.

If you don't already have a read replica, you can create one with this AWS CLI command:

aws neptune create-db-instance --db-instance-identifier <new_instance_name> \
  --db-instance-class <instance_size> --engine neptune \
  --db-cluster-identifier <cluster_name
>```

If you already have a read replica, you can change its size via the following command:

aws neptune modify-db-instance --db-instance-identifier <name_of_instance>
--db-instance-class <instance_size> --apply-immediately


Once you have a read replica at the target size that you require, you can initiate a failover to the read replica using the following command. This make the read replica instance the write master:

aws neptune failover-db-cluster --db-cluster-identifier <cluster_name>
--target-db-instance-identifier <name_of_new_instance

When the failover process is done, you can address the remaining read replica in a couple of ways. If it is a read replica that you want to keep for HA purposes (recommended), then you should now scale this read replica to match the size of the write master. You want to have a read replica of the same size as the write master for HA events so that you only experience minimal write performance degradation if an HA event were to occur.

If this is a development environment where you no longer require the read replica, then you can delete it after the failover process has occurred.

profile pictureAWS
answered 4 years ago
0

Thanks Taylor!

Your response answers my question entirely. Now I have a follow up question.

If we have active connections to the cluster writer endpoint, and then we promote a different instance to become writer, what happens to those active connections?

Thanks

answered 4 years ago
0

Hi,

A failover would cause both the reader and writer processes to restart such that they can assume their new duties (reader becomes the writer and vice versa). If there are any open websocket connections then as part of the restart process the connection would be closed from the server side which would result in client side errors as the connections are dropped. The clients would have re-establish the connections, some Gremlin clients do have the behavior of re-establishing dropped connections automatically.

Regards,
Kunal.

Edited by: kunal-aws on Nov 8, 2019 4:37 PM

AWS
answered 4 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions