How do I resize an Amazon Redshift cluster?

6 minutos de lectura
0

I want to resize an Amazon Redshift cluster. How does that impact performance and billing?

Resolution

There are four ways to resize an Amazon Redshift cluster:

  • Elastic resize: If elastic resize is available as an option, use elastic resize to change the node type, number of nodes, or both. Note that when you only change the number of nodes, the queries are temporarily paused and connections are kept open. An elastic resize takes between 10-15 minutes. During a resize operation, the cluster is read-only.
  • Classic resize: Use classic resize to change the node type, number of nodes, or both. Choose this option when you're resizing to a configuration that isn't available through elastic resize. A resize operation takes two hours or more, or can take up to several days depending on your data size. During the resize operation, the source cluster is read-only.
  • Snapshot, restore, and resize: To be sure that the cluster is available during a classic resize operation, copy the existing cluster. Then, resize the new cluster. If data is written to the source cluster after a snapshot is taken, the data must be manually copied over. The manual data copy to the newly created target cluster must take place after the migration completes.
  • Fast Classic Resize: Fast classic resize is as quick as elastic resize and functions similar to classic resize. In this resize operation there are two main stages. In stage 1 (critical path), data is migrated from the source to a target cluster and the cluster is in read only mode. In stage 2 (off critical path) redistributing of the data, done in the previous data distribution style, is completed in the background. The duration of this stage depends on the volume to distribute and cluster workload.

For more information, see Overview of managing clusters in Amazon Redshift.

Prerequisites for resize

To verify if your cluster is eligible for elastic resize, run the following AWS CLI or AWS CloudShell command:

aws redshift describe-node-configuration-options --cluster-identifier <cluster-id> --action-type resize-cluster

Note: If you receive errors when running AWS Command Line Interface (AWS CLI) commands, make sure that you’re using the most recent AWS CLI version.

If the cluster is eligible for elastic resize, the output is similar to following in AWS CLI:

{
    "NodeConfigurationOptionList": [
        {
            "NodeType": "dc2.large",
            "NumberOfNodes": 2,
            "EstimatedDiskUtilizationPercent": 0.01
      },
        {
            "NodeType": "ra3.16xlarge",
            "NumberOfNodes": 2,
            "EstimatedDiskUtilizationPercent": 0.01
        }
]    
    }

If the cluster isn't eligible for elastic resize, the output is similar to following in AWS CLI:

{
    "NodeConfigurationOptionList": []
}

Performance benchmark

Before a resize, you can perform a benchmark test on existing cluster workloads and target cluster workloads for making resize decisions.

Resize operation speed

If elastic resize is used to resize a cluster with the same node type, the operation doesn't create a new cluster. As a result, the operation completes quickly. The time required to complete a classic resize or a snapshot and restore operation might vary, depending on the following factors:

  • The workload on the source cluster.
  • The number and size of the tables being transferred from source to target cluster.
  • How evenly data is distributed across the compute nodes and slices.
  • The node configuration in the source and target clusters.

Note: If you perform a classic resize on a cluster with a large volume of data, and the nodes aren't RA3, data migration might be slow. It can take several days to migrate a cluster with multiple terabytes (TB) of data. Data transfer for RA3 nodes completes more quickly.

Optimize operation speed

To reduce the time required for a classic resize or a snapshot and restore operation:

For more information about optimizing your resize performance, see Top 10 performance tuning techniques for Amazon Redshift.

To check the status of your resize operation using the Amazon Redshift console, choose the Status tab on the cluster details page. The Status tab shows the average rate of transfer, the elapsed time, and the remaining time.

Troubleshooting

  • During a resize operation, your table will increase or decrease in size. This behavior is expected. For more information, see Why does a table in my Amazon Redshift cluster consume more disk storage space than expected?
  • If your cluster has a status of NONE in the AWS CLI, then the target cluster is still being provisioned. When your target cluster is being provisioned, it hasn't copied over yet. After your target cluster is provisioned, the status changes to IN_PROGRESS.
  • If your AWS CloudFormation StackSets has failed to resize with an error of "An internal error has occurred. Please try your query again at a later time". Check if the cluster is eligible for elastic resize. CloudFormation stack uses elastic resize where Classic:false is set by default.
  • If you receive an error message prompting you to "Please choose a larger target cluster," then your data does not fit into the target cluster. Resize your Amazon Redshift cluster with more nodes or a different node type.
  • To cancel a resize operation before it completes, choose cancel resize from the cluster list in the Amazon Redshift console. For more information, see Snapshot, restore, and resize.

Billing for resized clusters

  • During the resize operation, you're billed for the clusters that are available to you. For example, during the resize operation, you're billed for the source configuration. After the resize is complete, you're no longer billed for the source configuration. Billing starts for the target configuration as soon as the cluster status changes to Available.
  • When you resize smaller node types (large, xlarge) to larger node types (8xlarge), your cluster requires more storage per node. The more storage you have per node, the more metadata that's written when you run a COMMIT. Therefore, the base cost for a single COMMIT operation is higher for larger nodes. If you run several small COMMIT operations concurrently, you might see a decrease in performance. For improved performance, group multiple changes into a single COMMIT operation.
  • If you purchased Reserved Instances, then your billing depends on the resized cluster configuration, reserved node types, and reserved node count. For more information, see How reserved nodes work.

Related information

Resizing clusters in Amazon Redshift

Troubleshooting connection issues in Amazon Redshift

Building high-quality benchmark tests for Amazon Redshift using SQLWorkbench and psql

OFICIAL DE AWS
OFICIAL DE AWSActualizada hace 8 meses
Sin comentarios