- Newest
- Most votes
- Most comments
With the AWS node conversion recommendations, I've mostly read that for DC2 instances the performance kind of remains same with the RA3. But in your case, looks like you almost cut the capacity in half while opting to RA3. Is this AWS recommended config?
The documentation states the following:
"Create 3 nodes of ra3.xlplus for every 8 nodes of dc2.large"
https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html
So we would end up with 5.2 ra3.xlplus nodes. But as those nodes arent cheap (10k/y) the decision wasnt easy. The documentation states also that they improved S3 loads in comparison to dc/ds nodes. But considering loading times resulting in 3x more(!) and simple joins taking double amount i'm quite speechless. Thats why i'm looking for other experiences - might be that the performance will increase over time. We started the new cluster yesterday, so its fresh and much loads are running for the first time now on the ra3 cluster.
Hi, Thank you for your patience.
For the load issue, I think that what you are experiencing is normal, considering the specific types of files you are loading (more than 500k files with a few rows each).
In this case the better throughput between S3 and Redshift would not help, because the most probable delay would come from the opening of all the files not the time to move few rows.
ra3.xlplus instances have the same number of slices of dc2.xlarge, and each slice will load one file at the time, so based on the number of nodes you shared you have, now, almost one third of the slices, so each will have a longer queue of files to load.
Also, ra3.xlplus have 2 cores per slice, but only one will be used by the load operation , thus the 50% utilization you see.
Definitely, as you already know, the optimization/consolidation of the input files would be beneficial for both clusters.
As for the join slow down , it is not possible to provide any feedback without seeing the explains in both clusters, nor knowing how much data is read during the query, nor having an idea of the table design.
could you provide additional information or open a support case (if not already done)?
hope this helps.
Relevant content
- asked 2 months ago
- Accepted Answerasked 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago