EC2 Instance(c6in.2xlarge) takes more read time compared to lower instance type(c6in.xlarge)

1

EC2 instance comparison analysis

We are trying to compare EC2 instances performance for reading files from S3 bucket. We are using python script to read parquet file from S3 bucket. Python script uses pandas.read_parquet(S3 object key) function to read parquet file from S3 bucket.

The EC2 instances considered for comparison are c6in.xlarge and c6in.2xlarge.

Instance namevCPUMemoryStorageNetwork performanceEBS Volume
c6in.xlarge48 GiBEBS OnlyUp to 30000 Megabit8GiB-gp3
c6in.2xlarge816 GiBEBS OnlyUp to 40000 Megabit8GiB-gp3

Steps followed for comparison

  1. Create EC2 instances (c6in.xlarge and c6in.2xlarge).
  2. Run python script.
  3. Note the time taken for reading all files from S3 bucket.

Issue Description

When we are trying to compare read time in c6in.xlarge and c6in.2xlarge instances, we have observed that read time is higher in c6in.2xlarge instance. Since c6in.2xlarge uses double the number of vCPU and Network performance also higher compared to c6in.xlarge, we are expecting read time should be less in c6in.2xlarge compared to c6in.xlarge instance.

Metrics observed read time in c6in.xlarge and c6in.2xlarge instances

c6in.xlarge -> 247.705972 sec

c6in.xlarge-metrics

c6in.2xlarge -> 279.7610955 sec

c6in.2xlarge-metrics

Comparison between gp3, io1 and io2 EBS volume types

We tried by changing EBS volume types by keeping EBS volume size(8GiB) as constant. We changed EBS volume types to gp3, io1 and io2 respectively and used default value suggested.

Observations for changing EBS volume types

In this we have observed that there is no much difference between gp3, io1 and io2 EBS volume types in c6in.xlarge instance and little difference in c6in.2xlarge instance. But instance point of view still it is higher read time in c6in.2xlarge compared to c6in.xlarge instance.

EBS Volume TypeDefault IOPSc6in.xlarge Read Time (sec)c6in.2xlarge Read Time (sec)
gp33000203.922494409999219.475556271999
io1400203.525826416999228.025478206
io24000203.134390571999221.881286345

Please help me to identify the root cause and how to solve it. Thanks in advance.

asked 10 months ago291 views
1 Answer
0

It is not clear why you are seeing a difference. I'd ask if these instances are placed in the same availability zone? The network in max is less for the faster one. Maybe you used a different sized file?

But the test is not CPU intensive or network intensive. So you are over provisioned for this test. The CPU utilization was 12% and 5% which is pretty close to the 2x difference between instance type CPU. If you have 4 or 8 cpu, a single process copying a file from S3 wont be able to leverage multiple cpu's.

Also a tip: add another vertical axis to your graph so that there is a separate scale for CPU utilization.

AWS
MODERATOR
philaws
answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions