ECS t3.micro Bottlerocket performance degradation

0

Hello,

we have ECS service (2 tasks) running under Bottlerocket on 2 t3.micro instances (1 task per instance). It's a PHP on apache. With low load (1 req per second) avg response time of ECS (reported as target response time by ALB) is around 150ms (app does few network calls - to ElastiCache, SNS, DynamoDB, etc.). However when we increase load for test to about 20 reqs per second performance degrades - avg response time is about 1 sec. Profiling PHP app shows that most of time is spent for network calls (e.g. call to DynamoDB takes around 170 ms now).

My first though was that we're reaching open files limit, but increasing nofile ulimit didn't help much - it became better for 10-15%. Switching network mode from bridge to awsvpc didn't help at all.

Then comparing simple curl timing:

low load:

 time curl -w "@/curl.txt" -o /dev/null -s "https://dynamodb.eu-west-1.amazonaws.com"
     time_namelookup:  0.001612s
        time_connect:  0.002452s
     time_appconnect:  0.021202s
    time_pretransfer:  0.021243s
       time_redirect:  0.000000s
  time_starttransfer:  0.022665s
          time_total:  0.022717s
                     ----------

real    0m0.040s
user    0m0.018s
sys     0m0.007s

vs increased

# time curl -w "@/curl.txt" -o /dev/null -s "https://dynamodb.eu-west-1.amazonaws.com"
     time_namelookup:  0.007911s
        time_connect:  0.008654s
     time_appconnect:  0.106077s
    time_pretransfer:  0.106182s
       time_redirect:  0.000000s
  time_starttransfer:  0.111556s
          time_total:  0.111619s
                     ----------

real    0m0.233s
user    0m0.032s
sys     0m0.000s

so now it makes me think that it's smth related to CPU, however CPU utilization doesn't go over 40% in metrics (just in case memory doesn't go over 50%).

Any suggestions what to check further or we're hitting some "known" limit?

asked 2 years ago206 views
1 Answer
0

T class instances are whats called Burstable instances, so they have a baseline CPU and then can burst to higher based on credits available for a certain period of time before having to come down to baseline.

(number of credits earned/number of vCPUs)/60 minutes = % baseline utilization T3 Micro = (12/2)/60 = 10% Baseline utilization per vCPU https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-credits-baseline-concepts.html

have we checked the CPUCredits metric in CloudWatch? https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/burstable-performance-instances-monitoring-cpu-credits.html

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions