- Newest
- Most votes
- Most comments
Hello,
As S3 is a distributed service, a small amount of 5xx errors are expected(typically less than 0.01%) of the total request rate. As mentioned in the documentation, Your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per partitioned Amazon S3 prefix. However, this request rate is not available to a bucket or a prefix as soon as it is created. As the request rate increases gradually, S3 should be able to scale up automatically to accommodate the high request rates. Refer to this blog to understand how S3 scales up.
To monitor the number of 5xx status error responses that you receive, use one of these options:
- Turn on Amazon CloudWatch metrics. Amazon S3 CloudWatch request metrics include a metric for 5xx status responses. Note that request metrics incur additional CloudWatch charges and are billed at the standard CloudWatch rate.
- Turn on Amazon S3 server access logging. Because server access logging captures all requests, filter and review all requests that receive a 500/503 Error response. Use Amazon Athena to parse logs.
You can increase the number of retries to a higher value(10) to reduce the 503's you are getting as with a constantly high load, S3 will eventually scale up.
Another recommendation you could explore is to reduce the increment size to 500 TPS instead of 1000 TPS and increase the interval for the increment from 5 to 10 minutes to elongate the warm up period and give S3 some more time to scale up.
If you still see a high 5xx error rate, then to investigate your issue, we require details that are non-public information. Please open a support case with AWS using the following link
Regarding the SDK related errors, the possible reasons for 'Timeout waiting for connection from pool' could be as below:-
- Connections are not being closed properly by the client
- Number of concurrent connections is greater than the max number of connections setting
- Size of files being downloaded or uploaded is large causing the connections to be engaged for a longer time
Since you mentioned that you are closing connections explicitly, could you please check on whether at any time, the concurrent connections go beyond the MaxConnections value and the average file size. You could also engage the AWS Java SDK team directly via GitHub issues as well. Refer to the linked repo for Java SDK V1 and Java SDK V2.
Thanks
Relevant content
- asked a year ago
- Accepted Answerasked 6 months ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 3 years ago