How do I troubleshoot a HTTP 500 or 503 error from Amazon S3?

5 minute read
0

When I make a request to Amazon Simple Storage Service (Amazon S3), Amazon S3 returns a 5xx status error.

Short description

Amazon S3 returns a 5xx status error similar to the following examples:

  • "AmazonS3Exception: Internal Error (Service: Amazon S3; Status Code: 500; Error Code: 500 Internal Error; Request ID: A4DBBEXAMPLE2C4D)"
  • "AmazonS3Exception: Slow Down (Service: Amazon S3; Status Code: 503; Error Code: 503 Slow Down; Request ID: A4DBBEXAMPLE2C4D)"

The error code 500 Internal Error indicates that Amazon S3 can't handle the request at that time. The error code 503 Slow Down typically indicates that the number of requests to your S3 bucket is high. For example, you can send 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per partitioned Amazon S3 prefix. However, Amazon S3 can return a 503 Slow Down response if your requests exceed the amount of bandwidth that's available for cross-Region copying.

To resolve or avoid 5xx status errors, complete the following tasks:

  • For the application that makes the requests, use a retry mechanism.
  • Configure your application to gradually increase request rates.
  • Distribute objects across multiple prefixes.
  • Monitor the number of 5xx error responses.

Note: When a prefix is created, Amazon S3 doesn't automatically assign additional resources for the supported request rate. Amazon S3 scales based on request patterns. As the request rate increases, Amazon S3 dynamically optimizes for the new request rate.

Resolution

Use a retry mechanism

Because of the distributed nature of Amazon S3, you can retry requests that return 500 or 503 errors. It's a best practice to build retry logic into applications that make requests to Amazon S3.

All AWS SDKs have a built-in retry mechanism with an algorithm that uses exponential backoff. This algorithm implements increasingly longer wait times between retries for consecutive error responses. Many exponential backoff algorithms use jitter (randomized delay) to prevent successive collisions. For more information, see Retry behavior.

Note: Amazon S3 supports request rates of up to 3500 Put requests per second per partitioned Amazon S3 prefix. In some scenarios, rapid concurrent Put requests to the same key can result in a 503 response. It's a best practice to retry failed requests in these cases.

Configure your application to gradually increase request rates

If you make requests at a high request rate that's close to the rate limit, then Amazon S3 might return 503 Slow Down errors. If there's a sudden increase in the request rate for objects in a prefix, then you receive 503 Slow Down errors. Configure your application to maintain the request rate and implement a retry with exponential backoff. This configuration allows Amazon S3 time to monitor the request patterns and scale in the backend to handle the request rate.

Configure your application to start with a lower request rate (transactions per second) to avoid the 503 Slow Down error. Then, exponentially increase the application's request rate. Amazon S3 automatically scales to handle a higher request rate.

Distribute objects across multiple prefixes

Request rates apply per prefix in an Amazon S3 bucket. To set up your bucket to handle overall higher request rates and to avoid 503 Slow Down errors, distribute objects across multiple prefixes. For example, if you use an Amazon S3 bucket to store images and videos, then distribute the files into two prefixes:

  • mybucket/images
  • mybucket/videos

If the request rate on the prefixes gradually increases, then Amazon S3 scales up to handle requests for each of the two prefixes separately. Amazon S3 scales up to handle 3,500 PUT/POST/DELETE or 5,500 GET requests per second per partitioned Amazon S3 prefix. As a result, the overall request rate that the bucket handles doubles.

Monitor the number of 5xx status error responses

To monitor the number of 5xx status error responses that you receive, use one of the following options:

Additional reasons for 5xx errors

When you use the Expedited Restore Tier to retrieve archived objects, you can receive an error similar to the following examples:

  • "GlacierExpeditedRetrievalNotAvailable"
  • "Glacier expedited retrievals are currently not available, please try again later"

These errors occur if there's insufficient capacity to process the Expedited request. During a period of sustained high demand, Amazon S3 might deny Expedited retrieval requests and return a 503 error. Use provisioned capacity units (PCUs) to make sure that the retrieval capacity for Expedited retrievals is available on demand. Each unit allows for at least three Expedited retrievals to be performed every 5 minutes. Each unit provides up to 150 megabytes per second (MBps) of retrieval throughput. You can also use "Standard" or "Bulk" retrieval options.

You can retry the retrieval, though the retry doesn't guarantee success. Except in cases of extreme demand, Expedited retrievals are possible without provisioned capacity. Because of the constant change and high demand for Expedited retrieval availability from non-provisioned capacity, AWS Support doesn't provide a guaranteed SLA.

If you continue to receive a high amount of 5xx status error rates, then contact AWS Support. Include multiple Amazon S3 request ID pairs for requests that fail with a 5xx status error code.

Related information

Troubleshooting

Monitoring metrics with Amazon CloudWatch

AWS OFFICIAL
AWS OFFICIALUpdated a month ago
No comments