When I initiate the RefreshCache operation on my file gateway in AWS Storage Gateway, the operation takes a long time to complete.
Resolution
Note: If you receive errors when you run AWS Command Line Interface (AWS CLI) commands, then see Troubleshoot AWS CLI errors. Also, make sure that you're using the most recent AWS CLI version.
The RefreshCache operation identifies the changes in Amazon Simple Storage Service (Amazon S3) objects since the last time the gateway identified and cached the objects. The changes can include updates, uploads or deleted objects. To complete this operation, the file gateway runs a recursive LIST operation on the Amazon S3 bucket. Then the file gateway runs a HEAD object operation on every object that comes back from the LIST operation. The HEAD operation retains the metadata and stores it in the file gateway cache.
The following factors impact how long a RefreshCache operation takes:
- If there's a large number of objects in the S3 bucket, then the run time of RefreshCache increases. This is because the file gateway runs a HEAD object against all objects in the bucket.
- RefreshCache operations are specific to individual file shares within a file gateway. One file share supports two RefreshCache API operations at a given time. If you send more requests to initiate a cache refresh, then more operations are initiated before the completion of the operations that are in progress. This can result in an InvalidGatewayRequestException error.
- S3 buckets can support 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix. These supported request rates also apply to the requests made by the file gateway to your S3 buckets. The amount of requests impacts how quickly a RefreshCache operation is completed. The run time of RefreshCache can increase if the S3 bucket is also used by services other than the file gateway.
To decrease the run time of a RefreshCache operation, do any of the following:
- Reduce the number of objects in the bucket.
- Deploy multiple file shares that correspond to separate prefixes in the S3 bucket. Don't use one file share for the entire bucket.
Note: You can create up to 10 file shares for an individual file gateway. Because the RefreshCache operations are run per file share, this can help reduce the time it takes to complete individual RefreshCache operations.
- If you use one file share for an entire S3 bucket, then focus the RefreshCache operations on specific prefixes or folders of the bucket that are updated with new objects. This reduces the scope of the operation and can help reduce the run time. Target RefreshCache operations to specific folders when you use the AWS CLI or the Storage Gateway API to you run the operation. This option isn't available in the Storage Gateway console.
- Run the RefreshCache operation at off-peak times for other requests to the S3 bucket. You can use AWS Lambda and Amazon CloudWatch to start the operation on a timer.
Related information
Automating cache refresh process for File Gateway on AWS Storage Gateway