- Newest
- Most votes
- Most comments
In Amazon FSx, the terms "block size" and "stripe size" are used interchangeably and refer to the same concept. Stripe size is the amount of data that is written to each disk in the file system before moving on to the next disk. The larger the stripe size, the larger the amount of data that can be read or written in a single I/O operation.
You can change the stripe size of an Amazon FSx file system when you create the file system or by modifying the file system settings. However, it is not possible to change the stripe size once the file system has been created. Therefore, if you want to change the stripe size of an existing file system, you will need to create a new file system with the desired stripe size and migrate your data to the new file system.
Regarding your specific use case of random access of small files, a larger stripe size may not necessarily improve performance. In fact, as you noted, a larger stripe size can result in a higher penalty on block misses. If your application is primarily performing random access of small files, you may want to consider using a smaller stripe size to improve performance. Additionally, you may want to consider using a caching layer, such as Amazon Elastic Cache or Amazon ElastiCache for Redis, to reduce the number of disk reads required for small files.
Based on the information provided, it appears that the bottleneck in your use case is likely the small file size of the images you are working with. This is because the I/O operations required to read and write small files are typically much slower than those required for larger files, as there is significant overhead associated with each I/O operation.
One way to improve the performance of FSx with small files is to increase the block size used by the file system. Another potential issue could be the number of concurrent processes reading the data. With 16 concurrent processes, there may be contention for resources, leading to decreased performance. You can try reducing the number of concurrent processes and see if that improves performance.
Relevant content
- Accepted Answerasked 2 years ago
- asked 2 years ago
- AWS OFFICIALUpdated 8 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated a year ago
Thanks for the response and suggestions!
How can we modify the block size of FSx? I only see documentation about changing the stripe size, but not for block size. Are these the same thing?
Suppose my data access is random access of small files over the 307GB data, does the large block size still improves the performance? I think a larger block size will have a higher penalty on block misses, and my block size can't be large enough to cover the whole dataset.