FSx OpenZFS Filesystem SSD storage capacity increase is stalled

0

I triggered an update on the SSD storage for a OpenZFS filesystem, and it seems stalled (3 hours now on the "IN_PROGRESS" state), where in previous attempts, it completed in roughly 5 minutes for the same amount of disk increase (300GB). My console output for the operation is:

$ aws fsx describe-file-systems   --file-system-ids fs-<my_fsx_id>    --query 'FileSystems[0].AdministrativeActions[*].{Type:AdministrativeActionType, Status:Status, Progress:ProgressPercent, Error:FailureDetails.Message}'
[
    {
        "Type": "FILE_SYSTEM_UPDATE",
        "Status": "IN_PROGRESS",
        "Progress": null,
        "Error": null
    }
]

The difference this time, is that clients were trying to write to the NFS mount when the update was triggered while the disk was full.

Any ideas on how to further understand what's going on and troubleshoot it?

asked 8 days ago40 views
2 Answers
0
Accepted Answer

Well, after almost 18 hours on this stalled state, it finally changed to:

$ aws fsx describe-file-systems   --file-system-ids fs-<my-fs-id>    --query 'FileSystems[0].AdministrativeActions[*].{Type:AdministrativeActionType, Status:Status, Progress:ProgressPercent, Error:FailureDetails.Message}'
[
    {
        "Type": "FILE_SYSTEM_UPDATE",
        "Status": "FAILED",
        "Progress": null,
        "Error": "Storage capacity update failed due to insufficient capacity in this availability zone. Please try again later."
    }
]

I believe that this kind of error could be returned earlier....

Thanks

answered 7 days ago
0

The situation you're experiencing with your FSx for OpenZFS file system storage capacity increase being stalled is unusual and concerning. Typically, storage capacity increases for FSx for OpenZFS should complete relatively quickly, as you've observed in your previous attempts.

The fact that the update is stuck in the "IN_PROGRESS" state for an extended period (3 hours) suggests that there might be an issue with the update process. The lack of a progress percentage in your console output is also noteworthy, as normally you should be able to see the progress of the update.

The difference you mentioned - clients trying to write to the NFS mount when the update was triggered while the disk was full - could potentially be a contributing factor to the issue. When a file system is completely full, it can sometimes lead to unexpected behavior or complications during operations like capacity increases.

To further understand and troubleshoot the situation, you could try the following steps:

  1. Continue monitoring the status of the update. Sometimes, complex operations might take longer than expected, especially if the system is dealing with a full disk scenario.

  2. Check if there are any error messages or alerts in the AWS FSx console related to this file system or the update operation.

  3. Verify if you can still access and use the file system. If possible, try to free up some space on the file system, which might help the update process proceed.

  4. If the issue persists, it's advisable to contact AWS Support. They have access to more detailed information about the status of your file system and can provide specific guidance or intervene if necessary.

  5. For future updates, it's generally a good practice to ensure that the file system isn't completely full and to minimize write operations during the update process if possible.

Remember, while you can increase the storage capacity of an FSx for OpenZFS file system, you cannot decrease it. If this update eventually fails, you might need to consider creating a new file system with the desired capacity and migrating your data.

If the issue continues without resolution, AWS Support would be the best resource to help you understand what's happening behind the scenes and to resolve the stalled update.
Sources
Decrease/reduce Amazon FSx for OpenZFS storage capacity | AWS re:Post
Modifying provisioned SSD storage capacity and IOPS - FSx for OpenZFS
Monitoring storage capacity and IOPS updates - FSx for ONTAP

profile picture
answered 8 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions