Skip to content

Sagemaker fails to auto-mount EFS volume

0

Hello,

according to CloudTrail and error files created in started Sagemaker Studio Jupyterlabs, the role AWSServiceRoleForAmazonSageMakerNotebooks encounters an AccessDenied error when trying to create an EFS Access Point:

Error details: EFS Access Point [fsap-REDACTED] does not exist. Please work with your administrator.

In the CloudTrail event the requestParameters are null, the recorded error message is:

    "errorMessage": "User: arn:aws:sts::REDACTED:assumed-role/AWSServiceRoleForAmazonSageMakerNotebooks/AmazonSageMaker is not authorized to perform: elasticfilesystem:CreateAccessPoint on the specified resource",

I can not extend the permissions of the AWSServiceRoleForAmazonSageMakerNotebooks, as it is a Service-linked role. The relevant existing permissions are:

        {
            "Sid": "AllowEFSAccessPointCreation",
            "Effect": "Allow",
            "Action": "elasticfilesystem:CreateAccessPoint",
            "Resource": "arn:aws:elasticfilesystem:*:*:file-system/*",
            "Condition": {
                "StringLike": {
                    "aws:ResourceTag/ManagedByAmazonSageMakerResource": "*",
                    "aws:RequestTag/ManagedByAmazonSageMakerResource": "*"
                }
            }
        },

Could you help me troubleshoot this issue?

3 Answers
1

This issue appears to be related to the auto-mounting of an Amazon EFS volume in SageMaker Studio. When SageMaker encounters problems during the auto-mounting process, it typically creates an error file instead of mounting the EFS folder.

To troubleshoot this issue, consider the following steps:

  1. Check if the EFS file system exists and is properly tagged. The permissions on the AWSServiceRoleForAmazonSageMakerNotebooks role are conditional on the presence of a "ManagedByAmazonSageMakerResource" tag. Ensure this tag is present on the EFS file system.

  2. Verify that the EFS file system is in the same AWS region and VPC as your SageMaker domain.

  3. Ensure that the necessary VPC endpoints are set up correctly for EFS access.

  4. Check if there are any VPC security group issues preventing access to the EFS file system.

  5. Verify that the EFS file system's root directory has the correct ownership and permissions set. If these are not specified, EFS will not create the root directory, causing mount attempts to fail.

  6. If the issue persists, you may want to consider opting out of EFS auto-mounting for troubleshooting purposes. You can do this by updating the domain or user profile settings to disable auto-mounting.

  7. Review the full contents of the error.txt file created by SageMaker in the user-default-efs folder within the $HOME directory of the application. This file should contain more detailed error information and potential resolution steps.

If after trying these steps the issue continues, you may need to contact AWS Support for further assistance, as the AWSServiceRoleForAmazonSageMakerNotebooks is a service-linked role and cannot be directly modified.
Sources
Amazon EFS auto-mounting in Studio - Amazon SageMaker AI
Troubleshooting mount issues - Amazon Elastic File System
Opt out of Amazon EFS auto-mounting - Amazon SageMaker AI
UserSettings - Amazon SageMaker

answered a year ago
1

Broadly as the bot said, but to be a bit more specific, I would:

  • Check that your domain's EFS filesystem still exists at all in EFS console, and maybe raise a support case if not... I tentatively think there might be no "nice" way to fully recover your SM Domain if the filesystem got deleted - but you should be able to UpdateDomain to set your users' AutoMountHomeEFS setting to Disabled - if you just want to be able to open your spaces without the EFS failure preventing access.
  • Check if the filesystem got accidentally un-tagged even if it exists. You should usually see an ManagedByAmazonSageMakerResource tag whose value should be the SageMaker domain's ARN
  • Possibly double-check if some organization or account-level restriction has been set up to deny all access to EFS service? E.g. SCPs or so on... As shown by the policy you pulled out - the service role should generally be able to create the mount point as long as the filesystem exists with correct tagging, unless some other control somewhere is preventing it.
AWS
EXPERT
answered a year ago
0
Accepted Answer

Thanks to you all!

Sorry to say this, but the issue was actually with us - we somehow managed to delete the EFS by accident. Luckily, we did not yet store any meaningful data on it.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.