VPC Only模式下Sagemaker Studio创建Notebook任务失败: "Unexpected error occurred during creation of job."

0

【以下的问题经过翻译处理】 CloudWatch日志显示以下错误:

Traceback (most recent call last):
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_scheduler/handlers.py", line 194, in post
        job_id = await ensure_async(self.scheduler.create_job(CreateJob(**payload)))
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_server/utils.py", line 182, in ensure_async
        result = await obj
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 109, in wrapper
        raise excep
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 105, in wrapper
        return await func(*args, **kwargs)
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 210, in create_job
        s3_file_uploader = await self._prepare_job_artifacts(
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 168, in _prepare_job_artifacts
        input_uri = S3URI(runtime_environment_parameters.s3_input)
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/runtime_environment_parameters.py", line 40, in s3_input
        return self.parameters.get(RuntimeEnvironmentParameterName.S3_INPUT.value)
    AttributeError: 'NoneType' object has no attribute 'get'
[E 2023-03-16 15:36:01.070 SchedulerApp] 'NoneType' object has no attribute 'get' Traceback (most recent call last): File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_scheduler/handlers.py", line 194, in post job_id = await ensure_async(self.scheduler.create_job(CreateJob(**payload))) File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_server/utils.py", line 182, in ensure_async result = await obj File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 109, in wrapper raise excep File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 105, in wrapper return await func(*args, **kwargs) File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 210, in create_job s3_file_uploader = await self._prepare_job_artifacts( File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 168, in _prepare_job_artifacts input_uri = S3URI(runtime_environment_parameters.s3_input) File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/runtime_environment_parameters.py", line 40, in s3_input return self.parameters.get(RuntimeEnvironmentParameterName.S3_INPUT.value) AttributeError: 'NoneType' object has no attribute 'get'

另外"create notebook job"中的高级选项消失了.

VPC不通公网且已经根据下文更新过权限了: https://docs.aws.amazon.com/sagemaker/latest/dg/scheduled-notebook-policies.html

S3, SageMaker API and Runtime, SSM, STS, Metrics, Logs, ECR API 以及 ECR DKR的VPC端点都已经部署了,Notebooks也能正常工作。

请问下可能是哪里出错了呢?

profile picture
专家
已提问 7 个月前55 查看次数
1 回答
0

【以下的回答经过翻译处理】 感谢您报告了这个问题。

您能否也尝试下配置以下两个VPC终端?

  1. Amazon EC2
  2. Amazon EventBridge

此外,如果您有带有允许S3存储桶的细粒度控制的S3 VPC网关终端策略,请允许sagemakerheadlessexecution-prod-*访问S3,如下所示。

FYI- 您可以从https://docs.aws.amazon.com/sagemaker/latest/dg/create-notebook-auto-execution-advanced.html中找到更多的参考链接。

{
   "Action":[
      "s3:*"
   ],
   "Resource":[
      "arn:aws:s3:::sagemakerheadlessexecution-prod-*",
      "arn:aws:s3:::sagemakerheadlessexecution-prod-*/*"
   ],
   "Effect":"Allow",
   "Sid":"SCTASK14554266"
}
profile picture
专家
已回答 7 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则