VPC Only模式下Sagemaker Studio创建Notebook任务失败: "Unexpected error occurred during creation of job."

0

【以下的问题经过翻译处理】 CloudWatch日志显示以下错误:

Traceback (most recent call last):
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_scheduler/handlers.py", line 194, in post
        job_id = await ensure_async(self.scheduler.create_job(CreateJob(**payload)))
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_server/utils.py", line 182, in ensure_async
        result = await obj
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 109, in wrapper
        raise excep
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 105, in wrapper
        return await func(*args, **kwargs)
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 210, in create_job
        s3_file_uploader = await self._prepare_job_artifacts(
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 168, in _prepare_job_artifacts
        input_uri = S3URI(runtime_environment_parameters.s3_input)
      File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/runtime_environment_parameters.py", line 40, in s3_input
        return self.parameters.get(RuntimeEnvironmentParameterName.S3_INPUT.value)
    AttributeError: 'NoneType' object has no attribute 'get'
[E 2023-03-16 15:36:01.070 SchedulerApp] 'NoneType' object has no attribute 'get' Traceback (most recent call last): File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_scheduler/handlers.py", line 194, in post job_id = await ensure_async(self.scheduler.create_job(CreateJob(**payload))) File "/opt/conda/envs/studio/lib/python3.9/site-packages/jupyter_server/utils.py", line 182, in ensure_async result = await obj File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 109, in wrapper raise excep File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/logging.py", line 105, in wrapper return await func(*args, **kwargs) File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 210, in create_job s3_file_uploader = await self._prepare_job_artifacts( File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/scheduler.py", line 168, in _prepare_job_artifacts input_uri = S3URI(runtime_environment_parameters.s3_input) File "/opt/conda/envs/studio/lib/python3.9/site-packages/sagemaker_scheduling/runtime_environment_parameters.py", line 40, in s3_input return self.parameters.get(RuntimeEnvironmentParameterName.S3_INPUT.value) AttributeError: 'NoneType' object has no attribute 'get'

另外"create notebook job"中的高级选项消失了.

VPC不通公网且已经根据下文更新过权限了: https://docs.aws.amazon.com/sagemaker/latest/dg/scheduled-notebook-policies.html

S3, SageMaker API and Runtime, SSM, STS, Metrics, Logs, ECR API 以及 ECR DKR的VPC端点都已经部署了,Notebooks也能正常工作。

请问下可能是哪里出错了呢?

profile picture
EXPERTE
gefragt vor 8 Monaten58 Aufrufe
1 Antwort
0

【以下的回答经过翻译处理】 感谢您报告了这个问题。

您能否也尝试下配置以下两个VPC终端?

  1. Amazon EC2
  2. Amazon EventBridge

此外,如果您有带有允许S3存储桶的细粒度控制的S3 VPC网关终端策略,请允许sagemakerheadlessexecution-prod-*访问S3,如下所示。

FYI- 您可以从https://docs.aws.amazon.com/sagemaker/latest/dg/create-notebook-auto-execution-advanced.html中找到更多的参考链接。

{
   "Action":[
      "s3:*"
   ],
   "Resource":[
      "arn:aws:s3:::sagemakerheadlessexecution-prod-*",
      "arn:aws:s3:::sagemakerheadlessexecution-prod-*/*"
   ],
   "Effect":"Allow",
   "Sid":"SCTASK14554266"
}
profile picture
EXPERTE
beantwortet vor 8 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen