how can I use sagemaker_sklearn_extension in Sagemaker job?

0

I'm creating a data processing job in sagemaker notebook:

from sagemaker.sklearn.processing import SKLearnProcessor

sklearn_processor = SKLearnProcessor(role=role,
                                     base_job_name='end-to-end-ml-sm-proc',
                                     instance_type='ml.m5.large',
                                     instance_count=1,
                                     framework_version='0.23-1')

my processing script uses :

from sagemaker_sklearn_extension.decomposition import RobustPCA

and I get an error during the job exectution:

Traceback (most recent call last):
  File "/opt/ml/processing/input/code/preprocessor.py", line 14, in <module>
    from sagemaker_sklearn_extension.decomposition import RobustPCA
ModuleNotFoundError: No module named 'sagemaker_sklearn_extension'

as far as I understrand : framework_version='0.23-1' should make sagemaker create docker image based on image from that repo: https://github.com/aws/sagemaker-scikit-learn-container and the 0.23-1 branch handles extensions installation (if extenssion/Dockerfile.cpu file is executed), but I don't see how I can make Sagemaker run that script when creating the job.

how can I use sagemaker_sklearn_extension in Sagemaker job?

preguntada hace 2 años421 visualizaciones
1 Respuesta
0

There is a way to install the packages that you need via subprocess on the entry_point.py:

import subprocess

lets pip install the custom package

subprocess.check_call([sys.executable, "-m", "pip", "install", "sagemaker-scikit-learn-extension==(your version)"])

AWS
EXPERTO
respondido hace 2 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas