how can I use sagemaker_sklearn_extension in Sagemaker job?


I'm creating a data processing job in sagemaker notebook:

from sagemaker.sklearn.processing import SKLearnProcessor

sklearn_processor = SKLearnProcessor(role=role,

my processing script uses :

from sagemaker_sklearn_extension.decomposition import RobustPCA

and I get an error during the job exectution:

Traceback (most recent call last):
  File "/opt/ml/processing/input/code/", line 14, in <module>
    from sagemaker_sklearn_extension.decomposition import RobustPCA
ModuleNotFoundError: No module named 'sagemaker_sklearn_extension'

as far as I understrand : framework_version='0.23-1' should make sagemaker create docker image based on image from that repo: and the 0.23-1 branch handles extensions installation (if extenssion/Dockerfile.cpu file is executed), but I don't see how I can make Sagemaker run that script when creating the job.

1 Answer

There is a way to install the packages that you need via subprocess on the

import subprocess

lets pip install the custom package

subprocess.check_call([sys.executable, "-m", "pip", "install", "sagemaker-scikit-learn-extension==(your version)"])

answered a year ago

