By using AWS re:Post, you agree to the Terms of Use

Machine Learning & AI

AWS offers the broadest and deepest set of machine learning services and supporting cloud infrastructure, putting machine learning in the hands of every developer, data scientist and expert practitioner. AWS is helping more than one hundred thousand customers accelerate their machine learning journey.

Recent questions

see all
1/18

AWS SageMaker - Extending Pre-built Container, Deploy Endpoint Failed. No such file or directory: 'serve'"

I am trying to deploy the SageMaker Inference Endpoint by extending the Pre-built image. However, it failed with "FileNotFoundError: [Errno 2] No such file or directory: 'serve'" My Dockerfile ``` ARG REGION=us-west-2 # SageMaker PyTorch image FROM 763104351884.dkr.ecr.$REGION.amazonaws.com/pytorch-inference:1.12.1-gpu-py38-cu116-ubuntu20.04-ec2 RUN apt-get update ENV PATH="/opt/ml/code:${PATH}" # this environment variable is used by the SageMaker PyTorch container to determine our user code directory. ENV SAGEMAKER_SUBMIT_DIRECTORY /opt/ml/code # /opt/ml and all subdirectories are utilized by SageMaker, use the /code subdirectory to store your user code. COPY inference.py /opt/ml/code/inference.py # Defines inference.py as script entrypoint ENV SAGEMAKER_PROGRAM inference.py ``` CloudWatch Log From /aws/sagemaker/Endpoints/mytestEndpoint ``` 2022-09-30T04:47:09.178-07:00 Traceback (most recent call last): File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module> subprocess.check_call(shlex.split(' '.join(sys.argv[1:]))) File "/opt/conda/lib/python3.8/subprocess.py", line 359, in check_call retcode = call(*popenargs, **kwargs) File "/opt/conda/lib/python3.8/subprocess.py", line 340, in call with Popen(*popenargs, **kwargs) as p: File "/opt/conda/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/opt/conda/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) Traceback (most recent call last): File "/usr/local/bin/dockerd-entrypoint.py", line 20, in <module> subprocess.check_call(shlex.split(' '.join(sys.argv[1:]))) File "/opt/conda/lib/python3.8/subprocess.py", line 359, in check_call retcode = call(*popenargs, **kwargs) File "/opt/conda/lib/python3.8/subprocess.py", line 340, in call with Popen(*popenargs, **kwargs) as p: File "/opt/conda/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/opt/conda/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) 2022-09-30T04:47:13.409-07:00 FileNotFoundError: [Errno 2] No such file or directory: 'serve' ```
2
answers
0
votes
38
views
profile picture
asked 3 days ago

Is it possible to compile a neuron model in my local machine?

I'm following some guides and from my understanding this should be possible. But I've been trying for hours to compile a yolov5 model into a neuron model with no success. Is it even possible to do this in my local machine or do I have to be in an inferentia instance? This is what my environment looks like: ``` # packages in environment at /miniconda3/envs/neuron: # # Name Version Build Channel _libgcc_mutex 0.1 main _openmp_mutex 5.1 1_gnu absl-py 1.2.0 pypi_0 pypi astor 0.8.1 pypi_0 pypi attrs 22.1.0 pypi_0 pypi backcall 0.2.0 pyhd3eb1b0_0 ca-certificates 2022.07.19 h06a4308_0 cachetools 5.2.0 pypi_0 pypi certifi 2022.9.24 pypi_0 pypi charset-normalizer 2.1.1 pypi_0 pypi cycler 0.11.0 pypi_0 pypi debugpy 1.5.1 py37h295c915_0 decorator 5.1.1 pyhd3eb1b0_0 dmlc-nnvm 1.11.1.0+0 pypi_0 pypi dmlc-topi 1.11.1.0+0 pypi_0 pypi dmlc-tvm 1.11.1.0+0 pypi_0 pypi entrypoints 0.4 py37h06a4308_0 fonttools 4.37.3 pypi_0 pypi gast 0.2.2 pypi_0 pypi google-auth 2.12.0 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi gputil 1.4.0 pypi_0 pypi grpcio 1.49.1 pypi_0 pypi h5py 3.7.0 pypi_0 pypi idna 3.4 pypi_0 pypi importlib-metadata 4.12.0 pypi_0 pypi inferentia-hwm 1.11.0.0+0 pypi_0 pypi iniconfig 1.1.1 pypi_0 pypi ipykernel 6.15.2 py37h06a4308_0 ipython 7.34.0 pypi_0 pypi ipywidgets 8.0.2 pypi_0 pypi islpy 2021.1+aws2021.x.16.0.bld0 pypi_0 pypi jedi 0.18.1 py37h06a4308_1 jupyter_client 7.3.5 py37h06a4308_0 jupyter_core 4.10.0 py37h06a4308_0 jupyterlab-widgets 3.0.3 pypi_0 pypi keras-applications 1.0.8 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi kiwisolver 1.4.4 pypi_0 pypi ld_impl_linux-64 2.38 h1181459_1 libffi 3.3 he6710b0_2 libgcc-ng 11.2.0 h1234567_1 libgomp 11.2.0 h1234567_1 libsodium 1.0.18 h7b6447c_0 libstdcxx-ng 11.2.0 h1234567_1 llvmlite 0.39.1 pypi_0 pypi markdown 3.4.1 pypi_0 pypi markupsafe 2.1.1 pypi_0 pypi matplotlib 3.5.3 pypi_0 pypi matplotlib-inline 0.1.6 py37h06a4308_0 ncurses 6.3 h5eee18b_3 nest-asyncio 1.5.5 py37h06a4308_0 networkx 2.4 pypi_0 pypi neuron-cc 1.11.7.0+aec18907e pypi_0 pypi numba 0.56.2 pypi_0 pypi numpy 1.19.5 pypi_0 pypi oauthlib 3.2.1 pypi_0 pypi opencv-python 4.6.0.66 pypi_0 pypi openssl 1.1.1q h7f8727e_0 opt-einsum 3.3.0 pypi_0 pypi packaging 21.3 pyhd3eb1b0_0 pandas 1.3.5 pypi_0 pypi parso 0.8.3 pyhd3eb1b0_0 pexpect 4.8.0 pyhd3eb1b0_3 pickleshare 0.7.5 pyhd3eb1b0_1003 pillow 9.2.0 pypi_0 pypi pip 22.2.2 pypi_0 pypi pluggy 1.0.0 pypi_0 pypi prompt-toolkit 3.0.31 pypi_0 pypi protobuf 3.20.3 pypi_0 pypi psutil 5.9.2 pypi_0 pypi ptyprocess 0.7.0 pyhd3eb1b0_2 py 1.11.0 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pygments 2.13.0 pypi_0 pypi pyparsing 3.0.9 py37h06a4308_0 pytest 7.1.3 pypi_0 pypi python 3.7.13 h12debd9_0 python-dateutil 2.8.2 pyhd3eb1b0_0 pytz 2022.2.1 pypi_0 pypi pyyaml 6.0 pypi_0 pypi pyzmq 23.2.0 py37h6a678d5_0 readline 8.1.2 h7f8727e_1 requests 2.28.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.9 pypi_0 pypi scipy 1.4.1 pypi_0 pypi seaborn 0.12.0 pypi_0 pypi setuptools 59.8.0 pypi_0 pypi six 1.16.0 pyhd3eb1b0_1 sqlite 3.39.3 h5082296_0 tensorboard 1.15.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow 1.15.0 pypi_0 pypi tensorflow-estimator 1.15.1 pypi_0 pypi termcolor 2.0.1 pypi_0 pypi thop 0.1.1-2209072238 pypi_0 pypi tk 8.6.12 h1ccaba5_0 tomli 2.0.1 pypi_0 pypi torch 1.11.0 pypi_0 pypi torch-neuron 1.11.0.2.3.0.0 pypi_0 pypi torchvision 0.12.0 pypi_0 pypi tornado 6.2 py37h5eee18b_0 tqdm 4.64.1 pypi_0 pypi traitlets 5.4.0 pypi_0 pypi typing-extensions 4.3.0 pypi_0 pypi urllib3 1.26.12 pypi_0 pypi wcwidth 0.2.5 pyhd3eb1b0_0 werkzeug 2.2.2 pypi_0 pypi wheel 0.37.1 pypi_0 pypi widgetsnbextension 4.0.3 pypi_0 pypi wrapt 1.14.1 pypi_0 pypi xz 5.2.6 h5eee18b_0 zeromq 4.3.4 h2531618_0 zipp 3.8.1 pypi_0 pypi zlib 1.2.12 h5eee18b_3 ```
1
answers
1
votes
18
views
asked 3 days ago

using transformers module with sagemaker studio project: ModuleNotFoundError: No module named 'transformers'

So as mentioned in my [other recent post](https://repost.aws/questions/QUAL9Vn9abQ6KKCs2ASwwmzg/adjusting-sagemaker-xgboost-project-to-tensorflow-or-even-just-different-folder-name), I'm trying to modify the sagemaker example abalone xgboost template to use tensorfow. My current problem is that running the pipeline I get a failure and in the logs I see: ``` ModuleNotFoundError: No module named 'transformers' ``` NOTE: I am importing 'transformers' in `preprocess.py` not in `pipeline.py` Now I have 'transformers' listed in various places as a dependency including: * `setup.py` - `required_packages = ["sagemaker==2.93.0", "sklearn", "transformers", "openpyxl"]` * `pipelines.egg-info/requires.txt` - `transformers` (auto-generated from setup.py?) but so I'm keen to understand, how can I ensure that additional dependencies are available in the pipline itself? Many thanks in advance ------------ ------------ ------------ ADDITIONAL DETAILS ON HOW I ENCOUNTERED THE ERROR From one particular notebook (see [previous post](https://repost.aws/questions/QUAL9Vn9abQ6KKCs2ASwwmzg/adjusting-sagemaker-xgboost-project-to-tensorflow-or-even-just-different-folder-name) for more details) I have succesfully constructed the new topic/tensorflow pipeline and run the following steps: ``` pipeline.upsert(role_arn=role) execution = pipeline.start() execution.describe() ``` the `describe()` method gives this output: ``` {'PipelineArn': 'arn:aws:sagemaker:eu-west-1:398371982844:pipeline/topicpipeline-example', 'PipelineExecutionArn': 'arn:aws:sagemaker:eu-west-1:398371982844:pipeline/topicpipeline-example/execution/0aiczulkjoaw', 'PipelineExecutionDisplayName': 'execution-1664394415255', 'PipelineExecutionStatus': 'Executing', 'PipelineExperimentConfig': {'ExperimentName': 'topicpipeline-example', 'TrialName': '0aiczulkjoaw'}, 'CreationTime': datetime.datetime(2022, 9, 28, 19, 46, 55, 147000, tzinfo=tzlocal()), 'LastModifiedTime': datetime.datetime(2022, 9, 28, 19, 46, 55, 147000, tzinfo=tzlocal()), 'CreatedBy': {'UserProfileArn': 'arn:aws:sagemaker:eu-west-1:398371982844:user-profile/d-5qgy6ubxlbdq/sjoseph-reg-genome-com-273', 'UserProfileName': 'sjoseph-reg-genome-com-273', 'DomainId': 'd-5qgy6ubxlbdq'}, 'LastModifiedBy': {'UserProfileArn': 'arn:aws:sagemaker:eu-west-1:398371982844:user-profile/d-5qgy6ubxlbdq/sjoseph-reg-genome-com-273', 'UserProfileName': 'sjoseph-reg-genome-com-273', 'DomainId': 'd-5qgy6ubxlbdq'}, 'ResponseMetadata': {'RequestId': 'f949d6f4-1865-4a01-b7a2-a96c42304071', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'f949d6f4-1865-4a01-b7a2-a96c42304071', 'content-type': 'application/x-amz-json-1.1', 'content-length': '882', 'date': 'Wed, 28 Sep 2022 19:47:02 GMT'}, 'RetryAttempts': 0}} ``` Waiting for the execution I get: ``` --------------------------------------------------------------------------- WaiterError Traceback (most recent call last) <ipython-input-14-72be0c8b7085> in <module> ----> 1 execution.wait() /opt/conda/lib/python3.7/site-packages/sagemaker/workflow/pipeline.py in wait(self, delay, max_attempts) 581 waiter_id, model, self.sagemaker_session.sagemaker_client 582 ) --> 583 waiter.wait(PipelineExecutionArn=self.arn) 584 585 /opt/conda/lib/python3.7/site-packages/botocore/waiter.py in wait(self, **kwargs) 53 # method. 54 def wait(self, **kwargs): ---> 55 Waiter.wait(self, **kwargs) 56 57 wait.__doc__ = WaiterDocstring( /opt/conda/lib/python3.7/site-packages/botocore/waiter.py in wait(self, **kwargs) 376 name=self.name, 377 reason=reason, --> 378 last_response=response, 379 ) 380 if num_attempts >= max_attempts: WaiterError: Waiter PipelineExecutionComplete failed: Waiter encountered a terminal failure state: For expression "PipelineExecutionStatus" we matched expected path: "Failed" ``` Which I assume is corresponding to the failure I see in the logs: ![buildl pipeline error message on preprocessing step](/media/postImages/original/IMMpF6LeI6TgWxp20TnPZbUw) I did also run `python setup.py build` to ensure my build directory was up to date ... here's the terminal output of that command: ``` sagemaker-user@studio$ python setup.py build /opt/conda/lib/python3.9/site-packages/setuptools/dist.py:771: UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead warnings.warn( /opt/conda/lib/python3.9/site-packages/setuptools/config/setupcfg.py:508: SetuptoolsDeprecationWarning: The license_file parameter is deprecated, use license_files instead. warnings.warn(msg, warning_class) running build running build_py copying pipelines/topic/pipeline.py -> build/lib/pipelines/topic running egg_info writing pipelines.egg-info/PKG-INFO writing dependency_links to pipelines.egg-info/dependency_links.txt writing entry points to pipelines.egg-info/entry_points.txt writing requirements to pipelines.egg-info/requires.txt writing top-level names to pipelines.egg-info/top_level.txt reading manifest file 'pipelines.egg-info/SOURCES.txt' adding license file 'LICENSE' writing manifest file 'pipelines.egg-info/SOURCES.txt' ``` It seems like the dependencies are being written to `pipelines.egg-info/requires.txt` but are these not being picked up by the pipeline?
1
answers
0
votes
36
views
asked 4 days ago

adjusting sagemaker xgboost project to tensorflow (or even just different folder name)

I have sagemaker xgboost project template "build, train, deploy" working, but I'd like to modify if to use tensorflow instead of xgboost. First up I was just trying to change the `abalone` folder to `topic` to reflect the data we are working with. I was experimenting with trying to change the `topic/pipeline.py` file like so ``` image_uri = sagemaker.image_uris.retrieve( framework="tensorflow", region=region, version="1.0-1", py_version="py3", instance_type=training_instance_type, ) ``` i.e. just changing the framework name from "xgboost" to "tensorflow", but then when I run the following from a notebook: ``` from pipelines.topic.pipeline import get_pipeline pipeline = get_pipeline( region=region, role=role, default_bucket=default_bucket, model_package_group_name=model_package_group_name, pipeline_name=pipeline_name, ) ``` I get the following error ``` ValueError Traceback (most recent call last) <ipython-input-5-6343f00c3471> in <module> 7 default_bucket=default_bucket, 8 model_package_group_name=model_package_group_name, ----> 9 pipeline_name=pipeline_name, 10 ) ~/topic-models-no-monitoring-p-rboparx6tdeg/sagemaker-topic-models-no-monitoring-p-rboparx6tdeg-modelbuild/pipelines/topic/pipeline.py in get_pipeline(region, sagemaker_project_arn, role, default_bucket, model_package_group_name, pipeline_name, base_job_prefix, processing_instance_type, training_instance_type) 188 version="1.0-1", 189 py_version="py3", --> 190 instance_type=training_instance_type, 191 ) 192 tf_train = Estimator( /opt/conda/lib/python3.7/site-packages/sagemaker/workflow/utilities.py in wrapper(*args, **kwargs) 197 logger.warning(warning_msg_template, arg_name, func_name, type(value)) 198 kwargs[arg_name] = value.default_value --> 199 return func(*args, **kwargs) 200 201 return wrapper /opt/conda/lib/python3.7/site-packages/sagemaker/image_uris.py in retrieve(framework, region, version, py_version, instance_type, accelerator_type, image_scope, container_version, distribution, base_framework_version, training_compiler_config, model_id, model_version, tolerate_vulnerable_model, tolerate_deprecated_model, sdk_version, inference_tool, serverless_inference_config) 152 if inference_tool == "neuron": 153 _framework = f"{framework}-{inference_tool}" --> 154 config = _config_for_framework_and_scope(_framework, image_scope, accelerator_type) 155 156 original_version = version /opt/conda/lib/python3.7/site-packages/sagemaker/image_uris.py in _config_for_framework_and_scope(framework, image_scope, accelerator_type) 277 image_scope = available_scopes[0] 278 --> 279 _validate_arg(image_scope, available_scopes, "image scope") 280 return config if "scope" in config else config[image_scope] 281 /opt/conda/lib/python3.7/site-packages/sagemaker/image_uris.py in _validate_arg(arg, available_options, arg_name) 443 "Unsupported {arg_name}: {arg}. You may need to upgrade your SDK version " 444 "(pip install -U sagemaker) for newer {arg_name}s. Supported {arg_name}(s): " --> 445 "{options}.".format(arg_name=arg_name, arg=arg, options=", ".join(available_options)) 446 ) 447 ValueError: Unsupported image scope: None. You may need to upgrade your SDK version (pip install -U sagemaker) for newer image scopes. Supported image scope(s): eia, inference, training. ``` I was skeptical that the upgrade suggested by the error message would fix this, but gave it a try: ``` ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. pipelines 0.0.1 requires sagemaker==2.93.0, but you have sagemaker 2.110.0 which is incompatible. ``` So that seems like I can't upgrade sagemaker without changing pipelines, and it's not clear that's the right thing to do - like this project template may be all designed around those particular ealier libraries. But so is it that the "framework" name should be different, e.g. "tf"? Or is there some other setting that needs changing in order to allow me to get a tensorflow pipeline ...? However I find that if I use the existing `abalone/pipeline.py` file I can change the framework to "tensorflow" and there's no problem running that particular step in the notebook. I've searched all the files in the project to try and find any dependency on the `abalone` folder name, and the closest I came was in `codebuild-buildspec.yml` but that hasn't helped. Has anyone else successfully changed the folder name from `abalone` to something else, or am I stuck with `abalone` if I want to make progress? Many thanks in advance p.s. is there a slack community for sagemaker studio anywhere? p.p.s. I have tried changing all instances of the term "Abalone" to "Topic" within the `topic/pipeline.py` file (matching case as appropriate) to no avail p.p.p.s. I discovered that I can get an error free run of getting the pipeline from a unit test: ``` import pytest from pipelines.topic.pipeline import * region = 'eu-west-1' role = 'arn:aws:iam::398371982844:role/SageMakerExecutionRole' default_bucket = 'sagemaker-eu-west-1-398371982844' model_package_group_name = 'TopicModelPackageGroup-Example' pipeline_name = 'TopicPipeline-Example' def test_pipeline(): pipeline = get_pipeline( region=region, role=role, default_bucket=default_bucket, model_package_group_name=model_package_group_name, pipeline_name=pipeline_name, ) ``` and strangely if I go to a different copy of the notebook, everything runs fine, there ... so I have two seemingly identical ipynb notebooks, and in one of them when I switch to trying to get a topic pipeline I get the above error, and in the other, I get no error at all, very strange p.p.p.p.s. I also notice that `conda list` returns very different results depending on whether I run it in the notebook or the terminal ... but the conda list results are identical for the two notebooks ...
1
answers
0
votes
21
views
asked 4 days ago

Recent articles

see all
1/3

Popular users

see all
1/18

Learn AWS faster by following popular topics

1/1