- Più recenti
- Maggior numero di voti
- Maggior numero di commenti
You can find the actual training scripts' train_source_uri
from the LightGBM algorithm docs page and download and unzip the bundle to your notebook to inspect what it's doing, something like this:
!mkdir -p tmp
!aws s3 cp --no-sign-request {train_source_uri} tmp/train_source.tar.gz
!cd tmp && tar -xvf train_source.tar.gz
Unfortunately, the sagemaker_jumpstart_tabular_script_utilities
referenced by this particular script (in the main transfer_learning.py
file) doesn't seem to be open-source anywhere... But I see a wheel file lib/sagemaker_jumpstart_tabular_script_utilities/sagemaker_jumpstart_tabular_script_utilities-1.0.0-py2.py3-none-any.whl
which you can just !unzip
from your notebook to see the implementation of save_model()
. Result: it's actually saving the model file vIa joblib.dump().
There's a nice diagram here in the developer guide about the mapping between folders in the container(s) SageMaker spins up for your training job, and input/output locations in Amazon S3. Your final S3 paths may depend on whether you're using the sagemaker
Python SDK or the low-level boto3
AWS SDK to create your training job. Either way though, you should be able to customise the output S3 location (see e.g. output_path
and base_job_name
on SMPySDK Estimator objects).
As mentioned in the doc, your training container's /opt/ml/model
is always compressed to a .tar.gz
archive (by SageMaker) before upload to S3.
So in summary:
- I'd suggest trying to read your model file via
joblib.load
(after downloading and extracting themodel.tar.gz
e.g. withtar -xvf
) - If you're trying to customize where your models get save in S3, check out the Estimator's
output_path
andbase_job_name
parameters if using the SageMaker Python SDK - or just theOutputDataConfig
parameter if using low-level boto3 API. - Customizing the format of the output object itself doesn't really make sense with pre-built algorithms, but for your own training scripts you can put pretty much whatever you like in the folder and it will be tarballed for you. You cannot change what folder within the training container is used for this output (it's always
/opt/ml/model
), but as a best practice most example scripts will accept this as a CLI argument and/or environment variable to facilitate local debugging.
There are 3 components working together here: The SageMaker service itself, the container-side SageMaker Training Toolkit (which sets up CLI args and environment variables like SM_MODEL_DIR
for you, installs your requirements.txt, and starts your script), and your algorithm code itself... Or in this case, the pre-built LightGBM algorithm code.
Hope that helps!
Contenuto pertinente
- AWS UFFICIALEAggiornata un anno fa
- AWS UFFICIALEAggiornata un anno fa
- AWS UFFICIALEAggiornata 4 mesi fa
answer doesn't match the question , please read the question carefully and answer should contain a python script of execution not only some theoretical jargon