Questions tagged with Amazon SageMaker Model Training
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
Hi!
I have a training set of images for which I manually created a manifest file respecting the format required to train a Rekognition Custom Labels model for object detection. Both the images and...
1
answers
0
votes
295
views
asked a year agolg...
When trying to start a training job in sagemaker using AlgorithmEstimator (by inputting the algorithm arn), I get an error saying that the Algorithm arn does not exist. I have tried this with...
1
answers
0
votes
220
views
asked a year agolg...
I used the training script from [https://sagemaker.readthedocs.io/en/stable/frameworks/xgboost/using_xgboost.html](here),and trying to train the model. And the here is the code I used for configuring...
0
answers
0
votes
191
views
asked a year agolg...
Im using sagemaker for train the data
It has pre-trained model
“tensorflow-od1-ssd-resnet50-v1-fpn-640x640-coco17-tpu-8”
**Create the SageMaker model instance. Note that we need to pass Predictor...
0
answers
0
votes
121
views
asked a year agolg...
I would like to fine-tune large language models (starting with 10+B parameters) on Sagemaker.
Since we are working with Pytorch and Lightning the idea would be to use DeepSpeed in combination with...
1
answers
1
votes
596
views
asked a year agolg...
I am trying to train GPT2-large model on Sagemaker Studio -- using a 'ml.g4dn.2xlarge instance. The training file is very small ( 13 kb). It gives the following error:
ExitCode 1
ErrorMessage...
1
answers
0
votes
473
views
asked a year agolg...
I'm using the functionality of `sagemaker.experiments`, where a run object is defined for tracking a job.
For logging of metrics, I'm using the `log_metric()` method of the run object, with name,...
2
answers
0
votes
216
views
asked a year agolg...
I have trained a timeseries model on SageMaker Canvas through 'Standard Build' and made predictions on it. But I am unable to see the trained timeseries model as an AutoMLJob in SageMaker Studio. Is...
1
answers
0
votes
362
views
asked a year agolg...
Hi!
As of a few days ago, the "**Uploading**" phase of my SageMaker training jobs jumped from **2 minutes to 3+ hours.** The size of my artifacts did not change, but I did enable check-pointing...
0
answers
0
votes
68
views
asked a year agolg...
A pipeline train step saves a custom json file in the output path, set in the estimator's `output_path` param, as seen below:
```
estimator = TensorFlow(
entry_point=code_entry,
...
2
answers
0
votes
802
views
asked a year agolg...
I had made my custom training image so It can be conducted through CreateTrainingJob, not sagemaker training took kit (requiring "ContainerEntrypoint" option).
But when I'm trying to run...
1
answers
0
votes
260
views
asked a year agolg...
Hello,
I have a question about SNS Publish topic. I have received a error message as below:
![Enter image description here](/media/postImages/original/IMeF5-PdI3TwqvnXWTTEO1aA)
Also, I try to set...
2
answers
0
votes
348
views
asked a year agolg...