Questions tagged with Amazon SageMaker Model Training
Content language: English
Select up to 5 tags to filter
Sort by most recent
Browse through the questions and answers listed below or filter and sort to narrow down your results.
I'm using a PipeModeDataset to read the data from an augmented manifest file. Right now, if I try something like this:
```
ds = PipeModeDataset(channel=channel_name)
for epoch in...
2
answers
0
votes
234
views
asked a year agolg...
Hi,
I made new account a few days ago to use the free tier. I only trained one simple model on sagemaker studio. But now when I try to training another simple model I get this error.
An error...
1
answers
0
votes
365
views
asked a year agolg...
Hello,
When training an HPO using the Sagemaker SDK It's much slower than training on Sagemaker jupyter notebook -
Both variants have the same:
1. Hyperparameters (the same Model)
2. Data - Train /...
2
answers
0
votes
384
views
asked a year agolg...
Hi!
I have a training set of images for which I manually created a manifest file respecting the format required to train a Rekognition Custom Labels model for object detection. Both the images and...
1
answers
0
votes
345
views
asked a year agolg...
When trying to start a training job in sagemaker using AlgorithmEstimator (by inputting the algorithm arn), I get an error saying that the Algorithm arn does not exist. I have tried this with...
1
answers
0
votes
267
views
asked a year agolg...
I used the training script from [https://sagemaker.readthedocs.io/en/stable/frameworks/xgboost/using_xgboost.html](here),and trying to train the model. And the here is the code I used for configuring...
0
answers
0
votes
223
views
asked a year agolg...
Im using sagemaker for train the data
It has pre-trained model
“tensorflow-od1-ssd-resnet50-v1-fpn-640x640-coco17-tpu-8”
**Create the SageMaker model instance. Note that we need to pass Predictor...
0
answers
0
votes
145
views
asked a year agolg...
I would like to fine-tune large language models (starting with 10+B parameters) on Sagemaker.
Since we are working with Pytorch and Lightning the idea would be to use DeepSpeed in combination with...
1
answers
1
votes
849
views
asked a year agolg...
I am trying to train GPT2-large model on Sagemaker Studio -- using a 'ml.g4dn.2xlarge instance. The training file is very small ( 13 kb). It gives the following error:
ExitCode 1
ErrorMessage...
1
answers
0
votes
538
views
asked a year agolg...
I'm using the functionality of `sagemaker.experiments`, where a run object is defined for tracking a job.
For logging of metrics, I'm using the `log_metric()` method of the run object, with name,...
2
answers
0
votes
255
views
asked a year agolg...
I have trained a timeseries model on SageMaker Canvas through 'Standard Build' and made predictions on it. But I am unable to see the trained timeseries model as an AutoMLJob in SageMaker Studio. Is...
1
answers
0
votes
424
views
asked a year agolg...
Hi!
As of a few days ago, the "**Uploading**" phase of my SageMaker training jobs jumped from **2 minutes to 3+ hours.** The size of my artifacts did not change, but I did enable check-pointing...
0
answers
0
votes
82
views
asked a year agolg...