How to set up a training job in sagemaker ?


I'm following a blog/sample here - How can i set up something similar with just sagemaker and using aws cli? (sample code below from the example ) . in the example, it uses distilbert-base-uncased model and it is loaded via this code -> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

where does the model gets downloaded from and if one were to set up similar training job via boto3/cli, can we pass model location somewhere in a s3 bucket?

# Importing necessary tools
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from datasets import load_dataset
import tensorflow as tf
import numpy as np

# Loading our dataset
tweet_dataset = load_dataset(path="tweet_eval", name="emotion")

# Instantiating our DistilBERT tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=4

asked 2 months ago162 views
1 Answer

Hi clouduser,

If you are looking to set up something similar with just SageMaker and using the AWS CLI, here is an article that shows how you can directly set up a training job using a Hugging Face model and Amazon SageMaker.

Here is another example setup with PyTorch Training Jobs.

Here is another example setup with TensorFlow Training jobs.

I would recommend following one of these three blogs to set up your Amazon SageMaker training job based on which model you decide to go with.

answered 2 months ago
profile picture
reviewed 2 months ago
  • @autrin - thanks . if i want to fine tune a model , "distilbert-base-uncased" , in my case. how to set up a training/fine tuning job in sagemaker?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions