Hi I'm trying to create an endpoint for my own pretrained model. (which is customized flan-t5 model. )
I already have model.tar.gz already uploaded on S3 bucket called 'errocorrection' and working on sagemaker notebook instance ml.g4dn.2xlarge with EBS volume 100GB.
(model.tar.gz is archived : inference.py, pytorch_model.bin, tokenizer.json, tokenizer_config.json, config.json, generation_config.json, special_tokens_map.json and it is 10GB)
role = sagemaker.get_execution_role()
bucket = 'errocorrection'
%%bash -s "$role" "$bucket"
ROLE=$1
BUCKET=$2
aws s3 cp model.tar.gz s3://$BUCKET/model.tar.gz
%%time
model_path = 's3://{}/model.tar.gz'.format(bucket)
endpoint_name = "endpoint-{}".format(int(time.time()))
model = PyTorchModel(model_data=model_path,
role=role,
entry_point='inference.py',
framework_version='1.8.0',
py_version='py3')
predictor = model.deploy(
initial_instance_count=1,
instance_type='ml.g4dn.2xlarge'
)
The problem is, when I try to create an endpoint for this model(with a code above), it fails because of Os error : [errno28] No space left on disk. I checked the space with terminal, df -h command, but I think there is already enough space.
I've attached the screenshot of trouble shooting and the result of memories when I checked the space.
Please help me!!