Sagemaker Deployment Issues: TypeError: Not A String

0

Trying to get codegen25-7b-multi to launch on Sagemaker and hitting issues trying to launch on 2xlarge, 8xlarge and 12xlarge instances. All are throwing the same errors:

Error #1

#033[2m2023-08-04T15:30:16.074558Z#033[0m #033[31mERROR#033[0m #033[1mshard-manager#033[0m: #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Error when initializing model

Error #2

File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 124, in serve_inner
    model = get_model(model_id, revision, sharded, quantize, trust_remote_code)
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 237, in get_model
    return FlashLlamaSharded(
  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/flash_llama.py", line 166, in __init__
    tokenizer = LlamaTokenizer.from_pretrained(
  File "/usr/src/transformers/src/transformers/tokenization_utils_base.py", line 1812, in from_pretrained
    return cls._from_pretrained(
  File "/usr/src/transformers/src/transformers/tokenization_utils_base.py", line 1975, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "/usr/src/transformers/src/transformers/models/llama/tokenization_llama.py", line 96, in __init__
    self.sp_model.Load(vocab_file)
  File "/opt/conda/lib/python3.9/site-packages/sentencepiece/__init__.py", line 905, in Load
    return self.LoadFromFile(model_file)
  File "/opt/conda/lib/python3.9/site-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
    return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)

Error #3

TypeError: not a string
 #033[2m#033[3mrank#033[0m#033[2m=#033[0m0#033[0m

Error #4

#033[2m2023-08-04T15:30:16.396605Z#033[0m #033[31mERROR#033[0m #033[2mtext_generation_launcher#033[0m#033[2m:#033[0m Shard 1 failed to start:

Notebook for deployment in Sagemaker:

import json
import sagemaker
import boto3
from sagemaker.huggingface import HuggingFaceModel, get_huggingface_llm_image_uri

try:
	role = sagemaker.get_execution_role()
except ValueError:
	iam = boto3.client('iam')
	role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']

# Hub Model configuration. https://huggingface.co/models
hub = {
	'HF_MODEL_ID':'Salesforce/codegen25-7b-multi',
	'SM_NUM_GPUS': '2',
    'HF_API_TOKEN':'<TOKEN>'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
	image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"),
	env=hub,
	role=role, 
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
	initial_instance_count=1,
	instance_type="ml.g4dn.12xlarge",
	container_startup_health_check_timeout=300,
    endpoint_name="codegen25",
    model_name="codegen25"
  )
texnoob
gefragt vor 9 Monaten256 Aufrufe
1 Antwort
0

Hi, one first issue is 'SM_NUM_GPUS': '2' As per https://huggingface.co/transformers/v4.6.0/sagemaker.html

SM_NUM_GPUS: An integer representing the number of GPUs available to the host.

So, try 'SM_NUM_GPUS': 2 i.e. without quote around 2 to make it an integer rather than a string.

Best,

Didier

profile pictureAWS
EXPERTE
beantwortet vor 9 Monaten

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen