Deploy YOLOv5 in sagemaker - ModelError: InvokeEndpoint operation: Received server error (0)
I'm trying to deploy custom trained Yolov5 model in Sagemaker for inference. (Note : The model was not trained in sagemaker).
Followed this doc for deploying the model and inference script - Sagemaker docs
ModelError Traceback (most recent call last)
<ipython-input-7-063ca701eab7> in <module>
----> 1 result1=predictor.predict("FILE0032.JPG")
2 print(result1)
~/anaconda3/envs/python3/lib/python3.6/site-packages/sagemaker/predictor.py in predict(self, data, initial_args, target_model, target_variant, inference_id)
159 data, initial_args, target_model, target_variant, inference_id
160 )
--> 161 response = self.sagemaker_session.sagemaker_runtime_client.invoke_endpoint(**request_args)
162 return self._handle_response(response)
163
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
399 "%s() only accepts keyword arguments." % py_operation_name)
400 # The "self" in this scope is referring to the BaseClient.
--> 401 return self._make_api_call(operation_name, kwargs)
402
403 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
729 error_code = parsed_response.get("Error", {}).get("Code")
730 error_class = self.exceptions.from_code(error_code)
--> 731 raise error_class(parsed_response, operation_name)
732 else:
733 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (0) from primary with message "Your invocation timed out while waiting for a response from container primary. Review the latency metrics for each container in Amazon CloudWatch, resolve the issue, and try again.". See https://ap-south-1.console.aws.amazon.com/cloudwatch/home?region=ap-south-1#logEventViewer:group=/aws/sagemaker/Endpoints/pytorch-inference-2022-06-14-11-58-04-086 in account 772044684908 for more information.
After researching about InvokeEndpoint
, tried this
import boto3
sagemaker_runtime = boto3.client("sagemaker-runtime", region_name='ap-south-1')
endpoint_name='pytorch-inference-2022-06-14-11-58-04-086'
response = sagemaker_runtime.invoke_endpoint(
EndpointName=endpoint_name,
Body=bytes('{"features": ["This is great!"]}', 'utf-8') # Replace with your own data.
)
print(response['Body'].read().decode('utf-8'))
But this didn't help as well,
detailed output :
ReadTimeoutError Traceback (most recent call last)
<ipython-input-8-b5ca204734c4> in <module>
12 response = sagemaker_runtime.invoke_endpoint(
13 EndpointName=endpoint_name,
---> 14 Body=bytes('{"features": ["This is great!"]}', 'utf-8') # Replace with your own data.
15 )
16
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
399 "%s() only accepts keyword arguments." % py_operation_name)
400 # The "self" in this scope is referring to the BaseClient.
--> 401 return self._make_api_call(operation_name, kwargs)
402
403 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
716 apply_request_checksum(request_dict)
717 http, parsed_response = self._make_request(
--> 718 operation_model, request_dict, request_context)
719
720 self.meta.events.emit(
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/client.py in _make_request(self, operation_model, request_dict, request_context)
735 def _make_request(self, operation_model, request_dict, request_context):
736 try:
--> 737 return self._endpoint.make_request(operation_model, request_dict)
738 except Exception as e:
739 self.meta.events.emit(
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/endpoint.py in make_request(self, operation_model, request_dict)
105 logger.debug("Making request for %s with params: %s",
106 operation_model, request_dict)
--> 107 return self._send_request(request_dict, operation_model)
108
109 def create_request(self, params, operation_model=None):
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/endpoint.py in _send_request(self, request_dict, operation_model)
182 request, operation_model, context)
183 while self._needs_retry(attempts, operation_model, request_dict,
--> 184 success_response, exception):
185 attempts += 1
186 self._update_retries_context(
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/endpoint.py in _needs_retry(self, attempts, operation_model, request_dict, response, caught_exception)
306 event_name, response=response, endpoint=self,
307 operation=operation_model, attempts=attempts,
--> 308 caught_exception=caught_exception, request_dict=request_dict)
309 handler_response = first_non_none_response(responses)
310 if handler_response is None:
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
356 def emit(self, event_name, **kwargs):
357 aliased_event_name = self._alias_event_name(event_name)
--> 358 return self._emitter.emit(aliased_event_name, **kwargs)
359
360 def emit_until_response(self, event_name, **kwargs):
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/hooks.py in emit(self, event_name, **kwargs)
227 handlers.
228 """
--> 229 return self._emit(event_name, kwargs)
230
231 def emit_until_response(self, event_name, **kwargs):
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/hooks.py in _emit(self, event_name, kwargs, stop_on_response)
210 for handler in handlers_to_call:
211 logger.debug('Event %s: calling handler %s', event_name, handler)
--> 212 response = handler(**kwargs)
213 responses.append((handler, response))
214 if stop_on_response and response is not None:
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/retryhandler.py in __call__(self, attempts, response, caught_exception, **kwargs)
192 checker_kwargs.update({'retries_context': retries_context})
193
--> 194 if self._checker(**checker_kwargs):
195 result = self._action(attempts=attempts)
196 logger.debug("Retry needed, action of: %s", result)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception, retries_context)
266
267 should_retry = self._should_retry(attempt_number, response,
--> 268 caught_exception)
269 if should_retry:
270 if attempt_number >= self._max_attempts:
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/retryhandler.py in _should_retry(self, attempt_number, response, caught_exception)
292 # If we've exceeded the max attempts we just let the exception
293 # propogate if one has occurred.
--> 294 return self._checker(attempt_number, response, caught_exception)
295
296
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
332 for checker in self._checkers:
333 checker_response = checker(attempt_number, response,
--> 334 caught_exception)
335 if checker_response:
336 return checker_response
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/retryhandler.py in __call__(self, attempt_number, response, caught_exception)
232 elif caught_exception is not None:
233 return self._check_caught_exception(
--> 234 attempt_number, caught_exception)
235 else:
236 raise ValueError("Both response and caught_exception are None.")
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/retryhandler.py in _check_caught_exception(self, attempt_number, caught_exception)
374 # the MaxAttemptsDecorator is not interested in retrying the exception
375 # then this exception just propogates out past the retry code.
--> 376 raise caught_exception
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/endpoint.py in _do_get_response(self, request, operation_model, context)
247 http_response = first_non_none_response(responses)
248 if http_response is None:
--> 249 http_response = self._send(request)
250 except HTTPClientError as e:
251 return (None, e)
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/endpoint.py in _send(self, request)
319
320 def _send(self, request):
--> 321 return self.http_session.send(request)
322
323
~/anaconda3/envs/python3/lib/python3.6/site-packages/botocore/httpsession.py in send(self, request)
449 raise ConnectTimeoutError(endpoint_url=request.url, error=e)
450 except URLLib3ReadTimeoutError as e:
--> 451 raise ReadTimeoutError(endpoint_url=request.url, error=e)
452 except ProtocolError as e:
453 raise ConnectionClosedError(
ReadTimeoutError: Read timeout on endpoint URL: "https://runtime.sagemaker.ap-south-1.amazonaws.com/endpoints/pytorch-inference-2022-06-14-11-58-04-086/invocations"
Is there any error in your CloudWatch Logs that could point to the issue?
I see you are sending a string named "FILE0032.JPG". The .predict function will make a prediction to the endpoint with the string "FILE0032.JPG" not the serialized file "FILE0032.JPG"
Kindly see how a YOLOv4 model is invoked here.
Thanks for reply. There is no error in CloudWatch logs. (Pasted below) Sorry for the long description, i thought detailed info would be helpful.
2022-06-15T11:15:21.349+05:30 Warning: MMS is using non-default JVM parameters: -XX:-UseContainerSupport AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:WARN Continuable parsing error 2 and column 16 AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:WARN Document root element "Configuration", must match DOCTYPE root "null". AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:WARN Continuable parsing error 2 and column 16 AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:WARN Document is invalid: no grammar found. AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:ERROR DOM element is - not a <log4j:configuration> element. AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:WARN No appenders could be found for logger (io.netty.util.internal.PlatformDependent0). AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.349+05:30 log4j:WARN Please initialize the log4j system properly. AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:21.599+05:30 log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. AllTraffic/i-0ed6739cdaf7cf56a
2022-06-15T11:15:27.349+05:30 Model server started.
I tried this example, it says
"An entry_point script isn’t necessary and can be a blank file. The environment variables in the env parameter are also optional"
in the tutorial But when i tried it, it threw this error
---------------------------------------------------------------------------
ModelError Traceback (most recent call last)
<ipython-input-25-b706a4fea979> in <module>
13 for i in range(iters):
14 t0 = time.time()
---> 15 response = client.invoke_endpoint(EndpointName=optimized_predictor.endpoint_name, Body=body, ContentType=content_type)
16 t1 = time.time()
17 #convert to millis
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
399 "%s() only accepts keyword arguments." % py_operation_name)
400 # The "self" in this scope is referring to the BaseClient.
--> 401 return self._make_api_call(operation_name, kwargs)
402
403 _api_call.__name__ = str(py_operation_name)
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
729 error_code = parsed_response.get("Error", {}).get("Code")
730 error_class = self.exceptions.from_code(error_code)
--> 731 raise error_class(parsed_response, operation_name)
732 else:
733 return parsed_response
ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from primary with message "Content type applicatoin/x-image is not supported by this framework.
Please implement input_fn to to deserialize the request data or an output_fn to
serialize the response. For more information, see the SageMaker Python SDK README.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/sagemaker_inference/decoder.py", line 106, in decode
decoder = _decoder_map[content_type]
KeyError: 'applicatoin/x-image'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 128, in transform
result = self._transform_fn(self._model, input_data, content_type, accept)
File "/usr/local/lib/python3.6/site-packages/sagemaker_inference/transformer.py", line 233, in _default_transform_fn
data = self._input_fn(input_data, content_type)
File "/usr/local/lib/python3.6/site-packages/sagemaker_pytorch_serving_container/default_inference_handler.py", line 111, in default_input_fn
np_array = decoder.decode(input_data, content_type)
File "/usr/local/lib/python3.6/site-packages/sagemaker_inference/decoder.py", line 109, in decode
raise errors.UnsupportedFormatError(content_type)
sagemaker_inference.errors.UnsupportedFormatError: Content type applicatoin/x-image is not supported by this framework.
Please implement input_fn to to deserialize the request data or an output_fn to
serialize the response. For more information, see the SageMaker Python SDK README.
". See https://ap-south-1.console.aws.amazon.com/cloudwatch/home?region=ap-south-1#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-inference-pytorch-ml-c5-2022-06-15-05-44-12-970 in account 772044684908 for more information.
FYI,
torch.__version__
1.6.0
kernel
conda_pytorch_p36 (Same steps followed as mentioned in the tutorial)
Very confused on how to proceed after this? Why SageMaker is this much complex? Any kind of help would be appreciated. Thanks Marc.
Relevant questions
SageMaker Multi Model EndPoint and Inference Data Capture feature
Accepted Answerasked 3 months agoDeploy SageMaker model to IoT Greengrass in different account?
Accepted AnswerSagemaker training for multiclass classification run does not store the trained model
Accepted Answerasked a month agoSageMaker Model Spend
Accepted Answerasked 2 years agoReceived server error (0) from model when hosting
asked 3 years agoDeploy YOLOv5 in sagemaker - ModelError: InvokeEndpoint operation: Received server error (0)
asked 21 days agoloading and deploying a previously trained sagemaker xgboost model
asked 3 years agoHost a fine-tuned BERT Multilingual model on SageMaker with Serverless inference
asked 5 months agoAn error occurred (ModelError) when calling the InvokeEndpoint operation
asked a year agoHow to deploy pre-trained model?
asked 3 years ago
Thanks Marc,Please refer the next answer column for my comment.