Questions tagged with Amazon SageMaker

Content language: English

Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

Batch Transformation Mismatch -

I performing batch transformation in a pipeline and joining it with the ground truth labels to create a ModelQuality monitoring step as per the guide here: https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-pipelines/tabular/model-monitor-clarify-pipelines/sagemaker-pipeline-model-monitor-clarify-steps.ipynb Overview: * Utilizing Sklearn pipeline tools for data handling and prediction (i.e. the sklearn pipeline module) * Using my own inference script (based on the build your own example provided by AWS) * Reading in csv file of data, which successfully loads and predicts labels (output ex: ['an01' 'hxn2' 'sv4' ... 'ngn' 'ssv' 'ssv']) * Transformation step is unsuccessful and provides following message: `2022-11-08T16:22:59.531:[sagemaker logs]: sagemaker-us-east-1-766029086407/TEST-monitor-steps/cipm6ea9musp/projectKW-V3-Tune-TrainRegister-process/output/test/test.csv: Fail to join data: mismatched line count between the input and the output` I am having trouble understanding if the configurations of *TransformInput* provided to the step are wrong or if there is an error with the inference script itself. Below is the code for the transformer step: ``` transformer = Transformer( model_name=step_create_model.properties.ModelName, instance_type="ml.m5.xlarge", instance_count=1, accept="text/csv", assemble_with="Line", output_path=f"s3://{bucket}/Transform", sagemaker_session=sagemaker_session ) step_transform = TransformStep( name="TransformStep", transformer=transformer, inputs=TransformInput( data=preprocessing_step.properties.ProcessingOutputConfig.Outputs["test"].S3Output.S3Uri, join_source="Input", content_type="text/csv", split_type="Line", ), ) ``` Below is the code for the output_fn in the inference script: ``` def output_fn(prediction, accept): if accept == "application/json": instances = [] for row in prediction.tolist(): instances.append({"features": row}) json_output = {"instances": instances} return worker.Response(json.dumps(json_output), mimetype=accept) elif accept == 'application/x-npy': return worker.Response(encoders.encode(prediction, accept), mimetype=accept) elif accept == 'text/csv': return worker.Response(encoders.encode(prediction, accept), mimetype=accept) else: raise RuntimeException("{} accept type is not supported by this script.".format(accept)) ``` Thanks for the help!
2
answers
0
votes
47
views
cm_pall
asked a month ago

How to Resolve "ERROR execute(301) Failed to execute model:"

We have two applications working on the same AWS Panorama Appliance and processing different video streams. Unfortunately, we are catching the following error. ``` 2022-10-09 21:25:32.360 ERROR executionThread(358) Model 'model': 2022-10-09 21:25:32.359 ERROR execute(301) Failed to execute model: TVMError: '"---------------------------------------------------------------" An error occurred during the execution of TVM. For more information, please see: https://tvm.apache.org/docs/errors.html '"--------------------------------------------------------------- Check failed: (context->execute(batch_size "Stack trace: File "/home/nvidia/neo-ai-dlr/3rdparty/tvm/src/runtime/contrib/tensorrt/tensorrt_runtime.cc", line 177 [bt] (0) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(+0x381358) [0x7f81e66358] [bt] (1) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(tvm::runtime::detail::LogFatal::Entry::Finalize()+0x88) [0x7f81bb64a0] [bt] (2) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(tvm::runtime::contrib::TensorRTRuntime::Run()+0x12b8) [0x7f81e243b0] [bt] (3) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(std::_Function_handler<void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*), tvm::runtime::json::JSONRuntimeBase::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)::{lambda(tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)#3}>::_M_invoke(std::_Any_data const&, tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&)+0x5c) [0x7f81e1bfc4] [bt] (4) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(+0x3c0dc4) [0x7f81ea5dc4] [bt] (5) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(+0x3c0e4c) [0x7f81ea5e4c] [bt] (6) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(dlr::TVMModel::Run()+0xc0) [0x7f81c258e0] [bt] (7) /data/cloud/assets/applicationInstance-6ta4fxv6hatsk62pf7aigge36e/a9adc18d31f58ce11dab117a31b7f47e7ee2ab83e04b52c2952ac8cd47b51f72/model/libdlr.so(RunDLRModel+0x1c) [0x7f81bea304] [bt] (8) /usr/lib/libAwsOmniInferLib.so(awsomniinfer::CNeoModel::SNeoModel::execute()+0x3c) [0x7f887db978]" 2022-10-09 21:25:32.437 ERROR executionThread(358) Model 'model': 2022-10-09 21:25:32.437 ERROR setData(279) Failed to set model input 'data': ``` The error isn't persistent. It may happen once in 2-3 weeks, and I need to know which place to investigate. The application logs are in the attachment. I am trying to avoid this issue. However, I would appreciate it if somebody knew how to cook this properly.
0
answers
0
votes
22
views
Rinat
asked a month ago
0
answers
0
votes
18
views
asked a month ago