based on the example here ,
https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-triton/ensemble/sentence-transformer-trt/examples/ensemble_hf/ensemble/config.pbtxt, i am working on a configuration file for a multi model endpoint on a bert based model. which takes on a string and outputs a string. the max_batch_size and the dims:[1] parameters below are not very clear . Is there any more info on this . triton server documentation is not very clear as well, from what i saw.
name: "ensemble" platform: "ensemble" max_batch_size: 16 input [ { name: "INPUT0" data_type: TYPE_STRING dims: [ 1 ] } ] output [ { name: "finaloutput" data_type: TYPE_FP32 dims: [384] } ]
You are not logged in. Log in to post an answer.
A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.