How to serialize wav file input for Sagemaker CLI create-transform-job?

0

I'm an AWS noob, so please bear with me. My objective is to run an inference on a Whisper model. I'd like the run either to be triggered automatically whenever a new file is loaded to a bucket or using a command line to launch the process.

I've used this tutorial that uses Python SDK syntax as a starting point. I saved the model to S3, and ran the following commands to prep for inference:

aws sagemaker create-model ....
aws sagemaker create-endpoint-config ...
aws sagemaker create-endpoint ...

After saving some WAV files in an input directory and creating an output directory, I run the following:

aws sagemaker create-transform-job --transform-job-name test-predict\
        --transform-resources "InstanceType"="ml.m5.2xlarge","InstanceCount"=1 \
        --model-name whisper-model \
        --transform-input DataSource={S3DataSource={S3DataType=S3Prefix,S3Uri=s3://sagemaker-us-east-1-myawsid/data}},ContentType="audio/wav" \
        --transform-output "S3OutputPath=s3://sagemaker-us-east-1-myawsid/output/"

If we refer back to the tutorial, we would expect this to fail because the WAV files need to be serialized. So I see this error in the logs:

Content type audio/wav is not supported by this framework.
	2024-02-14T12:58:14.234-05:00	2024-02-14T17:58:13.638:[sagemaker logs]: sagemaker-us-east-1-/data/003.wav:
	2024-02-14T12:58:14.234-05:00	2024-02-14T17:58:13.638:[sagemaker logs]: sagemaker-us-east-1-/data/003.wav: Please implement input_fn to to deserialize the request data or an output_fn to
	2024-02-14T12:58:14.234-05:00	2024-02-14T17:58:13.638:[sagemaker logs]: sagemaker-us-east-1-/data/003.wav: serialize the response. For more information, see the SageMaker Python SDK README.

What is the best way to serialize these files and then feed for inference? I'm not tied to using the create-transform-job syntax but would really prefer not use the Python SDK for the MLOps part of the code. If I must use the Python SDK, please suggest a syntax that would allow me to separate out the prediction steps from the MLOps code (perhaps a Sagemaker pipeline is what I need).

Also, I saw an alternative AWS-sourced way of using the Whisper model (instead of HuggingFace),but it seemed more cumbersome to do it with the CLI syntax than the HuggingFace approach I used. Taking a step back from this approach altogether, maybe it's more efficient for me to use AWS Lambda functions? The use case seems appropriate, since I want to run the transcription whenever a new file is loaded to a bucket.

Thanks

질문됨 3달 전88회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠