AWS Transcribe Medical BadRequestException: Your stream is too big. Reduce the frame size and try your request again.

1

Sending audio file through Amazon Kinesis Video Streams (KVS) stream to Amazon Transcribe Medical. I'm able to get media from KVS and send to Transcribe Medical and even get a few TranscriptEvents.

But then the following error:

2023-02-04T05:10:00.173Z	d0ee90a1-a61f-44f9-826a-fd2c4977425b	ERROR	Error processing transcribe stream. SessionId:  e484b9b6-1657-45d0-bf3d-12c3737e6f65 {
    "name": "BadRequestException",
    "$fault": "client",
    "$metadata": {},
    "message": "Your stream is too big. Reduce the frame size and try your request again."
}

The file being sent through KVS is created with:

    const audio = ffmpeg(uri)
        .format('matroska')
        // .outputOptions(['-ar 8000', '-acodec pcm_s16le' // Only works for Transcribe
        .outputOptions(['-ar 16000', '-acodec pcm_s16le' // Transcribe Medical needs 16,000 Hz to 48,000 Hz
        ])

The only changes to my previously working code:

  • Changed from StartStreamTranscriptionCommand to StartMedicalStreamTranscriptionCommand
  • Set StartMedicalStreamTranscriptionCommandInput to:
{
    LanguageCode: 'en-US',
    MediaEncoding: 'pcm',
    AudioStream: audioStream(),

    // changes for Transcribe Medical
    MediaSampleRateHertz: 16000, //8000,
    Specialty: 'CARDIOLOGY',
    Type: 'CONVERSATION',
  }

What could be causing this?

Using AWS SDK for JavaScript v3

2개 답변
0

Hi rain-mtucker,

With the Transcribe JavaScript SDK, if you see the error 'The chunk is too big', you can solve it by making the highWaterMark smaller. The example below where the highWaterMark is set is from the Transcribe JavaScript SDK documentation..

const { PassThrough } = require("stream");
const { createReadStream } = require("fs");
const audioSource = createReadStream("path/to/speech.wav");
const audioPayloadStream = new PassThrough({ highWaterMark: 1 * 1024 }); // Stream chunk less than 1 KB
audioSource.pipe(audioPayloadStream);
const audioStream = async function* () {
  for await (const payloadChunk of audioPayloadStream) {
    yield { AudioEvent: { AudioChunk: payloadChunk } };
  }
};
AWS
답변함 일 년 전
  • The highWaterMark value was set to 128 and I reduced it to 64 and it still gives the error:

    2023-02-04T17:31:04.829Z	77e692ab-1c41-46b9-99de-4e5fd12363ae	ERROR	Error processing transcribe stream. SessionId:  87997632-5d42-425e-b6ab-6cad9e9f0aff {
        "name": "BadRequestException",
        "$fault": "client",
        "$metadata": {},
        "message": "Your stream is too big. Reduce the frame size and try your request again."
    }
    

    The JavaScript demo client is putting the audio file on a Kinesis Video Streams stream and the backend is getting the audio and passing it to Transcribe streaming. I'm doing this to save the audio stream in a file in the backend. In the end, there will be various clients/systems with microphones that will send audio to KVS.

    Is there something with that portion that could be resulting in the error?

    I can see that all fragments from KVS are being read, the transcription stops after a few TranscriptEvents. I tried changing highWaterMark to 1024, but same error.

  • Here is the Lambda code that reads audio fragments from KVS and sends them to a combined stream for Transcribe & audio file:

      import Block from 'block-stream2';
      const audioStream = new Block(2);
    
      const combinedStream = new PassThrough();
      const combinedStreamBlock = new Block(2);
      combinedStream.pipe(combinedStreamBlock);
      combinedStreamBlock.on('data', (chunk) => {
        // send to transcribe
        transcribePassthroughStream.write(chunk);
    
        // save to tmp file
        writeRecordingStream.write(chunk);
      });
    
      // audioStream comes from KVS
      audioStream.pipe(combinedStream);
    
    
0

Assigning thread_queue_size to our ffmpeg stream in start.sh fixed this bug.

ffmpeg -loglevel $loglevel -thread_queue_size 1024 -re -sn -i $inputb -c:v copy -c:a copy -f flv - | flv+srt - transcript_fifo - | ffmpeg -loglevel $loglevel -thread_queue_size 1024 -y -i - -c:v copy -c:a copy -metadata:s:s:0 language=eng -f $format $output & 
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠