- Newest
- Most votes
- Most comments
Hi rain-mtucker,
With the Transcribe JavaScript SDK, if you see the error 'The chunk is too big', you can solve it by making the highWaterMark smaller. The example below where the highWaterMark is set is from the Transcribe JavaScript SDK documentation..
const { PassThrough } = require("stream");
const { createReadStream } = require("fs");
const audioSource = createReadStream("path/to/speech.wav");
const audioPayloadStream = new PassThrough({ highWaterMark: 1 * 1024 }); // Stream chunk less than 1 KB
audioSource.pipe(audioPayloadStream);
const audioStream = async function* () {
for await (const payloadChunk of audioPayloadStream) {
yield { AudioEvent: { AudioChunk: payloadChunk } };
}
};
The
highWaterMark
value was set to 128 and I reduced it to 64 and it still gives the error:2023-02-04T17:31:04.829Z 77e692ab-1c41-46b9-99de-4e5fd12363ae ERROR Error processing transcribe stream. SessionId: 87997632-5d42-425e-b6ab-6cad9e9f0aff { "name": "BadRequestException", "$fault": "client", "$metadata": {}, "message": "Your stream is too big. Reduce the frame size and try your request again." }
The JavaScript demo client is putting the audio file on a Kinesis Video Streams stream and the backend is getting the audio and passing it to Transcribe streaming. I'm doing this to save the audio stream in a file in the backend. In the end, there will be various clients/systems with microphones that will send audio to KVS.
Is there something with that portion that could be resulting in the error?
I can see that all fragments from KVS are being read, the transcription stops after a few TranscriptEvents. I tried changing highWaterMark to 1024, but same error.
Here is the Lambda code that reads audio fragments from KVS and sends them to a combined stream for Transcribe & audio file:
import Block from 'block-stream2'; const audioStream = new Block(2); const combinedStream = new PassThrough(); const combinedStreamBlock = new Block(2); combinedStream.pipe(combinedStreamBlock); combinedStreamBlock.on('data', (chunk) => { // send to transcribe transcribePassthroughStream.write(chunk); // save to tmp file writeRecordingStream.write(chunk); }); // audioStream comes from KVS audioStream.pipe(combinedStream);
Assigning thread_queue_size to our ffmpeg stream in start.sh fixed this bug.
ffmpeg -loglevel $loglevel -thread_queue_size 1024 -re -sn -i $inputb -c:v copy -c:a copy -f flv - | flv+srt - transcript_fifo - | ffmpeg -loglevel $loglevel -thread_queue_size 1024 -y -i - -c:v copy -c:a copy -metadata:s:s:0 language=eng -f $format $output &
Relevant content
- asked 8 months ago
- asked 2 years ago
- Accepted Answerasked 8 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated a month ago
- AWS OFFICIALUpdated a month ago
Also seeing this error trying to stream through docker container to transcribe https://github.com/aws-samples/amazon-transcribe-streaming-live-closed-captions