AWS Transcribe, best use case

0

Hey all. 2 quick questions, I have this sample code in my Lambda now:

const params = {
    LanguageCode: "en-US",
    Media: {
      MediaFileUri: "https://transcribe-demo.s3-eu-central-1.amazonaws.com/hello_world.wav",
    },
    MediaFormat: "mp3",
    TranscriptionJobName: `TranscriptionJob-${Date.now()}`,
    OutputBucketName: "transcribe-output-bucket"
  };

  const data = await transcribeClient.send(
    new StartTranscriptionJobCommand(params)
  );

Question 1: The plan is to send an voice recording url to an api gateway. I then want to transcribe it. And after that send it to an API to summarize it. I don't think AWS has a service for this so I will probably try out ChatGPT. Is it possible to directly transcribe it from the url? Or do I first have to download it to a bucket.

Question 2: Trough the AWS Console I can let transcribe auto detect the language, is this also possible with the Javascript sdk? I don't see that option.

Thanks all. If you have any suggestions on how to implement this better feel free to let me know.

1 個回答
0

Q1 : Is it possible to directly transcribe it from the url? Or do I first have to download it to a bucket.

A1 : From this documentation[1], it is mentioned that Amazon Transcribe takes audio data, as a media file in an Amazon S3 bucket or a media stream, and converts it to text data. If you're transcribing media files stored in an Amazon S3 bucket, you're performing batch transcriptions. If you're transcribing media streams, you're performing streaming transcriptions. These two processes have different rules and requirements. You may also use Sagemaker JumpStart to deploy LLM/Models to do the summarization[2] and it is very straightforward process

Q 2: Trough the AWS Console I can let transcribe auto detect the language, is this also possible with the Javascript sdk?

A2 : You can also add custom vocabulary for the languages. Please take a look on the code snippet :

"use strict";

import { StartTranscriptionJobCommand } from "@aws-sdk/client-transcribe";
import { TranscribeClient } from "@aws-sdk/client-transcribe";

const REGION = "us-east-1";
const BUCKET = "YOUR_BUCKET";
const KEY = "YOUR_FILE";

const transcribeClient = new TranscribeClient({ region: REGION });
let random = (Math.random() + 1).toString(36).substring(7);
console.log('key = ' + KEY);

export const params = {
  IdentifyLanguage: true,
  LanguageOptions: ['en-US','fr-FR'],
  LanguageIdSettings: { 
    "en-US" : { 
       VocabularyName: "custom-vocab-en_US"
    },
    "fr-FR" : { 
      VocabularyName: "custom-vocab-fr_FR"
   }
 },
  Media: {
    MediaFileUri: `https://s3-${REGION}.amazonaws.com/${BUCKET}/${KEY}` 
  },
  MediaFormat: 'mp3',
  TranscriptionJobName: `Transcribe-Job-${random}`,
  OutputBucketName: 'YOUR_BUCKET'
};

export const run = async () => {
  try {
    const data = await transcribeClient.send(
      new StartTranscriptionJobCommand(params)
    );
    console.log("Success - put", data);
    return data;
  } catch (err) {
    console.log("Error", err);
  }
};
run();

Resources : [1] - https://docs.aws.amazon.com/transcribe/latest/dg/how-input.html

[2] - https://docs.aws.amazon.com/sagemaker/latest/dg/studio-jumpstart.html

AWS
已回答 9 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南