Transcribe is missing conversation not identify speakers

0

Here is audio file that I am using for transcribing AWS Transcribe is not properly identifying speakers in conversation. It's mixing it conversation

https://dl.sndup.net/zf5r/Conversation.mp3

response = transcribe_client.start_transcription_job(
    TranscriptionJobName=job_name,
    Media={'MediaFileUri': f's3://{bucket_name}/{file_name}'},
    MediaFormat='mp3',
    LanguageCode='en-US',
    OutputBucketName=bucket_name,
    Settings={
        'ShowSpeakerLabels': True,
        'MaxSpeakerLabels': 2
    }
)
    
    # Load the transcript from S3.
    transcript_key = f"{job_name}.json"
    transcript_obj = s3_client.get_object(Bucket=bucket_name, Key=transcript_key)
    transcript_text = transcript_obj['Body'].read().decode('utf-8')
    transcript_json = json.loads(transcript_text)
    output_text = ""
    current_speaker = None
    
    items = transcript_json['results']['items']
    
    for item in items:
        
        speaker_label = item.get('speaker_label', None)
        content = item['alternatives'][0]['content']
       
        # Start the line with the speaker label:
        if speaker_label is not None and speaker_label != current_speaker:
            current_speaker = speaker_label
            output_text += f"\n{current_speaker}: "
            
        # Add the speech content:
        if item['type'] == 'punctuation':
            output_text = output_text.rstrip()
            
        output_text += f"{content} "
        
    # Save the transcript to a text file
    with open(f'{job_name}.txt', 'w') as f:
        f.write(output_text)```

spk_0: Hello, thank you for calling airline customer support. My name is Sarah. How can I assist 
spk_1: you today? Hi, Sarah, I need to cancel my flight booking that I made last 
spk_0: week. I'm sorry to hear that you need to cancel your booking. May I have your mobile number to pull up your reservation? 
spk_1: Sure. It's 5551234567. 
spk_0: Thank you. Could you please verify your full name and the departure date of your flight for security purposes? 
spk_1: My name is John Smith and the departure date is February 5th. Thank you, 
spk_0: Mr Smith. Ive located your booking. I see that you are eligible for a refund. Would you like to proceed with the cancellation? Yes, please. All your booking has been canceled successfully. The refund will be processed within 24 hours to the original form of payment. Is there anything else I can assist you with today? No, 
spk_1: that's all. Thank you for your help. 
spk_0: You're welcome, Mister Smith. If you have any further questions or need assistance in the future, feel free to reach out to us. Have a great day.
Sushant
asked 2 months ago156 views
1 Answer
0

To extract the speaker-identified transcription text from the JSON output for the full audio file, you can use or modify the aws-transcribe-transcript Python script.

The steps would be:

Run your audio file through the Amazon Transcribe service to generate the JSON output file.

Use the aws-transcribe-transcript script to parse the JSON output. The script is available at https://github.com/trhr/aws-transcribe-transcript.

The script will parse the JSON and output a text file with the transcription organized by speaker, including speaker labels and timestamps.

You can then work with this text file which contains the full transcription with speakers identified, rather than having to extract it manually from the JSON.

You may need to modify the script to handle any special aspects of your JSON output, such as additional fields or different formatting.

profile picture
EXPERT
answered 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions