Skip to content

Duplicate transcription issue when using BDA API for videos longer than 120 minutes

-1

I have a video stored in Amazon S3 that is 200 minutes long. Since the BDA API only supports up to 120 minutes, I process it in two parts:

First part: startTime = 00:00, endTime = 120:00 — works fine.

Second part: startTime = 120:01, endTime = 200:00 — chapters are generated, but the transcriptions are overlapping.

Example:

Chapter 1: 120:01 to 125:00

Chapter 2: 120:01 to 130:00 (contains same content as Chapter 1)

This duplication continues in further chapters. How can I split videos longer than 120 minutes without overlapping transcription content?

1 Answer
0

Based on the information provided, it appears you're encountering an issue with overlapping transcriptions when processing a long video in segments using AWS services. However, the sources don't contain specific information about a "BDA API" with a 120-minute limitation or this particular overlapping transcription issue.

For processing long videos in AWS, you might consider these approaches:

  1. When segmenting videos, ensure that your segments have clear, non-overlapping boundaries. For example, if your first segment ends at 120:00, make sure your second segment starts exactly at 120:00 or 120:01 without overlap.

  2. Consider using Amazon Rekognition's segmentation capabilities to properly partition your video into distinct segments. This would help ensure that scene changes are properly identified and processed separately.

  3. After processing individual segments, you may need to implement custom post-processing to merge the results and remove any duplicated content. This could involve checking for overlapping timestamps and merging or trimming content as needed.

  4. For transcript generation specifically, you might explore using Amazon Transcribe, which has capabilities for handling longer content and includes features like speaker partitioning that might help with organizing your transcription output.

Without more specific information about the "BDA API" you're referring to, I can't provide targeted guidance for that particular service's limitations.
Sources
Exploring accessible audio descriptions with Amazon Nova | Artificial Intelligence
Exploring accessible audio descriptions with Amazon Nova | Artificial Intelligence
Media2Cloud on AWS Guidance: Scene and ad-break detection and contextual understanding for advertising using generative AI | AWS for M&E Blog
Enabling speaker partitioning - Amazon Transcribe

answered 10 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.