AWS Rekognition: slate detection (technical cue) on stored videos

0

Hi there,

I'm failing to get slate detection working whilst using AWS Rekognition's Segment Detection to find technical cues on videos stored on S3. The process only ever finds Content technical cues, but never the Slate at the beginning of the clip. I've tested with various clips that start with a clapperboard. I'm unsure if this a config problem, a region problem, an input source problem, or if I'm just misunderstanding what the model can do. Any ideas?

Region: eu-west-1 (Ireland, which should be supported judging by this blog post).

Extract from job:

        const labelDetectionResponse = await rekClient.send(new StartSegmentDetectionCommand({
            ...
            SegmentTypes: ["TECHNICAL_CUE"],
            ...
        ));

Extracts from response:

types: [ { ModelVersion: '2.0', Type: 'TECHNICAL_CUE' } ]
    JobStatus: 'SUCCEEDED',
    Segments: [ 
    {
      DurationFrames: 1140,
      DurationMillis: 47500,
      DurationSMPTE: '00:00:47:12',
      EndFrameNumber: 1140,
      EndTimecodeSMPTE: '00:00:47:12',
      EndTimestampMillis: 47500,
      StartFrameNumber: 0,
      StartTimecodeSMPTE: '00:00:00:00',
      StartTimestampMillis: 0,
      TechnicalCueSegment: {
         Confidence: 100, Type: 'Content'
      },
      Type: 'TECHNICAL_CUE'
    }
    ],

The above seems to show that it's including the slate/clapperboard in the Content; the clapperboard is clearly present at the '00:00:00:00' timecode.

I can't share the clips, I'm afraid!

Edit: Azure’s Video Indexer finds the clapperboard on the same clips. Their AI model isn’t doing brilliantly with parsing the data (yet), though. I’d love to compare it to AWS’ offering if only I could get it to work.

Thanks, Graham

Graham
질문됨 9달 전207회 조회
1개 답변
0

Hello Graham,

Good question, this is because Rekognition's model defines slates as essentially blank screens with text/metadata, not a clapperboard in the shot. Sorry for the confusion, but the segment detection models aren't trained to recognize objects in the shot, perhaps the documentation could be more explicit about this.

As a workaround, you could identify clapperboards with the StartLabelDetection API.

const labelDetectionResponse = await rekClient.send(new StartLabelDetectionCommand({
  ...
    Settings: {
      GeneralLabels: { 
        LabelInclusionFilters: [
          "Clapperboard",
        ],
      }
    }
  ...
));

This will give you a result that looks something like this:

{
  "JobStatus": "SUCCEEDED",
  "VideoMetadata": {
    ...
  },
  "Labels": [
    {
      "Timestamp": 0,
      "Label": {
        "Name": "Clapperboard",
        "Confidence": 90.0569076538086,
        "Instances": [
          {
            "BoundingBox": {
              "Width": 0.3286278247833252,
              "Height": 0.7593286037445068,
              "Left": 0.262031614780426,
              "Top": 0.21159084141254425
            },
            "Confidence": 90.02198028564453
          }
        ],
        "Parents": [],
        "Aliases": [],
        "Categories": [
          {
            "Name": "Hobbies and Interests"
          }
        ]
      }
    },
... // more detections for other timestamps
  ]
}

These are frame-based detections, not segment based, so it will give you a detection for all of the frames where the clapper board is detected.

Best, Lucas Jarman, Rekognition Video

AWS
답변함 9달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠