AWS Rekognition: slate detection (technical cue) on stored videos

0

Hi there,

I'm failing to get slate detection working whilst using AWS Rekognition's Segment Detection to find technical cues on videos stored on S3. The process only ever finds Content technical cues, but never the Slate at the beginning of the clip. I've tested with various clips that start with a clapperboard. I'm unsure if this a config problem, a region problem, an input source problem, or if I'm just misunderstanding what the model can do. Any ideas?

Region: eu-west-1 (Ireland, which should be supported judging by this blog post).

Extract from job:

        const labelDetectionResponse = await rekClient.send(new StartSegmentDetectionCommand({
            ...
            SegmentTypes: ["TECHNICAL_CUE"],
            ...
        ));

Extracts from response:

types: [ { ModelVersion: '2.0', Type: 'TECHNICAL_CUE' } ]
    JobStatus: 'SUCCEEDED',
    Segments: [ 
    {
      DurationFrames: 1140,
      DurationMillis: 47500,
      DurationSMPTE: '00:00:47:12',
      EndFrameNumber: 1140,
      EndTimecodeSMPTE: '00:00:47:12',
      EndTimestampMillis: 47500,
      StartFrameNumber: 0,
      StartTimecodeSMPTE: '00:00:00:00',
      StartTimestampMillis: 0,
      TechnicalCueSegment: {
         Confidence: 100, Type: 'Content'
      },
      Type: 'TECHNICAL_CUE'
    }
    ],

The above seems to show that it's including the slate/clapperboard in the Content; the clapperboard is clearly present at the '00:00:00:00' timecode.

I can't share the clips, I'm afraid!

Edit: Azure’s Video Indexer finds the clapperboard on the same clips. Their AI model isn’t doing brilliantly with parsing the data (yet), though. I’d love to compare it to AWS’ offering if only I could get it to work.

Thanks, Graham

Graham
asked 9 months ago199 views
1 Answer
0

Hello Graham,

Good question, this is because Rekognition's model defines slates as essentially blank screens with text/metadata, not a clapperboard in the shot. Sorry for the confusion, but the segment detection models aren't trained to recognize objects in the shot, perhaps the documentation could be more explicit about this.

As a workaround, you could identify clapperboards with the StartLabelDetection API.

const labelDetectionResponse = await rekClient.send(new StartLabelDetectionCommand({
  ...
    Settings: {
      GeneralLabels: { 
        LabelInclusionFilters: [
          "Clapperboard",
        ],
      }
    }
  ...
));

This will give you a result that looks something like this:

{
  "JobStatus": "SUCCEEDED",
  "VideoMetadata": {
    ...
  },
  "Labels": [
    {
      "Timestamp": 0,
      "Label": {
        "Name": "Clapperboard",
        "Confidence": 90.0569076538086,
        "Instances": [
          {
            "BoundingBox": {
              "Width": 0.3286278247833252,
              "Height": 0.7593286037445068,
              "Left": 0.262031614780426,
              "Top": 0.21159084141254425
            },
            "Confidence": 90.02198028564453
          }
        ],
        "Parents": [],
        "Aliases": [],
        "Categories": [
          {
            "Name": "Hobbies and Interests"
          }
        ]
      }
    },
... // more detections for other timestamps
  ]
}

These are frame-based detections, not segment based, so it will give you a detection for all of the frames where the clapper board is detected.

Best, Lucas Jarman, Rekognition Video

AWS
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions