Textraxt + HITL request: HumanInTheLoop Mechanical Turk Job fails with error: Validation error

0

I was testing out the latency of Textract + HITL via Mechanical Turk. I created a flow defintion, a human loop config and then submitted a request on a document using:

humanLoopConfig = {
    'FlowDefinitionArn':flowDefinitionArn,
    'HumanLoopName':human_loop_unique_id,
    'DataAttributes': { 'ContentClassifiers': [ 'FreeOfPersonallyIdentifiableInformation' ]}
}

def analyze_document_with_a2i(document_name, bucket):
    response = textract.analyze_document(
        Document={'S3Object': {'Bucket': bucket, 'Name': document_name}},
        FeatureTypes=["TABLES", "FORMS"],
        HumanLoopConfig=humanLoopConfig
    )
    return response

While the response was fine, when I list all the human loops related to a flow definition, I get this response:

{'HumanLoopName': 'human-loop-id-here', 'HumanLoopStatus': 'Failed', 'CreationTime': datetime.datetime(2023, 6, 20, 8, 29, 35, 606000, tzinfo=tzlocal()), 'FailureReason': 'ValidationError', 'FlowDefinitionArn': 'flow-definition-arn-here'}

How do I figure out what the exact issue is?

[EDITED]

FLOW DEFINITION FUNCTION:

def create_flow_definition(flow_definition_name):
    '''
    Creates a Flow Definition resource

    Returns:
    struct: FlowDefinitionArn
    '''
    # Visit https://docs.aws.amazon.com/sagemaker/latest/dg/a2i-human-fallback-conditions-json-schema.html for more information on this schema.
    humanLoopActivationConditions = json.dumps(
        {
            "Conditions": [
                {
                  "Or": [

                    {
                        "ConditionType": "ImportantFormKeyConfidenceCheck",
                        "ConditionParameters": {
                            "ImportantFormKey": "Total",
                            "ImportantFormKeyAliases": ["Total Amount", "Amount Payable", "Balance Due"],
                            "KeyValueBlockConfidenceLessThan": 100,
                            "WordBlockConfidenceLessThan": 100
                        }
                    },
                    {
                        "ConditionType": "MissingImportantFormKey",
                        "ConditionParameters": {
                            "ImportantFormKey": "Vendor Name",
                            "ImportantFormKeyAliases": ["Vendor Name", "Remit to:", "Remit payment to"]
                        }
                    },
                    {
                        "ConditionType": "ImportantFormKeyConfidenceCheck",
                        "ConditionParameters": {
                            "ImportantFormKey": "Phone Number",
                            "ImportantFormKeyAliases": ["Phone number:", "Phone No.:", "Number:"],
                            "KeyValueBlockConfidenceLessThan": 100,
                            "WordBlockConfidenceLessThan": 100
                        }
                    },
                    {
                      "ConditionType": "ImportantFormKeyConfidenceCheck",
                      "ConditionParameters": {
                        "ImportantFormKey": "*",
                        "KeyValueBlockConfidenceLessThan": 100,
                        "WordBlockConfidenceLessThan": 100
                      }
                    },
                    {
                      "ConditionType": "ImportantFormKeyConfidenceCheck",
                      "ConditionParameters": {
                        "ImportantFormKey": "*",
                        "KeyValueBlockConfidenceGreaterThan": 0,
                        "WordBlockConfidenceGreaterThan": 0
                      }
                    }
            ]
        }
            ]
        }
    )

    response = sagemaker.create_flow_definition(
            FlowDefinitionName= flow_definition_name,
            RoleArn= ROLE,
            HumanLoopConfig= {
                "WorkteamArn": WORKTEAM_ARN,
                "PublicWorkforceTaskPrice": {
                  "AmountInUsd": {
                      "Cents": 1,
                      "Dollars": 0,
                      "TenthFractionsOfACent": 2
                  }
                },
                "HumanTaskUiArn": humanTaskUiArn,
                "TaskCount": 1,
                "TaskDescription": "Document analysis sample task description",
                "TaskTitle": "Document analysis sample task"
            },
            HumanLoopRequestSource={
                "AwsManagedHumanLoopRequestSource": "AWS/Textract/AnalyzeDocument/Forms/V1"
            },
            HumanLoopActivationConfig={
                "HumanLoopActivationConditionsConfig": {
                    "HumanLoopActivationConditions": humanLoopActivationConditions
                }
            },
            OutputConfig={
                "S3OutputPath" : OUTPUT_PATH
            }
        )

    return response['FlowDefinitionArn']
Nirali
gefragt vor einem Jahr281 Aufrufe
1 Antwort
0

Hi It may because of the flow definition configuration. Could you provide more details here so we can reproduce the issue?

AWS
beantwortet vor einem Jahr
  • Hi, sure! Since comments have a 600 character limit, I edited my question and added the function I used to create a flow definition. Hope this helps!

    But, if there was an issue with the flow definition configuration, why was it created successfully and active? Any idea?

  • Any updates here? Can anyone point to the right direction?

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen