Textract: is it possible analyze document demo version and regular version are not aligned?

0

Hello all,

cannot post a sample document in here, but lets say I'm working with invoices, pdf. All of a sudden, I found a couple odd balls today where my scripts that consume textract will consistently failt return 1 structured table (out of 3, the other 2 semi-strcutured) present in the document. I've tried everything, using boto3 analyze_document stripped of everything but 'TABLES' feature and I only get 2 tables for these invoices. If I put the same invoices through the Analyze Document demo (https://us-east-1.console.aws.amazon.com/textract/home?region=us-east-1#/demo) I'm always getting the 3 tables present in the invoices as expected. So I was thinking maybe demo is not actually consuming the same service that I am through scripting. Is that even a possibility? I can't think of any other explanation.

Thanks!

  • Hello,

    From the information you have provided so far, I think there might be a bug in your scripts. Without reviewing the script I won't be able to provide any more details.

    AFAIK, under the hood the service works just the same. The only variable here is your scripts. We can collaborate offline if you'd like me to review and determine a solution.

posta 2 mesi fa128 visualizzazioni
2 Risposte
1

Seems like you are using a 1 pager PDF, and you are using sync api analyze_document with boto3? Console demo uses async api even for 1 pager pdf and sync vs async uses a different path for rendering PDF. That might explain the difference. I would recommend you to try out async start_document_analysis

AWS
con risposta 2 mesi fa
0

Thank you! I actually found a related question right after this and the replies indeed point to the same explanation. I'll definitely try.

Best!

con risposta 2 mesi fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande