Textract: is it possible analyze document demo version and regular version are not aligned?

0

Hello all,

cannot post a sample document in here, but lets say I'm working with invoices, pdf. All of a sudden, I found a couple odd balls today where my scripts that consume textract will consistently failt return 1 structured table (out of 3, the other 2 semi-strcutured) present in the document. I've tried everything, using boto3 analyze_document stripped of everything but 'TABLES' feature and I only get 2 tables for these invoices. If I put the same invoices through the Analyze Document demo (https://us-east-1.console.aws.amazon.com/textract/home?region=us-east-1#/demo) I'm always getting the 3 tables present in the invoices as expected. So I was thinking maybe demo is not actually consuming the same service that I am through scripting. Is that even a possibility? I can't think of any other explanation.

Thanks!

  • Hello,

    From the information you have provided so far, I think there might be a bug in your scripts. Without reviewing the script I won't be able to provide any more details.

    AFAIK, under the hood the service works just the same. The only variable here is your scripts. We can collaborate offline if you'd like me to review and determine a solution.

preguntada hace 2 meses128 visualizaciones
2 Respuestas
1

Seems like you are using a 1 pager PDF, and you are using sync api analyze_document with boto3? Console demo uses async api even for 1 pager pdf and sync vs async uses a different path for rendering PDF. That might explain the difference. I would recommend you to try out async start_document_analysis

AWS
respondido hace 2 meses
0

Thank you! I actually found a related question right after this and the replies indeed point to the same explanation. I'll definitely try.

Best!

respondido hace 2 meses

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas