Does Textract have a way to extract a table of specified dimensions?

0

I am trying to pull the SKUs, name, and price from Target and Walmart receipts, but when I try expense analysis, it only pulls the name and price, ignoring the SKUs, and when I do document analysis, it will often lump the SKU column in with the name column. Sometimes, it will just add a column on the end for no apparent reason. I need the SKUs to be associated with the name and price, so I can't just look for things matching the format of SKUs. Is there a way for me to specify that I want 3 columns from the table when doing document analysis?

preguntada hace 2 años515 visualizaciones
1 Respuesta
2

Hello, Amazon Textract has a specific feature to analyze Receipts and Invoices (link). If you have not tried that out, I'd recommend having a look.

If you already did, then I'd ask if the Detect Text feature (link) was able to get the content (that corresponds to the Raw text tab when doing the Document Analysis via the AWS Console.

Overall, there are a couple of things that can be done to improve the accuracy of Textract results, like techniques like sharpening, blurring, thresholding, rotation (for misaligned scanned documents) as examples. For more advanced document layouts, a NLP model can be plugged in to help detecting the content (link)

AWS
respondido hace 2 años
profile pictureAWS
EXPERTO
Chris_G
revisado hace 2 años
  • Hello, if the answer met your expectation, please mark that as accepted. Thank you

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas