Amazon Textract misses digits before and after commas and periods

1

Hello,

I am using Amazon Textract to transcribe tables into csv output. While it usually does a very good job, it has problems with dropping leading digits before a comma and following digits after a decimal point. I have attached an image that shows the problem while using the demo version. Has anyone encountered this problem or know of a way to fix it? As you can see, Textract does not always fail to capture the full number, and it seems to me to be correlated with the punctuation. My problem seems similar to one posted many years ago (https://repost.aws/questions/QUJgifajQpQYesjrkIocR9lw/numbers-amount-reading-problem), but the solution seems to have been relayed in a private message. Any help would be much appreciated, or questions for further clarification. Thank you!

Textract Screenshot

  • Can you share the original document or at least parts of it? Then I could run some tests to validate.

  • Thank you for using Textract. To better assist you, we will need to try this out at our end with the original image and gather a few details. It would be helpful if you can share the original image or you can also create a support ticket, and we will have our support engineer look into this for you.

  • Hello, thank you for both responses. I don't think the file hosted here [https://drive.google.com/file/d/1yGlYK6BI5popn9uwQ971AnUe4vLcCRjL/view?usp=sharing] is the exact same one as I used above, but it exhibits the same problematic behavior. I very much appreciate your attention to this matter.

Jackson
질문됨 2년 전129회 조회
답변 없음

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠