Script to extract queries from JSON

0

We have a pipeline to extract queries from PDF documents, async. Here's a sample from teh JSON " block_type='QUERY_RESULT', relationships=None, confidence=77.0, text='1 NON-SENSITIVE', " This confirms that our AWS pipeline is working for us. However, now matter what combination of the sample AWS scripts we use, we get errors. Anybody out there have an idea on how to format the extraction script so that it will work as AWS intends?

Input In [29], in <cell line: 19>() 16 print('d =', d) 18 #get_query_answers ---> 19 query_answers = d.get_query_answers(page=page) 21 #for x in query_answers: 22 # print(f"{image_filename},{x[1]},{x[2]}") 24 print(tabulate(query_answers, tablefmt="github"))

File ~/anaconda3/envs/aws-local/lib/python3.9/site-packages/trp/trp2.py:569, in TDocument.get_query_answers(self, page) 567 if answers: 568 for answer in answers: --> 569 result_list.append([query.query.text, query.query.alias, answer.text]) 570 else: 571 result_list.append([query.query.text, query.query.alias, ""])

AttributeError: 'NoneType' object has no attribute 'text'

已提問 2 年前檢視次數 459 次
2 個答案
0

Could you please post the entire json response you are getting from Textract Queries API before it reaches the post-processing, extraction script? It would help in our debugging efforts. If you cannot post that publicly on a forum post, please open a customer support ticket with us and attach the image/pdf of concern, your entire code file, and the entire stack trace. Thanks.

AWS
已回答 2 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南