- Newest
- Most votes
- Most comments
The problem using API calls is that the client will be charged for the list requests (which can go up if they have a large number of objects). Details about s3 related costs here: https://aws.amazon.com/s3/pricing/
The easiest way is to use S3 Inventory and run your searches from there. More details about this feature and which fields are available on the report here: https://docs.aws.amazon.com/AmazonS3/latest/userguide/configure-inventory.html
Although not an AWS native service, there is Mixpeek, which runs text extraction like Tika, Tesseract and ImageAI on your S3 files then places them in a Lucene index to make them searchable.
You integrate it as follows:
-
Download the module: https://github.com/mixpeek/mixpeek-python
-
Import the module and your API keys:
from mixpeek import Mixpeek, S3 from config import mixpeek_api_key, aws
-
Instantiate the S3 class (which uses boto3 and requests):
s3 = S3( aws_access_key_id=aws['aws_access_key_id'], aws_secret_access_key=aws['aws_secret_access_key'], region_name='us-east-2', mixpeek_api_key=mixpeek_api_key )
-
Upload one or more existing S3 files:
# upload all S3 files in bucket "demo" s3.index(bucket_name="demo") # upload one single file called "prescription.pdf" in bucket "demo" s3.index_one(s3_file_name="prescription.pdf", bucket_name="demo")
-
Now simply search using the Mixpeek module:
# mixpeek api direct mix = Mixpeek( api_key=mixpeek_api_key ) # search result = mix.search(query="Heartgard") print(result)
-
Where result can be:
[ { "_id": "REDACTED", "api_key": "REDACTED", "highlights": [ { "path": "document_str", "score": 0.8759502172470093, "texts": [ { "type": "text", "value": "Vetco Prescription\nVetcoClinics.com\n\nCustomer:\n\nAddress: Canine\n\nPhone: Australian Shepherd\n\nDate of Service: 2 Years 8 Months\n\nPrescription\nExpiration Date:\n\nWeight: 41.75\n\nSex: Female\n\n℞ " }, { "type": "hit", "value": "Heartgard" }, { "type": "text", "value": " Plus Green 26-50 lbs (Ivermectin 135 mcg/Pyrantel 114 mg)\n\nInstructions: Give one chewable tablet by mouth once monthly for protection against heartworms, and the treatment and\ncontrol of roundworms, and hookworms. " } ] } ], "metadata": { "date_inserted": "2021-10-07 03:19:23.632000", "filename": "prescription.pdf" }, "score": 0.13313256204128265 } ]
Then you parse the results
Relevant content
- Accepted Answerasked a year ago
- asked 9 months ago
- AWS OFFICIALUpdated 17 days ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 6 days ago
- AWS OFFICIALUpdated 9 months ago