How to range query over a JSON file using S3 select

0

Hi Team, I am looking forward to a solution that can help in fetching the JSON objects in batches using S3 select query. For example : I have a JSON file with 50 records present in it. with S3 select query , I could retrieve first 10 records completely and then next 10 and so on.

I had tried scanRange before , but that works on bytes by providing start and end range , which doesn't ensure full records to be fetched . as the object size can vary for a single record.

{ { status: "Success", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ], contacts: [ {"a": "123"}, {"b": "456"} ] }, { status: "Success", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ] }, { status: "In_progress", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ] }, { status: "Failed", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ] } }

I could get any 2 complete record in one batch & next 2 records next. Batch 1 : { status: "In_progress", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ] }, { status: "Failed", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ] }

batch 2 : { status: "Success", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ], contacts: [ {"a": "123"}, {"b": "456"} ] }, { status: "Success", created_at: "19 AUG 2019", employees:[ {"name":"name1", "id":"1"}, {"name":"name2", "id":"2"}, {"name":"name3", "id":"3"} ] }

asked a year ago375 views
1 Answer
0

Greetings from AWS. If I understood correctly your requirement is to fetch JSON objects in batches using S3 select query. For example : in a JSON file with 50 records present in it. with S3 select query you want to retrieve first 10 records completely and then next 10 and so on.

Have you tried using LIMIT, for example the following query would return 10 records with all fields.

SELECT * from S3Object[*] LIMIT 10

Please refer to the examples here to query specific fields based on conditions: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-glacier-select-sql-reference-select.html

Please reply with example input and expected output if there are any questions. Thanks for contacting AWS, stay safe and have a wonderful day.

AWS
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions