내용으로 건너뛰기

Is it possible to use S3 select to identify schema of data in a S3 file

0

Is it possible to use S3 select to identify schema of data in a S3 file

The file should contain one Json records per line, can S3 select be used to know first level fields of a Json object

Example: For two Json records in a file {a:1, b:2, c:3} {a:1, b:2, c:3, d:4}

I would like S3 select to return a,b,c,d

질문됨 일 년 전157회 조회
1개 답변
0

Yes, Amazon S3 Select does not support automatic schema inference, meaning it cannot directly identify or return the list of first-level fields (keys) from JSON records stored in an S3 object. As stated in the AWS documentation, "S3 Select supports querying data that is structured or semi-structured (like CSV or JSON), but does not support schema inference." This means you must already know the field names and structure of your data to write SQL queries against it.

To work around this limitation, you can use alternative tools such as AWS Glue , which can automatically infer schemas from JSON files and generate metadata tables for use with Amazon Athena or other query services. Another option is using a custom AWS Lambda function or a local script (e.g., Python with Pandas or json-schema-infer) to read sample data from the file and programmatically extract all unique top-level keys across the JSON records.

Reference: AWS S3 Select Documentation - JSON Support

답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

관련 콘텐츠