S3 Select ignores rows starting with #

0

Hi, I have uploaded a csv file to bucket, contents being

A,B
0,1
#1,2
1,2

I used the query SELECT * FROM s3object The result returned dropped the row starting with #.

A,B
0,1
1,2

If I download the file, the file still contains the row #1,2 Is this a known limitation? Any help is appreciated.

Ben
asked 2 months ago106 views
1 Answer
2
Accepted Answer

Yes, what you're observing is actually an expected behavior when querying CSV files stored in Amazon S3 using SQL with AWS S3 Select or Amazon Athena. These services allow you to run SQL queries directly against files in S3 without needing to load them into a database. However, when working with CSV files, both S3 Select and Athena treat lines that start with # as comments by default. This is why the row starting with # is being skipped in your query results.

The reason for this behavior is based on a common convention in many CSV and text file formats where lines beginning with # are considered as comments and are therefore ignored during processing. This can be particularly useful for including metadata or comments within a file that should not be treated as data.

You can see the example below:

example

profile picture
EXPERT
answered 2 months ago
profile pictureAWS
EXPERT
reviewed 2 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions