S3 Select SelectObjectContent with AllowQuotedRecordDelimiter & ScanRange

0

We're using S3 Select SelectObjectContent to convert CSV input to JSON output.

CSV files on input are very large, so we're passing chunks using ScanRange. Recently we ran into an issue with CSV files containing record delimiters in quoted fields (\r\n). Docs and API error indicate that you can't use AllowQuotedRecordDelimiter with ScanRange.

Is there any workaround to this issue? Would this be something AWS adds support for in the future? Is there another service to do this type of conversion instead of S3 Select?

Thanks for any help!

asked 3 months ago237 views
1 Answer
0

Hi,

I understand your concern about using S3 Select with AllowQuotedRecordDelimiter and ScanRange. But yeah, they don't really work together right now.

Don't worry though, there are a few ways you could get around it if you want.

  1. Instead of using the default record delimiter (\r\n), you can try using a different delimiter that does not exist in the referenced field. You can use a comma (",") or a semicolon (";") as a record separator.

  2. If the CSV file was generated by another process, you can try modifying the process to avoid using record delimiters in quoted fields. You can use a different quote character (such as double quotes instead of single quotes) or escape record delimiters within quotes.

  3. If neither of the above solutions works for you, you can consider switching to a different AWS service. You can perform the transformation using AWS Glue, which supports more advanced features such as handling quoted record delimiters.

profile picture
answered 3 months ago
profile picture
EXPERT
reviewed 24 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions