S3 Select SelectObjectContent with AllowQuotedRecordDelimiter & ScanRange


We're using S3 Select SelectObjectContent to convert CSV input to JSON output.

CSV files on input are very large, so we're passing chunks using ScanRange. Recently we ran into an issue with CSV files containing record delimiters in quoted fields (\r\n). Docs and API error indicate that you can't use AllowQuotedRecordDelimiter with ScanRange.

Is there any workaround to this issue? Would this be something AWS adds support for in the future? Is there another service to do this type of conversion instead of S3 Select?

Thanks for any help!



I understand your concern about using S3 Select with AllowQuotedRecordDelimiter and ScanRange. But yeah, they don't really work together right now.

Don't worry though, there are a few ways you could get around it if you want.

  1. Instead of using the default record delimiter (\r\n), you can try using a different delimiter that does not exist in the referenced field. You can use a comma (",") or a semicolon (";") as a record separator.

  2. If the CSV file was generated by another process, you can try modifying the process to avoid using record delimiters in quoted fields. You can use a different quote character (such as double quotes instead of single quotes) or escape record delimiters within quotes.

  3. If neither of the above solutions works for you, you can consider switching to a different AWS service. You can perform the transformation using AWS Glue, which supports more advanced features such as handling quoted record delimiters.

profile picture
回答済み 5ヶ月前
profile picture
レビュー済み 3ヶ月前

ログインしていません。 ログイン 回答を投稿する。