S3 Select SelectObjectContent with AllowQuotedRecordDelimiter & ScanRange

0

We're using S3 Select SelectObjectContent to convert CSV input to JSON output.

CSV files on input are very large, so we're passing chunks using ScanRange. Recently we ran into an issue with CSV files containing record delimiters in quoted fields (\r\n). Docs and API error indicate that you can't use AllowQuotedRecordDelimiter with ScanRange.

Is there any workaround to this issue? Would this be something AWS adds support for in the future? Is there another service to do this type of conversion instead of S3 Select?

Thanks for any help!

1개 답변
0

Hi,

I understand your concern about using S3 Select with AllowQuotedRecordDelimiter and ScanRange. But yeah, they don't really work together right now.

Don't worry though, there are a few ways you could get around it if you want.

  1. Instead of using the default record delimiter (\r\n), you can try using a different delimiter that does not exist in the referenced field. You can use a comma (",") or a semicolon (";") as a record separator.

  2. If the CSV file was generated by another process, you can try modifying the process to avoid using record delimiters in quoted fields. You can use a different quote character (such as double quotes instead of single quotes) or escape record delimiters within quotes.

  3. If neither of the above solutions works for you, you can consider switching to a different AWS service. You can perform the transformation using AWS Glue, which supports more advanced features such as handling quoted record delimiters.

profile picture
답변함 3달 전
profile picture
전문가
검토됨 한 달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠