How do I troubleshoot issues when I copy data from Amazon S3 into Amazon Redshift?

3 minute read
0

I want to resolve errors that occur when I copy data from Amazon Simple Storage Service (Amazon S3) into Amazon Redshift.

Resolution

Important: You must associate the AWS Identity and Access Management (IAM) role that you specify in the COPY command with the Amazon Redshift provisioned cluster or Amazon Redshift Serverless.

The following are issues that you might experience when you copy data from Amazon S3 to Amazon Redshift.

The number and order of columns don't match the table definition

If there are extra columns in the file, then you get "Extra column(s) found" error message.

If there are fewer columns in the file, then you get the "Delimiter not found" error message.

To resolve these issues, make sure that the columns in the file and the table definition match. To skip some of the columns from the flat file, use column mapping options.

The data type doesn't match the table definition

For example, if the COPY command tries to load 'x' to the column that's defined as INTEGER, then you get the "Invalid digit, Value 'x', Pos 0, Type: Integer" error message.

To resolve this issue, modify the input file or table definition to be consistent. For more information, see ALTER TABLE.

The text length in your data file exceeds the table column length

For example, if the COPY command tries to load 'xxxx' to the column that's defined as VARCHAR(2), then you get the "String length exceeds DDL length" error message.

To resolve this issue, modify the input file or table definition to be consistent. You can also use the TRUNCATECOLUMNS option to truncate data to fit the defined column.

Invalid UTF-8 characters

The COPY command only supports UTF-8 characters that are up to 4 bytes long.

If you include an invalid UTF-8 character in the file, then you get the "String contains invalid or unsupported UTF8 codepoints. Bad UTF8 hex sequence:" error message.

To resolve this issue, modify the input file to only contain UTF-8 characters that are up to four bytes long, or use the ACCEPTINVCHARS option to replace the invalid characters.

For more information, see Multibyte character load errors.

Note: For information about COPY command errors, see SYS_LOAD_ERROR_DETAIL, SYS_LOAD_HISTORY and SYS_LOAD_DETAIL.

Related information

Load LISTING using columnar data in Parquet format

Preparing your input data

Amazon Redshift best practices for loading data

Loading tables with the COPY command

AWS OFFICIAL
AWS OFFICIALUpdated 8 days ago