How do I identify errors related to Data API in Amazon Redshift?

4 minute read
0

How do I identify the reason why a Data API query in Amazon Redshift failed?

Resolution

Amazon Redshift Data API is asynchronous, meaning you can run long-running queries without having to wait for it to complete. When a Data API query fails, the status of the query isn't displayed immediately. To determine the reasons for failure, use the DescribeStatement action for single or multiple queries. To run the DescribeStatement, you must have the statement ID.

Single query

To run a single query against the cluster, use the ExecuteStatement action to return a statement ID:

Note: The following example command uses the AWS Secrets Manager authentication method. The command runs an SQL statement against a cluster and returns an identifier to fetch the results.

aws redshift-data execute-statement
    --region us-east-1
    --secret arn:aws:secretsmanager:us-east-1:123456789012:secret:myuser-secret-hKgPWn
    --cluster-identifier redshift-cluster-1
    --sql "select * from test_table;"
    --database dev

Note: If you receive errors when running AWS CLI commands, make sure that you’re using the most recent version of the AWS CLI.

The output looks similar to the following:

{
    "ClusterIdentifier": "redshift-cluster-1",
    "CreatedAt": "2022-09-16T12:22:31.894000+05:30",
    "Database": "dev",
    "Id": "458c568d-717b-4f36-90bd-e642bfb06cbf",
    "SecretArn": "arn:aws:secretsmanager:us-east-1:123456789012:secret:myuser-secret-hKgPWn"
}

The preceding SQL statement returns an ExecuteStatementOutput, which includes the statement Id. You can check the status of the query using DescribeStatement and entering the statement ID:

aws redshift-data describe-statement --id 458c568d-717b-4f36-90bd-e642bfb06cbf

The output for DescribeStatement provides the following additional details:

  • RedshiftPid
  • Query duration
  • Number of rows in
  • Size of the result set
  • RedshiftQueryID

The output looks similar to the following:

{
    "ClusterIdentifier": "redshift-cluster-1",
    "CreatedAt": "2022-09-16T12:22:31.894000+05:30",
    "Duration": -1,
    "Error": "ERROR: relation \"test_table\" does not exist",
    "HasResultSet": false,
    "Id": "458c568d-717b-4f36-90bd-e642bfb06cbf",
    "QueryString": "select * from test_table;",
    "RedshiftPid": 1074727629,
    "RedshiftQueryId": -1,
    "ResultRows": -1,<
    "ResultSize": -1,
    "SecretArn": "arn:aws:secretsmanager:us-east-1:123456789012:secret:myuser-secret-hKgPWn",
    "Status": "FAILED",
    "UpdatedAt": "2022-09-16T12:22:32.365000+05:30"
}

The "Error": section in the preceding response displays the exact error. Which in the preceding example is "ERROR: relation "test_table" does not exist".

Multiple queries

To run multiple queries against the cluster use the BatchExecuteStatement action to return a statement ID:

aws redshift-data batch-execute-statement
    --region us-east-1
    --secret-arn arn:aws:secretsmanager:us-east-1:123456789012:secret:myuser-secret-hKgPWn     --cluster-identifier redshift-cluster-1
        --database dev
    --sqls "select * from test_table;" "select * from another_table;"

The output looks similar to the following:

{
    "ClusterIdentifier": "redshift-cluster-1",
    "CreatedAt": "2022-09-16T12:37:16.707000+05:30",
    "Database": "dev",
    "Id": "08b4b917-9faf-498a-964f-e82a5959d1cb",
    "SecretArn": "arn:aws:secretsmanager:us-east-1:123456789012:secret:myuser-secret-hKgPWn"
}

To get the status of the queries, use the DescribeStatement with the statement ID from the preceding response:

aws redshift-data describe-statement --id 08b4b917-9faf-498a-964f-e82a5959d1cb

The output looks similar to the following:

{
    "ClusterIdentifier ": "redshift-cluster-1 ",
    "CreatedAt ": "2022-09-16T12:37:16.707000+05:30 ",
    "Duration ": 0,
    "Error ": "Query #1 failed with ERROR: relation \ "test_table\"
does not exist ",
    "HasResultSet ": false,
    "Id ": "08b4b917-9faf-498a-964f-e82a5959d1cb ",
    "RedshiftPid ": 1074705048,
    "RedshiftQueryId ": 0,
    "ResultRows ":-1,
    "ResultSize ": -1,
    "SecretArn ": "arn:aws:secretsmanager:us-east-1:123456789012:secret:myuser-secret-hKgPWn ",
    "Status ": "FAILED ",
    "SubStatements ": [
        {
            "CreatedAt ": "2022-09-16T12:37:16.905000+05:30 ",
            "Duration ": -1,
            "Error ": "ERROR: relation \ "test_table\" does not exist ",
            "HasResultSet ": false,
            "Id ": "08b4b917-9faf-498a-964f-e82a5959d1cb:1",
            "QueryString ": "select * from test_table; ",
            "RedshiftQueryId ": -1,
            "ResultRows ": -1,
            "ResultSize ": -1,
            "Status ": "FAILED ",
            "UpdatedAt ": "2022-09-16T12:37:17.263000+05:30 "
        },
        {
            "CreatedAt ": "2022-09-16T12:37:16.905000+05:30",
            "Duration ": -1,
            "Error ": "Connection or an prior query failed. ",
            "HasResultSet ": false,
            "Id ": "08b4b917-9faf-498a-964f-e82a5959d1cb:2 ",
            "QueryString ": "select * from another_table;",
            "RedshiftQueryId ": 0,
            "ResultRows ": -1,
            "ResultSize": -1,
            "Status ": "ABORTED ",
            "UpdatedAt ": "2022-09-16T12:37:17.263000+05:30 "
        }
    ],
    "UpdatedAt ": "2022-09-16T12:37:17.288000+05:30 "
}

The preceding output displays the status of all sub-statements for a multi-statement query. The "Error": section in the preceding response displays the exact error for each sub-statement.

To troubleshoot problems with the Data API, see Troubleshooting issues for Amazon Redshift Data API.

Monitoring Data API events

Data API events can be monitored using Amazon EventBridge. This information can be sent to an AWS Lambda function that's integrated with Amazon Simple Notification Service (Amazon SNS) to send notifications. For more information, see Building an event-driven application with AWS Lambda and the Amazon Redshift Data API.


AWS OFFICIAL
AWS OFFICIALUpdated a year ago