Redshift Spectrum giving IERR_READ_ERROR (Access Denied) error

0

I'm trying to use Redshift Spectrum to query gz JSON data in S3. I've used a Glue crawler to create the database and table schema which all appears to be correct. I've created the external schema using

create external schema spectrum from data catalog ...

which populates my SQL IDE with the schema from Glue. But when I go to query data from this table, I get the following error:

S3 Query Exception (Fetch). Task failed due to an internal error. 
Error: HTTP response error code: 403 Message: AccessDenied Access Denied
x-amz-request-id: xxx
x-amz-id-2: xxx
Ion/JSON Stream error while opening buffer: IERR_READ_ERROR
 (s3://<bucket>/.../<file...>...

The S3 data is in a separate AWS account from the Redshift cluster, but I have the S3 bucket policy set up to allow the spectrum-role access based on this link: https://docs.aws.amazon.com/redshift/latest/dg/c-spectrum-iam-policies.html

I've given the spectrum-role full get and list access:

{
    "Sid": "Redshift Spectrum permissions",
    "Effect": "Allow",
    "Principal": {
        "AWS": "arn:aws:iam::<account>:role/spectrum-role"
    },
    "Action": [
        "s3:Get*",
        "s3:List*"
    ],
    "Resource": [
        "arn:aws:s3:::<bucket>",
        "arn:aws:s3:::<bucket>/*"
    ]
}

This thread https://forums.aws.amazon.com/thread.jspa?threadID=280879 suggests that IERR_READ_ERROR is always malformed gz JSON, but seeing "AccessDenied" in the error message makes me question that.

Can anyone help me diagnose what the problem is?

kwhite
已提问 4 年前1006 查看次数
1 回答
0

The Redshift AWS account did not have permissions for spectrum-role to read from the cross-account AWS S3 bucket (the glue role did, but not the spectrum-role). Adding Get** and List** permissions to the spectrum-role allowed the query to complete.

kwhite
已回答 4 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则