Issues Connecting to Redshift from EMR with IAM Authentication: "FATAL: user 'IAM:yarn' does not exist"

0

Hello Everyone,

I am facing difficulties while trying to load data from an Amazon Redshift cluster into a Spark dataframe using the included redshift connector from EMR release 6.13.0 . My current setup involves reading from a Redshift cluster using IAM authentication. I've ensured that the IAM role associated with the Redshift cluster has the required permissions. However, I keep encountering the following error:

java.sql.SQLException: FATAL: user "IAM:yarn" does not exist
	at com.amazon.redshift.util.RedshiftException.getSQLException(RedshiftException.java:56)

the code snippet i'm using is as simple as this:

url0 =  "jdbc:redshift:iam://"+endpoint+"/dev"
df = spark.read \
    .format("io.github.spark_redshift_community.spark.redshift") \
    .option("url", url0) \
    .option("dbtable", "areas") \
    .option("tempdir", "s3://bucket-name/tmp_redshift/") \
    .option("aws_iam_role", iam_role_arn) \
    .load()

i don't understand from where he comes with IAM:yarn, i never stated that i want to use this user ??

Other considerations:

  • My EMR cluster is running in a private subnet. my Redshift cluster is on another private subnet in the same VPC. I created endpoint from EMR to redshift service.
  • I've ensured that there's network access between the Spark cluster and the Redshift cluster. ( the jdbc:redshift: works fine when i add username and pwd)
  • The IAM role in use has the necessary permissions to connect to Redshift as it has AmazonRedshiftAllCommandsFullAccess policy

Thanks for your help.

已提問 7 個月前檢視次數 526 次
1 個回答
3

Hello,

"FATAL: user "IAM:yarn" does not exist"

Here EMR is considering 'yarn' as a User which in turn trying to authenticate as DBUser using GetClusterCredentials API and add as a suffix for "IAM: ".

This documentation[1] indicates that if there is a username in the database that matches DBuser, the temporary user credentials have the same permissions as the existing user. In addition, if there is no database user that matches the DBuser value, the command will execute successfully, but the connection attempt will fail because the user does not exist in the database. This would mean that DBuser users' emruser 'and' yarn 'don't exist in the database.

In addition, the problem might exist with “getClusterCredentials” indicates that it returns a temporary database username and password with temporary authorization to log in to the Redshift database [2]. This action returns the database username prefix with 'IAM: . So, you can use DbUser with AutoCreate. If DbUser doesn't exist in the database and AutoCreate is true, a new user named DbUser is created.

[1] - https://docs.aws.amazon.com/redshift/latest/APIReference/API_GetClusterCredentials.html#API_GetClusterCredentials_RequestParameters

[2] - https://docs.aws.amazon.com/redshift/latest/APIReference/API_GetClusterCredentials.html

AWS
支援工程師
已回答 7 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南