Hi, I'm trying to connect airflow to Redshift Serverless. I have RedshiftFullAccess, and I'm trying to copy files from S3 to Redshift using S3ToRedshiftOperator, I got Cluster not found error. I'm stuck on connecting airflow to Redshift Serverless. I've checked airflow docs, they're not clear to me. Would appreciate any resource or tip shared?
code below
copy_to_redshift = S3ToRedshiftOperator(
aws_conn_id = 'redshift_default',
task_id='copy_to_redshift',
schema='db_schema',
table='table_name',
s3_bucket=S3_BUCKET_NAME,
s3_key='folder/output_folder',
copy_options=["FORMAT AS PARQUET"],
dag=dag,)
Connection created in Airflow UI
{Connection Id * : redshift_default,
Connection Type *: Amazon Redshift ,
Host: WORKGROUP_NAME.ACCOUNT_ID.REGION.redshift-serverless.amazonaws.com,
Database: DbName,
User: my-username,
Password: ,
Port: 5439,
Extra: { "iam": true,
"is_serverless": true,
"serverless_token_duration_seconds": 3600,
"port": 5439,
"region": "REGION",
"database": "DBName",
"profile": "default" }}
Error message
UserWarning: AWS Connection (conn_id='redshift_default', conn_type='redshift') expected connection type 'aws', got 'redshift'. This connection might not work correctly. Please use Amazon Web Services Connection type. {connection_wrapper.py:378} INFO - AWS Connection (conn_id='redshift_default', conn_type='redshift') credentials retrieved from login and password. {logging_mixin.py:154} WARNING - <string>:9 UserWarning: Found 'profile' without specifying 's3_config_file' in AWS Connection (conn_id='redshift_default', conn_type='redshift') extra. If required profile from AWS Shared Credentials please set 'profile_name' in AWS Connection (conn_id='redshift_default', conn_type='redshift') extra.
[2024-01-21, 12:32:34 UTC] {logging_mixin.py:154} WARNING - <string>:9 AirflowProviderDeprecationWarning: Host WORKGROUP_NAME.ACCOUNT_ID.REGION.redshift-serverless.amazonaws.com specified in the connection is not used. Please, set it on extra['endpoint_url'] instead
[2024-01-21, 12:32:35 UTC] {s3_to_redshift.py:192} INFO - Executing COPY command...
[2024-01-21, 12:32:35 UTC] {base.py:73} INFO - Using connection ID 'redshift_default' for task execution.
[2024-01-21, 12:32:35 UTC] {base_aws.py:581} WARNING - Unable to find AWS Connection ID 'aws_default', switching to empty.
[2024-01-21, 12:32:35 UTC] {base_aws.py:161} INFO - No connection ID provided. Fallback on boto3 credential strategy (region_name=None). See: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html
[2024-01-21, 12:32:36 UTC] {credentials.py:1052} INFO - Found credentials from IAM Role: AttachedRole-role
[2024-01-21, 12:32:36 UTC] {taskinstance.py:1937} ERROR - Task failed with exception
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/common/sql/hooks/sql.py", line 385, in run
with closing(self.get_conn()) as conn:
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 173, in get_conn
conn_params = self._get_conn_params()
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 84, in _get_conn_params
conn.login, conn.password, conn.port = self.get_iam_token(conn)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/providers/amazon/aws/hooks/redshift_sql.py", line 115, in get_iam_token
cluster_creds = redshift_client.get_cluster_credentials(
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/client.py", line 535, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/airflow/.local/lib/python3.8/site-packages/botocore/client.py", line 980, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ClusterNotFoundFault: An error occurred (ClusterNotFound) when calling the GetClusterCredentials operation: Cluster WORKGROUP_NAME not found.
[2024-01-21, 12:32:36 UTC] {taskinstance.py:1400} INFO - Marking task as FAILED. dag_id=redshift-pipe, task_id=copy_to_redshift, execution_date=20240121T123224, start_date=20240121T123234, end_date=20240121T123236
[2024-01-21, 12:32:36 UTC] {standard_task_runner.py:104} ERROR - Failed to execute job 95 for task copy_to_redshift (An error occurred (ClusterNotFound) when calling the GetClusterCredentials operation: Cluster WORKGROUP_NAME not found.; 12865)
[2024-01-21, 12:32:36 UTC] {local_task_job_runner.py:228} INFO - Task exited with return code 1
[2024-01-21, 12:32:36 UTC] {taskinstance.py:2778} INFO - 0 downstream tasks scheduled from follow-on schedule check
Found this solution and doc vague https://github.com/apache/airflow/issues/35805 https://airflow.apache.org/docs/apache-airflow-providers-amazon/stable/connections/redshift.html .