How can I integrate Amazon S3 with my Aurora PostgreSQL-Compatible DB cluster?

3 minute read
0

I want to turn on the Amazon Simple Storage Solution (Amazon S3) integration feature in my Amazon Aurora PostgreSQL-Compatible Edition DB cluster.

Short description

You can import data that is stored in an Amazon S3 bucket into a table on an Aurora PostgreSQL-Compatible DB cluster. First, install the Aurora PostgreSQL-Compatible aws_s3 extension. This extension provides the functions that you need to import data from an S3 bucket. The data can be in a comma-separate value (CSV) file, a text file (txt), or a compressed (gzip) file. Your database must be running PostgreSQL version 10.7 or higher to import from Amazon S3 into Aurora PostgreSQL-Compatible.

To integrate your Aurora DB cluster with an S3 bucket:

  • Install the aws_s3 extension
  • Set up access to an S3 bucket
  • Add the option to the Aurora cluster
  • Use a psql query to import or export the data from the database

Resolution

Install the aws_s3 extension

To install the aws_s3 extension that allows you to connect to the database, run this command:

postgres=> CREATE EXTENSION aws_s3 CASCADE;

To verify that aws_s3 is installed, use the psql \dx meta-command. Or you can use this query from any DB client tool:

select * from pg_available_extensions where installed_version is not null;

Set up access to an S3 bucket

To give your Aurora PostgreSQL-Compatible cluster access to Amazon S3, create an AWS Identity and Access Management (IAM) policy. This policy provides the bucket and object permissions that allow your DB cluster to access the S3 bucket.

In the IAM policy that you create, include these required actions. These actions allow the transfer of files from the S3 bucket to your DB cluster:

  • s3:GetObject
  • s3:ListBucket
  • s3:PutObject
  • s3:AbordMultipartUpload

For more information, see Setting up access to an Amazon S3 bucket.

Add the option to the Aurora cluster

Add an IAM role to your cluster using either the Amazon Relational Database Service (Amazon RDS) console or the AWS Command Line Interface (AWS CLI).

Using the console

  1. Log in to the Amazon RDS console.
  2. From the navigation pane, choose the PostgreSQL cluster that you want to modify
  3. Chose the Connectivity & security tab.
  4. From the Manage IAM roles section, under Add IAM roles to this instance, choose the role that you want to add.
  5. Under Feature, choose s3Export, and then choose Add role.

Import or export data from the database

Import data to your PostgreSQL cluster from the S3 bucket or export data to your S3 bucket from your PostgreSQL cluster.

For more information, see Export and import data from Amazon S3 to Amazon Aurora PostgreSQL.

After you complete these steps, import data to an Aurora PostgreSQL-Compatible cluster from Amazon S3 using aws_s3.table_import_from_s3. You can export data from Aurora PostgreSQL-Compatible cluster to Amazon S3 using the aws_s3.query_export_to_s3 function.


AWS OFFICIAL
AWS OFFICIALUpdated 2 years ago
2 Comments

There are a few things missing in this post. 1 VPC endpoint set up to allow RDS (typically in private subnet) connect to S3 (in public) 2 IAM role grant to the cluster as this post leading to or IAM role grant to RDS instance

aws rds add-role-to-db-instance \
   --db-instance-identifier test-db-1-s3-copy \
   --feature-name s3Import \
   --role-arn arn:aws:iam::1234567890:role/rds-s3-import-role \
   --region eu-west-1

As I posted in https://repost.aws/questions/QU9UIDaEE4ToyMNZdhEfYUmQ that IAM role grant to the cluster doesn't allow the RDS instance access S3. I have a thought that it is AWS' cunning plan to stop Aurora DBs become a competitor to Redshift.

JingyeL
replied 2 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

profile pictureAWS
MODERATOR
replied 2 months ago