2 Answers
- Newest
- Most votes
- Most comments
2
The easiest option is to enable data sharing between provisioned and serverless instance.
- Take a snapshot of provisioned cluster and restore to serverless. This will be your baseline.
- Setup data sharing between provisioned and serverless instance.
- Run "insert into" scripts to copy data from data share tables to local tables in serverless. Hope you have audit fields to pull only incremental records. If some tables do not, then a full table refresh is required for such tables.
Also for Blue/Green scenario, at the time of cut-over, please consider following steps.
- Please disconnect all inbound and outbound connnections to old instance and route to serverless instance.
- Pause the provisioned cluster (You can query data from consumer serverless instance even though producer is paused)
- Ensure you copied all the data from datashare into your local tables on serverless.
- Take final snapshot for any future reference and then delete the old provisioned instance.
0
@Denis, You can run the insert into statements from stored procedure or lambda. Either way is fine. If it is one time, you can just run them from query editor. I would suggest stored procedure and schedule with Query editor V2 for running it repeatedly.
Reference: https://docs.aws.amazon.com/redshift/latest/mgmt/query-editor-v2-schedule-query.html https://docs.aws.amazon.com/redshift/latest/dg/performing-a-deep-copy.html
Relevant content
- Accepted Answerasked a year ago
- Accepted Answerasked 2 years ago
- AWS OFFICIALUpdated 3 months ago
- AWS OFFICIALUpdated 3 years ago
- AWS OFFICIALUpdated 13 days ago
- AWS OFFICIALUpdated 2 years ago
"Run "insert into" scripts to copy data from data share tables to local tables in serverless."
How do I do that? Is it configured from the data share or I have to write some custom Lambda for it?
I accepted the answers from the 2 comments of Satesh Sonti-Redshift.
It seems it would work, but I decided that we should go with a different solution, that is specific to our team: Currently we update Redshift data through "COPY" statements that copy data from an S3 bucket into Redshift. This statement is executed in one of our Lambda functions. We will change this Lambda function code to send updates not only to Redshift but also to the new Redshift Serverless that we will create (through the CDK). To do this we will use the RedshiftDataClient class.
This way all new data is uploaded to both RS and RS Serverless. They are in sync.
Thank you for the support Satesh, highly appreciated.