Redshift snapshots - incremental/full and retention

0

Hi,

For compliance reasons, my client requires two types of backup - a daily backup with 35 day retention (although I assume they read the docs and decided on 35 days as it's the limit...) but also monthly full backup which is kept for 2 years (24 months).

I'm a little confused by the documentation -

Firstly, the snapshots are incremental but each one can be used to do a full restore to a new cluster - how is this possible? Is it incremental since the last snapshot and if so, what happens if the previous snapshots are deleted?

Secondly, I can see there's a limit of 20 snapshots (which you can request to change). Before I even consider manual monthly snapshots, if my automated snapshot is daily and is retained for 35 days I am going to have >35 automated snapshots at any point in time - will this be an issue?

Thirdly, if my DWH size is 24TB and I am somehow (?) able create a full database backup via snapshots, I'm going to be paying for the storage of 576 TB (24 months x 24TB) in S3 which will be at a very high cost. Ideally, we'd be able to store this in Glacier but I understand we don't have access to the S3 bucket containing snapshots.

So my questions are:

  • How is a full cluster restore performed from an incremental snapshot?
  • Is this still possible if there's only one incremental snapshot (all the others are deleted?)
  • Will I exceed the 20 snapshot limit by having a 35 day retention period?
  • Is it possible to create a "full" backup in a snapshot?
  • Is it possible to access snapshot locations on S3 so that we can move them to Glacier and still make them available to Redshift for restore?

Thanks in advance,
J

asked 7 years ago971 views
1 Answer
1

Not sure if you'll get an AWS response on this on here. Raise it with your account manager if you need an official answer.

Regarding snapshots, we users have no access to the mechanism but AFAICT it keeps track of changed blocks and ensure that it has at least one copy of each populated block for the time period specified. Consult the docs for details of what a block is: http://docs.aws.amazon.com/redshift/latest/dg/c_columnar_storage_disk_mem_mgmnt.html

Re full backups retained for 2 years, the snapshot mechanism doesn't offer this. You'd have to UNLOAD every table to S3 and use a lifecycle to have the bucket move objects to Glacier automatically after, say, 7 days. This sounds like a huge pain in the ass to me.

As an alternative we retain all of the incoming load data on S3 and design our ETL so that it can be reloaded as needed. Our lifecycle puts the objects into Infrequent Access after 7 days and Glacier after 90 days.

answered 7 years ago
profile picture
EXPERT
reviewed 16 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions