Redshift snapshots - incremental/full and retention

0

Hi,

For compliance reasons, my client requires two types of backup - a daily backup with 35 day retention (although I assume they read the docs and decided on 35 days as it's the limit...) but also monthly full backup which is kept for 2 years (24 months).

I'm a little confused by the documentation -

Firstly, the snapshots are incremental but each one can be used to do a full restore to a new cluster - how is this possible? Is it incremental since the last snapshot and if so, what happens if the previous snapshots are deleted?

Secondly, I can see there's a limit of 20 snapshots (which you can request to change). Before I even consider manual monthly snapshots, if my automated snapshot is daily and is retained for 35 days I am going to have >35 automated snapshots at any point in time - will this be an issue?

Thirdly, if my DWH size is 24TB and I am somehow (?) able create a full database backup via snapshots, I'm going to be paying for the storage of 576 TB (24 months x 24TB) in S3 which will be at a very high cost. Ideally, we'd be able to store this in Glacier but I understand we don't have access to the S3 bucket containing snapshots.

So my questions are:

  • How is a full cluster restore performed from an incremental snapshot?
  • Is this still possible if there's only one incremental snapshot (all the others are deleted?)
  • Will I exceed the 20 snapshot limit by having a 35 day retention period?
  • Is it possible to create a "full" backup in a snapshot?
  • Is it possible to access snapshot locations on S3 so that we can move them to Glacier and still make them available to Redshift for restore?

Thanks in advance,
J

질문됨 7년 전983회 조회
1개 답변
1

Not sure if you'll get an AWS response on this on here. Raise it with your account manager if you need an official answer.

Regarding snapshots, we users have no access to the mechanism but AFAICT it keeps track of changed blocks and ensure that it has at least one copy of each populated block for the time period specified. Consult the docs for details of what a block is: http://docs.aws.amazon.com/redshift/latest/dg/c_columnar_storage_disk_mem_mgmnt.html

Re full backups retained for 2 years, the snapshot mechanism doesn't offer this. You'd have to UNLOAD every table to S3 and use a lifecycle to have the bucket move objects to Glacier automatically after, say, 7 days. This sounds like a huge pain in the ass to me.

As an alternative we retain all of the incoming load data on S3 and design our ETL so that it can be reloaded as needed. Our lifecycle puts the objects into Infrequent Access after 7 days and Glacier after 90 days.

답변함 7년 전
profile picture
전문가
검토됨 21일 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인