netezza -> S3 copy

0

A customer is running IBM Netezza:

  • they want to keep a copy of data stored in netezza in AWS
  • as the netezza will still be used for some time, the copy needs to stay in sync
  • gradually over time, netezza will be replaced by functionality in SAP running in EC2
  • multiple solutions will be using the copy in AWS as a the single source of truth.

So I was thinking to let them use the SCT Data Extractors to store the copy into S3: https://aws.amazon.com/blogs/database/how-to-migrate-your-data-warehouse-to-amazon-redshift-using-the-aws-schema-conversion-tool-data-extractors/

While Redshift will be an option, it won't be the only solution that needs to access this data. I understand that SCT prepares the data for redshift, so will it make sense to use the copy in S3 as a source? Is it a reliable solution to keep it in sync on a daily basis for a relatively long term with this SCT process?

已提問 6 年前檢視次數 352 次
1 個回答
0
已接受的答案

SCT can use Netezza as a source for the "schema" only, not the actual data. DMS uses Change Data Capture (CDC) to keep a source and a target synchronized. Netezza is not a source for DMS, because the CDC relies on logs, which Netezza does not use for transaction control. So, Netezza cannot be "synched" with a target using DMS.

The work-around is to have the ETL systems that are loading Netezza write to a second target, in this case SAP, so that the data changes can be applied to each independently. There is a lot of complexity in making sure there is no split-brain, where the systems become unsynchronized. This is mitigated by using Audit/Balance/Control mechanisms in the ETL. This ABC will likely need to be built as net-new for this migration--most Netezza customers do not have ABC built into their ETL architecture.

Sending the changes to S3 is not recommended, because they usually contain UPDATE and DELETE requests, which S3 cannot support. The target needs to be able to perform these (in addition to INSERT) in a preserved order, to ensure the two databases are equivalent/synchronized.

Some customers perform this synchronization in low-latency form, others prefer to update the second target in batches, but the order of the transactions matters and so they must be performed in alignment with that strategy.

已回答 6 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南