Redshift Materialized View Auto Refresh Failed for Kinesis Streaming Ingestion

0

I've created an external table on top of the Kinesis Data Streams and created a materialized view on top of the external table. It is running properly when I added a new column key/value JSON pair to the existing data stream to the kinesis my materialized view auto refresh fails. My doubt is I am creating the materialized view with only the required columns so it shouldn't be a problem even though there is an additional column in an external table right but my materialized view auto refresh failed due to this. So what I did was I dropped the data with additional columns and repushed the data to the Kinesis Data Stream after one day I got the recent data at that time how did the auto-refresh succeed without fail?

My Doubts:

  1. Why does adding an additional column of data to the external table affect the materialized view which is created with required columns?
  2. Reloading the data to the data stream again with proper columns how does the auto-refresh succeed after one day? why not immediately
MSkalai
asked 5 months ago305 views
1 Answer
3

Modify a streaming Materialized View (MV) to have a new column is not possible (check the following documentation on ALTER MATERIALIZED VIEWS).

In order to add a new column to a MV, follow these steps:

  1. Create a new MV "new_streaming_MV", including the additional column/s. To turn on auto refresh, use AUTO REFRESH YES. The default behavior is manual refresh.
  2. Back up the data in the old MV into the new MV using the ALTER TABLE APPEND command. The following command will move the data records from "old_streaming_MV" into the "new_streaming_MV": ALTER TABLE <new_streaming_MV> APPEND FROM <old_streaming_MV>;
  3. Manually refresh the Materialized View using the command: REFRESH MATERIALIZED VIEW <new_streaming_MV>;
  4. You can now drop the "old_streaming_MV".

Regarding the delay on the autorefresh of the streaming MV, Amazon Redshift considers multiple factors in order to complete refresh of the most important materialized views with minimal impact to active workloads in your cluster. These factors include current system load, the resources needed for refresh, available cluster resources, and how often the materialized views are used. Amazon Redshift prioritizes your workloads over autorefresh and might stop autorefresh to preserve the performance of user workload.

Please, refer to this additional documentation about Refreshing Materialized Views and Redshift Streaming Ingestion:

profile pictureAWS
EXPERT
answered 5 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions