- Newest
- Most votes
- Most comments
U sure the base zero etl base tables no longer show the rows? The deletes might not sync if they are triggered by cascades which zero etl (dms) doesn’t support. Also maybe the delete has a delay, make two mat views, fully refresh one then incrementally refresh the other to ensure ur not incrementally refreshing too soon.
I mean it might just be internally flagging the row as deleted but the MV team didn’t accommodate for that in their incremental refresh code. I’ve seen that code break, especially with joining multiple unnested supers. I would try to make a dead simple reproduction, so you can open a support case and give them step by step instructions that show the same exact error. I mean instructions including “create a mysql rds, create a table, create a redshift serverless, set up zeroetl, make a incremental mat view with this exact sql statement, ensure adding a row to mysql table shows it in the mat view after incremental refresh, delete the row in mysql, see that it’s still in the mat view after another incremental refresh, do a full refresh and now see it’s gone”. They need clean scratch reproduction steps.
The behavior you're experiencing with incremental refreshes of materialized views in Amazon Redshift not removing deleted rows is not the expected behavior for properly configured materialized views. Typically, incremental refreshes should identify and apply changes from the base tables, including deletions.
However, there are a few factors that could potentially explain this situation:
-
Transaction visibility: Incremental refreshes use only the base table rows that are already committed. If the deletions were part of a transaction that wasn't fully committed when the refresh operation started, those changes might not be visible to the refresh process.
-
Refresh method: Amazon Redshift automatically chooses the refresh method based on the SELECT query used to define the materialized view. It's possible that for your specific materialized view definition, the system is not able to efficiently track deletions during an incremental refresh.
-
Limitations of incremental refresh: Some SQL constructs and functions used in the materialized view definition might limit the ability to perform efficient incremental refreshes, especially for deletions.
-
Data consistency: Ensure that the Zero-ETL integration is maintaining proper consistency between your RDS MySQL source and the Redshift target, especially for delete operations.
To address this issue, you could try the following:
- Review your materialized view definitions to ensure they're optimized for incremental refreshes.
- Check the SVL_MV_REFRESH_STATUS system view to see the details of recent refresh operations and their effectiveness.
- If the issue persists, you might need to schedule periodic full refreshes to ensure complete synchronization, including deletions.
- Contact AWS support for a more in-depth investigation of your specific setup and materialized view definitions.
Remember that while incremental refreshes are generally faster and more efficient, there can be edge cases where they don't capture all changes, especially in complex scenarios involving multiple data sources or intricate query structures.
Sources
REFRESH MATERIALIZED VIEW - Amazon Redshift
Refreshing a materialized view - Amazon Redshift
Refreshing materialized views - AWS Prescriptive Guidance
Relevant content
- asked a year ago
- asked 2 years ago
