2回答
- 新しい順
- 投票が多い順
- コメントが多い順
0
Using Bookmarks is recommended way when using Glue, however nothing stops you from passing a parameter to glue script [https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-pyspark-extensions-get-resolved-options.html] and then generaly the query dynamically based on that parameter if you want.
The other question is why you wanted to use Glue and not Redshift federated query instead which might be an easier option?
回答済み 2年前
0
One option is to use DMS to incrementally load Redshift using MySQL as a source. Here's a whitepaper on the topic: https://docs.aws.amazon.com/whitepapers/latest/optimizing-dms-with-amazon-redshift/optimizing-dms-with-amazon-redshift.html
関連するコンテンツ
- AWS公式更新しました 3年前
- AWS公式更新しました 3年前
- AWS公式更新しました 3年前
- AWS公式更新しました 2年前
Cause i use Redshift Serverless which still have no Query schedule , so even if i used federated query to run MV i will have to schedule using step function to be able to refresh and monitor extraction workflow.
the issue of using the bookmark that i need only the data that is loaded yesterday and by default running the jobs after 12 AM will cause some unneeded data to be loaded by using bookmark which might affect my extraction logic