Athena PySpark Notebooks - how to schedule/automate recurrent notebook runs? Can you use Step Function Workflow Jobs?

0

Hi. I am using Athena Pyspark notebooks that I would like to schedule for weekly automated runs. Is there a way to use Step Function Workflow Jobs to do this? In the documentation, it only mentions scheduling queries (it does not mention notebooks and I do not currently have the permissions to test it). Note that we are not using SageMaker. Or, is there another method that you would recommend to automate this? Thanks!

kfure
質問済み 8ヶ月前360ビュー
1回答
0

Please be advised that Athena Spark notebooks are designed primarily for interactive use, not for scheduled tasks. Athena Spark serves as an interactive tool to analyze Catalog data using the capabilities of Spark. It is not intended to integrate with pipelines or processes and operates as a standalone software, accessible solely via the AWS Console, similar to Glue interactive sessions. While it excels for testing and development purposes, it isn't suitable for scheduling or integration into workflows.

Should you require scheduling capabilities, we recommend utilizing AWS Glue jobs. This service also harnesses the power of PySpark and facilitates scheduled runs.

[+] https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html

AWS
回答済み 8ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ