Athena PySpark Notebooks - how to schedule/automate recurrent notebook runs? Can you use Step Function Workflow Jobs?

0

Hi. I am using Athena Pyspark notebooks that I would like to schedule for weekly automated runs. Is there a way to use Step Function Workflow Jobs to do this? In the documentation, it only mentions scheduling queries (it does not mention notebooks and I do not currently have the permissions to test it). Note that we are not using SageMaker. Or, is there another method that you would recommend to automate this? Thanks!

kfure
已提問 8 個月前檢視次數 361 次
1 個回答
0

Please be advised that Athena Spark notebooks are designed primarily for interactive use, not for scheduled tasks. Athena Spark serves as an interactive tool to analyze Catalog data using the capabilities of Spark. It is not intended to integrate with pipelines or processes and operates as a standalone software, accessible solely via the AWS Console, similar to Glue interactive sessions. While it excels for testing and development purposes, it isn't suitable for scheduling or integration into workflows.

Should you require scheduling capabilities, we recommend utilizing AWS Glue jobs. This service also harnesses the power of PySpark and facilitates scheduled runs.

[+] https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html

AWS
已回答 8 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南