Athena PySpark Notebooks - how to schedule/automate recurrent notebook runs? Can you use Step Function Workflow Jobs?

0

Hi. I am using Athena Pyspark notebooks that I would like to schedule for weekly automated runs. Is there a way to use Step Function Workflow Jobs to do this? In the documentation, it only mentions scheduling queries (it does not mention notebooks and I do not currently have the permissions to test it). Note that we are not using SageMaker. Or, is there another method that you would recommend to automate this? Thanks!

kfure
已提问 8 个月前359 查看次数
1 回答
0

Please be advised that Athena Spark notebooks are designed primarily for interactive use, not for scheduled tasks. Athena Spark serves as an interactive tool to analyze Catalog data using the capabilities of Spark. It is not intended to integrate with pipelines or processes and operates as a standalone software, accessible solely via the AWS Console, similar to Glue interactive sessions. While it excels for testing and development purposes, it isn't suitable for scheduling or integration into workflows.

Should you require scheduling capabilities, we recommend utilizing AWS Glue jobs. This service also harnesses the power of PySpark and facilitates scheduled runs.

[+] https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html

AWS
已回答 8 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则