Athena PySpark Notebooks - how to schedule/automate recurrent notebook runs? Can you use Step Function Workflow Jobs?

0

Hi. I am using Athena Pyspark notebooks that I would like to schedule for weekly automated runs. Is there a way to use Step Function Workflow Jobs to do this? In the documentation, it only mentions scheduling queries (it does not mention notebooks and I do not currently have the permissions to test it). Note that we are not using SageMaker. Or, is there another method that you would recommend to automate this? Thanks!

kfure
질문됨 8달 전360회 조회
1개 답변
0

Please be advised that Athena Spark notebooks are designed primarily for interactive use, not for scheduled tasks. Athena Spark serves as an interactive tool to analyze Catalog data using the capabilities of Spark. It is not intended to integrate with pipelines or processes and operates as a standalone software, accessible solely via the AWS Console, similar to Glue interactive sessions. While it excels for testing and development purposes, it isn't suitable for scheduling or integration into workflows.

Should you require scheduling capabilities, we recommend utilizing AWS Glue jobs. This service also harnesses the power of PySpark and facilitates scheduled runs.

[+] https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html

AWS
답변함 8달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인