Athena PySpark Notebooks - how to schedule/automate recurrent notebook runs? Can you use Step Function Workflow Jobs?

0

Hi. I am using Athena Pyspark notebooks that I would like to schedule for weekly automated runs. Is there a way to use Step Function Workflow Jobs to do this? In the documentation, it only mentions scheduling queries (it does not mention notebooks and I do not currently have the permissions to test it). Note that we are not using SageMaker. Or, is there another method that you would recommend to automate this? Thanks!

kfure
demandé il y a 8 mois360 vues
1 réponse
0

Please be advised that Athena Spark notebooks are designed primarily for interactive use, not for scheduled tasks. Athena Spark serves as an interactive tool to analyze Catalog data using the capabilities of Spark. It is not intended to integrate with pipelines or processes and operates as a standalone software, accessible solely via the AWS Console, similar to Glue interactive sessions. While it excels for testing and development purposes, it isn't suitable for scheduling or integration into workflows.

Should you require scheduling capabilities, we recommend utilizing AWS Glue jobs. This service also harnesses the power of PySpark and facilitates scheduled runs.

[+] https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html

AWS
répondu il y a 8 mois

Vous n'êtes pas connecté. Se connecter pour publier une réponse.

Une bonne réponse répond clairement à la question, contient des commentaires constructifs et encourage le développement professionnel de la personne qui pose la question.

Instructions pour répondre aux questions