Ability to temporarily stop Airflow planned?

0

Similar to RDS, where it is possible to start/stop a database (see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_StopInstance.html), it would be great to have a similar feature for Airflow.

Use-Cases:

  1. Stop Airflow Development Environment at 6pm and restart it at 8am
  2. Run Airflow only during the night and will be idle during the day --> save resources by stopping Airflow during the day

A start-stop feature is currently only possible by re-creating environments, which unfortunately also drops Airflow connections and logs. My question would be therefore if such a feature is in the AWS MWAA roadmap?

This feature would help a lot to convince customers to use MWAA instead of hosting Airflow by themselves.

Thanks!

Edited by: capca5 on Dec 28, 2020 4:16 AM

Edited by: capca5 on Dec 28, 2020 4:17 AM

capca5
asked 2 years ago424 views
7 Answers
0

Hi!

There are not immediate plans to be able to pause/resume an MWAA environment, however for development environments one option is to use CloudFormation to create an on-demand environment, with connections defined in AWS Secrets Manager. Logs will be preserved in CloudWatch while the environment is deleted, and if the replacement environment uses the same name as the previous environment then the logs should resume in the same place in CloudWatch.

Thanks!

John_J
answered 2 years ago
0

Hi,

I may have an hint and understand why this service cannot be stopped and then started that being a managed service but ...

We are considering in adopting the managed service MWAA and what concerns me is what happens to metadata configs. if the service is gone or stopped on a PROD env. ?
There's no access to the metadata db, no full access to the footprint of the service so we cant perform a disaster recovery activity and restore to the previous state.
Another question is why cant exec. metadata DB be accessible or explicitly to be able to manage it on a service like RDS safekeeping everything giving more control to operations?

thanks for all your work. We hope to be able to adopt this service in our ecosystem too.

BR
Pedro M.

Edited by: pedromach on Jan 8, 2021 6:55 AM

answered 2 years ago
0

Hi Pedro,

Meta DB access was restricted both for stability of the service and to enable simpler setup. Customers have access to the meta DB through their DAGs, demonstrated by the clean up DAG sample here: https://docs.aws.amazon.com/mwaa/latest/userguide/samples-database-cleanup.html

The meta DB is a multi-AZ RDS Postgres instance unique to that particular environment and as such disaster recovery should not be required, however you are correct that for production systems we would recommend that the MWAA environment is left running and not recreated on-demand.

Thanks!

John_J
answered 2 years ago
0

Thanks, that answered my question :)

capca5
answered 2 years ago
0

Hello,
unfortunately, I need to re-open this issue. We tested the start-stop (=create & destroy) mechanism of MWAA and it seems like the meta database is not being recovered. I can see my logs in CloudWatch but all task runs are not displayed in Airflow itself. Is there a way to keep also the Meta-Database entries?

capca5
answered 2 years ago
0

Hi,

MWAA does not automatically recover the meta database. Before deleting the environment, you would need to use a DAG to offload the meta data to an external DB, and then use a similar dag to reload meta data on the new environment.

Thanks!

John_J
answered 2 years ago
0

Do you have an example for such a DAG.

averma
answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions