MWAA: How to sandbox the tasks from each other?

0

Hi all!...brand new MWAA user here. I'm evaluating MWAA as a solution to provide integration/workflow services for multiple tenants...and 1 risk area I'm trying to mitigate is the '/tmp' local disk storage. Tasks for multiple tenants will commonly be writing data to the /tmp dir, and it would be a really big deal if somehow they conflicted and 1 tenant's data was exposed to another tenant. These workflows will be being written by a team of, shall we say, not super-strong developers...so conflicts are likely to happen eventually, and proper tempfile cleanup is likely to be missed. Are there any common strategies or infrastructure options to deal with this risk?

I was thinking of installing a task instance mutation cluster policy hook which would "rm -rf /tmp/*" before every task run, but of course that's sketchy and would very possibly break other tasks which are running concurrently on the same worker.

To automate cleanup (so worker disks don't eventually fill up) the only other thing I can think of is to provide a library with a get_temp_file() method which generates timestamped temp file names, so we can automatically delete files >a day old (or whatever)....but of course this relies on the team to diligently use our library method rather than the standard python method, or (god forbid) hardcoding their own filenames.

Any thoughts or insights are appreciated. Thanks!

redec
asked 2 years ago578 views
1 Answer
0

Airflow is not multitenant. Anyone who can write a DAG can see any other DAG or environment information. See AIP-1 for how the Airflow community is working towards multitenancy and other security improvements.

As such the only true data isolation is multiple environments. A secondary alternative is a DAG factory where you don't allow users to write DAGs directly, but rather specify their DAGs via YAML or JSON and control exactly what they can do.

AWS
John_J
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions