I want to install Python libraries in my Amazon Managed Workflows for Apache Airflow (Amazon MWAA) environment.
Short description
To install Python libraries on an Amazon MWAA environment, you can get the dependencies from the Python Package Index (PyPI). You can also use a plugins.zip with the required Python wheels (.whl) package.
To install Python dependencies on an Amazon MWAA environment with a public webserver, specify the dependencies in the requirements.txt file. When you use the requirements.txt file, the pip installs the listed packages from PyPI by default. If you want to install dependencies that aren't available on PyPI, then follow the steps to install the packages on a private webserver.
To install Python dependencies on an Amazon MWAA environment with a private webserver, use Python wheels (.whl). Package the libraries and artifacts as .whl files, zip them together in plugins.zip, and then point the requirements.txt to the plugins folder.
Note: You must use the plugins.zip file when you install custom Amazon MWAA operations, hooks, sensors, or interfaces.
Resolution
Make sure that you complete the prerequisites before you begin.
Set up your Amazon MWAA local environment
Complete the following steps:
-
Use the Airflow Image (on the GitHub website) to set up an Amazon MWAA local environment.
-
Add the Python library and dependencies to the requirements.txt file.
Note: Make sure that the requirements.txt file contains the required constraints.
-
Run the following script to test the requirements.txt file:
aws-mwaa-local-runner % ./mwaa-local-env test-requirements
Example output:
Installing requirements.txtCollecting aws-batch (from -r /usr/local/airflow/dags/requirements.txt (line 1)) Downloading https://files.pythonhosted.org/packages/5d/11/3aedc6e150d2df6f3d422d7107ac9eba5b50261cf57ab813bb00d8299a34/aws_batch-0.6.tar.gz Collecting awscli (from aws-batch->-r /usr/local/airflow/dags/requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/07/4a/d054884c2ef4eb3c237e1f4007d3ece5c46e286e4258288f0116724af009/awscli-1.19.21-py2.py3-none-any.whl (3.6MB)
100% |████████████████████████████████| 3.6MB 365kB/s
...
...
...
Installing collected packages: botocore, docutils, pyasn1, rsa, awscli, aws-batch
Running setup.py install for aws-batch ... done
Successfully installed aws-batch-0.6 awscli-1.19.21 botocore-1.20.21 docutils-0.15.2 pyasn1-0.4.8 rsa-4.7.2
For more information, see Installing Python dependencies using PyPi.org requirements file format.
Download and upload the constraints.txt file
Complete the following steps:
- Add your Python modules to the requirements.txt.
- Download the Apache Airflow constraints file that corresponds with the version of Apache Airflow you use in your MWAA environment.
- Rename the downloaded file to constraints.txt.
- Upload the constraints.txt file to Amazon Simple Storage Service (Amazon S3) bucket's dags/ directory.
Important: Make sure that the packaged_requirements.txt file points to /usr/local/airflow/plugins/constraints.txt instead of the public constraints.
Build the .whl files from the requirements.txt file
Complete the following steps:
-
Add the Python library and dependencies to the requirements.txt file.
Note: Make sure that the requirements.txt file contains the required constraints.
-
Run the following local-runner command:
aws-mwaa-local-runner % ./mwaa-local-env package-requirements
Note: This command downloads all .whl files into the aws-mwaa-local-runner/plugins folder. After you run the package-requirements command, the plugins.zip and packaged_requirements.txt files are available in the application's requirements/ directory.
-
Modify the new packaged_requirements.txt file so that it points to the local constraints instead of the public constraints.
Create a new requirements.txt file
To create a new requirements.txt file that points to the .whl files packaged in the plugins.zip file, complete the following steps:
-
Modify the new packaged_requirements.txt file:
--find-links /usr/local/airflow/plugins
--no-index
--constraint "/usr/local/airflow/dags/constraints.txt"
....snip.....
-
Upload the plugins.zip files and requirements.txt files to the Amazon S3 bucket for your Amazon MWAA cluster.
-
Update the environment to use the latest version of plugins.zip and packaged_requirements.txt.
Troubleshoot package installation
Use the aws-mwaa-local-runner (on the GitHub website) to test DAGs, custom plugins, and Python dependencies. View the log file from the Amazon MWAA worker or the Airflow scheduler log group.