Skip to content

How do I install Python libraries in my Amazon MWAA environment?

4 minute read
-1

I want to install Python libraries in my Amazon Managed Workflows for Apache Airflow (Amazon MWAA) environment.

Short description

To install Python libraries on an Amazon MWAA environment, you can get the dependencies from the Python Package Index (PyPI). You can also use a plugins.zip with the required Python wheels (.whl) package.

To install Python dependencies on an Amazon MWAA environment with a public webserver, specify the dependencies in the requirements.txt file. When you use the requirements.txt file, the pip installs the listed packages from PyPI by default. If you want to install dependencies that aren't available on PyPI, then follow the steps to install the packages on a private webserver.

To install Python dependencies on an Amazon MWAA environment with a private webserver, use Python wheels (.whl). Package the libraries and artifacts as .whl files, zip them together in plugins.zip, and then point the requirements.txt to the plugins folder.

Note: You must use the plugins.zip file when you install custom Amazon MWAA operations, hooks, sensors, or interfaces.

Resolution

Make sure that you complete the prerequisites before you begin.

Set up your Amazon MWAA local environment

Complete the following steps:

  1. Use the Airflow Image (on the GitHub website) to set up an Amazon MWAA local environment.

  2. Add the Python library and dependencies to the requirements.txt file.
    Note: Make sure that the requirements.txt file contains the required constraints.

  3. Run the following script to test the requirements.txt file:

    aws-mwaa-local-runner % ./mwaa-local-env test-requirements

    Example output:

    Installing requirements.txtCollecting aws-batch (from -r /usr/local/airflow/dags/requirements.txt (line 1))    Downloading https://files.pythonhosted.org/packages/5d/11/3aedc6e150d2df6f3d422d7107ac9eba5b50261cf57ab813bb00d8299a34/aws_batch-0.6.tar.gz    Collecting awscli (from aws-batch->-r /usr/local/airflow/dags/requirements.txt (line 1))    
      Downloading https://files.pythonhosted.org/packages/07/4a/d054884c2ef4eb3c237e1f4007d3ece5c46e286e4258288f0116724af009/awscli-1.19.21-py2.py3-none-any.whl (3.6MB)    
        100% |████████████████████████████████| 3.6MB 365kB/s     
    ...    
    ...    
    ...    
    Installing collected packages: botocore, docutils, pyasn1, rsa, awscli, aws-batch    
      Running setup.py install for aws-batch ... done    
    Successfully installed aws-batch-0.6 awscli-1.19.21 botocore-1.20.21 docutils-0.15.2 pyasn1-0.4.8 rsa-4.7.2

For more information, see Installing Python dependencies using PyPi.org requirements file format.

Download and upload the constraints.txt file

Complete the following steps:

  1. Add your Python modules to the requirements.txt.
  2. Download the Apache Airflow constraints file that corresponds with the version of Apache Airflow you use in your MWAA environment.
  3. Rename the downloaded file to constraints.txt.
  4. Upload the constraints.txt file to Amazon Simple Storage Service (Amazon S3) bucket's dags/ directory.
    Important: Make sure that the packaged_requirements.txt file points to /usr/local/airflow/plugins/constraints.txt instead of the public constraints.

Build the .whl files from the requirements.txt file

Complete the following steps:

  1. Add the Python library and dependencies to the requirements.txt file.
    Note: Make sure that the requirements.txt file contains the required constraints.

  2. Run the following local-runner command:

    aws-mwaa-local-runner % ./mwaa-local-env package-requirements

    Note: This command downloads all .whl files into the aws-mwaa-local-runner/plugins folder. After you run the package-requirements command, the plugins.zip and packaged_requirements.txt files are available in the application's requirements/ directory.

  3. Modify the new packaged_requirements.txt file so that it points to the local constraints instead of the public constraints.

Create a new requirements.txt file

To create a new requirements.txt file that points to the .whl files packaged in the plugins.zip file, complete the following steps:

  1. Modify the new packaged_requirements.txt file:

    --find-links /usr/local/airflow/plugins
    --no-index    
    --constraint "/usr/local/airflow/dags/constraints.txt"     
    ....snip.....
  2. Upload the plugins.zip files and requirements.txt files to the Amazon S3 bucket for your Amazon MWAA cluster.

  3. Update the environment to use the latest version of plugins.zip and packaged_requirements.txt.

Troubleshoot package installation

Use the aws-mwaa-local-runner (on the GitHub website) to test DAGs, custom plugins, and Python dependencies. View the log file from the Amazon MWAA worker or the Airflow scheduler log group.

3 Comments

I just want to note, unless I'm missing the obvious, that these instructions do not (for a constraints file with a private MWAA instance) work.

I can see other people in other forums online reporting the same issue - and nobody seems to know the answer, all MWAA components log the following error:

ERROR: Could not open requirements file: [Errno 2] No such file or directory: '/usr/local/airflow/plugins/constraints.txt'

That's when following these instructions down the route stating:

Or, if you added the file to the plugins.zip file, then replace {OPTION} with plugins

There's also I think a mistake in the statement:

However, if you include the constraints.txt file in the plugins file, then make sure that the packaged_requirements.txt file points to /usr/local/airflow/dags/constraints.txt and not to the public constraints.

(this should be /plugins/ and not /dags/, as per later in the article?)

I also haven't experienced the mwaa-local-env tool generating a constraints.txt as stated in the article:

This command downloads all .whl files into the aws-mwaa-local-runner/plugin folder. After you run the package-requirements command**,** the plugins.zip, new packaged_requirements.txt, and constraints.txt file are available in the application's requirement/ directory.

Without the constraints.txt, I'm unable to specify any custom requirements with a private MWAA instance - as pip then states there are conflicting package versions.

replied 3 months ago

Thank you for your comment. We'll review and update the Knowledge Center article as needed.

AWS
MODERATOR
replied 3 months ago

You've still not updated it and it still doesn't work on MWAA

replied 2 months ago