1. Create a Python 2 or Python 3 library for boto3. Be sure that the AWS Glue version that you're using supports the Python version that you choose for the library. AWS Glue version 1.0 supports Python 2 and Python 3, and AWS Glue version 0.9 supports only Python 2.
Note: Libraries and extension modules for Spark jobs must be written in Python. Libraries, such as pandas, that are written in C aren't supported in Glue 0.9 or 1.0. If you need to use a Library written in C, then upgrade AWS Glue to at least version 2.0 and use the --additional-python-modules option. For more information, see How do I use external Python libraries in my AWS Glue 2.0 ETL job?
4. Run the following commands to install Python and Boto3. For more information, see Boto3 documentation for Quickstart.
sudo yum groupinstall "Development Tools"
sudo yum -y install openssl-devel
tar xvf Python-3.6.9.tgz
sudo make install
sudo pip install boto3
5. Confirm the location of the Python site-packages directory:
python -m site
You receive an output similar to the following:
6. Package the external library files in a .zip file unless the library is contained in a single .py file. The .zip file must include an __init__.py file, and the package directory must be at the root of the archive. The __init__.py file can be empty. For more information, see Python documentation for Packages.
sudo zip -r -X "/home/ec2-user/site-packages.zip" *