Include additional python module in Glue job


I am trying to include python-oracledb in my job. I have followed the instructions from here, saving various versions of the relevant .whl files from PyPi. I have set the Glue job parameter --additional-python-modules as the key and the value as the S2 URI.

When i run my job I still get the NoModuleFoundError: No module named oracledb.

Please help.

    import sys
    from awsglue.transforms import *
    from awsglue.utils import getResolvedOptions
    from pyspark.context import SparkContext
    from awsglue.context import GlueContext
    from awsglue.job import Job
    import boto3
    import oracledb

Job params

  • Just found the error in the log, it is not supported. Any idea if it will ever be supported.

asked 18 days ago26 views
2 Answers


I understand you wish to use python-oracledb in you Glue PySpark ETL job. I did some tests with my test environment and I'm able to confirm this can be done by either of the following approaches:

  1. If your Glue job runs in a VPC subnet with public Internet access (a NAT gateway is required since Glue workers don't have public Ip address [1]). You can specify the job parameter like this:
Key:  --additional-python-modules
Value:  oracledb
  1. If your Glue job runs in a VPC without internet access, you must create a Python repository on Amazon S3 by following this documentation [2] and include oracledb in your "modules_to_install.txt" file. Then, you should be able to install the package from your own Python repository on S3 by using following parameters. (make sure replace the MY-BUCKET with the real bucket name according to your use case)
"--additional-python-modules" : "oracledb",
"--python-modules-installer-option" : "--no-index --find-links= --trusted-host"




answered 14 days ago


Thank you for your question. My name is Yvonne, from RDS team.

From your question I understand that you experienced an error "NoModuleFoundError: No module named oracledb" and also noticed the error in the log "it is not supported" while trying to include python-oracledb in your Glue job, so you want to know when it will be supported.

Unfortunately i am not able to provide the timelines as our development team has their own timelines however we announce all new features when we release them in below blogs [1] [2] .

Please note that 'additional-python-modules' is applicable for Spark Glue Job with Glue version 2.0 and 3.0. You can include the external python library as mentioned in the link[3] .

For supported versions please refer to the below documentation:


In case you require further assistance or have any queries, feel free to respond back to the case and I will be happy to assist you.



answered 14 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions