Hello, I see that you are unable to import the python package into your Glue notebook. You tried to install the dependency from whl file stored in s3 bucket. I have tried it on my test environment and I was able to import my module using whl file stored in s3 bucket. Here are the steps that I followed:
- For testing purpose, I uploaded this whl file into my s3 bucket.
- Next, I started a Glue notebook and then I used a magic to give the s3 bucket path of my wheel file. The magic command was something like this:
%additional_python_modules <my-s3-uri>
- I executed this and then my session got created. When I tried to import psycopg2 module, it was working fine for me.
Therefore, I request you to verify the following things on your end:
- Do check whether or not you have followed the above steps.
- Please do note that once the session is created, it is not possible to run the magic command. You will have to stop and then restart the notebook to run a magic (%additional_python_modules in your case). These magic commands will only run at the time of session creation.
- Confirm whether the whl file that you are trying to install is compatible with the Glue version you are using.
If you are still facing the same issue, then please share me the steps you were trying to follow along with the error that you were facing.
Screenshot attached
The django python whl file that you are trying to use is not supported by Glue 3.0 because the Django 4.1.7 supports python version >=3.8 but the python version in Glue 3.0 is 3.7. This is the reason why it was not working in Glue 3.0 and as the Glue 4.0 has python version 3.10, it worked there. Refer - https://docs.aws.amazon.com/glue/latest/dg/release-notes.html
Thus, please try to download this file - https://files.pythonhosted.org/packages/57/12/da22535f809b8c06c8d58eaf236ec8683ffd4e1dc4eced175b174e6446fa/Django-3.2.18-py3-none-any.whl and then use it in your Glue notebook, it should work fine. I have tested this on my end as well.
Hey Chaitu, Thank you for the response. Appreciate it. However as an end user had the notebook errored out while loading the library, I would have saved much time and pain that i wasted in troubleshooting. I am sure as you won't expect customers to be aware of these nuances by expecting them to read all the documentation. Could have been avoided by failing fast. Now this has opened a pandora box. I have to downgrade all my glue notebooks from 4.0 to 3.0 so that we maintain same package versions and compatibility in dev notebooks and prod glue jobs. Please pass this feedback to the maintainers.
I will pass on the feedback as suggested by you. So, you could have tried it with the normal Glue job set to version 3.0 and then observed the "error logs" for the job. It will show the error message if it fails to install the python library. In fact, I was able to find this out by the same way. Having said that, it could have been much easier & time saving for you if the error was displayed on the notebook itself. I will convey this to the relevant team.
Relevant content
- asked 3 months ago
- AWS OFFICIALUpdated 7 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 2 years ago
Thanks for the response, I tried importing the psycopg2 package that you provided in the link and it work as expected. However when I tried to import Django package from official website downloaded from https://files.pythonhosted.org/packages/89/86/59e237f7176cfc1544446914fa329fd560bb8fce46be52dd7af5dc7c54f9/Django-4.1.7-py3-none-any.whl It wont work, See the screenshot attached. I am wondering if there is any difference in packaging ? BTW same Django package works perfectly fine in glue 4.0