AWS Glue notebook - Kernel fails with Glue version exception

1

I am trying to run the notebooks locally, I followed the instructions provided here (for Windows): https://docs.aws.amazon.com/glue/latest/dg/interactive-sessions.html

I ran the pip3 install --upgrade jupyter boto3 aws-glue-sessions, which upgraded aws-glue-session to version 0.35

but the when starting the notebook the kernel fails to launch and throws the following error,

  File "d:\python38\lib\site-packages\aws_glue_interactive_sessions_kernel\glue_pyspark\GlueKernel.py", line 100, in __init__
    self.set_glue_version(os_env_glue_version)
  File "d:\python38\lib\site-packages\aws_glue_interactive_sessions_kernel\glue_pyspark\GlueKernel.py", line 443, in set_glue_version
    raise Exception(f"Valid Glue versions are {VALID_GLUE_VERSIONS}")
Exception: Valid Glue versions are {'2.0', '3.0'}

Setting the the glue_version = 2.0 in .aws/config and environment variables, does not help either. any help on what could be causing this will be much appreciated!

3 Answers
2

I'm having this same exact issue and found the culprit, some various environment variables aren't being properly seen as null in Windows when they're being looked up if they aren't set.

Find the GlueKernel.py file for the glue interactive sessions package, in the site-packages folder for your Python environment like:

site-packages\aws_glue_interactive_sessions_kernel\glue_pyspark\GlueKernel.py

Change the function at line 871 to

Every environment variable lookup was always being returned as '${SOME_KEY}' instead of being None.

def _retrieve_os_env_variable(self, key):
    _, output = subprocess.getstatusoutput(f"echo ${key}")
    if(output != '${}'.format(key)):
            return output
        else:
            return os.environ.get(key)
joxy
answered 2 years ago
AWS
EXPERT
reviewed 2 years ago
  • with the help of a colleague using a Windows laptop we have been able to reproduce it.

  • thanks Fabrizi, I updated the code in GlueKernel.py, didnt change the terminal still throws the following exception

    site-packages\aws_glue_interactive_sessions_kernel\glue_pyspark\GlueKernel.py", line 432, in set_glue_version raise Exception(f"Valid Glue versions are {VALID_GLUE_VERSIONS}") Exception: Valid Glue versions are {'2.0', '3.0'}

    did you do have to anything else after updating & just saving the script like re-compiling, etc to make it work ?

0

Hi,

Have you tried to set the glue version through the spark magic as described here?

%glue_version 2.0

or

%glue_version 3.0

this need to be set in the first cell before you start the session.

I tested this locally and it worked, hope this helps.

AWS
EXPERT
answered 2 years ago
  • the error message pops up on running the Jupyter Notebook command, so the kernel fails to launch (Jupyter notebook immediately disconnects from the Pyspark kernel after launch as a result).

    The magic command is only usefuly if the notebook is connected to the Pyspark kernel, which it isn't in this case. Adding glue version to the config files in .aws folder also does not resolve this.

  • @rePost-User-6963104 , sorry I may have not captured fully your context reading the question. I did the upgrade on my mac and did not see any issue, so I thought you were having the error at session initialization time.

0

I had to manually hard code the version as 3.0 in the set_glue_version method inside of GlueKernel.py. This got rid of the error, but obviously isn't a great solution. This implementation is pretty buggy.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions