スキップしてコンテンツを表示

Facing issues while installing dependencies in AWS Glue Jupyter Notebook

0

Hello Team, I am trying to install libraries through %additional_python_modules in my glue notebook. There is no problem with libraries except prophet and sqlalchemy. I tried downloading wheel files for both libraries and uploaded into S3 and i gave path for that wheel file in %additional_python_modules still it's not working. I have also tried installing both libraries with lower versions also prophet==1.1.5. Anyone faced the same issue?

For your reference i am using glue 5.0 and python-3 versions.

  • %additional_python_modules pandas,numpy,seaborn,path to s3 for prophet, path to s3 for sqlalchemy.

ModuleNotFoundError: No module named prophet

質問済み 1年前278ビュー
3回答
0

It seems you're encountering difficulties installing the 'prophet' and 'sqlalchemy' libraries in your AWS Glue Jupyter Notebook. This is a common issue, especially with libraries that have complex dependencies or require compilation.

For AWS Glue 5.0 and later, the recommended approach is to use a requirements.txt file to manage Python library dependencies. Here's how you can try this:

  1. Create a requirements.txt file with the libraries you need, including prophet and sqlalchemy.

  2. Upload this file to an S3 bucket.

  3. In your Glue job configuration, add two job parameters:

    • Key: --python-modules-installer-option Value: -r
    • Key: --additional-python-modules Value: s3://path_to_your_requirements.txt

This method should install the libraries specified in your requirements.txt file.

If you're using an earlier version of Glue, or if this method doesn't work, you can try the following:

  1. Use the --additional-python-modules parameter in your job configuration, specifying the exact versions of the libraries you need. For example: prophet==1.1.5,sqlalchemy==1.4.0

  2. If the libraries have complex dependencies, you might need to create a custom runtime environment that includes these packages and use this environment for your Glue job.

  3. For libraries with native code dependencies (which might be the case for prophet), you may need to find or create a precompiled wheel file that's compatible with the Python version and Amazon Linux 2 used by Glue.

  4. Check the full CloudWatch logs for your Glue job to see if there are any specific error messages related to the installation process.

If these methods don't work, you might need to consider alternative libraries that are more compatible with the Glue environment, or reach out to AWS support for further assistance.

Remember, when using custom packages, ensure they are compatible with the Python version and environment used by your AWS Glue job. Always test your job thoroughly after adding custom packages to ensure everything works as expected.

Sources
Using Python libraries with AWS Glue - AWS Glue
AWS Glue Visual ETL - Install/ import packages | AWS re:Post
Unable to import/install external library 'psycopg2' in AWS Glue | AWS re:Post
AWS Glue 5.0 Not Installing Libraries from requirements.txt | AWS re:Post

回答済み 1年前
0

Hello,


Attempted to import the two modules you mentioned prophet and sqlalchemy. Below are the code configuration details.


%idle_timeout 2880

%glue_version 5.0

%worker_type G.1X

%number_of_workers 5

%additional_python_modules prophet,sqlalchemy

import sys

from awsglue.transforms import *

from awsglue.utils import getResolvedOptions

from pyspark.context import SparkContext

from awsglue.context import GlueContext

from awsglue.job import Job

sc = SparkContext.getOrCreate()

glueContext = GlueContext(sc)

spark = glueContext.spark_session

job = Job(glueContext)


Once the session got created, in a new cell, executed the below import statements successfully without any errors.

import prophet

import sqlalchemy



Additionally, tried by mentioning the version for prophet module as below.


%idle_timeout 2880

%glue_version 5.0

%worker_type G.1X

%number_of_workers 5

%additional_python_modules prophet==1.1.5,sqlalchemy

import sys

from awsglue.transforms import *

from awsglue.utils import getResolvedOptions

from pyspark.context import SparkContext

from awsglue.context import GlueContext

from awsglue.job import Job

sc = SparkContext.getOrCreate()

glueContext = GlueContext(sc)

spark = glueContext.spark_session

job = Job(glueContext)


Enter image description here

Upon successful creation of session, executed the import statements successfully without any error.

Enter image description here

However, if you are still experiencing the issue, which makes it challenging to accurately pinpoint the cause or provide effective mitigation steps without a detailed investigation. It would be difficult to give a definitive solution without access to logs and case tools to review the resources thoroughly.


Therefore, I kindly request you to raise a case with AWS Support team. This will enable our team to conduct a thorough investigation and efficiently identify the root cause of the problem.


Link to raise case: https://support.console.aws.amazon.com/support/home#/case/create

Thanks!

AWS
回答済み 1年前
0

Enter image description here Enter image description here Enter image description here

Tried working on the same, but still facing same problem.

回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

関連するコンテンツ