Import External Libraries (Glue) Without Internet

0

Hi All,

We have a client requirement where we are running our process in glue using python shell. Since they are using VPN, we do not have internet connectivity and want some of the pypi packages to be imported in our glue job for further processing.

How do we import all those packages without internet and also include their dependencies.

Please recommend the best possible solution which can help to import all the libraries without internet connectivity.

We tried creating wheel package for each library and hosting it in S3. But it didn't work either.

질문됨 6달 전246회 조회
1개 답변
0

Glue shell (unlike Glue ETL) doesn't allow installing packages from s3 directly, but what you can do is create a Pypi repo in s3 with the things you need and ask the shell to use it instead of the internet one.
To do that you need:

  • Create a repo on s3 with your dependencies, for instance using https://github.com/wolever/pip2pi
  • In the --additional-python-modules parameter you can specify pip flags, so you can use -i to point to the repository index on s3
profile pictureAWS
전문가
답변함 6달 전
  • Hi Gonzalo, Thanks for your revert. Could you add more details on the step, I tried --additional-python-modules where I have passed pypi libraries like keras==2.15.0 etc and also used --python-modules-installer-option" : "--no-index --find-links=" but this didn't work either. I had all my wheel files present inside S3 in a package folder.

  • I think that --python-modules-installer-option is for Glue ETL, try adding the flags directly inside the --additional-python-modules . Check the logs to see if it picks it up

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠