EC2 vs Glue - Internet Access problem

0

Hi All,

  1. I have EC2 Instance: Enter image description here Enter image description here
  2. Inside instance I am able to install libraries via pip3 Enter image description here
  3. I created Glue Job with connection on the same subnet and same sg as EC2, that requires additional libs: Enter image description here
  4. Job is timing out because it's unable to download libs: Enter image description here

I do not understand where might be the problem....

Thank you for any suggestions.

2 Answers
0

Without sharing the rules of the Security Group used by the Glue, we will assume it has Allow All Traffic for Outbound. Also without seeing your Route Table, I assumed you have have both resources deployed in a Public Subnet routed tot he internet through Internet Gateway.

You don't have to place your Glue connection in a Public Subnet to allow VPC traffic to internet. Traffic generated from your Glue connection has to go through NAT gateway. So you need to ensure:

If you find this helpful, please accept the answer.

profile pictureAWS
AmerO
answered 6 months ago
profile picture
EXPERT
reviewed 6 months ago
  • Agree totally. You will need glue in a private subnet not public.

  • Normally it's easier (and cheaper) to add an Internet Gateway rather than NAT

0

Hello AmerO,

  1. I am using same security group for EC2 and for Glue with Allow All Traffic for Outbound.
  2. Yes, my subnets are public and routing traffic to Internet Gateway.

So as I understand EC2 traffic is routed to IGW, but Glue traffic must be routed trough NAT? Is this Glue specific? Is NAT mandatory for Glue to access internet?

Thank you.

answered 6 months ago
  • Yes, during the process when you assign the VPC and Subnet to your connection, you can click on the little blue Hyperlink "info" of those fields. The Subnet info explains the following:

    " ***Subnet Enter the subnet in the VPC that contains your data store.

    When AWS Glue connects to a JDBC data store in a VPC, AWS Glue creates an elastic network interface (with the prefix Glue_) in your account to access your VPC data. Each elastic network interface is assigned a private IP address from the IP address range in the subnets that you specify. Don't put your data store in a public subnet or in a private subnet that doesn't have internet access. Instead, attach it only to private subnets that have internet access through a network address translation (NAT) instance or an Amazon VPC NAT gateway. You can configure a NAT instance inside your VPC, or you can use the Amazon VPC NAT gateway."***

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions