How do I troubleshoot AWS Marketplace connection errors in my AWS Glue ETL jobs?
I'm using AWS Marketplace connectors in AWS Glue, but receive errors in the logs.
Resolution
Connectors aren't showing
You subscribe to a connector from AWS Marketplace, but you can’t find this connector in the AWS Glue Studio’s connector page. To resolve this issue, complete the following steps.
Note: You can repeat these steps even if you previously subscribed to the connector.
- Open the AWS Marketplace.
- Choose Discover products, and then find the connector that you want to use.
- Choose Continue to Subscribe, and then, if promoted, log in to your AWS account.
- Choose Continue to Configure. If this option is grayed out and you can't choose it, then make sure to read the terms and conditions. Choose Accept Terms, and then wait until the Continue to Configure button becomes available.
- From the dropdown list, choose Delivery Method and Software Version. If you're not sure which version to choose, then choose the latest version.
- Choose Continue to Launch, and then choose Usage Instruction.
- In the pop-up window that appears, choose Active the Glue connector from AWS Glue Studio.
- (Optional) To install only the connector, choose Active Connector Only. For more information on this option, see Using connectors and connections with AWS Glue Studio. If you're working with custom connectors instead, then see Developing custom connectors.
Issues with AWS Identity and Access Management (IAM) role
When trying to subscribe to a connector in the AWS Marketplace, you get an IAM permissions error similar to the following one:
"You do not have the right permissions to make this request. Some controls have been disabled because you are missing the correct permission(s). The missing permission(s) are: aws-marketplace:Subscribe."
To resolve this issue, add an IAM policy to the IAM user that received the error. For AWS Marketplace use, add the following IAM policies to your IAM user:
- To grant permissions to view subscriptions but not change them, choose AWSMarketplaceRead-only.
- To grant permissions to subscribe and unsubscribe, choose AWSMarketplaceManageSubscriptions.
- To grant complete control of your subscriptions, choose AWSMarketplaceFullAccess.
For more information, see Controlling access to AWS Marketplace subscriptions.
AccessDeniedException errors
You receive an AccessDeniedException error similar to the following one in the AWS Glue job's logs:
"An error occurred (AccessDeniedException) when calling the GetAuthorizationToken operation: User: arn:aws:sts::xxxxxxxxxxxx:assumed-role/<IamRole>/GlueJobRunnerSession is not authorized to perform: ecr:GetAuthorizationToken on resource: * because no identity-based policy allows the ecr:GetAuthorizationToken action Glue ETL Marketplace - failed to download connector, activation script exited with code 1 LAUNCH ERROR | Glue ETL Marketplace - failed to download connector. Please refer logs for details."
This error occurs when the IAM role that's associated with your AWS Glue job has insufficient permissions when it tries to perform the GetAuthorizationToken operation. To resolve this issue, give your AWS Glue job the ecr:GetAuthorizationToken permission:
- Open the IAM console.
- Choose the IAM role that you're using in the AWS Glue job.
- Choose Attach policies.
- Under Filter policies, enter AmazonEC2ContainerRegistryReadOnly, and then choose this policy.
- Choose Attach Policy.
After you attach the required policy to the IAM role, run the AWS Glue job again.
For more information, see AmazonEC2ContainerRegistryReadOnly, Adding and removing IAM identity permissions, and Setting up IAM permissions for AWS Glue.
Networking issues - No network pathway from VPC
Your networking setup might not be adequate for AWS Glue connectors to work correctly when it's used in an AWS Glue job.
botocore.exceptions.ConnectTimeoutError: Connect timeout on endpoint URL: "https://api.ecr.us-east-1.amazonaws.com/" Glue ETL Marketplace - failed to download connector, activation script exited with code 1 LAUNCH ERROR | Glue ETL Marketplace - failed to download connector.Please refer logs for details. Exception in thread "main" java.lang.Exception: Glue ETL Marketplace - failed to download connector.
In this example, there's no network pathway from the virtual private cloud (VPC) containing the job's components to the Amazon Elastic Container Registry (Amazon ECR) repository. The Amazon ECR repository contains the images for the connectors. AWS Glue stores all connectors in an Amazon ECR repository in the us-east-1 AWS Region. If the AWS Glue job wants to use a connector, then it must download it from this Region.
When a connection is added to an AWS Glue job, you must establish a network route. This network route allows traffic to flow to or from the service, in this case Amazon ECR. AWS Glue uses private IPs to communicate with the components of the job and services, such as Amazon ECR. This error can occur if your connection uses a public subnet with an internet gateway in its route table.
The job of the internet gateway is to route traffic. But it can't convert the private IPs that AWS Glue uses into public IPs that the Amazon ECR endpoint in the us-east-1 Region recognizes. You must use a NAT gateway in the connection subnet that's capable of performing these address translations (private to public IPs).
When you create the connection, the networking information such as, VPC, subnet, and security group are optional. If you create the connection with only the connector and secrets key, then the AWS Glue job uses an internal NAT gateway. The job doesn't rely on a NAT gateway in your account.
To resolve this issue, choose one of the following solutions, and incorporate it into your network design.
Approach 1: Create and attach a NAT gateway to the connection subnet
Instead of using an internet gateway, create and attach a NAT gateway to the connection subnet:
- Provision an unattached Elastic IP address to your account. Make sure that you associate this IP address with the NAT gateway.
- Create a NAT gateway, and then choose a public subnet and the Elastic IP address that you provisioned for the NAT gateway. This creates the NAT gateway in a public subnet.
- Create a private subnet (without an internet gateway route) and a related route table. In the route table, add a rule with 0.0.0.0/0 pointing to the NAT Gateway that you created. Or, you can edit one of the existing subnets to use the route table with the NAT gateway route. Make sure that there's no internet gateway route that's used with the NAT gateway route.
- Revise the AWS Glue connection's subnet to use the private subnet that you created in Step 3.
- Run the AWS Glue job again and confirm that the error doesn't re-occur.
Approach 2: Don't use VPC information in the connection
Don't include VPC information in the connection. Use an internet NAT gateway instead:
- Create a new connection for your connector in the AWS Glue Studio.
- When you create the connection, specify only the Secrets Manager key. Don't add any VPC options. This means that AWS Glue uses the internal NAT instead of relying on the subnet.
- Edit the AWS Glue job to use the new connection, and then rerun the job.
Approach 3: For private network setups, create a VPC endpoint
If you have a private network setup, then you can also use a VPC endpoint instead of using a NAT gateway.
Create a private network
- Log in to the Amazon Virtual Private Cloud (Amazon VPC) console.
- From the navigation pane, choose Subnet, and then choose Create subnet.
- Choose your VPC ID, and then add a Subnet name, Availability Zone, IPV4 CIDR block and tags. Then, choose Create Subnet.
- From the navigation pane, choose Route tables. Add a name for your route table, choose your VPC, and then choose Create route table.
- Open the route table that you created. Under Subnet association, choose the tab for Explicit subnet associations.
- Choose Edit subnet association, choose the newly created subnet from the list, and then choose Save.If you check the Route tab, you can now see that there is no internet access (0.0.0.0/0).
- From the navigation pane, choose Security groups. Add the details for your security group, and then choose the VPC. Add an inbound rule to allow TCP protocol for port 22 with source 0.0.0.0./0.
- Add a second rule. For Protocol, choose ALL. For Source, choose the new security group that you created. If you can't find the new security group name in the dropdown list, then save the group, and edit the inbound rules again.
Create VPC endpoint
Next, create three VPC endpoints: an Amazon ECR API endpoint, a VPC endpoint for the com.amazonaws.<region>.ecr.dkr service, and an Amazon Simple Storage Service (Amazon S3) endpoint.
First, create the Amazon ECR API endpoint:
- From the navigation pane, choose Endpoints.
- Choose Create endpoint, and then add an endpoint name for your Amazon ECR API endpoint.
- For Service category, choose AWS services.
- For Services, add the ECR filter, and then choose com.amazonaws.<region>ecr.api.
- For VPC, choose the VPC that you want to create the endpoint in. Under Additional settings, choose Enable DNS Name.
- For Subnets, choose the Availability Zone that you created the new subnet in. Then, from the Subnet ID dropdown list, choose the Subnet name.
- For Security groups, choose the security group that you created.
- For Policy, choose Full access to allow all operations by all principles on all resources over the VPC endpoint.
- Add an optional tag, and then choose Create endpoint.
Using the same steps, create another VPC endpoint for the service name com.amazonaws.<region>.ecr.dkr.
Then, complete the following steps to create the Amazon S3 endpoint:
- From the navigation pane, choose Endpoints.
- Choose Create endpoint, and then add an endpoint name for your Amazon S3 endpoint.
- For Service category, choose AWS services
- For Services, add the Type:Gateway filter, and then choose com.amazonaws.<region>.s3.
- For VPC, choose the VPC that you want to create the endpoint in.
- For Route tables, choose the route tables that you created.
- For Policy, choose Full access to allow all operations by all principles on all resources over the VPC endpoint.
- Add an optional tag, and then choose Create endpoint.
Subscribe to and configure connectors
If you already subscribed to and configured your connector in AWS Glue, then proceed to the Create AWS Glue connection section.
If you didn't subscribe to and configure your connector in AWS Glue, then follow the steps in Subscribing to AWS Marketplace connectors. In the Usage instruction pop-up window that appears, choose Activate the Glue connector from AWS Glue Studio. This takes you to the Create Glue Connection page.
Create AWS Glue Connection
If you already added your connector in the AWS Glue console, then navigate to Connections and choose your connector. Then choose, Create connection.
If you followed the previous steps to subscribe to and configure connectors, then the Create Glue Connection page is open. Complete the following steps to create your connection:
- On the Create Glue Connection page, add a Connection name.
- For Network options, choose your VPC, and the subnet and security groups that you previously created.
- Choose Create connection and activate connector.
You can now use the connection name in your AWS Glue job to control the connector.
Networking issues - too many connections in the AWS Glue job
You receive this error in the AWS Glue job's logs:
INFO - Glue ETL Marketplace - Start downloading connector jars for connection: <connection name> test connection feature: "Caused by: com.amazonaws.services.glue.exceptions.InvalidInputException: Connection: does not exist" LAUNCH ERROR | Glue ETL Marketplace - failed to download connector.Please refer logs for details.
AWS Glue supports one connection per job or development endpoint. If you specify more than one connection in a job, then AWS Glue uses only the first connection. If you must access more than one VPC, then see Connect to and run ETL jobs across multiple VPCs using a dedicated AWS Glue VPC.
Related information
Relevant content
- asked a year agolg...
- asked 7 months agolg...
- Accepted Answerasked 2 years agolg...
- asked 8 months agolg...
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated a year ago