Glue job cannot download the hudi connector: 403 forbidden (IAM role has full access of EC2ContainerRegistry and Marketplace)


I follow this blog to try the hudi connect: Ingest streaming data to Apache Hudi tables using AWS Glue and Apache Hudi DeltaStreamer.

But when I started the glue job, I always got this error log:

2023-03-28 12:39:33,136 - __main__ - INFO - Glue ETL Marketplace - Preparing layer url and gz file path to store layer 8de5b65bd171294b1e04e0df439f4ea11ce923b642eddf3b3d76d297bfd2670c.
2023-03-28 12:39:33,136 - __main__ - INFO - Glue ETL Marketplace - Getting the layer file 8de5b65bd171294b1e04e0df439f4ea11ce923b642eddf3b3d76d297bfd2670c and store it as gz.
Traceback (most recent call last):
  File "/usr/lib64/python3.7/", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.7/", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/aws_glue_custom_connector_python/docker/", line 361, in <module>
  File "/tmp/aws_glue_custom_connector_python/docker/", line 351, in main
    res += download_jars_per_connection(conn, region, endpoint, proxy)
  File "/tmp/aws_glue_custom_connector_python/docker/", line 304, in download_jars_per_connection
    download_and_unpack_docker_layer(ecr_url, layer["digest"], dir_prefix, http_header)
  File "/tmp/aws_glue_custom_connector_python/docker/", line 168, in download_and_unpack_docker_layer
    layer = send_get_request(layer_url, header)
  File "/tmp/aws_glue_custom_connector_python/docker/", line 80, in send_get_request
  File "/home/spark/.local/lib/python3.7/site-packages/requests/", line 941, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url:
Glue ETL Marketplace - failed to download connector, activation script exited with code 1
LAUNCH ERROR | Glue ETL Marketplace - failed to download connector.Please refer logs for details.
Exception in thread "main" 
java.lang.Exception: Glue ETL Marketplace - failed to download connector.

I guess the root cause is:

  1. The Glue job cannot pull the connect image from AWS maketplace.
  2. The connector image cannot store into the S3 bucket.

So I try these methods:

  1. Give permissions to the IAM role of the job. I give AWSMarketplaceFullAccess, AmazonEC2ContainerRegistryFullAccess, AmazonS3FullAccess, I think these permissions are enough definitely.
  2. Make the S3 bucket public. I turned off the Block public access of the related S3 bucket.

But even I did these, I still got the same error. Can someone give any suggestions?

asked a year ago432 views
1 Answer
Accepted Answer

If using Glue 3 or later, nowadays, the best way to add support is just adding a parameter --datalake-formats=hudi and not depend on the marketplace connector

profile pictureAWS
answered a year ago
profile picture
reviewed 9 days ago
  • Thank you for your prompt reply! This problem has been successfully resolved!

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions