I am not sure you would be using the benefits of the Glue core if you are calling an API. The driver would have to handle the API requests, while the executors would not be able to use their compute power to call the APIs in parallel. Given that, I would believe you would not use the power of Glue until you use Pyspark/DynamicFrames to process data. It may be more efficient (less expensive) to orchestrate a Lambda function to read S3, call API and do transformation and write into S3 before you use a Glue job to process/transform for your ETL.
That said, there may be an use case for what you want to implement. In case you want to try calling an API from Glue using Python code, you could try the following code.
import sys from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.job import Job import boto3 ## Library for invoking Lambda ## @params: [JOB_NAME] args = getResolvedOptions(sys.argv, ['JOB_NAME']) sc = SparkContext() glueContext = GlueContext(sc) spark = glueContext.spark_session job = Job(glueContext) job.init(args['JOB_NAME'], args) ## your ETL logic prior to invoking Lambda ## Once the ETL completes lambda_client = boto3.client('lambda') response = lambda_client.invoke(FunctionName='LambdaName') ## Your ETL code after invoking lanbda
if you want to call an external API, you need to install
requests module using additional-python-modules option and then use the below code:
import requests url="https://example.com/api/jobs/test" response = requests.post(url) print(response.text) #TEXT/HTML print(response.status_code, response.reason) #HTTP
trigger glue job from s3Accepted Answerasked 7 months ago
Copying data from sql server to snowflake with AWS GLUEasked 6 months ago
What are the benefits when I run a Glue job inside VPC?Accepted Answerasked 7 months ago
Invoking Lambda or External API from within a AWS Glue Jobasked 2 months ago
Call a glue Job from within another without using vpc endppoints or SG
Is it possible to call rest API from AWS glue jobasked 7 months ago
What is the best practice to load data to redshift with aws glue ?asked 3 years ago
Glue job keeps running and does not write resultsasked 7 months ago
AWS Glue is taking very long time to read data from MySQL table (61 millions)asked 4 months ago
call a stored procedure from within glue jobAccepted Answer