By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Pass a time from one Glue (Python) script to SSM parameter store and start another script at that time

0

I have a set of scripts that call a government hierarchy API and the rate limit is 4000/hr. I have a separate script for each function:

  1. OAuth token retrieval and store to SSM
  2. Get token and call API for Level 1 records, then store total call count and first API call time
  3. Get token and run through JSON of Level 1 records, call API for their children (Level 2), then update total call count
  4. Get token and run through JSON of Level 2 records, call API for their children (Level 3), then update total call count
  5. Get token and run through JSON of Level 3 records, call API for their children (Level 4), then update total call count

This has the potential to go 6 or 7 levels. Right now my Level 4 script simply pauses for an hour once the call count gets close to 4000. This isn't a sustainable practice, though. I'd like to store my call count and start time from each script and calculate when I need to pause for an hour, then call the next script once that hour runs out.

I'm thinking of different ways to do it and one way hinges on the ability to call any Glue script to run at a time stored in the SSM parameter store. Is that possible?

asked 11 days ago26 views
2 Answers
0

Yes, it is possible to pass a time from one Glue (Python) script to SSM Parameter Store and start another script at that specified time. Here's how you can approach this:

  1. In your current Glue script, after calculating the next run time:

    • Use the AWS SDK for Python (boto3) to store the calculated time in SSM Parameter Store.
    • You can store this as a string in ISO format for easy parsing.
  2. To start the next script at the specified time:

    • Create a Step Functions state machine that orchestrates your Glue jobs.
    • In the state machine, use a Wait state that reads the time from SSM Parameter Store.
    • After the Wait state, trigger the next Glue job.

Here's a more detailed breakdown:

In your current Glue script, you can store the next run time like this:

import boto3
from datetime import datetime, timedelta

ssm = boto3.client('ssm')

# Calculate next run time (e.g., one hour from now)
next_run_time = (datetime.now() + timedelta(hours=1)).isoformat()

# Store in SSM Parameter Store
ssm.put_parameter(
    Name='/my/next/run/time',
    Value=next_run_time,
    Type='String',
    Overwrite=True
)

Then, in your Step Functions state machine, you can use a Wait state that reads this parameter:

{
  "Wait": {
    "Type": "Wait",
    "TimestampPath": "$.nextRunTime",
    "Next": "StartNextGlueJob"
  },
  "StartNextGlueJob": {
    "Type": "Task",
    "Resource": "arn:aws:states:::glue:startJobRun",
    "Parameters": {
      "JobName": "YourNextGlueJob"
    },
    "End": true
  }
}

Before the Wait state, you'll need a Task state that retrieves the parameter from SSM:

{
  "GetNextRunTime": {
    "Type": "Task",
    "Resource": "arn:aws:states:::aws-sdk:ssm:getParameter",
    "Parameters": {
      "Name": "/my/next/run/time"
    },
    "ResultPath": "$.nextRunTime",
    "Next": "Wait"
  }
}

This approach allows you to dynamically control when your Glue jobs run based on the time stored in SSM Parameter Store. It provides a flexible way to manage your rate-limited API calls across multiple scripts while ensuring you don't exceed the 4000/hr limit.

Remember to set up appropriate IAM roles and permissions for your Glue jobs and Step Functions state machine to interact with SSM Parameter Store and start Glue jobs.

Sources
Calling AWS Glue APIs in Python - AWS Glue
Working with parameter versions in Parameter Store - AWS Systems Manager

profile picture
answered 11 days ago
0

Hello there,

Thank your for explaining the use-case. You could try utilising the boto3 SDK for achieving the use-case programatically and use a Step Function where you can use a Wait state that reads the "StartNextGlueJob" parameter.

I hope this helps. If you need any further insights on this, please feel free to reach out to AWS via a support case.

Thank you!

References: [1] https://docs.aws.amazon.com/systems-manager/latest/userguide/sysman-paramstore-versions.html

AWS
SUPPORT ENGINEER
answered 10 days ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions