How to get the EMR Serverless Job URL with EMR Studio Information Missing from the Event?

0

I'm working with AWS EMR Serverless, and I need to construct a job URL for an EMR Serverless job to be sent in a message notification in case of state change. The desired URL includes the associated EMR Studio information, but this information (StudioId or StudioName) is not provided in the EMR Serverless Job Run State Change event or in the api methods like list application or describe applications. Here's an example of the event data I receive:

{
  "version": "0",
  "id": "781da594-5643-13f4-98ad-cc4b05ce6df9",
  "detail-type": "EMR Serverless Job Run State Change",
  "source": "aws.emr-serverless",
  "account": "010045674077",
  "time": "2025-01-09T17:38:04Z",
  "region": "eu-west-3",
  "resources": [
    "arn:aws:emr-serverless:eu-west-1:010015667677:/applications/00foRXVXXenk0p/jobruns/00fpXFHFDhjag0r"
  ],
  "detail": {
    "jobRunId": "00fpGFHFHHjag0r",
    "jobRunAttempt": 2,
    "jobRunName": "eigth-clone",
    "applicationId": "00foff2FHHFGk0p",
    "arn": "arn:aws:emr-serverless:eu-west-1:01001768777:/applications/00foff2genk0p/jobruns/00fpbslgcjag0r",
    "releaseLabel": "emr-7.2.0",
    "state": "SCHEDULED",
    "previousState": "PENDING",
    "createdBy": "arn:aws:sts::017077:assumed-role/role-user-administrator/toto@toto.com",
    "updatedAt": "2025-01-09T17:38:04.179223Z",
    "createdAt": "2025-01-09T17:35:11.729700Z"
  }
}

I know the job URL format typically looks like this:

https://studio-info-needed/#/serverless-applications//

https://es-7wpgfhf1k79qyq7vxke8ais6t.emrstudio-env.eu-west-3.amazonaws.com/#/serverless-applications/00fjathdfnkg0p/00fpfhfh3703u9g0r

However, neither the StudioId nor the StudioName is included in the event. I've explored AWS API methods like list_applications() and describe_application() for the application but didn’t find any information about the associated Studio. The list_studios() API provides studio details, but it doesn't directly correlate with the application or the jobs.

list_applications: Fetches details of the application but doesn’t provide any Studio-related information. list_studios: Fetches all studios in the account, including VpcId and StudioId, but there's no clear link to the application.

I need a robust way to correlate an EMR Serverless application to its associated Studio (if any) to construct the job URL dynamically.

1 Answer
1

To construct the EMR Serverless job URL with the missing EMR Studio information, you can use the GetDashboardForJobRun API operation. This method generates a URL that provides direct access to the job run dashboard, including all necessary Studio information if applicable. Here's how to do it:

  1. Call the GetDashboardForJobRun API using the application ID and job run ID from your event data:

    aws emr-serverless get-dashboard-for-job-run --application-id <application-id> --job-run-id <job-id>

  2. This operation will return a URL that you can use in your notification message. The URL includes all required information to access the job's UI, including Studio details if associated.

  3. If you need the Studio ID for other purposes, you can extract it from the returned URL. It's typically the subdomain before .emrstudio-prod in the URL.

This approach is robust as it doesn't require maintaining a mapping between applications and Studios, works even if an application isn't associated with a Studio, and ensures the URL is correct and up-to-date.

Note that the generated URL is valid for one hour. For longer validity, you may need to implement a system to regenerate the URL periodically or on-demand.
Sources
GetDashboardForJobRun - Amazon EMR Serverless
Monitoring EMR Serverless applications and jobs - Amazon EMR

profile picture
answered a month ago
profile picture
EXPERT
reviewed 25 days ago
  • Thanks for you answer, i added a function to return the dashboard url : def get_emr_dashboard_url(application_id, job_run_id, region): client = boto3.client('emr-serverless', region_name=region) response = client.get_dashboard_for_job_run( applicationId=application_id, jobRunId=job_run_id ) return response.get('url') ,but when i receive the notification message ihad this when click the url:This URL has already been used the maximum number of times. Please generate a new URL. however i didn't check it just once and i have the job running

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions