- Newest
- Most votes
- Most comments
Hi jsullivan43,
Let's dig into your issue:
Clarifying the Issue
Your EMR cluster stays alive when launched from SageMaker using a CloudFormation template. Although the KeepJobFlowAliveWhenNoSteps parameter defaults to false (terminating the cluster when no steps are running), the cluster persists beyond the IdleTimeout of 1 hour (3600 seconds) set in SageMaker.
This behavior may stem from how SageMaker’s Service Catalog integration processes EMR configurations. While KeepJobFlowAliveWhenNoSteps defaults to false, inconsistencies can arise when settings aren't explicitly specified.
Key Areas to Investigate
-
IdleTimeout Setting in SageMaker
Ensure that the IdleTimeout parameter in your SageMaker configuration aligns with your expectations. If IdleTimeout is not explicitly set in the EMR cluster configuration, it may default to a behavior inconsistent with CloudFormation directly. -
CloudFormation Template Review
Revisit your template stored in Service Catalog. Verify ifKeepJobFlowAliveWhenNoStepsis explicitly set tofalse. Here’s the relevant configuration snippet for clarity:"JobFlowInstancesConfig": { "KeepJobFlowAliveWhenNoSteps": false }Even if the default is
false, inconsistencies during the template launch may lead to unexpected behavior. -
Shimomura Template Reference
If you’re using the following sample template shared by Tomonori Shimomura, consider modifying it explicitly:{ "AWSTemplateFormatVersion": "2010-09-09", "Resources": { "EMRCluster": { "Type": "AWS::EMR::Cluster", "Properties": { "Name": "EMRClusterFromServiceCatalog", "ReleaseLabel": "emr-6.3.0", "Applications": [{"Name": "Hadoop"}], "Instances": { "MasterInstanceGroup": { "InstanceType": "m5.xlarge", "InstanceCount": 1 }, "CoreInstanceGroup": { "InstanceType": "m5.xlarge", "InstanceCount": 2 }, "TerminationProtected": false, "KeepJobFlowAliveWhenNoSteps": false }, "JobFlowRole": "EMR_EC2_DefaultRole", "ServiceRole": "EMR_DefaultRole", "VisibleToAllUsers": true, "AutoTerminationPolicy": { "IdleTimeout": 3600 } } } } }This template explicitly sets
KeepJobFlowAliveWhenNoStepstofalseand includes anIdleTimeoutof 3600 seconds (1 hour). -
SageMaker-Specific Overrides
Service Catalog or SageMaker may override certain EMR default settings. As a workaround:- Explicitly set
KeepJobFlowAliveWhenNoStepsin the template. - Test the same configuration outside SageMaker to confirm the behavior.
- Explicitly set
-
Step Execution Validation
Confirm that no lingering steps are being queued or pending, which may inadvertently keep the cluster alive.
Next Steps
By explicitly setting KeepJobFlowAliveWhenNoSteps and testing the behavior both within and outside SageMaker, you can determine whether SageMaker's Service Catalog introduces overrides or inconsistencies.
Credit goes to Tomonori Shimomura for sharing the relevant template reference! Let me know if you need further clarification or additional steps. 😊
Cheers! Aaron 🚀
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 4 months ago

@jsullivan43, Could you please tell us which service catalog template you used specifically? Is it this template? - https://github.com/aws-samples/sagemaker-studio-emr/tree/main/cloudformation/emr_servicecatalog_templates