Trouble configuring AWS DataPipeline to use Spot Instances instead of On-Demand Instances

0

Hi Team,

I have set up an AWS DataPipeline to run my EMR jobs on On-Demand instances. However, I now want to switch to using Spot Instances to reduce costs. I have configured the spotBidPrice parameter in my pipeline settings, expecting it to run on Spot Instances. However, it seems that the pipeline is still using On-Demand instances.

Could you please help me understand how I can properly configure my DataPipeline to utilize Spot Instances? Here are my current pipeline settings:

coreInstanceCount: 40
coreInstanceType: r5.4xlarge
keyPair: #{myKeyPair}
masterInstanceType: r5.4xlarge
maximumRetries: 2
region: #{myRegion}
releaseLabel: emr-5.31.0
resourceRole: #{myResourceRole}
role: #{myRole}
subnetId: #{mySubnetId}
taskInstanceType: r5.4xlarge
terminateAfter: 240 Minutes
spotBidPrice: 10.00
useOnDemandOnLastAttempt: true

I appreciate any guidance or suggestions you can provide to help me successfully configure my AWS DataPipeline to use Spot Instances. Thank you!

질문됨 일 년 전320회 조회
2개 답변
0

Hi there!

Can you try using taskInstanceBidPrice instead of spotBidPrice?

I hope this helps.

profile pictureAWS
전문가
답변함 일 년 전
  • I set it as the following (using coreInstanceBidPrice and taskInstanceBidPrice), but still not working (it is still running on demand):

      "coreInstanceCount": "40",
      "coreInstanceType": "r5.4xlarge",
      "coreInstanceBidPrice": "10.00",
      "keyPair": "#{myKeyPair}",
      "masterInstanceType": "r5.4xlarge",
      "maximumRetries": "2",
      "region": "#{myRegion}",
      "releaseLabel": "emr-5.31.0",
      "resourceRole": "#{myResourceRole}",
      "role": "#{myRole}",
      "subnetId": "#{mySubnetId}",
      "taskInstanceType": "r5.4xlarge",
      "taskInstanceBidPrice": "10.00",
      "terminateAfter": "240 Minutes",
      "useOnDemandOnLastAttempt": "true"
    },
    
0

Hi,

I can see from the current pipeline settings, that the “useOnDemandOnLastAttempt” is set to “true”. The parameter 'useOnDemandOnLastAttempt' is set to true by default. To avoid getting on demand instances used for EMR cluster, when Spot instances are not available you need to set this parameter to false. Also the maximum attempts for EMR cluster resource is defaulted to "1", you can also change the "maximumRetries" of EMR cluster to more than “1”. Currently you have "maximumRetries: 2” you can increase it to get the spot instances in other attempts.

Scenarios where the spot instances fail to launch

  1. Spot price is low than the minimum required Spot request fulfillment price.
  2. Limitation issue "EXCEEDED_SPOT_INSTANCE_COUNT_LIMIT (USER_ERROR)".
AWS
지원 엔지니어
답변함 일 년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠