Trouble configuring AWS DataPipeline to use Spot Instances instead of On-Demand Instances

0

Hi Team,

I have set up an AWS DataPipeline to run my EMR jobs on On-Demand instances. However, I now want to switch to using Spot Instances to reduce costs. I have configured the spotBidPrice parameter in my pipeline settings, expecting it to run on Spot Instances. However, it seems that the pipeline is still using On-Demand instances.

Could you please help me understand how I can properly configure my DataPipeline to utilize Spot Instances? Here are my current pipeline settings:

coreInstanceCount: 40
coreInstanceType: r5.4xlarge
keyPair: #{myKeyPair}
masterInstanceType: r5.4xlarge
maximumRetries: 2
region: #{myRegion}
releaseLabel: emr-5.31.0
resourceRole: #{myResourceRole}
role: #{myRole}
subnetId: #{mySubnetId}
taskInstanceType: r5.4xlarge
terminateAfter: 240 Minutes
spotBidPrice: 10.00
useOnDemandOnLastAttempt: true

I appreciate any guidance or suggestions you can provide to help me successfully configure my AWS DataPipeline to use Spot Instances. Thank you!

質問済み 1年前320ビュー
2回答
0

Hi there!

Can you try using taskInstanceBidPrice instead of spotBidPrice?

I hope this helps.

profile pictureAWS
エキスパート
回答済み 1年前
  • I set it as the following (using coreInstanceBidPrice and taskInstanceBidPrice), but still not working (it is still running on demand):

      "coreInstanceCount": "40",
      "coreInstanceType": "r5.4xlarge",
      "coreInstanceBidPrice": "10.00",
      "keyPair": "#{myKeyPair}",
      "masterInstanceType": "r5.4xlarge",
      "maximumRetries": "2",
      "region": "#{myRegion}",
      "releaseLabel": "emr-5.31.0",
      "resourceRole": "#{myResourceRole}",
      "role": "#{myRole}",
      "subnetId": "#{mySubnetId}",
      "taskInstanceType": "r5.4xlarge",
      "taskInstanceBidPrice": "10.00",
      "terminateAfter": "240 Minutes",
      "useOnDemandOnLastAttempt": "true"
    },
    
0

Hi,

I can see from the current pipeline settings, that the “useOnDemandOnLastAttempt” is set to “true”. The parameter 'useOnDemandOnLastAttempt' is set to true by default. To avoid getting on demand instances used for EMR cluster, when Spot instances are not available you need to set this parameter to false. Also the maximum attempts for EMR cluster resource is defaulted to "1", you can also change the "maximumRetries" of EMR cluster to more than “1”. Currently you have "maximumRetries: 2” you can increase it to get the spot instances in other attempts.

Scenarios where the spot instances fail to launch

  1. Spot price is low than the minimum required Spot request fulfillment price.
  2. Limitation issue "EXCEEDED_SPOT_INSTANCE_COUNT_LIMIT (USER_ERROR)".
AWS
サポートエンジニア
回答済み 1年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ