Trouble configuring AWS DataPipeline to use Spot Instances instead of On-Demand Instances

0

Hi Team,

I have set up an AWS DataPipeline to run my EMR jobs on On-Demand instances. However, I now want to switch to using Spot Instances to reduce costs. I have configured the spotBidPrice parameter in my pipeline settings, expecting it to run on Spot Instances. However, it seems that the pipeline is still using On-Demand instances.

Could you please help me understand how I can properly configure my DataPipeline to utilize Spot Instances? Here are my current pipeline settings:

coreInstanceCount: 40
coreInstanceType: r5.4xlarge
keyPair: #{myKeyPair}
masterInstanceType: r5.4xlarge
maximumRetries: 2
region: #{myRegion}
releaseLabel: emr-5.31.0
resourceRole: #{myResourceRole}
role: #{myRole}
subnetId: #{mySubnetId}
taskInstanceType: r5.4xlarge
terminateAfter: 240 Minutes
spotBidPrice: 10.00
useOnDemandOnLastAttempt: true

I appreciate any guidance or suggestions you can provide to help me successfully configure my AWS DataPipeline to use Spot Instances. Thank you!

gefragt vor einem Jahr320 Aufrufe
2 Antworten
0

Hi there!

Can you try using taskInstanceBidPrice instead of spotBidPrice?

I hope this helps.

profile pictureAWS
EXPERTE
beantwortet vor einem Jahr
  • I set it as the following (using coreInstanceBidPrice and taskInstanceBidPrice), but still not working (it is still running on demand):

      "coreInstanceCount": "40",
      "coreInstanceType": "r5.4xlarge",
      "coreInstanceBidPrice": "10.00",
      "keyPair": "#{myKeyPair}",
      "masterInstanceType": "r5.4xlarge",
      "maximumRetries": "2",
      "region": "#{myRegion}",
      "releaseLabel": "emr-5.31.0",
      "resourceRole": "#{myResourceRole}",
      "role": "#{myRole}",
      "subnetId": "#{mySubnetId}",
      "taskInstanceType": "r5.4xlarge",
      "taskInstanceBidPrice": "10.00",
      "terminateAfter": "240 Minutes",
      "useOnDemandOnLastAttempt": "true"
    },
    
0

Hi,

I can see from the current pipeline settings, that the “useOnDemandOnLastAttempt” is set to “true”. The parameter 'useOnDemandOnLastAttempt' is set to true by default. To avoid getting on demand instances used for EMR cluster, when Spot instances are not available you need to set this parameter to false. Also the maximum attempts for EMR cluster resource is defaulted to "1", you can also change the "maximumRetries" of EMR cluster to more than “1”. Currently you have "maximumRetries: 2” you can increase it to get the spot instances in other attempts.

Scenarios where the spot instances fail to launch

  1. Spot price is low than the minimum required Spot request fulfillment price.
  2. Limitation issue "EXCEEDED_SPOT_INSTANCE_COUNT_LIMIT (USER_ERROR)".
AWS
SUPPORT-TECHNIKER
beantwortet vor einem Jahr

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen