Trouble configuring AWS DataPipeline to use Spot Instances instead of On-Demand Instances

0

Hi Team,

I have set up an AWS DataPipeline to run my EMR jobs on On-Demand instances. However, I now want to switch to using Spot Instances to reduce costs. I have configured the spotBidPrice parameter in my pipeline settings, expecting it to run on Spot Instances. However, it seems that the pipeline is still using On-Demand instances.

Could you please help me understand how I can properly configure my DataPipeline to utilize Spot Instances? Here are my current pipeline settings:

coreInstanceCount: 40
coreInstanceType: r5.4xlarge
keyPair: #{myKeyPair}
masterInstanceType: r5.4xlarge
maximumRetries: 2
region: #{myRegion}
releaseLabel: emr-5.31.0
resourceRole: #{myResourceRole}
role: #{myRole}
subnetId: #{mySubnetId}
taskInstanceType: r5.4xlarge
terminateAfter: 240 Minutes
spotBidPrice: 10.00
useOnDemandOnLastAttempt: true

I appreciate any guidance or suggestions you can provide to help me successfully configure my AWS DataPipeline to use Spot Instances. Thank you!

preguntada hace un año320 visualizaciones
2 Respuestas
0

Hi there!

Can you try using taskInstanceBidPrice instead of spotBidPrice?

I hope this helps.

profile pictureAWS
EXPERTO
respondido hace un año
  • I set it as the following (using coreInstanceBidPrice and taskInstanceBidPrice), but still not working (it is still running on demand):

      "coreInstanceCount": "40",
      "coreInstanceType": "r5.4xlarge",
      "coreInstanceBidPrice": "10.00",
      "keyPair": "#{myKeyPair}",
      "masterInstanceType": "r5.4xlarge",
      "maximumRetries": "2",
      "region": "#{myRegion}",
      "releaseLabel": "emr-5.31.0",
      "resourceRole": "#{myResourceRole}",
      "role": "#{myRole}",
      "subnetId": "#{mySubnetId}",
      "taskInstanceType": "r5.4xlarge",
      "taskInstanceBidPrice": "10.00",
      "terminateAfter": "240 Minutes",
      "useOnDemandOnLastAttempt": "true"
    },
    
0

Hi,

I can see from the current pipeline settings, that the “useOnDemandOnLastAttempt” is set to “true”. The parameter 'useOnDemandOnLastAttempt' is set to true by default. To avoid getting on demand instances used for EMR cluster, when Spot instances are not available you need to set this parameter to false. Also the maximum attempts for EMR cluster resource is defaulted to "1", you can also change the "maximumRetries" of EMR cluster to more than “1”. Currently you have "maximumRetries: 2” you can increase it to get the spot instances in other attempts.

Scenarios where the spot instances fail to launch

  1. Spot price is low than the minimum required Spot request fulfillment price.
  2. Limitation issue "EXCEEDED_SPOT_INSTANCE_COUNT_LIMIT (USER_ERROR)".
AWS
INGENIERO DE SOPORTE
respondido hace un año

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas