Dieser Inhalt ist in der ausgewählten Sprache nicht verfügbar
Wir arbeiten ständig daran, Inhalte in der ausgewählten Sprache bereitzustellen. Vielen Dank für deine Geduld.
How do I determine whether to use a bootstrap action or a step on an Amazon EMR cluster?
Lesedauer: 2 Minute
What are the use cases for running a bootstrap action or running a step on an Amazon EMR cluster?
Use bootstrap actions to install additional software on an EMR cluster. Use steps to submit work to an EMR cluster, or to process data.
Bootstrap actions run after an EMR cluster transitions from the STARTING state to the BOOTSTRAPPING state. Bootstrap actions run before core services, such as Hadoop or Spark, are installed. If a bootstrap action fails, the cluster doesn't start. For more information, see Understanding the cluster lifecycle.
Bootstrap actions run on all cluster nodes. Bootstrap actions are scripts that run as the Hadoop user by default—but they can also run as the root user with the sudo command. You can configure bootstrap actions to run commands conditionally, based on instance-specific values in the instance.json or job-flow.json file.
Note: On Amazon EMR 2.x and 3.x releases, bootstrap actions run after core services are installed. Most predefined bootstrap actions for Amazon EMR AMI versions 2.x and 3.x aren't supported in later Amazon EMR releases. For more information, see Create bootstrap actions to install additional software.
A step is a unit of work that contains one or more Hadoop jobs. Steps are usually used to transfer or process data. One step might submit work to a cluster. Other steps might process the submitted data and then send the processed data to a particular location.