I want to troubleshoot errors I receive that I use a Jupyter notebook to run an Apache Livy application on Amazon EMR.
Resolution
You might receive one of the following errors when you use a Jupyter notebook to run an Apache Livy application on Amazon EMR:
- '404' from ######## with error payload: "session '0' not found"
- "The code failed because of a fatal error: Error sending http request and maximum retry encountered."
You receive the preceding errors when you run your Jupyter Notebook session until it times out. To resolve these errors, increase the value of the livy.server.session.timeout property in /etc/livy/conf/livy.conf on the primary node. Then, restart livy-server.
You can modify the livy.server.session.timeout property on a running Amazon EMR cluster or when you launch a new cluster.
Modify livy.server.session.timeout on a running cluster
Complete the following steps:
- Open /etc/livy/conf/livy.conf on the active primary node.
- Modify the livy.server.session.timeout value:
sudo vim /etc/livy/conf/livy.conflivy.server.session.timeout 2h
Note: Replace 2h with the value that fits your requirements. The default value is 1 hour.
- To restart livy-server, run one of the following commands that fits your version requirements on the active primary node.
For Amazon EMR version 5.30.0 or later, Amazon EMR 6 series, and Amazon EMR 7 series on Amazon Linux 2, run the following command:
sudo systemctl stop livy-server
sudo systemctl start livy-server
For Amazon EMR release version 5.29.0 or earlier, run the following command:
sudo stop livy-server
sudo start livy-server
Note: When your livy-server restarts, you can't access your cluster. To avoid downtime, configure the Apache Livy application when you launch an Amazon EMR cluster.
Modify livy.server.session.timeout on a new cluster
Add a configuration object when you use Amazon EMR version 4.6.0 or later to launch a cluster.
Example:
[
{
"Classification": "livy-conf",
"Properties": {
"livy.server.session.timeout-check": "true",
"livy.server.session.timeout": "2h",
"livy.server.yarn.app-lookup-timeout": "120s"
}
}
]
You can also modify the following related properties:
- When you turn on the livy.server.session.timeout-check property, Apache Livy stops idle sessions that reach the timeout threshold. The default setting is true.
- The livy.server.yarn.app-lookup-timeout property is the duration that Apache Livy looks for the YARN application before the application considers it lost. The default setting is 60s.
After you run the job, make sure to close the session in Jupyter or Zeppelin. When too many sessions are open, new jobs can't start until resources become available.
Related information
Apache Livy
Jupyter Notebook on Amazon EMR