Why can't I run Spark applications on my EMR notebook?

4 minute read
0

I can't run my Apache Spark application on my Amazon EMR notebook

Short description

Spark applications run from an EMR notebook might fail to start with the following exception:

The code failed because of a fatal error:
Session 4 did not start up in 60 seconds.

Resolution

The following are common troubleshooting steps for running Spark applications on your EMR notebook:

Check the resources on the cluster

Make sure that Spark has enough available resources in the cluster for Jupyter to create a Spark context. You can check available resources using Amazon CloudWatch metrics or the Resource Manager.

Make sure that the Sparkmagic libraries are configured correctly

Contact your Jupyter administrator to make sure that the Sparkmagic libraries are configured correctly.

Restart the notebook kernel

1.    Open the EMR console and then select Notebook.

2.    Select the notebook from the Notebooks list, and then choose Open in JupyterLab or Open in Jupyter. A new browser tab opens to the JupyterLab or Jupyter Notebook editor.

3.    From the Kernel menu, select Restart Kernel.

Increase the Spark session timeout period for JupyterLab

To increase the Spark session timeout period, do the following:

1.    Open the EMR console and select Notebook.

2.    Select the notebook from the Notebooks list.

3.    Access the EMR notebook's Jupyter web user interface.

4.    Open the EMR notebook terminal.

5.    Open the config.json file using the following command:

vi /home/notebook/.sparkmagic/config.json

5.    Add or update the livy_session_startup_timeout_seconds: xxx option in the config.json file.

6.    Restart all kernels.

Note: If the JupyterHub application is installed in the EMR primary instance, then do the following to increase the Spark session timeout period.

1.    Run the following command:

vi /etc/jupyter/conf/config.json

2.    Update the livy_session_startup_timeout_seconds:60 option to your value and then restart the JupyterHub container.

Tune Spark driver memory

Tune the Spark driver memory used by the Jupyter notebook application to control the resource allocation. For more information, see How can I modify the Spark configuration in an Amazon EMR notebook?

Make sure that the Apache Livy service is healthy

Check the status of the Livy server running on the primary node instance

1.    Use the following command to check the status of the livy-server:

sudo systemctl status livy-server

2.    Use the following command to restart the livy-server if the status is down:

sudo systemctl start livy-server

Increase the Livy Server memory

By default, the notebook client attempts to connect to Livy Server for 90 seconds. If the Livy server doesn't respond within 90 seconds, then the client generates a timeout. The most common reason why Livy server isn't responding is lack of enough resources. To fix this, increase the memory for Livy server:

1.    Connect to the primary node instance using SSH.

2.    Add the following property to the file /etc/livy/conf/livy-env.sh:

export LIVY_SERVER_JAVA_OPTS="-Xmx8g" (option to your value)

3.    For the changes to take effect, restart the Livy server.

sudo systemctl stop livy-server
sudo systemctl start livy-server

Use cluster mode instead of client mode in Livy

Spark applications are submitted on the notebook in client mode and the Spark driver runs as the sub process of the Livy server. Running as a sub process might cause a lack of resources on the primary node. To prevent Livy from failing due to insufficient resources, change deployment mode to cluster mode. Running on cluster mode, the driver runs on the application primary in the core and task nodes rather than on the primary node.

To use cluster mode, do the following:

1.    Connect to the primary node using SSH.

2.    Add the following parameter to the file /etc/livy/conf/livy.conf:

livy.spark.deploy-mode  cluster

3.    For the changes to take effect, restart the Livy server:

sudo systemctl stop livy-server
sudo systemctl start livy-server

AWS OFFICIAL
AWS OFFICIALUpdated a year ago