Installing Custom Packages for PySpark

You can install custom Python packages either by manually installing packages on each node in your MapR Data Platform cluster or by using Conda. Using Conda allows you to perform the install from your Zeppelin host node without having to directly access your MapR cluster. The topics in this section describe the instructions for each method as well as instructions for Python 2 vs Python 3.

You can run only version of Python in your Zeppelin notebook.

IMPORTANT The MapR product supports Python libraries included in the Zeppelin container but does not support the libraries in custom Python packages. You should use Python versions that match the versions installed on your MapR cluster nodes. Choosing a Zeppelin Docker image OS that matches the OS running in your MapR cluster minimizes library version differences.