Configure Data Fabric Client Node to Run Spark Applications

When Spark runs on YARN, Data Fabric client nodes require the hadoop-yarn-server-web-proxy JAR file to run Spark applications. On Windows, the client node also requires an update to the SPARK_DIST_CLASSPATH. A Data Fabric client node (a node with the mapr-client package, but without mapr-core packages) is also known as an edge node.

The mapr-client package does not include the JAR file required to run Spark applications. Therefore, you must copy the following JAR file from a Data Fabric cluster node to the same location on the Data Fabric client node where you want to run the Spark application:
/opt/mapr/hadoop/hadoop-<version>/share/hadoop/yarn/hadoop-yarn-server-web-proxy-<version>.jar
For example, here is a JAR file path for Hadoop 3.3.5:
/opt/mapr/hadoop/hadoop-3.3.5/share/hadoop/yarn/hadoop-yarn-server-web-proxy-3.3.5.100-eep-920.jar