Post-Upgrade Steps for Spark

Complete the following steps after you upgrade Spark with or without the Installer.

Post-Upgrade Steps for Spark Standalone Mode

About this task

Procedure

  1. (Optional) Migrate Custom Configurations.
    Migrate any custom configuration settings into the new default files in the conf directory (/opt/mapr/spark/spark-<version>/conf).
  2. If Spark SQL is configured to work with Hive, copy the hive-site.xml file into the conf directory (/opt/mapr/spark/spark-<version>/conf).
  3. Run the following commands to configure the secondary instances:
    1. For Spark 2.x:
      Copy the /opt/mapr/spark/spark-<version>/conf/slaves.template into /opt/mapr/spark/spark-<version>/conf/slaves.
      For Spark 3.x:
      Copy the /opt/mapr/spark/spark-<version>/conf/workers.template into /opt/mapr/spark/spark-<version>/conf/workers.
    2. Add the hostnames of the Spark worker nodes. Put one worker node hostname on each line.
      For example:
      localhost
      worker-node-1
      worker-node-2
  4. Run configure.sh -R.
  5. Restart all the spark secondary instances as the mapr user:
    For Spark 2.x:
    /opt/mapr/spark/spark-<version>/sbin/start-slaves.sh spark://<comma-separated list of spark master hostname: port>
    For Spark 3.x:
    /opt/mapr/spark/spark-<version>/sbin/start-workers.sh spark://<comma-separated list of spark master hostname: port>
  6. Delete the old Spark directory from /opt/mapr/spark. For example, if you upgraded from Spark 2.1.0 to 2.3.1, you need to delete /opt/mapr/spark/spark-2.1.0.
    Starting with the EEP 6.1.0 release, for Spark 2.2.1 and later versions, after an upgrade the old directory is automatically removed. Only the new directory and the directory with the timestamp is present.

Post-Upgrade Steps for Spark on YARN

Procedure

  1. (Optional) Migrate Custom Configurations.
    Migrate any custom configuration settings into the new default files in the conf directory (/opt/mapr/spark/spark-<version>/conf). Also, if you previously configured Spark to use the Spark JAR file from a location on the file system, you need to copy the latest JAR file to the file system and reconfigure the path to the JAR file in the spark-defaults.conf file. See Configure Spark JAR Location.
  2. If Spark SQL is configured to work with Hive, copy the hive-site.xml file into the conf directory (/opt/mapr/spark/spark-<version>/conf).
  3. Run configure.sh -R.
  4. Delete the old Spark directory from /opt/mapr/spark. For example, if you upgraded from Spark 2.1.0 to 2.3.1, you need to delete /opt/mapr/spark/spark-2.1.0.
    Starting with the EEP 6.1.0 release, for Spark 2.2.1 and later versions, after an upgrade the old directory is automatically removed. Only the new directory and the directory with the timestamp is present.