Spark 2.3.1-1808 (EEP 6.0.0) Release Notes

This section provides reference information, including new features, patches, and known issues for Spark 2.3.1-1808.

The notes below relate specifically to the MapR Distribution for Apache Hadoop. This release of Spark has backward-compatibility changes, see the open-source Spark 2.3.1 Release Notes for more information.

These release notes contain only MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.

Spark Version 2.3.1
Release Date September 2018
MapR Version Interoperability See EEP Components and OS Support.
Source on GitHub https://github.com/mapr/spark/tree/2.3.1-mapr-1808
GitHub Release Tag 2.3.1-mapr-1808
Maven Artifacts https://repository.mapr.com/maven/
Package Names Navigate to https://package.ezmeral.hpe.com/releases/MEP/ and select your EEP and OS to view the list of package names.
IMPORTANT
  • Starting with EEP 6.0.0, keyStore and trustStore passwords can be removed from the spark-defaults.conf file and can be set in the /opt/mapr/conf/ssl-client.xml file.
  • Starting with EEP 6.0.0, after an upgrade, configuration files of previous versions are saved in the /opt/mapr/spark directory.
  • The MapR 6.1 and EEP 6.0.0 release introduces "Simplified Security". If you are using these versions and enable security on your MapR cluster, MapR scripts automatically configure Spark security features.
  • The encryption algorithms used to Configure SSL Encryption for Spark on YARN are no longer available for your web service to pick up. You need to remove the spark.ssl.enabledAlgorithms TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA line to let parties negotiate the matching ciphers.

Hive Support

This version of Spark supports integration with Hive. However, note the following exceptions:

New in This Release

For a complete list of all new features, refer to the open source documentation.

The following features of Spark 2.3.1 are NOT officially supported:
  • Continuous Processing in Structured Streaming
  • Stream-Stream Joins in Structured Streaming
  • Spark on Kubernetes support

Fixes

This MapR release includes the following new fixes since the latest MapR Spark 2.3.1 release. For details, refer to the commit log for this project in GitHub.

GitHub Commit Date (YYYY-MM-DD) Comment
484df64 2018/05/03

[MAPR-31305] Spark History server NOT loading applications submitted by users other than "mapr"

1542dfc 2018/05/04

MapR [SPARK-181] Kafka 0.10 Structured Streaming unit tests fixed

c08d3c7 2018/05/07

MapR [SPARK-210] Rename sprk-defaults.conf to spark-defaults.conf.template

7fe6261 2018/05/11

MapR [SPARK-227] KafkaUtils.createDirectStream fails with kafka-09

563e240 2018/05/11

MapR [SPARK-76] spark configure.sh should not restart the service every time it is run

0a9f1ca 2018/05/15

MapR [SPARK-213] Loading of data for parquet files bug fixed

f8660f2 2018/05/17

MapR [SPARK-220] SparkR fails with UDF functions bug fixed

a0f0d55 2018/05/22

[SPARK-24062][THRIFT SERVER] Fix SASL encryption cannot enabled issue in thrift server

d570fcb 2018/05/23

MapR [SPARK-209] Implement saving of user configurations during ecosystem package update

44318f1 2018/05/25

MapR [SPARK-226] Spark - pySpark Security Vulnerability

769754f 2018/05/29

MapR [SPARK-244] Provide ability to use MapR-Negotiation authentication for Spark HistoryServer

4db1e9a 2018/05/30

MapR [SPARK-216] Spark thriftserver fails when work with hive-maprdb json table

527522e 2018/05/30

MapR [SPARK-214] Hive-2.1 properties cannot be read from a hive-site.xml as Spark uses Hive-1.2

b9a5c43 2018/05/31

MapR [SPARK-248] MapRDBTableScanRDD fails to convert to Scala Dataframe when using where clause

a0524c3 2018/06/01

MapR [SPARK-255] Installer fresh install 610/600 secure fails to start "mapr-spark-thriftserver" and "mapr- spark-historyserver"

a30ec42 2018/06/05 MapR [SPARK-256] Spark does not work in Yarn mode
a0abaaa 2018/06/09 MapR [SPARK-260] fix EC option handling
ad91e9a 2018/06/12

MapR [SPARK-259] Spark application does not finish correctly

9be83a0 2018/06/13

MapR [SPARK-261] Use mapr-security-web for getting passwords

b296795 2018/06/13

[MAPR-31632] RM UI showing broken page for Spark jobs

cc5c12e 2018/06/14

MapR [SPARK-263] Add possibility to use keyPassword which is different from keyStorePassword

528e034 2018/06/18

MapR [SPARK-266] Spark jobs cannot finish correctly, when there is an error during job running

63c159b 2018/06/21

MapR [SPARK-272] Use only client passwords from ssl-client.xml

917558f 2018/06/26

MapR [SPARK-273] Update Hive-1.2 dependencies in Spark

bbb05f9 2018/06/26

MapR [SPARK-276] Update zookeeper dependency to v.3.4.11 for spark 2.3.1

ce5a14b 2018/07/02

MapR [SPARK-279] Cannot connect to spark thrift server with new spark and hive packages

e316246 2018/07/02

MapR [SPARK-278] Spark submit fails for jobs with python

8dc3e24 2018/07/09

MapR [SPARK-282] Remove maprfs and hadoop jars from mapr spark package

aeb5e3a 2018/07/10

[SPARK-212] SparkHiveExample fails when we run it twice

b444403 2018/07/13

[SPARK-130] MapR Database connector - NPE while saving Pair RDD with "null" values

3ca1f51 2018/07/16

MapR [SPARK-281] Spark configure.sh -R is ignoring custom security and overriding hive-site.xml

ac12360 2018/07/24

MapR [SPARK-277] Spark thriftserver fails when we try inserting for hive-maprdb json table

925c9fb 2018/07/27

MapR [SPARK-296] Structured Streaming memory leak

940f23d 2018/08/03

MapR [SPARK-297] Added unit test for empty value conversion

67cb089 2018/08/08

[SPARK-302] Local privilege escalation

17e3c3b 2018/08/10

[SPARK-301] Error while submitting job in Standalone cluster mode on MapR secure cluster

832411e 2018/08/14

[SPARK-306] Kafka clients 1.0.1 present in jars directory for Spark 2.3.1

c1ee416 2018/08/15

[MAPR-32014] Spark Consumer fails with java.lang.AssertionError

Known Issues

  • You cannot connect to a Spark Thrift Server on a Kerberos-secured cluster as Kerberos and SSL are not compatible.

    Workaround: Modify the hive.server2.use.SSL to false in the hive-site.xml file.

  • When you install a secure (MapR-SASL) cluster using the MapR Installer, the configure.sh script configures Hive after Spark. As a result, Spark copies the wrong hive-site.xml file and the Spark and Hive integration may not work correctly and you may have problems connecting to Spark beeline.

    Workaround: Check the hive-site.xml file in the Spark home directory, and, if needed, rerun the configure.sh script or copy the hive-site.xml file from your Hive home directory and restart services.

Resolved Issues

None.