Spark 2.4.4.0-1912 Release Notes
This section provides reference information, including new features, patches, and known issues for Spark 2.4.4.0.
The notes below relate specifically to the MapR Distribution for Apache Hadoop. For more information, you may also wish to consult the open-source Spark 2.4.4 Release Notes.
These release notes contain only MapR-specific information and are not necessarily cumulative in nature. For information about how to use the release notes, see Ecosystem Component Release Notes.
Spark Version | 2.4.4.0 |
Release Date | December 2019 |
MapR Version Interoperability | See Component Versions for Released EEPs and EEP Components and OS Support. |
Source on GitHub | https://github.com/mapr/spark |
GitHub Release Tag | 2.4.4.0-mapr-630 |
Maven Artifacts | https://repository.mapr.com/maven/ |
Package Names | Navigate to https://package.ezmeral.hpe.com/releases/MEP/ and select your EEP and OS to view the list of package names. |
- Beginning with EEP 6.0.0, the keyStore and
trustStore password can be removed from
spark-defaults.conf
and set in/opt/mapr/conf/ssl-client.xml
. - Beginning with EEP 6.0.0, after an upgrade,
the previous version's configuration files are saved in the
/opt/mapr/spark
directory. - MapR 6.1.0 with EEP 6.0.0 and later support simplified security. If you enable security on your MapR cluster, MapR scripts automatically configure Spark security features.
- Beginning with EEP 6.3.0, the Spark MapRDB JSON connector supports secondary indexes.
- Beginning with EEP 6.3.0, Spark supports configurable HTTP security headers.
Hive Support
This version of Spark supports integration with Hive. However, note the following exceptions:
- Hive-on-Spark is not supported.
- Spark-SQL is supported, but it is not fully compatible with Hive. For details, see the Apache Spark documentation and the MapR Spark documentation.
New in This Release
- MapR Spark ACLs behave like Apache Spark ACLs. For details, see the ACL Configuration for Spark documentation.
- For a complete list of new features, see the open-source Spark 2.4.4 Release Notes.
Fixes
This MapR release includes the following new fixes since the latest MapR Spark release. For details, refer to the commit log for this project in GitHub.
GitHub Commit | Date (YYYY-MM-DD) | Comment |
---|---|---|
47147db | 2019/09/20 | MapR [SPARK-609] Port Apache Spark-2.4.4 changes to the MapR Spark-2.4.4 branch |
b41bbe0 | 2019/09/25 | MapR [SPARK-614] Error in sparkR while reading avro and parquet file formats |
ca0c9c4 | 2019/10/07 | MapR [SPARK-618] Update hive dependencies for spark-2.4.4 to 2.3.6 version |
2827bed | 2019/10/07 | MapR [SPARK-619] Move absent commits from 2.4.3 branch to 2.4.4 |
118c6c5 | 2019/10/09 | MapR [SPARK-620] Replace core dependency in Spark-2.4.4 |
c996bb0 | 2019/10/09 | MapR [SPARK-595] Spark cannot access hs2 through zookeeper |
e758a24 | 2019/10/11 | MapR [SPARK-621] Add custom http header support. Improve work with security headers. |
4000048 | 2019/10/16 | MapR [SPARK-617 Can't use ssl via spark beeline |
d3a0ec5 | 2019/10/22 | MapR [SPARK-340] Jetty web server version at Spark should be updated to v9.4.X |
c7e076e | 2019/10/22 | MapR [SPARK-626] Update kafka dependencies for Spark 2.4.4.0 in release MEP-6.3.0 |
c5cbbcc | 2019/10/23 | MapR [MS-925] After upgrade to MEP 6.2 (Spark 2.4.0) can no longer |
5eaced8 | 2019/11/07 | MapR [SPARK-629] Spark UI for job lose CSS styles |
b0d5ee9 | 2019/11/11 | MapR [SPARK-639] Default headers are adding two times |
c99e9c9 | 2019/11/15 | MapR [SPARK-627] SparkHistoryServer-2.4 is getting 403 Unauthorized home page for users(spark.ui.view.acls) via spark-submit |
Known Issues
- MapR [SPARK-593], MapR [SPARK-558] - A Spark job can hang and the job output can be
redirected to the
/opt/mapr/logs/pam.log
file if you use the spark-shell during login to the Spark Driver UI or if you try to open the Spark Web UI before it is initialized. - MapR [SPARK-573] - A Spark job on a standalone node fails via the
mapr-client. This happens because the
spark-defaults.conf
file can't be configured by Sparkconfigure.sh
because coreconfigure.sh
doesn't call it. Workaround: Two workarounds are possible:- Copy the
spark-defaults.conf
file from/opt/mapr/spark/spark-<version>/conf/
into the same folder on the client node. - Run Spark
configure.sh
directly.
- In some cases, copying the
spark-defaults.conf
file may be not enough. - Spark
configure.sh
is not documented for external use. In addition, Sparkconfigure.sh
is run implicitly by coreconfigure.sh
, and running it directly with the wrong commands can break the Spark configuration
- Copy the
Resolved Issues
- None.