Spark SQL Thrift Server
Spark SQL Thrift (Spark Thrift) was developed from Apache Hive HiveServer2 and operates like HiveSever2 Thrift server.
It is supported on secure clusters. You can run the Spark Thrift server and connect to Hive versions supported by Spark 2.1.0 with Business Intelligence (BI) tools or the Beeline command-line tool.
Starting in the MEP 4.0 release, the Spark Thrift server is available as a separate package. For instructions about installing this package, see Installing Spark Standalone or Installing Spark on YARN, depending on the type of cluster manager you are installing.
In MEP 3.0, MapR introduces additional security mechanisms for Spark with the Spark Thrift server. MapR-SASL and Kerberos are supported:
- For JDBC connections into Spark Thrift server
- Between Spark and Hive metastore
To enable these security mechanisms for the Spark Thrift server, starting in the MEP 4.0
release, for secure clusters, running configure.sh -R configures MapR-SASL security. The script modifies or creates a
SPARK_HOME/conf/hive-site.xml
file as follows:
- If Hive installed in your cluster, the script copies
HIVE_HOME/conf/hive-site.xml
toSPARK_HOME/conf
and modifies the file. - If Hive is not installed and you are using MapR-SASL security, the script creates a new
SPARK_HOME/conf/hive-site.xml
file. - Each time the script runs, if there is a pre-existing
SPARK_HOME/conf/hive-site.xml
file, the script saves a copy of the file inSPARK_HOME/conf/hive-site.xml.old
before modifying it.
You can manually configure security by following the steps outlined in sub-topics listed on this page.
To launch Spark Thrift server, perform the procedures required to configure Spark to use Hive.
/opt/mapr/spark/<spark-version/sbin/{start,stop}-thriftserver.sh
scripts, the port number remains 10000.Default Behavior
The default behavior of the Spark Thrift server is as follows:
- After installation, the Spark Thrift server is started in the local master mode.
- If the Spark master package is installed, then Spark Thrift server is started in the standalone master mode.
- If the
spark.master
property is set in thespark-defaults.conf
file, then Spark Thrift server uses the master set by this property.
Known Limitations
- MapR-SASL support is only implemented for Spark 2.1.0.
- ODBC driver does not support MAPR-SASL.
- Username and password authentication through PAM is not supported in MEP 3.0.
- Only SELECT statements support impersonation usage to access data stored in MapR-FS and/or MapR-DB.
- Spark Thrift server supports only features and commands in Hive 1.2.
- Although Spark 2.1.0 can connect to Hive 2.1 Metastore, only Hive 1.2 features and commands are supported by Spark 2.1.0.
Related Links
For information related to Spark Thrift server, see:
MapR | Apache |