HPE Ezmeral Data Fabric 6.1.x is In Maintenance and transitions to "End of Maintenance" in June 2024. Please see the latest documentation.

About MapR 6.1
This site contains the main documentation for Version 6.1 of the MapR Converged Data Platform, including installation, configuration, administration, and reference information.
6.1 Installation
This section contains information about installing and upgrading MapR software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a MapR cluster.
6.1 MapR Data Platform
MapR Data Platform is the industry-leading data platform for AI and analytics that solves enterprise business needs.
6.1 Administration
This section describes how to manage the nodes and services that make up a cluster.
6.1 Development
This section contains information related to application development for Ezmeral ecosystem components and MapR Data Platform products, including the file system, Database (Key-Value and JSON), and Event Streams.
- Application Development Process
  Before you start developing applications on the MapR Data Platform platform, consider how you will get the data into the platform, the storage format of the data, the type of processing or modeling that is required, and how the data will be accessed.
- MapR XD and Apps
  The following sections provide information about accessing the MapR XD with C and Java applications.
- MapR Database and Apps
  This section contains information about developing client applications for JSON and key-value tables.
- MapR Event Store For Apache Kafka and Apps
  MapR Event Store For Apache Kafka brings integrated publish and subscribe messaging to MapR Data Platform.
- MapReduce and Apps
  This section contains information associated with developing YARN applications.
- MapR Data Science Refinery
  The MapR Data Science Refinery product is an easy-to-deploy and scalable data science toolkit with native access to all platform assets and superior out-of-the-box security.
- MapR Data Fabric for Kubernetes
  This section describes how to leverage the capabilities of the MapR Data Fabric for Kubernetes.
- Ecosystem Components
  The following sections provide information about each open-source project that is supported by the MapR Data Platform.
  - Ezmeral Ecosystem Packs
  - AsyncHBase
  - Cascading
  - Apache Drill
  - Flume
  - HBase
  - HBase Client and MapR Database Binary Tables
  - HCatalog
  - Hive
  - HttpFS
  - Hue
  - Impala
  - Livy
    Apache Livy is primarily used to provide integration between Hue and Spark.
  - MapR Event Store For Apache Kafka Clients and Tools
    Describes the supported MapR Event Store For Apache Kafka tools and clients.
  - S3 Gateway
    The S3 gateway is a service that provides an S3-compatible interface to expose data in MapR Data Platform as objects. The S3 gateway manages all inbound S3 API requests to put data into and get data out of cloud storage.
  - Myriad
  - Oozie
  - Pig
  - Sentry
  - Apache Spark
  - Sqoop
  - YARN
  - Zeppelin
    - Configuring Zeppelin Interpreters
      Out-of-box, the interpreters in Apache Zeppelin on the MapR Data Platform are preconfigured to run against different backend engines. You may need to perform manual steps to configure the Livy, Spark, and JDBC interpreters. You can configure the idle timeout threshold for interpreters.
    - Cloning the Zeppelin Interpreter
      Describes how to change interpreter settings for different notebooks.
    - Zeppelin Multiuser and Multi-Instance Support
      Describes support for multiple users and multiple instances of the Zeppelin package-based product.
    - Configuring Impersonation in Zeppelin
      Impersonation for Apache Zeppelin is enabled and configured through the user interface for each interpreter. The following provides details for performing these configuration functions.
    - Enabling Kerberos Security for Zeppelin
      Describes how to set the principal and keytab properties for the Zeppelin server and configure interpreters to enable Kerberos for your Zeppelin installation.
    - Using Zeppelin to Access Different Backend Engines
      Contains links to examples for how to use Apache Zeppelin interpreters to access different backend engines. This includes running Apache Drill queries, Apache Hive queries, and Apache Spark jobs, as well as accessing database and streaming solutions.
    - Configuring Conda Python for Zeppelin
      Describes how to configure Conda Python for Zeppelin.
- Maven and MapR
  This section discusses topics associated with Maven and MapR.
- Developer's Reference
  This section contains in-depth information for the developer.
- API Documentation
  MapR Data Platform supports public APIs for MapR File System, MapR Database, and MapR Event Store For Apache Kafka. These APIs are available for application-development purposes.
Other Docs
This section contains release-independent information, including: MapR Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other MapR version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Zeppelin

Apache Zeppelin is an open source, Web-based data-science notebook. You can use it with MapR components to conduct data discovery, ETL, machine learning, and data visualization.

You can run the package-based Zeppelin product only on a MapR node (and not on an edge node). Out of the box, Zeppelin is integrated with open-source data-processing engines such as Apache Spark, Apache Drill, and Apache Hive, as well as with native MapR engines (MapR File System, MapR Database, and MapR Event Store For Apache Kafka). Using the notebook simply requires connecting to Zeppelin through your browser.

Zeppelin provides the following benefits for your data-engineering and data-science use cases:

An interactive development environment for writing, testing, and sharing data processing code snippets
Support for a variety of interpreters for integrating with different backend components
Support for extensible visualization libraries

For release-specific information, see Zeppelin Release Notes (Package-Based).

For additional information about Zeppelin, refer to the open source documentation.