HPE Ezmeral Data Fabric 6.2 is In Maintenance and transitions to "End of Maintenance" in June 2024. Please see the latest documentation.

About Release 6.2
This site contains documentation for HPE Ezmeral Data Fabric release 6.2 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
6.2 Installation
This section contains information about installing and upgrading HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
6.2 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
6.2 Administration
This section describes how to manage the nodes and services that make up a cluster.
- Administering Users and Clusters
  Lists topics that help manage a data-fabric cluster.
- Administering Nodes
  Provides a synopsis of managing nodes in a cluster.
- Administering Volumes
  This section provide information about how to organize and manage data using volumes, a unique feature of HPE Ezmeral Data Fabric clusters.
- Administering Files and Directories
- Administering Tables
  Administration of the HPE Ezmeral Data Fabric Database is done primarily via the command line (maprcli) or with the Managed Control System (MCS). Regardless of whether the HPE Ezmeral Data Fabric Database table is used for binary files or JSON documents, the same types of commands are used with slightly different parameter options. HPE Ezmeral Data Fabric Database administration is associated with tables, columns and column families, and table regions.
- Administering Streams
- Administering Data Fabric Gateways
  A HPE Ezmeral Data Fabric gateway mediates one-way communication between a source HPE Ezmeral Data Fabric cluster and a destination cluster. You can replicate HPE Ezmeral Data Fabric Database tables (binary and JSON) and HPE Ezmeral Data Fabric Streams streams. HPE Ezmeral Data Fabric gateways also apply updates from JSON tables to their secondary indexes and propagate Change Data Capture (CDC) logs.
- Administering Services
- Monitoring the Cluster
  This section describes how to monitor the health and performance of a MapR cluster.
- Configuring Security
  Describes how to configure security and manage secure clusters.
- Managing Secure Clusters
  Provides procedures that will enable you to use MapR clusters securely.
- Administering the Data Access Gateway
  The HPE Ezmeral Data Fabric Data Access Gateway is a service that acts as a proxy and gateway for translating requests between lightweight client applications and the HPE Ezmeral Data Fabric cluster. This section describes considerations when upgrading the service, how to modify configuration settings, and how to administer and manage the service.
- Planning for High Availability
  - CLDB Failover
    Explains the concept of CLDB failover, and its advantages.
  - Best Practices for Running a Highly Available Cluster
    Lists high availability cluster replication types, and the best practices for running such a cluster.
  - ResourceManager High Availability
    Provides an overview of how high availability for Resource Manager works.
- Administrator's Reference
  This section contains in-depth reference information for the administrator.
- Troubleshooting Cluster Administration
  Lists the common errors and their solutions.
- Best Practices for Backing Up HPE Ezmeral Data Fabric Information
  Lists the best practices and performance considerations to follow when backing up HPE Ezmeral Data Fabric information.
6.2 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other data-fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Manual or Automatic Failover for the ResourceManager

With manual or automatic failover, an active ResourceManager and two standby ResourceManager processes run in the cluster. The standby ResourceManager nodes run the ResourceManager process without loading the working state. When the active ResourceManager fails, one of the standby ResourceManager nodes can load the working state from the ZooKeeper and continue providing services to the cluster.

ResourceManager clients (HPE Ezmeral Data Fabric client nodes, ApplicationMaster processes, and NodeManager nodes) attempt connections to the ResourceManager nodes in a round-robin fashion until they hit an active ResourceManager node. If the active ResourceManager node is down, ResourceManager clients resume round-robin polling until an active ResourceManager node is detected.

For web requests, including REST API requests, standby ResourceManager nodes automatically redirect web requests to the active ResourceManager node.

The difference between manual and automatic failover is how the transition from standby to active occurs for the ResourceManager process.

With manual failover, you manually invoke the transition of the ResourceManager from standby to active with the yarn rmadmin command.
With automatic failover, the ResourceManager processes have an embedded ZooKeeper-based ActiveStandbyElector, which chooses the active ResourceManager. This ActiveStandbyElector also detects failures in the currently active ResourceManager and automatically transitions one of the standby ResourceManagers to an active state.

If you specify multiple ResourceManagers when you run configure.sh, automatic failover is configured. However, you can edit the yarn-site.xml file to enable manual failover instead.