HPE Ezmeral Data Fabric 6.2 is In Maintenance and transitions to "End of Maintenance" in June 2024. Please see the latest documentation.

About Release 6.2
This site contains documentation for HPE Ezmeral Data Fabric release 6.2 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
6.2 Installation
This section contains information about installing and upgrading HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
- Planning the Cluster
  Describes information and factors used in planning your cluster.
- Installing Core and Ecosystem Components
  Describes how to install HPE Ezmeral Data Fabric software and ecosystem components with or without the Installer.
  - Data Fabric Repositories and Packages
    Describes the repositories for data-fabric software and the ecosystem components.
  - Package Dependencies
    Lists the interdependencies between packages across all the supported Operating Systems
  - Preparing Each Node
    Defines minimum requirements for each node in your cluster.
  - Installing with the Installer
    The Installer automates the process of installing data-fabric software and offers you a variety of options to complete the installation.
  - Installing without the Installer
    Describes how to install data-fabric software and ecosystem components manually.
    - Step 1: Install the Package Key
      Before you install data-fabric packages, you must install the package key.
    - Step 2: Prepare Packages and Repositories
      To install services correctly, each node must have access to the package files.
    - Step 3: Install Cluster Service Packages
      The installation process varies based on the location of your packages and the configuration of your cluster.
    - Step 4: Verify Installation Success
      To confirm success, check each node.
    - Step 5: Set Environment Variables
      Before starting ZooKeeper or Warden, you must complete this step.
    - Step 6: Configure Nodes
      Connect nodes to the cluster, configure security, and arrange node storage.
      - Preparing to Run configure.sh
        Before you run configure.sh, collect the information that you need to run the script based on your requirements.
      - Enabling the External Key Store (KMIP) Feature
        Enabling an external key store requires performing certain steps after installing data-fabric packages but before running configure.sh.
      - Enabling Security
        Describes how to enable security for the cluster, platform, ecosystem components, and network-based connections.
      - Configuring Nodes without Security
        Describes how to configure all nodes without security during installation without the Installer.
      - Configuring Storage
        This section describes how to format disks for cluster storage manually by using disksetup.
    - Step 7: Bring up the Cluster
      Before you can install monitoring or ecosystem components, you must enable the cluster by starting ZooKeeper and Warden and verifying the cluster installation status.
    - Step 8: Install Metrics Monitoring
      Metrics monitoring is part of monitoring, which also includes log monitoring. Monitoring components are available as part of the Ezmeral Ecosystem Pack (EEP) that you selected for the cluster.
    - Step 9: Install Log Monitoring
      Installing the monitoring logging components is optional. The logging components enable the collection, storage, and visualization of core logs, system logs, and ecosystem component logs. Monitoring components are available as part of the Ezmeral Ecosystem Pack (EEP) that you selected for the cluster.
    - Step 10: Install Ecosystem Components Manually
      You can install one or more ecosystem components from any Ezmeral Ecosystem Pack (EEP) that is supported by the data-fabric cluster version. An EEP consists of a group of ecosystem components that work together.
    - Step 11: Run configure.sh
      Run configure.sh with the -R option to complete the configuration of ecosystem components that were added manually.
  - Configuring the Cluster
    Describes post-installation configuration tasks for HPE Ezmeral Data Fabric clusters.
- Installing the File Migration Service on the Edge Cluster
  Describes how to install the File Migration service on CentOS nodes.
- Installing the HPE Ezmeral Data Fabric File Store
  Describes how to install File Store software with or without the Installer.
- Installing Kubernetes Interfaces for Data Fabric
  This section describes how to plan for and install the Container Storage Interface (CSI) Storage Plugin and the Kubernetes Interfaces for Data Fabric FlexVolume Driver.
- Upgrading Core or EEP Components
  Depending on your current configuration, you may choose to upgrade the release version (core), ecosystem components, clients, or monitoring components.
- Setting Up Clients and Services
  Describes how to set up and use interfaces to an HPE Ezmeral Data Fabric cluster from a client computer.
- Setting Up the Control System
  Describes how to configure and access the Control System.
- Migrating to the HPE Ezmeral Data Fabric
  Provides instructions for migrating business-critical data and applications from an Apache Hadoop cluster to an HPE Ezmeral Data Fabric cluster.
- Applying a Patch
  You can apply a patch by using the Installer, by using the command line (a manual process), or by using an Installer Stanza.
6.2 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
6.2 Administration
This section describes how to manage the nodes and services that make up a cluster.
6.2 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other data-fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Preparing to Run configure.sh

Before you run configure.sh, collect the information that you need to run the script based on your requirements.

The configure.sh script can configure a node for the first time or update existing node configurations. Therefore, it has many configuration options that you can use.

Note the hostnames of the CLDB and ZooKeeper nodes. Optionally, you can specify the ports for the CLDB and ZooKeeper nodes as well. The default CLDB port is 7222. The default ZooKeeper port is 5181.
If a node in the cluster runs the HistoryServer, note the hostname for the HistoryServer. The HistoryServer node must be specified by using the -HS parameter.
If one or more nodes in the cluster runs the ResourceManager, note the hostname or IP address for each ResourceManager node. Based on the version you install and your ResourceManager high availability requirements, you may need to specify the ResourceManager nodes using the -RM parameter. High availability for the ResourceManager is configured by default and does not need to be specified.
If mapr-fileserver is installed on a node, you can use configure.sh with the -F option to format the disks and set up partitions. The -F option allows you to create a text file that lists the disks and partitions for use by the filesystem on the node. configure.sh passes the file to the disksetup utility. Each line lists either a single disk or all applicable partitions on a single disk. When listing multiple partitions on a line, separate each partition with a space. For example:
```
/dev/sdb
/dev/sdc1 /dev/sdc2 /dev/sdc4
/dev/sdd
```
Or you can manually run disksetup after you run configure.sh. See Configuring Storage.
For a cluster node that is on a VM, use the --isvm parameter when you run configure.sh, so that the script uses less memory.