About Release 7.0
This site contains documentation for HPE Ezmeral Data Fabric release 7.0, including installation, configuration, administration, and reference content, as well as content for the associated ecosystem components and drivers.
7.0 Installation
This section contains information about installing and upgrading HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
- Planning the Cluster
  Describes information and factors used in planning your cluster.
  - Select Services
    This section describes some of the services that can be run on a node.
  - Cluster Design Objectives
    This section describes some of the work that your cluster performs, and identifies key design considerations.
  - Minimum Cluster Size
    All data-fabric production clusters must have a minimum of four data nodes except for HPE Ezmeral Data Fabric Edge.
  - Cluster Hardware
    Describes important hardware-architecture considerations for your cluster.
  - Service Layout in a Cluster
    Provides an overview of segregating services on different nodes.
    - Node Types
      Depending on the size of your cluster, nodes may or may not perform specialized work.
    - Service Layout Guidelines for Large Clusters
      Describes how to install and segregate services on large clusters.
    - Service Layout Guidelines for Replication
      Based on the use case, replicating HPE Ezmeral Data Fabric Database tables and HPE Ezmeral Data Fabric Streams may require the installation of Data Fabric Gateways and the HBase client on one or more nodes.
    - Data Fabric Monitoring Storage Options
      Describes various storage options for Data Fabric Monitoring. The Control System relies on data-fabric monitoring components to display metrics, but can function without the monitoring components. Using data-fabric monitoring to store logs is optional.
    - Example Cluster Designs
      Describes how to design a Data Fabric cluster for maximum availability, fault-tolerance, and performance.
    - Plan Initial Volumes
      Describes why it is important to define volumes.
  - Security Considerations
    Planning for security will help you identify security shortcomings and address them before you go into production.
  - User Accounts
    This section identifies how to organize authorized users of the cluster.
  - Next Step
    After you have a complete cluster plan, you are ready to prepare each node.
- Installing Core and Ecosystem Components
  Describes how to install HPE Ezmeral Data Fabric software and ecosystem components with or without the Installer.
- Installing the HPE Ezmeral Data Fabric File Store
  Describes how to install File Store software with or without the Installer.
- Installing HPE Ezmeral Data Fabric Object Store
  Describes installation of the HPE Ezmeral Data Fabric Object Store software with or without the Installer.
- Installing Kubernetes Interfaces for Data Fabric
  This section describes how to plan for and install the Container Storage Interface (CSI) Storage Plugin and the Kubernetes Interfaces for Data Fabric FlexVolume Driver.
- Upgrading Core or EEP Components
  Depending on your current configuration, you may choose to upgrade the release version (core), ecosystem components, clients, or monitoring components.
- Setting Up Clients and Services
  Describes how to set up and use interfaces to an HPE Ezmeral Data Fabric cluster from a client computer.
- Setting Up the Control System
  Describes how to configure and access the Control System.
- Migrating to the HPE Ezmeral Data Fabric
  Provides instructions for migrating business-critical data and applications from an Apache Hadoop cluster to an HPE Ezmeral Data Fabric cluster.
- Applying a Patch
  You can apply a patch by using the Installer, by using the command line (a manual process), or by using an Installer Stanza.
7.0 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
7.0 Administration
This section describes how to manage the nodes and services that make up a cluster.
7.0 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other data-fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Service Layout in a Cluster

Provides an overview of segregating services on different nodes.

How you assign services to nodes depends on the scale of your cluster and the data-fabric license level. For a single-node cluster – which must not be used in a production environment (see Minimum Cluster Size) – no decisions are involved. All of the services you are using run on the single node.

On medium clusters, the performance demands of the CLDB and ZooKeeper services require them to be assigned to separate nodes to optimize performance. On large clusters, good cluster performance requires that these services run on separate nodes.

The cluster is flexible and elastic. Nodes play different roles over the lifecycle of a cluster. The basic requirements of a node are not different for management or for data nodes.

As the cluster grows, it becomes advantageous to locate control services (such as ZooKeeper and CLDB) on nodes that do not run compute services. The Data Fabric Converged Community Edition does not include HA capabilities, which restricts the number of instances that certain services can run. The number of nodes and the services they run evolve over the life cycle of the cluster.

To provide a high-availability, high-performance cluster, the data-fabric software architecture allows virtually any service to run on any node, or nodes. The following guidelines help you to plan your cluster service layout.

NOTE It is possible to install data-fabric software on a one- or two-node demo cluster. Production clusters can harness hundreds of nodes, but five- or ten-node production clusters are appropriate for some applications.