HPE Ezmeral Data Fabric 6.2 is In Maintenance and transitions to "End of Maintenance" in June 2024. Please see the latest documentation.

About Release 6.2
This site contains documentation for HPE Ezmeral Data Fabric release 6.2 including installation, configuration, administration, and reference content, as well as content for the associated bundled ecosystem components and drivers.
6.2 Installation
This section contains information about installing and upgrading HPE Ezmeral Data Fabric software. It also contains information about how to migrate data and applications from an Apache Hadoop cluster to a HPE Ezmeral Data Fabric cluster.
6.2 Data Fabric
HPE Ezmeral Data Fabric is the industry-leading data platform for AI and analytics that solves enterprise business needs.
6.2 Administration
This section describes how to manage the nodes and services that make up a cluster.
- Administering Users and Clusters
  Lists topics that help manage a data-fabric cluster.
- Administering Nodes
  Provides a synopsis of managing nodes in a cluster.
- Administering Volumes
  This section provide information about how to organize and manage data using volumes, a unique feature of HPE Ezmeral Data Fabric clusters.
- Administering Files and Directories
- Administering Tables
  Administration of the HPE Ezmeral Data Fabric Database is done primarily via the command line (maprcli) or with the Managed Control System (MCS). Regardless of whether the HPE Ezmeral Data Fabric Database table is used for binary files or JSON documents, the same types of commands are used with slightly different parameter options. HPE Ezmeral Data Fabric Database administration is associated with tables, columns and column families, and table regions.
- Administering Streams
- Administering Data Fabric Gateways
  A HPE Ezmeral Data Fabric gateway mediates one-way communication between a source HPE Ezmeral Data Fabric cluster and a destination cluster. You can replicate HPE Ezmeral Data Fabric Database tables (binary and JSON) and HPE Ezmeral Data Fabric Streams streams. HPE Ezmeral Data Fabric gateways also apply updates from JSON tables to their secondary indexes and propagate Change Data Capture (CDC) logs.
- Administering Services
  - Managing Services
    Synopsis on managing services.
  - Setting Up Central Configuration from the Command-Line
    Describes the concept of a central location where customized data-fabric configuration files for data-fabric services are stored.
  - Viewing CLDB Information
    Describes how to view CLDB information from the CLDB page, and provides an explanation of each field that the page displays.
  - Managing Drill
    Provides a short description on managing Drill services.
  - Managing the HPE Ezmeral Data Fabric NFS Service
    Provides an overview of managing the NFS for the HPE Ezmeral Data Fabric service on a licensed cluster.
    - Managing VIPs for NFS
      Explains how to use virtual IP addresses (VIPs) on NFS servers.
    - Accessing Data with NFS v3
      Describes how data-fabric works with NFS v3.
      - Starting, Stopping, and Restarting HPE Ezmeral Data Fabric NFSv3
        Explains how to start, stop, and restart NFS version 3 using either the Control System or the CLI.
      - Setting Up Aliases for NFS Exports
      - Mounting NFS for the HPE Ezmeral Data Fabric to file system on a Cluster Node
      - Mounting NFS on a Linux Client
        Explains how to mount NFS on a Linux client either automatically at start up or manually.
      - Mounting NFS on a Mac Client
        Describes how to mount a NFS server on a Mac client.
      - Mounting NFS on a Windows Client
        Describes how to mount an NFS share on a Windows client, and configure the relevant user and group IDs.
      - Configuring the Linux NFS Client
        Describes how to set the optimal number of RPC requests to the NFS server.
    - Accessing Data with NFS v4
      Describes how HPE Ezmeral Data Fabric works with the NFS v4 protocol. Presents an overview of the process flow to read and write HPE Ezmeral Data Fabric processes with NFS v4, and a list of NFS v4 features that HPE Ezmeral Data Fabric does not support.
    - Viewing the List of NFS Servers
      Explains how to view the list of NFS servers using the Control System.
    - Handling Heavy Write Loads on Red Hat Enterprise Linux
      Describes a fix to mitigate resource contention between NFS Clients and the NFS Server on Red Hat Linux.
    - Configure NFS Write Performance
      Describes how to set the optimal value for outstanding Remote Procedure Call (RPC) requests to the NFS server.
    - Adjusting NFS Memory Settings
    - Running NFS on a Non-standard Port
    - Enabling Debug Logging for NFS Using the CLI
    - Unmounting the MapR Cluster from the Command-Line
  - Managing HPE Ezmeral Data Fabric POSIX Clients
    Provides a brief synopsis of HPE Ezmeral Data Fabric POSIX clients.
  - Managing the MAST Gateway
  - Configuring YARN for Control Groups
    Control groups (cgroups) are a Linux kernel feature available through the LinuxContainerExecutor program that you can configure to limit and monitor the CPU resources available to YARN container processes on a node.
  - Configuring NodeManager Restart
  - Managing Jobs and Applications
- Monitoring the Cluster
  This section describes how to monitor the health and performance of a MapR cluster.
- Configuring Security
  Describes how to configure security and manage secure clusters.
- Managing Secure Clusters
  Provides procedures that will enable you to use MapR clusters securely.
- Administering the Data Access Gateway
  The HPE Ezmeral Data Fabric Data Access Gateway is a service that acts as a proxy and gateway for translating requests between lightweight client applications and the HPE Ezmeral Data Fabric cluster. This section describes considerations when upgrading the service, how to modify configuration settings, and how to administer and manage the service.
- Planning for High Availability
- Administrator's Reference
  This section contains in-depth reference information for the administrator.
- Troubleshooting Cluster Administration
  Lists the common errors and their solutions.
- Best Practices for Backing Up HPE Ezmeral Data Fabric Information
  Lists the best practices and performance considerations to follow when backing up HPE Ezmeral Data Fabric information.
6.2 Development
This section contains information related to application development for Ezmeral ecosystem components and HPE Ezmeral Data Fabric products, including the file system, Database (Key-Value and JSON), and Event Streams.
Other Docs
This section contains release-independent information, including: Installer documentation, Ecosystem release notes, interoperability matrices, security vulnerabilities, and links to other data-fabric version documentation.
Glossary
Definitions for commonly used terms in MapR Converged Data Platform environments.

Accessing Data with NFS v3

Describes how data-fabric works with NFS v3.

Unlike other Hadoop distributions that only allow cluster data import or import as a batch operation, Data Fabric lets you mount the cluster itself using NFS for the HPE Ezmeral Data Fabric so that your applications can read and write data directly. Data Fabric allows direct file modification and multiple concurrent reads and writes using POSIX semantics. With a NFS-mounted cluster, you can read and write data directly with standard tools, applications, and scripts. For example, you could run a MapReduce application that outputs to a CSV file, then import the CSV file directly into SQL using NFS for the HPE Ezmeral Data Fabric.

Data Fabric exports each cluster as the directory /mapr/<cluster name> (for example, /mapr/my.cluster.com). If you create a mount point with the local path /mapr, then Hadoop FS paths and NFS v3 paths to the cluster will be the same. This makes it easy to work on the same files using NFS v3 and Hadoop. In a multi-cluster setting, the clusters share a single namespace, and you can see them all by mounting the top-level /mapr directory.

WARNING Data Fabric uses version 3 of the NFS protocol. NFS version 4 bypasses the port mapper and attempts to connect to the default port only. If you are running NFS on a non-standard port, mounts from NFS version 4 clients time out. Use the -o nfsvers=3 option to specify NFS v3.

CAUTION It is observed that NFS v3 clients are caching older atime values from previous history. Therefore, you might observe wrong atime values. To mitigate, make sure to clear caches, before checking file timestamps.

You can mount the cluster on a Linux, Mac, or Windows client. Before you begin, make sure you know the hostname and directory of the NFS v3 share you plan to mount.