Accessing Data with NFS v3

Describes how MapR works with NFS v3.

Unlike other Hadoop distributions that only allow cluster data import or import as a batch operation, MapR lets you mount the cluster itself using NFS so that your applications can read and write data directly. MapR allows direct file modification and multiple concurrent reads and writes using POSIX semantics. With a NFS-mounted cluster, you can read and write data directly with standard tools, applications, and scripts. For example, you could run a MapReduce application that outputs to a CSV file, then import the CSV file directly into SQL using NFS.

MapR exports each cluster as the directory /mapr/<cluster name> (for example, /mapr/my.cluster.com). If you create a mount point with the local path /mapr, then Hadoop FS paths and NFS v3 paths to the cluster will be the same. This makes it easy to work on the same files using NFS v3 and Hadoop. In a multi-cluster setting, the clusters share a single namespace, and you can see them all by mounting the top-level /mapr directory.

WARNING MapR uses version 3 of the NFS protocol. NFS version 4 bypasses the port mapper and attempts to connect to the default port only. If you are running NFS on a non-standard port, mounts from NFS version 4 clients time out. Use the -o nfsvers=3 option to specify NFS v3.

You can mount the cluster on a Linux, Mac, or Windows client. Before you begin, make sure you know the hostname and directory of the NFS v3 share you plan to mount.