Setting Up Disks for MapR

This section describes how to set up disks during the normal installation process. Go to the disksetup command page for information about other uses of this command.

MapR formats and uses disks for the Lockless Storage Services layer (MapR File System), and records these disks in the disktab file. In a production environment, or when testing performance, MapR should be configured to use physical hard drives and partitions. In some cases, it is necessary to reinstall the operating system on a node so that the physical hard drives are available for direct use by MapR. Reinstalling the operating system provides an unrestricted opportunity to configure the hard drives. If the installation procedure assigns hard drives to be managed by the Linux Logical Volume Manager(LVM) by default, you should explicitly remove the drives you plan to use with MapRMapR Data Platform from the LVM configuration. It is common to let LVM manage one physical drive containing the operating system partition(s) and to leave the rest unmanaged by LVM for use with MapRMapR Data Platform.
NOTE It is not necessary to set up RAID (Redundant Array of Independent Disks) on disks used by MapR File System. MapRMapR Data Platform uses the disksetup script to set up storage pools. In most cases, you should let MapRMapR Data Platform calculate storage pools using the default stripe width of two or three disks. If you anticipate a high volume of random-access I/O, you can use the -W option with disksetup to specify larger storage pools of up to 8 disks each.
NOTICE For more information on setting up disks, see Drive Configuration.
The following procedures are intended for use on physical clusters or Amazon EC2 instances. On EC2 instances, EBS volumes can be used as MapRMapR Data Platform storage, although performance will be slow.
NOTE If you are using MapR on Amazon EMR, you do not have to use this procedure; the disks are set up for you automatically.