Creating a Volume

Describes how to create a volume using the Control System, CLI and the REST API.

You can create a new (Standard or Mirror) volume using the Control System, the CLI, and the REST API.

Creating a Volume Using the Control System

To create a new (Standard or Mirror) volume using the Control System:

  1. Go to the Data > Volumes page and click Create Volume to display the Create New Volume page.
    Note: When running on a Kubernetes cluster, the Create Volume option is on the Volumes page.
  2. Choose the Volume Type in the Properties section. Choose:
    • Standard Volume to create a read-write volume.
    • Mirror Volume to create a volume that is a read-only copy of an existing volume.
      Tip: See also: Mirror Types.
  3. Specify the following required settings in the Properties section:
    Volume Name Enter a name for the volume.
    The name should contain only the following characters:
    A-Z a-z 0-9 _ - .
    Note:
    • The volume name should not begin with mapr. because mapr. is used for system volumes. If you use mapr. at the start of the volume name, the volume may not display in the default view of the list of volumes in the Control System; you must select the Include System Volumes checkbox in the Volumes pane to view volumes with names beginning with mapr.
    • For tiering-enabled volumes, volume name should not exceed ninety-eight characters.
    Accountable Entity Specifies a user or group whose use of a volume can be subject to quotas. You can set or modify quotas that limit the space used by all the volumes owned by an accountable entity.
    Volume Name Enter a name for the volume.
    The name should contain only the following characters:
    A-Z a-z 0-9 _ - .
    Note:
    • The volume name should not begin with mapr. because mapr. is used for system volumes. If you use mapr. at the start of the volume name, the volume may not display in the default view of the list of volumes in the Control System; you must select the Include System Volumes checkbox in the Volumes pane to view volumes with names beginning with mapr.
    • For tiering-enabled volumes, volume name should not exceed ninety-eight characters.
    Source Cluster Name Enter the name of the cluster on which the source volume exists.
    The name should contain only the following characters:
    A-Z a-z 0-9 _ - .
    Mirroring only works between two secure clusters or between two non-secure clusters. Mirroring does not work when one cluster is secure and the other is non-secure.
    Note: When setting up mirror volumes for mirroring between clusters, for the mirroring operation to run successfully, servers in one cluster cannot use the same IP addresses as servers in the other cluster. For example, if node A in cluster A has a private IP address of 10.10.20.29, no server in cluster B can have the same private IP address. Also, all the servers in destination cluster must be able to reach all the servers in the source cluster and vice versa. For example, suppose 10.10.20.29 is the only IP address used by node A in cluster A; then all servers in cluster B should be able to reach the IP address 10.10.20.29.
    Source Volume Name Enter the name of the source volume, from which the mirror volume pulls data (after selecting the source volume cluster).
    The name should contain only the following characters:
    A-Z a-z 0-9 _ - .
    If the source volume is on:
    • Same cluster, you create a local mirror volume, which is useful for load balancing or for providing a read-only copy of a data set. See Local Mirroring for more information.
    • Another cluster, you create a remote mirror volume, which is useful for offsite backup, for data transfer to remote facilities, and for load and latency balancing. See Creating Remote Mirrors for more information.
    Note: If you plan to enable tiering for the mirror volume, ensure that the selected source volume is also tiering-enabled. You cannot create tiering-enabled mirror volumes to mirror data in standard volumes that are not tiering-enabled.
    For information on setting up mirror cascades, see Mirror Cascades.
    Accountable Entity Specifies a user or group whose use of a volume can be subject to quotas. You can set or modify quotas that limit the space used by all the volumes owned by an accountable entity.
Steps 4 to 10 are optional and allow you to define optional volume properties and optional settings for auditing, replication, data tiering, volume access, and volume administration. If you do not define these settings, default values, where available, are used. You can skip to:
  • (Optional) Step 9 to associate a snapshot schedule and/or an offload schedule with the volume.
  • Step 11 to create the volume with basic settings.
  1. (Optional) Specify the following general settings under Properties:
    Mount Specifies whether to automatically mount (Yes) or not mount (No) the volume after creation. By default, volumes are mounted immediately after creation. If this is set to Yes, you must also specify the mount path.
    Mount Path The path to mount the volume. This is required if the value for Mount is Yes.
    Note: The path must be relative to / and cannot be in the form of a global namespace path (for example, /mapr/<cluster-name>/).
    Collect Metrics Specifies whether (Yes) or not (No) to enable metrics collection for this volume. For more information, see Collecting Volume Metrics and Enabling Volume Metric Collection.
    Volume Access Specifies whether the volume is read-only or a read/write volume. By default, a standard volume is created with read/write access. A mirror volume can only be a read-only volume.
    Last Access Interval Denotes the frequency at which the access time of a file is updated. See Tuning Last Access Time for more information.
  2. (Optional) Specify the following settings for data replication under Replication and Storage section:
    Topology Specifies the rack path to the volume. The default topology is /data.
    Optimize Replication for Specifies the basis for the replication factor:
    • High throughput, or cascading replication, where volumes are replicated sequentially on intermediate and tail containers.
    • Low latency, or star replication, where volumes are replicated on multiple containers in parallel.
    The default value is high throughput. See Selecting a Replication Type for High Availability.
    Guarantee Min Replication Specifies whether (Yes) or not (No) to enforce minimum number of copies. If this is enabled (Yes), writes succeed only when the minimum number of copies exist. If this is enabled (Yes) and minimum number of copies are not available, the client is asked to retry.

    For more information, see Understanding Replication.

    Replication Specifies the minimum (Minimum Replication) and desired (Target Replication) number of copies of the volume data. The default minimum is 2, and the default target is 3.
    Name Container Replication Specifies the minimum (Minimum Replication) and desired (Target Replication) number of copies of the name container associated with the volume. The default minimum is 2, and the default target is 3.
    Data Tiering — Specifies whether to enable (Yes) or disable (No) data tiering for volume data.

    By default, data tiering is enabled and volume data is stored in the hot tier (data-fabric cluster). If you choose to enable data tiering for the volume, you can associate a tier type with the volume either now, or later by editing the volume. If you decide to associate a type of tier with the volume, proceed to the next step; otherwise, proceed to step 7.

  3. (Optional) Associate a type of tier with the volume by selecting a tiering type from the Tiering Type drop-down list and specifying the following settings for the tier:
    For offloading data to an erasure coded volume, specify values for the following properties. If values are not specified, default values are applied.
    Topology The topology of the erasure coded volume from the drop-down list.
    Storage Policy The rule for offloading data in this volume. You can click:
    • Browse to select an existing rule.
    • Create to create a new rule for offloading data. See steps 3 - 5 in the Creating a Storage Tier Policy topic for more information.
    If you do not select a storage policy, the default policy named default.ectier.rule, which is all files (p), is associated with the volume.
    Scheme The erasure coding scheme, which is the number of data chunks and number of parity chunks. Set the required Parity Scheme. The system indicates in real-time whether or not the parity scheme is valid. Some valid parity schemes include:
    • 3+2 — for 3 data chunks and 2 parity chunks. You must have 5 or more nodes on the cluster for this option. If selected, this scheme has 60% storage overhead and can tolerate failure of up to 2 nodes.
    • 4+2 — for 4 data chunks and 2 parity chunks. You must have 6 or more nodes on the cluster for this option. If selected, this scheme has 50% storage overhead and can tolerate failure of up to 2 nodes.
    • 5+2 — for 5 data chunks and 2 parity chunks. You must have 7 or more nodes on the cluster for this option. If selected, this scheme has 40% storage overhead and can tolerate failure of up to 2 nodes.
    • 6+3 — for 6 data chunks and 3 parity chunks. You must have 9 or more nodes on the cluster for this option. If selected, this scheme has 50% storage overhead and can tolerate failure of up to 3 nodes.
    To use local parity, set the Local Parity slider to Yes. The system then displays a third slider to set the number of local parity blocks.

    As you set the parity scheme, irrespective of local or not, the system indicates the number of failures that the parity scheme can tolerate, the storage overhead required, and the number of nodes required to implement the parity scheme.

    Note: Although you can create a volume even if the required number of nodes are not present, offload operation fails if the required number of nodes are not present.
    See Erasure Coding Scheme for Data Protection and Recovery for more information.
    For offloading data to a low cost storage alternative on the cloud, specify values for the following properties.
    Storage Policy The rule for offloading data in this volume. You can click:
    Remote Target The location to which the data is offloaded. You can click:
    • Browse to select an existing tier.
    • Create to create a new tier. See Creating a Storage Tier for more information.
    Retention Duration after Recall The number of days to retain data recalled from the tier to the MapR cluster. Once the number of days is reached, recalled data on the MapR cluster is purged (if there are no changes), or offloaded (if there are changes).
    Tier Encryption Specifies whether (Yes) or not (No) to enable encryption of data on the tier. This cannot be modified once it is set. The default value is No (disabled).
  4. (Optional) To use Label-Based Storage, enter the label and the namespace label . See Using Storage Labels for more information.
  5. (Optional) Configure security for volume data by setting values for the properties in the Security section.
    1. Enter the name of the security policy in the SECURITY POLICIES field to search for the security policy to associate with the volume.
    2. Enable (Yes) or disable (No) the following audit and encryption settings by selecting the desired option:
      Auditing Auditing of operations. You can either audit particular files or directories (By File or Directory) or audit all files and directories on the volume (All Volume Content. In either case, you can do the following:
      • Choose either Default or Custom to specify the list of directory, file, table, and streams operations to audit.
      • Specify a Coalesce Interval, which is the interval of time during which READ, WRITE, or GETATTR operations on one file from one IP address or UID are logged only once for a particular operation, if auditing is enabled.. The default value is 60 minutes.
      Data on Wire Encryption Encryption of data in the volume during transmission. By default, this is enabled (Yes) for all new volumes in secure cluster. This is not supported on insecure clusters.
      Data at Rest Encryption Encryption of data at rest. This should be enabled only if the feature is enabled at the cluster-level. By default, this is disabled (No). This is not supported on insecure clusters.
      Coalesce Interval The interval of time (in minutes) to use when logging multiple READ, WRITE, or GETATTR operations on one file from one client IP address, if auditing is enabled. The default value is 60 minutes.
  6. (Optional) Specify the users, groups, and/or roles that have and/or do not have permissions to read and/or write in the Data Access Control section:
    1. Click Add Data Access Control to display the Add Access Permissions window.
    2. Move the slider associated with Public to Yes to grant access to all or to No to specify a list of users, groups, and or roles and do one of the following:
    3. Click Add to select permissions for the specified users, groups, and/or roles.
    4. Select the Read and/or Write checkbox in the Permissions column to grant that type of access to all (Public) or the specified users, groups, and/or roles.
    Click:
    • to modify the users, groups, and/or roles.
    • Add Another to grant permissions for other users, groups, and/or roles and repeat steps b and c.
    • to create a copy of the permissions, which you can then modify.
    • to remove a data access control setting.
  7. (Optional) Specify the users and groups that have volume administration permissions:
    1. Select the type of entity, user or group, and enter entity name in the Entities field.
    2. Select the checkbox associated with any of the following permissions to grant the user or group that type of administration control:
      Dump & Backup
      Transport large amount of data or copies of the volume on physical media to a remote cluster using backup files.
      Restore & Mirror
      Restore a volume from a dump file and create mirror volumes, which is a read-only copy of the source volume.
      Edit
      Edit volume properties, create and delete snapshots.
      Delete
      Delete the volume.
      Admin (Access Control)
      View and edit access control settings (but cannot perform volume operations).
      Full Control
      Perform all volume-related administrative operations except changing access control settings.
    To define administrative access control settings for another user or group, click Add Another and repeat steps a and b.
    Note: To perform this action from the command line, refer to acl set.
    By default, the root user and the volume creator have full control permissions on the volume.
  8. Set read (R), write (W), and/or execute (X) permissions on the root directory for users, groups, and others by selecting the permission.
  9. Click Create Volume to create the volume.

Creating a Volume Using the CLI and REST API

The basic command to create a (Standard) volume is:
/opt/mapr/bin/maprcli volume create -name <volName> -path <mountPath>
The name should contain only the following characters:
A-Z a-z 0-9 _ - .
If you are creating a:
  • Mirror volume, you must specify -type mirror and -source <sourceVolName>@<cluster> in the command.
  • Tiering-enabled volume, you must specify -tieringenable true in the command.
Send a request of type POST. For example:
curl -k -X POST 'https://<hostname>:8443/rest/volume/create?name=<volName>&path=<mountPath>' --user mapr:mapr
The name should contain only the following characters:
A-Z a-z 0-9 _ - .
If you are creating a:
  • Mirror volume, you must specify type=mirror and source=<sourceVolName>@<cluster> in the request.
  • Tiering-enabled volume, you must specify tieringenable=true in the request. The tieringenable property of a mirror volume should be the same as the source volume.
For the complete list of parameters, see volume create.