Using Storage Labels

Synopsis of the Storage Labels Feature

Different applications have different requirements such as low latency, high throughput, and low variance in response times. Storage devices with various characteristics, for example SSD over SATA or SSD over NVME, are available as off-the-shelf storage. Therefore, it is now possible to cater exactly to the storage requirements of each application.

The Storage Label feature allows you to store particular kinds of data on particular classes of devices. For example, you could place data that needs to be fetched with lower latency on Solid State Drives (SSD), while placing data that can afford to be fetched with higher latency on Hard Disk Drives (HDD). Both classes of devices can be on the same node.

Use Cases for Storage Labels

Example use cases include:

Latency-sensitive data can be placed on SSD, while data that is very rarely read can be placed on HDD.
Active (warm) data can be placed on SSD while rarely used (cold) and archived data can be placed on HDD.
Local volumes for Map Reduce tasks can be placed on SSD.
Data can be securely segregated. For example, all finance data can be placed on volumes marked finance, engineering data on volumes marked engineering, and HR on volumes marked HR.

Components of the Storage Labels Feature

Storage Labels help you confine volumes to specific storage pools, to meet your desired objectives (such as low latency).

The Storage Labels feature has three main components: Labels, Storage Pools and Volumes.

What is a Label?

The Label acts as the bridge between a Storage Pool and a Volume.

The Label is a tag that associates a Volume with a Storage Pool. For example, a volume marked SSD will always be placed on a Storage Pool marked with the name SSD.

Labels have the following characteristics:

Are a string of printable ASCII characters that should begin with a letter or a number in most cases, except when they represent device classes. For a device class, the label may contain properties that are enforced when the label is assigned to a Storage Pool or a Volume. For example, label SSD could have the following properties:
```
{
num_disks_per_instance; #used in partioning disks among multiple instances of a file server 
                         on the same node.
max_active_io_per_disk; # Number of concurrent IO operations to be used on the disk.
}
```
Are case-insensitive.
Have a internal numeric value attached to each of them.
Have to be registered with the CLDB before use.

What is a Storage Pool?

In the HPE Data Fabric platform, a storage pool (SP) constitutes a logical storage device. A SP can be composed of SSDs, while another SP can be composed of HDDs. A heterogeneous SP that is made up of different classes of devices is not permitted.

However, multiple classes of SPs can reside on the same node.

You can assign a single label to each SP to identify it. A SP that has not been labeled, automatically assumes the default label (with an internal numeric value of 0).

What is a Volume?

In the HPE Data Fabric platform, a volume is the logical unit for a class of data. Therefore, storing various types of data on various classes of devices is essentially placing particular volumes on the appropriate classes of devices.

Each volume can have only one label. Volumes that do not have a label, assume the default label with a numeric value of 0.

The CLDB servers are responsible for keeping track of these labels and moving volumes across storage pools based on these labels.

A volume is placed only on the SP matching its label. A volume labelled SSD is placed only on a SP with the label SSD. Similarly, a volume with the default label, will only be placed on a SP with the default label.

To override this matching placement, use the special label anywhere for a volume. A volume with the anywhere label is placed on any SP irrespective of its label. Use this label frugally as you might inadvertently end up storing a volume on your expensive disks where it is not needed to be stored.

The Disk Balancer takes care of moving volumes to appropriate SPs as needed. For example, assume that a volume with a label marked anywhere is stored on a SP with a label SSD. Now a request comes in to place volumes that are explicitly labelled as SSD but there are no more SPs labelled as SSD. The Disk Balancer will then move the volume marked anywhere off the SP labelled as SSD on to any other SP, and accommodate the volumes explicitly labelled as SSD.

You might assign a label to a namespace container of a volume. If a label is not assigned, the namespace container inherits the label of its data container. The namespace label is NOT changed when moving a volume from one label to another. The namespace inherits the data container label at the time of volume creation if you do not specify a label to a namespace container explicitly.

However, if you assign a label to a namespace container but not to a data container, the data container is assigned the default label.

Once the data container and the namespace container are labelled, their labels are independent of each other.

At the time of Volume creation, the system chooses the SPs with matching labels. If the volume or its namespace container is associated with a label, creation fails if there are insufficient nodes with SPs having the same label.

If only one matching SP is present for a 2/3-way replicated volume, only a single replica is created for that volume.

When creating containers, the system checks for nodes/storage pools with matching labels in the topology requested. Container creation fails if there are no nodes/storage pools with matching labels in the topology requested, even if there are such nodes/storage pools in other topologies.

When the label of a volume is changed, replicas cannot be migrated within the file server, from one SP with the old label to another SP with the desired label. If there are no other SPs, all old copies will not be fully migrated to the new desired label.

ATTENTION CLDB volumes are an exception. CLDB volumes can be created on any storage pool. Labels do not apply to CLDB volumes.

Enable the Storage Labels Feature on an Upgraded Cluster

The Storage Labels feature is already enabled on a new HPE Data Fabric version 6.2 cluster.

On a cluster upgraded to version 6.2, enable the Storage Labels feature with the command:

maprcli cluster feature enable -name cldb.lbs.support

This feature takes effect immediately without the need to fail over CLDB. This feature cannot be disabled once it is enabled.

Usage Sequence

You must perform label creation and assignment in the following sequence:

Register a label - Ensures that random labels are not assigned to SPs and Volumes. A label once registered cannot be deleted or prevented from being used.
Assign a label to a Storage Pool - Select any disk in the SP to assign a label. You can create a storage pool, and simultaneously set its label as well.
Assign a label to a volume. You can assign a label at volume creation using the volume create command.

Override Topology and Storage Pool Adherences

You can override topology and SP adherences for critically under-replicated containers. The Storage Labels feature contains the following two settings.

honor.topology.for.critical.replication - default value is 1 (true). When set to 0 (false), critically under-replicated containers are replicated outside their topology, if space is not available within their topology.
```
/opt/mapr/bin/maprcli config save -values '{"honor.topology.for.critical.replication":"0"}'
```
honor.label.for.critical.replication - default value is 1 (true). When set to 0 (false), critically under-replicated containers are replicated on other SPs, even if their labels do not match the labels of the SPs.
```
/opt/mapr/bin/maprcli config save -values '{"honor.label.for.critical.replication":"0"}'
```

Storage Label Commands

Use the label add command to register a label.

Use the disk add or the disk setlabel command to label an SP.

Use the volume create or the volume move command to label a volume.

Use the label list command to list all registered labels.

Use the node list command to list all labels associated with a node.

Use the dashboard info command to view the list of registered labels and the number of associated volumes and storage pools.

Use the mrconfig sp list command to view all storage pool information, including the labels associated with the storage pools.

Use the disk list command to view the labels associated with disks.