Managing Storage Policies

Data offload is driven by rules, which are configured per volume. Data offload rule can be based on size of file (s), owner (u, g, or p) of the file, and/or file modification timestamp (m). You can apply one rule per volume.

When a rule is associated with a volume, the rule is first applied on the files in the tiering-enabled volume. When applied on the files in the tiering-enabled volume, the offload is triggered for all files in the snapshot chain as well when the criteria in the rule is met. If the file does not exist in the tiering-enabled volume, rule is applied on the latest state of the file in the snapshot chain. If the file exists in the tiering-enabled volume but has no latest state or if the file was deleted in the tiering-enabled volume, offload does not happen.

Rules can be defined using a combination of the following:

u Username or user ID, as configured in the OS registry (such as /etc/passwd file, LDAP, etc.), of a specific user.

Usage: u:<username or user ID>

g Group name or group ID, as configured in the OS registry (such as /etc/group file, LDAP, etc.), of a specific group.

Usage: g:<groupname or group ID>

a (atime) Time (in seconds or days) since the files were last accessed. The number of seconds can be specified by appending s to value and the number of days can be specified by appending d to the value.

Usage:

  • "a:<value>s" — specifies atime in seconds
  • "a:<value>d" — specifies atime in days
All files whose access timestamp (atime) exceeds the specified amount of time are offloaded.
Note: If the system time on CLDB and file server nodes are different, the atime rule for offloading data may not work as intended.
Example:
 maprcli tier rule create -name ecrule1 -expr "a:300s"
 maprcli volume create -name ecvol3 -path /ecvol3 -tieringenable true -tiername ectier1 -tieringrule ecrule1 -ecscheme 2+1 -json
 maprcli volume info -name test -json | grep atime
   "atimeUpdateInterval":"300",
   "atimeTrackingStartTime":"2020-09-02 23:27:49 GMT-0700",                 
In this example, you create a tier rule with atime as 300 seconds and assign it to a volume. Now, when you run the volume info command, you see that the atime interval is set as 300 seconds, along with the date and time when you enabled atime updates (atimeTrackingStartTime).

Based on this rule, all files that are read later than 300 seconds after you enabled atime (300+atimeTrackingStartTime) seconds, are offloaded.

aTimeTrackingStartTime is updated to the current time in the following cases:
  • Just started tracking atime, which means that the atimeUpdate frequency was previously zero
  • If the value of atimeUpdate frequency is decreased
  • If the value of atimeUpdate frequency is increased and atime has not been tracked for the duration of the new frequency value
For more information, see Tuning Last Access Time.
m (mtime) Time (in seconds or days) since the files were last modified. The number of seconds can be specified by appending s to value and the number of days can be specified by appending d to the value.

Usage:

  • "m:<value>s" — specifies mtime in seconds
  • "m:<value>d" — specifies mtime in days
All files whose modification timestamp (mtime) exceeds the specified amount of time are offloaded.
Note: If the system time on CLDB and file server nodes are different, the mtime rule for offloading data may not work as intended.
s The size of the file in bytes, kilobytes, megabytes, or gigabytes. The size of the file can be specified by appending one of the following to the value: b for bytes, k for kilobytes, m for megabytes, or g for gigabytes.

Usage

  • "s:<value>b" — specifies file size in bytes
  • "s:<value>k" — specifies file size in KB
  • "s:<value>m" — specifies file size in MB
  • "s:<value>g" — specifies file size in GB
All files whose size exceeds the specified size are offloaded.
Or, use the following:
p (Default) Specifies all files. Specifies that this operation is applicable to all the files without restriction. This cannot be combined with any other operator.
"" Indicates none of the files. Specifies that this operation cannot be performed on any of the files.
Use the following to string multiple criteria for offload:
& AND operation to combine multiple expressions as the criteria for the rule.
| OR operation to indicate either of the expressions as the criteria for the rule.
() Delimiters for subexpressions.

For volumes configured for erasure coding, a default storage policy, default.ectier.rule (ID 1 and expression p), is applied if one is not specified.

You can create, associate, and remove rules using the MapR Control System, the CLI, and REST API.