Replication Role Balancer

Describes the features of the replication role balancer.

The replication role balancer manages containers to optimize the following:

  • Network bandwidth during the replication process
  • Disk I/O and CPU when serving read requests

The replication role balancer switches the replication roles of name and data containers to balance them across each storage pool in a volume. You can modify the cldb.role.balancer.strategy parameter from the maprcli to change how the role balancer manages the containers, either by size or count. You can also run the dump rolebalancerinfo command to see the status of active role switches or how container roles are balanced across each storage pool for a particular volume.

Replicated Containers

The name container is the first container created in every volume. Name containers can have either a primary or a replica role. Data containers can have a primary, intermediate, or tail role. By default, each name and data container is replicated across the cluster three times, with the primary being the first container written. The primary is sequentially replicated two more times, into a container with either an intermediate or a tail container role. If too many primary or intermediate containers exist on a storage pool or if the primary and intermediate containers are too large, the role balancer switches some of these containers to tail containers.

By default, the role balancer compares the size of the primary and tail containers to determine if containers within a storage pool are balanced. For the best performance, the size of the primary containers in a volume should be evenly distributed across storage pools. The role balancer maintains this balance by ensuring that each type of container (primary, intermediate, and tail) accounts for 1/ReplicationFactor of the total container size in a volume.

If the role balancer is configured to manage containers by count, it compares the number of primary and tail containers and balances the roles such that each type of container accounts for 1/ReplicationFactor of the total number of containers in a volume. For example, if the replication factor is set to 3, the role balancer maintains a balance of ⅓ primary, ⅓ intermediate, and ⅓ tail containers in each volume.

HPE Ezmeral Data Fabric Database Considerations

To optimize HPE Ezmeral Data Fabric Database performance, you should configure the role balancer to manage containers by size. As described at HPE Ezmeral Data Fabric Database and File Store, HPE Ezmeral Data Fabric Database shards tables into tablets and stores the tablets in data containers. Only primary data containers serve reads. Therefore, configuring the role balancer by size spreads read requests evenly across the storage pools for a volume. To ensure the most optimal balancing for your HPE Ezmeral Data Fabric Database tables, you should consider storing them on dedicated volumes.

Assign Cache

The assign cache is a list of reserved containers on a particular file server node that are allocated by the CLDB (container location database). The replication role balancer does not balance the containers in the assign cache and does not include them when balancing the roles. See the maprcli dump rolebalancerinfo command for assign cache values and details.