Minimum Cluster Size

All data-fabric production clusters must have a minimum of four data nodes except for HPE Ezmeral Data Fabric Edge.

More Nodes Are Better

In general, it is better to have more nodes. Larger clusters recover faster from disk failures because more nodes are available to contribute. To maximize fault tolerance in the design of your cluster, see Example Cluster Designs.

A data node is defined as a node running a FileServer process that is responsible for storing data on behalf of the entire cluster. Having additional nodes deployed with control-only services such as CLDB and ZooKeeper is recommended, but they do not count toward the minimum node total because they do not contribute to the overall availability of data.

To understand how to size a data-fabric cluster, see this video.

Considerations for Clusters Smaller Than 10 Nodes

Note these special considerations for clusters of 10 nodes or fewer:
  • Erasure coding and rolling updates are not supported for clusters of four nodes or fewer.
  • Erasure coding is not recommended for five- and six-node clusters. See the Important note in Erasure Coding Scheme for Data Protection and Recovery.
  • Dedicated control nodes are not needed on clusters with fewer than 10 data nodes.
  • As the cluster size is reduced, each individual node has a larger proportional impact on cluster performance. As cluster size drops below 10 nodes, especially during times of failure recovery, clusters can begin to exhibit variable performance depending on the workload, network and storage I/O speed, and the amount of data being re-replicated.
  • For information about fault tolerance, see Priority 1 - Maximize Fault Tolerance and Cluster Design Objectives.

For hardware and configuration best practices, see Cluster Hardware.