Service Layout in a Cluster

Provides an overview of segregating services on different nodes.

How you assign services to nodes depends on the scale of your cluster and the data-fabric license level. For a single-node cluster – which must not be used in a production environment (see Minimum Cluster Size) – no decisions are involved. All of the services you are using run on the single node.

On medium clusters, the performance demands of the CLDB and ZooKeeper services require them to be assigned to separate nodes to optimize performance. On large clusters, good cluster performance requires that these services run on separate nodes.

The cluster is flexible and elastic. Nodes play different roles over the lifecycle of a cluster. The basic requirements of a node are not different for management or for data nodes.

As the cluster grows, it becomes advantageous to locate control services (such as ZooKeeper and CLDB) on nodes that do not run compute services. The Data Fabric Converged Community Edition does not include HA capabilities, which restricts the number of instances that certain services can run. The number of nodes and the services they run evolve over the life cycle of the cluster.

To provide a high-availability, high-performance cluster, the data-fabric software architecture allows virtually any service to run on any node, or nodes. The following guidelines help you to plan your cluster service layout.

NOTE: It is possible to install data-fabric software on a one- or two-node demo cluster. Production clusters can harness hundreds of nodes, but five- or ten-node production clusters are appropriate for some applications.