Capacity Scheduler Features

The CapacityScheduler supports these features:

  • Hierarchical Queues Hierarchical queues ensure that resources are shared among the sub-queues of an organization before other queues are allowed to use free resources, thereby providing more control and predictability.
  • Capacity Guarantees Queues are allocated a fraction of the capacity of the grid, which means that a certain capacity of resources are at their disposal. All applications submitted to a queue have access to the capacity allocated to the queue. Administrators can configure soft limits and optional hard limits on the capacity allocated to each queue.
  • Security Each queue has strict Access Control Lists (ACLs). The ACLs control which users can submit applications to individual queues. Also, safeguards ensure that users cannot view or modify applications from other users. Per-queue and system administrator roles are also supported.
  • Elasticity Free resources can be allocated to any queue beyond its capacity allocation. As tasks scheduled on these resources complete, the resources become available to be reassigned to applications on queues running below their capacity. (Note that pre-emption is not supported.) This ensures that resources are available in a predictable and elastic manner to queues, thus preventing artificial silos of resources in the cluster and improving cluster utilization.
  • Multi-tenancy A comprehensive set of limits is provided to prevent a single application, user, or queue from monopolizing resources of the queue or the cluster as a whole. This ensures that the cluster is not overwhelmed.
  • Operability
    • Runtime Configuration The queue definitions and properties, such as capacity or ACLs, can be changed in a secure manner by administrators at runtime, which minimizes disruption to users. Also, a console is provided for users and administrators to view the current allocation of resources to various queues in the system. Administrators can add queues at runtime, but queues cannot be deleted at runtime.
    • Drain applications Administrators can stop queues at runtime to ensure that while existing applications run to completion, no new applications can be submitted. If a queue is in the STOPPED state, new applications cannot be submitted to that queue or any of its child queues. Existing applications continue to completion, so the queue can be drained gracefully. Administrators can also start the stopped queues.
  • Resource-based Scheduling Support for resource-intensive applications, where an application can optionally specify higher resource requirements than the default, thereby accommodating applications with differing resource requirements. Currently, memory is the only supported resource requirement.