Advanced Resource Management

Advanced Resource Management

Advanced resource management is an important capability of the Moab HPC Suite that enables resources to be effectively controlled and optimized in complex or heterogeneous HPC environments. It includes capabilities that allow Moab to aggregate local resources, incorporate information from remote tools or custom fields into scheduling decisions, apply unique policies to groupings of nodes, and add fine-tuned controls over workload placement on resources. These capabilities will enhance scheduling decisions in complex environments, boost application performance through better resource matching, and improve overall system utilization.

Simple homogeneous clusters often evolve into more complex or heterogeneous environments over time as the cluster matures and organizations increasingly focus on application performance. As organizations add nodes to an existing cluster or seek to consolidate previously independent clusters, it is quite common to run into speed and other attribute dissimilarities in new processors, memories, or hard drives. It is also common, in seeking to optimize performance, to add GPUs/accelerators to a subset of nodes, or to add memory on a few nodes to meet the need of memory-intense applications. These natural upgrades lead to heterogeneity and an added measure of complexity. The advanced resource management capabilities described below help not only to manage this complexity, but also to help truly take advantage of the different capabilities.

Node Sets: Node Sets are commonly used to help group homogenous nodes in a heterogeneous cluster. This can be due to multiple architectures, processor speeds, or unique features. It could also simply be that they are nodes associated to a specific class. Examples include allowing jobs to ask for:

  • Any node set with a processor speed of 1900 or 2200
  • A node set with a network attribute of either Myranet or Ethernet, but not both
  • A node set is called “high memory”.

Some even use node sets to get jobs to run within any given single rack of nodes in order to minimize network traffic.

NUMA (Non Uniform Memory Access): NUMA support allows organizations to force workloads into using the optimal memory of processors it utilizes. For example, an organization could ensure that a processor uses its own local memory rather than that of another.

Multi-Resource Manager Support: Multi-RM support allows Moab to connect to more than one external tool when gathering additional information or managing additional resources. Below are a few key examples of how this helps improve the decision-making or control of an HPC cluster.

  • Local Area Grid: Moab can connect to multiple queueing tools (i.e. two TORQUE instances) and unify the resources under a single Moab administrative domain. This enables a simplified grid, as long as they already have a shared user and data space
  • FlexLM: Moab can connect to FlexLM and manage workload in context of available licenses
  • Identity Manager: Moab can be connected to a remote identity manager to incorporate dynamic changes to users, groups, accounts and queues/classes associated with compute resources as well as fairshare targets, priorities, service access constraints, and credential relationships in grids
  • Other Resource Managers: Moab can connect to hardware monitors, provisioning tools, and other such tools that can dynamically import information about resources and either apply them to update configured resources such as nodes or to add such details dynamically to generic resource fields

Node Allocation Policy: While job prioritization allows a site to determine which job to run, node allocation policies allow a site to specify how available resources should be allocated to each job from nodes that meet the base requirements. There are multiple node allocation policies to choose from, allowing selection based on reservation constraints, node configuration, resource usage, and other preferred factors. You can specify these policies with a system-wide default value, on a per-partition basis, or on a per-job basis. Examples include, allocating nodes in a contiguous block, and prioritizing nodes with the fewest configured resources that can meet the need to minimize resource fracturing or even maximizing usage of nodes with the most identical or balanced processor speeds.

Other Capabilities:

  • Docker Support
  • Malleable Jobs
  • Remap Classes
  • Generic Metrics
  • Generic Events