Group Sharing

Group Sharing

Group sharing management is a core capability of the Moab HPC Suite that enables organizations to effectively share a cluster between multiple groups. As these groups begin to utilize a cluster their competing needs and usage behaviors will inevitably cause conflict. If not properly managed, this interaction will result both in the cluster not being used according to business objectives as well as in new inefficiencies that arise from users or groups trying to “Game” the system to get the resources they want.

With this set of capabilities, organizations get the controls they need to efficiently share a cluster between multiple groups and the ability to align resource usage to business objectives. Example capabilities include Account and QoS credentials, Fairshare, Advanced Prioritization, Preemption, and Administrative Reservations.

Added Credential Objects: Each of the following additional credential objects can be used to apply specific rights and policies:

  • Account: The account credential is often used to help apply resource usage rights and policies to departments or other top level organizational groups
  • Group: The group credential is often used to help organize groups of users associated with a project, or to unite organizational groups inside of a department/account
  • QoS (Quality of Service): The QoS credential is often used as a way to give special consideration to various classes of users, groups, jobs, etc. The object can be considered a container that associates unique priority, special resource access rights or other policies that either increase or decrease the committed service experience. Each QoS is custom defined to meet the needs of the organization, but below are a few examples:

Deadline QoS: Highest priority to run next in the queue, rights to preempt low priority workloads, least restrictive rights to resources, optional extra high charge rate

High Priority QoS: High priority, rights to use many resources, optional high charge rate

Low Priority QoS: Lowest priority, restricted to run only short duration jobs, restricted to run only jobs with small resource requests, required to be preemptible if higher priority workloads require resources, optional low charge rate

Fairshare: This policy allows organizations to dictate how resources are shared based on historical resource usage of the requestor (requester) versus others and then modifies the prioritization of their workload in order to achieve the per entity resource usage targets.

Specifically, this feature allows site administrators to set system utilization targets for users, groups, accounts, classes, and QoS levels. Administrators can also specify the time frame over which resource utilization is evaluated in determining whether the goal is being reached. Parameters allow sites to specify the utilization metric, how historical information is aggregated, and the effect of fairshare state on scheduling behavior.

Advanced Prioritization: This allows an administrator to apply prioritization modifiers to groups, accounts, and QoS’s, as well as to specify how important fairshare, resources requested, and usage are in influencing the overall priority of a workload request.

Preemption: This policy allows higher priority workloads to displace lower priority workloads that are already running. This may be done to enforce an owner’s rights to use their own resources when they want them or to deliver differing qualities of service to important workloads or groups. Only workload designated as having preemption rights can preempt, and only workloads designated as being preemptible can be preempted, so as to avoid excessive disruption. Administrators can then determine whether they want to configure the system to checkpoint, cancel, requeue, or suspend the preempted job.   Also, Moab can apply rules to avoid the preemption of jobs that are well underway, so as to allow the workload to complete.

Administrative Reservations: This policy is a mechanism that allows administrators to reserve resources for one-time issues. For example, in doing software upgrades, rather than sending an email and asking everyone to get off the system at a particular time, the admin can simply set an administrative future maintenance reservation. All newly submitted workloads will simply be forced to either complete before or start after this window.

Other Features Include:

  • Job Deadlines
  • Personal Reservations