An Introduction to Scheduling Workload with Moab and Torque

When I was a teenager in the 1970’s, they held a huge rock concert at the Ontario Motor Speedway in Pamona, California. All the great bands came to perform: Black Sabbath, Rare Earth, Emerson Lake & Palmer, Earth Wind & Fire, Seals & Crofts, Deep Purple and others. While I was too young to attend, I remember seeing pictures of the event from airplanes and news helicopters that flew overhead. What struck me more than anything else was the unbelievable number people queuing to get in to the speedway. More frightening, was they all entered though one gate.

I have since come to be familiar with the concept of queues, and how important they are. What I witnessed as a teenager in the rock concert, with its simple, inefficient method for allowing access to the venue, is basically FIFO scheduling. The earlier you showed up at the speedway, the sooner you got in to see the show.

As I have grown, my taste in music has also. While I still attend an occasional rock concert, for nostalgic sake mostly, my preference is now musical theater, ballet, and the symphony. The method for attending my favorite shows today could be considered much more “Civilized.” It usually involved calling a ticket agency, giving them my credit card, and having the theater hold my seats for the time I wish to go. There is very little pushing or shoving to get into the theater early, it’s usually a quite ordered affair.

The parallel between simple FIFO scheduling and intelligent, sophisticated, and “Civilized” scheduling can be seen with Torque’s built-in pbs-sched and Adaptive Computing’s Moab HPC Suite,  Workload Manager. Where Torque plays a key role in handling requests from its’ job scheduler, it is far less efficient on it’s own. With intelligent job management handled through Moab, the workload on HPC and Cloud systems becomes ultimately more efficient.

Moab is designed as a 3-tiered architecture: It sits at the top receiving status updates from the resource managers, such as Toruqe, SLURM, and others, which monitor the consumable resources in a cluster. Moab, in turn, makes policy-based decisions on where and when workload should be scheduled and hands that mandate to the resource managers, who in turn get the workload onto the cluster.

In its simplest sense, Moab schedules workload as jobs. To an end-user wanting to schedule computational resources on a cluster, the job is how to get it there. A job to Moab is the fundamental object of resource consumption.

A job usually contains the following components: Consumable resources, resource and job constraints, execution environment, and the credentials of the job owner. In turn, Moab uses its “Civilized” method for scheduling that job and others on the cluster, handling any resource contention problems, sending workload to the appropriate hardware, at the appropriate time, etc.

Since Moab is detached from the resource managers that physically control the consumable resources, it is free to be almost infinitely customizable. This is done by way of directives or policies. Using those policies, and leveraging mechanisms such as reservation management, flexible node allocation policies, pre-emptive workload methodologies, budget enforcements, and providing SLAs through automated prioritization schemes, Moab is able to elegantly and efficiently schedule workload on many of the fastest supercomputers and high performance clusters around the world. Almost like reserving a ticket to the ballet, it’s an elegant, civilized process that everyone can enjoy.

Facebook Twitter Email

Speak Your Mind

*