TORQUE Resource Manager
TORQUE provides control over batch jobs and distributed computing resources. It is an advanced open-source product based on the original PBS project* and incorporates the best of both community and professional development. It incorporates significant advances in the areas of scalability, reliability, and functionality and is currently in use at tens of thousands of leading government, academic, and commercial sites throughout the world. TORQUE may be freely used, modified, and distributed under the constraints of the included license.
*TORQUE can integrate with Moab, a workload manager that intelligently places workloads and adapts resources to optimize application performance, increase system utilization, and achieve organizational objectives. It is customizable to each system’s specific situation:
- Ease-of-Use Job Submission and Management: Simplify the workload submission process for end-users with an easy-to-use job submission portal, which includes features like application templates, script builders, job details, and web-based file management.
- Multiple Groups or Heterogeneous Hardware: Meet the needs of multiple groups and optimize resources in complex or heterogeneous environments, as well as guarantee SLA’s and achieve business objectives.
- Modular Add-ons: Obtain additional controls through powerful add-on capabilities like portal-based job submission, accounting, grid and power management, high throughput submission, and more.
**TORQUE customers who purchase Moab family products also receive free support for TORQUE.
TORQUE has aggressive development from the community and Adaptive Computing with continuing advancements in high availability, advanced diagnostics, job arrays, advanced GPGPU scheduling, high-throughput support, and other functionality. TORQUE incorporates the best contributions from both community and professional development teams and also offers community and professional support options. Because of this TORQUE is an industry-standard resource manager solution with higher adoption than any other resource management offering. TORQUE provides enhancements over other resource managers in the following areas:
- Additional failure conditions checked/handled
- Node health check script support
- Extended query interface providing the scheduler with additional and more accurate information
- Extended control interface allowing the scheduler increased control over job behavior and attributes
- Allows the collection of statistics for completed jobs
- Significantly improved server to MOM communication model
- Ability to handle larger clusters with tens of thousands of nodes and jobs
- Ability to handle larger jobs that span hundreds of thousands of processors
- High responsiveness and reliability with multi-threading and TCP-based communication
- Extensive logging additions
- More human readable logging (i.e. no more ‘error 15038 on command 42’)
We are continuing to collect community patches to incorporate into the TORQUE distribution. We are also focusing on further functionality, scalability and fault tolerance enhancements to extend the numerous changes already made.
TORQUE is freely available for download from www.adaptivecomputing.com. Adaptive Computing is the custodian of the TORQUE open-source project and is actively developing the code base in cooperation with the TORQUE community to provide state-of-the-art resource and job management. TORQUE users may subscribe to TORQUE’s mailing list or view the archive for questions, comments or patches. Please send mail to this list or send it directly to [email protected] if you have any patches to contribute or if you are aware of any issues in the distribution.