Solutions to Enterprise High Performance Computing Challenges
While all high performance computing systems (HPC) face challenges in workload demand, resource complexity, and scale, enterprise HPC systems face more stringent challenges and expectations. Enterprise high performance computing systems must meet mission-critical and priority HPC workload demands for commercial businesses and business-oriented research and academic organizations. They have complex SLAs and priorities to balance. Their HPC workloads directly impact the revenue, product delivery, and organizational objectives of their organizations.
Enterprise HPC systems must eliminate job delays and failures. They are also seeking to improve resource utilization and management efficiency across multiple heterogeneous systems. To maximize user productivity, they are required to make it easier to access and use high performance computing resources for users or even expand to other clusters or HPC cloud to better handle workload demand surges.
Enterprise-Ready HPC Workload Management
Moab® HPC Suite – Enterprise Edition provides enterprise-ready HPC workload management that accelerates productivity, automates workload uptime, and consistently meets SLAs and business priorities for high performance computing systems and HPC cloud. It uses the battle-tested and patented Moab intelligence engine to automatically balance the complex, mission-critical workload priorities of enterprise HPC systems. Enterprise customers benefit from a single integrated product that brings together key enterprise HPC capabilities, implementation services, and 24×7 support. This speeds the realization of benefits from the HPC system for the business including:
- Higher job throughput
- Massive scalability for faster response and system expansion
- Optimum utilization of 90-99% on a consistent basis
- Fast, simple job submission and management to increase productivity
- Reduced cluster management complexity and support costs across heterogeneous systems
- Reduced job failures and auto-recovery from failures
- SLAs consistently met for improved user satisfaction
- Reduce power usage and costs by 10-30%
Moab HPC Suite – Enterprise Edition gets more results delivered faster from HPC resources at lower costs by accelerating overall system, user and administrator productivity. Moab provides the scalability, 90-99 percent utilization, and simple job submission required to maximize the productivity of enterprise HPC systems. Enterprise use cases and capabilities include:
- Massive multi-point scalability to accelerate job response and throughput, including high throughput computing
- Workload-optimized allocation policies and provisioning to get more results out of existing heterogeneous resources and reduce costs, including tailored topology-based allocation that speeds job processing up to 200%
- Unify workload management cross heterogeneous clusters to maximize resource availability and administration efficiency by managing them as one cluster
- Optimized, intelligent scheduling packs workloads and backfills around priority jobs and reservations while balancing SLAs to efficiently use all available resources
- Optimized scheduling and management of accelerators, both Intel® Xeon Phi™ and GPGPUs for jobs to maximize their utilization and effectiveness
- Simplified job submission and management with advanced job arrays, self-service portal, and templates
- Administrator dashboards and reporting tools reduce management complexity, time and costs
- Workload-aware auto-power management reduces energy use and costs by 10-30 percent
Job and resource failures in enterprise high performance computing systems lead to delayed results, missed organizational opportunities, and missed objectives. Moab HPC Suite – Enterprise Edition intelligently automates workload and resource uptime in the HPC system to ensure that workload completes successfully and reliably, avoiding these failures. Enterprises can benefit from:
- Intelligent resource placement to prevent job failures with granular resource modeling to meet workload requirements and avoid at-risk resources
- Auto-response to failures and events with configurable actions to pre-failure conditions, amber alerts, or other metrics and monitors
- Workload-aware future maintenance scheduling that helps maintain a stable HPC system without disrupting workload productivity
- Real-world expertise for fast time-to-value and system uptime with included implementation, training, and 24×7 support remote services
Moab HPC Suite – Enterprise Edition uses the powerful Moab intelligence engine to optimally schedule and dynamically adjust workload to consistently meet service level agreements (SLAs), guarantees, or business priorities. This automatically ensures that the right workloads are completed at the optimal times, taking into account the complex number of using departments, priorities and SLAs to be balanced. Moab provides:
- Usage accounting and budget enforcement that schedules resources and reports in line with resource sharing agreements and precise budgets (i.e. usage limits, usage reports, and dynamic fairshare policies)
- SLA and priority polices that make sure the highest priority workloads are processed first (i.e. Quality of Service , hierarchical priority weighting)
- Continuous plus future scheduling that ensures priorities and guarantees are proactively met as conditions and workload changes (i.e. future reservations, pre-emption, etc.)
The benefits of a traditional HPC environment can be extended to more efficiently manage and meet workload demand with the multi-cluster grid and HPC cloud management capabilities in Moab HPC Suite – Enterprise Edition. It allows you to:
- Self-service job submission portal enables users to easily submit and track HPC cloud jobs at any time from any location with little training required
- Pay-for-use showback or chargeback so actual resource usage is tracked with flexible chargeback rates reporting by user, department, cost center, or cluster
- On-demand workload-optimized node provisioning dynamically provisions nodes based on policies, to better meet workload needs.
- Manage and share workload across multiple remote clusters to meet growing workload demand or surges by adding on the Moab HPC Suite – Grid Option.
Managing the World’s Top Systems, Ready to Manage Yours
Moab manages the world’s largest, most scale-intensive and complex high performance computing environments in the world including many of the most capable systems on the Top500 list and the most innovative HPC research sites and commercial HPC systems. Adaptive Computing is also the largest supplier of HPC job/workload management software to HPC sites today with over 36% of HPC sites* using our commercial and open source job/workload management software. So you know it is battle-tested and ready to efficiently and intelligently manage the complexities of your environment.
* According to the 2011 HPC Site Census Survey by Intersect360 Research