Runtime Systems: Taming the High Performance Computing Beast

Speaker: Cal Ribbens, Deptartment of Computer Science at Virginia Tech
Date: Friday, February 3, 2012
Time: 11:15am-12:15pm
Location: 2150 Torgersen

High performance computing (HPC) is an area of computer science and engineering that has always evolved rapidly---sometimes leading and sometimes riding succeeding waves of technical innovation. While HPC application developers and users have continued to benefit from the increasing power of these high-end resources, the increasing complexity of HPC execution environments will require more and more reliance on runtime systems. Parallelism, load-balancing, power, fault-tolerance, and hardware heterogeneity are just a few of the emerging dominant issues that require runtime solutions. In this talk I will briefly describe some of the motivations and trends in runtime systems for HPC. I will then describe two recent projects we have worked on at Virginia Tech. The first, ReSHAPE, is a runtime system that allows the number of nodes assigned to job running on a cluster to be changed at run time. Experimental results from a prototype implementation of ReSHAPE illustrate the potential of "malleable" jobs for improving overall cluster utilization and reducing turn-around time for individual jobs. The second project, Samhita, is a distributed shared memory (DSM) execution environment, which allow programs based on the widely used Pthreads library for shared memory thread parallelism to be easily ported to a distributed memory (cluster) platform. Samhita not only allows a wide range of parallel codes to be ported to a new context, but its design reduces the problem of DSM to a cache management problem, with corresponding opportunities for exploiting locality at runtime.

Cal Ribbens is Associate Professor and Associate Department Head for Undergraduate Studies in the Department of Computer Science at Virginia Tech. He received a B.S. in Mathematics from Calvin College (1981) and a Ph.D. in Computer Sciences from Purdue University (1986). His research interests include parallel computation, numerical algorithms, mathematical software, and tools and environments for high performance computing.