Runtime Systems: Taming the High Performance Computing Beast

Speaker: Cal Ribbens, Deptartment of Computer Science at Virginia Tech
Date: Friday, February 3, 2012
Time: 11:15am-12:15pm
Location: 2150 Torgersen

Abstract:
High performance computing (HPC) is an area of computer science and engineering that has always evolved rapidly---sometimes leading and sometimes riding succeeding waves of technical innovation. While HPC application developers and users have continued to benefit from the increasing power of these high-end resources, the increasing complexity of HPC execution environments will require more and more reliance on runtime systems. Parallelism, load-balancing, power, fault-tolerance, and hardware heterogeneity are just a few of the emerging dominant issues that require runtime solutions. In this talk I will briefly describe some of the motivations and trends in runtime systems for HPC. I will then describe two recent projects we have worked on at Virginia Tech. The first, ReSHAPE, is a runtime system that allows the number of nodes assigned to job running on a cluster to be changed at run time. Experimental results from a prototype implementation of ReSHAPE illustrate the potential of "malleable" jobs for improving overall cluster utilization and reducing turn-around time for individual jobs. The second project, Samhita, is a distributed shared memory (DSM) execution environment, which allow programs based on the widely used Pthreads library for shared memory thread parallelism to be easily ported to a distributed memory (cluster) platform. Samhita not only allows a wide range of parallel codes to be ported to a new context, but its design reduces the problem of DSM to a cache management problem, with corresponding opportunities for exploiting locality at runtime.

Bio:
Cal Ribbens is Associate Professor and Associate Department Head for Undergraduate Studies in the Department of Computer Science at Virginia Tech. He received a B.S. in Mathematics from Calvin College (1981) and a Ph.D. in Computer Sciences from Purdue University (1986). His research interests include parallel computation, numerical algorithms, mathematical software, and tools and environments for high performance computing.