CS PhD candidate Rajesh Sudarsan Awarded the Paul E. Torgersen Graduate Student Research Excellence Award

Publish Date: 04/15/2009

Rajesh Sudarsan, a PhD candidate in Computer Science, has been awarded the Paul E. Torgersen Graduate Student Research Excellence Award. The annual award is presented by the College of Engineering Graduate Student Committee as a way to showcase top research performed by graduating Masters and Doctoral students.

In his winning presentation, "A Scheduling Framework for Resizable Parallel Applications,” Sudarsan described his research on dynamic resizing of parallel applications.

As terascale supercomputers become common and petascale machines emerge, the challenge of providing effective resource management for high-end machines (e.g., System X at Virginia Tech) grows in both importance and difficulty. The most powerful HPC resources are extremely expensive to build and operate, so the cost of underutilization is high. A fundamental challenge is that conventional parallel job schedulers are static. That is, once a job is allocated a set of processors, it continues to use that same number of processors until it finishes execution. As a result, it is common to see jobs stuck in the queue because they require just a few more processors than are currently available, resulting in long queue wait times for applications and low overall system utilization.

Sudarsan’s research aims for a more flexible and effective approach, where the set of processors allocated to jobs can be expanded or contracted at runtime. He has developed a framework called ReSHAPE to explore the potential benefits and challenges of “dynamic resizing” of parallel applications. The framework includes a programming model, a runtime library for application resizing and data redistribution, and a parallel scheduling and resource management system. In his talk before a panel of faculty judges, Sudarsan described the potential of ReSHAPE for supporting interesting and effective scheduling techniques, and reported on experimental results which demonstrate a significant improvement in cluster utilization and turn-around time for applications executed using ReSHAPE.