CS Graduate Student Guanying Wang wins “Best Paper Award” at MASCOTS ’09 symposium.

Publish Date: 11/06/2009

Computer science graduate student Guanying Wang’s paper titled “A Simulation Approach to Evaluating Design Decisions in MapReduce Setups” was recently recognized as “Best Paper” at the annual IEEE/ACM International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS ’09).

Guanying is a 3rd year Ph.D. candidate whose current research focuses on designing efficient computer systems for supporting the emerging Cloud Computing paradigm. Guanying’s paper provides insights into designing better clusters for running Hadoop – a publicly available implementation of the MapReduce programming model that is used for creating compute clouds. Collaborators on the paper include Wang’s advisor, Dr. Ali Butt, as well as Prashant Pandey and Karan Gupta of the IBM Almaden Research Center in San Jose, California.   

 
Abstract

MapReduce has emerged as a model of choice for supporting modern data-intensive applications. The model is easy-to-use and promising in reducing time-to-solution. It is also a key enabler for cloud computing, which provides transparent and flexible access to a large number of compute, storage and networking resources.

Setting up and operating a large MapReduce cluster entails careful evaluation of various design choices and run-time parameters to achieve high efficiency. However, this design space has not been explored in detail. In this paper, we adopt a simulation approach to systematically understanding the performance of MapReduce setups. The resulting simulator, MRPerf, captures such aspects of these setups as node, rack and network configurations, disk parameters and performance, data layout and application I/O characteristics, among others, and uses this information to predict expected application performance.

Specifically, we use MRPerf to explore the effect of several component inter-connect topologies, data locality, and software and hardware failures on overall application performance. MRPerf allows us to quantify the effect of these factors, and thus can serve as a tool for optimizing existing MapReduce setups as well as designing new ones.

Additional Information:

MASCOT ’09 webpage

Full Version of Paper