Date: Friday, January 22, 2010, 10 – 11 am, SB 113
Title: Many-Task Computing on Grids, Clouds, and Supercomputers
Speaker: Ioan Raicu
NSF/CRA Computation Innovation Fellow
Department of Electrical Engineering and Computer Science
Many-task computing aims to bridge the gap between two computing paradigms, high-throughput computing and high-performance computing. Many-task computing includes loosely coupled applications that are generally communication-intensive but not naturally expressed using message passing interface commonly found in high-performance computing, drawing attention to the many computations that are heterogeneous but not "happily" parallel. My talk explores fundamental issues in defining the many-task computing paradigm, as well as theoretical and practical issues in supporting both compute and data intensive many-task computing on large scale systems. In particular, my talk describes the abstract model I defined for data diffusion and the Falkon middleware I designed and implemented to enable the support of many-task computing on clusters, grids, clouds, and supercomputers. Micro-benchmarks have shown Falkon to achieve over 15K+ tasks/sec throughputs, scale to millions of queued tasks, to execute billions of tasks per day, and achieve hundreds of Gb/s I/O rates. Falkon has shown orders of magnitude improvements in performance and scalability across many diverse workloads and applications (e.g. astronomy, medicine, chemistry, molecular dynamics, economic modeling, and data analytics) at scales of billions of tasks on hundreds of thousands of processors across grids and supercomputers. Over the past several years, Falkon has processed over 173 million tasks totaling over 2M CPU hours. My talk will also outline my future work to 1) develop the theoretical and practical aspects of building efficient and scalable support for manytask computing on a wide variety of architectures at the extreme scales of tomorrow's exascale systems, and 2) develop scalable distributed storage systems emphasizing the integration of storage resources throughout the compute resources and leveraging data locality information commonly found in many-task computing. The shift of co-locating storage and compute resources in high-end computing systems will lead to improving application performance and scalability for the most demanding data intensive applications as system scales continue to increase according to Moore's Law.
Biography: Dr. Ioan Raicu is a NSF/CRA Computation Innovation Fellow at Northwestern University, in the Department of Electrical Engineering and Computer Science. Ioan holds a Ph.D. in Computer Science from University of Chicago under the guidance of Dr. Ian Foster. His research work focuses on resource management in distributed systems to support large scale loosely coupled and data intensive applications. He has defined a new paradigm Many-Tasks Computing (MTC), as well as architected and implemented the middleware, Falkon, a fast and light-weight task execution framework, necessary to support MTC across a wide range of systems, from clusters, grids, clouds, to supercomputers. The impact of his research can be measured through his 50+ peer-reviewed publications and proposals that received over 800 citations summing to an H-index of 14. His work has been funded by the NASA Ames Research Center GSRP Fellowship Program, the DOE Office of Advanced Scientific Computing Research, and most recently by the NSF/CRA CIFellows Program. Ioan has contributed to the broader community service by being involved in over 50 events (workshops, conferences, journals, book chapters) in various capacities such as reviewer, program committee, organizing committee, chair, and editor. His most significant service contributions have been the workshops he established and chaired, namely the ACM Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS08, MTAGS09) co-located with the IEEE/ACM Supercomputing (SC) conference, and the ACM Workshop on Scientific Cloud Computing (ScienceCloud2010) colocated with the ACM HPDC conference. He is also the guest editor for the special issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) to appear in November 2010. For more information, please see http://www.ece.northwestern.edu/~iraicu/.