Kalyan Perumalla

Oak Ridge National Laboratory Computational Sciences and Engineering Division


Lecture Information:
  • April 24, 2024
  • 11:09 AM
  • PG5: 134

Speaker Bio

Kalyan Perumalla, Ph.D., is a senior R&D staff member and manager in the Computational Sciences and Engineering Division at the Oak Ridge National Laboratory, and an adjunct professor in the School of Computational Sciences and Engineering at the Georgia Institute of Technology. Dr. Perumalla founded and currently leads the High Performance Discrete Computing Systems team at the Oak Ridge National Laboratory. He earned his Ph.D. in Computer Science from the Georgia Institute of Technology in 1999. His areas of interest include reversible computing, high performance computing, parallel discrete event simulation, and parallel combinatorial optimization.

Dr. Perumalla is a winner of the prestigious US Department of Energy Career Award in Advanced Scientific Computing Research, 2010-2015. His primary research contributions are in the application of reversible computation to high performance computing and in advancing the vision of a new class of supercomputing applications using real-time, parallel discrete event simulations. His recent book “Introduction to Reversible Computing” is among the first few in its area. He co-authored another book, three book chapters, and over 100 articles in peer-reviewed conferences and journals. Four of his co-authored papers received the best paper awards, in 1999, 2002, 2005 and 2008, and two were finalists in 2010.

Dr. Perumalla has been actively serving the research community as program committee member and reviewer for several international conferences and journals. He serves on the editorial board of the ACM Transactions on Modeling and Computer Simulation and the SCS Transactions of the Society for Modeling and Simulation International. His research prototype tools in parallel and distributed computing have been disseminated to research institutions worldwide. He has performed research as an investigator on several research programs sponsored by US federal agencies including the Department of Energy, Department of Defense, Department of Homeland Security, and the National Science Foundation.

Abstract

The next leap in parallel computing is characterized by formidable challenges in achieving feasibility, efficiency, and usability, as evidenced in the ongoing evolution to exa-scale computing. One of the exciting directions being explored by our research group in meeting these challenges is the paradigm of reversible computing applied at the software-level to large-scale parallel execution. In this paradigm, execution is relaxed from the conventional (forward-only) mode to a reversible mode that can change direction dynamically on demand. Such reversible execution directly addresses concerns of fault tolerance for feasibility, synchronization for efficiency, and debugging for usability, at scale. Additionally, the theoretical relation of reversible execution to adiabatic computing and quantum computing positions it to capitalize on such hardware in future.
Rendering a program reversible is an extremely challenging endeavor; automation is difficult, and naive methods incur unacceptably high memory and runtime overheads. We are developing new techniques that carefully minimize (or sometimes eliminate) the overheads at the software-level via reversible compilers, reversible libraries and physical system models. In this talk, we will present two recent results in this research: (a) a novel reversible model of n-particle elastic collision in d-dimensions that is essentially memory-less, and (b) a prototype implementation of reversible computing for rollback-based fault tolerance that is very efficient on heterogeneous computing platforms containing thousands of processors and accelerators.