Michael Perrone

IBM T.J. Watson Program Director


Lecture Information:
  • April 7, 2016
  • 11:00 AM
  • ECS: 349

Speaker Bio

Michael P. Perrone is IBM Research Program Director for Technical Vitality. Dr. Perrone is responsible for driving research strategy for large scale machine learning and high performance system design, university outreach and collaboration, business model development, client collaborations, identifying new research opportunities, managing the development of new technologies and establishing and growing industry collaborations and alliances with key business partners. Dr. Perrone research has focused on advances in the upstream petroleum industry, including seismic imaging, reservoir modeling, carbon sequestration, and related high performance computing, cognitive, business analytics, and statistical machine learning issues. Dr. Perrone led the project that won the 2009 Platts Global Energy Award for the Commercial Technology of the Year (with Repsol); and is on the board of the Mission-Oriented Seismic Research Program (M-OSRP) at the University of Houston – an industry consortium for addressing high priority seismic exploration and production challenges. Previously, Dr. Perrone performed research in a wide range of other areas, including graph algorithms, network intrusion detection, financial data stream processing (OPRA), high-speed text indexing, computational fluid dynamics, image processing and bioinformatics. His work has led to deep insight into algorithmic optimization techniques and to the design of novel supercomputing algorithms. He led the team that won the Graph500 competition five times in a row. Dr. Perrone has won numerous awards, including being named IBM Master Inventor. He is the author of 26 issued patents and over 70 technical publications. He has given numerous invited and keynote lectures at conferences, workshops and universities. He received his Ph.D. in Physics from Brown University.

Abstract

Deep Neural Networks (DNNs) have recently received tremendous publicity due to significant advances in training, accuracy, and novel applications. In many cases, DNNs outperform currently deployed, traditional machine learning techniques; and many researchers are expecting this trend to continue. Due to large data sets and model sizes of typical DNNs, training can take days or longer on single node systems. This long turn-around time makes efficient exploration of the model and algorithm space is very difficult. In this paper, we investigate multinode scale-out to accelerate turn-around time and enable efficient exploration. This presentation proposes a solution to the problem of optimal, high-level system design for scale-out Deep Neural Network (DNN) machine learning systems. By developing a methodology that goes beyond traditional bottleneck analysis, it is possible to optimally select amongst a set of designs, each of which balances compute and communication. The methodology determines the optimal performance, which is used to assess system design trade-offs under constraints such as fixed price, etc.