Illinois Institute of Technology (IIT)
Dr. Xian-He Sun is a University Distinguished Professor of Computer Science at the Department of Computer Science in the Illinois Institute of Technology (IIT). He is the director of the Scalable Computing Software laboratory at IIT and a guest faculty in the Mathematics and Computer Science Division at the Argonne National Laboratory. Before joining IIT, he worked at DoE Ames National Laboratory, at ICASE, NASA Langley Research Center, at Louisiana State University, Baton Rouge, and was an ASEE fellow at Navy Research Laboratories. Dr. Sun is an IEEE fellow and is known for his memory-bounded speedup model, also called Sun-Ni’s Law, for scalable computing. His research interests include data-intensive high-performance computing, memory and I/O systems, software system for big data applications, and performance evaluation and optimization. He has over 250 publications and 6 patents in these areas. He is the Associate Editor-in-Chief of the IEEE Transactions on Parallel and Distributed Systems, a Golden Core member of the IEEE CS society, a former vice chair of the IEEE Technical Committee on Scalable Computing, the past chair of the Computer Science Department at IIT, and is serving and served on the editorial board of leading professional journals in the field of parallel processing.
Computing has changed from compute-centric to data-centric. From deep-learning to visualization, data access becomes the main performance concern of computing. In this talk, based on a series of fundamental results and their supporting mechanisms, we introduce a new thought on memory system design. We first present the Concurrent-AMAT (C-AMAT) data access model to quantify the unified impact of data locality, concurrency and overlapping. Then, we introduce the pace-matching data-transfer design methodology to optimize memory system performance. Based on the pace-matching design, a memory-computing hierarchy is built to generate and transfer the final results, and to mask the performance gap between computing and data transfer. C-AMAT is used to optimize performance at each memory layer, and a global management algorithm, named Layered Performance Matching (LPM), is developed to optimize the overall performance of the memory system. The holistic pace-matching optimization is very different from the conventional locality-based system optimization, and can minimize memory-wall effects to the minimum. Experimental testing confirms the theoretical findings, with a 150x reduction of memory stall time. We will present the concept of the pace-matching data-transfer design, the design of C-AMAT and LPM, experimental case studies， and results on DoE and NASA applications. We will also discuss optimization and research issues related to pace-matching data access and to memory systems in general.