School of Computing and Information Sciences
Bin Shi is a Ph.D. candidate in School of Computing and Information Sciences at FIU under the supervision of Professor Sitharama S. Iyengar. His preliminary research focus on the theory of machine learning, especially on optimization. Bin Shi received his B.S. of Applied Math from Ocean University of China, China in 2006, Master of Pure Mathematics from Fudan University, China in 2011 and Master of Theoretical Physics from University of Massachusetts Dartmouth in 2015. His research interests focus on statistical machine learning and optimization, some theoretical computer science. His research contributions have been published in NIPS OPT-2017 workshop and a debut joint work with Michael I. Jordan, Weijie Su and Simon S. Du —— Understanding the acceleration phenomena via high-resolution differential equations —— preprint on https://arxiv.org/abs/1810.08907 (Submitted to Mathematical Programming A, Full Length Paper) The preprint was recommended by the founder of accelerated gradient descent algorithm, Yurii Nesterov. The work has been submitted to Stephen Boyd (Stanford), Emmanuel Candes (Stanford), Ya-xiang Yuan (CAS), George Guanghui Lan (Gatech), Zhihua Zhang (PKU), Xiaoping Yuan (Fudan) and obtained positive feedback.
The dissertation addresses the research topics of machine learning outlined below. We developed the theory about traditional first-order algorithms from convex optimization and provide new insights in nonconvex objective functions from machine learning. Based on the theory analysis, we designed and developed new algorithms to overcome the difficulty of nonconvex objective and to accelerate the speed to obtain the desired result. In this thesis, we answer the two questions: (1) How to design a step size for gradient descent with random initialization? (2) Can we accelerate the current convex optimization algorithms and improve them into nonconvex objective? For application, we apply the optimization algorithms in sparse subspace clustering. A new algorithm, CoCoSSC, is proposed to improve the current sample complexity under the condition of the existence of noise and missing entries.
Gradient-based optimization methods have been increasingly modeled and interpreted by ordinary differential equations (ODEs). Existing ODEs in the literature are, however, inadequate to distinguish between two fundamentally different methods, Nesterov’s acceleration gradient method for strongly convex functions (NAG-SC) and Polyak’s heavy-ball method. In this paper, we derive high-resolution ODEs as more accurate surrogates for the two methods in addition to Nesterov’s acceleration gradient method for general convex functions (NAG-C), respectively. These novel ODEs can be integrated into a general framework that allows for a fine-grained analysis of the discrete optimization algorithms through translating properties of the amenable ODEs into those of their discrete counterparts. As a first application of this framework, we identify the effect of a term referred to as gradient correction in NAG-SC but not in the heavy-ball method, shedding deep insight into why the former achieves acceleration while the latter does not. Moreover, in this high-resolution ODE framework, NAG-C is shown to boost the squared gradient norm minimization at the inverse cubic rate, which is the sharpest known rate concerning NAG-C itself. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates as NAG-C for minimizing convex functions.