Florida International University Knight Foundation School of Computing and Information Sciences
Roger Boza is a Ph.D. candidate at the Knight Foundation School of Computing and Information Sciences (KFSCIS) at Florida International University (FIU) working on Machine Learning (ML) and Deep Learning (DL). His research interest lies in computer vision, image classification, anomaly detection, and Convolutional Neural Network (CNN) designs. Roger received his B.Sc. in Computer Science from FIU in 2018 and was accepted into the Ph.D. program in 2019. As a Ph.D. student, he was awarded a fellowship from the Department of Energy – Environmental Management (DOE-EM) and inducted as a DOE-EM fellow to perform Structural Health Monitoring (SHM) using state-of-the-art (SOTA) ML and DL algorithms. He has been awarded three consecutive summer internships at Idaho National Laboratory (INL) to collaborate on multiple computer vision tasks such as object detection and obstacle avoidance for the Route Operable Unmanned Navigation of DroneS (ROUNDS) and image classification for Firewatch. He has published three research papers at American Nuclear Society (ANS) conferences and one journal paper in the Progress in Nuclear Science.
Convolutional Neural Network (CNN) is a type of Artificial Neural Network (ANN) designed explicitly for processing unstructured data, such as the pixels in images, and used for image recognition. CNN is mainly composed of three types of layers. The convolutional layers extract meaningful features that help the network classify images according to labeled classes. The pooling layers aim to gradually decrease the spatial information from the input and thus reduce the number of trainable parameters and the computational complexity of the model. The fully connected layers form a set of dependent non-linear functions that are directly responsible for the final output and predictions. There is no predefined way for formulating a CNN architecture, even though it can be assembled with a relatively small number of layers. Simply trying several combinations of hyperparameters like the number of convolutional layers, the number of kernels per layer, and the number of layers before pooling is very computationally expensive and time-consuming.
This dissertation aims to formulate a mathematical equation that will recommend a convolutional architecture by suggesting the number of layers and the number of kernels per layer. To achieve this goal, we propose using information entropy to compute how many bits of information a given imagery dataset has and correlate it to the structure of a CNN model. Entropy is calculated for both grayscale and RGB images by creating a histogram for each color channel, normalizing it to be a probability mass function (PMF), and using Shannon’s entropy equation. The recommended CNN models will be trained and tested on open-source benchmark datasets like CIFAR100, MNIST-Digits, MNIST-Letters, and Fashion-MNIST. Preliminary results show that there are recipes, common convolution/pooling layer structures, among the top-performing models. We hypothesize that these recipes will transfer from one image dataset domain to another.
View on Zoom: https://fiu.zoom.us/j/3053482033?pwd=TkNtbXREK0ZsMHB2TFVLOU1QNU9Fdz09