Information-Theoretic Approaches to Neural Network Compression, Clustering and Concept Learning
Add to Google Calendar
Deep neural networks have shown incredible performance for inference tasks in a variety of domains, but unfortunately are often enormous cloud-based structures, require significant training data, and are hard for people to interpret. To work towards addressing these challenges, we present three lines of information theoretic investigation. We discuss optimal quantization of synaptic weights and universal lossless compressed representations of feedforward neural networks, taking inferential purpose into account. The basic
insight yielding considerable rate reduction is a kind of permutation invariance to the labeling of nodes. We also discuss optimal energy allocation in specialized hardware implementations of neural networks. In considering unsupervised learning, we present asymptotically-optimal algorithms for universal clustering and joint registration/clustering that involve optimization of novel empirical multivariate information measures. Finally we present algorithms for hierarchical, human-interpretable concept learning that iteratively optimize empirical information measures. Such concept learning may act in support of computational creativity.
Lav Varshney is an assistant professor of electrical and computer engineering, computer science, and neuroscience at the University of Illinois at Urbana-Champaign. He is also leading curriculum initiatives for the new B.S. degree in Innovation, Leadership, and Engineering Entrepreneurship in the College of Engineering.
He received the B.S. degree (magna cum laude) in electrical and computer engineering with honors from Cornell University in 2004. He received the S.M., E.E., and Ph.D. degrees, all in electrical engineering and computer science, from the Massachusetts Institute of Technology, in 2006, 2008, and 2010, where his theses received the E. A. Guillemin Thesis Award and the J.-A. Kong Award Honorable Mention. He was a research staff member at the IBM Thomas J. Watson Research Center from 2010 until 2013, where he led the design and development of the Chef Watson computational creativity system. His research interests include information and coding theory; data science; limits of nanoscale, social, and neural computing; human decision making and collective intelligence; and creativity.
Dr. Varshney is a founding member of the IEEE Special Interest Group on Big Data in Signal Processing and served on the Shannon Centenary Committee of the IEEE Information Theory Society. He also serves on the advisory board of the AI XPRIZE. He received the IBM Faculty Award in 2014 and was a finalist for the Bell Labs Prize in 2014 and 2016. He and his students have won several best paper awards, including a 2015 Data for Good Exchange Paper Award. His work appears in the anthology, The Best Writing on Mathematics 2014, and he was selected to present at the 2017 World Science Festival. He appears on the List of Teachers Ranked as Excellent and has been named a Center for Advanced Study Fellow at the University of Illinois.