Can Yaras awarded Predoctoral Fellowship to support research on deep learning and generative AI models

Yaras, Ph.D. student in Electrical and Computer Engineering, is working to better understand the inner workings of deep learning and provide better access to foundation models in AI.
Can Yaras

Can Yaras, Ph.D. student in Electrical and Computer Engineering, was awarded a Rackham Predoctoral Fellowship to support his research on efficient training, inference, and architectures for large language models (LLMs) to provide a better understanding of the inner workings of deep learning, while providing solutions to modern computational challenges for employing foundation models.

His proposed dissertation title is “Low-Dimensional Structures of Learning and Computation in Deep Learning.”

“Foundation models, especially large-scale generative AI models, are revolutionizing modern machine learning by enabling systems that can generate human-like text, images, and even complex decision-making,” explains Yaras. “These models act as versatile building blocks, pre-trained on vast datasets, and fine-tuned for various applications, driving significant advances in fields ranging from natural language processing to computer vision. However, the mechanisms underlying their success remain opaque, posing challenges in interpretability and model understanding. Additionally, the training and deployment of these models demand substantial computational resources.”

“This dissertation aims to simultaneously tackle the issues of interpretability and efficiency in foundation models through a unifying framework of identifying intrinsic low-dimensional structure in deep learning models and algorithms as well as the processes through which they learn.”

A foundation model in AI is based on massive amounts of data, and typically uses unsupervised learning. Yaras explained why it is important to create more efficient foundation models:

“The development of foundation models is prohibitively expensive. For instance, state-of-the-art models such as ChatGPT-4 are rumored to have around 1.8 trillion parameters and cost an estimated $63 million to train, with daily inference costs reaching hundreds of thousands of dollars. As a result, training and deploying these models present significant computational challenges, both in academia and industry. By characterizing and leveraging low-dimensional structures in the learning dynamics of deep learning models, we can design memory- and compute-efficient algorithms for training and inference with little to no loss in model quality.”

He says he hopes his work will lower the carbon footprint of deep learning, “and democratize the practice, enabling the adoption of state-of-the-art models by the general public.”

Yaras received his bachelor’s degree in Electrical and Computer Engineering from Duke University. He received a Best Poster Award at the 2024 Midwest Machine Learning Symposium, and a Scholar Award at NeurIPS (Neural Information Processing Systems). He has served as a Graduate Student Instructor for EECS 598 (Machine Learning Theory), as a mentor to two graduate students in his research group, and also helped develop a new summer camp program for high school students on the subject of AI. Yaras is co-advised by Prof. Qing Qu and Prof. Laura Balzano.


The Rackham Predoctoral Fellowship supports students who are working on dissertations that are “unusually creative, ambitious, and impactful,” and who expect to complete their dissertations during their 3-term fellowship period. Students receive a stipend of more than $40K, tuition, fees, and insurance.

Explore:
Honors and Awards; Laura Balzano; Qing Qu; Student News; Undergraduate Students