Communications and Signal Processing Seminar
Breaking the Sample Size Barrier in Reinforcement Learning via Model-Based Algorithms
This event is free and open to the publicAdd to Google Calendar
Abstract: Emerging reinforcement learning (RL) applications necessitate the design of sample-efficient solutions in order to accommodate the explosive growth of problem dimensionality. Despite the empirical success, however, our understanding about the statistical limits of RL remains highly incomplete. In this talk, I will present some recent progress towards settling the sample complexity limits in RL. The first scenario is concerned with RL with a generative model, which allows one to query arbitrary state-action pairs to draw independent samples. We prove that a model-based algorithm (a.k.a. the plug-in approach) achieves minimal-optimal sample complexity without any burn-in cost. The second scenario is concerned with online RL, where an agent learns via real-time interactions with an unknown environment. We develop the first algorithm — an optimistic model-based algorithm — that achieves minimax-optimal regret for the entire range of sample sizes. Time permitting, we will also discuss the effectiveness of model-based paradigms in offline RL and multi-agent RL. Our results emphasize the prolific interplay between high-dimensional statistics, online learning, and game theory.
The first part is based on joint work with Gen Li, Yuting Wei and Yuejie Chi, and the second part is based on joint work with Zihan Zhang, Jason Lee and Simon Du.
Bio: Yuxin Chen is currently an associate professor of statistics and data science and of electrical and systems engineering at the University of Pennsylvania. Before joining UPenn, he was an assistant professor of electrical and computer engineering at Princeton University. He completed his Ph.D. in Electrical Engineering at Stanford University and was also a postdoc scholar at Stanford Statistics. His current research interests include high-dimensional statistics, nonconvex optimization, and machine learning theory. He has received the Alfred P. Sloan Research Fellowship, the ICCM Best Paper Award (gold medal), and was selected as a finalist for the Best Paper Prize for Young Researchers in Continuous Optimization. He has also received the Princeton Graduate Mentoring Award.
*** This event will take place in a hybrid format. The location for in-person attendance will be room 3427 EECS. Attendance will also be available via Zoom.
Join Zoom Meeting https: https://umich.zoom.us/j/99102451525
Meeting ID: 991 0245 1525
Passcode: XXXXXX (Will be sent via e-mail to attendees)
Zoom Passcode information is also available upon request to Sher Nickrand (email@example.com).
This seminar will be recorded and posted to the CSP Seminar website.