Electrical and Computer Engineering

Communications and Signal Processing Seminar

Approximate planning and learning in partially observed systems

Aditya MahajanAssociate ProfessorElectrical and Computer Engineering, McGill University

Abstract:  Reinforcement learning (RL) provides a conceptual framework for designing agents which learn to act optimally in unknown environments. RL has been successfully used in various applications ranging from robotics, industrial automation, finance, healthcare, and natural language processing. The success of RL is based on a solid foundation of combining the theory of exact and approximate Markov decision processes (MDPs) with iterative algorithms that are guaranteed to learn an exact or approximate action-value function and/or an approximately optimal policy. However, for the most part, the research on RL theory is focused on systems with full state observations.

In various applications including robotics, finance, and healthcare, the agent only gets a partial observation of the state of the environment. In this talk, I will describe a new framework for approximate planning and learning for partially observed systems based on the notion of approximate information state. The talk will highlight the strong theoretical foundations of this framework, illustrate how many of the existing approximation results can be viewed as a special case of approximate information state, and provide empirical evidence which suggests
that this approach works well in practice.

Joint work with Jayakumar Subramanian, Amit Sinha, and Raihan Seraj.

Bio:   Aditya Mahajan is Associate Professor of Electrical and Computer Engineering at McGill University, Montreal, Canada. He is affiliated with the McGill Center of Intelligent Machines (CIM), Montreal Institute of Learning Algorithms (MILA), and Group for research in decision analysis (GERAD). He received the B.Tech degree in Electrical Engineering from the Indian Institute of Technology, Kanpur, India in 2003 and the MS and PhD degrees in Electrical Engineering and Computer Science from the University of Michigan, Ann Arbor, USA in 2006 and 2008. From 2008 to 2010, he was postdoctoral researcher in the department of Electrical Engineering at Yale University, New Haven, CT, USA. From 2016 to 2017, he was a visiting scholar at the University of California, Berkeley.

He is the recipient of the 2015 George Axelby Outstanding Paper Award, the 2016 NSERC Discovery Accelerator Award, the 2014 CDC Best Student Paper Award (as supervisor), and the 2016 NecSys Best Student Paper Award (as supervisor). His principal research interests include decentralized stochastic control, team theory, reinforcement learning, multi-armed bandits and information theory.

Join Zoom Meeting https://umich.zoom.us/j/97598571292

Meeting ID: 975 9857 1292

Passcode: XXXXXX (Will be sent via email to attendees)

NOTE:  This seminar will be recorded.  The video will be posted to this website shortly after the seminar.