Other Seminar

Scaling up reinforcement learning theories of cognition: Towards human-like deep reinforcement learning

Wilka CarvalhoResearch Fellow, Kempner Institute for the Study of Natural and Artificial IntelligenceHarvard
WHERE:
1303 EECS BuildingMap
SHARE:

Abstract: Deep reinforcement learning has made impressive progress on developing intelligent agents for real-world domains. However, RL agents still have progress to make on being robust, learning efficiently, and having robust social inference—all abilities that humans excel in. In contract, cognitive science has made significant progress understanding the mechanisms that support these abilities but in relatively domain-specific ways. In this talk, I propose that we can make progress towards human-like deep reinforcement learning by scaling up reinforcement learning theories of cognition so that they work across increasingly naturalistic domains to enable both human-level and human-like AI models. To demonstrate this research strategy, I will present “Multitask Preplay”, a research project that develops a novel reinforcement learning theory for human behavior and tests it across increasingly naturalistic experimental conditions. We observe that human tasks of interest are commonly co-located—for example, stove and fridge tasks commonly appear in kitchens, and coffee shops and restaurants are commonly co-located in city centers. We hypothesize that humans exploit this structure with Multitask Preplay to leverage experience with one task to preemptively learn solutions to other tasks that were accessible but not achieved. I will first present behavioral predictions that scale from small artificial grid-worlds to a 2D minecraft environment. Afterwards, I will show that Multitask Preplay enables AI agents to learn long-horizon behavioral skills that better transfer to novel open-world environments that share subtask co-occurrence structure.

Bio: Wilka carvalho is a research fellow at Harvard’s Kempner Institute for the study of natural and artificial intelligence. He earned his PhD in computer science with Satinder Singh, Honglak Lee, and Richard Lewis at the University, where he was supported by an NSF Fellowship and the Rackham Merit Fellowship. He’s spent time at DeepMind, Microsoft Research, and IBM Research. His long-term research goal is to develop generalizable models of human cognition which scale to the full spectrum of conditions natural behavior manifests in. Towards this end, he hopes to integrate hypothesis-driven experimental design of cognitive science with the generalization-oriented model development and engineering practices of machine learning.