Communications and Signal Processing Seminar

Social Learning in Multi-Agent, Multi-Armed Bandits

Ayalvadi GaneshAssociate Professor, School of MathematicsUniversity of Bristol, United Kingdom

WHERE:

Remote/Virtual

WHEN:

Friday, October 15, 2021 @ 9:30 am - 10:30 am
This event is free and open to the publicAdd to Google Calendar

ABSTRACT: Consider a large number of agents, N, faced with the problem of choosing amongst a large number of options, K. The problem occurs repeatedly, and every time an agent chooses an option, it receives a random reward or payoff whose distribution depends on the option but not on the agent. The goal is to maximize the long-run payoff. The problem involves a trade-off between exploitation – choosing the option currently believed to be the best – and exploration – choosing possibly sub-optimal options in order to gain more information about their payoffs. The challenge is to optimize this trade-off.

If there were a single agent, then this is an instance of the multi-armed bandit problem with K arms., which has been studied extensively for decades. If no communication is allowed between agents, then it is N parallel instances of the multi-armed bandit problem. If there are no communication constraints, then the agents act in aggregate as if they were a single agent. We are interested in the intermediate case where limited communication is allowed. We show that, even with limited communication, in the long run the system behaves in aggregate as if there were a single agent, i.e., as if there were no communication constraints.

Joint work with Abhishek Sankararaman, Ronshee Chawla, Sanjay Shakkottai, Conor Newton and Henry Reeve.

Faculty Host

Vijay SubramanianAssociate Professor of Electrical Engineering and Computer ScienceUniversity of Michigan

Events

Communications and Signal Processing Seminar

Social Learning in Multi-Agent, Multi-Armed Bandits

Faculty Host