Communications and Signal Processing Seminar

Social Learning in Multi-Agent, Multi-Armed Bandits

Ayalvadi GaneshAssociate Professor, School of MathematicsUniversity of Bristol, United Kingdom

ABSTRACT: Consider a large number of agents, N, faced with the problem of choosing amongst a large number of options, K. The problem occurs repeatedly, and every time an agent chooses an option, it receives a random reward or payoff whose distribution depends on the option but not on the agent. The goal is to maximize the long-run payoff. The problem involves a trade-off between exploitation – choosing the option currently believed to be the best – and exploration – choosing possibly sub-optimal options in order to gain more information about their payoffs. The challenge is to optimize this trade-off.

If there were a single agent, then this is an instance of the multi-armed bandit problem with K arms., which has been studied extensively for decades. If no communication is allowed between agents, then it is N parallel instances of the multi-armed bandit problem.  If there are no communication constraints, then the agents act in aggregate as if they were a single agent. We are interested in the intermediate case where limited communication is allowed. We show that, even with limited communication, in the long run the system behaves in aggregate as if there were a single agent, i.e., as if there were no communication constraints.

Joint work with Abhishek Sankararaman, Ronshee Chawla, Sanjay Shakkottai, Conor Newton and Henry Reeve.

Related Papers:

BIO: Ayalvadi Ganesh is an Associate Professor at the School of Mathematics, University of Bristol. His research interests include large deviations, queueing theory, random graph dynamics, and decentralised algorithms. He won the INFORMS Best Publication Award in 2005 and the ACM Sigmetrics Best Paper Prize in 2010.

Join Zoom Meeting

Meeting ID: 922 1113 6360

Passcode: XXXXXX (Will be sent via e-mail to attendees)

Zoom Passcode information is also available upon request to Katherine Godwin ([email protected]).

See Full seminar by Professor Ganesh

Faculty Host

Vijay SubramanianAssociate Professor of Electrical Engineering and Computer ScienceUniversity of Michigan