Communications and Signal Processing Seminar
Social Learning in Multi-Agent, Multi-Armed Bandits
This event is free and open to the publicAdd to Google Calendar
ABSTRACT: Consider a large number of agents, N, faced with the problem of choosing amongst a large number of options, K. The problem occurs repeatedly, and every time an agent chooses an option, it receives a random reward or payoff whose distribution depends on the option but not on the agent. The goal is to maximize the long-run payoff. The problem involves a trade-off between exploitation – choosing the option currently believed to be the best – and exploration – choosing possibly sub-optimal options in order to gain more information about their payoffs. The challenge is to optimize this trade-off.
If there were a single agent, then this is an instance of the multi-armed bandit problem with K arms., which has been studied extensively for decades. If no communication is allowed between agents, then it is N parallel instances of the multi-armed bandit problem. If there are no communication constraints, then the agents act in aggregate as if they were a single agent. We are interested in the intermediate case where limited communication is allowed. We show that, even with limited communication, in the long run the system behaves in aggregate as if there were a single agent, i.e., as if there were no communication constraints.
Joint work with Abhishek Sankararaman, Ronshee Chawla, Sanjay Shakkottai, Conor Newton and Henry Reeve.
- Social learning in multi-agent multi-armed bandits: https://dl.acm.org/doi/pdf/10.1145/3366701
- The gossiping-insert-eliminate algorithm for multi-agent bandits: http://proceedings.mlr.press/v108/chawla20a/chawla20a.pdf
- Asymptotic optimality for decentralised bandits: https://arxiv.org/pdf/2109.09427.pdf
BIO: Ayalvadi Ganesh is an Associate Professor at the School of Mathematics, University of Bristol. His research interests include large deviations, queueing theory, random graph dynamics, and decentralised algorithms. He won the INFORMS Best Publication Award in 2005 and the ACM Sigmetrics Best Paper Prize in 2010.
Join Zoom Meeting https://umich.zoom.us/j/92211136360
Meeting ID: 922 1113 6360
Passcode: XXXXXX (Will be sent via e-mail to attendees)
Zoom Passcode information is also available upon request to Katherine Godwin ([email protected]).