Decentralized online learning: decision making in an unknown and changing world
Add to Google Calendar
In this talk I will discuss a type of learning, referred to as regret learning, commonly used in dealing with decision making under uncertainty in the environment, which may also include other similar users. A common feature shared by this type of learning algorithms is to use a combination of "exploration" , where the user samples different options to find out good actions to take, and "exploitation" , where the user takes what he/she believes to be good actions to maximize his/her gain. The design of a good algorithm boils down to determining when to explore, when to exploit and how. This is made more complex when there are multiple uncoordinated such users present in the system. I will go through a few example algorithms and use a number of applications to motivate and illustrate this learning process, including opportunistic channel access decisions in a wireless network and selling products in an unknown market.