REVEAL@RecSys2020: Workshop on Bandit and Reinforcement Learning from User Interactions

State-of-the-art recommender systems are notoriously hard to design and improve upon, due to their interactive and dynamic nature. In particular, they involve a multi-step decision-making process, where a stream of interactions occurs between the user and the system. Leveraging reward signals from these interactions and creating a scalable and performant recommendation inference model is a key challenge. Traditionally, to make the problem tractable, the interactions are often viewed as independent, but in order to improve recommender systems further, the models will need to take into account the delayed effects of each recommendation and start reasoning/planning for longer-term user satisfaction. To this end, our workshop invites contributions that enable recommender systems to adapt effectively to diverse forms of user feedback and to optimize the quality of each user’s long-term experience — specifically approaches that leverage bandit and reinforcement learning from user interactions.

Potential contributions include (but are not limited to):

Reinforcement learning and bandits for recommendation

Robust estimators, counterfactual and off-policy evaluation

Causal recommender systems

Using simulation for recommender systems evaluation

New evaluation datasets

New offline metrics for recommender systems

Add a Comment

Your email address will not be published. Required fields are marked *