Dagstuhl Seminar 24211: Evaluation Perspectives of Recommender Systems — Driving Research and Education

Recommender-Systems Evaluation is and maybe always has been a challenge. The Dagstuhl Seminar 24211 attempted to improve the current state of recommender-systems evaluation, and as a participant, I may report that it succeeded ;-).

The Dagstuhl Seminar 24211, held from May 20 to May 24, 2024, focused on “Evaluation Perspectives of Recommender Systems: Driving Research and Education.” This seminar aimed to critically examine and reflect on the state of evaluating recommender systems by bringing together academia and industry professionals. Building on the discussions from the PERSPECTIVES workshop series at ACM RecSys 2021-2023, the seminar sought to understand the diverse and potentially contradictory perspectives on evaluation in this field. The goal was to foster a setting for development and growth in the evaluation methodologies of recommender systems, which are crucial for their advancement and deployment.

Recommender systems, while largely applied, rely heavily on theories from information retrieval, machine learning, and human-computer interaction. Each field offers different theories and evaluation approaches, making the thorough evaluation of recommender systems a complex task that necessitates integrating these diverse perspectives. The seminar provided a platform for experts from these areas to collaborate and discuss state-of-the-art practices. Emphasizing the importance of considering both technical performance and human elements, the seminar aimed to develop comprehensive evaluation metrics, methods, and practices through interdisciplinary collaboration.

The Dagstuhl seminar opened with 8 invited talks, in which different topics were presented.

  1. Theory of Evaluation; Neil Hurley
  2. Evaluation in Practice; Bart Goethals
  3. Multistakeholder Evaluation; Robin Burke
  4. Multi-method Evaluation; Jürgen Ziegler
  5. Evaluation of Fairness; Michael Ekstrand
  6. Evaluating the Long-Term Impact of Recommender Systems; Joseph Konstan
  7. Optimizing and evaluating for short- or long-term preferences?; Martijn C. Willemsen
  8. Proposal for Evidence-based Best-Practices for Recommender Systems Evaluation; Joeran Beel

Participants voted on the topics, and eventually worked in the following five groups.

  1. Theory of Evaluation; Neil Hurley, Vito Walter Anelli, Alejandro Bellogin, Oliver Jeunen, Lien Michiels, Denis Parra, Rodrygo Santos, Alexander Tuzhilin
  2. Fairness Evaluation; Christine Bauer, Michael Ekstrand, Andrés Ferraro, Maria Maistro, Manel Slokom, Robin Verachtert
  3. Best-Practices for Offline Evaluations of Recommender Systems; Joeran Beel, Dietmar Jannach, Alan Said, Guy Shani, Tobias Vente, Lukas Wegmeth
  4. Multistakeholder and Multimethod Evaluation; Robin Burke, Gediminas Adomavicius, Toine Bogers, Tommaso Di Noia, Dominik Kowald, Julia Neidhardt, Özlem Özgöbek, Maria Soledad Pera, Jürgen Ziegler
  5. Evaluating the Long-Term Impact of Recommender Systems; Andrea Barraza-Urbina, Peter Brusilovsky, Wanling Cai, Kim Falk, Bart Goethals, Joseph A. Konstan, Lorenzo Porcaro, Annelien Smets, Barry Smyth, Marko Tkalčič, Helma Torkamaan, Martijn C. Willemsen

The seminar also highlighted the need to prepare the next generation of researchers to comprehensively evaluate and advance recommender systems. By bringing together participants from various backgrounds, including academic and industry researchers and practitioners, the seminar facilitated a holistic understanding of a recommender system’s performance in its context of use. The organizers, Christine Bauer, Alan Said, and Eva Zangerle, emphasized creating a foundation for future research that integrates technical rigour with practical usability and user interaction considerations, thus driving forward the field of recommender systems.

The participant list included many known figures from the RecSys community, and we all created a report that is to be published soon.

Add a Comment

Your email address will not be published. Required fields are marked *