Dagstuhl Seminar 24211: Evaluation Perspectives of Recommender Systems — Driving Research and Education

June 7, 2024

Recommender-Systems Evaluation is and maybe always has been a challenge. The Dagstuhl Seminar 24211 attempted to improve the current state of recommender-systems evaluation, and as a participant, I may report that it succeeded ;-).

The Dagstuhl Seminar 24211, held from May 20 to May 24, 2024, focused on “Evaluation Perspectives of Recommender Systems: Driving Research and Education.” This seminar aimed to critically examine and reflect on the state of evaluating recommender systems by bringing together academia and industry professionals. Building on the discussions from the PERSPECTIVES workshop series at ACM RecSys 2021-2023, the seminar sought to understand the diverse and potentially contradictory perspectives on evaluation in this field. The goal was to foster a setting for development and growth in the evaluation methodologies of recommender systems, which are crucial for their advancement and deployment.

Recommender systems, while largely applied, rely heavily on theories from information retrieval, machine learning, and human-computer interaction. Each field offers different theories and evaluation approaches, making the thorough evaluation of recommender systems a complex task that necessitates integrating these diverse perspectives. The seminar provided a platform for experts from these areas to collaborate and discuss state-of-the-art practices. Emphasizing the importance of considering both technical performance and human elements, the seminar aimed to develop comprehensive evaluation metrics, methods, and practices through interdisciplinary collaboration.

The Dagstuhl seminar opened with 8 invited talks, in which different topics were presented.

Theory of Evaluation; Neil Hurley
Evaluation in Practice; Bart Goethals
Multistakeholder Evaluation; Robin Burke
Multi-method Evaluation; Jürgen Ziegler
Evaluation of Fairness; Michael Ekstrand
Evaluating the Long-Term Impact of Recommender Systems; Joseph Konstan
Optimizing and evaluating for short- or long-term preferences?; Martijn C. Willemsen
Proposal for Evidence-based Best-Practices for Recommender Systems Evaluation; Joeran Beel

Participants voted on the topics, and eventually worked in the following five groups.

Theory of Evaluation; Neil Hurley, Vito Walter Anelli, Alejandro Bellogin, Oliver Jeunen, Lien Michiels, Denis Parra, Rodrygo Santos, Alexander Tuzhilin
Fairness Evaluation; Christine Bauer, Michael Ekstrand, Andrés Ferraro, Maria Maistro, Manel Slokom, Robin Verachtert
Best-Practices for Offline Evaluations of Recommender Systems; Joeran Beel, Dietmar Jannach, Alan Said, Guy Shani, Tobias Vente, Lukas Wegmeth
Multistakeholder and Multimethod Evaluation; Robin Burke, Gediminas Adomavicius, Toine Bogers, Tommaso Di Noia, Dominik Kowald, Julia Neidhardt, Özlem Özgöbek, Maria Soledad Pera, Jürgen Ziegler
Evaluating the Long-Term Impact of Recommender Systems; Andrea Barraza-Urbina, Peter Brusilovsky, Wanling Cai, Kim Falk, Bart Goethals, Joseph A. Konstan, Lorenzo Porcaro, Annelien Smets, Barry Smyth, Marko Tkalčič, Helma Torkamaan, Martijn C. Willemsen

The seminar also highlighted the need to prepare the next generation of researchers to comprehensively evaluate and advance recommender systems. By bringing together participants from various backgrounds, including academic and industry researchers and practitioners, the seminar facilitated a holistic understanding of a recommender system’s performance in its context of use. The organizers, Christine Bauer, Alan Said, and Eva Zangerle, emphasized creating a foundation for future research that integrates technical rigour with practical usability and user interaction considerations, thus driving forward the field of recommender systems.

The participant list included many known figures from the RecSys community, and we all created a report that is to be published soon.

About The Author

Joeran Beel

I am the founder of Recommender-Systems.com and head of the Intelligent Systems Group (ISG) at the University of Siegen, Germany https://isg.beel.org. We conduct research in recommender-systems (RecSys), personalization and information retrieval (IR) as well as on automated machine learning (AutoML), meta-learning and algorithm selection. Domains we are particularly interested in include smart places, eHealth, manufacturing (industry 4.0), mobility, visual computing, and digital libraries. We founded or maintain, among others, LensKit-Auto, Darwin & Goliath, Mr. DLib, and Docear, each with thousand of users; we contributed to TensorFlow, JabRef and others; and we developed the first prototypes of automated recommender systems (AutoSurprise and Auto-CaseRec) and Federated Meta Learning (FMLearn Server and Client).

Related Posts

About The Author

Joeran Beel

Add a Comment