‘Hot’ Research Topics

While ‘Recommender Systems’ is an established research discipline, the questions of how to generate good recommendations, what good recommendations are, and how to measure ‘goodness’ are far from being answered.

To get an idea of ‘hot’ and promising research topics, have a look at the recent workshops of the ACM Recommender Systems conference, and other related conferences.

Our personal list of promising research topics includes the following and is, obviously, biased towards our own research (we wouldn’t do the research we do if we were not thinking it was in a promising area).

Multi-Stakeholder Environments

Research on recommender-systems normally focuses on “the” user. However, the user is not the only one who matters. The business operating the system has its own interest (probably to generate revenue). Similarly, the society or government may have some interest, especially if a recommender system is run by a large company. And, sometimes, and maybe most interestingly from a research perspective, there is sometimes not the user who uses the system. Think of e.g. Spotify. Spotify needs to satisfy the needs of its ‘normal’ users, particularly its paying subscribers. However, there are also the artists and music labels that want their music to be prominently featured. Without happy paying users, Spotify will go bankrupt. Without happy artists, Spotify has no content to generate money from (and would go bankrupt, too). So, how can Spotify utilize its recommender system to satisfy both groups (read also here)?

A good starting point to learn more about multi-stakeholder environments is the Workshop on Recommendation in Multistakeholder Environments.

Group Recommendations

Giving recommendations to groups of people is kind of a multi-stakeholder environment. Think of a group of people that wants to watch a movie, or travel. The people probably have different preferences, and budgets. Giving a recommendation for a movie or hotel to just one person is relatively simple. But how to give recommendation that satisfy all the persons’ needs has not yet been researched extensively, and is a promising area of future research.

User Interfaces for Recommender Systems

There is surprisingly little research on how to exactly design a user interface for recommender systems. This includes simple questions like:

How many recommendations to display? 1, 2, 3, …?
Where to display them? On the top/bottom/side of a page?
How often? Hourly? Daily? Weekly?
What information to display about the recommended items? Thumbnails? price? …?
How to label a set of recommendations (“Recommendations”, “Suggestions”, “Sponsored”, …)?

Evaluation

Impact

Surprisingly, it is relatively unknown how much impact recommender systems actually have on a business’s success, its customers, and society. Often you will find that people refer to increases in sales and site activity by up to 30%. But as Sharma et al. (2015) point out, “these estimates likely overstate the true causal estimate, possibly by a large amount”. Consequently, there is still lots of work to do to identify the true impact, and discussing what ‘impact’ actually means in the context of recommender systems. Eventually, recommender systems span far beyond simply maximizing revenue but should also consider the consequences e.g. for society (think of filter bubbles, etc).

A good starting point to learn more about this is the Workshop on the Impact of Recommender Systems.

Reproducibility (Crisis)

The recommender-system community faces a reproducibility crisis, which makes it almost impossible to say what algorithms are truly state-of-the-art. In a recent paper (Are we really making much progress? A worrying analysis of recent neural recommendation approaches), the authors found that of 18 algorithms, 11 algorithms (61%) could not be reproduced at all. For the remaining 7 algorithms (39%), the authors managed to reproduce the results, but:

“6 of [the algorithms could] often be outperformed with comparably simple heuristic methods, e.g., based on nearest-neighbor or graph-based techniques. The remaining one clearly outperformed the baselines but did not consistently outperform a well-tuned nonneural linear ranking method.“
Maurizio Ferrari Dacrema, Paolo Cremonesi, Dietmar Jannach

In short: none of the 18 novel algorithms lead to a real improvement over a relatively simple baseline.

(Automated) Algorithm Selection / AutoRecSys

Identifying a good recommendation algorithm for a new system is nearly impossible. In one of our own studies, Towards reproducibility in recommender-systems research, we showed that recommendation algorithms perform vastly differently in slightly different scenarios (figure below).

https://link.springer.com/article/10.1007/s11257-016-9174-x

We tested five recommendation algorithms on six German news websites. On almost every website a different algorithm was best (and worst). ‘User-based CF’ performed best on ksta.de, the ‘Most popular sequence’ performed best on sport1.de, the normal most ‘popular algorithm’ performed best on ciao.de, and content-based filtering performed best on motor-talk.de.

Even worse, algorithms performed differently at different times, for different genders etc. For instance, between 18:00 and 4:00 o’clock, the most popular sequence algorithm performed best (see below). Between 4:01 and 17:59 o’clock, the standard “most popular” algorithm performed best.

Given these results it is almost impossible to predict, when a recommendation algorithm will perform well in which scenario.

Automated Recommender Systems (AutoRecSys) attempt to automated the selection and tuning process. However, while the interest in this field is growing, this field is still young.

References

MULPURU, S. 2006. What you need to know about Third-Party Recommendation Engines. Forrester Research.

GRAU, J. 2009. Personalized product recommendations: Predicting shoppers’ needs.

SHARMA, A. and YAN, B. 2013. Pairwise learning in recommendation: Experiments with community recommendation on linkedin. In ACM Conf. on Recommender Sys.