In recent years, two sided marketplaces have emerged as viable business models in many real world applications (e.g. Amazon, AirBnb, Spotify, YouTube), wherein the platforms have customers not only on the demand side (e.g. users), but also on the supply side (e.g. retailer, artists). Such multi-sided marketplace involves interaction between multiple stakeholders among which there are different individuals with assorted needs. While traditional recommender systems focused specifically towards increasing consumer satisfaction by providing relevant content to the consumers, two-sided marketplaces face an interesting problem of optimizing their models for supplier preferences, and visibility.
In this talk, we begin by describing a contextual bandit model developed for serving explainable music recommendations to users and showcase the need for explicitly considering supplier-centric objectives during optimization. To jointly optimize the objectives of the different marketplace constituents, we present a multi-objective contextual bandit model aimed at maximizing long-term vectorial rewards across different competing objectives. Finally, we discuss theoretical performance guarantees as well as experimental results with historical log data and tests with live production traffic in a large-scale music recommendation service.