{"year":"2023","author":[{"first_name":"Haya","id":"87673","last_name":"Halimeh","full_name":"Halimeh, Haya"},{"last_name":"Freese","full_name":"Freese, Florian","first_name":"Florian"},{"full_name":"Müller, Oliver","last_name":"Müller","id":"72849","first_name":"Oliver"}],"user_id":"87673","main_file_link":[{"open_access":"1","url":"https://scholar.google.com/citations?view_op=view_citation&hl=en&user=zBlrdP4AAAAJ&citation_for_view=zBlrdP4AAAAJ:UeHWp8X0CEIC"}],"date_created":"2024-01-10T14:20:12Z","_id":"50431","status":"public","department":[{"_id":"195"},{"_id":"196"}],"oa":"1","type":"conference","citation":{"mla":"Halimeh, Haya, et al. “Event Recommendations through the Lens of Vision and Language Foundation Models.” Workshop on Recommenders in Tourism, Co-Located with the 17th ACM Conference on Recommender Systems, 2023.","short":"H. Halimeh, F. Freese, O. Müller, in: Workshop on Recommenders in Tourism, Co-Located with the 17th ACM Conference on Recommender Systems, 2023.","chicago":"Halimeh, Haya, Florian Freese, and Oliver Müller. “Event Recommendations through the Lens of Vision and Language Foundation Models.” In Workshop on Recommenders in Tourism, Co-Located with the 17th ACM Conference on Recommender Systems, 2023.","ama":"Halimeh H, Freese F, Müller O. Event Recommendations through the Lens of Vision and Language Foundation Models. In: Workshop on Recommenders in Tourism, Co-Located with the 17th ACM Conference on Recommender Systems. ; 2023.","bibtex":"@inproceedings{Halimeh_Freese_Müller_2023, title={Event Recommendations through the Lens of Vision and Language Foundation Models}, booktitle={Workshop on Recommenders in Tourism, co-located with the 17th ACM Conference on Recommender Systems}, author={Halimeh, Haya and Freese, Florian and Müller, Oliver}, year={2023} }","apa":"Halimeh, H., Freese, F., & Müller, O. (2023). Event Recommendations through the Lens of Vision and Language Foundation Models. Workshop on Recommenders in Tourism, Co-Located with the 17th ACM Conference on Recommender Systems. Workshop on Recommenders in Tourism, co-located with the 17th ACM Conference on Recommender Systems.","ieee":"H. Halimeh, F. Freese, and O. Müller, “Event Recommendations through the Lens of Vision and Language Foundation Models,” presented at the Workshop on Recommenders in Tourism, co-located with the 17th ACM Conference on Recommender Systems, 2023."},"date_updated":"2024-01-10T16:10:04Z","language":[{"iso":"eng"}],"conference":{"name":"Workshop on Recommenders in Tourism, co-located with the 17th ACM Conference on Recommender Systems","end_date":"2023-09-22","start_date":"2023-09-18"},"publication":"Workshop on Recommenders in Tourism, co-located with the 17th ACM Conference on Recommender Systems","abstract":[{"text":"Recommender systems now span the entire customer journey. Amid the multitude of diversified experi- ences, immersing in cultural events has become a key aspect of tourism. Cultural events, however, suffer from fleeting lifecycles, evade exact replication, and invariably lie in the future. In addition, their low standardization makes harnessing historical data regarding event content or past patron evaluations intricate. The distinctive traits of events thereby compound the challenge of the cold-start dilemma in event recommenders. Content-based recommendations stand as a viable avenue to alleviate this issue, functioning even in scenarios where item-user information is scarce. Still, the effectiveness of content- based recommendations often hinges on the quality of the data representation they build upon. In this study, we explore an array of cutting-edge uni- and multimodal vision and language foundation models (VL-FMs) for this purpose. Next, we derive content-based recommendations through a straightforward clustering approach that groups akin events together, and evaluate the efficacy of the models through a series of online user experiments across three dimensions: similarity-based evaluation, comparison-based evaluation, and clustering assignment evaluation. Our experiments generated four major findings. First, we found that all VL-FMs consistently outperformed a naive baseline of recommending randomly drawn events. Second, unimodal text-based embeddings were surprisingly on par or in some cases even superior to multimodal embeddings. Third, multimodal embeddings yielded arguably more fine-grained and diverse clusters in comparison to their unimodal counterparts. Finally, we could confirm that cross event interest is indeed reliant on the perceived similarity of events, resonating with the notion of similarity in content-based recommendations. All in all, we believe that leveraging the potential of contemporary FMs for content-based event recommendations would help address the cold-start problem and propel this field of research forward in new and exciting ways.","lang":"eng"}],"title":"Event Recommendations through the Lens of Vision and Language Foundation Models"}