Exploiting the Web of Data for cross-domain information retrieval and recommendation



Similar documents
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts

CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet

Search and Information Retrieval

RECOMMENDATION SYSTEM

1 o Semestre 2007/2008

Open Domain Information Extraction. Günter Neumann, DFKI, 2012

RRSS - Rating Reviews Support System purpose built for movies recommendation

Recommendation Tool Using Collaborative Filtering

EXPLOITING FOLKSONOMIES AND ONTOLOGIES IN AN E-BUSINESS APPLICATION

Recommender Systems: Content-based, Knowledge-based, Hybrid. Radek Pelánek

Semantic Search in Portals using Ontologies

Tourism Destination Web Monitor: Beyond Web Analytics

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Engines. Stephen Shaw 18th of February, Netsoc

So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)

A Web Recommender System for Recommending, Predicting and Personalizing Music Playlists

Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015

Weblogs Content Classification Tools: performance evaluation

ANALYTICS IN BIG DATA ERA

Music Mood Classification

Cross-Domain Recommender Systems

The Need for Training in Big Data: Experiences and Case Studies

A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS

Improving the Performance of a Computer-Controlled Player in a Maze Chase Game using Evolutionary Programming on a Finite-State Machine

Introduction to Recommender Systems Handbook

LABERINTO at ImageCLEF 2011 Medical Image Retrieval Task

IWP MULTI-AFFECT INDICATOR

From the concert hall to the library portal

4, 2, 2014 ISSN: X

Intinno: A Web Integrated Digital Library and Learning Content Management System

Control of affective content in music production

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

Building a Book Recommender system using time based content filtering

Folksonomies versus Automatic Keyword Extraction: An Empirical Study

Course Outline of Record Curriculum Council Approval Date: 11/04/2013. Discipline, Number, Title: Music 101, Music Appreciation.

ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU

CINEMA DEPARTMENT COURSE LEVEL STUDENT LEARNING OUTCOMES BY COURSE

Cross-Domain Collaborative Recommendation in a Cold-Start Context: The Impact of User Profile Size on the Quality of Recommendation

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Movie Classification Using k-means and Hierarchical Clustering

Spanish Language Courses. Málaga Costa del Sol

Big Data and Analytics: Challenges and Opportunities

Automated Collaborative Filtering Applications for Online Recruitment Services

An Ontology Based Method to Solve Query Identifier Heterogeneity in Post- Genomic Clinical Trials

CHAPTER 2 Social Media as an Emerging E-Marketing Tool

Optimization of Image Search from Photo Sharing Websites Using Personal Data

University of Texas, Tyler School of Performing Arts Musi MUSIC APPRECIATION

A Survey on Product Aspect Ranking

Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.

Context Aware Predictive Analytics: Motivation, Potential, Challenges

How To Make Sense Of Data With Altilia

Computational Advertising Andrei Broder Yahoo! Research. SCECR, May 30, 2009

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

CS 229, Autumn 2011 Modeling the Stock Market Using Twitter Sentiment Analysis

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Comparing Ontology-based and Corpusbased Domain Annotations in WordNet.

Mining Text Data: An Introduction

2014 / Academic Programmes. Rey Juan Carlos University. Campus of International Excellence. Manuel Becerra. Vicálvaro. Alcorcón.

Spam Detection Using Customized SimHash Function

Big Data and Opinion Mining: Challenges and Opportunities

Video Games and Academic Performance. Ronny Khadra. Cody Hackshaw. Leslie Mccollum. College of Coastal Georgia

Incorporating Participant Reputation in Community-driven Question Answering Systems

Emotion Card Games. I hope you enjoy these cards. Joel Shaul, LCSW. Joel Shaul provides

TOOL OF THE INTELLIGENCE ECONOMIC: RECOGNITION FUNCTION OF REVIEWS CRITICS. Extraction and linguistic analysis of sentiments

Interactive Dynamic Information Extraction

Classical Music Ludwig Van Beethoven

Using Data Mining for Mobile Communication Clustering and Characterization

User Profile Refinement using explicit User Interest Modeling

HELP DESK SYSTEMS. Using CaseBased Reasoning

A Framework for the Delivery of Personalized Adaptive Content

Enhancing the relativity between Content, Title and Meta Tags Based on Term Frequency in Lexical and Semantic Aspects

Query Recommendation employing Query Logs in Search Optimization

Search and Data Mining: Techniques. Introduction Anna Yarygina Boris Novikov

Blog Post Extraction Using Title Finding

OVER 200 EMOTIONS ALPHA SORTED

BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL

Text Analytics with Ambiverse. Text to Knowledge.

User Data Analytics and Recommender System for Discovery Engine

New Generation of Social Networks Based on Semantic Web Technologies: the Importance of Social Data Portability

Transcription:

Exploiting the Web of Data for cross-domain information retrieval and recommendation Ignacio Fernández-Tobías under the supervision of Iván Cantador Grupo de Recuperación de Información Universidad Autónoma de Madrid i.fernandez@uam.es VII Jornadas MAVIR Avances en Tecnologías de la Lengua y Acceso a la Información Multimedia Escuela Politécnica Superior, Universidad Carlos III de Madrid 26-27 November 2012

Contents 1 Introduction: Cross-domain item recommendation Case study: Linking music with places of interest A semantic-based framework for linking domains Cross-domain semantic networks from Wikipedia Cross-domain semantic networks from Open Information Extraction A social tag-based emotion-oriented approach for linking domains

Introduction: Cross-domain item recommendation 2 Recommender systems help users to make choices, by proactively finding relevant items or services, taking into account or predicting the users tastes, priorities and goals The vast majority of the currently available recommender systems predicts the user s relevance of items in a specific and limited domain

Introduction: Cross-domain item recommendation 3 In some applications, it could be useful to offer the user joint personalized recommendations of items belonging to multiple domains In an e-commerce site, we may suggest movies or videogames based on a particular book bought by a costumer In a travel application, we may suggest cultural events may interest a person who has booked a hotel in a particular place In an e-learning system, we may suggest educational websites with topics related to a video documentary a student has seen Potential benefits Offering diversity and serendipity Addressing the cold-start problem (on a target domain) Mitigating the sparsity problem Fernández-Tobías, I., Cantador, I., Kaminskas, M., Ricci, F. 2012. Cross-domain Recommender Systems: A Survey of the State of the Art. 2nd Spanish Conference on Information Retrieval.

Introduction: Cross-domain item recommendation 4 Some real applications (e.g. Amazon) do already recommend items from different domains, but their recommendations rely on statistical analysis of popular items, without any personalization strategy, or most of them only exploit information about the user preferences in the target domain

Introduction: Cross-domain item recommendation 5 Context User and item profiles are distributed in multiple systems there is no / a few user profiles with preferences on items in different domains Goal Automatically establishing links or transferring knowledge between domains

Contents 6 Introduction: Cross-domain item recommendation Case study: Linking music with places of interest A semantic-based framework for linking domains Cross-domain semantic networks from Wikipedia Cross-domain semantic networks from Open Information Extraction A social tag-based emotion-oriented approach for linking domains

Case study: Linking music with places of interest Case study: Suggesting music / musicians highly related to a particular point of interest (POI) 7

Case study: Linking music with places of interest 8 Case study: Suggesting music / musicians highly related to a particular point of interest (POI) Relations between music and places Based on common emotions caused by listening to music and visiting POIs social tags Kaminskas, M., Ricci, F. 2011. Location-Adapted Music Recommendation Using Tags. 19th Intl. Conference on User Modeling, Adaptation and Personalization, 183-194.

Case study: Linking music with places of interest 9 Case study: Suggesting music / musicians highly related to a particular point of interest (POI) Relations between music and places Based on common emotions caused by listening to music and visiting POIs social tags Based on explicit semantic associations between musicians and POIs information available in the (Semantic) Web Austrian musicians Romanticism Vienna State Opera Classical music Opera composers 19th century Gustav Mahler Arnold Schoenberg Wolfgang Amadeus Mozart

Case study: Linking music with places of interest 10 Semantic relations between musicians and POIs Location relations Arnold Schoenberg was born in Vienna, which is the city where Vienna State Opera is located Time relations Gustav Mahler was born in 1869, which is a year in the decade when Vienna State Opera was built Architecture-History/Art-Music category relations Wolfgang A. Mozart was a classical music composer, and classical compositions are played in Opera houses, which is the building type of the Vienna State Opera Arbitrary relations Gustav Mahler was the director of Vienna State Opera Ana Belén (a famous Spanish singer) composed a song about La Puerta de Alcalá (a well known POI in Madrid)

Contents 11 Introduction: Cross-domain item recommendation Case study: Linking music with places of interest A semantic-based framework for linking domains Cross-domain semantic networks from Wikipedia Cross-domain semantic networks from Open Information Extraction A social tag-based emotion-oriented approach for linking domains

Cross-domain semantic networks from Wikipedia 12 Building type Date City (Architecture) categories

Cross-domain semantic networks from Wikipedia Linking Wikipedia s architecture and music categories 13 Architectural styles 19th century Visitor architecture attractions Arts venues 18th century Music venues Modern history Romanticism Opera houses Historical eras 19th century Romantic composers 19th century in music Opera Classical composers Opera composers 19th century musicians 19th century composers Classical music Music people Musicians Composers Music genres Kaminskas, M., Fernández-Tobías, I., Ricci, F., Cantador, I. 2013. Ontology-based Identification of Music for Places. 13th Intl. Conference on Information and Communication Technologies in Tourism.

Cross-domain semantic networks from Wikipedia Linking Wikipedia s architecture and music categories 14 Architectural styles 19th century Visitor architecture attractions Arts venues 18th century Music venues Modern history Romanticism Opera houses Historical eras 19th century Romantic composers 19th century in music Opera Classical composers Opera composers 19th century musicians 19th century composers Classical music Music people Musicians Composers Music genres Kaminskas, M., Fernández-Tobías, I., Ricci, F., Cantador, I. 2013. Ontology-based Identification of Music for Places. 13th Intl. Conference on Information and Communication Technologies in Tourism.

Cross-domain semantic networks from Wikipedia 15 Cross-domain taxonomies from Wikipedia Architecture History / Art Music Architectural styles Visitor attractions Centuries in architecture 19th century architecture Historical eras Centuries Modern history 18th century 19th century Romanticism Musicians Composers Classical composers Opera composers 19th century musicians 19th century composers Music people Arts venues Music venues Opera houses Romantic composers Opera Classical music Music genres

building_start_date_of building_end_date_of opening_date_of subcategory_of located_in City Year birth_place_of 16 death_place_of residence_place_of birth_date_of death_date_of activity_date_of POI Date Decade Musician has_style Century type_of has_type Architectural style Architectural era Historical era Musical era Musician type genre_of Building type Music genre

Vienna, Austria 17 City death_place_of 1860s 1869 Date birth_decade_of Vienna State Opera 19th century activity_century_of Gustav Mahler Architectural styles 1869 architecture Architectural eras 19th century architecture Historical eras 19th century Romanticism Musical eras Romantic music 19th century in music 19th century composers Romantic composers Building types Music genres Musician types Classical composers Opera houses in Vienna Opera houses in Austria Opera houses Opera Classical music

Cross-domain semantic networks from Wikipedia Weight Spreading Activation 18 score i S i = 1 d rel i + d w ji S(j) PageRank score i PR i = 1 d 1 N + d 1 PR(j) L(j) j i HITS j i j i j i score i A i A i = H(j) j i H i = A(j) i j H A A H

Cross-domain semantic networks from Wikipedia 97 users, 17 cities, 25 POIs, 356 POI-musician pairs, 1155 assessments 19

Cross-domain semantic networks from Wikipedia Average precision values for the top 5 ranked musicians for each POI 20 P@1 P@2 P@3 P@4 P@5 Random 0.355* 0.391* 0.363* 0.435* 0.413* HITS 0.688 0.706 0.711* 0.700* 0.694 PageRank 0.753 0.728 0.707* 0.660* 0.646* Spreading 0.810 0.804 0.828 0.847 0.837 The values marked with * have differences statistically significant with Spreading algorithm s (Wilcoxon signed-rank test, p<0.05) Fernández-Tobías, I., Kaminskas, M., Cantador, I., Ricci, F. 2013. A semantic framework for supporting cross-domain recommendation: Suggesting music for places of interest. Submitted.

Cross-domain semantic networks from Wikipedia Average number of semantic paths per POI 21 Percentages of interesting and obvious musicians recommended by Spreading algorithm Interesting Non interesting Related 78.3% 21.7% Non-related 8.2% 91.8% Non obvious Fernández-Tobías, I., Kaminskas, M., Cantador, I., Ricci, F. 2013. A semantic framework for supporting cross-domain recommendation: Suggesting music for places of interest. Submitted. Obvious 58.9% 41.1% 84.2% 15.8%

Contents 22 Introduction: Cross-domain item recommendation Case study: Linking music with places of interest A semantic-based framework for linking domains Cross-domain semantic networks from Wikipedia Cross-domain semantic networks from Open Information Extraction A social tag-based emotion-oriented approach for linking domains

Cross-domain semantic networks from Open Information Extraction TextRunner (openie.cs.washington.edu) and ReVerb (reverb.cs.washington.edu): Automatically identification and extraction of binary relationships from English sentences 23 Linked to Freebase Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam. 2011. Open Information Extraction: The Second Generation. 22nd International Joint Conference on Artificial Intelligence, pp. 3-10.

Cross-domain semantic networks from Open Information Extraction Filtering relations based on a TF-IDF heuristic c e 1, e 2 w e 1, r, e 2 = λ + 1 λ tfidf(r) c e i, e j ei,e j 24 tfidf r = max s e i, r, e j G e i, s, e j log N e i, r, e j C Ranking entities according to node categories and graph structure w e = α 1 w T e + α 2 w P (e) + α 3 w D (e) w T e = T e D w P e = s e w D e = dist(s, e) T e D T(e) Fernández-Tobías, I., Cantador, I. 2013. Open Cross-domain Semantic Networks: Application to Item-to-item Recommendation. To be submitted.

Cross-domain semantic networks from Open Information Extraction 25

Cross-domain semantic networks from Open Information Extraction 26

Contents 27 Introduction: Cross-domain item recommendation Case study: Linking music with places of interest A semantic-based framework for linking domains Cross-domain semantic networks from Wikipedia Cross-domain semantic networks from Open Information Extraction A social tag-based emotion-oriented approach for linking domains

A social tag-based emotion-oriented approach for linking domains Mining social tagging systems to create linked emotion-oriented folksonomies 28

A social tag-based emotion-oriented approach for linking domains Generic emotion lexicon Automatically created by mining online thesauri (e.g. thesaurus.com) 16 main emotions: alert, excited, elated, happy, content, serene, relaxed, calm, fatigued, bored, depressed, sad, upset, stressed, nervous, tense Emotion = synonym & antonym vector Synonyms: positive weights Antonyms: negative weights 29 happy happy:+66, cheerful:+ 21, merry:+19, felicitous:+17, unhappy: 11, sad: 10, depressed: 6, serious: 4,. Fernández-Tobías, I., Plaza, L., Cantador, I. 2013. Cross-domain Emotion Folksonomies. To be submitted.

A social tag-based emotion-oriented approach for linking domains Generic emotion lexicon In accordance with Russell s emotion model (1980) Emotion representation in 2 dimensions: pleasure & arousal 30 Russell s emotion model Obtained emotion vectors projected into 2 dimensions (PCA) AROUSAL tense alert DISTRESS nervous excited EXCITEMENT stressed upset MISERY sad depressed DEPRESSION bored fatigued relaxed calm elated happy content serene PLEASURE 0.15 0.10 0.05 0.00-0.05 tense excited nervous upset elated alert stressed happy content depressed fatigued relaxed bored sad calm serene CONTENTMENT -0.10 SLEEPINESS Russell, J. A. 1980. A Circumplex Model of Affect. Journal of Personality and Social Psychology 39(6), pp. 1161-1178. -0.15-0.15-0.10-0.05 0.00 0.05 0.10 0.15

A social tag-based emotion-oriented approach for linking domains 31 Domain-dependent emotion folksonomies Particular emotional categories in each domain Each category is composed of a set of concurrent tags in the domain folksonomy Movies (MovieLens, Jinni, IMDb) bittersweet, emotional, feel good, scary, Music (Last.fm, GEMS) wonder, tenderness, nostalgia, peacefulness, Books (BookCrossing, LibraryThing, Whichbook) funny, unpredictable, disgusting, violent,

Exploiting the Web of Data for cross-domain information retrieval and recommendation Ignacio Fernández-Tobías under the supervision of Iván Cantador Grupo de Recuperación de Información Universidad Autónoma de Madrid i.fernandez@uam.es VII Jornadas MAVIR Avances en Tecnologías de la Lengua y Acceso a la Información Multimedia Escuela Politécnica Superior, Universidad Carlos III de Madrid 26-27 November 2012

Case study: Linking music with places of interest Vienna State Opera Arnold Schoenberg Arnold Schoenberg was born in Vienna, where Vienna State Opera is located Arnold Schoenberg was born in the 19th century, when Vienna State Opera was built Arnold Schoenberg was a Classical music composer, Classical music genre is related to Opera houses, which is the building type of Vienna State Opera 33 Las Ventas Antonio Flores Antonio Flores was born in Madrid, where Las Ventas is located Antonio Flores died in the 20th century, when Las Ventas was built Antonio Flores was a Flamenco singer, Flamenco is a Romanic music genre and is related to Moorish architecture, and Moorish Revival architecture is the architectonical style of Las Ventas

Cross-domain semantic networks from Wikipedia 34 Average precision values obtained by the Spreading algorithm for the top 5 ranked musicians for each POI type P@1 P@2 P@3 P@4 P@5 Music venues (4) Religious buildings (8) Castles and palaces (6) Other POIs (7) 0.838 0.688 0.838 0.829 0.870 0.721 0.965 0.844 0.795 0.781 0.794 0.704 0.792 0.900 0.825 0.908 0.772 0.836 0.872 0.893

Cross-domain semantic networks from Wikipedia Evaluating if tracks of the retrieved musicians are relevant to POIs 35