Exploring Topic Models for Word Sense Discrimination

Similar documents

for Word Sense Discrimination

Non-negative Tensor Factorization for Selectional Preference Induction

Distributional Similarity

Clustering Connectionist and Statistical Language Processing

Can shared mobility offer an answer to the problems with regard to transport poverty?

Analytics on Big Data

Chapter 8. Final Results on Dutch Senseval-2 Test Data

Opportunities for Action in Industrial Goods. The Price Is Right: Optimizing Industrial Companies Pricing of Services

Université de Montpellier 2 Hugo Alatrista-Salas : hugo.alatrista-salas@teledetection.fr

THE DURBAN UNDER 19 s INTERNATIONAL FOOTBALL TOURNAMENT

PRODUCTS AND SERVICES TO MAKE YOUR TRIP MEMORABLE. European Driving Guide

Social Media Mining. Data Mining Essentials

The Data Mining Process

START OFF AT The MOST STRATEGIC LOCATION

_ 2015 winter edition

W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015

Betting Shop Footfall and Exit Survey Analysis. November 2012

Barcelona as a Smart City Lessons learned from the evolution of the concept and the influence in the city attractiveness

Semantic Clustering in Dutch

YOUR DAILY CONNECTION TO GREECE!

PRESS KIT BLACKLANE GMBH

Non-negative Matrix Factorization (NMF) in Semi-supervised Learning Reducing Dimension and Maintaining Meaning

4.5 Girls Teams Ajax AZ Breda NEC Roda PSV. 12-Sep 11:45 AM AZ v Ajax Breda v NEC Roda v PSV. 19-Sep 11:45 AM PSV v Breda Ajax v AZ NEC v Roda

MOBILITY IN CITIES DATABASE

Data Mining Yelp Data - Predicting rating stars from review text

TravelOAC: development of travel geodemographic classifications for England and Wales based on open data

Learning is a very general term denoting the way in which agents:

THE FUTURE OF RAIL MANIFESTO. The five pillars of customer experience modern rail depends on.

UEFA Champions League 2014/2015 Booking List before Group stage Matchday 1

Synopsis: In the first September TripCase product release there will be several big updates.

Final Project Report

Improved Outlook? French Manufacturing Competitiveness Radar 2014/2015. Paris, March 2015

Opportunities for Action. Achieving Success in Business Process Outsourcing and Offshoring

How to Make Money on Investments into Residential Real Estate? No Quick n Easy Solutions

Eurotunnel 2020 potential traffic estimate (Extract)

Gallito 2.0: a Natural Language Processing tool to support Research on Discourse

Connecting to Remote Desktop Services on an ipad

The German High Speed Rail System. DB International GmbH Ottmar Grein Bilbao,

Implementing Quality of Service for the Software Defined Networking Enabled Future Internet

Applying Data Analysis to Big Data Benchmarks. Jazmine Olinger

Multimodal Services in Vienna. Dr. Michael Lichtenegger, Neue Urbane Mobilität Wien GmbH

How To Identify And Represent Multiword Expressions (Mwe) In A Multiword Expression (Irme)

EMTA BAROMETER OF PUBLIC TRANSPORT IN EUROPEAN METROPOLITAN AREAS (2004)

Europe s Most Dynamic Cities. City Momentum Index March 2015

CONTENTS PREFACE 1 INTRODUCTION 1 2 DATA VISUALIZATION 19

The Master of Art in TransAtlantic Studies (TAS) is one of Jagiellonian University s newest programs.

Figure 30. Do you want to become a long-term resident? Do you want to become a citizen?

A chart generator for the Dutch Alpino grammar

Our staff will charm you with a warm welcome on board on our cozy boat and will ensure you every need.

Linear Programming. March 14, 2014

Content-Based Recommendation

Excellent Education Made in Bad Honnef

WHU Otto Beisheim School of Management. International Relations Burgplatz Vallendar Germany

Managing the IT cost challenge

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics

Opportunities for Action in Financial Services. Sales Force Effectiveness: Moving Up the Middle and Managing New Prospects

Integrated Public Transport Systems

Case No COMP/M AMERICAN EXPRESS COMPANY/ QATAR HOLDING/ GBT

primeclass CIP Service - The privilege that our passengers deserve...

primeclass CIP Service - The privilege that our passengers deserve...

Data, Measurements, Features

How To Teach Spain

Information Technology. GSK. The Difference.

DIABETES & HEALTHY EATING

Phase 2 of the D4 Project. Helmut Schmid and Sabine Schulte im Walde

Vernetzte Mobilität: Die IT-Perspektive.

business opportunities, Paris excellence

The PALAVRAS parser and its Linguateca applications - a mutually productive relationship

Machine Learning for Data Science (CS4786) Lecture 1

How to Earn Extra N 50,000 Weekly From NAIRABET Rebranded

Predict Influencers in the Social Network

ATKIS Model Generalization and On-Demand Map Production

Have fun and Grow a Reader! Car Wash

Movie Classification Using k-means and Hierarchical Clustering

Indian E-Retail Congress 2013

Corporate funding monitor 2015

Data Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland

Emotion Detection from Speech

We create, plan, buy & monitor international communication campaigns since 1993.

Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari

TEACHING ADULTS TO REASON STATISTICALLY

Turn the Tide promote green, inclusive and sustainable urban mobility

SRM How to maximize vendor value and opportunity

Just Married. PART 1 - Meet Neil and Julia. PART 2 - A tour around the kitchen

Case Igglo. Mikko Ranin

European Market Study

jars of marbles hours. Josh worked on his garden 2_ 3 times as long as Stacey. Vicki worked on her garden 1 3_ 8

Lesson Plan for Basic Magnetism

Successful OWN BRAND MANAGEMENT

Search and Data Mining: Techniques. Text Mining Anya Yarygina Boris Novikov

AOS STUDLEY OCCUPANCY COST INDEX

The CMS REMIT healthcheck

Berufskolleg Barmen Europaschule

FASTER. EASIER. COOLER.

Opportunities for Action in Financial Services. Winning with Wireless: A Challenge for Auto Insurers

Cut chicken into ½ inch cubes and cook in steamer for 2 minutes. Add all other ingredients and cook for 6 minutes,

HOW TO GET CAPO DI LEUCA AREA FROM BRINDISI AIRPORT. BY BUS + TRAIN* (Timetable not valid on Sundays and holidays)

Opportunities for Action in Industrial Goods. Winning by Understanding the Full Customer Experience

Sense-Tagging Verbs in English and Chinese. Hoa Trang Dang

BOLOGNA, ITALY BOLOGNA: A MEDIEVAL CITY WORTH DISCOVERING

Transcription:

Exploring s for Word Sense Discrimination University of Groningen clin December 7, 2007 Nijmegen s for WSD

Semantic similarity Semantic Similarity Context Ambiguity Most work on semantic similarity relies on the distributional hypothesis (Harris 1954) Take a word and its contexts: tasty smoutebol greasy smoutebol a portion of smoutebol smoutebol with loads of powdered sugar By looking at a word s context, one can infer its meaning s for WSD

Semantic similarity Semantic Similarity Context Ambiguity Most work on semantic similarity relies on the distributional hypothesis (Harris 1954) Take a word and its contexts: tasty smoutebol greasy smoutebol a portion of smoutebol smoutebol with loads of powdered sugar FOOD By looking at a word s context, one can infer its meaning s for WSD

Semantic similarity Semantic Similarity Context Ambiguity Most work on semantic similarity relies on the distributional hypothesis (Harris 1954) Take a word and its contexts: tasty smoutebol greasy smoutebol a portion of smoutebol smoutebol with loads of powdered sugar By looking at a word s context, one can infer its meaning s for WSD

Two kinds of context Semantic Similarity Context Ambiguity 1 Bag of words context a window around the word is used as context e.g. a fixed numbers of words, the paragraph in which a word appears,... often used with some form of dimensionality reduction 2 Syntactic context a corpus is parsed, dependency triples are extracted e.g. <apple, obj, eat>, <apple, adj, red> typically does not use any form of dimensionality reduction s for WSD

Ambiguity Semantic Similarity Context Ambiguity Problem: ambiguity Compare: a trendy bar an iron bar today s air pressure: 1.013 bar Different meanings, but they are considered the same entity by a naive algorithm Main research question: can bag of words context and syntactic context be combined to differentiate between various senses of a word? s for WSD

Technique Non-Negative Matrix Factorization Extending nmf Sense subtraction Clustering Extension Given a non-negative matrix V, find non-negative matrix factors W and H such that: Choosing r n,m reduces data V nxm W nxr H rxm (1) Constraint on factorization: all values in three matrices need to be non-negative values ( 0) Constraint brings about a parts-based representation: only additive, no subtractive relations are allowed s for WSD

Non-Negative Matrix Factorization Extending nmf Sense subtraction Clustering Extension Context vectors (5k nouns 2k co-occurring nouns) extracted from clef corpus nmf is able to capture semantic dimensions) Examples: bus bus, taxi taxi, trein train, halte stop, reiziger traveler, perron platform, tram tram, station station, chauffeur driver, passagier passenger bouillon broth, slagroom cream, ui onion, eierdooier egg yolk, laurierblad bay leaf, zout salt, deciliter decilitre, boter butter, bleekselderij celery, saus sauce s for WSD

Methodology Non-Negative Matrix Factorization Extending nmf Sense subtraction Clustering Extension Goal: classification of nouns according to both bag of words context and syntactic context Construct three matrices capturing co-occurrence frequencies for each mode nouns cross-classified by dependency relations nouns cross-classified by (bag of words) context words dependency relations cross-classified by context words Apply nmf to matrices, but interleave the process Result of former factorization is used to initialize factorization of the next one s for WSD

Graphical Representation Non-Negative Matrix Factorization Extending nmf Sense subtraction Clustering Extension s for WSD

Sense subtraction Non-Negative Matrix Factorization Extending nmf Sense subtraction Clustering Extension switch off one dimension of an ambiguous word to reveal other possible senses From matrix W, we know which dimensions are the most important for a certain word Matrix H gives the importance of each dependency relation given a dimension subtract dependency relations that are responsible for a given dimension from the original noun vector v new = v orig ( 1 h dim ) each dependency relation is multiplied by a scaling factor, according to the load of the feature on the subtracted dimensions s for WSD

Combination with clustering Non-Negative Matrix Factorization Extending nmf Sense subtraction Clustering Extension A simple clustering algorithm (e.g. K-means) assigns ambiguous nouns to its predominant sense Centroid of the cluster is fold into topic model The dimensions that define the centroid are subtracted from the ambiguous noun vector Adapted noun vector is fed to the clustering algorithm again s for WSD

Experimental Design Experimental Design Dimensions Examples Approach applied to Dutch, using clef corpus (Dutch newspaper texts 94-95) Corpus parsed with Dutch dependency parser alpino three matrices constructed with: 5k nouns 40k dependency relations 5k nouns 2k context words 40k dependency relations 2k context words Factorization to 100 dimensions s for WSD

Example dimension: transport Experimental Design Dimensions Examples 1 nouns: auto car, wagen car, tram tram, motor motorbike, bus bus, metro subway, automobilist driver, trein trein, stuur steering wheel, chauffeur driver 2 context words: auto car, trein train, motor motorbike, bus bus, rij drive, chauffeur driver, fiets bike, reiziger reiziger, passagier passenger, vervoer transport 3 dependency relations: viertraps adj four pedal, verplaats met obj move with, toeter adj honk, tank in houd obj [parsing error], tank subj refuel, tank obj refuel, rij voorbij subj pass by, rij voorbij adj pass by, rij af subj drive off, peperduur adj very expensive s for WSD

Pop: most similar words Experimental Design Dimensions Examples pop music doll 1 pop, rock, jazz, meubilair furniture, popmuziek pop music, heks witch, speelgoed toy, kast cupboard, servies [tea] service, vraagteken question mark 2 pop, meubilair furniture, speelgoed toy, kast cupboard, servies [tea] service, heks witch, vraagteken question mark sieraad jewel, sculptuur sculpture, schoen shoe 3 pop, rock, jazz, popmuziek pop music, heks witch, danseres dancer, servies [tea] service, kopje cup, house house music, aap monkey s for WSD

Pop: clusters Experimental Design Dimensions Examples 1 house, jazz, pop, rock 2 elektronica electronics, horloge watch, juweel jewel, kleding clothes, parfum perfume, pop doll, schoen shoe, sieraad jewel, speelgoed toy, textiel textile s for WSD

Barcelona: most similar words Experimental Design Dimensions Examples Spanish city Spanish football club 1 Barcelona, Arsenal, Inter, Juventus, Vitesse, Milaan Milan, Madrid, Parijs Paris, Wenen Vienna, München Munich 2 Barcelona, Milaan Milan, München Munich, Wenen Vienna, Madrid, Parijs Paris, Bonn, Praag Prague, Berlijn Berlin, Londen London 3 Barcelona, Arsenal, Inter, Juventus, Vitesse, Parma, Anderlecht, PSV, Feyenoord, Ajax s for WSD

Barcelona: clusters Experimental Design Dimensions Examples 1 Anderlecht, Arsenal, Barcelona, Bayern, Inter, Juventus, Milan, Parma 2 Antwerpen Antwerp, Barcelona, Berlijn Berlin, Bonn, Bremen, Brussel Brussels, Frankfurt, Hamburg, Londen London, Milaan Milan, München Munich, Parijs Paris, Rome, Wenen Vienna s for WSD

Future Work Combining bag of words data and syntactic data is useful bag of words data (factorized with nmf) puts its finger on topical dimensions syntactic data is particularly good at finding similar words a clustering approach allows one to determine which topical dimension(s) are responsible for a certain sense and adapt the (syntactic) feature vector of the noun accordingly subtracting the more dominant sense to discover less dominant senses s for WSD

Future Work Future Work Proper evaluation, comparison with other wsd algorithms Extend the clustering part Fully automatic clustering Use of tight clusters instead of broad K-means results Scale the method to large data sets (20K nouns 100K dependency relations, larger corpus) s for WSD