Tracking change in word meaning
|
|
- Janel Higgins
- 8 years ago
- Views:
Transcription
1 Overview Intro DisSem Previous Case Visualisation Conclusion References Tracking change in word meaning A dynamic visualization of diachronic distributional semantics Kris Heylen, Thomas Wielfaert & Dirk Speelman KULeuven Quantitative Lexicology and Variational Linguistics
2 Purpose of the talk A lexicological study of how a set of near-synonymous adjectives have changed meaning through time, using a statistical, distributional approach for modelling lexical semantics in large corpora, using a dynamic visualization to assist in interpreting these statistical patterns, with the ultimate goal of creating an exploritative tool for lexical semantic analysis.
3 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
4 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
5 Background: Lexical Variation LEXICOLOGY:
6 Background: Lexical Variation LEXICOLOGY
7 Background: Lexical Variation LEXICOLOGY: SEMASIOLOGICAL PERSPECTIVE
8 Background: Lexical Variation LEXICOLOGY: ONOMASIOLOGICAL PERSPECTIVE
9 Background: Lexical Variation LEXICOLOGY: FINER GRAINED ANALYSIS OF SEMANTIC FEA- TURES
10 Background: Lexical Variation LEXICOLOGY: FINER GRAINED ANALYSIS OF SEMANTIC FEA- TURES
11 Background: Lexical Variation LEXICOLOGY: LECTAL VARIATION
12 Background: Lexical Variation LEXICOLOGY: CHRONO-LECTAL (DIACHRONIC) VARIATION
13 Background: Lexical Variation LEXICOLOGY: QUANTITATIVE CORPUS ANALYSIS
14 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
15 Distributional models of lexical semantics Linguistic origin: Distributional Hypothesis You shall know a word by the company it keeps (Firth) a word s meaning can be induced from its co-occurring words long tradition of collocation studies in corpus linguistics Semantic Vector Spaces in Computational Linguistics standard technique in statistical NLP for the large-scale automatic modeling of (lexical) semantics aka Vector Spaces Models, Distributional Semantic Models, Word Spaces,... (cf Turney & Pantel 2010 for overview) generalised, large scale collocation analysis mainly used for automatic thesaurus extraction: words occurring in same contexts have similar meaning
16 Semantic Vector Spaces as models of word meaning Practical Which two words out of a set of three have the same meaning? ongeval, koffie, accident Occurrences in context from a corpus Op de Brusselse ring deed zich een ongeval met een vrachtwagen voor s Morgens drinkt hij een kop koffie met melk en suiker 2 bestuurders raakten gekwetst bij een ongeval met een vrachtwagen in de avondspits veroorzaakte een accident een kilometerslange file als vieruurtje serveert het hotel koffie en gebak voor de gasten de auto was betrokken in een accident met een dodelijke afloop Met winterbanden is het risico op een ongeval bij vriesweer veel kleiner
17 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
18 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval vader raakte gekwetst bij een ongeval met een vrachtwagen op de
19 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval voor zeven uur veroorzaakte een ongeval een kilometerslange file richting Antwerpen
20 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval vrachtwagens waren betrokken bij het ongeval, dat meer dan tien slachtoffers
21 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
22 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
23 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
24 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
25 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
26 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
27 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
28 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
29 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
30 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
31 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
32 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
33 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
34 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
35 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie Which words are similar?
36 Distributional models of lexical semantics word by word similarity matrix ongeval accident koffie ongeval accident koffie
37 Distributional models of lexical semantics Geometrical metaphor: Semantic distance frequencies weighted by collocational strength (pmi) vectors projected in context feature space: Word Space cosine of angle between vectors as semantic similarity measure
38 Distributional semantics: lexical variation Bilectal Word Spaces Extend Word Space from one corpus to two corpora representative for different lects/varieties 2 context vectors for each word, one for each variety most words will have themselves as most similar word... BUT words with diverging semantic structure will not
39 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
40 Sagi, Kaufmann & Clark 2009
41 Rohrdantz, Hautli, Mayer etal. 2011
42 Hilpert 2011
43 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
44 Case study: : positive evaluative adjectives
45 Case study: positive evaluative adjectives brilliant cool delightful excellent fabulous fantastic good great impressive lovely magnificent marvelous perfect splendid superb terrific wonderful Table: positive evaluative adjectives
46 Case study Corpus Corpus of Historical American English (COHA, Davies 2012) Period from 1810 to 2009, 400M words, POS-tagged. Concept: Positive evaluative adjectives 1 vector per adjective, per decade ( ) modelled by window of 5 words left & right 5000 most frequent context words (minus top 100) PMI-weighting, cosine similarity
47 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
48 HighD to 2D Visualisation word-decade by context matrix is high dimensional first aim is NOT to find latent structure (as with LSA/LDA) but general picture of distributional semantic structuring faithful rendering of similarity matrix in 2D: Kruskal s non-metric Multidimensional Scaling interpret dimensions with context-labeled clusters Dynamic and interactive chart Motion Charts from Google Chart Tools panchronic view to interpret semantic space diachronic view to see meaning changes.
49 panchronic view for interpretation of semantic space Clusters with most typical contextwords of adjectives: cluster 2 (centre, light blue): positive evaluated things (colors, spectacle, performance) centre of the plot, expressing the core meaning of the adjectives cluster 8 (red, lower left): loud and frightening things (explosion, thunder, crash) periphery of the plot, expressing non-related meaning
50 diachronic motion chart to see meaning change Trajectory of terrific from 1860 to 2000, moving from the peripheral cluster of frightening things to the central cluster of positive evaluated things, indicative of its meaning change
51 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
52 Summary Conclusion and future work Lexicological perspective: Tool for exploring lexical semantics and variation in large amounts of corpus data Dynamic visualisation of evolving semantic structuring for a set of near-synonymous adjectives Desiderata integrate with latent dimension finding techniques (cf. Rohrdantz et al.) for easier interpretation of semantic space show individual occurrences of lexemes (tokens) to explore semasiological structure of adjectives in each decade show interpretative beacons in the dynamic plot other types of context features (e.g. dependency relations)
53 For more information:
54 References I Davies, Mark Corpus of Historical American English (COHA): ): 400+ million words, Heylen, Kris, Speelman, Dirk, & Geeraerts, Dirk Looking at word meaning. An interactive visualization of Semantic Vector Spaces for Dutch synsets. Pages of: Proceedings of the EACL-2012 joint workshop of LINGVIS & UNCLH: Visualization of Language Patters and Uncovering Language History from Multilingual Resources. Hilpert, Martin Dynamic visualizations of language change: Motion charts on the basis of bivariate and multivariate data from diachronic corpora. International Journal of Corpus Linguistics, 16(4),
55 References II Rohrdantz, Christian, Hautli, Annette, Mayer, Thomas, Butt, Miriam, Keim, Daniel A, & Plank, Frans Towards Tracking Semantic Change by Visual Analytics. Pages of: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics. Sagi, Eyal, Kaufmann, Stefan, & Clark, Brady Semantic Density Analysis: Comparing Word Meaning across Time and Phonetic Space. Pages of: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics. Athens, Greece: Association for Computational Linguistics. Turney, Peter D., & Pantel, Patrick From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37(1),
Distributional Semantic Modelling in Cognitive Sociolinguistics: QLVL probes Semantic Space
Overview CogSoLx Onomas Semas Conclusion Distributional Semantic Modelling in Cognitive Sociolinguistics: QLVL probes Semantic Space Kris Heylen, Dirk Geeraerts & Dirk Speelman KU Leuven Quantitative Lexicology
More informationLexical convergence in the Dutch lexicon
Overview Introduction Dutch Method Results Lexical convergence in the Dutch lexicon Jocelyne Daems Kris Heylen Dirk Geeraerts University of Leuven RU Quantitative Lexicology and Variational Linguistics
More informationDoe wat je niet laten kan: A usage-based analysis of Dutch causative constructions. Natalia Levshina
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Natalia Levshina RU Quantitative Lexicology and Variational Linguistics Faculteit Letteren Subfaculteit Taalkunde K.U.Leuven
More informationComparing constructicons: A cluster analysis of the causative constructions with doen in Netherlandic and Belgian Dutch.
Comparing constructicons: A cluster analysis of the causative constructions with doen in Netherlandic and Belgian Dutch Natalia Levshina Outline 1. Dutch causative Cx with doen 2. Data and method 3. Quantitative
More informationSemantic Clustering in Dutch
t.van.de.cruys@rug.nl Alfa-informatica, Rijksuniversiteit Groningen Computational Linguistics in the Netherlands December 16, 2005 Outline 1 2 Clustering Additional remarks 3 Examples 4 Research carried
More informationDoe wat je niet laten kan: A usage-based analysis of Dutch causative constructions
FACULTEIT LETTEREN SUBFACULTEIT TAALKUNDE KATHOLIEKE UNIVERSITEIT LEUVEN Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Proefschrift ingediend tot het behalen van de
More informationText Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC
Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC 1. Introduction A popular rule of thumb suggests that
More informationUniversity of Marburg, RC Deutscher Sprachatlas University of Leuven, RU Quantitative Lexicology and Variational Linguistics
Construction Grammar meets Semantic Vector Spaces: A radically data-driven approach to semantic classification of slot fillers Natalia Levshina Kris Heylen University of Marburg, RC Deutscher Sprachatlas
More informationLinguistic Research with CLARIN. Jan Odijk MA Rotation Utrecht, 2015-11-10
Linguistic Research with CLARIN Jan Odijk MA Rotation Utrecht, 2015-11-10 1 Overview Introduction Search in Corpora and Lexicons Search in PoS-tagged Corpus Search for grammatical relations Search for
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More information$ 4XDQWLWDWLYH $SSURDFK WR WKH &RQWUDVW DQG 6WDELOLW\ RI 6RXQGV
Thomas Mayer 1, Christian Rohrdantz 2, Frans Plank 1, Miriam Butt 1, Daniel A. Keim 2 Department of Linguistics 1, Department of Computer Science 2 University of Konstanz thomas.mayer@uni-konstanz.de,
More informationDoe wat je niet laten kan: A usage-based analysis of Dutch causative constructions
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Supervisors: Natalia Levshina Dirk Geeraerts Dirk Speelman University of Leuven RU Quantitative Lexicology and Variational
More informationHow To Identify And Represent Multiword Expressions (Mwe) In A Multiword Expression (Irme)
The STEVIN IRME Project Jan Odijk STEVIN Midterm Workshop Rotterdam, June 27, 2008 IRME Identification and lexical Representation of Multiword Expressions (MWEs) Participants: Uil-OTS, Utrecht Nicole Grégoire,
More informationThe Value of Visualization 2
The Value of Visualization 2 G Janacek -0.69 1.11-3.1 4.0 GJJ () Visualization 1 / 21 Parallel coordinates Parallel coordinates is a common way of visualising high-dimensional geometry and analysing multivariate
More informationMarkus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
More informationComputer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015
Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD
More informationApplying quantitative methods to dialect Dutch verb clusters
Applying quantitative methods to dialect Dutch verb clusters Jeroen van Craenenbroeck KU Leuven/CRISSP jeroen.vancraenenbroeck@kuleuven.be 1 Introduction Verb cluster ordering is a well-known area of microparametric
More informationSearch Engines. Stephen Shaw <stesh@netsoc.tcd.ie> 18th of February, 2014. Netsoc
Search Engines Stephen Shaw Netsoc 18th of February, 2014 Me M.Sc. Artificial Intelligence, University of Edinburgh Would recommend B.A. (Mod.) Computer Science, Linguistics, French,
More informationStatistical Validation and Data Analytics in ediscovery. Jesse Kornblum
Statistical Validation and Data Analytics in ediscovery Jesse Kornblum Administrivia Silence your mobile Interactive talk Please ask questions 2 Outline Introduction Big Questions What Makes Things Similar?
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More information3 Paraphrase Acquisition. 3.1 Overview. 2 Prior Work
Unsupervised Paraphrase Acquisition via Relation Discovery Takaaki Hasegawa Cyberspace Laboratories Nippon Telegraph and Telephone Corporation 1-1 Hikarinooka, Yokosuka, Kanagawa 239-0847, Japan hasegawa.takaaki@lab.ntt.co.jp
More informationGet the most value from your surveys with text analysis
PASW Text Analytics for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That
More informationMonitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations
Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations Christian W. Frey 2012 Monitoring of Complex Industrial Processes based on Self-Organizing Maps and
More informationTaxonomy learning factoring the structure of a taxonomy into a semantic classification decision
Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Viktor PEKAR Bashkir State University Ufa, Russia, 450000 vpekar@ufanet.ru Steffen STAAB Institute AIFB,
More informationInformation Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
More informationW. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015
W. Heath Rushing Adsurgo LLC Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare Session H-1 JTCC: October 23, 2015 Outline Demonstration: Recent article on cnn.com Introduction
More informationClustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller keller@coli.uni-sb.de Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
More informationBig data, the future of statistics
Big data, the future of statistics Experiences from Statistics Netherlands Dr. Piet J.H. Daas Senior-Methodologist, Big Data research coordinator and Marco Puts, Martijn Tennekes, Alex Priem, Edwin de
More informationSentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
More informationTechWatch. Technology and Market Observation powered by SMILA
TechWatch Technology and Market Observation powered by SMILA PD Dr. Günter Neumann DFKI, Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, Juni 2011 Goal - Observation of Innovations and Trends»
More informationHow To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)
Statgraphics Centurion XVII (currently in beta test) is a major upgrade to Statpoint's flagship data analysis and visualization product. It contains 32 new statistical procedures and significant upgrades
More informationExploiting Comparable Corpora and Bilingual Dictionaries. the Cross Language Text Categorization
Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization Alfio Gliozzo and Carlo Strapparava ITC-Irst via Sommarive, I-38050, Trento, ITALY {gliozzo,strappa}@itc.it
More informationIntroduction. 1.1 Kinds and generalizations
Chapter 1 Introduction 1.1 Kinds and generalizations Over the past decades, the study of genericity has occupied a central place in natural language semantics. The joint work of the Generic Group 1, which
More informationCross-lingual Synonymy Overlap
Cross-lingual Synonymy Overlap Anca Dinu 1, Liviu P. Dinu 2, Ana Sabina Uban 2 1 Faculty of Foreign Languages and Literatures, University of Bucharest 2 Faculty of Mathematics and Computer Science, University
More informationAn Introduction to Random Indexing
MAGNUS SAHLGREN SICS, Swedish Institute of Computer Science Box 1263, SE-164 29 Kista, Sweden mange@sics.se Introduction Word space models enjoy considerable attention in current research on semantic indexing.
More informationVarieties of lexical variation
Dirk Geeraerts University of Leuven Varieties of lexical Abstract This paper presents the theoretical backgr ound of a large-scale lexicological research project on lexical that was carried out at the
More informationImputing Values to Missing Data
Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data
More informationOpinion Mining Issues and Agreement Identification in Forum Texts
Opinion Mining Issues and Agreement Identification in Forum Texts Anna Stavrianou Jean-Hugues Chauchat Université de Lyon Laboratoire ERIC - Université Lumière Lyon 2 5 avenue Pierre Mendès-France 69676
More informationComputer-aided Document Indexing System
Journal of Computing and Information Technology - CIT 13, 2005, 4, 299-305 299 Computer-aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić and Jan Šnajder,, An enormous
More informationGallito 2.0: a Natural Language Processing tool to support Research on Discourse
Presented in the Twenty-third Annual Meeting of the Society for Text and Discourse, Valencia from 16 to 18, July 2013 Gallito 2.0: a Natural Language Processing tool to support Research on Discourse Guillermo
More informationEM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationCrossing Corpora. Modelling Semantic Similarity across Languages and Lects.
Distributional Models Bilectal Bilingual Crossing Corpora. Modelling Semantic Similarity across Languages and Lects. Yves Peirsman Supervisors: Dirk Geeraerts & Dirk Speelman Quantitative Lexicology and
More informationOverview of SEO Recon Features and Benefits
Michael Marshall, CEO Overview of SEO Recon Features and Benefits Data Collection (partial sample):... 2 Multivariate analysis: (Which Factors are Important?):... 3 Multivariate Analysis: (Which Competitors
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More informationINF4820, Algorithms for AI and NLP: More Common Lisp Vector Spaces
INF4820, Algorithms for AI and NLP: More Common Lisp Vector Spaces Erik Velldal University of Oslo Sept. 4, 2012 Topics for today 2 More Common Lisp More data types: Arrays, sequences, hash tables, and
More informationExploratory Data Analysis with MATLAB
Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton
More informationWhat the Hell is Big Data?
Presentation What the Hell is Big Data? Bernard Marr www.ap-institute.com 1 Background 2 Navigating to Success 3 Navigation Today 4 The Global Data Revolution 5 The Intelligent Company Model Strategic
More informationWhy Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
More informationIntroduction VOLKER GAST. 1. Central questions addressed in this issue
ZAA 54.2 (2006): 113-120 VOLKER GAST Introduction 1. Central questions addressed in this issue Corpus linguistics has undoubtedly become one of the most important and most widely used empirical methods
More informationVisualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures
Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More information9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
More informationUtilizing spatial information systems for non-spatial-data analysis
Jointly published by Akadémiai Kiadó, Budapest Scientometrics, and Kluwer Academic Publishers, Dordrecht Vol. 51, No. 3 (2001) 563 571 Utilizing spatial information systems for non-spatial-data analysis
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationUSING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS
USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS Koua, E.L. International Institute for Geo-Information Science and Earth Observation (ITC).
More informationBig Data Visualisations. Professor Ian Nabney i.t.nabney@aston.ac.uk NCRG
Big Data Visualisations Professor Ian Nabney i.t.nabney@aston.ac.uk NCRG Overview Why visualise data? How we can visualise data Big Data Institute What is Visualisation? Goal of visualisation is to present
More informationText Mining - Scope and Applications
Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss
More informationKnowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD
Knowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD Eneko Agirre and Oier Lopez de Lacalle and Aitor Soroa Informatika Fakultatea, University of the Basque Country 20018,
More informationMaster of Artificial Intelligence
Faculty of Engineering Faculty of Science Master of Artificial Intelligence Options: Engineering and Computer Science (ECS) Speech and Language Technology (SLT) Cognitive Science (CS) K.U.Leuven Masters.
More informationOn the use of antonyms and synonyms from a domain perspective
On the use of antonyms and synonyms from a domain perspective Debela Tesfaye IT PhD Program Addis Ababa University Addis Ababa, Ethiopia dabookoo@gmail.com Carita Paradis Centre for Languages and Literature
More informationMINISTRY OF DEFENCE LANGUAGES EXAMINATIONS BOARD
Name: Candidate Registration Number: Date of Exam: MINISTRY OF DEFENCE LANGUAGES EXAMINATIONS BOARD SURVIVAL SLP1 DUTCH PAPER A Reading Task 1 Task 2 Time allowed Translation Comprehension 15 minutes Candidates
More informationAcquiring grammatical gender in northern and southern Dutch. Jan Klom, Gunther De Vogelaer
Acquiring grammatical gender in northern and southern Acquring grammatical gender in southern and northern 2 Research questions How does variation relate to change? (transmission in Labov 2007 variation
More informationSpecial Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
More informationHow To Create A Data Science System
Enhance Collaboration and Data Sharing for Faster Decisions and Improved Mission Outcome Richard Breakiron Senior Director, Cyber Solutions Rbreakiron@vion.com Office: 571-353-6127 / Cell: 803-443-8002
More informationHOTEL INFORMATION 2009
HOTEL INFORMATION 2009 IBFD has negotiated special corporate rates with the Mövenpick Hotel Amsterdam City Centre where we feel sure you will enjoy a comfortable stay. The Mövenpick Hotel is a modern,
More informationIBM SPSS Text Analytics for Surveys
IBM SPSS Text Analytics for Surveys IBM SPSS Text Analytics for Surveys Easily make your survey text responses usable in quantitative analysis Highlights With IBM SPSS Text Analytics for Surveys you can:
More informationActive Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
More informationOLAP Visualization Operator for Complex Data
OLAP Visualization Operator for Complex Data Sabine Loudcher and Omar Boussaid ERIC laboratory, University of Lyon (University Lyon 2) 5 avenue Pierre Mendes-France, 69676 Bron Cedex, France Tel.: +33-4-78772320,
More informationVisibility optimization for data visualization: A Survey of Issues and Techniques
Visibility optimization for data visualization: A Survey of Issues and Techniques Ch Harika, Dr.Supreethi K.P Student, M.Tech, Assistant Professor College of Engineering, Jawaharlal Nehru Technological
More informationFolksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
More informationComputer Aided Document Indexing System
Computer Aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić, Jan Šnajder Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, 0000 Zagreb, Croatia
More informationA Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca ablancogo@upsa.es Spain Manuel Martín-Merino Universidad
More informationTime series clustering and the analysis of film style
Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such
More informationExploratory Data Analysis with R. @matthewrenze #codemash
Exploratory Data Analysis with R @matthewrenze #codemash Motivation The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that
More informationConceptual Change Digital Humanities Case Studies (7-8 December 2015)
Conceptual Change Digital Humanities Case Studies (7-8 December 2015) ABSTRACTS The History of Concepts as Complex Systems Clifford Siskin (New York University) and Peter de Bolla (University of Cambridge)
More informationCover Page. "Assessing the Agreement of Cognitive Space with Information Space" A Research Seed Grant Proposal to the UNC-CH Cognitive Science Program
Cover Page "Assessing the Agreement of Cognitive Space with Information Space" A Research Seed Grant Proposal to the UNC-CH Cognitive Science Program Submitted by: Dr. Gregory B. Newby Assistant Professor
More informationVisual Discovery in Multivariate Binary Data
Visual Discovery in Multivariate Binary Data Boris Kovalerchuk a*, Florian Delizy a, Logan Riggs a, Evgenii Vityaev b a Dept. of Computer Science, Central Washington University, Ellensburg, WA, 9896-7520,
More informationInformation Visualization Multivariate Data Visualization Krešimir Matković
Information Visualization Multivariate Data Visualization Krešimir Matković Vienna University of Technology, VRVis Research Center, Vienna Multivariable >3D Data Tables have so many variables that orthogonal
More informationClever Search: A WordNet Based Wrapper for Internet Search Engines
Clever Search: A WordNet Based Wrapper for Internet Search Engines Peter M. Kruse, André Naujoks, Dietmar Rösner, Manuela Kunze Otto-von-Guericke-Universität Magdeburg, Institut für Wissens- und Sprachverarbeitung,
More informationMIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
More informationCours de Visualisation d'information InfoVis Lecture. Multivariate Data Sets
Cours de Visualisation d'information InfoVis Lecture Multivariate Data Sets Frédéric Vernier Maître de conférence / Lecturer Univ. Paris Sud Inspired from CS 7450 - John Stasko CS 5764 - Chris North Data
More informationLexical Competition: Round in English and Dutch
Lexical Competition: Round in English and Dutch Joost Zwarts * Abstract This paper studies the semantic division of labour between three Dutch words, om, rond and rondom, all three corresponding to the
More informationdm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING
dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING ABSTRACT In most CRM (Customer Relationship Management) systems, information on
More informationData visualization in political and social sciences
Data visualization in political and social sciences Andrei Zinovyev Institut Curie, Paris, France zinovyev@gmail.com The basic objective of data visualization is to provide an efficient graphical display
More informationA Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, shjun@cju.ac.kr Abstract Most text data from diverse document databases are unsuitable for analytical
More information2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
More informationCAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable
More informationExploratory Spatial Data Analysis
Exploratory Spatial Data Analysis Part II Dynamically Linked Views 1 Contents Introduction: why to use non-cartographic data displays Display linking by object highlighting Dynamic Query Object classification
More informationMonitoring chemical processes for early fault detection using multivariate data analysis methods
Bring data to life Monitoring chemical processes for early fault detection using multivariate data analysis methods by Dr Frank Westad, Chief Scientific Officer, CAMO Software Makers of CAMO 02 Monitoring
More informationWhat is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
More informationTowards a Visually Enhanced Medical Search Engine
Towards a Visually Enhanced Medical Search Engine Lavish Lalwani 1,2, Guido Zuccon 1, Mohamed Sharaf 2, Anthony Nguyen 1 1 The Australian e-health Research Centre, Brisbane, Queensland, Australia; 2 The
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationVisual Analytics and Data Mining
Visual Analytics and Data Mining in S-T-applicationsS Gennady Andrienko & Natalia Andrienko Fraunhofer Institute AIS Sankt Augustin Germany http://www.ais.fraunhofer.de/and Mining Spatio-Temporal Data
More informationThe First Online 3D Epigraphic Library: The University of Florida Digital Epigraphy and Archaeology Project
Seminar on Dec 19 th Abstracts & speaker information The First Online 3D Epigraphic Library: The University of Florida Digital Epigraphy and Archaeology Project Eleni Bozia (USA) Angelos Barmpoutis (USA)
More informationHybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
More informationThe Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA
Paper 156-2010 The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Abstract JMP has a rich set of visual displays that can help you see the information
More informationExploratory Data Analysis with R
Exploratory Data Analysis with R Roger D. Peng This book is for sale at http://leanpub.com/exdata This version was published on 2015-11-12 This is a Leanpub book. Leanpub empowers authors and publishers
More informationText Analytics. A business guide
Text Analytics A business guide February 2014 Contents 3 The Business Value of Text Analytics 4 What is Text Analytics? 6 Text Analytics Methods 8 Unstructured Meets Structured Data 9 Business Application
More informationEighth Annual Student Research Forum
Eighth Annual Student Research Forum February 18, 2011 COMPUTER SCIENCE AND COMPUTATIONAL SCIENCE PRESENTATION SCHEDULE Session Chair: Dr. George Miminis Head, Computer Science: Dr. Edward Brown Director,
More informationHow To Rank Term And Collocation In A Newspaper
You Can t Beat Frequency (Unless You Use Linguistic Knowledge) A Qualitative Evaluation of Association Measures for Collocation and Term Extraction Joachim Wermter Udo Hahn Jena University Language & Information
More information