Tracking change in word meaning
|
|
|
- Janel Higgins
- 10 years ago
- Views:
Transcription
1 Overview Intro DisSem Previous Case Visualisation Conclusion References Tracking change in word meaning A dynamic visualization of diachronic distributional semantics Kris Heylen, Thomas Wielfaert & Dirk Speelman KULeuven Quantitative Lexicology and Variational Linguistics
2 Purpose of the talk A lexicological study of how a set of near-synonymous adjectives have changed meaning through time, using a statistical, distributional approach for modelling lexical semantics in large corpora, using a dynamic visualization to assist in interpreting these statistical patterns, with the ultimate goal of creating an exploritative tool for lexical semantic analysis.
3 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
4 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
5 Background: Lexical Variation LEXICOLOGY:
6 Background: Lexical Variation LEXICOLOGY
7 Background: Lexical Variation LEXICOLOGY: SEMASIOLOGICAL PERSPECTIVE
8 Background: Lexical Variation LEXICOLOGY: ONOMASIOLOGICAL PERSPECTIVE
9 Background: Lexical Variation LEXICOLOGY: FINER GRAINED ANALYSIS OF SEMANTIC FEA- TURES
10 Background: Lexical Variation LEXICOLOGY: FINER GRAINED ANALYSIS OF SEMANTIC FEA- TURES
11 Background: Lexical Variation LEXICOLOGY: LECTAL VARIATION
12 Background: Lexical Variation LEXICOLOGY: CHRONO-LECTAL (DIACHRONIC) VARIATION
13 Background: Lexical Variation LEXICOLOGY: QUANTITATIVE CORPUS ANALYSIS
14 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
15 Distributional models of lexical semantics Linguistic origin: Distributional Hypothesis You shall know a word by the company it keeps (Firth) a word s meaning can be induced from its co-occurring words long tradition of collocation studies in corpus linguistics Semantic Vector Spaces in Computational Linguistics standard technique in statistical NLP for the large-scale automatic modeling of (lexical) semantics aka Vector Spaces Models, Distributional Semantic Models, Word Spaces,... (cf Turney & Pantel 2010 for overview) generalised, large scale collocation analysis mainly used for automatic thesaurus extraction: words occurring in same contexts have similar meaning
16 Semantic Vector Spaces as models of word meaning Practical Which two words out of a set of three have the same meaning? ongeval, koffie, accident Occurrences in context from a corpus Op de Brusselse ring deed zich een ongeval met een vrachtwagen voor s Morgens drinkt hij een kop koffie met melk en suiker 2 bestuurders raakten gekwetst bij een ongeval met een vrachtwagen in de avondspits veroorzaakte een accident een kilometerslange file als vieruurtje serveert het hotel koffie en gebak voor de gasten de auto was betrokken in een accident met een dodelijke afloop Met winterbanden is het risico op een ongeval bij vriesweer veel kleiner
17 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
18 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval vader raakte gekwetst bij een ongeval met een vrachtwagen op de
19 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval voor zeven uur veroorzaakte een ongeval een kilometerslange file richting Antwerpen
20 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval vrachtwagens waren betrokken bij het ongeval, dat meer dan tien slachtoffers
21 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
22 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
23 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
24 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
25 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval
26 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
27 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
28 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
29 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
30 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident
31 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
32 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
33 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
34 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie
35 auto slachtoffer vrachtwagen file gekwetst suiker melk kop ongeval accident koffie Which words are similar?
36 Distributional models of lexical semantics word by word similarity matrix ongeval accident koffie ongeval accident koffie
37 Distributional models of lexical semantics Geometrical metaphor: Semantic distance frequencies weighted by collocational strength (pmi) vectors projected in context feature space: Word Space cosine of angle between vectors as semantic similarity measure
38 Distributional semantics: lexical variation Bilectal Word Spaces Extend Word Space from one corpus to two corpora representative for different lects/varieties 2 context vectors for each word, one for each variety most words will have themselves as most similar word... BUT words with diverging semantic structure will not
39 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
40 Sagi, Kaufmann & Clark 2009
41 Rohrdantz, Hautli, Mayer etal. 2011
42 Hilpert 2011
43 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
44 Case study: : positive evaluative adjectives
45 Case study: positive evaluative adjectives brilliant cool delightful excellent fabulous fantastic good great impressive lovely magnificent marvelous perfect splendid superb terrific wonderful Table: positive evaluative adjectives
46 Case study Corpus Corpus of Historical American English (COHA, Davies 2012) Period from 1810 to 2009, 400M words, POS-tagged. Concept: Positive evaluative adjectives 1 vector per adjective, per decade ( ) modelled by window of 5 words left & right 5000 most frequent context words (minus top 100) PMI-weighting, cosine similarity
47 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
48 HighD to 2D Visualisation word-decade by context matrix is high dimensional first aim is NOT to find latent structure (as with LSA/LDA) but general picture of distributional semantic structuring faithful rendering of similarity matrix in 2D: Kruskal s non-metric Multidimensional Scaling interpret dimensions with context-labeled clusters Dynamic and interactive chart Motion Charts from Google Chart Tools panchronic view to interpret semantic space diachronic view to see meaning changes.
49 panchronic view for interpretation of semantic space Clusters with most typical contextwords of adjectives: cluster 2 (centre, light blue): positive evaluated things (colors, spectacle, performance) centre of the plot, expressing the core meaning of the adjectives cluster 8 (red, lower left): loud and frightening things (explosion, thunder, crash) periphery of the plot, expressing non-related meaning
50 diachronic motion chart to see meaning change Trajectory of terrific from 1860 to 2000, moving from the peripheral cluster of frightening things to the central cluster of positive evaluated things, indicative of its meaning change
51 Overview 1. Background: Lexical variation 2. Distributional semantics 3. Previous Visualisations 4. Case Study: positive evaluative adjectives 5. Dynamic Visualisation of semantic change 6. Conclusion
52 Summary Conclusion and future work Lexicological perspective: Tool for exploring lexical semantics and variation in large amounts of corpus data Dynamic visualisation of evolving semantic structuring for a set of near-synonymous adjectives Desiderata integrate with latent dimension finding techniques (cf. Rohrdantz et al.) for easier interpretation of semantic space show individual occurrences of lexemes (tokens) to explore semasiological structure of adjectives in each decade show interpretative beacons in the dynamic plot other types of context features (e.g. dependency relations)
53 For more information:
54 References I Davies, Mark Corpus of Historical American English (COHA): ): 400+ million words, Heylen, Kris, Speelman, Dirk, & Geeraerts, Dirk Looking at word meaning. An interactive visualization of Semantic Vector Spaces for Dutch synsets. Pages of: Proceedings of the EACL-2012 joint workshop of LINGVIS & UNCLH: Visualization of Language Patters and Uncovering Language History from Multilingual Resources. Hilpert, Martin Dynamic visualizations of language change: Motion charts on the basis of bivariate and multivariate data from diachronic corpora. International Journal of Corpus Linguistics, 16(4),
55 References II Rohrdantz, Christian, Hautli, Annette, Mayer, Thomas, Butt, Miriam, Keim, Daniel A, & Plank, Frans Towards Tracking Semantic Change by Visual Analytics. Pages of: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics. Sagi, Eyal, Kaufmann, Stefan, & Clark, Brady Semantic Density Analysis: Comparing Word Meaning across Time and Phonetic Space. Pages of: Proceedings of the Workshop on Geometrical Models of Natural Language Semantics. Athens, Greece: Association for Computational Linguistics. Turney, Peter D., & Pantel, Patrick From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research, 37(1),
Distributional Semantic Modelling in Cognitive Sociolinguistics: QLVL probes Semantic Space
Overview CogSoLx Onomas Semas Conclusion Distributional Semantic Modelling in Cognitive Sociolinguistics: QLVL probes Semantic Space Kris Heylen, Dirk Geeraerts & Dirk Speelman KU Leuven Quantitative Lexicology
Lexical convergence in the Dutch lexicon
Overview Introduction Dutch Method Results Lexical convergence in the Dutch lexicon Jocelyne Daems Kris Heylen Dirk Geeraerts University of Leuven RU Quantitative Lexicology and Variational Linguistics
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions. Natalia Levshina
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Natalia Levshina RU Quantitative Lexicology and Variational Linguistics Faculteit Letteren Subfaculteit Taalkunde K.U.Leuven
Comparing constructicons: A cluster analysis of the causative constructions with doen in Netherlandic and Belgian Dutch.
Comparing constructicons: A cluster analysis of the causative constructions with doen in Netherlandic and Belgian Dutch Natalia Levshina Outline 1. Dutch causative Cx with doen 2. Data and method 3. Quantitative
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions
FACULTEIT LETTEREN SUBFACULTEIT TAALKUNDE KATHOLIEKE UNIVERSITEIT LEUVEN Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Proefschrift ingediend tot het behalen van de
Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC
Text Mining in JMP with R Andrew T. Karl, Senior Management Consultant, Adsurgo LLC Heath Rushing, Principal Consultant and Co-Founder, Adsurgo LLC 1. Introduction A popular rule of thumb suggests that
University of Marburg, RC Deutscher Sprachatlas University of Leuven, RU Quantitative Lexicology and Variational Linguistics
Construction Grammar meets Semantic Vector Spaces: A radically data-driven approach to semantic classification of slot fillers Natalia Levshina Kris Heylen University of Marburg, RC Deutscher Sprachatlas
Statistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions
Doe wat je niet laten kan: A usage-based analysis of Dutch causative constructions Supervisors: Natalia Levshina Dirk Geeraerts Dirk Speelman University of Leuven RU Quantitative Lexicology and Variational
The Value of Visualization 2
The Value of Visualization 2 G Janacek -0.69 1.11-3.1 4.0 GJJ () Visualization 1 / 21 Parallel coordinates Parallel coordinates is a common way of visualising high-dimensional geometry and analysing multivariate
Markus Dickinson. Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013
Markus Dickinson Dept. of Linguistics, Indiana University Catapult Workshop Series; February 1, 2013 1 / 34 Basic text analysis Before any sophisticated analysis, we want ways to get a sense of text data
Computer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015
Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD
Applying quantitative methods to dialect Dutch verb clusters
Applying quantitative methods to dialect Dutch verb clusters Jeroen van Craenenbroeck KU Leuven/CRISSP [email protected] 1 Introduction Verb cluster ordering is a well-known area of microparametric
Search Engines. Stephen Shaw <[email protected]> 18th of February, 2014. Netsoc
Search Engines Stephen Shaw Netsoc 18th of February, 2014 Me M.Sc. Artificial Intelligence, University of Edinburgh Would recommend B.A. (Mod.) Computer Science, Linguistics, French,
Statistical Validation and Data Analytics in ediscovery. Jesse Kornblum
Statistical Validation and Data Analytics in ediscovery Jesse Kornblum Administrivia Silence your mobile Interactive talk Please ask questions 2 Outline Introduction Big Questions What Makes Things Similar?
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
3 Paraphrase Acquisition. 3.1 Overview. 2 Prior Work
Unsupervised Paraphrase Acquisition via Relation Discovery Takaaki Hasegawa Cyberspace Laboratories Nippon Telegraph and Telephone Corporation 1-1 Hikarinooka, Yokosuka, Kanagawa 239-0847, Japan [email protected]
Get the most value from your surveys with text analysis
PASW Text Analytics for Surveys 3.0 Specifications Get the most value from your surveys with text analysis The words people use to answer a question tell you a lot about what they think and feel. That
Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations
Monitoring of Complex Industrial Processes based on Self-Organizing Maps and Watershed Transformations Christian W. Frey 2012 Monitoring of Complex Industrial Processes based on Self-Organizing Maps and
Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision
Taxonomy learning factoring the structure of a taxonomy into a semantic classification decision Viktor PEKAR Bashkir State University Ufa, Russia, 450000 [email protected] Steffen STAAB Institute AIFB,
Information Visualization WS 2013/14 11 Visual Analytics
1 11.1 Definitions and Motivation Lot of research and papers in this emerging field: Visual Analytics: Scope and Challenges of Keim et al. Illuminating the path of Thomas and Cook 2 11.1 Definitions and
W. Heath Rushing Adsurgo LLC. Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare. Session H-1 JTCC: October 23, 2015
W. Heath Rushing Adsurgo LLC Harness the Power of Text Analytics: Unstructured Data Analysis for Healthcare Session H-1 JTCC: October 23, 2015 Outline Demonstration: Recent article on cnn.com Introduction
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Big data, the future of statistics
Big data, the future of statistics Experiences from Statistics Netherlands Dr. Piet J.H. Daas Senior-Methodologist, Big Data research coordinator and Marco Puts, Martijn Tennekes, Alex Priem, Edwin de
Sentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)
Statgraphics Centurion XVII (currently in beta test) is a major upgrade to Statpoint's flagship data analysis and visualization product. It contains 32 new statistical procedures and significant upgrades
Exploiting Comparable Corpora and Bilingual Dictionaries. the Cross Language Text Categorization
Exploiting Comparable Corpora and Bilingual Dictionaries for Cross-Language Text Categorization Alfio Gliozzo and Carlo Strapparava ITC-Irst via Sommarive, I-38050, Trento, ITALY {gliozzo,strappa}@itc.it
An Introduction to Random Indexing
MAGNUS SAHLGREN SICS, Swedish Institute of Computer Science Box 1263, SE-164 29 Kista, Sweden [email protected] Introduction Word space models enjoy considerable attention in current research on semantic indexing.
Varieties of lexical variation
Dirk Geeraerts University of Leuven Varieties of lexical Abstract This paper presents the theoretical backgr ound of a large-scale lexicological research project on lexical that was carried out at the
Imputing Values to Missing Data
Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data
Computer-aided Document Indexing System
Journal of Computing and Information Technology - CIT 13, 2005, 4, 299-305 299 Computer-aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić and Jan Šnajder,, An enormous
Gallito 2.0: a Natural Language Processing tool to support Research on Discourse
Presented in the Twenty-third Annual Meeting of the Society for Text and Discourse, Valencia from 16 to 18, July 2013 Gallito 2.0: a Natural Language Processing tool to support Research on Discourse Guillermo
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
Crossing Corpora. Modelling Semantic Similarity across Languages and Lects.
Distributional Models Bilectal Bilingual Crossing Corpora. Modelling Semantic Similarity across Languages and Lects. Yves Peirsman Supervisors: Dirk Geeraerts & Dirk Speelman Quantitative Lexicology and
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
INF4820, Algorithms for AI and NLP: More Common Lisp Vector Spaces
INF4820, Algorithms for AI and NLP: More Common Lisp Vector Spaces Erik Velldal University of Oslo Sept. 4, 2012 Topics for today 2 More Common Lisp More data types: Arrays, sequences, hash tables, and
Exploratory Data Analysis with MATLAB
Computer Science and Data Analysis Series Exploratory Data Analysis with MATLAB Second Edition Wendy L Martinez Angel R. Martinez Jeffrey L. Solka ( r ec) CRC Press VV J Taylor & Francis Group Boca Raton
What the Hell is Big Data?
Presentation What the Hell is Big Data? Bernard Marr www.ap-institute.com 1 Background 2 Navigating to Success 3 Navigation Today 4 The Global Data Revolution 5 The Intelligent Company Model Strategic
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures
Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the
Statistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
9. Text & Documents. Visualizing and Searching Documents. Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08
9. Text & Documents Visualizing and Searching Documents Dr. Thorsten Büring, 20. Dezember 2007, Vorlesung Wintersemester 2007/08 Slide 1 / 37 Outline Characteristics of text data Detecting patterns SeeSoft
Utilizing spatial information systems for non-spatial-data analysis
Jointly published by Akadémiai Kiadó, Budapest Scientometrics, and Kluwer Academic Publishers, Dordrecht Vol. 51, No. 3 (2001) 563 571 Utilizing spatial information systems for non-spatial-data analysis
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS
USING SELF-ORGANIZING MAPS FOR INFORMATION VISUALIZATION AND KNOWLEDGE DISCOVERY IN COMPLEX GEOSPATIAL DATASETS Koua, E.L. International Institute for Geo-Information Science and Earth Observation (ITC).
Text Mining - Scope and Applications
Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss
Knowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD
Knowledge-Based WSD on Specific Domains: Performing Better than Generic Supervised WSD Eneko Agirre and Oier Lopez de Lacalle and Aitor Soroa Informatika Fakultatea, University of the Basque Country 20018,
Master of Artificial Intelligence
Faculty of Engineering Faculty of Science Master of Artificial Intelligence Options: Engineering and Computer Science (ECS) Speech and Language Technology (SLT) Cognitive Science (CS) K.U.Leuven Masters.
On the use of antonyms and synonyms from a domain perspective
On the use of antonyms and synonyms from a domain perspective Debela Tesfaye IT PhD Program Addis Ababa University Addis Ababa, Ethiopia [email protected] Carita Paradis Centre for Languages and Literature
MINISTRY OF DEFENCE LANGUAGES EXAMINATIONS BOARD
Name: Candidate Registration Number: Date of Exam: MINISTRY OF DEFENCE LANGUAGES EXAMINATIONS BOARD SURVIVAL SLP1 DUTCH PAPER A Reading Task 1 Task 2 Time allowed Translation Comprehension 15 minutes Candidates
Acquiring grammatical gender in northern and southern Dutch. Jan Klom, Gunther De Vogelaer
Acquiring grammatical gender in northern and southern Acquring grammatical gender in southern and northern 2 Research questions How does variation relate to change? (transmission in Labov 2007 variation
Special Topics in Computer Science
Special Topics in Computer Science NLP in a Nutshell CS492B Spring Semester 2009 Jong C. Park Computer Science Department Korea Advanced Institute of Science and Technology INTRODUCTION Jong C. Park, CS
HOTEL INFORMATION 2009
HOTEL INFORMATION 2009 IBFD has negotiated special corporate rates with the Mövenpick Hotel Amsterdam City Centre where we feel sure you will enjoy a comfortable stay. The Mövenpick Hotel is a modern,
IBM SPSS Text Analytics for Surveys
IBM SPSS Text Analytics for Surveys IBM SPSS Text Analytics for Surveys Easily make your survey text responses usable in quantitative analysis Highlights With IBM SPSS Text Analytics for Surveys you can:
Active Learning SVM for Blogs recommendation
Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the
Visibility optimization for data visualization: A Survey of Issues and Techniques
Visibility optimization for data visualization: A Survey of Issues and Techniques Ch Harika, Dr.Supreethi K.P Student, M.Tech, Assistant Professor College of Engineering, Jawaharlal Nehru Technological
Folksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
Computer Aided Document Indexing System
Computer Aided Document Indexing System Mladen Kolar, Igor Vukmirović, Bojana Dalbelo Bašić, Jan Šnajder Faculty of Electrical Engineering and Computing, University of Zagreb Unska 3, 0000 Zagreb, Croatia
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization
A Partially Supervised Metric Multidimensional Scaling Algorithm for Textual Data Visualization Ángela Blanco Universidad Pontificia de Salamanca [email protected] Spain Manuel Martín-Merino Universidad
Time series clustering and the analysis of film style
Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such
Exploratory Data Analysis with R. @matthewrenze #codemash
Exploratory Data Analysis with R @matthewrenze #codemash Motivation The ability to take data to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it that
Visual Discovery in Multivariate Binary Data
Visual Discovery in Multivariate Binary Data Boris Kovalerchuk a*, Florian Delizy a, Logan Riggs a, Evgenii Vityaev b a Dept. of Computer Science, Central Washington University, Ellensburg, WA, 9896-7520,
Information Visualization Multivariate Data Visualization Krešimir Matković
Information Visualization Multivariate Data Visualization Krešimir Matković Vienna University of Technology, VRVis Research Center, Vienna Multivariable >3D Data Tables have so many variables that orthogonal
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
Cours de Visualisation d'information InfoVis Lecture. Multivariate Data Sets
Cours de Visualisation d'information InfoVis Lecture Multivariate Data Sets Frédéric Vernier Maître de conférence / Lecturer Univ. Paris Sud Inspired from CS 7450 - John Stasko CS 5764 - Chris North Data
Lexical Competition: Round in English and Dutch
Lexical Competition: Round in English and Dutch Joost Zwarts * Abstract This paper studies the semantic division of labour between three Dutch words, om, rond and rondom, all three corresponding to the
dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING
dm106 TEXT MINING FOR CUSTOMER RELATIONSHIP MANAGEMENT: AN APPROACH BASED ON LATENT SEMANTIC ANALYSIS AND FUZZY CLUSTERING ABSTRACT In most CRM (Customer Relationship Management) systems, information on
A Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, [email protected] Abstract Most text data from diverse document databases are unsuitable for analytical
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING
CAPTURING THE VALUE OF UNSTRUCTURED DATA: INTRODUCTION TO TEXT MINING Mary-Elizabeth ( M-E ) Eddlestone Principal Systems Engineer, Analytics SAS Customer Loyalty, SAS Institute, Inc. Is there valuable
Exploratory Spatial Data Analysis
Exploratory Spatial Data Analysis Part II Dynamically Linked Views 1 Contents Introduction: why to use non-cartographic data displays Display linking by object highlighting Dynamic Query Object classification
Monitoring chemical processes for early fault detection using multivariate data analysis methods
Bring data to life Monitoring chemical processes for early fault detection using multivariate data analysis methods by Dr Frank Westad, Chief Scientific Officer, CAMO Software Makers of CAMO 02 Monitoring
What is Visualization? Information Visualization An Overview. Information Visualization. Definitions
What is Visualization? Information Visualization An Overview Jonathan I. Maletic, Ph.D. Computer Science Kent State University Visualize/Visualization: To form a mental image or vision of [some
Towards a Visually Enhanced Medical Search Engine
Towards a Visually Enhanced Medical Search Engine Lavish Lalwani 1,2, Guido Zuccon 1, Mohamed Sharaf 2, Anthony Nguyen 1 1 The Australian e-health Research Centre, Brisbane, Queensland, Australia; 2 The
Knowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs [email protected] Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
The First Online 3D Epigraphic Library: The University of Florida Digital Epigraphy and Archaeology Project
Seminar on Dec 19 th Abstracts & speaker information The First Online 3D Epigraphic Library: The University of Florida Digital Epigraphy and Archaeology Project Eleni Bozia (USA) Angelos Barmpoutis (USA)
Hybrid Strategies. for better products and shorter time-to-market
Hybrid Strategies for better products and shorter time-to-market Background Manufacturer of language technology software & services Spin-off of the research center of Germany/Heidelberg Founded in 1999,
The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA
Paper 156-2010 The Forgotten JMP Visualizations (Plus Some New Views in JMP 9) Sam Gardner, SAS Institute, Lafayette, IN, USA Abstract JMP has a rich set of visual displays that can help you see the information
Exploratory Data Analysis with R
Exploratory Data Analysis with R Roger D. Peng This book is for sale at http://leanpub.com/exdata This version was published on 2015-11-12 This is a Leanpub book. Leanpub empowers authors and publishers
Text Analytics. A business guide
Text Analytics A business guide February 2014 Contents 3 The Business Value of Text Analytics 4 What is Text Analytics? 6 Text Analytics Methods 8 Unstructured Meets Structured Data 9 Business Application
