Domain Adaptive Relation Extraction for Big Text Data Analytics. Feiyu Xu
|
|
|
- Kenneth Johns
- 10 years ago
- Views:
Transcription
1 Domain Adaptive Relation Extraction for Big Text Data Analytics Feiyu Xu
2 Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions! Conclusion and future work
3 What is Relation Extraction! Given an unstructured text, a relation extraction (RE) tool should be able to automatically recognize and extract relations among the relevant entities or concepts that are salient to the user's needs Nobel Peace Prize Barack Obama 2009 RE Walter Kohn, Nobel, Chemistry,1998 J.M. Coetzee, Nobel, Literature, 2003 Barack Obama, Nobel, Peace, 2009 Linguistic Patterns: <prize> be awarded to <person> <person> win <prize> in <year>...
4 Example in Opinion Mining Mitten in der Euro-Krise geht Altkanzler Helmut Kohl mit Angela Merkel äußerst hart ins Gericht -- Welt online, Opinion Holder Opinion Target Polarity Feiyu Xu
5 General application task 1:! Information access for information finder mapping unstructured textual queries of users to more structured formal query for search and answer engines
6 General application task 2:! Information acquisition for information provider extract structured information from big amount free texts to construct knowledge bases RE!
7 Acquisition of Social Network of Pop Stars from Web Social Network of Madonna (Depth = 1)
8 Social Network of Madonna (Depth = 3)
9 General application task 3: Big Data Analytics!! Enabling the linking between structured and unstructured data Large-scale information monitoring Analytics: analyses of areas, markets, trends Watch: Scanning for relevant new developments
10 Example: Network of Innovation Keyplayers
11 Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions! Conclusion and future work
12 Text Analytics for Big Textual Data! Three main features of big data Volume: large-scale in volume Variety: with respect to heterogeneous domains and formats Velocity: because of its rapid and steady growing.! Requirements of text analytics technologies for big data efficient robust scalable domain-adaptive
13 Domain Adaptation is Essential for Big Data!! Among the three big data features, variety and velocity are even more challenging than the sheer size volume
14 Reasons:! New domains have been constantly emerging, rapidly growing in size.! Domains can differ in topics (e.g., medicine, chemistry or mechanics) genres (e.g., news, novels, blogs, scientific publications or patents) targets (e.g., different relations such as marriage, person-parent relation, disease-symptom relation) data internal properties (e.g., size or redundancy or connectivity).! Systems, methods or strategies developed or trained for so-called general purpose or one specific domain can often not be directly taken over by other domains, because each domain needs its own domain knowledge and each application data has its own special properties.
15 Relevant Strategies for Domain Adaptation! Minimally dependent on the labeled training data Minimally or weakly supervised machine learning methods! Strategies for confidence estimation of automatically learned information and knowledge filtering of irrelevant and wrong information! Domain adaptation of generic systems for specific applications
16 Outline! Introduction to relation extraction and its applications! Motivation of domain adaptation in big text data analytics! Solutions! Conclusion and future work
17 Our solutions (1)! minimally supervised and distantly supervised automatic learning of domain-specific grammar-based pattern rules for n-ary RE: DARE and Web-DARE Systems Feiyu Xu, Hans Uszkoreit, Hong Li, A Seed-driven Bottom-up Machine Learning Framework for Extracting Relations of Various Complexity (2007). In ACL Hans Uszkoreit, Feiyu Xu, Hong Li. Analysis and Improvement of Minimally Supervised Machine Learning for Relation Extraction. In NLDB Sebastian Krause, Hong Li, Hans Uszkoreit, Feiyu Xu, Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web. In Proceedings of the 11th International Semantic Web Conference (ISWC 2012).
18 Our solutions (2)! Various filtering and confidence estimation methods for highperformance and large-scale relation extraction Sebastian Krause, Hong Li, Hans Uszkoreit, Feiyu Xu, Large-Scale Learning of Relation-Extraction Rules with Distant Supervision from the Web. In Proceedings of the 11th International Semantic Web Conference (ISWC 2012) Andrea Moro, Hong Li, Sebastian Krause, Feiyu Xu, Roberto Navigli, Hans Uszkoreit, Semantic rule filtering for web-scale relation extraction. In Proceeding of International Semantic Web Conference (ISWC 2013). Feiyu Xu, Hans Uszkoreit, Sebastian Krause, Hong Li. Boosting Relation Extraction with Limited Closed-World Knowledge. COLING 2009, Poster.
19 Our solutions (3)! Automatic adaptation and improvement of generic parsing results for specific domains Peter Adolphs, Feiyu Xu, Hans Uszkoreit, Hong Li, Dependency Graphs as a Generic Interface between Parsers and Relation Extraction Rule Learning. In Proceedings of KI 2011, pp , Feiyu Xu, Hong Li, Yi Zhang, Hans Uszkoreit, Sebastian Krause, Parse reranking for domain-adaptive relation extraction. Journal of Logic and Computation, doi: /logcom/exs055, Oxford University Press, 2012.
20 Our solutions (4)! Automatic generation of domain-specific linguistic knowledge resources Hans Uszkoreit and Feiyu Xu, From Strings to Things, SAR-Graphs: A New Type of Resource for Connecting Knowledge and Language. In Proceedings of 1st International Workshop on NLP and DBpedia volume 1064, Sydney, NSW, Australia, CEUR Workshop Proceedings, 10/2013 Open source: sargraph.dfki.de
21 Web-DARE Distant-supervised Web-scale RE
22 Web-DARE: Distant Supervision based RE! Large number of RE rules are automatically learned by using Freebase as seed knowledge and Web as training corpus! Goal: " covering most linguistic variants for expressing a relation " thus solving the notorious long-tail problem of real-world NL applications
23 Data Set!! rules learned for 39 relations n-ary relations n>=2! three domains: business, awards and people! 2.8 million relation instances retrieved from Freebase as seed! 20 million web documents as training corpus
24 Example in Nobel Prize Award Domain Seed example <Mohamed ElBaradei/Person, Nobel/Prize, Peace/Area, 2005/Year> Sentence matched with the seed Mohamed ElBaradei won the 2005 Nobel Prize for Peace on Friday...
25 Dependency Parse Result
26 Bottom Up Rule Learning Rule (1)
27 Bottom Up Rule Learning Rule (2) Rule (1)
28 Bottom Up Rule Learning Rule (3) Rule (2) Rule (1)
29 Web-DARE Architecture Facts Queries Web pages Filtered Patterns Patterns Sentence Mentions of Facts
30 Some Statistics of Web-DARE Rules!
31 Problems of Large-Scale Approach! Very low precision a lot of noisy rules many rules are learned from more than one relation
32 Euler Diagram for Four People-Relations rules&of& marriage' rules&of& person'parent' 160,853( 107,381( 3,358( 798( 996( 6,153( 1,450( 866( 109( 2,708( 64,515( 483( 61,176( rules&of& romantic'relationship' rules&of& sibling'relationship'
33 Various Filtering Strategies for High-Performance Web-Scale RE
34 Frequency-Driven Rule Filters! Merged Filter: 1) absolute frequency filtering: a threshold to exclude rules with low occurrency
35 Rule Frequency Driven Filters! Merged Filter: 1) absolute frequency filtering: a threshold to exclude rules with low occurrency 2) inter-relation filter (Overlap Filter FO Filter): " based on mutual exclusiveness of relations with similar entitytype signatures. " a rule is only valid for a relation, if its relative frequency is higher than any other relations with similar entity type signatures.
36 Weakness of Filtering with Frequency! Undetected low-quality patterns: high frequency in target relation, low frequency in coupled relations
37 Weakness of Filtering with Rule Frequency! Undetected low-quality patterns: high frequency in target relation, low frequency in coupled relations nsubj dobj PERSON Verb: meet PERSON??
38 Weakness of Filtering with Rule Frequency! Undetected low-quality patterns: high frequency in target relation, low frequency in coupled relations nsubj dobj PERSON Verb: meet PERSON?! Erroneously-deleted good patterns: infrequent patterns poss appos PERSON Noun: widower PERSON?
39 Lexical Semantics can help! marriage person-death acquisition World Wide Web Relation-specific lexical semantic graphs?? Candidate RE Patterns Unsupervised Classification!!! High-quality RE Patterns
40 Automatic learning of relation-specific lexical semantic network Generic Lexical Semantic Network (BabelNet) automatically learned unfiltered RE rules and their mentions
41 Automatic learning of relation-specific lexical semantic network Generic Lexical Semantic Network (BabelNet) automatically learned unfiltered RE rules and their mentions Word Sense Disambiguation
42 The Relation-Specific Semantic Graph An excerpt of the semantic graph for the relation marriage 27/10/14 42
43 Extrinsic Eval. Web-DARE FO-Filter S-Filter (BabelNet) S-Filter (WordNet) Baseline Relative Recall
44 Parse-Reranking for Domain-adaptive RE
45 Error Types of Extracted Wrong Instances Content Modality Named Entity Recognition (NER) Parsing NER & Parsing DARE Rules 11.8% 17.6% 5.9% 38.2% 11.8% 14.7%
46 HPSG Parses: PP Attachment Egyptian scientist Ahmed Zewail won the 1999 Nobel Prize for chemistry
47 HPSG Parses: PP Attachment Egyptian scientist Ahmed Zewail won the 1999 Nobel Prize for chemistry
48 Reranking Architecture
49 Baseline: before Re-ranking! Best reading: high precision, low recall, low F-measure! 500 readings: lower precision, higher recall, higher F-measure
50 After Re-Ranking:! Re-ranked top readings match more sentence mentions containing RE instances! Improvements of Recall and F-Measure
51 Conclusion! The performance of large-scale RE for each application is dependent on the performance of domain-adaptation methods! Three original contributions (among others): Extension of relation extraction to n-ary relations Semantic filtering with large lexical knowledge bases Parser improvement for the specific RE task by reranking! For our work we received a Google Focused Research Award
52 Planned Future Work! Immediate next step of big text data analytics is to integrate the existing NLP and IE components into big data analytics platforms! Entity linking and RE will play an essential role for semantic interoperability between structured and unstructured data! Extension and Application of our IE technologies to the new Smart Data projects Smart Data Web: Industry 4.0 Smart Data for Mobility: Mobility
Semantic Rule Filtering for Web-Scale Relation Extraction
Semantic Rule Filtering for Web-Scale Relation Extraction Andrea Moro 1, Hong Li 2, Sebastian Krause 2, Feiyu Xu 2, Roberto Navigli 1, and Hans Uszkoreit 2 1 Dipartimento di Informatica, Sapienza Università
Interactive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy)
Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy) Multilingual Word Sense Disambiguation and Entity Linking on the Web based on BabelNet Roberto Navigli, Tiziano
Open Domain Information Extraction. Günter Neumann, DFKI, 2012
Open Domain Information Extraction Günter Neumann, DFKI, 2012 Improving TextRunner Wu and Weld (2010) Open Information Extraction using Wikipedia, ACL 2010 Fader et al. (2011) Identifying Relations for
Automatic Knowledge Base Construction Systems. Dr. Daisy Zhe Wang CISE Department University of Florida September 3th 2014
Automatic Knowledge Base Construction Systems Dr. Daisy Zhe Wang CISE Department University of Florida September 3th 2014 1 Text Contains Knowledge 2 Text Contains Automatically Extractable Knowledge 3
ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search
Project for Michael Pitts Course TCSS 702A University of Washington Tacoma Institute of Technology ALIAS: A Tool for Disambiguating Authors in Microsoft Academic Search Under supervision of : Dr. Senjuti
CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet
CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet Muhammad Atif Qureshi 1,2, Arjumand Younus 1,2, Colm O Riordan 1,
A Survey on Product Aspect Ranking
A Survey on Product Aspect Ranking Charushila Patil 1, Prof. P. M. Chawan 2, Priyamvada Chauhan 3, Sonali Wankhede 4 M. Tech Student, Department of Computer Engineering and IT, VJTI College, Mumbai, Maharashtra,
Sentiment analysis for news articles
Prashant Raina Sentiment analysis for news articles Wide range of applications in business and public policy Especially relevant given the popularity of online media Previous work Machine learning based
Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com
Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around
Survey Results: Requirements and Use Cases for Linguistic Linked Data
Survey Results: Requirements and Use Cases for Linguistic Linked Data 1 Introduction This survey was conducted by the FP7 Project LIDER (http://www.lider-project.eu/) as input into the W3C Community Group
An Introduction to Machine Learning and Natural Language Processing Tools
An Introduction to Machine Learning and Natural Language Processing Tools Presented by: Mark Sammons, Vivek Srikumar (Many slides courtesy of Nick Rizzolo) 8/24/2010-8/26/2010 Some reasonably reliable
Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources
Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference Domain Independent Knowledge Base Population From Structured and Unstructured Data Sources Michelle
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework
Effective Data Retrieval Mechanism Using AML within the Web Based Join Framework Usha Nandini D 1, Anish Gracias J 2 1 [email protected] 2 [email protected] Abstract A vast amount of assorted
Web Information Mining and Decision Support Platform for the Modern Service Industry
Web Information Mining and Decision Support Platform for the Modern Service Industry Binyang Li 1,2, Lanjun Zhou 2,3, Zhongyu Wei 2,3, Kam-fai Wong 2,3,4, Ruifeng Xu 5, Yunqing Xia 6 1 Dept. of Information
Identifying Focus, Techniques and Domain of Scientific Papers
Identifying Focus, Techniques and Domain of Scientific Papers Sonal Gupta Department of Computer Science Stanford University Stanford, CA 94305 [email protected] Christopher D. Manning Department of
A Statistical Text Mining Method for Patent Analysis
A Statistical Text Mining Method for Patent Analysis Department of Statistics Cheongju University, [email protected] Abstract Most text data from diverse document databases are unsuitable for analytical
How To Write A Summary Of A Review
PRODUCT REVIEW RANKING SUMMARIZATION N.P.Vadivukkarasi, Research Scholar, Department of Computer Science, Kongu Arts and Science College, Erode. Dr. B. Jayanthi M.C.A., M.Phil., Ph.D., Associate Professor,
Domain Classification of Technical Terms Using the Web
Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using
Computational Linguistics and Learning from Big Data. Gabriel Doyle UCSD Linguistics
Computational Linguistics and Learning from Big Data Gabriel Doyle UCSD Linguistics From not enough data to too much Finding people: 90s, 700 datapoints, 7 years People finding you: 00s, 30000 datapoints,
2014/02/13 Sphinx Lunch
2014/02/13 Sphinx Lunch Best Student Paper Award @ 2013 IEEE Workshop on Automatic Speech Recognition and Understanding Dec. 9-12, 2013 Unsupervised Induction and Filling of Semantic Slot for Spoken Dialogue
ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS
ONLINE RESUME PARSING SYSTEM USING TEXT ANALYTICS Divyanshu Chandola 1, Aditya Garg 2, Ankit Maurya 3, Amit Kushwaha 4 1 Student, Department of Information Technology, ABES Engineering College, Uttar Pradesh,
Search and Information Retrieval
Search and Information Retrieval Search on the Web 1 is a daily activity for many people throughout the world Search and communication are most popular uses of the computer Applications involving search
Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences
Discover more, discover faster. High performance, flexible NLP-based text mining for life sciences It s not information overload, it s filter failure. Clay Shirky Life Sciences organizations face the challenge
POSBIOTM-NER: A Machine Learning Approach for. Bio-Named Entity Recognition
POSBIOTM-NER: A Machine Learning Approach for Bio-Named Entity Recognition Yu Song, Eunji Yi, Eunju Kim, Gary Geunbae Lee, Department of CSE, POSTECH, Pohang, Korea 790-784 Soo-Jun Park Bioinformatics
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata
Generating SQL Queries Using Natural Language Syntactic Dependencies and Metadata Alessandra Giordani and Alessandro Moschitti Department of Computer Science and Engineering University of Trento Via Sommarive
Building a Question Classifier for a TREC-Style Question Answering System
Building a Question Classifier for a TREC-Style Question Answering System Richard May & Ari Steinberg Topic: Question Classification We define Question Classification (QC) here to be the task that, given
ANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
CENG 734 Advanced Topics in Bioinformatics
CENG 734 Advanced Topics in Bioinformatics Week 9 Text Mining for Bioinformatics: BioCreative II.5 Fall 2010-2011 Quiz #7 1. Draw the decompressed graph for the following graph summary 2. Describe the
Discovering and Querying Hybrid Linked Data
Discovering and Querying Hybrid Linked Data Zareen Syed 1, Tim Finin 1, Muhammad Rahman 1, James Kukla 2, Jeehye Yun 2 1 University of Maryland Baltimore County 1000 Hilltop Circle, MD, USA 21250 [email protected],
3 Paraphrase Acquisition. 3.1 Overview. 2 Prior Work
Unsupervised Paraphrase Acquisition via Relation Discovery Takaaki Hasegawa Cyberspace Laboratories Nippon Telegraph and Telephone Corporation 1-1 Hikarinooka, Yokosuka, Kanagawa 239-0847, Japan [email protected]
Natural Language to Relational Query by Using Parsing Compiler
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 3, March 2015,
Parsing Software Requirements with an Ontology-based Semantic Role Labeler
Parsing Software Requirements with an Ontology-based Semantic Role Labeler Michael Roth University of Edinburgh [email protected] Ewan Klein University of Edinburgh [email protected] Abstract Software
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System
Architecture of an Ontology-Based Domain- Specific Natural Language Question Answering System Athira P. M., Sreeja M. and P. C. Reghuraj Department of Computer Science and Engineering, Government Engineering
ARABIC PERSON NAMES RECOGNITION BY USING A RULE BASED APPROACH
Journal of Computer Science 9 (7): 922-927, 2013 ISSN: 1549-3636 2013 doi:10.3844/jcssp.2013.922.927 Published Online 9 (7) 2013 (http://www.thescipub.com/jcs.toc) ARABIC PERSON NAMES RECOGNITION BY USING
Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED
Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED 17 19 June 2013 Monday 17 June Salón de Actos, Facultad de Psicología, UNED 15.00-16.30: Invited talk Eneko Agirre (Euskal Herriko
Text Mining - Scope and Applications
Journal of Computer Science and Applications. ISSN 2231-1270 Volume 5, Number 2 (2013), pp. 51-55 International Research Publication House http://www.irphouse.com Text Mining - Scope and Applications Miss
ISSN: 2278-5299 365. Sean W. M. Siqueira, Maria Helena L. B. Braz, Rubens Nascimento Melo (2003), Web Technology for Education
International Journal of Latest Research in Science and Technology Vol.1,Issue 4 :Page No.364-368,November-December (2012) http://www.mnkjournals.com/ijlrst.htm ISSN (Online):2278-5299 EDUCATION BASED
Facilitating Business Process Discovery using Email Analysis
Facilitating Business Process Discovery using Email Analysis Matin Mavaddat [email protected] Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information
Ngram Search Engine with Patterns Combining Token, POS, Chunk and NE Information Satoshi Sekine Computer Science Department New York University [email protected] Kapil Dalwani Computer Science Department
Big Data and Text Mining
Big Data and Text Mining Dr. Ian Lewin Senior NLP Resource Specialist [email protected] www.linguamatics.com About Linguamatics Boston, USA Cambridge, UK Software Consulting Hosted content Agile,
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5
Semantic annotation of requirements for automatic UML class diagram generation
www.ijcsi.org 259 Semantic annotation of requirements for automatic UML class diagram generation Soumaya Amdouni 1, Wahiba Ben Abdessalem Karaa 2 and Sondes Bouabid 3 1 University of tunis High Institute
AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS
AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS Alok Ranjan Pal 1, 3, Anirban Kundu 2, 3, Abhay Singh 1, Raj Shekhar 1, Kunal Sinha 1 1 College of Engineering and Management,
Big Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs [email protected] Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
Semantic Sentiment Analysis of Twitter
Semantic Sentiment Analysis of Twitter Hassan Saif, Yulan He & Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 11 th International Semantic Web Conference
Customer Intentions Analysis of Twitter Based on Semantic Patterns
Customer Intentions Analysis of Twitter Based on Semantic Patterns Mohamed Hamroun [email protected] Mohamed Salah Gouider [email protected] Lamjed Ben Said [email protected] ABSTRACT
TREC 2003 Question Answering Track at CAS-ICT
TREC 2003 Question Answering Track at CAS-ICT Yi Chang, Hongbo Xu, Shuo Bai Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China [email protected] http://www.ict.ac.cn/
Term extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
The Prolog Interface to the Unstructured Information Management Architecture
The Prolog Interface to the Unstructured Information Management Architecture Paul Fodor 1, Adam Lally 2, David Ferrucci 2 1 Stony Brook University, Stony Brook, NY 11794, USA, [email protected] 2 IBM
Wikipedia and Web document based Query Translation and Expansion for Cross-language IR
Wikipedia and Web document based Query Translation and Expansion for Cross-language IR Ling-Xiang Tang 1, Andrew Trotman 2, Shlomo Geva 1, Yue Xu 1 1Faculty of Science and Technology, Queensland University
Sentiment Analysis. D. Skrepetos 1. University of Waterloo. NLP Presenation, 06/17/2015
Sentiment Analysis D. Skrepetos 1 1 Department of Computer Science University of Waterloo NLP Presenation, 06/17/2015 D. Skrepetos (University of Waterloo) Sentiment Analysis NLP Presenation, 06/17/2015
Semantic SharePoint. Technical Briefing. Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company
Semantic SharePoint Technical Briefing Helmut Nagy, Semantic Web Company Andreas Blumauer, Semantic Web Company What is Semantic SP? a joint venture between iquest and Semantic Web Company, initiated in
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 4, Issue 11, November-2013 5 INTELLIGENT MULTIDIMENSIONAL DATABASE INTERFACE Mona Gharib Mohamed Reda Zahraa E. Mohamed Faculty of Science,
The Battle for the Future of Data Mining Oren Etzioni, CEO Allen Institute for AI (AI2) March 13, 2014
The Battle for the Future of Data Mining Oren Etzioni, CEO Allen Institute for AI (AI2) March 13, 2014 Big Data Tidal Wave What s next? 2 3 4 Deep Learning Some Skepticism about Deep Learning Of course,
Micro blogs Oriented Word Segmentation System
Micro blogs Oriented Word Segmentation System Yijia Liu, Meishan Zhang, Wanxiang Che, Ting Liu, Yihe Deng Research Center for Social Computing and Information Retrieval Harbin Institute of Technology,
Reinventing Business Intelligence through Big Data
Reinventing Business Intelligence through Big Data Dr. Flavio Villanustre VP, Technology and lead of the Open Source HPCC Systems initiative LexisNexis Risk Solutions Reed Elsevier LEXISNEXIS From RISK
Stanford s Distantly-Supervised Slot-Filling System
Stanford s Distantly-Supervised Slot-Filling System Mihai Surdeanu, Sonal Gupta, John Bauer, David McClosky, Angel X. Chang, Valentin I. Spitkovsky, Christopher D. Manning Computer Science Department,
Bing Liu. Web Data Mining. Exploring Hyperlinks, Contents, and Usage Data. With 177 Figures. ~ Spring~r
Bing Liu Web Data Mining Exploring Hyperlinks, Contents, and Usage Data With 177 Figures ~ Spring~r Table of Contents 1. Introduction.. 1 1.1. What is the World Wide Web? 1 1.2. ABrief History of the Web
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,
Industry 4.0 and Big Data
Industry 4.0 and Big Data Marek Obitko, [email protected] Senior Research Engineer 03/25/2015 PUBLIC PUBLIC - 5058-CO900H 2 Background Joint work with Czech Institute of Informatics, Robotics and
Text Mining and Analysis
Text Mining and Analysis Practical Methods, Examples, and Case Studies Using SAS Goutam Chakraborty, Murali Pagolu, Satish Garla From Text Mining and Analysis. Full book available for purchase here. Contents
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts
MIRACLE at VideoCLEF 2008: Classification of Multilingual Speech Transcripts Julio Villena-Román 1,3, Sara Lana-Serrano 2,3 1 Universidad Carlos III de Madrid 2 Universidad Politécnica de Madrid 3 DAEDALUS
ezdi s semantics-enhanced linguistic, NLP, and ML approach for health informatics
ezdi s semantics-enhanced linguistic, NLP, and ML approach for health informatics Raxit Goswami*, Neil Shah* and Amit Sheth*, ** ezdi Inc, Louisville, KY and Ahmedabad, India. ** Kno.e.sis-Wright State
Clustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
How To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
Big Data Analytics and Healthcare
Big Data Analytics and Healthcare Anup Kumar, Professor and Director of MINDS Lab Computer Engineering and Computer Science Department University of Louisville Road Map Introduction Data Sources Structured
Keyphrase Extraction for Scholarly Big Data
Keyphrase Extraction for Scholarly Big Data Cornelia Caragea Computer Science and Engineering University of North Texas July 10, 2015 Scholarly Big Data Large number of scholarly documents on the Web PubMed
Analysis and Synthesis of Help-desk Responses
Analysis and Synthesis of Help-desk s Yuval Marom and Ingrid Zukerman School of Computer Science and Software Engineering Monash University Clayton, VICTORIA 3800, AUSTRALIA {yuvalm,ingrid}@csse.monash.edu.au
End-to-End Sentiment Analysis of Twitter Data
End-to-End Sentiment Analysis of Twitter Data Apoor v Agarwal 1 Jasneet Singh Sabharwal 2 (1) Columbia University, NY, U.S.A. (2) Guru Gobind Singh Indraprastha University, New Delhi, India [email protected],
A SURVEY ON OPINION MINING FROM ONLINE REVIEW SENTENCES
A SURVEY ON OPINION MINING FROM ONLINE REVIEW SENTENCES Dr.P.Perumal 1,M.Kasthuri 2 1 Professor, Computer science and Engineering, Sri Ramakrishna Engineering College, TamilNadu, India 2 ME Student, Computer
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews Minqing Hu and Bing Liu Department of Computer Science University of Illinois at Chicago 851 South Morgan Street Chicago, IL 60607-7053 {mhu1, liub}@cs.uic.edu
SENTIMENT ANALYSIS: TEXT PRE-PROCESSING, READER VIEWS AND CROSS DOMAINS EMMA HADDI BRUNEL UNIVERSITY LONDON
BRUNEL UNIVERSITY LONDON COLLEGE OF ENGINEERING, DESIGN AND PHYSICAL SCIENCES DEPARTMENT OF COMPUTER SCIENCE DOCTOR OF PHILOSOPHY DISSERTATION SENTIMENT ANALYSIS: TEXT PRE-PROCESSING, READER VIEWS AND
C o p yr i g ht 2015, S A S I nstitute Inc. A l l r i g hts r eser v ed. INTRODUCTION TO SAS TEXT MINER
INTRODUCTION TO SAS TEXT MINER TODAY S AGENDA INTRODUCTION TO SAS TEXT MINER Define data mining Overview of SAS Enterprise Miner Describe text analytics and define text data mining Text Mining Process
Introduction to Text Mining and Semantics. Seth Grimes -- President, Alta Plana
Introduction to Text Mining and Semantics Seth Grimes -- President, Alta Plana New York Times October 9, 1958 Text expresses a vast, rich range of information, but encodes this information in a form that
CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise
CLOUD ANALYTICS: Empowering the Army Intelligence Core Analytic Enterprise 5 APR 2011 1 2005... Advanced Analytics Harnessing Data for the Warfighter I2E GIG Brigade Combat Team Data Silos DCGS LandWarNet
Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.
Ming-Wei Chang 201 N Goodwin Ave, Department of Computer Science University of Illinois at Urbana-Champaign, Urbana, IL 61801 +1 (917) 345-6125 [email protected] http://flake.cs.uiuc.edu/~mchang21 Research
Search Engine Based Intelligent Help Desk System: iassist
Search Engine Based Intelligent Help Desk System: iassist Sahil K. Shah, Prof. Sheetal A. Takale Information Technology Department VPCOE, Baramati, Maharashtra, India [email protected], [email protected]
Twitter Stock Bot. John Matthew Fong The University of Texas at Austin [email protected]
Twitter Stock Bot John Matthew Fong The University of Texas at Austin [email protected] Hassaan Markhiani The University of Texas at Austin [email protected] Abstract The stock market is influenced
BIG. Big Data Analysis John Domingue (STI International and The Open University) Big Data Public Private Forum
Big Data Analysis John Domingue (STI International and The Open University) Project co-funded by the European Commission within the 7th Framework Program (Grant Agreement No. 257943) 1 The Data landscape
Taxonomies for Auto-Tagging Unstructured Content. Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013
Taxonomies for Auto-Tagging Unstructured Content Heather Hedden Hedden Information Management Text Analytics World, Boston, MA October 1, 2013 About Heather Hedden Independent taxonomy consultant, Hedden
Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network
I-CiTies 2015 2015 CINI Annual Workshop on ICT for Smart Cities and Communities Palermo (Italy) - October 29-30, 2015 Combining Social Data and Semantic Content Analysis for L Aquila Social Urban Network
A Comparative Study on Sentiment Classification and Ranking on Product Reviews
A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan
ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU
ONTOLOGIES p. 1/40 ONTOLOGIES A short tutorial with references to YAGO Cosmina CROITORU Unlocking the Secrets of the Past: Text Mining for Historical Documents Blockseminar, 21.2.-11.3.2011 ONTOLOGIES
Big Data and Analytics: Challenges and Opportunities
Big Data and Analytics: Challenges and Opportunities Dr. Amin Beheshti Lecturer and Senior Research Associate University of New South Wales, Australia (Service Oriented Computing Group, CSE) Talk: Sharif
» A Hardware & Software Overview. Eli M. Dow <[email protected]:>
» A Hardware & Software Overview Eli M. Dow Overview:» Hardware» Software» Questions 2011 IBM Corporation Early implementations of Watson ran on a single processor where it took 2 hours
Context Aware Predictive Analytics: Motivation, Potential, Challenges
Context Aware Predictive Analytics: Motivation, Potential, Challenges Mykola Pechenizkiy Seminar 31 October 2011 University of Bournemouth, England http://www.win.tue.nl/~mpechen/projects/capa Outline
