A Wikipedia-based Naive Bayes Approach for Obtaining Related Phrases from A Natural Language Query
|
|
- Randolf Lane
- 7 years ago
- Views:
Transcription
1 DEIM Forum 2012 D7-2 Wikipedia Web Wikipedia Wikipedia,,,, A Wikipedia-based Naive Bayes Approach for Obtaining Related Phrases from A Natural Language Query Masumi SHIRAKAWA, Kotaro NAKAYAMA, Takahiro HARA, and Shojiro NISHIO Graduate School of Information Science and Technology, Osaka University 1-5 Yamadaoka, Suita, Osaka , Japan The Center for Knowledge Structuring, The University of Tokyo Hongo, Bunkyo-ku, Tokyo , Japan {shirakawa.masumi,hara,nishio}@ist.osaka-u.ac.jp, nakayama@cks.u-tokyo.ac.jp 1. Web 2006 Web Wikipedia 1 Wikipedia 1 1) 2) 3) Wikipedia [2], [16]
2 Wikipedia Wikipedia 2. Wikipedia Wikipedia 2006 Wikipedia Wiki Web Web Wikipedia URL [10] Wikipedia 2 Wikipedia Wikipedia [9] [12] Strube [16] WordNet 3 Wikipedia Wikipedia [12] Wikipedia Gabrilovich [2] Wikipedia (Explicit Semantic Analysis, ESA) ESA ESA Milne [8] ESA [10], [11] Wikipedia Ito [4] [10] Wikipedia Wikipedia Twitter 4 Meij [6] Ferragina [1] Wikipedia Song [14] ESA Wikipedia Wikipedia Yahoo! Content Analysis API Wikipedia 3. Wikipedia
3 Wikipedia Wikipedia 1 6 t T T E e c P (t T ) P (e t) P (c e) P (c t) P (c) P (c T ) Table 1 1 Definition of symbols. t T t e e c t c c c T c P (T =T ) T T Table 2 2 An example of the probability that a term is a keyphrase. P (t T ) Apple Apple Inc Steve Jobs Japan China tree black house Wikipedia t P (t T ) Wikipedia [7] Wikipedia (wikification) Wikipedia t CountDocuments(t) CountDocuments(t Key) P (t T ) CountDocuments(t Key) CountDocuments(t) 2 TFIDF Apple Inc. Steve Jobs black house (1)... and New
4 3 Apple Table 3 The probability that a term Apple is linked to an entity. P (e t) Apple Inc Apple Apple Records Apple (album) Apple Corps Apple Store Apple (company) App Store Apple Inc. Table 4 Related terms of an entity Apple Inc. and their probability. P (c e) AppleInsider Apple Store Steve Jobs IPhone OS IPod Touch FairPlay Mac OS X Macworld York Times said... New York Times New York York Wikipedia t e P (e t) Wikipedia [9] t e CountAnchortexts(t, e) P (e t) CountAnchortexts(t, e) e i E CountAnchortexts(t, ei) (2) E Wikipedia 3 Apple 8 IT Apple Inc. Apple Apple Records e c P (c e) Wikipedia ESA [2] e c CountLinks(e, c) e c P (c e) CountLinks(e, c) c j E CountLinks(e, cj) (3) ESA e c ESA Sim(e, c) e c Sim(e, c) P (c e) Sim(e, c c j E j) (2) t c P (c t) = P (c e i )P (e i t) (5) e i E 4 ESA Apple Inc. 8 Apple Inc (4) c P (c) c P (c e) c CountLinks(c) P (c) CountLinks(c) c j E CountLinks(cj) (6) Wikipedia
5 P (T = T ) = P (t k T ) P (t k / T ) t k T t k / T = P (t k T ) (1 P (t k T )) (8) t k T t k / T 1 (7) (8) ( ) P (c t P (c T ) P (T = T t ) k T k ) P (c) T 1 T (9) 1 Fig. 1 Naive Bayes for a set of keyphrases in which members are unobservable. T = {t 1,..., t K } P (c T ) 7 t k P (c t) [14] P (c T = {t 1,..., t K}) P (c) K P (t k c) k=1 K k=1 P (c t k) P (c) K 1 (7) T [13] T T P (c T ) 1 t 1 t 2 t 3 T T P (T = T ) 8 T T (1) 7 T T T T K T T t k T t k / T (9) ( ( P (t k T )P (c t k ) 1 P (tk T ) ) ) P (c) T t k T t k / T (10) P (c) K 1 t k [13] K k=1( P (t k T )P (c t k ) + ( 1 P (t k T ) ) ) P (c) P (c T ) P (c) K 1 (11) (11) (7) P (c t k ) P (c t k ) P (c) P (t k T ) P (t k T ) t k P (c t k ) t k P (c) P (t k T ) P (c) P (c) 4. ESA 4 Twitter 2 8 (a) (b) Microsoft Microsoft (a) brand (b) Xbox Live (a) Microsoft brand
6 (a) Did you know that Microsoft is the most influential brand in Canada? Microsoft (b) Microsoft denies Xbox Live security breach Xbox Microsoft (c) Warriors beat the Heat... Happy face! NBA (d) McClennan names Warriors lineup for first pre-season trial Fig. 2 2 Related terms obtained by our method (value means probability). Canada (c) (d) Warriors (c) NBA (d) Warriors Golden State Warriors New Zealand Warriors (c) NBA (d) (c) Heat NBA (d) McClennan Twitter Twitter K-means #Obama #MacBook # [5] 5 Table 5 Three datasets for evaluation and their statistics. U IT S #Obama #MacBook #NFL (779) (1,251) (1,043) #Bones #Silverlight #NHL (949) (221) (1,045) #PGA #VMWare #NBA (1,243) (890) (1,085) #Microsoft #MySQL #MLB ( ) (1,040) (1,241) (752) #medicine #Ubuntu #MLS (1,109) (988) (969) #Christ #Chrome #UFC (871) (1,018) (984) #NASCAR (857) 5,991 5,609 6,735 83,748 82,608 91, ,636 16,539 18,603 [14] 5 U IT
7 6 Table 6 The result of clustering. purity NMI ARI U IT S U IT S U IT S BOW ESA ( 10) ESA ( 20) ESA ( 50) ESA ( 100) ESA ( 200) ESA ( 500) ESA ( 1,000) ESA ( 2,000) ESA ( 5,000) ( 10) ( 20) ( 50) ( 100) ( 200) ( 500) ( 1,000) ( 2,000) ( 5,000) (ESA 10) (ESA 20) (ESA 50) (ESA 100) (ESA 200) (ESA 500) (ESA 1,000) (ESA 2,000) (ESA 5,000) S 1) 2) 3) RT URL 4) # 5) 5 bag-of-words (BOW) Gabrilovich ESA [2] (ESA) ESA ESA (purity) [17] (NMI) [15] adjusted Rand index (ARI) [3] purity NMI ARI NMI ARI false-positive false-negative 0 1 K-means (BOW) Wikipedia ESA 5 Song [14]
8 BOW ESA ESA ESA ESA purity ARI NMI (IT, S) (U) IT S ESA ESA 6. Wikipedia Wikipedia B( ) [1] P. Ferragina and U. Scaiella, TAGME: On-the-fly Annotation of Short Text Fragments (by Wikipedia Entities), Proc. of ACM Conference on Information and Knowledge Management (CIKM), pp , Oct [2] E. Gabrilovich and S. Markovitch, Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis, Proc. of International Joint Conference on Artificial Intelligence (IJCAI), pp , Jan [3] L. Hubert and P. Arabie, Comparing Partitions, Journal of Classification, vol.2, no.1, pp , [4] M. Ito, K. Nakayama, T. Hara, and S. Nishio, Association Thesaurus Construction Methods based on Link Cooccurrence Analysis for Wikipedia, Proc. of ACM Conference on Information and Knowledge Management (CIKM), pp , Oct [5] D. Laniado and P. Mika, Making Sense of Twitter, Proc. of International Semantic Web Conference (ISWC), pp , Nov [6] E. Meij, W. Weerkamp, and M. de Rijke, Adding Semantics to Microblog Posts, Proc. of ACM International Conference on Web Search and Data Mining (WSDM), Feb [7] R. Mihalcea and A. Csomai, Wikify! Linking Documents to Encyclopedic Knowledge, Proc. of ACM Conference on Information and Knowledge Management (CIKM), pp , Nov [8] D. Milne and I.H. Witten, An Effective, Low-Cost Measure of Semantic Relatedness Obtained from Wikipedia Links, Proc. of AAAI Workshop on Wikipedia and Artificial Intelligence (WIKIAI), pp.25 30, July [9] D. Milne and I.H. Witten, Learning to Link with Wikipedia, Proc. of ACM Conference on Information and Knowledge Management (CIKM), pp , Oct [10] K. Nakayama, T. Hara, and S. Nishio, Wikipedia Mining for An Association Web Thesaurus Construction, Proc. of International Conference on Web Information Systems Engineering (WISE), pp , Dec [11] Y. Ollivier and P. Senellart, Finding Related Pages Using Green Measures: An Illustration with Wikipedia, Proc. of National Conference on Artificial Intelligence (AAAI), pp , July [12] S.P. Ponzetto and M. Strube, Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution, Proc. of Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL), pp , June [13] M. Shirakawa, H. Wang, Y. Song, Z. Wang, K. Nakayama, T. Hara, and S. Nishio, Entity Disambiguation based on a Probabilistic Taxonomy, Tech. Rep. MSR-TR , Microsoft Research, Nov [14] Y. Song, H. Wang, Z. Wang, H. Li, and W. Chen, Short Text Conceptualization Using a Probabilistic Knowledgebase, Proc. of International Joint Conference on Artificial Intelligence (IJCAI), pp , July [15] A. Strehl and J. Ghosh, Cluster Ensembles A Knowledge Reuse Framework for Combining Multiple Partitions, Journal of Machine Learning Research, vol.3, pp , Dec [16] M. Strube and S.P. Ponzetto, WikiRelate! Computing Semantic Relatedness using Wikipedia, Proc. of National Conference on Artificial Intelligence (AAAI), pp , July [17] Y. Zhao and G. Karypis, Criterion Functions for Document Clustering: Experiments and Analysis, Tech. Rep. #01-40, Department of Computer Science, University of Minnesota, Feb
Probabilistic Semantic Similarity Measurements for Noisy Short Texts Using Wikipedia Entities
Probabilistic Semantic Similarity Measurements for Noisy Short Texts Using Wikipedia Entities Masumi Shirakawa Kotaro Nakayama Takahiro Hara Shojiro Nishio Graduate School of Information Science and Technology,
More informationClustering Documents with Active Learning using Wikipedia
Clustering Documents with Active Learning using Wikipedia Anna Huang David Milne Eibe Frank Ian H. Witten Department of Computer Science, University of Waikato Private Bag 3105, Hamilton, New Zealand {lh92,
More informationSemantic Relatedness Metric for Wikipedia Concepts Based on Link Analysis and its Application to Word Sense Disambiguation
Semantic Relatedness Metric for Wikipedia Concepts Based on Link Analysis and its Application to Word Sense Disambiguation Denis Turdakov, Pavel Velikhov ISP RAS turdakov@ispras.ru, pvelikhov@yahoo.com
More informationLocal and Global Algorithms for Disambiguation to Wikipedia
ACL 11 Local and Global Algorithms for Disambiguation to Wikipedia Lev Ratinov 1 Dan Roth 1 Doug Downey 2 Mike Anderson 3 1 University of Illinois at Urbana-Champaign {ratinov2 danr}@uiuc.edu 2 Northwestern
More informationLocal and Global Algorithms for Disambiguation to Wikipedia
ACL 11 Local and Global Algorithms for Disambiguation to Wikipedia Lev Ratinov 1 Dan Roth 1 Doug Downey 2 Mike Anderson 3 1 University of Illinois at Urbana-Champaign {ratinov2 danr}@uiuc.edu 2 Northwestern
More informationImproving Question Retrieval in Community Question Answering Using World Knowledge
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Improving Question Retrieval in Community Question Answering Using World Knowledge Guangyou Zhou, Yang Liu, Fang
More informationChapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
More informationCIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet
CIRGIRDISCO at RepLab2014 Reputation Dimension Task: Using Wikipedia Graph Structure for Classifying the Reputation Dimension of a Tweet Muhammad Atif Qureshi 1,2, Arjumand Younus 1,2, Colm O Riordan 1,
More informationSemantic Relationship Discovery with Wikipedia Structure
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Semantic Relationship Discovery with Wikipedia Structure Fan Bu, Yu Hao and Xiaoyan Zhu State Key Laboratory of
More informationTwitter Stock Bot. John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu
Twitter Stock Bot John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu Hassaan Markhiani The University of Texas at Austin hassaan@cs.utexas.edu Abstract The stock market is influenced
More informationTwitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques.
Twitter Sentiment Analysis of Movie Reviews using Machine Learning Techniques. Akshay Amolik, Niketan Jivane, Mahavir Bhandari, Dr.M.Venkatesan School of Computer Science and Engineering, VIT University,
More informationSheeba J.I1, Vivekanandan K2
IMPROVED UNSUPERVISED FRAMEWORK FOR SOLVING SYNONYM, HOMONYM, HYPONYMY & POLYSEMY PROBLEMS FROM EXTRACTED KEYWORDS AND IDENTIFY TOPICS IN MEETING TRANSCRIPTS Sheeba J.I1, Vivekanandan K2 1 Assistant Professor,sheeba@pec.edu
More informationCOMPUTATION OF THE SEMANTIC RELATEDNESS BETWEEN WORDS USING CONCEPT CLOUDS
COMPUTATION OF THE SEMANTIC RELATEDNESS BETWEEN WORDS USING CONCEPT CLOUDS Swarnim Kulkarni and Doina Caragea Department of Computing and Information Sciences, Kansas State University, Manhattan, Kansas,
More informationThe Effect of Clustering in the Apriori Data Mining Algorithm: A Case Study
WCE 23, July 3-5, 23, London, U.K. The Effect of Clustering in the Apriori Data Mining Algorithm: A Case Study Nergis Yılmaz and Gülfem Işıklar Alptekin Abstract Many organizations collect and store data
More informationConcept Term Expansion Approach for Monitoring Reputation of Companies on Twitter
Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter M. Atif Qureshi 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group, National University
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationAn Open-Source Toolkit for Mining Wikipedia
An Open-Source Toolkit for Mining Wikipedia David Milne Department of Computer Science, University of Waikato Private Bag 3105, Hamilton, New Zealand +64 7 856 2889 (ext. 6038) d.n.milne@gmail.com ABSTRACT
More informationAnalysis One Code Desc. Transaction Amount. Fiscal Period
Analysis One Code Desc Transaction Amount Fiscal Period 57.63 Oct-12 12.13 Oct-12-38.90 Oct-12-773.00 Oct-12-800.00 Oct-12-187.00 Oct-12-82.00 Oct-12-82.00 Oct-12-110.00 Oct-12-1115.25 Oct-12-71.00 Oct-12-41.00
More informationComputing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis
Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis Evgeniy Gabrilovich and Shaul Markovitch Department of Computer Science Technion Israel Institute of Technology, 32000 Haifa,
More informationINTERNATIONAL JOURNAL FOR TRENDS IN ENGINEERING & TECHNOLOGY VOLUME 3 ISSUE
Enhancing Implicit Relations in Wikipedia Mining Using Object Relationship Technique G.Shanmugapriya 1 1 B.S Abdur Rahman University, Computer Science, sarushiya@gmail.com S.Raja shaik 2 2 B.S Abdur Rahman
More informationEmoticon Smoothed Language Models for Twitter Sentiment Analysis
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of
More informationWikipedia-based Semantic Interpretation for Natural Language Processing
Journal of Artificial Intelligence Research 34 (2009) 443-498 Submitted 08/08; published 03/09 Wikipedia-based Semantic Interpretation for Natural Language Processing Evgeniy Gabrilovich Shaul Markovitch
More informationSentiment Analysis and Topic Classification: Case study over Spanish tweets
Sentiment Analysis and Topic Classification: Case study over Spanish tweets Fernando Batista, Ricardo Ribeiro Laboratório de Sistemas de Língua Falada, INESC- ID Lisboa R. Alves Redol, 9, 1000-029 Lisboa,
More informationCitationBase: A social tagging management portal for references
CitationBase: A social tagging management portal for references Martin Hofmann Department of Computer Science, University of Innsbruck, Austria m_ho@aon.at Ying Ding School of Library and Information Science,
More informationImpact of Feature Selection Technique on Email Classification
Impact of Feature Selection Technique on Email Classification Aakanksha Sharaff, Naresh Kumar Nagwani, and Kunal Swami Abstract Being one of the most powerful and fastest way of communication, the popularity
More informationSentiment analysis on tweets in a financial domain
Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International
More informationUnderstanding User s Query Intent with Wikipedia
Understanding User s Query Intent with Wikipedia Jian Hu 1, Gang Wang 1, Fred Lochovsky 2, Jian-Tao Sun 1, Zheng Chen 1 1 Microsoft Research Asia 2 The Hong Kong University of Science & Technology No.
More informationWIKITOLOGY: A NOVEL HYBRID KNOWLEDGE BASE DERIVED FROM WIKIPEDIA. by Zareen Saba Syed
WIKITOLOGY: A NOVEL HYBRID KNOWLEDGE BASE DERIVED FROM WIKIPEDIA by Zareen Saba Syed Thesis submitted to the Faculty of the Graduate School of the University of Maryland in partial fulfillment of the requirements
More informationIdentifying free text plagiarism based on semantic similarity
Identifying free text plagiarism based on semantic similarity George Tsatsaronis Norwegian University of Science and Technology Department of Computer and Information Science Trondheim, Norway gbt@idi.ntnu.no
More informationDBTech Pro Workshop. Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining. Georgios Evangelidis
DBTechNet DBTech Pro Workshop Knowledge Discovery from Databases (KDD) Including Data Warehousing and Data Mining Dimitris A. Dervos dad@it.teithe.gr http://aetos.it.teithe.gr/~dad Georgios Evangelidis
More informationSense and Reference Disambiguation in Wikipedia. Dezambiguizare de Sensuri si Referinte in Wikipedia
Sense and Reference Disambiguation in Wikipedia Hui SH EN 1 Razvan BUN ESCU 1 Rada M IHALC EA 2 (1) School of Electrical Engineering and Computer Science, Ohio University, Athens, OH (2) Department of
More informationSentiment analysis on news articles using Natural Language Processing and Machine Learning Approach.
Sentiment analysis on news articles using Natural Language Processing and Machine Learning Approach. Pranali Chilekar 1, Swati Ubale 2, Pragati Sonkambale 3, Reema Panarkar 4, Gopal Upadhye 5 1 2 3 4 5
More informationResearch on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2
Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data
More informationHarvesting and Structuring Social Data in Music Information Retrieval
Harvesting and Structuring Social Data in Music Information Retrieval Sergio Oramas Music Technology Group Universitat Pompeu Fabra, Barcelona, Spain sergio.oramas@upf.edu Abstract. An exponentially growing
More informationUNED Online Reputation Monitoring Team at RepLab 2013
UNED Online Reputation Monitoring Team at RepLab 2013 Damiano Spina, Jorge Carrillo-de-Albornoz, Tamara Martín, Enrique Amigó, Julio Gonzalo, and Fernando Giner {damiano,jcalbornoz,tmartin,enrique,julio}@lsi.uned.es,
More informationDiscovering and Querying Hybrid Linked Data
Discovering and Querying Hybrid Linked Data Zareen Syed 1, Tim Finin 1, Muhammad Rahman 1, James Kukla 2, Jeehye Yun 2 1 University of Maryland Baltimore County 1000 Hilltop Circle, MD, USA 21250 zsyed@umbc.edu,
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 5, Sep-Oct 2015
RESEARCH ARTICLE Multi Document Utility Presentation Using Sentiment Analysis Mayur S. Dhote [1], Prof. S. S. Sonawane [2] Department of Computer Science and Engineering PICT, Savitribai Phule Pune University
More informationEffective Mentor Suggestion System for Collaborative Learning
Effective Mentor Suggestion System for Collaborative Learning Advait Raut 1 U pasana G 2 Ramakrishna Bairi 3 Ganesh Ramakrishnan 2 (1) IBM, Bangalore, India, 560045 (2) IITB, Mumbai, India, 400076 (3)
More informationAnnotation for the Semantic Web during Website Development
Annotation for the Semantic Web during Website Development Peter Plessers, Olga De Troyer Vrije Universiteit Brussel, Department of Computer Science, WISE, Pleinlaan 2, 1050 Brussel, Belgium {Peter.Plessers,
More informationIntegrating Cyc and Wikipedia: Folksonomy Meets Rigorously Defined Common-Sense
Integrating Cyc and Wikipedia: Folksonomy Meets Rigorously Defined Common-Sense Olena Medelyan Department of Computer Science University of Waikato, New Zealand olena@cs.waikato.ac.nz Catherine Legg Department
More informationCloud Computing an introduction
Prof. Dr. Claudia Müller-Birn Institute for Computer Science, Networked Information Systems Cloud Computing an introduction January 30, 2012 Netzprogrammierung (Algorithmen und Programmierung V) Our topics
More informationAN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS
AN APPROACH TO WORD SENSE DISAMBIGUATION COMBINING MODIFIED LESK AND BAG-OF-WORDS Alok Ranjan Pal 1, 3, Anirban Kundu 2, 3, Abhay Singh 1, Raj Shekhar 1, Kunal Sinha 1 1 College of Engineering and Management,
More informationDiscovering Filter Keywords for Company Name Disambiguation in Twitter
Discovering Filter Keywords for Company Name Disambiguation in Twitter Damiano Spina, Julio Gonzalo, Enrique Amigó UNED NLP & IR Group Juan del Rosal, 16 28040 Madrid, Spain http: // nlp. uned. es Abstract
More informationExperiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
More informationHorizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
IOSR Journal of Computer Engineering (IOSRJCE) ISSN: 2278-0661, ISBN: 2278-8727 Volume 6, Issue 5 (Nov. - Dec. 2012), PP 36-41 Horizontal Aggregations in SQL to Prepare Data Sets for Data Mining Analysis
More informationWord Sense Disambiguation as an Integer Linear Programming Problem
Word Sense Disambiguation as an Integer Linear Programming Problem Vicky Panagiotopoulou 1, Iraklis Varlamis 2, Ion Androutsopoulos 1, and George Tsatsaronis 3 1 Department of Informatics, Athens University
More informationBT Lancashire Services
In confidence BT Lancashire Services Remote Access to Corporate Desktop (RACD) Getting Started Guide Working in partnership Confidentiality Statement BT Lancashire Services Certain information given to
More informationSustaining Privacy Protection in Personalized Web Search with Temporal Behavior
Sustaining Privacy Protection in Personalized Web Search with Temporal Behavior N.Jagatheshwaran 1 R.Menaka 2 1 Final B.Tech (IT), jagatheshwaran.n@gmail.com, Velalar College of Engineering and Technology,
More informationYifan Chen, Guirong Xue and Yong Yu Apex Data & Knowledge Management LabShanghai Jiao Tong University
Yifan Chen, Guirong Xue and Yong Yu Apex Data & Knowledge Management LabShanghai Jiao Tong University Presented by Qiang Yang, Hong Kong Univ. of Science and Technology 1 In a Search Engine Company Advertisers
More informationOn the Evolution of Wikipedia: Dynamics of Categories and Articles
Wikipedia, a Social Pedia: Research Challenges and Opportunities: Papers from the 2015 ICWSM Workshop On the Evolution of Wikipedia: Dynamics of Categories and Articles Ramakrishna B. Bairi IITB-Monash
More informationBuilding Semantic Kernels for Text Classification using Wikipedia
Building Semantic Kernels for Text Classification using Wikipedia Pu Wang and Carlotta Domeniconi Department of Computer Science George Mason University pwang7@gmuedu, carlotta@csgmuedu ABSTRACT Document
More informationFacilitating Business Process Discovery using Email Analysis
Facilitating Business Process Discovery using Email Analysis Matin Mavaddat Matin.Mavaddat@live.uwe.ac.uk Stewart Green Stewart.Green Ian Beeson Ian.Beeson Jin Sa Jin.Sa Abstract Extracting business process
More informationREUSING DISCUSSION FORUMS AS LEARNING RESOURCES IN WBT SYSTEMS
REUSING DISCUSSION FORUMS AS LEARNING RESOURCES IN WBT SYSTEMS Denis Helic, Hermann Maurer, Nick Scerbakov IICM, University of Technology Graz Austria ABSTRACT Discussion forums are highly popular and
More informationOn Analyzing Hashtags in Twitter
Proceedings of the Ninth International AAAI Conference on Web and Social Media On Analyzing Hashtags in Twitter Paolo Ferragina Francesco Piccinno Roberto Santoro Dipartimento di Informatica University
More informationAccess Your Cisco Smart Storage Remotely Via WebDAV
Application Note Access Your Cisco Smart Storage Remotely Via WebDAV WebDAV (Web-based Distributed Authoring and Versioning), is a set of extensions to the HTTP(S) protocol that allows a web server to
More informationSoftware Defect Prediction for Quality Improvement Using Hybrid Approach
Software Defect Prediction for Quality Improvement Using Hybrid Approach 1 Pooja Paramshetti, 2 D. A. Phalke D.Y. Patil College of Engineering, Akurdi, Pune. Savitribai Phule Pune University ABSTRACT In
More informationEfficient Integration of Data Mining Techniques in Database Management Systems
Efficient Integration of Data Mining Techniques in Database Management Systems Fadila Bentayeb Jérôme Darmont Cédric Udréa ERIC, University of Lyon 2 5 avenue Pierre Mendès-France 69676 Bron Cedex France
More informationImproving Classification of Multi-Lingual Web Documents using Domain Ontologies
Improving Classification of Multi-Lingual Web Documents using Domain Ontologies Marina Litvak, Mark Last, and Slava Kisilevich Department of Information Systems Engineering, Ben-Gurion University of the
More informationCOLINDA - Conference Linked Data
Undefined 1 (0) 1 5 1 IOS Press COLINDA - Conference Linked Data Editor(s): Name Surname, University, Country Solicited review(s): Name Surname, University, Country Open review(s): Name Surname, University,
More informationQUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS
QUANTIFYING THE EFFECTS OF ONLINE BULLISHNESS ON INTERNATIONAL FINANCIAL MARKETS Huina Mao School of Informatics and Computing Indiana University, Bloomington, USA ECB Workshop on Using Big Data for Forecasting
More informationComputer-Based Text- and Data Analysis Technologies and Applications. Mark Cieliebak 9.6.2015
Computer-Based Text- and Data Analysis Technologies and Applications Mark Cieliebak 9.6.2015 Data Scientist analyze Data Library use 2 About Me Mark Cieliebak + Software Engineer & Data Scientist + PhD
More informationUniversity of Glasgow Terrier Team / Project Abacá at RepLab 2014: Reputation Dimensions Task
University of Glasgow Terrier Team / Project Abacá at RepLab 2014: Reputation Dimensions Task Graham McDonald, Romain Deveaud, Richard McCreadie, Timothy Gollins, Craig Macdonald and Iadh Ounis School
More informationFiltering Noisy Contents in Online Social Network by using Rule Based Filtering System
Filtering Noisy Contents in Online Social Network by using Rule Based Filtering System Bala Kumari P 1, Bercelin Rose Mary W 2 and Devi Mareeswari M 3 1, 2, 3 M.TECH / IT, Dr.Sivanthi Aditanar College
More informationKeyword Optimization in Sponsored Search via Feature Selection
JMLR: Workshop and Conference Proceedings 4: 122-134 New challenges for feature selection Keyword Optimization in Sponsored Search via Feature Selection Svetlana Kiritchenko Institute for Information Technology
More informationHow To Make Sense Of Data With Altilia
HOW TO MAKE SENSE OF BIG DATA TO BETTER DRIVE BUSINESS PROCESSES, IMPROVE DECISION-MAKING, AND SUCCESSFULLY COMPETE IN TODAY S MARKETS. ALTILIA turns Big Data into Smart Data and enables businesses to
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationThe 2006 IEEE / WIC / ACM International Conference on Web Intelligence Hong Kong, China
WISE: Hierarchical Soft Clustering of Web Page Search based on Web Content Mining Techniques Ricardo Campos 1, 2 Gaël Dias 2 Célia Nunes 2 1 Instituto Politécnico de Tomar Tomar, Portugal 2 Centre of Human
More informationAutomatic Annotation Wrapper Generation and Mining Web Database Search Result
Automatic Annotation Wrapper Generation and Mining Web Database Search Result V.Yogam 1, K.Umamaheswari 2 1 PG student, ME Software Engineering, Anna University (BIT campus), Trichy, Tamil nadu, India
More informationUsing Semantic Data Mining for Classification Improvement and Knowledge Extraction
Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this
More informationRule based Classification of BSE Stock Data with Data Mining
International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification
More informationProfile Based Personalized Web Search and Download Blocker
Profile Based Personalized Web Search and Download Blocker 1 K.Sheeba, 2 G.Kalaiarasi Dhanalakshmi Srinivasan College of Engineering and Technology, Mamallapuram, Chennai, Tamil nadu, India Email: 1 sheebaoec@gmail.com,
More informationEfficient Query Optimizing System for Searching Using Data Mining Technique
Vol.1, Issue.2, pp-347-351 ISSN: 2249-6645 Efficient Query Optimizing System for Searching Using Data Mining Technique Velmurugan.N Vijayaraj.A Assistant Professor, Department of MCA, Associate Professor,
More informationSearch Result Optimization using Annotators
Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,
More informationCREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS
CREATING MINIMIZED DATA SETS BY USING HORIZONTAL AGGREGATIONS IN SQL FOR DATA MINING ANALYSIS Subbarao Jasti #1, Dr.D.Vasumathi *2 1 Student & Department of CS & JNTU, AP, India 2 Professor & Department
More informationA Novel Framework for Personalized Web Search
A Novel Framework for Personalized Web Search Aditi Sharan a, * Mayank Saini a a School of Computer and Systems Sciences, Jawaharlal Nehru University, New Delhi-67, India Abstract One hundred users, one
More informationAn Adaptive Method for Organization Name Disambiguation. with Feature Reinforcing
An Adaptive Method for Organization Name Disambiguation with Feature Reinforcing Shu Zhang 1, Jianwei Wu 2, Dequan Zheng 2, Yao Meng 1 and Hao Yu 1 1 Fujitsu Research and Development Center Dong Si Huan
More informationAdditional details >>> HERE <<<
Additional details >>> HERE http://dbvir.com/winningtip/pdx/nasl3500/
More informationBuilding the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy)
Building the Multilingual Web of Data: A Hands-on tutorial (ISWC 2014, Riva del Garda - Italy) Multilingual Word Sense Disambiguation and Entity Linking on the Web based on BabelNet Roberto Navigli, Tiziano
More informationAnalysis of Social Media Streams
Fakultätsname 24 Fachrichtung 24 Institutsname 24, Professur 24 Analysis of Social Media Streams Florian Weidner Dresden, 21.01.2014 Outline 1.Introduction 2.Social Media Streams Clustering Summarization
More informationRole of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
More informationOvercoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge
Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge Evgeniy Gabrilovich and Shaul Markovitch Department of Computer Science Technion Israel
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationSentiment Analysis of Twitter Data within Big Data Distributed Environment for Stock Prediction
Proceedings of the Federated Conference on Computer Science and Information Systems pp. 1349 1354 DOI: 10.15439/2015F230 ACSIS, Vol. 5 Sentiment Analysis of Twitter Data within Big Data Distributed Environment
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationFolksonomies versus Automatic Keyword Extraction: An Empirical Study
Folksonomies versus Automatic Keyword Extraction: An Empirical Study Hend S. Al-Khalifa and Hugh C. Davis Learning Technology Research Group, ECS, University of Southampton, Southampton, SO17 1BJ, UK {hsak04r/hcd}@ecs.soton.ac.uk
More informationA Matrix Factorization Approach for Integrating Multiple Data Views
A Matrix Factorization Approach for Integrating Multiple Data Views Derek Greene, Pádraig Cunningham School of Computer Science & Informatics, University College Dublin {derek.greene,padraig.cunningham}@ucd.ie
More informationSentiment analysis: towards a tool for analysing real-time students feedback
Sentiment analysis: towards a tool for analysing real-time students feedback Nabeela Altrabsheh Email: nabeela.altrabsheh@port.ac.uk Mihaela Cocea Email: mihaela.cocea@port.ac.uk Sanaz Fallahkhair Email:
More informationDownload Free ebook Sports Betting Systems Unbeatable Sports Betting System Win Win Sports Betting System User Experience
Additional information >>> HERE
More informationSelf-adaptive e-learning Website for Mathematics
Self-adaptive e-learning Website for Mathematics Akira Nakamura Abstract Keyword searching and browsing on learning website is ultimate self-adaptive learning. Our e-learning website KIT Mathematics Navigation
More informationAdditional information >>> HERE <<<
Additional information >>> HERE http://urlzz.org/winningtip/pdx/palo1436/ Tags: how to best price adidas football boots
More informationBisecting K-Means for Clustering Web Log data
Bisecting K-Means for Clustering Web Log data Ruchika R. Patil Department of Computer Technology YCCE Nagpur, India Amreen Khan Department of Computer Technology YCCE Nagpur, India ABSTRACT Web usage mining
More informationMining Domain-Specific Thesauri from Wikipedia: A case study
Mining Domain-Specific Thesauri from Wikipedia: A case study David Milne, Olena Medelyan and Ian H. Witten Department of Computer Science, University of Waikato {dnk2, olena, ihw}@cs.waikato.ac.nz Abstract
More informationA Comparison Framework of Similarity Metrics Used for Web Access Log Analysis
A Comparison Framework of Similarity Metrics Used for Web Access Log Analysis Yusuf Yaslan and Zehra Cataltepe Istanbul Technical University, Computer Engineering Department, Maslak 34469 Istanbul, Turkey
More informationRANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS
ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for
More informationSpatio-Temporal Patterns of Passengers Interests at London Tube Stations
Spatio-Temporal Patterns of Passengers Interests at London Tube Stations Juntao Lai *1, Tao Cheng 1, Guy Lansley 2 1 SpaceTimeLab for Big Data Analytics, Department of Civil, Environmental &Geomatic Engineering,
More informationDiscovering the Dynamics of Terms Semantic Relatedness through Twitter
Discovering the Dynamics of Terms Semantic Relatedness through Twitter Nikola Milikic 1, Jelena Jovanovic 1, Milan Stankovic 2 1 University of Belgrade, Jove Ilica 154, 11000 Belgrade, Serbia 2 STIH, Université
More informationPredicting stocks returns correlations based on unstructured data sources
Predicting stocks returns correlations based on unstructured data sources Mateusz Radzimski, José Luis Sánchez-Cervantes, José Luis López Cuadrado, Ángel García-Crespo Departamento de Informática Universidad
More informationKeyphrase Extraction for Scholarly Big Data
Keyphrase Extraction for Scholarly Big Data Cornelia Caragea Computer Science and Engineering University of North Texas July 10, 2015 Scholarly Big Data Large number of scholarly documents on the Web PubMed
More information