Summarizing microblog stream

Size: px
Start display at page:

Download "Summarizing microblog stream"

Transcription

1 SIG-SWO-A Summarizing microblog stream Hiroya Takamura Hikaru Yokono Manabu Okumura Tokyo Institute of Technology Precision and Intelligence Laboratory Abstract: We address the task of summarizing numerous short documents on microblogs including Twitter. On microblogs, thousands of short documents on a certain topic such as sports games or TV dramas are posted by users. Noticeable characteristics of microblog data are that documents are often very highly redundant and are aligned on timeline. There can be dozens of documents on one event in the topic. Two very similar documents will refer to two distinct events when the documents are temporally distant. We examine the microblog data to gain more understanding of those characteristics, and propose a summarization model for numerous short documents on timeline, along with an approximate fast algorithm for generating summary. We empirically show that our model generates a good summary on the dataset of microblog documents on sports games. 1 Twitter 1 Twitter tweet takamura@pi.titech.ac.jp 1 Ustream 2 [6]

2 10 30 [12] Twitter Streaming API tweet( ) : ; (i) 03-2

3 2: (ii) 3 p- Takamura [12] [2] ( p ) ( ) p- Takamura ( ) ( ) e ij d i d j 3 z ij d j d i 1 0 i,j e ijz ij p x i d i 1 0 i x i p p- max. i,j e ijz ij s.t. z ij x i ; i, j, (1) i x i p, (2) i z ij = 1; j, (3) z ii = x i ; i, (4) x i {0, 1}; i, (5) z ij {0, 1}; i, j. (6) (1) (2) (3) (4) z ij (6) NP [3] e ij d i d j Takamura [12] e ij e ij = d i d j. (7) d j d i d i d j d i d j

4 2 ; (i) (ii) e ij 3: 4.1 e ij 0.5 t(d i) t(d j ) /β. (8) t(d) d ( ) β β β 1/c i c i d i e time ij e time ij = e ij c i 0.5 t(d i) t(d j ) /β. (9) 4.2 p- p- 3 4: 4 p 1 p p- max. i,j e ijz ij s.t. z ij x i ; i, j, i c ix i p, i z ij = 1; j, z ii = x i ; i, z ij z ik ; i, j, k(j k i) (10) z ij z ik ; i, j, k(i k j) (11) x i {0, 1}; i, z ij {0, 1}; i, j. 03-4

5 (10) (11) p- p- k- p d m1,, d mp i j, t(d mi ) t(d mj ) while for l = 1 to p d ml +1,, d ml+1 1 d ml d ml+1 end for for l = 1 to p d ml end for end while h max h max = h argmax h:ml h<m l+1 e ml j + j=m l m l+1 j=h+1 e ml+1 j, d hmax d ml d ml+1 h max h + 1 h e ml h+1 e ml+1 h+1 d ml 5 Sharifi [10] O Connor [7] Twitter tweet( ) Swan and Jensen [11] O Connor Swan and Jensen Topic Detection and Tracking (TDT) [1] Topic Detection and Tracking (TDT) [1] TDT TDT [5] [8, 9] 03-5

6 6 6.1 ROUGE[4] ROUGE ROUGE ROUGE ROUGE 10 ROUGE ( ) ROUGE ( ) 4 tweet Twitter Streaming API tweet( ) % 2010 FIFA ( ) Streaming API (#soccer, #jfa, #wc2010, #jfa2010, #daihyo, #2010wc) 6 ( 1 Streaming API statuses/sample 7 5% Streaming API statuses/filter

7 1: FIFA : MeCab e ij e time ij 6.4 ILOG CPLEX version 11.1 p- 7 8 p- p- e ij e time ij e time ij β / / / 9 (4.1 ) β ( ) e ij Takamura [12] p- e ij e time ij β = p random: p p ROUGE equal: p e time ij 03-7

8 3: p- ROUGE e ij e time ij β = 300 β = 600 β = β = e time ij β 19:42:24 19:58:18 20:19:27 20:39:30 20:52:00 20:58:55 21:00:56 19:19:39 19:39:56 19:49:18 20:03:41 20:26:27 20:32:02 20:36:29 21:08:58 p- 4.3 p- 4 p- ROUGE p- e time ij 1/c i equal 0.92 p Twitter Ustream 10 Perl

9 4: FIFA p- ROUGE p- random equal p [1] James Allan, Jaime Carbonell, George Doddington, Jonathan Yamron, Yiming Yang, Umass Amherst, and James Allan Umass. Topic detection and tracking pilot study. In Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, pages , [2] Zvi Drezner and Horst W. Hamacher, editors. Facility Location: Applications and Theory. Springer, [3] Juraj Hromkovič. Algorithmics for Hard Problems. Springer, [4] Chin-Yew Lin. ROUGE: a package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out, pages 74 81, [5] Alireza Rezaei Mahdiraji. Clustering data stream: A survey of algorithms. International Journal of Knowledge-based and Intelligent Engineering Systems, 13:39 44, [6] Inderjeet Mani. Automatic Summarization. John Benjamins Publisher, [7] Brendan O Connory, Michel Krieger, and David Ahn. Tweetmotif: Exploratory search and topic summarization for twitter. In Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, pages , [8] Sasa Petrovic, Miles Osborne, and Victor Lavrenko. Streaming first story detection with application to twitter. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2010), pages , Los Angeles, California, June Association for Computational Linguistics. [9] Alan Ritter, Colin Cherry, and Bill Dolan. Unsupervised modeling of twitter conversations. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2010), pages , Los Angeles, California, June Association for Computational Linguistics. [10] Beaux Sharifi, Mark-Anthony Hutton, and Jugal Kalita. Summarizing microblogs automatically. In Proceedings of the 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2010), pages , Los Angeles, California, June Association for Computational Linguistics. [11] Russell Swan and David Jensen. Timemines: Constructing timelines with statistical models of word usage. In Proceedings of the ACM SIGKDD 2000 Workshop on Text Mining, pages 73 80, [12] Hiroya Takamura and Manabu Okumura. Text summarization model based on the budgeted median problem. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), short paper, pages , November

Enhanced Information Access to Social Streams. Enhanced Word Clouds with Entity Grouping

Enhanced Information Access to Social Streams. Enhanced Word Clouds with Entity Grouping Enhanced Information Access to Social Streams through Word Clouds with Entity Grouping Martin Leginus 1, Leon Derczynski 2 and Peter Dolog 1 1 Department of Computer Science, Aalborg University Selma Lagerlofs

More information

Sentiment analysis on tweets in a financial domain

Sentiment analysis on tweets in a financial domain Sentiment analysis on tweets in a financial domain Jasmina Smailović 1,2, Miha Grčar 1, Martin Žnidaršič 1 1 Dept of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia 2 Jožef Stefan International

More information

Exploiting Topic based Twitter Sentiment for Stock Prediction

Exploiting Topic based Twitter Sentiment for Stock Prediction Exploiting Topic based Twitter Sentiment for Stock Prediction Jianfeng Si * Arjun Mukherjee Bing Liu Qing Li * Huayi Li Xiaotie Deng * Department of Computer Science, City University of Hong Kong, Hong

More information

Semantic Expansion of Hashtags for Enhanced Event Detection in Twitter

Semantic Expansion of Hashtags for Enhanced Event Detection in Twitter Semantic Expansion of Hashtags for Enhanced Event Detection in Twitter Ozer Ozdikis, Pinar Senkul, Halit Oguztuzun Middle East Technical University Ankara, Turkey ozer.ozdikis, senkul, oguztuzn@ceng.metu.edu.tr

More information

Spatio-Temporal Patterns of Passengers Interests at London Tube Stations

Spatio-Temporal Patterns of Passengers Interests at London Tube Stations Spatio-Temporal Patterns of Passengers Interests at London Tube Stations Juntao Lai *1, Tao Cheng 1, Guy Lansley 2 1 SpaceTimeLab for Big Data Analytics, Department of Civil, Environmental &Geomatic Engineering,

More information

Can Twitter Predict Royal Baby's Name?

Can Twitter Predict Royal Baby's Name? Summary Can Twitter Predict Royal Baby's Name? Bohdan Pavlyshenko Ivan Franko Lviv National University,Ukraine, b.pavlyshenko@gmail.com In this paper, we analyze the existence of possible correlation between

More information

IDENTIFICATION OF KEY LOCATIONS BASED ON ONLINE SOCIAL NETWORK ACTIVITY

IDENTIFICATION OF KEY LOCATIONS BASED ON ONLINE SOCIAL NETWORK ACTIVITY H. Efstathiades, D. Antoniades, G. Pallis, M. D. Dikaiakos IDENTIFICATION OF KEY LOCATIONS BASED ON ONLINE SOCIAL NETWORK ACTIVITY 1 Motivation Key Locations information is of high importance for various

More information

Web Information Mining and Decision Support Platform for the Modern Service Industry

Web Information Mining and Decision Support Platform for the Modern Service Industry Web Information Mining and Decision Support Platform for the Modern Service Industry Binyang Li 1,2, Lanjun Zhou 2,3, Zhongyu Wei 2,3, Kam-fai Wong 2,3,4, Ruifeng Xu 5, Yunqing Xia 6 1 Dept. of Information

More information

Supporting Mobility In Publish-Subscribe Networks

Supporting Mobility In Publish-Subscribe Networks A Selective Neighbor Caching Approach for Supporting Mobility in Publish/Subscribe Networks Vasilios A. Siris, Xenofon Vasilakos, and George C. Polyzos Mobile Multimedia Laboratory Department of Informatics

More information

Effective Self-Training for Parsing

Effective Self-Training for Parsing Effective Self-Training for Parsing David McClosky dmcc@cs.brown.edu Brown Laboratory for Linguistic Information Processing (BLLIP) Joint work with Eugene Charniak and Mark Johnson David McClosky - dmcc@cs.brown.edu

More information

UMass at TREC 2008 Blog Distillation Task

UMass at TREC 2008 Blog Distillation Task UMass at TREC 2008 Blog Distillation Task Jangwon Seo and W. Bruce Croft Center for Intelligent Information Retrieval University of Massachusetts, Amherst Abstract This paper presents the work done for

More information

Efficient Cluster Detection and Network Marketing in a Nautural Environment

Efficient Cluster Detection and Network Marketing in a Nautural Environment A Probabilistic Model for Online Document Clustering with Application to Novelty Detection Jian Zhang School of Computer Science Cargenie Mellon University Pittsburgh, PA 15213 jian.zhang@cs.cmu.edu Zoubin

More information

Summarizing Online Forum Discussions Can Dialog Acts of Individual Messages Help?

Summarizing Online Forum Discussions Can Dialog Acts of Individual Messages Help? Summarizing Online Forum Discussions Can Dialog Acts of Individual Messages Help? Sumit Bhatia 1, Prakhar Biyani 2 and Prasenjit Mitra 2 1 IBM Almaden Research Centre, 650 Harry Road, San Jose, CA 95123,

More information

Evaluating Methods for Summarizing Twitter Posts

Evaluating Methods for Summarizing Twitter Posts Evaluating Methods for Summarizing Twitter Posts Gary Beverungen St. Mary s College of Maryland 16800 Point Lookout Rd. St. Mary s City, MD gebeverungen@smcm.edu Jugal Kalita University of Colorado at

More information

Ph.D., 2014, Machine Learning, Carnegie Mellon University 2009 2014 M.S. and B.S., 2006, Symbolic Systems, Stanford University 2001 2006

Ph.D., 2014, Machine Learning, Carnegie Mellon University 2009 2014 M.S. and B.S., 2006, Symbolic Systems, Stanford University 2001 2006 Brendan O Connor Assistant Professor, Computer Science, UMass Amherst 140 Governors Drive, Amherst, MA, 01003, USA brenocon@cs.umass.edu http://brenocon.com Last updated: March 2015 Education Ph.D., 2014,

More information

Developing a Collaborative MOOC Learning Environment utilizing Video Sharing with Discussion Summarization as Added-Value

Developing a Collaborative MOOC Learning Environment utilizing Video Sharing with Discussion Summarization as Added-Value , pp. 397-408 http://dx.doi.org/10.14257/ijmue.2014.9.11.38 Developing a Collaborative MOOC Learning Environment utilizing Video Sharing with Discussion Summarization as Added-Value Mohannad Al-Mousa 1

More information

Dialog System Using Real-Time Crowdsourcing and Twitter Large-Scale Corpus

Dialog System Using Real-Time Crowdsourcing and Twitter Large-Scale Corpus Dialog System Using Real-Time Crowdsourcing and Twitter Large-Scale Corpus Fumihiro Bessho, Tatsuya Harada, Yasuo Kuniyoshi The University of Tokyo Department of Mechano-Informatics 7-3-1 Hongo, Bunkyo-ku,

More information

Streaming First Story Detection with application to Twitter

Streaming First Story Detection with application to Twitter Streaming First Story Detection with application to Twitter Saša Petrović School of Informatics University of Edinburgh sasa.petrovic@ed.ac.uk Miles Osborne School of Informatics University of Edinburgh

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

Sentiment Analysis and Topic Classification: Case study over Spanish tweets

Sentiment Analysis and Topic Classification: Case study over Spanish tweets Sentiment Analysis and Topic Classification: Case study over Spanish tweets Fernando Batista, Ricardo Ribeiro Laboratório de Sistemas de Língua Falada, INESC- ID Lisboa R. Alves Redol, 9, 1000-029 Lisboa,

More information

Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter

Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter Concept Term Expansion Approach for Monitoring Reputation of Companies on Twitter M. Atif Qureshi 1,2, Colm O Riordan 1, and Gabriella Pasi 2 1 Computational Intelligence Research Group, National University

More information

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment Edmond H. Wu,MichaelK.Ng, Andy M. Yip,andTonyF.Chan Department of Mathematics, The University of Hong Kong Pokfulam Road,

More information

arxiv:1204.3731v1 [cs.ir] 17 Apr 2012

arxiv:1204.3731v1 [cs.ir] 17 Apr 2012 Towards Real-Time Summarization of Scheduled Events from Twitter Streams arxiv:1204.3731v1 [cs.ir] 17 Apr 2012 Arkaitz Zubiaga Queens College City University of New York New York, NY, USA arkaitz.zubiaga@qc.cuny.edu

More information

Date: May 6 (Wednesday), 2015, 14:00 ~ 18:00 Venue: Room No. 201, Engineering Building 2, Yonsei University, Seoul, Korea

Date: May 6 (Wednesday), 2015, 14:00 ~ 18:00 Venue: Room No. 201, Engineering Building 2, Yonsei University, Seoul, Korea Microsoft Research Yonsei University Joint Workshop Date: May 6 (Wednesday), 2015, 14:00 ~ 18:00 Venue: Room No. 201, Engineering Building 2, Yonsei University, Seoul, Korea PROGRAM Time 14:00 ~ 14:10

More information

Analysis of Social Media Streams

Analysis of Social Media Streams Fakultätsname 24 Fachrichtung 24 Institutsname 24, Professur 24 Analysis of Social Media Streams Florian Weidner Dresden, 21.01.2014 Outline 1.Introduction 2.Social Media Streams Clustering Summarization

More information

Mimicking human fake review detection on Trustpilot

Mimicking human fake review detection on Trustpilot Mimicking human fake review detection on Trustpilot [DTU Compute, special course, 2015] Ulf Aslak Jensen Master student, DTU Copenhagen, Denmark Ole Winther Associate professor, DTU Copenhagen, Denmark

More information

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006

The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 The XMU Phrase-Based Statistical Machine Translation System for IWSLT 2006 Yidong Chen, Xiaodong Shi Institute of Artificial Intelligence Xiamen University P. R. China November 28, 2006 - Kyoto 13:46 1

More information

Semi-Supervised Learning for Blog Classification

Semi-Supervised Learning for Blog Classification Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008) Semi-Supervised Learning for Blog Classification Daisuke Ikeda Department of Computational Intelligence and Systems Science,

More information

User Authentication/Identification From Web Browsing Behavior

User Authentication/Identification From Web Browsing Behavior User Authentication/Identification From Web Browsing Behavior US Naval Research Laboratory PI: Myriam Abramson, Code 5584 Shantanu Gore, SEAP Student, Code 5584 David Aha, Code 5514 Steve Russell, Code

More information

TREC 2007 ciqa Task: University of Maryland

TREC 2007 ciqa Task: University of Maryland TREC 2007 ciqa Task: University of Maryland Nitin Madnani, Jimmy Lin, and Bonnie Dorr University of Maryland College Park, Maryland, USA nmadnani,jimmylin,bonnie@umiacs.umd.edu 1 The ciqa Task Information

More information

Semantic Sentiment Analysis of Twitter

Semantic Sentiment Analysis of Twitter Semantic Sentiment Analysis of Twitter Hassan Saif, Yulan He & Harith Alani Knowledge Media Institute, The Open University, Milton Keynes, United Kingdom The 11 th International Semantic Web Conference

More information

Contemporary Techniques for Data Mining Social Media

Contemporary Techniques for Data Mining Social Media Contemporary Techniques for Data Mining Social Media Stephen Cutting (100063482) 1 Introduction Social media websites such as Facebook, Twitter and Google+ allow millions of users to communicate with one

More information

Knowledge Management and Speech Recognition

Knowledge Management and Speech Recognition Knowledge Management and Speech Recognition by James Allan Knowledge Management (KM) generally refers to techniques that allow an organization to capture information and practices of its members and customers,

More information

Cross-Lingual Concern Analysis from Multilingual Weblog Articles

Cross-Lingual Concern Analysis from Multilingual Weblog Articles Cross-Lingual Concern Analysis from Multilingual Weblog Articles Tomohiro Fukuhara RACE (Research into Artifacts), The University of Tokyo 5-1-5 Kashiwanoha, Kashiwa, Chiba JAPAN http://www.race.u-tokyo.ac.jp/~fukuhara/

More information

Online Generation of Locality Sensitive Hash Signatures

Online Generation of Locality Sensitive Hash Signatures Online Generation of Locality Sensitive Hash Signatures Benjamin Van Durme HLTCOE Johns Hopkins University Baltimore, MD 21211 USA Ashwin Lall College of Computing Georgia Institute of Technology Atlanta,

More information

Predicting stocks returns correlations based on unstructured data sources

Predicting stocks returns correlations based on unstructured data sources Predicting stocks returns correlations based on unstructured data sources Mateusz Radzimski, José Luis Sánchez-Cervantes, José Luis López Cuadrado, Ángel García-Crespo Departamento de Informática Universidad

More information

GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING

GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING MEDIA MONITORING AND ANALYSIS GRAPHICAL USER INTERFACE, ACCESS, SEARCH AND REPORTING Searchers Reporting Delivery (Player Selection) DATA PROCESSING AND CONTENT REPOSITORY ADMINISTRATION AND MANAGEMENT

More information

A Study of Mobile Search Queries in Japan

A Study of Mobile Search Queries in Japan A Study of Mobile Search Queries in Japan Ricardo Baeza-Yates, Georges Dupret, Javier Velasco Yahoo! Research Latin America Santiago, Chile ABSTRACT In this paper we study the characteristics of search

More information

Introduction. Chapter 1

Introduction. Chapter 1 This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides

More information

Manifestation of real world social events on Twitter

Manifestation of real world social events on Twitter Radboud University Master Thesis Computer Science Manifestation of real world social events on Twitter Author: M. Van de Voort Supervisor: dr. S. Verberne Second reader: prof. dr. T.M. Heskes August 13,

More information

Process Mining in Big Data Scenario

Process Mining in Big Data Scenario Process Mining in Big Data Scenario Antonia Azzini, Ernesto Damiani SESAR Lab - Dipartimento di Informatica Università degli Studi di Milano, Italy antonia.azzini,ernesto.damiani@unimi.it Abstract. In

More information

Recommendation in the Digital TV Domain: an Architecture based on Textual Description Analysis

Recommendation in the Digital TV Domain: an Architecture based on Textual Description Analysis Recommendation in the Digital TV Domain: an Architecture based on Textual Description Analysis Felipe Ramos feliperamos@copin.ufcg.edu.br Alexandre Costa antonioalexandre@copin.ufcg.edu.br Reudismam Rolim

More information

Modeling of Information Sharing on the Business Social Media

Modeling of Information Sharing on the Business Social Media SICE Journal of Control, Measurement, and System Integration, Vol. 6, No. 2, pp. 083 087, March 2013 Modeling of Information Sharing on the Business Social Media Fujio TORIUMI, Takashi OKADA, Shuichiro

More information

Tech Presentation 2016

Tech Presentation 2016 Tech Presentation 2016 Our Management Team Marvin Igelman CEO Alex Zivkovic CTO David Berman CFO Matt Burns PM and Growth BreakingSports is the world s first fully automated real-time alerts platform for

More information

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize

More information

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining.

Ming-Wei Chang. Machine learning and its applications to natural language processing, information retrieval and data mining. Ming-Wei Chang 201 N Goodwin Ave, Department of Computer Science University of Illinois at Urbana-Champaign, Urbana, IL 61801 +1 (917) 345-6125 mchang21@uiuc.edu http://flake.cs.uiuc.edu/~mchang21 Research

More information

Storybase: Towards Building a Knowledge Base for News Events

Storybase: Towards Building a Knowledge Base for News Events Storybase: Towards Building a Knowledge Base for News Events Zhaohui Wu, Chen Liang, C. Lee Giles Computer Science and Engineering, Information Sciences and Technology The Pennsylvania State University

More information

Joint POS Tagging and Text Normalization for Informal Text

Joint POS Tagging and Text Normalization for Informal Text Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Joint POS Tagging and Text Normalization for Informal Text Chen Li and Yang Liu University of Texas

More information

Learn Software Microblogging - A Review of This paper

Learn Software Microblogging - A Review of This paper 2014 4th IEEE Workshop on Mining Unstructured Data An Exploratory Study on Software Microblogger Behaviors Abstract Microblogging services are growing rapidly in the recent years. Twitter, one of the most

More information

Scalable Distributed Event Detection for Twitter

Scalable Distributed Event Detection for Twitter Scalable Distributed Event Detection for Twitter Richard McCreadie, Craig Macdonald and Iadh Ounis School of Computing Science University of Glasgow Email: firstname.lastname@glasgow.ac.uk Miles Osborne

More information

Latent Dirichlet Markov Allocation for Sentiment Analysis

Latent Dirichlet Markov Allocation for Sentiment Analysis Latent Dirichlet Markov Allocation for Sentiment Analysis Ayoub Bagheri Isfahan University of Technology, Isfahan, Iran Intelligent Database, Data Mining and Bioinformatics Lab, Electrical and Computer

More information

Hub Cover and Hub Center Problems

Hub Cover and Hub Center Problems Hub Cover and Hub Center Problems Horst W. Hamacher, Tanja Meyer Department of Mathematics, University of Kaiserslautern, Gottlieb-Daimler-Strasse, 67663 Kaiserslautern, Germany Abstract Using covering

More information

Predicting Stock Market Indicators Through Twitter I hope it is not as bad as I fear

Predicting Stock Market Indicators Through Twitter I hope it is not as bad as I fear Available online at www.sciencedirect.com Procedia Social and Behavioral Sciences Procedia - Social and Behavioral Sciences 00 (2009) 000 000 www.elsevier.com/locate/procedia COINs2010 Predicting Stock

More information

Reconstruction and Analysis of Twitter Conversation Graphs

Reconstruction and Analysis of Twitter Conversation Graphs Reconstruction and Analysis of Twitter Conversation Graphs Peter Cogan peter.cogan@alcatellucent.com Gabriel Tucci gabriel.tucci@alcatellucent.com Matthew Andrews andrews@research.belllabs.com W. Sean

More information

Exploring Big Data in Social Networks

Exploring Big Data in Social Networks Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about

More information

Spatiotemporal Clustering of Twitter Feeds for Activity Summarization

Spatiotemporal Clustering of Twitter Feeds for Activity Summarization Spatiotemporal Clustering of Twitter Feeds for Activity Summarization N. Wayant 1, A. Crooks 2, A. Stefanidis 3, A. Croitoru 3, J. Radzikowski 3, J. Stahl 2, J. Shine 2 1 US Army ERDC Topographic Engineering

More information

Ensemble Data Mining Methods

Ensemble Data Mining Methods Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods

More information

A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts

A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts A comparison of Lexicon-based approaches for Sentiment Analysis of microblog posts Cataldo Musto, Giovanni Semeraro, Marco Polignano Department of Computer Science University of Bari Aldo Moro, Italy {cataldo.musto,giovanni.semeraro,marco.polignano}@uniba.it

More information

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED

Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED Doctoral Consortium 2013 Dept. Lenguajes y Sistemas Informáticos UNED 17 19 June 2013 Monday 17 June Salón de Actos, Facultad de Psicología, UNED 15.00-16.30: Invited talk Eneko Agirre (Euskal Herriko

More information

ISSN: 2321-7782 (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 2, Issue 10, October 2014 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Predicting Information Popularity Degree in Microblogging Diffusion Networks

Predicting Information Popularity Degree in Microblogging Diffusion Networks Vol.9, No.3 (2014), pp.21-30 http://dx.doi.org/10.14257/ijmue.2014.9.3.03 Predicting Information Popularity Degree in Microblogging Diffusion Networks Wang Jiang, Wang Li * and Wu Weili College of Computer

More information

Get me off Your Fucking Mailing List

Get me off Your Fucking Mailing List Get me off Your Fucking Mailing List David Mazières and Eddie Kohler New York University University of California, Los Angeles http://www.mailavenger.org/ Abstract off off off mailing 1 Introduction off

More information

Estimating Twitter User Location Using Social Interactions A Content Based Approach

Estimating Twitter User Location Using Social Interactions A Content Based Approach 2011 IEEE International Conference on Privacy, Security, Risk, and Trust, and IEEE International Conference on Social Computing Estimating Twitter User Location Using Social Interactions A Content Based

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association

More information

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113 CSE 450 Web Mining Seminar Spring 2008 MWF 11:10 12:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University davison@cse.lehigh.edu http://www.cse.lehigh.edu/~brian/course/webmining/

More information

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter

VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter VCU-TSA at Semeval-2016 Task 4: Sentiment Analysis in Twitter Gerard Briones and Kasun Amarasinghe and Bridget T. McInnes, PhD. Department of Computer Science Virginia Commonwealth University Richmond,

More information

Lisa D. Friedland School of Computer Science 140 Governors Drive Amherst, MA 01003 (413) 575-4995 lfriedl@cs.umass.edu

Lisa D. Friedland School of Computer Science 140 Governors Drive Amherst, MA 01003 (413) 575-4995 lfriedl@cs.umass.edu Lisa D. Friedland School of Computer Science 140 Governors Drive Amherst, MA 01003 (413) 575-4995 lfriedl@cs.umass.edu EDUCATION University of Massachusetts Amherst, Amherst, MA Ph.D. candidate in Computer

More information

How People Read Books Online: Mining and Visualizing Web Logs for Use Information

How People Read Books Online: Mining and Visualizing Web Logs for Use Information How People Read Books Online: Mining and Visualizing Web Logs for Use Information Rong Chen 1, Anne Rose 2, Benjamin B. Bederson 2 1 Department of Computer Science and Technique College of Computer Science,

More information

An Analysis of Verifications in Microblogging Social Networks - Sina Weibo

An Analysis of Verifications in Microblogging Social Networks - Sina Weibo An Analysis of Verifications in Microblogging Social Networks - Sina Weibo Junting Chen and James She HKUST-NIE Social Media Lab Dept. of Electronic and Computer Engineering The Hong Kong University of

More information

THREE ESSAYS ON ENTERPRISE INFORMATION SYSTEM MINING

THREE ESSAYS ON ENTERPRISE INFORMATION SYSTEM MINING DOCTORAL DISSERTATION THREE ESSAYS ON ENTERPRISE INFORMATION SYSTEM MINING FOR BUSINESS INTELLIGENCE A dissertation submitted to the HEINZ COLLEGE, CARNEGIE MELLON UNIVERSITY in partial fulfillment for

More information

Content-Based Discovery of Twitter Influencers

Content-Based Discovery of Twitter Influencers Content-Based Discovery of Twitter Influencers Chiara Francalanci, Irma Metra Department of Electronics, Information and Bioengineering Polytechnic of Milan, Italy irma.metra@mail.polimi.it chiara.francalanci@polimi.it

More information

Analysis and Visualization with Large Scale Temporal Web Archives

Analysis and Visualization with Large Scale Temporal Web Archives 1 st Int. Alexandria Workshop (15th. Sep. 2014) Multiple Media Analysis and Visualization with Large Scale Temporal Web Archives Masashi Toyoda Center for Socio Global Informatics, Institute t of Industrial

More information

European Parliament elections on Twitter

European Parliament elections on Twitter Analysis of Twitter feeds 6 June 2014 Outline The goal of our project is to investigate any possible relations between public support towards two Polish major political parties - Platforma Obywatelska

More information

Tweets Miner for Stock Market Analysis

Tweets Miner for Stock Market Analysis Tweets Miner for Stock Market Analysis Bohdan Pavlyshenko Electronics department, Ivan Franko Lviv National University,Ukraine, Drahomanov Str. 50, Lviv, 79005, Ukraine, e-mail: b.pavlyshenko@gmail.com

More information

Curriculum Vitae Ruben Sipos

Curriculum Vitae Ruben Sipos Curriculum Vitae Ruben Sipos Mailing Address: 349 Gates Hall Cornell University Ithaca, NY 14853 USA Mobile Phone: +1 607-229-0872 Date of Birth: 8 October 1985 E-mail: rs@cs.cornell.edu Web: http://www.cs.cornell.edu/~rs/

More information

Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades

Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades 1 Community-Aware Prediction of Virality Timing Using Big Data of Social Cascades Alvin Junus, Ming Cheung, James She and Zhanming Jie HKUST-NIE Social Media Lab, Hong Kong University of Science and Technology

More information

Popularity Analysis on Social Network: A Big Data Analysis

Popularity Analysis on Social Network: A Big Data Analysis Popularity Analysis on Social Network: A Big Data Analysis Sufal Das Brandon Victor Syiem Hemanta Kumar Kalita ABSTRACT A social network is a social structure made up of a set of social actors. These actors

More information

Using News Articles to Predict Stock Price Movements

Using News Articles to Predict Stock Price Movements Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 gyozo@cs.ucsd.edu 21, June 15,

More information

Identifying and Following Expert Investors in Stock Microblogs

Identifying and Following Expert Investors in Stock Microblogs Identifying and Following Expert Investors in Stock Microblogs 1 Roy Bar-Haim, 1 Elad Dinur, 1,2 Ronen Feldman, 1 Moshe Fresko and 1 Guy Goldstein 1 Digital Trowel, Airport City, Israel 2 School of Business

More information

E6893 Big Data Analytics Lecture 2: Big Data Analytics Platforms

E6893 Big Data Analytics Lecture 2: Big Data Analytics Platforms E6893 Big Data Analytics Lecture 2: Big Data Analytics Platforms Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and Big Data

More information

WILL TWITTER MAKE YOU A BETTER INVESTOR? A LOOK AT SENTIMENT, USER REPUTATION AND THEIR EFFECT ON THE STOCK MARKET

WILL TWITTER MAKE YOU A BETTER INVESTOR? A LOOK AT SENTIMENT, USER REPUTATION AND THEIR EFFECT ON THE STOCK MARKET WILL TWITTER MAKE YOU A BETTER INVESTOR? A LOOK AT SENTIMENT, USER REPUTATION AND THEIR EFFECT ON THE STOCK MARKET ABSTRACT Eric D. Brown Dakota State University edbrown@dsu.edu The use of social networks

More information

Twitter Analytics: Architecture, Tools and Analysis

Twitter Analytics: Architecture, Tools and Analysis Twitter Analytics: Architecture, Tools and Analysis Rohan D.W Perera CERDEC Ft Monmouth, NJ 07703-5113 S. Anand, K. P. Subbalakshmi and R. Chandramouli Department of ECE, Stevens Institute Of Technology

More information

Management Decision Making. Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011

Management Decision Making. Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011 Management Decision Making Hadi Hosseini CS 330 David R. Cheriton School of Computer Science University of Waterloo July 14, 2011 Management decision making Decision making Spreadsheet exercise Data visualization,

More information

EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES

EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES EXPLOITING TWITTER IN MARKET RESEARCH FOR UNIVERSITY DEGREE COURSES Zhenar Shaho Faeq 1,Kayhan Ghafoor 2, Bawar Abdalla 3 and Omar Al-rassam 4 1 Department of Software Engineering, Koya University, Koya,

More information

TEMPER : A Temporal Relevance Feedback Method

TEMPER : A Temporal Relevance Feedback Method TEMPER : A Temporal Relevance Feedback Method Mostafa Keikha, Shima Gerani and Fabio Crestani {mostafa.keikha, shima.gerani, fabio.crestani}@usi.ch University of Lugano, Lugano, Switzerland Abstract. The

More information

Italian Journal of Accounting and Economia Aziendale. International Area. Year CXIV - 2014 - n. 1, 2 e 3

Italian Journal of Accounting and Economia Aziendale. International Area. Year CXIV - 2014 - n. 1, 2 e 3 Italian Journal of Accounting and Economia Aziendale International Area Year CXIV - 2014 - n. 1, 2 e 3 Could we make better prediction of stock market indicators through Twitter sentiment analysis? ALEXANDER

More information

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu

Twitter Stock Bot. John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu Twitter Stock Bot John Matthew Fong The University of Texas at Austin jmfong@cs.utexas.edu Hassaan Markhiani The University of Texas at Austin hassaan@cs.utexas.edu Abstract The stock market is influenced

More information

MONIC and Followups on Modeling and Monitoring Cluster Transitions

MONIC and Followups on Modeling and Monitoring Cluster Transitions MONIC and Followups on Modeling and Monitoring Cluster Transitions Myra Spiliopoulou 1, Eirini Ntoutsi 2, Yannis Theodoridis 3, and Rene Schult 4 1 Otto-von-Guericke University of Magdeburg, Germany, myra@iti.cs.uni-magdeburg.de,

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Big Data Analytics Big Data Analytics 1 / 36 Outline

More information

Twitter sentiment vs. Stock price!

Twitter sentiment vs. Stock price! Twitter sentiment vs. Stock price! Background! On April 24 th 2013, the Twitter account belonging to Associated Press was hacked. Fake posts about the Whitehouse being bombed and the President being injured

More information

Introducing diversity among the models of multi-label classification ensemble

Introducing diversity among the models of multi-label classification ensemble Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and

More information

Marketing and Outreach Efforts to Promote FAFSA Completion and Financial Aid Awareness!

Marketing and Outreach Efforts to Promote FAFSA Completion and Financial Aid Awareness! Marketing and Outreach Efforts to Promote FAFSA Completion and Financial Aid Awareness! West Virginia Association of Student Financial Aid Administrators Annual Conference October 31, 2014 Overview! Statewide

More information

Current state of learning analytics and educational data mining

Current state of learning analytics and educational data mining Current state of learning analytics and educational data mining George Siemens Ryan S.J.d. Baker August 2013 Poll #1 How far along is your institution in using LA/ EDM at institutional level? We re thinking

More information

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.

More information

Is it Really About Me? Message Content in Social Awareness Streams

Is it Really About Me? Message Content in Social Awareness Streams Is it Really About Me? Message Content in Social Awareness Streams Mor Naaman, Jeffrey Boase, Chih-Hui Lai Rutgers University, School of Communication and Information 4 Huntington St., New Brunswick, NJ

More information

Network Analysis in the Big Data Age: Mining Graphs and Social Streams

Network Analysis in the Big Data Age: Mining Graphs and Social Streams Charu C. Aggarwal IBM T J Watson Research Center Yorktown Heights, NY 10598 Network Analysis in the Big Data Age: Mining Graphs and Social Streams Keynote Talk, ECML/PKDD, 2014 Introduction Large networks

More information

Speeding up GPU-based password cracking

Speeding up GPU-based password cracking Speeding up GPU-based password cracking SHARCS 2012 Martijn Sprengers 1,2 Lejla Batina 2,3 Sprengers.Martijn@kpmg.nl KPMG IT Advisory 1 Radboud University Nijmegen 2 K.U. Leuven 3 March 17-18, 2012 Who

More information

Keyphrase Extraction for Scholarly Big Data

Keyphrase Extraction for Scholarly Big Data Keyphrase Extraction for Scholarly Big Data Cornelia Caragea Computer Science and Engineering University of North Texas July 10, 2015 Scholarly Big Data Large number of scholarly documents on the Web PubMed

More information

Data Mining: Opportunities and Challenges

Data Mining: Opportunities and Challenges Data Mining: Opportunities and Challenges Xindong Wu University of Vermont, USA; Hefei University of Technology, China ( 合 肥 工 业 大 学 计 算 机 应 用 长 江 学 者 讲 座 教 授 ) 1 Deduction Induction: My Research Background

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information