Research Statement Constrained Frequent Pattern Mining For Large Graph/Networks

Size: px
Start display at page:

Download "Research Statement. 1.1. Constrained Frequent Pattern Mining For Large Graph/Networks"

Transcription

1 Research Statement Feida ZHU School of Information Systems, Singapore Management University Tel: (65) ; 30 (Day) 04 (Month) 2013 (Year) Introduction The past decade has seen an unprecedented explosion of data in almost all areas of our life, from the boom of online social networks drawing hundreds of millions of users to highly accurate GPS systems tracking every move of the attached mobile devices The concept of Big Data has never attracted more attention from the research community as its importance grows increasingly palpable each day Yet, with all the wonders it could make happen, Big Data at the same time poses serious research challenges for mining and analysis tasks My central research theme has therefore been focused on --- Big Data Mining and Analytics The challenge of Big Data, in my understanding, can be best characterized by 4 V s, which are Volume, Velocity, Variety and Value as shown in Figure 1 These 4 V s also serve as a good map for my current and near-future research, which I shall present one by one in the following The settings have been centered on network and social media data as social networks have been the main data source for my research for the past few years However, all the results apply as well to other data settings of similar nature Variety Volume Big Data Velocity Value The Four Dimensions of the Big Data Challenge: (1) Volume --- taming data of societal-scale Figure 1 The most noticeable feature of the big data is its sheer volume, which is often of societal scale Mining and analysis on such data becomes extremely difficult even for simple tasks like frequent pattern discovery My research along this dimension has been focused on a fundamental problem in data mining which is the constrained frequent pattern mining problem, particularly on graph/network data which is the main data representation for social networks and also the most challenging setting compared with item-sets and sequences Frequent patterns have proved extremely powerful in a wide range of network analysis tasks including network clustering, classification, community detection and evolution To add to the complexity, the mining task often comes with user-specified constraints on the pattern result My research in this dimension can be further grouped into the following three topics 11 Constrained Frequent Pattern Mining For Large Graph/Networks To use frequent patterns for various knowledge discovery tasks, one must first be able to find the set of frequent patterns from the given data My research on constrained frequent pattern mining starts with my two Best Student Paper Awards [ICDE 07][PAKDD 07] during my PhD study in which I proposed a novel randomized mining framework to find the colossal frequent patterns in transaction data and a comprehensive constraint-pushing mining framework for graph data It is well-known that frequent pattern mining in graph setting is notoriously hard, especially in face of today s network scale Most work on graph mining has been largely focused on graph transaction setting where the input data is a large collection of small graphs However, 1

2 all the social network applications today present us with large single graphs It has been shown that frequent pattern mining in single network setting is a much more challenging problem than its counter-part in the transaction setting due to the existence of overlapping embeddings and accordingly much trickier support computation My VLDB 2011 paper on Mining top-k large structural patterns in massive networks [VLDB 11] proposed the first work that is able to find large patterns in massive graph data We developed a novel concept called r-spider and a corresponding algorithm called SpiderMine to use small frequent patterns in spider-shape to find top-k large patterns probabilistically within any user-specified error bound This work provides users for the first time the capacity to reach and study the largest frequent patterns in big graph data within reasonable amount of time With the boom of mobile social data and research on information diffusion, another kind of constrained pattern --- the skinny patterns, which are graph patterns with a long back- bone from which short twigs branch out, have found important applications for the descriptive power of its long backbone to represent spatial and temporal trajectories in heterogeneous information networks, and of the short twigs the various kinds of associated information My work in [SIGMOD 13] proposed a whole new direct mining paradigm for efficient constrained frequent graph mining such that frequent patterns with certain structural constraints can be generated directly with minimum redundancy, something impossible with traditional mining methodology in which patterns are grown in the order of increasing sizes The research agenda in this direction is to systematically explore and tackle the challenges posed by the constrained pattern mining problem for large networks as those ubiquitous in our daily life I have a coming book chapter on Mining Constrained Graph Patterns to be published by Springer later this year which will be a good summary of my work along this direction 12 Collaborative Pattern Mining In Distributed Environment Due to the remarkable size of network data, many of these networks are not stored in a centralized fashion Different parts of the network could be stored in different data centers around the world, or in a machine farm All existing mining algorithms have assumed a centralized storage of the entire graph and are therefore powerless in such a distributed environment Besides, one way to handle huge single network could be to first partition the data carefully and then mine them collaboratively Under this new setting, even the most classic problems in graph mining become fresh and interestingly challenging This is a whole new direction with few research work published There are many foundation work to be laid out and directions to be chartered My research agenda is to develop efficient algorithms for those fundamental mining problems in this setting and make it work on the societal-scale social network data we have here 13 Sampling and Summarization For Large Networks The size of today s social network has made it even impossible to visually comprehend as a whole by human examination Certain summarization of the original network becomes necessary for visualization of mining results or navigation in the network On the other hand, sampling of the entire network is also essential as it is often unrealistic to obtain the whole network My research agenda here is to examine the principles and algorithms of effective and efficient sampling methods to facilitate our data acquisition and find intuitive, informative and interesting ways to summarize large network data such as our Twitter data set 2

3 (2) Velocity --- conducting real-time analysis in huge-volume data flow Perhaps the most important and unique feature of social media compared against all the traditional news media is the real-time responsiveness of the data For example, it has been observed that, in life-critical disasters of societal scale, Twitter is the most important and timely source from which people find out and track the breaking news before any mainstream media picks up on them and rebroadcast the footage Consequently, it is essential that we are able to conduct mining and analysis in the huge-volume data flow in a real-time fashion One important topic in social media study is the bursty topics which capture social events attracting population-wise attention Our work in [ACL 12] proposed the first algorithm to find such topics from Twitter in an offline fashion To achieve the real-time responsiveness, our work published at KDD 13 proposed a novel mining framework called TopicSketch which is able to detect bursty topics earlier than traditional news media and can potentially handle hundreds of millions tweets per day which is close to the total number of daily tweets in Twitter One example of bursty topics detected from our data is illustrated in the following figure To our best knowledge, this is the first work that achieves real-time detection on social media of such scale as Twitter The future work includes incorporating community-awareness and information diffusion structure into the detection algorithm such that bursty events of different kinds can be distinguished and their potential virality can be predicted Other real-time mining and analysis such as frequent patterns and outlier detection would also be studied as part of the research in this dimension to handle the velocity of big data (3) Variety --- understanding data of high heterogeneity The challenge of big data also comes from the fact that the data is usually highly heterogeneous, ie, they are of different formats, types and come from different sources For example, even for the same user, we have text data from his tweets and reviews, multimedia data such as images from his Instagram account and videos from Youtube, trajectory and location data from his mobile devices and so on The analytical capacity to integrate, understand and leverage these highly heterogeneous data is immensely important The key is to find a connecting ingredient or a unifying model to achieve effective integration My approach in this dimension so far is to use what I deem the most characterizing feature of social media data --- user behavior --- as the gluing element to tie things together Our tutorial in DASFAA 13 titled Behavior Driven Social Network Mining and Analysis gives a selected summary of our recent research work along this line In particular we pushed the user behavior element into the following three mining tasks and produced interesting results which are otherwise unobtainable 3

4 (1) Behavior-driven Topic Modeling We proposed in [SDM 13] a B-LDA model to incorporate user behavior into the LDA topic modeling to better capture the user interactions which are critically important for topic analysis, user clustering and followee recommendation on social micro-blogging services such as Twitter (2) Behavior-driven Anomaly Detection We used group-level user behavior to characterize anomaly collections and identified spammer groups that are hard to catch with traditional point anomaly framework [SDM 12, CIKM 12] We also used collective user rating behavior to model anomalous users and products in online review settings and proposed a unifying framework based on mutual dependency principles [ICDM 12] Extensions of these pieces of work have been submitted to DMKD and TKDE (3) Behavior-driven Relationship Mining We studied the user follow links in Twitter network and developed a novel algorithm which, based on this piece of information alone, is able to identify with high accuracy those offline real-life friends of the target user [WebSci 12] This work has profound potential impact as we will further elaborate in the next part We also studied user follow linkage to dynamically propagate user attribute/relationship labels with user input [DASFAA 13] In another work published at [SocInfo 13], we re-visited the user ranking problem on social network and examined the problem from the user interaction perspective We provided a new angle to the problem based on the interplay between information and interaction (4) Value --- translating data analytical results into real-world impact This dimension of the Big Data challenge has not been well explored as yet In online social media setting, the central question to ask is --- How would all the analytical results about the online social data impact our offline real life? For example, all the research findings on social influence would remain inconsequential if we are not able to establish the linkage between the online and offline world My research agenda here is to fill this gap and establish the connection As the first effort toward this Holy Grail, we proposed [WebSci 12] a novel algorithm to distinguish a user s online and offline friends from her Twitter follow network, as illustrated in the right figure This work provides foundation for many exciting applications and future works including robust user modeling, business competitive analysis, user profile matching, spammer detection, etc Based on this work, our next work [DASFAA 13] is to propagate dynamically user attribute labels in the relationship network The corresponding demo system has won the Best Demo Award (Runner- Up) at DASFAA 13 A fundamental task in bridging the online and offline world is to integrate various aspects of information about the same user across different platforms The problem has profound impact to user modeling and business intelligence and has begun to attract a huge amount of research interest from the community We provide the first solution to use the whole range of user data and the result will be published in SIGMOD 14 4

5 Conclusion My research agenda in the past few years and in the near future will be focused on the Big Data challenge along, in particular, the four dimensions of Volume, Velocity, Variety and Value and with an emphasis on graph/network data Besides this main theme, I have also been working on other data mining applications including program parameter tuning [CoCoMile'12, LION'13], churn prediction [ASONAM'12], game strategy mining [CIG' 12] and network experimentation [ICWSM 13] References 1 "A Direct Mining Approach To Efficient Constrained Graph Pattern Discovery", by Feida ZHU, Zequn ZHANG, Qiang QU, 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD'13), New York, USA, June, "Reviving Dormant Ties in an Online Social Network Experiment", by Ee-Peng LIM, Denzil CORRERA, David LO, Michael FINEGOLD, Feida ZHU, The 7th International AAAI Conference on Weblogs and Social Media (ICWSM'13), Boston, USA, July, "It Is Not Just What We Say, But How We Say Them: LDA-based Behavior-Topic Model", by Minghui QIU, Feida ZHU, and Jing JIANG, 05/2013, 2013 SIAM International Conference on Data Mining (SDM'13), Austin, Texas, USA, May, "TwiCube: A Real-time Twitter Online Community Analysis Tool", by Juan DU, Wei XIE, Cheng LI, Feida ZHU, and Ee Peng LIM, 04/2013, The 18th International Conference on Database Systems for Advanced Applications (DASFAA'13), Wuhan, China, April, "Dynamic Label Propagation in Social Networks", by Juan DU, Feida ZHU, and Ee Peng LIM, 04/2013, The 18th International Conference on Database Systems for Advanced Applications (DASFAA'13), Wuhan, China, April, "Automated Parameter Tuning Framework for Heterogeneous and Large Instances: Case study in Quadratic Assignment Problem", by LINDAWATI, Zhi YUAN, Hoong Chuin LAU, and Feida ZHU, 01/2013, Learning and Intelligent OptimizatioN Conference (LION 13), Catania, Italy 7 "A Survey of Recommender Systems in Twitter", by Su Mon KYWE, Ee Peng LIM, and Feida ZHU, 12/2012, International Conference on Social Informatics (SocInfo 12), Lausanne, Switzerland 8 "On Recommending Hashtags in Twitter Networks", by Su Mon KYWE, Tuan Anh HOANG, Ee Peng LIM, and Feida ZHU, 12/2012, International Conference on Social Informatics (SocInfo 12), Lausanne, Switzerland 9 "Detecting Anomalies in Bipartite Graphs with Mutual Dependency Principles", by Hanbo DAI, Feida ZHU, Ee Peng LIM, and Hwee Hwa PANG, 12/2012, The 12th IEEE International Conference on Data Mining (ICDM'12), Brussels, Belgium 10 "Impact of Multimedia in Sina Weibo: Popularity and Life Span", by Xun ZHAO, Feida ZHU, Weining QIAN, and Aoying ZHOU, 11/2012, The Joint Conference of the Sixth Chinese Semantic Web Symposium and the First Chinese Web Science Conference (CSWS & CWSC '12), Shenzheng, China 11 "Mining Coherent Anomaly Collections On Web Data", by Hanbo DAI, Feida ZHU, Ee Peng LIM, and Hwee Hwa PANG, 10/2012, the 21st Int Conf on Information and Knowledge Management (CIKM'12), Hawaii, USA 5

6 12 "In-Game Action List Segmentation and Labeling in Real-Time Strategy Games", by Wei GONG, Ee Peng LIM, Feida ZHU, Achananuparp PALAKORN, David LO, and Chong Tat Freddy CHUA, 09/2012, the 8th IEEE Conference on Computational Intelligence and Games (CIG' 12), Granada, Spain 13 "Follow Link Seeking Strategy: A Pattern Based Approach", by Agus Trisnajaya KWEE, Ee Peng LIM, Achananuparp PALAKORN, and Feida ZHU, 08/2012, the 6th ACM workshop on Social Network Mining and Analysis (SNAKDD' 12), Beijing, China 14 "Collective Churn Prediction in Social Network", by Jayadi Oentaryo RICHARD, Ee Peng LIM, David LO, Feida ZHU, and Philips Kokoh PRASETYO, 08/2012, Proc of the 4th Int Conf on Advances in Social Networks Analysis and Mining (ASONAM'12), Istanbul, Turkey 15 "Instance-specific Parameter Tuning via Constraint-based Clustering", by Lindawati LINDAWATI, Hoong Chuin LAU, and Feida ZHU, 08/2012, Proc of the 1st Int Workshop on Combining COnstraint solving with MIning and LEarning(CoCoMile' 12) joint with ECAI 2012, Montpellier, France 16 "Finding Bursty Topics From Microblogs", by Qiming DIAO, Jing JIANG, Feida ZHU, and Ee Peng LIM, 07/2012, , 50th Annual Meeting of the Association for Computational Linguistics (ACL 12), Jeju Island, Korea 17 "Detecting Anomalous Twitter Users by Extreme Group Behaviors", by Hanbo DAI, Ee Peng LIM, Feida ZHU, and Hwee Hwa PANG, 07/2012, Proc of the 2012 ACM Int Conf on Net Science (NetSci' 12), Chicago, Illinois, USA 18 "Detecting Extreme Rank Anomalous Collections", by Hanbo DAI, Feida ZHU, Ee Peng LIM, and Hwee Hwa PANG, 04/2012, SIAM International Conference on Data Mining (SDM 12), Anaheim, California, USA 19 "When a Friend in Twitter is a Friend in Life", by Wei XIE, Cheng LI, Feida ZHU, Ee Peng LIM, and Xueqing GONG, 04/2012, the 4th ACM Int Conf on Web Science (WebSci' 12), Chicago, Iillinois, USA 20 Mining Top-K Large Structural Patterns In Massive Networks, by Feida Zhu, Qiang Qu, David Lo, Xifeng Yan, Jiawei Han and Philip Yu, in Proc 2011 Int Conf on Very Large Data Base (VLDB 11), USA, August, "Mining Diversity On Networks", by Liu Lu, Feida Zhu, Chen Chen, Xifeng Yan, Jiawei Han, Philip S Yu, and Shiqiang Yang, in Proc 2010 Int Conf on Database Systems for Advanced Applications (DASFAA'10), Japan, April, "Efficient Topological OLAP on Information Networks", by Qiang Qu, Feida Zhu, Xifeng Yan, Jiawei Han, Philip Yu and Hongyan Li, in Proc 2011 Int Conf on Database Systems for Advanced Applications (DASFAA'11), Hong Kong, April, "Top-K Aggregation Queries over Large Networks", by Xifeng Yan, Bin He, Feida Zhu, and Jiawei Han, in Proc 2010 International Conference on Data Engineering (ICDE '10), USA, March gprune: A Constraint Pushing Framework for Graph Pattern Mining, by Feida Zhu, Xifeng Yan, Jiawei Han, and Philip S Yu, Proc of the 11th Pacific-Asia Conf on Knowledge Discovery and Data Mining (PAKDD'07), Nanjing, China, May

7 25 Mining Colossal Frequent Patterns by Core Pattern Fusion, by Feida Zhu, Xifeng Yan, Jiawei Han, Philip S Yu, and Hong Cheng, Proc of the 23th Int Conf on Data Engineering (ICDE'07), Istanbul, Turkey, April

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL

PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra

More information

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph

MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518

International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518 International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,

More information

Date: May 6 (Wednesday), 2015, 14:00 ~ 18:00 Venue: Room No. 201, Engineering Building 2, Yonsei University, Seoul, Korea

Date: May 6 (Wednesday), 2015, 14:00 ~ 18:00 Venue: Room No. 201, Engineering Building 2, Yonsei University, Seoul, Korea Microsoft Research Yonsei University Joint Workshop Date: May 6 (Wednesday), 2015, 14:00 ~ 18:00 Venue: Room No. 201, Engineering Building 2, Yonsei University, Seoul, Korea PROGRAM Time 14:00 ~ 14:10

More information

User Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大

User Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大 User Modeling in Big Data Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大 Who we are: Noah s Ark LAB Have you watched the movie 2012?

More information

Learn Software Microblogging - A Review of This paper

Learn Software Microblogging - A Review of This paper 2014 4th IEEE Workshop on Mining Unstructured Data An Exploratory Study on Software Microblogger Behaviors Abstract Microblogging services are growing rapidly in the recent years. Twitter, one of the most

More information

Discovering Social Media Experts by Integrating Social Networks and Contents

Discovering Social Media Experts by Integrating Social Networks and Contents Proceedings of the Twenty-Third Australasian Database Conference (ADC 2012), Melbourne, Australia Discovering Social Media Experts by Integrating Social Networks and Contents Zhao Zhang Bin Zhao Weining

More information

Mimicking human fake review detection on Trustpilot

Mimicking human fake review detection on Trustpilot Mimicking human fake review detection on Trustpilot [DTU Compute, special course, 2015] Ulf Aslak Jensen Master student, DTU Copenhagen, Denmark Ole Winther Associate professor, DTU Copenhagen, Denmark

More information

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis

Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,

More information

Data Mining: Opportunities and Challenges

Data Mining: Opportunities and Challenges Data Mining: Opportunities and Challenges Xindong Wu University of Vermont, USA; Hefei University of Technology, China ( 合 肥 工 业 大 学 计 算 机 应 用 长 江 学 者 讲 座 教 授 ) 1 Deduction Induction: My Research Background

More information

TAAI 2012 Panel Discussion: Big Data. About Me: Chin Yew Lin

TAAI 2012 Panel Discussion: Big Data. About Me: Chin Yew Lin TAAI 2012 Panel Discussion: Big Data Chin Yew Lin cyl@microsoft.com Microsoft Research Asia About Me: Chin Yew Lin Senior Researcher, Knowledge Mining Group, Microsoft Research Asia Areas of Interest Natural

More information

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS

A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS A GENERAL TAXONOMY FOR VISUALIZATION OF PREDICTIVE SOCIAL MEDIA ANALYTICS Stacey Franklin Jones, D.Sc. ProTech Global Solutions Annapolis, MD Abstract The use of Social Media as a resource to characterize

More information

Introduction. Chapter 1

Introduction. Chapter 1 This chapter is from Social Media Mining: An Introduction. By Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu. Cambridge University Press, 2014. Draft version: April 20, 2014. Complete Draft and Slides

More information

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113

Web Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113 CSE 450 Web Mining Seminar Spring 2008 MWF 11:10 12:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University davison@cse.lehigh.edu http://www.cse.lehigh.edu/~brian/course/webmining/

More information

Community Mining from Multi-relational Networks

Community Mining from Multi-relational Networks Community Mining from Multi-relational Networks Deng Cai 1, Zheng Shao 1, Xiaofei He 2, Xifeng Yan 1, and Jiawei Han 1 1 Computer Science Department, University of Illinois at Urbana Champaign (dengcai2,

More information

MATTEO RIONDATO Curriculum vitae

MATTEO RIONDATO Curriculum vitae MATTEO RIONDATO Curriculum vitae 100 Avenue of the Americas, 16 th Fl. New York, NY 10013, USA +1 646 292 6641 riondato@acm.org http://matteo.rionda.to EDUCATION Ph.D. Computer Science, Brown University,

More information

Top Top 10 Algorithms in Data Mining

Top Top 10 Algorithms in Data Mining ICDM 06 Panel on Top Top 10 Algorithms in Data Mining 1. The 3-step identification process 2. The 18 identified candidates 3. Algorithm presentations 4. Top 10 algorithms: summary 5. Open discussions ICDM

More information

Partially Supervised Word Alignment Model for Ranking Opinion Reviews

Partially Supervised Word Alignment Model for Ranking Opinion Reviews International Journal of Computer Sciences and Engineering Open Access Review Paper Volume-4, Issue-4 E-ISSN: 2347-2693 Partially Supervised Word Alignment Model for Ranking Opinion Reviews Rajeshwari

More information

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin

Network Big Data: Facing and Tackling the Complexities Xiaolong Jin Network Big Data: Facing and Tackling the Complexities Xiaolong Jin CAS Key Laboratory of Network Data Science & Technology Institute of Computing Technology Chinese Academy of Sciences (CAS) 2015-08-10

More information

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6

International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6 International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: editor@ijermt.org November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering

More information

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET

International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand

More information

AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS

AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS Chapter 1 AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS Charu C. Aggarwal IBM T. J. Watson Research Center Hawthorne, NY 10532 charu@us.ibm.com Abstract The advent of online social networks has been

More information

College information system research based on data mining

College information system research based on data mining 2009 International Conference on Machine Learning and Computing IPCSIT vol.3 (2011) (2011) IACSIT Press, Singapore College information system research based on data mining An-yi Lan 1, Jie Li 2 1 Hebei

More information

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP

AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)

More information

Big Data Analytics in Mobile Environments

Big Data Analytics in Mobile Environments 1 Big Data Analytics in Mobile Environments 熊 辉 教 授 罗 格 斯 - 新 泽 西 州 立 大 学 2012-10-2 Rutgers, the State University of New Jersey Why big data: historical view? Productivity versus Complexity (interrelatedness,

More information

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain.

(b) How data mining is different from knowledge discovery in databases (KDD)? Explain. Q2. (a) List and describe the five primitives for specifying a data mining task. Data Mining Task Primitives (b) How data mining is different from knowledge discovery in databases (KDD)? Explain. IETE

More information

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1

A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.

More information

Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges. Presenter: Sancheng Peng Zhaoqing University

Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges. Presenter: Sancheng Peng Zhaoqing University Social Influence Analysis in Social Networking Big Data: Opportunities and Challenges Presenter: Sancheng Peng Zhaoqing University 1 2 3 4 35 46 7 Contents Introduction Relationship between SIA and BD

More information

II. OLAP(ONLINE ANALYTICAL PROCESSING)

II. OLAP(ONLINE ANALYTICAL PROCESSING) Association Rule Mining Method On OLAP Cube Jigna J. Jadav*, Mahesh Panchal** *( PG-CSE Student, Department of Computer Engineering, Kalol Institute of Technology & Research Centre, Gujarat, India) **

More information

Data Mining & Data Stream Mining Open Source Tools

Data Mining & Data Stream Mining Open Source Tools Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.

More information

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network

Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network , pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and

More information

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health

Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining

More information

Principles of Dat Da a t Mining Pham Tho Hoan hoanpt@hnue.edu.v hoanpt@hnue.edu. n

Principles of Dat Da a t Mining Pham Tho Hoan hoanpt@hnue.edu.v hoanpt@hnue.edu. n Principles of Data Mining Pham Tho Hoan hoanpt@hnue.edu.vn References [1] David Hand, Heikki Mannila and Padhraic Smyth, Principles of Data Mining, MIT press, 2002 [2] Jiawei Han and Micheline Kamber,

More information

A Process Driven Architecture of Analytical CRM Systems with Implementation in Bank Industry

A Process Driven Architecture of Analytical CRM Systems with Implementation in Bank Industry International Journal of Intelligent Information Technology Application 1:1 (2008) 48-52 Available at http://www.engineering-press.org/ijiita.htm A Process Driven Architecture of Analytical CRM Systems

More information

Blog Post Extraction Using Title Finding

Blog Post Extraction Using Title Finding Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School

More information

Social Computing: Challenges in Research and Applications

Social Computing: Challenges in Research and Applications Social Computing: Challenges in Research and Applications Huan Liu, Shamanth Kumar, Fred Morstatters Conducting state-of-the-art research in data mining and machine learning, social computing, and artificial

More information

1 Results from Prior Support

1 Results from Prior Support 1 Results from Prior Support Dr. Shashi Shekhar s work has been supported by multiple NSF grants [21, 23, 18, 14, 15, 16, 17, 19, 24, 22]. His most recent grant relating to spatiotemporal network databases

More information

Data Mining and Database Systems: Where is the Intersection?

Data Mining and Database Systems: Where is the Intersection? Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: surajitc@microsoft.com 1 Introduction The promise of decision support systems is to exploit enterprise

More information

Top 10 Algorithms in Data Mining

Top 10 Algorithms in Data Mining Top 10 Algorithms in Data Mining Xindong Wu ( 吴 信 东 ) Department of Computer Science University of Vermont, USA; 合 肥 工 业 大 学 计 算 机 与 信 息 学 院 1 Top 10 Algorithms in Data Mining by the IEEE ICDM Conference

More information

Florida International University - University of Miami TRECVID 2014

Florida International University - University of Miami TRECVID 2014 Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

Morteza Zihayat Curriculum Vitae October 2015

Morteza Zihayat Curriculum Vitae October 2015 Morteza Zihayat Curriculum Vitae October 2015 Contact Information Ph.D Candidate Phone: (+1) 647-831-6167 E-mail: zihayatm@cse.yorku.ca 4700 Keele St. Room LS2057 Website: http://www.cse.yorku.ca/~zihayatm/

More information

Project Participants

Project Participants Annual Report for Period:10/2006-09/2007 Submitted on: 08/15/2007 Principal Investigator: Yang, Li. Award ID: 0414857 Organization: Western Michigan Univ Title: Projection and Interactive Exploration of

More information

Workshop on Internet and BigData Finance (WIBF)

Workshop on Internet and BigData Finance (WIBF) Workshop on Internet and BigData Finance (WIBF) Central University of Finance and Economics June 11-12, 2015 In a 2013 study, IBM found that 71 percent of the banking and financial firms report that the

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA

Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA Pengtao Xie Carnegie Mellon University Machine Learning Department School of Computer Science 5000 Forbes Ave Pittsburgh, PA 15213 Tel: (412) 916-9798 Email: pengtaox@cs.cmu.edu Web: http://www.cs.cmu.edu/

More information

Microblog Sentiment Analysis with Emoticon Space Model

Microblog Sentiment Analysis with Emoticon Space Model Microblog Sentiment Analysis with Emoticon Space Model Fei Jiang, Yiqun Liu, Huanbo Luan, Min Zhang, and Shaoping Ma State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory

More information

A Comparative Study on Sentiment Classification and Ranking on Product Reviews

A Comparative Study on Sentiment Classification and Ranking on Product Reviews A Comparative Study on Sentiment Classification and Ranking on Product Reviews C.EMELDA Research Scholar, PG and Research Department of Computer Science, Nehru Memorial College, Putthanampatti, Bharathidasan

More information

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE

RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE RESEARCH ON THE FRAMEWORK OF SPATIO-TEMPORAL DATA WAREHOUSE WANG Jizhou, LI Chengming Institute of GIS, Chinese Academy of Surveying and Mapping No.16, Road Beitaiping, District Haidian, Beijing, P.R.China,

More information

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment

A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment Edmond H. Wu,MichaelK.Ng, Andy M. Yip,andTonyF.Chan Department of Mathematics, The University of Hong Kong Pokfulam Road,

More information

Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques Data Mining: Concepts and Techniques Chapter 1 Introduction SURESH BABU M ASST PROF IT DEPT VJIT 1 Chapter 1. Introduction Motivation: Why data mining? What is data mining? Data Mining: On what kind of

More information

Research of Postal Data mining system based on big data

Research of Postal Data mining system based on big data 3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication

More information

Big Data in Pictures: Data Visualization

Big Data in Pictures: Data Visualization Big Data in Pictures: Data Visualization Huamin Qu Hong Kong University of Science and Technology What is data visualization? Data visualization is the creation and study of the visual representation of

More information

Web Database Integration

Web Database Integration Web Database Integration Wei Liu School of Information Renmin University of China Beijing, 100872, China gue2@ruc.edu.cn Xiaofeng Meng School of Information Renmin University of China Beijing, 100872,

More information

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines

Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines , 22-24 October, 2014, San Francisco, USA Automatic Mining of Internet Translation Reference Knowledge Based on Multiple Search Engines Baosheng Yin, Wei Wang, Ruixue Lu, Yang Yang Abstract With the increasing

More information

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning

More information

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains

A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains Dr. Kanak Saxena Professor & Head, Computer Application SATI, Vidisha, kanak.saxena@gmail.com D.S. Rajpoot Registrar,

More information

The 2006 IEEE / WIC / ACM International Conference on Web Intelligence Hong Kong, China

The 2006 IEEE / WIC / ACM International Conference on Web Intelligence Hong Kong, China WISE: Hierarchical Soft Clustering of Web Page Search based on Web Content Mining Techniques Ricardo Campos 1, 2 Gaël Dias 2 Célia Nunes 2 1 Instituto Politécnico de Tomar Tomar, Portugal 2 Centre of Human

More information

A Hybrid Data Mining Approach for Analysis of Patient Behaviors in RFID Environments

A Hybrid Data Mining Approach for Analysis of Patient Behaviors in RFID Environments A Hybrid Data Mining Approach for Analysis of Patient Behaviors in RFID Environments incent S. Tseng 1, Eric Hsueh-Chan Lu 1, Chia-Ming Tsai 1, and Chun-Hung Wang 1 Department of Computer Science and Information

More information

MINING CLICKSTREAM-BASED DATA CUBES

MINING CLICKSTREAM-BASED DATA CUBES MINING CLICKSTREAM-BASED DATA CUBES Ronnie Alves and Orlando Belo Departament of Informatics,School of Engineering, University of Minho Campus de Gualtar, 4710-057 Braga, Portugal Email: {alvesrco,obelo}@di.uminho.pt

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 1

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 1 Data Mining: Concepts and Techniques (3 rd ed.) Chapter 1 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013 Han, Kamber & Pei. All rights

More information

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge

Static Data Mining Algorithm with Progressive Approach for Mining Knowledge Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

Jiliang Tang. 701 First Avenue Yahoo!, Voice: (408) 744-2053 E-mail: jlt@yahoo-inc.com Sunnyvale, CA, 94089 US. Contact Information

Jiliang Tang. 701 First Avenue Yahoo!, Voice: (408) 744-2053 E-mail: jlt@yahoo-inc.com Sunnyvale, CA, 94089 US. Contact Information Jiliang Tang Contact Information Research Interests 701 First Avenue Yahoo!, Voice: (408) 744-2053 Yahoo Labs E-mail: jlt@yahoo-inc.com Sunnyvale, CA, 94089 US URL: http://www.public.asu.edu/~jtang20 Data

More information

Predicting Information Popularity Degree in Microblogging Diffusion Networks

Predicting Information Popularity Degree in Microblogging Diffusion Networks Vol.9, No.3 (2014), pp.21-30 http://dx.doi.org/10.14257/ijmue.2014.9.3.03 Predicting Information Popularity Degree in Microblogging Diffusion Networks Wang Jiang, Wang Li * and Wu Weili College of Computer

More information

Research Statement: Human-Powered Information Management Aditya Parameswaran (www.stanford.edu/ adityagp)

Research Statement: Human-Powered Information Management Aditya Parameswaran (www.stanford.edu/ adityagp) Research Statement: Human-Powered Information Management Aditya Parameswaran (www.stanford.edu/ adityagp) My research broadly revolves around information management, with special emphasis on incorporating

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

Research Statement Immanuel Trummer www.itrummer.org

Research Statement Immanuel Trummer www.itrummer.org Research Statement Immanuel Trummer www.itrummer.org We are collecting data at unprecedented rates. This data contains valuable insights, but we need complex analytics to extract them. My research focuses

More information

Kaiquan Xu, Associate Professor, Nanjing University. Kaiquan Xu

Kaiquan Xu, Associate Professor, Nanjing University. Kaiquan Xu Kaiquan Xu Marketing & ebusiness Department, Business School, Nanjing University Email: xukaiquan@nju.edu.cn Tel: +86-25-83592129 Employment Associate Professor, Marketing & ebusiness Department, Nanjing

More information

Mining Mobile Group Patterns: A Trajectory-Based Approach

Mining Mobile Group Patterns: A Trajectory-Based Approach Mining Mobile Group Patterns: A Trajectory-Based Approach San-Yih Hwang, Ying-Han Liu, Jeng-Kuen Chiu, and Ee-Peng Lim Department of Information Management National Sun Yat-Sen University, Kaohsiung, Taiwan

More information

Content-Based Discovery of Twitter Influencers

Content-Based Discovery of Twitter Influencers Content-Based Discovery of Twitter Influencers Chiara Francalanci, Irma Metra Department of Electronics, Information and Bioengineering Polytechnic of Milan, Italy irma.metra@mail.polimi.it chiara.francalanci@polimi.it

More information

Mining Association Rules: A Database Perspective

Mining Association Rules: A Database Perspective IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.12, December 2008 69 Mining Association Rules: A Database Perspective Dr. Abdallah Alashqur Faculty of Information Technology

More information

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2

The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) The multilayer sentiment analysis model based on Random forest Wei Liu1, Jie Zhang2 1 School of

More information

Curriculum Vitae. Summer internship in a financial company that is active in quantitative analysis or development of quantitative

Curriculum Vitae. Summer internship in a financial company that is active in quantitative analysis or development of quantitative Curriculum Vitae XIAOXIAO SHI Department of Computer Science University of Illinois at Chicago Office: 851 S. Morgan St., Rm 1336 SEO, Chicago, IL 60607 xshi9@uic.edu, xiao.x.shi@gmail.com (preferred)

More information

Data Mining in the Application of Criminal Cases Based on Decision Tree

Data Mining in the Application of Criminal Cases Based on Decision Tree 8 Journal of Computer Science and Information Technology, Vol. 1 No. 2, December 2013 Data Mining in the Application of Criminal Cases Based on Decision Tree Ruijuan Hu 1 Abstract A briefing on data mining

More information

RESEARCH INTERESTS Modeling and Simulation, Complex Systems, Biofabrication, Bioinformatics

RESEARCH INTERESTS Modeling and Simulation, Complex Systems, Biofabrication, Bioinformatics FENG GU Assistant Professor of Computer Science College of Staten Island, City University of New York 2800 Victory Boulevard, Staten Island, NY 10314 Doctoral Faculty of Computer Science Graduate Center

More information

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications

Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Mining Signatures in Healthcare Data Based on Event Sequences and its Applications Siddhanth Gokarapu 1, J. Laxmi Narayana 2 1 Student, Computer Science & Engineering-Department, JNTU Hyderabad India 1

More information

Available online at www.sciencedirect.com Available online at www.sciencedirect.com. Advanced in Control Engineering and Information Science

Available online at www.sciencedirect.com Available online at www.sciencedirect.com. Advanced in Control Engineering and Information Science Available online at www.sciencedirect.com Available online at www.sciencedirect.com Procedia Procedia Engineering Engineering 00 (2011) 15 (2011) 000 000 1822 1826 Procedia Engineering www.elsevier.com/locate/procedia

More information

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY

SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY SEARCH ENGINE OPTIMIZATION USING D-DICTIONARY G.Evangelin Jenifer #1, Mrs.J.Jaya Sherin *2 # PG Scholar, Department of Electronics and Communication Engineering(Communication and Networking), CSI Institute

More information

Management of Human Resource Information Using Streaming Model

Management of Human Resource Information Using Streaming Model , pp.75-80 http://dx.doi.org/10.14257/astl.2014.45.15 Management of Human Resource Information Using Streaming Model Chen Wei Chongqing University of Posts and Telecommunications, Chongqing 400065, China

More information

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan

A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan , pp.217-222 http://dx.doi.org/10.14257/ijbsbt.2015.7.3.23 A Framework for Data Warehouse Using Data Mining and Knowledge Discovery for a Network of Hospitals in Pakistan Muhammad Arif 1,2, Asad Khatak

More information

The Design Study of High-Quality Resource Shared Classes in China: A Case Study of the Abnormal Psychology Course

The Design Study of High-Quality Resource Shared Classes in China: A Case Study of the Abnormal Psychology Course The Design Study of High-Quality Resource Shared Classes in China: A Case Study of the Abnormal Psychology Course Juan WANG College of Educational Science, JiangSu Normal University, Jiangsu, Xuzhou, China

More information

Mobile Storage and Search Engine of Information Oriented to Food Cloud

Mobile Storage and Search Engine of Information Oriented to Food Cloud Advance Journal of Food Science and Technology 5(10): 1331-1336, 2013 ISSN: 2042-4868; e-issn: 2042-4876 Maxwell Scientific Organization, 2013 Submitted: May 29, 2013 Accepted: July 04, 2013 Published:

More information

DATA PREPARATION FOR DATA MINING

DATA PREPARATION FOR DATA MINING Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI

More information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information

Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Continuous Fastest Path Planning in Road Networks by Mining Real-Time Traffic Event Information Eric Hsueh-Chan Lu Chi-Wei Huang Vincent S. Tseng Institute of Computer Science and Information Engineering

More information

Dynamic Data in terms of Data Mining Streams

Dynamic Data in terms of Data Mining Streams International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining

More information

How To Create A Text Classification System For Spam Filtering

How To Create A Text Classification System For Spam Filtering Term Discrimination Based Robust Text Classification with Application to Email Spam Filtering PhD Thesis Khurum Nazir Junejo 2004-03-0018 Advisor: Dr. Asim Karim Department of Computer Science Syed Babar

More information

BPOE Research Highlights

BPOE Research Highlights BPOE Research Highlights Jianfeng Zhan ICT, Chinese Academy of Sciences 2013-10- 9 http://prof.ict.ac.cn/jfzhan INSTITUTE OF COMPUTING TECHNOLOGY What is BPOE workshop? B: Big Data Benchmarks PO: Performance

More information

Jiexun Li, Ph.D. College of Information Science and Technology, Drexel University, Philadelphia, PA

Jiexun Li, Ph.D. College of Information Science and Technology, Drexel University, Philadelphia, PA EDUCATION Jiexun Li, Ph.D. Assistant Professor College of Information Science and Technology Drexel University, Philadelphia, PA 19104 Phone: (215) 895-1459 Fax: (215) 895-2494 Email: jiexun.li@ischool.drexel.edu

More information

Emoticon Smoothed Language Models for Twitter Sentiment Analysis

Emoticon Smoothed Language Models for Twitter Sentiment Analysis Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence Emoticon Smoothed Language Models for Twitter Sentiment Analysis Kun-Lin Liu, Wu-Jun Li, Minyi Guo Shanghai Key Laboratory of

More information

Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com

Text Analytics with Ambiverse. Text to Knowledge. www.ambiverse.com Text Analytics with Ambiverse Text to Knowledge www.ambiverse.com Version 1.0, February 2016 WWW.AMBIVERSE.COM Contents 1 Ambiverse: Text to Knowledge............................... 5 1.1 Text is all Around

More information

民 國 九 十 七 年 四 月 第 38 卷 第 2 期

民 國 九 十 七 年 四 月 第 38 卷 第 2 期 民 國 九 十 七 年 四 月 第 38 卷 第 2 期 1============================================================ Inside of Internet Data Nien-Yi Jan Ming-Tsung Chen Wan-Ting Chang Wei Shen Chow Along with the Internet technology

More information

Robust Outlier Detection Technique in Data Mining: A Univariate Approach

Robust Outlier Detection Technique in Data Mining: A Univariate Approach Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,

More information

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD

IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Journal homepage: www.mjret.in ISSN:2348-6953 IMPROVING BUSINESS PROCESS MODELING USING RECOMMENDATION METHOD Deepak Ramchandara Lad 1, Soumitra S. Das 2 Computer Dept. 12 Dr. D. Y. Patil School of Engineering,(Affiliated

More information

ConTag: Conceptual Tag Clouds Video Browsing in e-learning

ConTag: Conceptual Tag Clouds Video Browsing in e-learning ConTag: Conceptual Tag Clouds Video Browsing in e-learning 1 Ahmad Nurzid Rosli, 2 Kee-Sung Lee, 3 Ivan A. Supandi, 4 Geun-Sik Jo 1, First Author Department of Information Technology, Inha University,

More information

Search Result Optimization using Annotators

Search Result Optimization using Annotators Search Result Optimization using Annotators Vishal A. Kamble 1, Amit B. Chougule 2 1 Department of Computer Science and Engineering, D Y Patil College of engineering, Kolhapur, Maharashtra, India 2 Professor,

More information

Statistical Analysis and Visualization for Cyber Security

Statistical Analysis and Visualization for Cyber Security Statistical Analysis and Visualization for Cyber Security Joanne Wendelberger, Scott Vander Wiel Statistical Sciences Group, CCS-6 Los Alamos National Laboratory Quality and Productivity Research Conference

More information

IncSpan: Incremental Mining of Sequential Patterns in Large Database

IncSpan: Incremental Mining of Sequential Patterns in Large Database IncSpan: Incremental Mining of Sequential Patterns in Large Database Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign Urbana, Illinois 61801 hcheng3@uiuc.edu Xifeng

More information

Some Research Challenges for Big Data Analytics of Intelligent Security

Some Research Challenges for Big Data Analytics of Intelligent Security Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,

More information