Graph-Based User Behavior Modeling: From Prediction to Fraud Detection
|
|
|
- Cory Marshall
- 10 years ago
- Views:
Transcription
1 Graph-Based User Behavior Modeling: From Prediction to Fraud Detection Alex Beutel Carnegie Mellon University Christos Faloutsos Carnegie Mellon University March 2015 Leman Akoglu Stony Brook University Abstract How can we model users preferences? How do anomalies, fraud, and spam effect our models of normal users? How can we modify our models to catch fraudsters? In this tutorial we will answer these questions - connecting graph analysis tools for user behavior modeling to anomaly and fraud detection. In particular, we will focus on the application of subgraph analysis, label propagation, and latent factor models to static, evolving, and attributed graphs. For each of these techniques we will give a brief explanation of the algorithms and the intuition behind them. We will then give examples of recent research using the techniques to model, understand and predict normal behavior. With this intuition for how these methods are applied to graphs and user behavior, we will focus on state-of-the-art research showing how the outcomes of these methods are effected by fraud, and how they have been used to catch fraudsters. Perspective and Target Audience Perspective: In this tutorial we focus on understanding anomaly and fraud detection through the lens of normal user behavior modeling. The data mining and machine learning communities have developed a plethora of models and methods for understanding user behavior. However, these methods generally assume that the behavior is that of real, honest people. On the other hand, fraud detection systems frequently use similar techniques as those used in modeling normal behavior, but are often framed as an independent problem. However, by focusing on the relations and intersections of the two perspectives we can gain a more complete understanding of the methods and hopefully inspire new research joining these two communities. Target Audience: This tutorial is aimed at anyone interested in modeling and understanding user behavior, from data mining and machine learning researchers to practitioners from industry and government. For those new to the field, the tutorial will cover the necessary background material to understand these systems and will offer a concise, intuitive overview of the state-of-the-art. Additionally, the tutorial aims to offer a new perspective that will be valuable and interesting even for researchers with more experience in these domains. For those having worked in classic user behavior modeling, we will demonstrate how fraud can effect commonly-used models that expect normal behavior, with the hope that future models will directly account for fraud. For those having worked in fraud detection systems, we hope to inspire new research directions through connecting with recent developments in modeling normal behavior. 1
2 Outline I. Introduction (10 min) A. Graphs are a useful abstraction for a wide variety of domains: social networks, movie or product ratings and reviews, text in articles, medical diagnoses, financial transactions, etc. B. How can we make sense of the data? What does normal behavior look like? For example, we can predict ratings on Netflix or friends on Twitter and fill in missing information on Facebook. C. Fraud is rampant in nearly all of these applications - fake reviews on Yelp, purchased followers on Twitter, inflated trust on ebay, medical fraud, bank fraud. This activity deceives users and manipulates our prediction algorithms. Therefore it is important to understand how fraud influences our models and how we can isolate and catch anomalous behavior. II. Subgraph Analysis and Patterns (40 min) A. Background: Graph clustering and partitioning (5 min) i. Local search and graph queries [4, 19] ii. Co-clustering and cross associations [23, 13] B. Normal Behavior: Subgraph Patterns (10 min) i. Ego-nets: [29, 5] ii. Subgraph patterns in social networks: [40] iii. Influence of subgraphs on recommendation: [41] iv. Co-clustering for recommendation: ACCAMS [6] C. Abnormal Behavior: What are anomalous or fraudulent subgraphs? (25 min) i. Ego-net analysis: COI [14], OddBall [3] ii. Attributed subgraphs: FocusCO [35], SODA [20], CODA [16] iii. Temporal lockstep behavior: CopyCatch [8]; extended in [11] iv. Graph queries: volcanoes and blackholes on static graphs [30], attributed subgraph [46] v. Detecting fraud with co-clustering [34] vi. Graph cut for intrusion detection [15] III. Label Propagation Methods (1 hour) A. Random Walks and Eigenvectors (10 min) i. Background: Why do random walks find important parts of a graph? ii. Normal Behavior: Power method, HITS [27], and PageRank [9] B. Belief and Label Propagation (10 min) i. Background: What is semi-supervised learning? What are belief and label propagation? ii. Normal Behavior: Predict attributes on nodes or why certain people are friends [12] C. Abnormal Behavior: How can random walks help us find fraud? (40 min) i. Surprising patterns in HITS: CatchSync [25] ii. Modifications of PageRank: TrustRank [22], SybilRank[10], CollusionRank[17], BadRank [43] iii. Use belief propagation for guilt-by-association: Binary graphs: NetProbe [31] Attributed graphs: FraudEagle [2], [26] Guilt-by-constellation [42] 2
3 IV. Latent Factor Models (1 hour) A. Background: What is the singular vector decomposition? (5 min) i. Generalization from Eigenvectors and HITS ii. Why do latent factor models, like SVD, work well for relational data? iii. What do the factors typically represent? Why? For example, the factorization of a user by movie ratings matrix gives genres; the decomposition of a word by document matrix gives topics. B. Normal Behavior: Modifications for different settings: (15 min) i. Finding communities (binary matrices): MMSB [1], overlapping communities [44] ii. Missing data and prediction: SVD++ [28], BPMF [37], CoBaFi [7] iii. Multi-modal data: PARAFAC, Tensor factorization [33, 32] iv. Coupled factorization [39, 21] C. Abnormal Behavior: What happens if there is fraud in the data? (40 min) i. Surprising patterns in latent factors (binary matrices): EigenSpokes [36], Get-the-scoop [24], fbox [38] ii. Surprising group patterns in ratings data: CoBaFi [7] iii. Surprising pattern in coupled factorization for heterogeneous graphs [18] iv. Group anomalies: GLAD [45] V. Looking forward (10 min) A. How can we handle multiple data sources and complex data? B. With more complex data and methods, how can maintain the interpretability of discovered fraud? C. Can we provide an adversarial analysis of our methods and offer provable guarantees? Similar tutorials What is Strange in Large Networks? Graph-based Irregularity and Fraud Detection ( cs.stonybrook.edu/~leman/icdm12/, ICDM 2012, Brussels, Belgium) - A subset of this tutorial focused on fraud detection systems, many of which are included in this tutorial. However, the tutorial was generally framed as finding strange structures and patterns in large graphs, some of which could be fraudulent patterns. This differs significantly from the perspective of this tutorial, which focuses on (1) the models and methods used in fraud detection systems and (2) their relation to modeling normal behavior. Presenters Christos Faloutsos is a Professor at Carnegie Mellon University. He has received the Presidential Young Investigator Award by the National Science Foundation (1989), the Research Contributions Award in ICDM 2006, the Innovations award in KDD10, 20 best paper awards, and several teaching awards. He has served as a member of the executive committee of SIGKDD; he has published over 200 refereed articles, 11 book chapters and one monograph. He holds five patents and he has given over 30 tutorials and over 10 invited distinguished lectures. His research interests include data mining for graphs and streams, fractals, database performance, and indexing for multimedia and bio-informatics data. More details can be found at 3
4 Leman Akoglu is an Assistant Professor in the Department of Computer Science at Stony Brook University since August She received her Ph.D. from the Computer Science Department at Carnegie Mellon University. Her research interests span a wide range of data mining and machine learning topics with a focus on algorithmic problems arising in graph mining, pattern discovery, social and information networks, and especially anomaly, fraud, and event detection. Dr. Akoglu s research has won 3 publication awards; one Best Knowledge Discovery Paper at ECML/PKDD 2009, a Best Paper at PAKDD 2010, and a Best Paper at ADC She also holds 3 U.S. patents filed by IBM T. J. Watson Research Labs. Dr. Akoglu is a recipient of the NSF CAREER award (2015) and an Army Research Office Young Investigator award (2013). Her research is supported by the National Science Foundation, the US Army Research Office, Office of Naval Research, and a gift from Northrop Grumman Aerospace Systems. More details can be found at edu/~leman. Alex Beutel is a fourth-year Ph.D. student at Carnegie Mellon University in the Computer Science Department. He previously received his B.S. from Duke University. His Ph.D. research focuses on large scale user behavior modeling, covering both recommendation sytems and fraud detection systems. Alex s research is supported by the National Science Foundation Graduate Research Fellowship Program and a Facebook Fellowship. More details can be found at References [1] Edoardo M Airoldi, David M Blei, Stephen E Fienberg, and Eric P Xing. Mixed membership stochastic blockmodels. In Advances in Neural Information Processing Systems, pages 33 40, [2] Leman Akoglu, Rishi Chandy, and Christos Faloutsos. Opinion fraud detection in online reviews by network effects. In ICWSM, [3] Leman Akoglu, Mary McGlohon, and Christos Faloutsos. Oddball: Spotting anomalies in weighted graphs. PAKDD 2010, June [4] Reid Andersen, Fan Chung, and Kevin Lang. Local graph partitioning using pagerank vectors. In Foundations of Computer Science, FOCS th Annual IEEE Symposium on, pages IEEE, [5] Lars Backstrom and Jon Kleinberg. Romantic partnerships and the dispersion of social ties: a network analysis of relationship status on facebook. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing, pages ACM, [6] Alex Beutel, Amr Ahmed, and Alexander J Smola. ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly. In Proceedings of the 24th international conference on World wide web. International World Wide Web Conferences Steering Committee, [7] Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J Smola. CoBaFi: Collaborative Bayesian Filtering. In Proceedings of the 23rd international conference on World wide web, pages International World Wide Web Conferences Steering Committee, [8] Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd international conference on World Wide Web, pages International World Wide Web Conferences Steering Committee, [9] Sergey Brin and Larry Page. The anatomy of a large-scale hypertextual web search engine. In Proceedings of the Seventh International World Wide Web Conference,
5 [10] Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. Aiding the detection of fake accounts in large scale social online services. In NSDI, [11] Qiang Cao, Xiaowei Yang, Jieqi Yu, and Christopher Palow. Uncovering large groups of active malicious accounts in online social networks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pages ACM, [12] Deepayan Chakrabarti, Stanislav Funiak, Jonathan Chang, and Sofus A Macskassy. Joint inference of multiple label types in large networks. arxiv preprint arxiv: , [13] Deepayan Chakrabarti, Spiros Papadimitriou, Dharmendra S Modha, and Christos Faloutsos. Fully automatic cross-associations. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, [14] Corinna Cortes, Daryl Pregibon, and Chris Volinsky. Communities of interest. Springer, [15] Qi Ding, Natallia Katenka, Paul Barford, Eric Kolaczyk, and Mark Crovella. Intrusion as (anti) social communication: characterization and detection. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, [16] Jing Gao, Feng Liang, Wei Fan, Chi Wang, Yizhou Sun, and Jiawei Han. On community outliers and their efficient detection in information networks. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, [17] Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. Understanding and combating link farming in the twitter social network. In Proceedings of the 21st international conference on World Wide Web, pages ACM, [18] Manish Gupta, Jing Gao, and Jiawei Han. Community distribution outlier detection in heterogeneous information networks. In Machine Learning and Knowledge Discovery in Databases, pages Springer, [19] Manish Gupta, Jing Gao, Xifeng Yan, Hasan Cam, and Jiawei Han. On detecting association-based clique outliers in heterogeneous information networks. In Advances in Social Networks Analysis and Mining (ASONAM), 2013 IEEE/ACM International Conference on, pages IEEE, [20] Manish Gupta, Arun Mallya, Subhro Roy, Jason HD Cho, and Jiawei Han. Local learning for mining outlier subgraphs from network datasets. In Proceedings of the 2014 SIAM International Conference on Data Mining, pages 73 81, [21] Nitish Gupta and Sameer Singh. Collective factorization for relational data: An evaluation on the yelp datasets. [22] Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. Combating web spam with trustrank. In VLDB Endowment, pages , [23] J. A. Hartigan. Direct clustering of a data matrix. Journal of the american statistical association, 67(337): , [24] Meng Jiang. Peng cui, alex beutel, christos faloutsos and shiqiang yang. inferring strange behavior from connectivity pattern in social networks. In The 18th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD),
6 [25] Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. Catchsync: catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, [26] Enric Junqué de Fortuny, Marija Stankova, Julie Moeyersoms, Bart Minnaert, Foster Provost, and David Martens. Corporate residence fraud detection. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 14, pages , New York, NY, USA, ACM. [27] J.M. Kleinberg. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM), 46(5): , [28] Yehuda Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, [29] Jure Leskovec and Julian J Mcauley. Learning to discover social circles in ego networks. In Advances in neural information processing systems, pages , [30] Zhongmou Li, Hui Xiong, and Yanchi Liu. Detecting blackholes and volcanoes in directed networks. arxiv preprint arxiv: , [31] S. Pandit, D.H. Chau, S. Wang, and C. Faloutsos. Netprobe: a fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th international conference on World Wide Web, pages ACM, [32] Evangelos Papalexakis, Konstantinos Pelechrinis, and Christos Faloutsos. Spotting misbehaviors in location-based social networks using tensors. In Proceedings of the companion publication of the 23rd international conference on World wide web companion, pages International World Wide Web Conferences Steering Committee, [33] Evangelos E Papalexakis, Leman Akoglu, and D Ience. Do more views of a graph help? community detection and clustering in multi-graphs. In Information Fusion (FUSION), th International Conference on, pages IEEE, [34] Evangelos E Papalexakis, Alex Beutel, and Peter Steenkiste. Network anomaly detection using coclustering. In Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), pages IEEE Computer Society, [35] Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sánchez, and Emmanuel Müller. Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages ACM, [36] B. Aditya Prakash, Mukund Seshadri, Ashwin Sridharan, Sridhar Machiraju, and Christos Faloutsos. Eigenspokes: Surprising patterns and scalable community chipping in large graphs. PAKDD 2010, June [37] Ruslan Salakhutdinov and Andriy Mnih. Bayesian probabilistic matrix factorization using markov chain monte carlo. In Proceedings of the 25th international conference on Machine learning, pages ACM, [38] Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. Spotting suspicious link behavior with fbox: An adversarial perspective. In ICDM,
7 [39] Ajit P Singh and Geoffrey J Gordon. A unified view of matrix factorization models. In Machine Learning and Knowledge Discovery in Databases, pages Springer, [40] Johan Ugander, Lars Backstrom, and Jon Kleinberg. Subgraph frequencies: Mapping the empirical and extremal geography of large graph collections. In Proceedings of the 22nd international conference on World Wide Web, pages International World Wide Web Conferences Steering Committee, [41] Johan Ugander, Lars Backstrom, Cameron Marlow, and Jon Kleinberg. Structural diversity in social contagion. Proceedings of the National Academy of Sciences, page , [42] Véronique Van Vlasselaer, Veronique VanVlasselaer, Leman Akoglu, Tina Eliassi-Rad, Monique Snoeck, and Bart Baesens. Guilt-by-constellation: Fraud detection by suspicious clique memberships. In Proceedings of 48 Annual Hawaii International Conference on System Sciences, [43] Baoning Wu, Vinay Goel, and Brian D Davison. Propagating trust and distrust to demote web spam. MTW, 190, [44] Jaewon Yang and Jure Leskovec. Structure and overlaps of ground-truth communities in networks. ACM Transactions on Intelligent Systems and Technology (TIST), 5(2):26, [45] Rose Yu, Xinran He, and Yan Liu. Glad: group anomaly detection in social media analysis. In SIGKDD, pages , [46] Honglei Zhuang, Jing Zhang, George Brova, Jie Tang, Hasan Cam, Xifeng Yan, and Jiawei Han. Mining query-based subnetwork outliers in heterogeneous information networks. 7
CMU SCS Large Graph Mining Patterns, Tools and Cascade analysis
Large Graph Mining Patterns, Tools and Cascade analysis Christos Faloutsos CMU Roadmap Introduction Motivation Why big data Why (big) graphs? Patterns in graphs Tools: fraud detection on e-bay Conclusions
SOCIAL NETWORK DATA ANALYTICS
SOCIAL NETWORK DATA ANALYTICS SOCIAL NETWORK DATA ANALYTICS Edited by CHARU C. AGGARWAL IBM T. J. Watson Research Center, Yorktown Heights, NY 10598, USA Kluwer Academic Publishers Boston/Dordrecht/London
Bayesian Factorization Machines
Bayesian Factorization Machines Christoph Freudenthaler, Lars Schmidt-Thieme Information Systems & Machine Learning Lab University of Hildesheim 31141 Hildesheim {freudenthaler, schmidt-thieme}@ismll.de
Top Top 10 Algorithms in Data Mining
ICDM 06 Panel on Top Top 10 Algorithms in Data Mining 1. The 3-step identification process 2. The 18 identified candidates 3. Algorithm presentations 4. Top 10 algorithms: summary 5. Open discussions ICDM
Part III: Graph-based spam/fraud detection algorithms and apps
Part III: Graph-based spam/fraud detection algorithms and apps L. Akoglu & C. Faloutsos Anomaly detection in graph data (ICDM'12) 159 Part III: Outline Algorithms: relational learning Collective classification
Social Prediction in Mobile Networks: Can we infer users emotions and social ties?
Social Prediction in Mobile Networks: Can we infer users emotions and social ties? Jie Tang Tsinghua University, China 1 Collaborate with John Hopcroft, Jon Kleinberg (Cornell) Jinghai Rao (Nokia), Jimeng
Top 10 Algorithms in Data Mining
Top 10 Algorithms in Data Mining Xindong Wu ( 吴 信 东 ) Department of Computer Science University of Vermont, USA; 合 肥 工 业 大 学 计 算 机 与 信 息 学 院 1 Top 10 Algorithms in Data Mining by the IEEE ICDM Conference
Big Data Analytics Process & Building Blocks
Big Data Analytics Process & Building Blocks Duen Horng (Polo) Chau Georgia Tech CSE 6242 A / CS 4803 DVA Jan 10, 2013 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos
Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model
Evaluating Online Payment Transaction Reliability using Rules Set Technique and Graph Model Trung Le 1, Ba Quy Tran 2, Hanh Dang Thi My 3, Thanh Hung Ngo 4 1 GSR, Information System Lab., University of
Curriculum Vitae. Summer internship in a financial company that is active in quantitative analysis or development of quantitative
Curriculum Vitae XIAOXIAO SHI Department of Computer Science University of Illinois at Chicago Office: 851 S. Morgan St., Rm 1336 SEO, Chicago, IL 60607 [email protected], [email protected] (preferred)
AFRAID: Fraud Detection via Active Inference in Time-evolving Social Networks
AFRAID: Fraud Detection via Active Inference in Time-evolving Social Networks Véronique Van Vlasselaer Tina Eliassi-Rad Leman Akoglu Monique Snoeck Bart Baesens Department of Decision Sciences and Information
Graph Processing and Social Networks
Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph
Community Mining from Multi-relational Networks
Community Mining from Multi-relational Networks Deng Cai 1, Zheng Shao 1, Xiaofei He 2, Xifeng Yan 1, and Jiawei Han 1 1 Computer Science Department, University of Illinois at Urbana Champaign (dengcai2,
Mimicking human fake review detection on Trustpilot
Mimicking human fake review detection on Trustpilot [DTU Compute, special course, 2015] Ulf Aslak Jensen Master student, DTU Copenhagen, Denmark Ole Winther Associate professor, DTU Copenhagen, Denmark
Ahmed Metwally Google Inc. 1600 Amphitheatre Pkwy Mountain View, CA 94043 (805) 403-9725 [email protected]
Education: Ahmed Metwally Google Inc. 1600 Amphitheatre Pkwy Mountain View, CA 94043 (805) 403-9725 [email protected] 2002-2007: PhD; UC Santa Barbara, Department of Computer Science. Advisors: Prof.
DÉFIS STATISTIQUES ET COMPUTATIONNELS DANS LES RÉSEAUX ET LA CYBERSÉCURITÉ
HORAIRE / PROGRAM ATELIER DÉFIS STATISTIQUES ET COMPUTATIONNELS DANS LES RÉSEAUX ET LA CYBERSÉCURITÉ 4 au 8 mai 2015 WORKSHOP STATISTICAL AND COMPUTATIONAL CHALLENGES IN NETWORKS AND CYBERSECURITY May
Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014
Parallel Data Mining Team 2 Flash Coders Team Research Investigation Presentation 2 Foundations of Parallel Computing Oct 2014 Agenda Overview of topic Analysis of research papers Software design Overview
Graph Mining Techniques for Social Media Analysis
Graph Mining Techniques for Social Media Analysis Mary McGlohon Christos Faloutsos 1 1-1 What is graph mining? Extracting useful knowledge (patterns, outliers, etc.) from structured data that can be represented
Cost effective Outbreak Detection in Networks
Cost effective Outbreak Detection in Networks Jure Leskovec Joint work with Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, and Natalie Glance Diffusion in Social Networks One of
Introduction to Graph Mining
Introduction to Graph Mining What is a graph? A graph G = (V,E) is a set of vertices V and a set (possibly empty) E of pairs of vertices e 1 = (v 1, v 2 ), where e 1 E and v 1, v 2 V. Edges may contain
A Survey on Outlier Detection Techniques for Credit Card Fraud Detection
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud
Social Network Discovery based on Sensitivity Analysis
Social Network Discovery based on Sensitivity Analysis Tarik Crnovrsanin, Carlos D. Correa and Kwan-Liu Ma Department of Computer Science University of California, Davis [email protected], {correac,ma}@cs.ucdavis.edu
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach
Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach Alex Hai Wang College of Information Sciences and Technology, The Pennsylvania State University, Dunmore, PA 18512, USA
Jiliang Tang. 701 First Avenue Yahoo!, Voice: (408) 744-2053 E-mail: [email protected] Sunnyvale, CA, 94089 US. Contact Information
Jiliang Tang Contact Information Research Interests 701 First Avenue Yahoo!, Voice: (408) 744-2053 Yahoo Labs E-mail: [email protected] Sunnyvale, CA, 94089 US URL: http://www.public.asu.edu/~jtang20 Data
AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS
Chapter 1 AN INTRODUCTION TO SOCIAL NETWORK DATA ANALYTICS Charu C. Aggarwal IBM T. J. Watson Research Center Hawthorne, NY 10532 [email protected] Abstract The advent of online social networks has been
International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
KNOWLEDGE DISCOVERY and SAMPLING TECHNIQUES with DATA MINING for IDENTIFYING TRENDS in DATA SETS
KNOWLEDGE DISCOVERY and SAMPLING TECHNIQUES with DATA MINING for IDENTIFYING TRENDS in DATA SETS Prof. Punam V. Khandar, *2 Prof. Sugandha V. Dani Dept. of M.C.A., Priyadarshini College of Engg., Nagpur,
Big Data Analytics in Mobile Environments
1 Big Data Analytics in Mobile Environments 熊 辉 教 授 罗 格 斯 - 新 泽 西 州 立 大 学 2012-10-2 Rutgers, the State University of New Jersey Why big data: historical view? Productivity versus Complexity (interrelatedness,
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer
Community Detection Proseminar - Elementary Data Mining Techniques by Simon Grätzer 1 Content What is Community Detection? Motivation Defining a community Methods to find communities Overlapping communities
Detecting Anomalies in Mobile Telecommunication Networks Using a Graph Based Approach
Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference Detecting Anomalies in Mobile Telecommunication Networks Using a Graph Based Approach Cameron
Online Fraud Detection Model Based on Social Network Analysis
Journal of Information & Computational Science :7 (05) 553 5 May, 05 Available at http://www.joics.com Online Fraud Detection Model Based on Social Network Analysis Peng Wang a,b,, Ji Li a,b, Bigui Ji
Example application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
JIAN WANG. Mountain View, CA 94043 (650) 868-6572 [email protected]
JIAN WANG Mountain View, CA 94043 (650) 868-6572 [email protected] RESEARCH INTEREST Big Data Analysis, Recommender Systems, Data Mining, Personalization, Information Retrieval, Machine Learning, Large-scale
DualIso: Scalable Subgraph Pattern Matching On Large Labeled Graphs SUPPLEMENT. Computer Science Department
DualIso: Scalable Subgraph Pattern Matching On Large Labeled Graphs SUPPLEMENT Matthew Saltz, Ayushi Jain, Abhishek Kothari, Arash Fard, John A. Miller and Lakshmish Ramaswamy Computer Science Department
Statistical Analysis of Network Data
Statistical Analysis of Network Data A Brief Overview Eric D. Kolaczyk Dept of Mathematics and Statistics, Boston University [email protected] Introduction Focus of this Talk In this talk I will present
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
Factorization Machines
Factorization Machines Factorized Polynomial Regression Models Christoph Freudenthaler, Lars Schmidt-Thieme and Steffen Rendle 2 Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim,
Data Mining: Opportunities and Challenges
Data Mining: Opportunities and Challenges Xindong Wu University of Vermont, USA; Hefei University of Technology, China ( 合 肥 工 业 大 学 计 算 机 应 用 长 江 学 者 讲 座 教 授 ) 1 Deduction Induction: My Research Background
User Modeling in Big Data. Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大
User Modeling in Big Data Qiang Yang, Huawei Noah s Ark Lab and Hong Kong University of Science and Technology 杨 强, 华 为 诺 亚 方 舟 实 验 室, 香 港 科 大 Who we are: Noah s Ark LAB Have you watched the movie 2012?
Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
Insider Threat Detection Using Graph-Based Approaches
Cybersecurity Applications & Technology Conference For Homeland Security Insider Threat Detection Using Graph-Based Approaches William Eberle Tennessee Technological University [email protected] Lawrence
A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment
A Clustering Model for Mining Evolving Web User Patterns in Data Stream Environment Edmond H. Wu,MichaelK.Ng, Andy M. Yip,andTonyF.Chan Department of Mathematics, The University of Hong Kong Pokfulam Road,
Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering
IV International Congress on Ultra Modern Telecommunications and Control Systems 22 Adaptive Framework for Network Traffic Classification using Dimensionality Reduction and Clustering Antti Juvonen, Tuomo
Statistical and computational challenges in networks and cybersecurity
Statistical and computational challenges in networks and cybersecurity Hugh Chipman Acadia University June 12, 2015 Statistical and computational challenges in networks and cybersecurity May 4-8, 2015,
A Game Theoretical Framework for Adversarial Learning
A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,
Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection
Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection Jing Gao Wei Fan Deepak Turaga Olivier Verscheure Xiaoqiao Meng Lu Su Jiawei Han University
Customer Relationship Management using Adaptive Resonance Theory
Customer Relationship Management using Adaptive Resonance Theory Manjari Anand M.Tech.Scholar Zubair Khan Associate Professor Ravi S. Shukla Associate Professor ABSTRACT CRM is a kind of implemented model
Introduction to Data Mining. Lijun Zhang [email protected] http://cs.nju.edu.cn/zlj
Introduction to Data Mining Lijun Zhang [email protected] http://cs.nju.edu.cn/zlj Outline Overview Introduction The Data Mining Process The Basic Data Types The Major Building Blocks Scalability and Streaming
International Journal of World Research, Vol: I Issue XIII, December 2008, Print ISSN: 2347-937X DATA MINING TECHNIQUES AND STOCK MARKET
DATA MINING TECHNIQUES AND STOCK MARKET Mr. Rahul Thakkar, Lecturer and HOD, Naran Lala College of Professional & Applied Sciences, Navsari ABSTRACT Without trading in a stock market we can t understand
Mining the Temporal Dimension of the Information Propagation
Mining the Temporal Dimension of the Information Propagation Michele Berlingerio, Michele Coscia 2, and Fosca Giannotti 3 IMT-Lucca, Lucca, Italy 2 Dipartimento di Informatica, Pisa, Italy {name.surname}@isti.cnr.it
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP
AN EFFICIENT SELECTIVE DATA MINING ALGORITHM FOR BIG DATA ANALYTICS THROUGH HADOOP Asst.Prof Mr. M.I Peter Shiyam,M.E * Department of Computer Science and Engineering, DMI Engineering college, Aralvaimozhi.
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1
A Platform for Supporting Data Analytics on Twitter: Challenges and Objectives 1 Yannis Stavrakas Vassilis Plachouras IMIS / RC ATHENA Athens, Greece {yannis, vplachouras}@imis.athena-innovation.gr Abstract.
Web Mining Techniques in E-Commerce Applications
Web Mining Techniques in E-Commerce Applications Ahmad Tasnim Siddiqui College of Computers and Information Technology Taif University Taif, Kingdom of Saudi Arabia Sultan Aljahdali College of Computers
Enhancing the Ranking of a Web Page in the Ocean of Data
Database Systems Journal vol. IV, no. 3/2013 3 Enhancing the Ranking of a Web Page in the Ocean of Data Hitesh KUMAR SHARMA University of Petroleum and Energy Studies, India [email protected] In today
Data Mining & Data Stream Mining Open Source Tools
Data Mining & Data Stream Mining Open Source Tools Darshana Parikh, Priyanka Tirkha Student M.Tech, Dept. of CSE, Sri Balaji College Of Engg. & Tech, Jaipur, Rajasthan, India Assistant Professor, Dept.
Big Data Analytics Building Blocks. Simple Data Storage (SQLite)
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Big Data Analytics Building Blocks. Simple Data Storage (SQLite) Duen Horng (Polo) Chau Georgia Tech Partly based on materials
Big Data Analytics Building Blocks; Simple Data Storage (SQLite)
Big Data Analytics Building Blocks; Simple Data Storage (SQLite) Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John
Ensemble Learning Better Predictions Through Diversity. Todd Holloway ETech 2008
Ensemble Learning Better Predictions Through Diversity Todd Holloway ETech 2008 Outline Building a classifier (a tutorial example) Neighbor method Major ideas and challenges in classification Ensembles
Data Mining for Security Applications
2008 IEEE/IFIP International Conference on Embedded and Ubiquitous Computing Data Mining for Security Applications Bhavani Thuraisingham, Latifur Khan, Mohammad M. Masud, Kevin W. Hamlen The University
Xi Chen Curriculum Vitae
Xi Chen Curriculum Vitae IOMS Dept., NYU Stern B-School, 44 West 4th St., 8-50 New York, NY 10012 phone: 212-998-4017 email: [email protected] Education 2008 2013 Carnegie Mellon University, School of Computer
Factorization Machines
Factorization Machines Steffen Rendle Department of Reasoning for Intelligence The Institute of Scientific and Industrial Research Osaka University, Japan [email protected] Abstract In this
LINEAR-ALGEBRAIC GRAPH MINING
UCSF QB3 Seminar 5/28/215 Linear Algebraic Graph Mining, [email protected] 1/22 LLNL-PRES-671587 New Applications of Computer Analysis to Biomedical Data Sets QB3 Seminar, UCSF Medical School, May 28
Anomaly Detection Using Unsupervised Profiling Method in Time Series Data
Anomaly Detection Using Unsupervised Profiling Method in Time Series Data Zakia Ferdousi 1 and Akira Maeda 2 1 Graduate School of Science and Engineering, Ritsumeikan University, 1-1-1, Noji-Higashi, Kusatsu,
Detection of Collusion Behaviors in Online Reputation Systems
Detection of Collusion Behaviors in Online Reputation Systems Yuhong Liu, Yafei Yang, and Yan Lindsay Sun University of Rhode Island, Kingston, RI Email: {yuhong, yansun}@ele.uri.edu Qualcomm Incorporated,
MULTI-DOMAIN CLOUD SOCIAL NETWORK SERVICE PLATFORM SUPPORTING ONLINE COLLABORATIONS ON CAMPUS
MULTI-DOMAIN CLOUD SOCIAL NETWORK SERVICE PLATFORM SUPPORTING ONLINE COLLABORATIONS ON CAMPUS Zhao Du, Xiaolong Fu, Can Zhao, Ting Liu, Qifeng Liu, Qixin Liu Information Technology Center, Tsinghua University,
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL
Journal homepage: www.mjret.in ISSN:2348-6953 PULLING OUT OPINION TARGETS AND OPINION WORDS FROM REVIEWS BASED ON THE WORD ALIGNMENT MODEL AND USING TOPICAL WORD TRIGGER MODEL Utkarsha Vibhute, Prof. Soumitra
Machine Learning Department, School of Computer Science, Carnegie Mellon University, PA
Pengtao Xie Carnegie Mellon University Machine Learning Department School of Computer Science 5000 Forbes Ave Pittsburgh, PA 15213 Tel: (412) 916-9798 Email: [email protected] Web: http://www.cs.cmu.edu/
Morteza Zihayat Curriculum Vitae October 2015
Morteza Zihayat Curriculum Vitae October 2015 Contact Information Ph.D Candidate Phone: (+1) 647-831-6167 E-mail: [email protected] 4700 Keele St. Room LS2057 Website: http://www.cse.yorku.ca/~zihayatm/
