Crawling and Detecting Community Structure in Online Social Networks using Local Information
|
|
|
- Annis Hall
- 10 years ago
- Views:
Transcription
1 Crawling and Detecting Community Structure in Online Social Networks using Local Information TU Delft - Network Architectures and Services (NAS) 1/12
2 Outline In order to find communities in a graph one needs the full graph. Crawling large Datasets like Online Social Networks takes very long. Facebook: 901 million (active April 2012), Twitter: Over 140 million (active March 2012) Ideal Crawling with one PC: 1s per request: Facebook 29years, Twitter: 4,5years 1. Crawling BFS/DFS/RFS Mutual Friend Crawling (MFC) the Reference Score Performance 2. Community Detection The Reference Score Compared to well known methods 3. Conclusion 2/12
3 Crawling Online Social Networks via Breadth/Depth first Search i i i 2 n standard Breadth First Search But unfortunately Social Networks are not tree like standard Depth First Search What most people do (Random First Search RFS) using a BFS/DFS/RFS leads to a sampling bias by using any of these methods and the fact one has to wait until the full graph is crawled to detect communities. 3/12
4 Crawling Online Social Networks via Mutual Friend Crawling Our proposed method Mutual Friend Crawling (MFC) overcomes this situation by crawling a Graph from any given seed point, Community wise. MFC is based on BFS/DFS plus one assumption: the degree of neighboring nodes is known and keeps a Reference Score S R This in the search trajectory the next node to be next node to visit is the one having the highest S R 4/12
5 Crawling Online Social Networks via Mutual Friend Crawling Example: Starting with node 2: its neighbors are 0,1,3,4 with degrees Lets take 4 the Reference Scores are: 0:0.2, 1:0.2, 3:0.25, 4:0.2 5/12
6 Crawling Online Social Networks via Mutual Friend Crawling - Performance BFS (blue) DFS(green) MFC(red) American Football network (Newman et al.) 6/12
7 Community Detection in OSNs via Mutual Friend Crawling How is the reference Score behaving while MFC is traversing the graph. As there MFC stays in communities the reference score is always increasing denoting that the community is tightly connected. As soon as there is a drop in S R a new community is been found. This drop is largest if an expressed community structure can be found. Otherwise it will be small 7/12
8 Community Detection in Online Social Networks via Mutual Friend Crawling Problem of misclassification If starting with a hub (11), the nodes 10 and 21 are classified as being in the same community as node 11 (the first community). Solution: after finishing a community check if the nodes in this community should really be in this community 8/12
9 Conclusion & Future Work We proposed an algorithm to crawl online social networks community wise in order to minimize sampling bias in communities. to be able to analyze data while still crawling the network The algorithm detects communities, (even for directed and weighted graphs) Future work: overlapping communities formalism to understand the drop in the reference score in order to catch how structured a graph is. (compared to modularity) 9/12
10 Thank you for your attention Questions Delft University of Technology Faculty of Electr. Engineering Dept. of Telecommunication Mekelweg CD Delft The Netherlands Room: EWI /12
11 Crawling Online Social Networks via Mutual Friend Crawling - Performance In order to measure the performance we were looking for ground truth datasets As it is very hard to find some real world datasets where the community partition is known we came up with a Cluster Graph Generator 1. node generation and slot assignment 2. assigning nodes to clusters 3. creating the links 4. force the generation of a giant connected component (GCC) Has the possibility to generate arbitrary (predefined) community size distributions Multiple community detection algorithms were tested on the ground truth 11/12
Efficient Crawling of Community Structures in Online Social Networks
Efficient Crawling of Community Structures in Online Social Networks Network Architectures and Services PVM 2011-071 Efficient Crawling of Community Structures in Online Social Networks For the degree
Network Architectures & Services
Network Architectures & Services Fernando Kuipers ([email protected]) Multi-dimensional analysis Network peopleware Network software Network hardware Individual: Quality of Experience Friends: Recommendation
How To Cluster Of Complex Systems
Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving
Strong and Weak Ties
Strong and Weak Ties Web Science (VU) (707.000) Elisabeth Lex KTI, TU Graz April 11, 2016 Elisabeth Lex (KTI, TU Graz) Networks April 11, 2016 1 / 66 Outline 1 Repetition 2 Strong and Weak Ties 3 General
Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
Understanding Graph Sampling Algorithms for Social Network Analysis
Understanding Graph Sampling Algorithms for Social Network Analysis Tianyi Wang, Yang Chen 2, Zengbin Zhang 3, Tianyin Xu 2 Long Jin, Pan Hui 4, Beixing Deng, Xing Li Department of Electronic Engineering,
Lecture 13: Validation
Lecture 3: Validation g Motivation g The Holdout g Re-sampling techniques g Three-way data splits Motivation g Validation techniques are motivated by two fundamental problems in pattern recognition: model
Mining Social Network Graphs
Mining Social Network Graphs Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata November 13, 17, 2014 Social Network No introduc+on required Really? We s7ll need to understand
Graph Analytics in Big Data. John Feo Pacific Northwest National Laboratory
Graph Analytics in Big Data John Feo Pacific Northwest National Laboratory 1 A changing World The breadth of problems requiring graph analytics is growing rapidly Large Network Systems Social Networks
SCAN: A Structural Clustering Algorithm for Networks
SCAN: A Structural Clustering Algorithm for Networks Xiaowei Xu, Nurcan Yuruk, Zhidan Feng (University of Arkansas at Little Rock) Thomas A. J. Schweiger (Acxiom Corporation) Networks scaling: #edges connected
FPGA area allocation for parallel C applications
1 FPGA area allocation for parallel C applications Vlad-Mihai Sima, Elena Moscu Panainte, Koen Bertels Computer Engineering Faculty of Electrical Engineering, Mathematics and Computer Science Delft University
SIP Service Providers and The Spam Problem
SIP Service Providers and The Spam Problem Y. Rebahi, D. Sisalem Fraunhofer Institut Fokus Kaiserin-Augusta-Allee 1 10589 Berlin, Germany {rebahi, sisalem}@fokus.fraunhofer.de Abstract The Session Initiation
Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware
Evaluation of Different Task Scheduling Policies in Multi-Core Systems with Reconfigurable Hardware Mahyar Shahsavari, Zaid Al-Ars, Koen Bertels,1, Computer Engineering Group, Software & Computer Technology
Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani
Improving the Scalability of Data Center Networks with Traffic-aware Virtual Machine Placement Xiaoqiao Meng, Vasileios Pappas, Li Zhang IBM T.J. Watson Research Center Presented by: Payman Khani Overview:
Data mining and statistical models in marketing campaigns of BT Retail
Data mining and statistical models in marketing campaigns of BT Retail Francesco Vivarelli and Martyn Johnson Database Exploitation, Segmentation and Targeting group BT Retail Pp501 Holborn centre 120
Mining Social-Network Graphs
342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is
Prediction of DDoS Attack Scheme
Chapter 5 Prediction of DDoS Attack Scheme Distributed denial of service attack can be launched by malicious nodes participating in the attack, exploit the lack of entry point in a wireless network, and
Data Mining with R. Decision Trees and Random Forests. Hugh Murrell
Data Mining with R Decision Trees and Random Forests Hugh Murrell reference books These slides are based on a book by Graham Williams: Data Mining with Rattle and R, The Art of Excavating Data for Knowledge
Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model
Proposed Advance Taxi Recommender System Based On a Spatiotemporal Factor Analysis Model Santosh Thakkar, Supriya Bhosale, Namrata Gawade, Prof. Sonia Mehta Department of Computer Engineering, Alard College
Improving performance of Memory Based Reasoning model using Weight of Evidence coded categorical variables
Paper 10961-2016 Improving performance of Memory Based Reasoning model using Weight of Evidence coded categorical variables Vinoth Kumar Raja, Vignesh Dhanabal and Dr. Goutam Chakraborty, Oklahoma State
Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
Distributed Computing over Communication Networks: Maximal Independent Set
Distributed Computing over Communication Networks: Maximal Independent Set What is a MIS? MIS An independent set (IS) of an undirected graph is a subset U of nodes such that no two nodes in U are adjacent.
SAFARI. Future Work Ideas. Alberto Garcia-Robledo, Abel Sanchez, Rongsha Li, Juan-Carlos Murillo-Torres, John Williams and Sascha Boheme
SAFARI Future Work Ideas Alberto Garcia-Robledo, Abel Sanchez, Rongsha Li, Juan-Carlos Murillo-Torres, John Williams and Sascha Boheme Massachusetts Institute of Technology z 1 Situational Awareness for
Client Overview. Engagement Situation. Key Requirements
Client Overview Our client is one of the leading providers of business intelligence systems for customers especially in BFSI space that needs intensive data analysis of huge amounts of data for their decision
Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing. October 29th, 2015
E6893 Big Data Analytics Lecture 8: Spark Streams and Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science IBM Chief Scientist, Graph Computing
Visualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
LOAD BALANCING AND EFFICIENT CLUSTERING FOR IMPROVING NETWORK PERFORMANCE IN AD-HOC NETWORKS
LOAD BALANCING AND EFFICIENT CLUSTERING FOR IMPROVING NETWORK PERFORMANCE IN AD-HOC NETWORKS Saranya.S 1, Menakambal.S 2 1 M.E., Embedded System Technologies, Nandha Engineering College (Autonomous), (India)
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Load Balancing. Load Balancing 1 / 24
Load Balancing Backtracking, branch & bound and alpha-beta pruning: how to assign work to idle processes without much communication? Additionally for alpha-beta pruning: implementing the young-brothers-wait
Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report
2012 Determining optimum insurance product portfolio through predictive analytics BADM Final Project Report Dinesh Ganti(61310071), Gauri Singh(61310560), Ravi Shankar(61310210), Shouri Kamtala(61310215),
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the
Sampling Online Social Networks
Sampling Online Social Networks Athina Markopoulou 1,3 Joint work with: Minas Gjoka 3, Maciej Kurant 3, Carter T. Butts 2,3, Patrick Thiran 4 1 Department of Electrical Engineering and Computer Science
So, how do you pronounce. Jilles Vreeken. Okay, now we can talk. So, what kind of data? binary. * multi-relational
Simply Mining Data Jilles Vreeken So, how do you pronounce Exploratory Data Analysis Jilles Vreeken Jilles Yill less Vreeken Fray can 17 August 2015 Okay, now we can talk. 17 August 2015 The goal So, what
W6.B.1. FAQs CS535 BIG DATA W6.B.3. 4. If the distance of the point is additionally less than the tight distance T 2, remove it from the original set
http://wwwcscolostateedu/~cs535 W6B W6B2 CS535 BIG DAA FAQs Please prepare for the last minute rush Store your output files safely Partial score will be given for the output from less than 50GB input Computer
IBA Business Analytics Data Challenge
Information is the oil of the 21st century, and analytics is the combustion engine." - Peter Sondergaard, SVP, Gartner Research October 31 st, 2014 IBA Business Analytics Data Challenge Atur, Ramanuja
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION
CAB TRAVEL TIME PREDICTI - BASED ON HISTORICAL TRIP OBSERVATION N PROBLEM DEFINITION Opportunity New Booking - Time of Arrival Shortest Route (Distance/Time) Taxi-Passenger Demand Distribution Value Accurate
Implementing Graph Pattern Mining for Big Data in the Cloud
Implementing Graph Pattern Mining for Big Data in the Cloud Chandana Ojah M.Tech in Computer Science & Engineering Department of Computer Science & Engineering, PES College of Engineering, Mandya [email protected]
Graph Theory and Complex Networks: An Introduction. Chapter 08: Computer networks
Graph Theory and Complex Networks: An Introduction Maarten van Steen VU Amsterdam, Dept. Computer Science Room R4.20, [email protected] Chapter 08: Computer networks Version: March 3, 2011 2 / 53 Contents
Chapter 12 Bagging and Random Forests
Chapter 12 Bagging and Random Forests Xiaogang Su Department of Statistics and Actuarial Science University of Central Florida - 1 - Outline A brief introduction to the bootstrap Bagging: basic concepts
Cloud Computing. Lectures 10 and 11 Map Reduce: System Perspective 2014-2015
Cloud Computing Lectures 10 and 11 Map Reduce: System Perspective 2014-2015 1 MapReduce in More Detail 2 Master (i) Execution is controlled by the master process: Input data are split into 64MB blocks.
A1 and FARM scalable graph database on top of a transactional memory layer
A1 and FARM scalable graph database on top of a transactional memory layer Miguel Castro, Aleksandar Dragojević, Dushyanth Narayanan, Ed Nightingale, Alex Shamis Richie Khanna, Matt Renzelmann Chiranjeeb
Sentiment analysis using emoticons
Sentiment analysis using emoticons Royden Kayhan Lewis Moharreri Steven Royden Ware Lewis Kayhan Steven Moharreri Ware Department of Computer Science, Ohio State University Problem definition Our aim was
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
Load Balancing Techniques
Load Balancing Techniques 1 Lecture Outline Following Topics will be discussed Static Load Balancing Dynamic Load Balancing Mapping for load balancing Minimizing Interaction 2 1 Load Balancing Techniques
Creating a Network Graph with Gephi
Creating a Network Graph with Gephi Gephi is a powerful tool for network analysis, but it can be intimidating. It has a lot of tools for statistical analysis of network data most of which you won't be
DATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A REVIEW ON THE USAGE OF OLD AND NEW DATA STRUCTURE ARRAYS, LINKED LIST, STACK,
Smart Sell Re-quote project for an Insurance company.
SAS Analytics Day Smart Sell Re-quote project for an Insurance company. A project by Ajay Guyyala Naga Sudhir Lanka Narendra Babu Merla Kiran Reddy Samiullah Bramhanapalli Shaik Business Situation XYZ
A Locality Enhanced Scheduling Method for Multiple MapReduce Jobs In a Workflow Application
2012 International Conference on Information and Computer Applications (ICICA 2012) IPCSIT vol. 24 (2012) (2012) IACSIT Press, Singapore A Locality Enhanced Scheduling Method for Multiple MapReduce Jobs
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Imre Varga Abstract In this paper I propose a novel method to model real online social networks where the growing
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I)
! E6893 Big Data Analytics Lecture 9:! Linked Big Data Graph Computing (I) Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science Mgr., Dept. of Network Science and
Link Prediction in Social Networks
CS378 Data Mining Final Project Report Dustin Ho : dsh544 Eric Shrewsberry : eas2389 Link Prediction in Social Networks 1. Introduction Social networks are becoming increasingly more prevalent in the daily
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Role of Neural network in data mining
Role of Neural network in data mining Chitranjanjit kaur Associate Prof Guru Nanak College, Sukhchainana Phagwara,(GNDU) Punjab, India Pooja kapoor Associate Prof Swami Sarvanand Group Of Institutes Dinanagar(PTU)
How To Understand The Network Of A Network
Roles in Networks Roles in Networks Motivation for work: Let topology define network roles. Work by Kleinberg on directed graphs, used topology to define two types of roles: authorities and hubs. (Each
Binary Search Trees CMPSC 122
Binary Search Trees CMPSC 122 Note: This notes packet has significant overlap with the first set of trees notes I do in CMPSC 360, but goes into much greater depth on turning BSTs into pseudocode than
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce. Authors: B. Panda, J. S. Herbach, S. Basu, R. J. Bayardo.
PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce Authors: B. Panda, J. S. Herbach, S. Basu, R. J. Bayardo. VLDB 2009 CS 422 Decision Trees: Main Components Find Best Split Choose split
Bachelor of Bachelor of Computer Science
Bachelor of Bachelor of Computer Science Detailed Course Requirements The 2016 Monash University Handbook will be available from October 2015. This document contains interim 2016 course requirements information.
Parallelization: Binary Tree Traversal
By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU Pervasive Parallelism Laboratory Stanford University Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun Graph and its Applications Graph Fundamental
Krishna Institute of Engineering & Technology, Ghaziabad Department of Computer Application MCA-213 : DATA STRUCTURES USING C
Tutorial#1 Q 1:- Explain the terms data, elementary item, entity, primary key, domain, attribute and information? Also give examples in support of your answer? Q 2:- What is a Data Type? Differentiate
MBA - INFORMATION TECHNOLOGY MANAGEMENT (MBAITM) Term-End Examination December, 2014 MBMI-012 : BUSINESS INTELLIGENCE SECTION I
No. of Printed Pages : 8 I MBMI-012 I MBA - INFORMATION TECHNOLOGY MANAGEMENT (MBAITM) Term-End Examination December, 2014 Time : 3 hours Note : (i) (ii) (iii) (iv) (v) MBMI-012 : BUSINESS INTELLIGENCE
CS 6220: Data Mining Techniques Course Project Description
CS 6220: Data Mining Techniques Course Project Description College of Computer and Information Science Northeastern University Spring 2013 General Goal In this project, you will have an opportunity to
Data Mining with SQL Server Data Tools
Data Mining with SQL Server Data Tools Data mining tasks include classification (directed/supervised) models as well as (undirected/unsupervised) models of association analysis and clustering. 1 Data Mining
Minimize Response Time Using Distance Based Load Balancer Selection Scheme
Minimize Response Time Using Distance Based Load Balancer Selection Scheme K. Durga Priyanka M.Tech CSE Dept., Institute of Aeronautical Engineering, HYD-500043, Andhra Pradesh, India. Dr.N. Chandra Sekhar
Impelling Heart Attack Prediction System using Data Mining and Artificial Neural Network
General Article International Journal of Current Engineering and Technology E-ISSN 2277 4106, P-ISSN 2347-5161 2014 INPRESSCO, All Rights Reserved Available at http://inpressco.com/category/ijcet Impelling
Single machine models: Maximum Lateness -12- Approximation ratio for EDD for problem 1 r j,d j < 0 L max. structure of a schedule Q...
Lecture 4 Scheduling 1 Single machine models: Maximum Lateness -12- Approximation ratio for EDD for problem 1 r j,d j < 0 L max structure of a schedule 0 Q 1100 11 00 11 000 111 0 0 1 1 00 11 00 11 00
6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
Applying Data Analysis to Big Data Benchmarks. Jazmine Olinger
Applying Data Analysis to Big Data Benchmarks Jazmine Olinger Abstract This paper describes finding accurate and fast ways to simulate Big Data benchmarks. Specifically, using the currently existing simulation
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
Social Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
SOCIAL MEDIA 80 78 76 74 72 70 68 66 64 Access to free content Series 1 To learn Advanced news of products Series 1 A Social Roadmap Understand how and why people use social media Map the social
Predictive Dynamix Inc
Predictive Modeling Technology Predictive modeling is concerned with analyzing patterns and trends in historical and operational data in order to transform data into actionable decisions. This is accomplished
Cross-validation for detecting and preventing overfitting
Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.
Understanding Neo4j Scalability
Understanding Neo4j Scalability David Montag January 2013 Understanding Neo4j Scalability Scalability means different things to different people. Common traits associated include: 1. Redundancy in the
Voice of the Customers: Mining Online Customer Reviews for Product Feature-Based Ranking
Voice of the Customers: Mining Online Customer Reviews for Product Feature-Based Ranking Kunpeng Zhang, Ramanathan Narayanan, Alok Choudhary Dept. of Electrical Engineering and Computer Science Center
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
An Analysis of Social Network-Based Sybil Defenses
An Analysis of Social Network-Based Sybil Defenses ABSTRACT Bimal Viswanath MPI-SWS [email protected] Krishna P. Gummadi MPI-SWS [email protected] Recently, there has been much excitement in the research
Data Mining Fundamentals
Part I Data Mining Fundamentals Data Mining: A First View Chapter 1 1.11 Data Mining: A Definition Data Mining The process of employing one or more computer learning techniques to automatically analyze
Data Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE
IMPROVED FAIR SCHEDULING ALGORITHM FOR TASKTRACKER IN HADOOP MAP-REDUCE Mr. Santhosh S 1, Mr. Hemanth Kumar G 2 1 PG Scholor, 2 Asst. Professor, Dept. Of Computer Science & Engg, NMAMIT, (India) ABSTRACT
Clustering UE 141 Spring 2013
Clustering UE 141 Spring 013 Jing Gao SUNY Buffalo 1 Definition of Clustering Finding groups of obects such that the obects in a group will be similar (or related) to one another and different from (or
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
ALBERTA. Social Network Analysis for the Assessment of Learning UNIVERSITY OF. Osmar R. Zaïane Professor & Scientific Director of AICML
UNIVERSITY OF ALBERTA Social Network Analysis for the Assessment of Learning Osmar R. Zaïane Professor & Scientific Director of AICML Educational Data Mining 2010 Pittsburgh, USA University of Alberta
Data Mining Cluster Analysis: Basic Concepts and Algorithms. Lecture Notes for Chapter 8. Introduction to Data Mining
Data Mining Cluster Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 8 by Tan, Steinbach, Kumar 1 What is Cluster Analysis? Finding groups of objects such that the objects in a group will
Parallelism and Cloud Computing
Parallelism and Cloud Computing Kai Shen Parallel Computing Parallel computing: Process sub tasks simultaneously so that work can be completed faster. For instances: divide the work of matrix multiplication
An unbiased crawling strategy for directed social networks
Abstract An nbiased crawling strategy for directed social networks Xeha Yang 1,2, HongbinLi 2* 1 School of Software, Shenyang Normal University, Shenyang 110034, Liaoning, China 2 Shenyang Institte of
