Exploring Different Aspects of Social Network Analysis Using Web Mining Techniques
|
|
- Terence Gardner
- 8 years ago
- Views:
Transcription
1 Exploring Different Aspects of Social Network Analysis Using Web Mining Techniques 1. Hilal Ahmad Khanday, Dr. Rana Hashmy 2 1, 2 Department of Computer Sciences, University of Kashmir Abstract A social network is a set of people connected by a set of social relations. Thanks to the availability of increasing real-world social network data, social networks are receiving increasing attention by scientific community. The purpose of this paper is to explore multiple aspects of Social network Analysis using data mining techniques to the World Wide Web, referred to as Web mining. In particular we will inquire whether there is any other way of matrix representation of social networks other than Adjacency Matrix. Light will be shed on properties of social networks and there will also be a guided tourof different techniques of web mining. The main goal is to provide a roadmap for researchers who are trying to use data mining techniques for discovering different trends in social network data. Keywords-SocialNetwork, Social Network Analysis, Web Mining 1. INTRODUCTION With the advent of Web 2.0, we have been able to move from Closed, Individual Publishing, One-Way Communication, Passive Involvement, Read-Only Content & Personal Websites to Collaborative, Group Participation, Two-Way Communication, Active Involvement, User-Generated Content & Blogging. In reality we have moved from Double click to Google AdSense, from Britannica Online to Wikipedia, from publishing to participation, from personal websites to Blogging, from page views to cost per click etc. etc. Social Networks Analysis has acquired huge popularity and signify one of the most important social and Computer Science phenomena of these years. This has happened because of many factors including the popularity of online social networks (OSNs), availability of large volumes of OSN log data, representation and analysis of social networks as graphs, and the market interests of social networks. Social networking sites have skyrocketed in popularity in a very short span of time as is depicted by Fig 1. [1] Facebook, Twitter, LinkedIn, Wikipedia, YouTube have been able to make it to the top 15 global websites. Fig 1- Global rank of some major social networking sites. Image was generated dynamically at 2. REPRESENTATION Social network analysts use two kinds of tools from mathematics to represent information about patterns of ties among social actors: graphs and matrices. [2]. 2.1Using Graphs Problems in almost every conceivable discipline can be solved using graphic models [3].Network analysis uses (primarily) one kind of graphic display that consists of points (or nodes) to represent actors and lines (or edges) to represent ties or relations. Formally a Graph G = (V, E) consists of V, a nonempty set of vertices (or nodes) and E, a set of edges. Each edge has either one or two vertices associated with it, called its endpoints. An edge is said to connect its endpoints. Statistically, a graph can be characterized by derived values such as the average degree of the nodes and the average path length between nodes. Additional characteristics are the graphs diameter, the number of triangles, the number of isomorphism s and the clustering coefficient, among others. We can use graph models to represent various relationships between people. We can use simple graph to represent whether two people know each other. Each person in a particular group of people is represented by a vertex. An undirected edge is used to connect two people when these two people know each other. No multiple edges and usually no loops are used. (If we want to include self-knowledge, we would include loops.) An undirected graph can be used in some social networking sites like Facebook etc., while as directed edges can replace the connections between two nodes in case of websites like Twitter, where there is concept of following. Volume 4, Issue 2, March April 2015 Page 121
2 The acquaintanceship graph of all the people in the world has seven billion vertices and probably more than One trillion edges!many social scientists have conjectured that almost every pair of people in the world are linked by a small chain of people, perhaps containing just six or fewer people. This would mean that almost every pair of vertices in the acquaintanceship graph containing all the people in the world is linked by a path of length not exceeding four. The play Six Degrees of Separation is based on this notion 2.2Using Matrices The most common form of representation of data in social network analysis is Matrix form as it is most convenient and suitable for many mathematical operations. We can convert graphs into matrices and vice versa. The information contained in a graph G = (V, E) can be stored in several ways, for example, using the matrix form. The most common way to store a graph is using the adjacency matrix. The adjacency matrix, denoted by A, contains entries which indicate whether two vertices are adjacent or not. An adjacency matrix is also called a Sociomatrix. The matrix A of size n x n can efficiently describe an unweighted undirected graph G = (V, E) containing n vertices. Rows and columns of the sociomatrix both represent the index of each vertex in the graph, and are labelled as 1, 2 n. Each entry a ij in the sociomatrix represents if the indicated pair of vertices a i and a j are adjacent or not. Usually, there is a 1 in the (i, j)th cell if there is an edge connecting a i and a j in the graph, or a 0 otherwise. Thus, if vertices a i and a j are adjacent, a ij = 1, otherwise a ij = 0. Because the graph is undirected, the matrix is symmetric respect to its diagonal, thus a ij = a ji, v i j. Sociomatrices are widely adopted for storing undirected network structures because of some particular properties; for example, social networks arise to sparse sociomatrices, thus it is convenient to adopt techniques of compact matrix decomposition for efficiently storing data[4]. Another possible representation of an undirected graph G = (V, E) through a matrix is called incidence matrix, usually denoted by I. It stores which edges are incident with which vertices, indexing the former on the columns and the latters on the rows, thus the dimension of the matrix I is VxE. A matrix entry I ij contains 1 if the vertex v i is incident with the edge e j, or 0 otherwise. Both the incidence and the adjacent matricescontains all the information required to describe the represented graph. Since Incidence matrix does not necessary has to be a square matrix unlike adjacency matrix which is always a square matrix, therebyproviding enough flexibility. Besides adjacency matrix of simple graphs is always symmetric which may not be the case with incidence matrices. In addition when a simple graph consists of relatively few edges, i.e. when it is sparse, it is preferable to use adjacency list rather than adjacency matrix. For example, if there is a graph with n vertices and each vertex has degree less than or equal to c where c is a constant much smaller than n, then each adjacency list contains vertices less than or equal to c. Hence there are no more than cn items in the adjacency list while as the adjacent matrix for the same graph will consist of n 2 entries which is bigger than cn, thereby using more memory space. Keeping all the above points in mind, considerable effort needs to be done towards analyzing Adjacency lists and Incidence matrices as the possible candidates of data structures for representing graphs of social networks. 3. PROPERTIES OF SOCIAL NETWORKS There are some properties of social networks that are very important. Thefirst three properties are deliberated from [5]. 3.1 Diameter There are short chains of friends that connect a large fraction of pairs of people in a social network. It is wellknown that most real-world graphs exhibit a relatively small diameter. A graph has diameter D if every pair of nodes can be connected by a path of length of at most D edges. Unfortunately, the literature on social networks tends to use the word diameter ambiguously in reference to at least four different quantities: 1) The longest shortest-path length, which is the true graph theoretic diameter but which is infinite in disconnected networks. 2) The longest shortest-path length between connected nodes, which is always finite but which cannot distinguish the complete graph from a graph with a solitary edge. 3) The average shortest-path length, and 4) The average shortest-path length between connected nodes. 3.2 Navigability Social networks exhibit small-world phenomenon based on small-world model defined by Watts, Dodds, and Newman [6] in which navigation is also possible. Their model is based upon multiple hierarchies (geography, occupation, hobbies, etc.) into which people fall, and a greedy algorithm that attempts to get closer to the target in any dimension at every step.simulations have shown this algorithm and model to allow navigation, but no theoretical results have been established. Social Networks are navigable small worlds: not only do there exist short paths connecting most pairs of people, but using only local information and some knowledge of global structure, the people in the network are able to construct short paths to the target. 3.3 Clustering Coefficient Informally, the clustering coefficient of a network is a measure of the probability that two people who have a common friend will themselves be friends. The clustering coefficient for a node u V of a graph is the fraction of edges that exist within the neighbourhood of u, i.e., between two nodes adjacent to u. The clustering coefficient of the entire network is the average clustering coefficient taken over all nodes in the graph. Volume 4, Issue 2, March April 2015 Page 122
3 3.4 Centrality and Power All sociologists would agree that power is a fundamental property of social structures. There is much less agreement about what power is, and how we can describe and analyse its causes and consequences. Below we summarise some of the main approaches that social network analysis has developed to study power, and the closely related concept of centrality. A) Degree: Degree is the number of ties for an actor, where a tie connects two or more nodes in a graph. Ties can be direct or indirect. Many human behaviours such as advice seeking, information sharing are direct ties while comemberships are examples of undirected ties [7]. B) Closeness: The degree an individual is near all other individuals in a network (directly or indirectly). It reflects the ability to access information through the grapevine of network members. Thus closeness is the inverse of the sum of the shortest distances between each individual and every other person in the network. C) Betweenness: The extent to which a node lies between other nodes in a network. This measure takes into account the connectivity of the node s neighbours, giving a higher value for nodes which bridge clusters. D) Density: It is the measure of the closeness of a network. Given a number of nodes, the more links between them, the larger the density. If the number of nodes in a network is n, and the number of links is l, then its density is given by: for directed graphs and for undirected graphs 4. WEB MINING As a large and dynamic information source that is structurally complex and ever growing, social networks are a fertile ground for data mining principles, or Web mining. The Web mining field encompasses a wide array of issues, primarily aimed at deriving actionable knowledge from the Web, and includes researchers frominformation retrieval,database technologies, and artificial intelligence [8]. Since Oren Etzioni [9], among others, formally introduced the term, authors have used Web mining to mean slightly different things. For example, Jaideep Srivastava and colleagues [10] define it as: The application of data-mining techniques to extract knowledge from Web data, in which at least one of structure or usage (Web log) data is used in the mining process (with or without other types of Web data). Web Mining consists of three main categories according to the web data used as input in Web Data Mining. Web Content Mining;Web Structure Mining andweb Usage Mining 4.1 Web Content Mining Web content mining is the application of data mining techniques to content published on the Internet, usually as HTML (semi structured), plaintext (unstructured), or XML (structured) documents. In other words it may be defined as the procedure of retrieving the information from the web into more structured forms and indexing the information to retrieve it quickly [11]. Table 1 summarizes the concept of web content mining. Table 1 Web Content Mining 4.2 Web Structure Mining Web structure mining operates on the Web s hyperlink structure. This graph structure can provide information about a page s ranking [12] or authoritativeness [13] and enhance search results through filtering. In other words Web structure mining may be defined as the process by which we discover the model of link structure of the web pages. Its aim is to generate structured abstract about the website and web page. Table 2summarises web structure mining. Table 2 WEB STRUCTURE MINING Volume 4, Issue 2, March April 2015 Page 123
4 4.3 Web Usage Mining Web usage mining analyses results of user interactions with a Web server, including Web logs, clickstreams, and database transactions at a Web site or a group of related sites. It is used to identify the browsing patterns by analysing the navigational behaviour of user. Web usage mining tries to make sense of the data generated by the Web surfer's sessions or behaviours whereas Web-content mining and Web-structure mining utilize real or primary data on the Web. Web usage mining introduces privacy concerns and is currently the topic of extensive debate. Table 3 WEB USAGE MINING 5. WEB MINING TECHNIQUES FOR SOCIAL NETWORK ANALYSIS 5.1 Association Rules Association rules are if/then statements that help uncover relationships between seemingly unrelated data in a transactional database, relational database or other information repository. Association rule mining was first introduced in [14]. Association rule mining represents a data mining technique, the goal of which is to find interesting relationships among a large set of data items. It targets to extract fascinating correlations, recurrent patterns, associations or unpremeditated structures among sets of items in the transaction databases or other data repository. A survey of Association Rule mining is presented in [15]. The simple marginal and conditional probabilities are insufficient to tell us about causal relationships, more sophisticated techniques are required. Association rule mining is easy to use and implement. The patterns discovered with this data mining technique can be represented in the form of association rules. Rule support and confidence are two measures of rule interestingness. Typically, association rules are considered interesting if they satisfy both a minimum support threshold and a minimum confidence threshold. Such thresholds can be set by users or domain experts. In social network analysis, association rule mining can help discover the hidden relationships between the different nodes of a network 5.2 Classification Classification is to build (automatically) a model that can classify a class of objects so as to predict the classification or missing attribute value of future objects (whose class may not be known) [16]. It is a two-step process. In the first step, based on the collection of training data set, a model is constructed to describe the characteristics of a set of data classes or concepts. Since data classes or concepts are predefined, this step is also known as supervised learning (i.e., which class the training sample belongs to is provided). In the second step, the model is used to predict the classes of future objects or data There are many techniques for classification. Classification by decision tree has been wellresearched and lot of algorithms have been developed. A complete survey of Classification using decision trees is given in [17]. Bayesian classification is another technique that can be found in [18].Nearest neighbour methods are also discussed in many statistical texts on classification, such as [19].Many other machine learning and neural network techniquesare used for constructing the classification models. 5.3 Clustering Clustering is similar to classification but it is an unsupervised learning process while as Classification is a supervised learning process. Clustering is the process of grouping a set of physical or abstract objects into classes of similar objects, so that objects within the same cluster must be similar to some extent, also they should be dissimilar to those objects in other clusters [16]. In classification which record belongs to which class is predefined, while as in clustering there is no predefined classes. In clustering, objects are grouped together based on their similarities. Similarity between objects are defined by similarity functions, usually similarities are quantitatively specified as distance or other measures by corresponding domain experts. A survey of clustering techniques and algorithms can be found in [20]. In social network analysis, discovering the closest people in the network is usually the main mission, and is generally achieved by using a visualization technique in a small social network. Thus clustering may emerge a potential technique for identifying more clusters and groups in large social networks. Besides it can offer more meticulous information than visualisation [21], including the closeness of a group, detailed information of members in a group and the relationship between groups in a social network. 6. CONCLUSION AND FUTURE RESEARCH As an approach to social network research, social network analysis displays four features: structural intuition, systematic relational data, graphic images, and mathematical or computational models [22]. Here we tried to present a holistic approach by considering all of the Volume 4, Issue 2, March April 2015 Page 124
5 above features. This paper studies the formal methods to represent social networks, and the various properties of these networks. In particular representation of social networks using Incidence matrices was considered from mathematical perspective. In order to come to a formal conclusion, lotof research needs to be done by comparingadjacent matrices and lists with incidence matricesusing graphs obtained from online social networks as the input data, which will be among our future work.besides that the computational cost of association rule mining represents a disadvantage and future work will focus on reducing it. Finally we hope that this study should help one to understand social networks and should enrich the studies of applications of web mining techniques on these networks. REFERENCES [1] Alexa, Alexa the Web Information Company, [OnlineAccessed on Dec. 25, 2014]. Available: [2] A. Hanneman and M. Riddle, Introduction to Social Network Methods, [Online].Available: [3] Kenneth H. Rosen, Discrete Mathematics & its Applications with Combinatorics and Graph Theory, TMH, [4] V. Snasel, Z. Horak, J. Kocibova, A. Abraham, Reducing social network dimensions using matrix factorization methods, International Conference on Advances in Social Network Analysis and Mining, pp , IEEE, [5] D. Liben-Nowell, An Algorithmic Approach to Social Networks, Ph. D Thesis, Massachusetts Institute of Technology, June [6] D. J. Watts, P. Sheridan Dodds, and M. E. J. Newman, Identity and Search in Social Networks, Science, 296: , 17 May [7] G. Plickert, R. Cote, B. Wellman, It s Not Who You Know. It s How You Know Them: Who Exchanges What with Whom?, Social Networks, Vol. 29, No. 3, pp , 2007 [8] P. Kolari, A. Joshi, Web Mining: Research and Practice, IEE CS, July- August, 2004 [9] O. Etzioni, The World Wide Web: Quagmire or Gold Mine?,Comm. ACM, vol. 39, no.11, pp , [10] J. Srivastava, P. Desikan, and V. Kumar, Web Mining: Accomplishments and Future Directions, Proc. US National Science Foundation Workshop on Next-Generation Data Mining (NGDM), National Science Foundation, [11] Z. S. Zubi, Ranking Web Pages Using Web Structure Mining Concepts, Recent Advances in Telecommunications, Signals and Systems, [12] L. Page et al., The PageRank Citation Ranking: Bring Order to the Web, Tech. report, Stanford Digital Library Technologies, Jan [13] J. Kleinberg, Authoritative Sources in a Hyperlinked Environment, Proc. 9th Ann. ACM SIAM Symp. Discrete Algorithms, ACM Press, pp [14] R.Agrawal, T. Imielinski, and A.N. Swami, Mining Association Rules between Sets of Items in large Databases, In Proceedings of the ACM SIGMOD International Conference on Management of Data,. Washington, D.C., [15] Q. Zhao, S. Bhowmick, Association Rule Mining: A Survey, No , Technical Report, CAIS, Nanyang Technological University, Singapore, [16] J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, [17] S. Murthy, Automatic construction of decision trees from data: A multi-disciplinary Survey, Data Mining and Knowledge Discovery 2, 4, , [18] R. Duda, T. Hart, Pattern Classification and Scene Analysis., Wiley & Sons, Inc., [19] M. James, Classification Algorithms, Wiley & Sons, Inc., [20] P. Berkhin, Survey of clustering data mining techniques Technical Report, Accrue Software, San Jose, CA, [21] B. Tatemura, Y.Wu, Tomographic Clustering to Visualize Blog Communities as Mountain Views, In Proc. Of WWW Conference, Japan, May, [22] F. Borko, Handbook of Social Network Technologies and Applications, Springer Publications, AUTHOR Hilal Ahmad Khanday received his Bachelors and Masters in Computers from University of Kashmir and IUST respectively. He qualified many national examinations conducted by UGC, MHRD and is currently a research scholar and faculty member at the University of Kashmir. Dr. Rana Hashmy is Scientist C at University of Kashmir. She received a Ph.D. (Computer Science) from JawaharlalNehru University, Delhi (INDIA). Her areas of research interest include Software Engineering., Data warehousing, and Data mining.she has many research publications related to her field of interest. Volume 4, Issue 2, March April 2015 Page 125
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationBig Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationPractical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationMy work provides a distinction between the national inputoutput model and three spatial models: regional, interregional y multiregional
Mexico, D. F. 25 y 26 de Julio, 2013 My work provides a distinction between the national inputoutput model and three spatial models: regional, interregional y multiregional Walter Isard (1951). Outline
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationData Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over
More informationSocial Media Mining. Network Measures
Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users
More informationMALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph
MALLET-Privacy Preserving Influencer Mining in Social Media Networks via Hypergraph Janani K 1, Narmatha S 2 Assistant Professor, Department of Computer Science and Engineering, Sri Shakthi Institute of
More informationTechnology, Kolkata, INDIA, pal.sanjaykumar@gmail.com. sssarma2001@yahoo.com
Sanjay Kumar Pal 1 and Samar Sen Sarma 2 1 Department of Computer Science & Applications, NSHM College of Management & Technology, Kolkata, INDIA, pal.sanjaykumar@gmail.com 2 Department of Computer Science
More informationComparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
More informationComplex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics
Complex Network Visualization based on Voronoi Diagram and Smoothed-particle Hydrodynamics Zhao Wenbin 1, Zhao Zhengxu 2 1 School of Instrument Science and Engineering, Southeast University, Nanjing, Jiangsu
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationViral Marketing in Social Network Using Data Mining
Viral Marketing in Social Network Using Data Mining Shalini Sharma*,Vishal Shrivastava** *M.Tech. Scholar, Arya College of Engg. & I.T, Jaipur (Raj.) **Associate Proffessor(Dept. of CSE), Arya College
More informationMapReduce Approach to Collective Classification for Networks
MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty
More informationDATA ANALYSIS II. Matrix Algorithms
DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where
More informationANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION
ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION K.Vinodkumar 1, Kathiresan.V 2, Divya.K 3 1 MPhil scholar, RVS College of Arts and Science, Coimbatore, India. 2 HOD, Dr.SNS
More informationRole of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
More informationA SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION
A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending
More informationAsking Hard Graph Questions. Paul Burkhardt. February 3, 2014
Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)
More informationDMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
More informationThe Open University s repository of research publications and other research outputs
Open Research Online The Open University s repository of research publications and other research outputs The degree-diameter problem for circulant graphs of degree 8 and 9 Journal Article How to cite:
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationExample application (1) Telecommunication. Lecture 1: Data Mining Overview and Process. Example application (2) Health
Lecture 1: Data Mining Overview and Process What is data mining? Example applications Definitions Multi disciplinary Techniques Major challenges The data mining process History of data mining Data mining
More informationIE 680 Special Topics in Production Systems: Networks, Routing and Logistics*
IE 680 Special Topics in Production Systems: Networks, Routing and Logistics* Rakesh Nagi Department of Industrial Engineering University at Buffalo (SUNY) *Lecture notes from Network Flows by Ahuja, Magnanti
More informationSPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
More informationSocial Network Discovery based on Sensitivity Analysis
Social Network Discovery based on Sensitivity Analysis Tarik Crnovrsanin, Carlos D. Correa and Kwan-Liu Ma Department of Computer Science University of California, Davis tecrnovrsanin@ucdavis.edu, {correac,ma}@cs.ucdavis.edu
More informationHow To Use Data Mining For Knowledge Management In Technology Enhanced Learning
Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationUSE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS
USE OF EIGENVALUES AND EIGENVECTORS TO ANALYZE BIPARTIVITY OF NETWORK GRAPHS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu ABSTRACT This
More informationWeb Mining using Artificial Ant Colonies : A Survey
Web Mining using Artificial Ant Colonies : A Survey Richa Gupta Department of Computer Science University of Delhi ABSTRACT : Web mining has been very crucial to any organization as it provides useful
More information203.4770: Introduction to Machine Learning Dr. Rita Osadchy
203.4770: Introduction to Machine Learning Dr. Rita Osadchy 1 Outline 1. About the Course 2. What is Machine Learning? 3. Types of problems and Situations 4. ML Example 2 About the course Course Homepage:
More informationUSING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu
More informationSGL: Stata graph library for network analysis
SGL: Stata graph library for network analysis Hirotaka Miura Federal Reserve Bank of San Francisco Stata Conference Chicago 2011 The views presented here are my own and do not necessarily represent the
More informationAn Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
More informationExploring Big Data in Social Networks
Exploring Big Data in Social Networks virgilio@dcc.ufmg.br (meira@dcc.ufmg.br) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli (alberto.ceselli@unimi.it)
More informationData Mining and Analysis of Online Social Networks
Data Mining and Analysis of Online Social Networks R.Sathya 1, A.Aruna devi 2, S.Divya 2 Assistant Professor 1, M.Tech I Year 2, Department of Information Technology, Ganadipathy Tulsi s Jain Engineering
More informationInternational Journal of Advanced Research in Computer Science and Software Engineering
Volume 2, Issue 10, October 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of
More informationUnderstanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationAPPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email paul@esru.strath.ac.uk
Eighth International IBPSA Conference Eindhoven, Netherlands August -4, 2003 APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION Christoph Morbitzer, Paul Strachan 2 and
More informationSpecific Usage of Visual Data Analysis Techniques
Specific Usage of Visual Data Analysis Techniques Snezana Savoska 1 and Suzana Loskovska 2 1 Faculty of Administration and Management of Information systems, Partizanska bb, 7000, Bitola, Republic of Macedonia
More informationZachary Monaco Georgia College Olympic Coloring: Go For The Gold
Zachary Monaco Georgia College Olympic Coloring: Go For The Gold Coloring the vertices or edges of a graph leads to a variety of interesting applications in graph theory These applications include various
More informationWeb Mining Seminar CSE 450. Spring 2008 MWF 11:10 12:00pm Maginnes 113
CSE 450 Web Mining Seminar Spring 2008 MWF 11:10 12:00pm Maginnes 113 Instructor: Dr. Brian D. Davison Dept. of Computer Science & Engineering Lehigh University davison@cse.lehigh.edu http://www.cse.lehigh.edu/~brian/course/webmining/
More informationFREQUENT PATTERN MINING FOR EFFICIENT LIBRARY MANAGEMENT
FREQUENT PATTERN MINING FOR EFFICIENT LIBRARY MANAGEMENT ANURADHA.T Assoc.prof, atadiparty@yahoo.co.in SRI SAI KRISHNA.A saikrishna.gjc@gmail.com SATYATEJ.K satyatej.koganti@gmail.com NAGA ANIL KUMAR.G
More informationA New Marketing Channel Management Strategy Based on Frequent Subtree Mining
A New Marketing Channel Management Strategy Based on Frequent Subtree Mining Daoping Wang Peng Gao School of Economics and Management University of Science and Technology Beijing ABSTRACT For most manufacturers,
More informationSentiment Analysis on Big Data
SPAN White Paper!? Sentiment Analysis on Big Data Machine Learning Approach Several sources on the web provide deep insight about people s opinions on the products and services of various companies. Social
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationTOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
More informationAn Empirical Study of Application of Data Mining Techniques in Library System
An Empirical Study of Application of Data Mining Techniques in Library System Veepu Uppal Department of Computer Science and Engineering, Manav Rachna College of Engineering, Faridabad, India Gunjan Chindwani
More informationPrediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
More informationSocial Media Mining. Graph Essentials
Graph Essentials Graph Basics Measures Graph and Essentials Metrics 2 2 Nodes and Edges A network is a graph nodes, actors, or vertices (plural of vertex) Connections, edges or ties Edge Node Measures
More informationApplication of Markov chain analysis to trend prediction of stock indices Milan Svoboda 1, Ladislav Lukáš 2
Proceedings of 3th International Conference Mathematical Methods in Economics 1 Introduction Application of Markov chain analysis to trend prediction of stock indices Milan Svoboda 1, Ladislav Lukáš 2
More informationA Survey on Web Mining From Web Server Log
A Survey on Web Mining From Web Server Log Ripal Patel 1, Mr. Krunal Panchal 2, Mr. Dushyantsinh Rathod 3 1 M.E., 2,3 Assistant Professor, 1,2,3 computer Engineering Department, 1,2 L J Institute of Engineering
More informationMethodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: macario.cordel@dlsu.edu.ph
More informationBig Data: Rethinking Text Visualization
Big Data: Rethinking Text Visualization Dr. Anton Heijs anton.heijs@treparel.com Treparel April 8, 2013 Abstract In this white paper we discuss text visualization approaches and how these are important
More information2015 Workshops for Professors
SAS Education Grow with us Offered by the SAS Global Academic Program Supporting teaching, learning and research in higher education 2015 Workshops for Professors 1 Workshops for Professors As the market
More informationStatistical Models in Data Mining
Statistical Models in Data Mining Sargur N. Srihari University at Buffalo The State University of New York Department of Computer Science and Engineering Department of Biostatistics 1 Srihari Flood of
More informationSelf Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, julian.szymanski@eti.pg.gda.pl
More informationSingle Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results
, pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department
More informationTop Top 10 Algorithms in Data Mining
ICDM 06 Panel on Top Top 10 Algorithms in Data Mining 1. The 3-step identification process 2. The 18 identified candidates 3. Algorithm presentations 4. Top 10 algorithms: summary 5. Open discussions ICDM
More informationHISTORICAL DEVELOPMENTS AND THEORETICAL APPROACHES IN SOCIOLOGY Vol. I - Social Network Analysis - Wouter de Nooy
SOCIAL NETWORK ANALYSIS University of Amsterdam, Netherlands Keywords: Social networks, structuralism, cohesion, brokerage, stratification, network analysis, methods, graph theory, statistical models Contents
More informationNew Matrix Approach to Improve Apriori Algorithm
New Matrix Approach to Improve Apriori Algorithm A. Rehab H. Alwa, B. Anasuya V Patil Associate Prof., IT Faculty, Majan College-University College Muscat, Oman, rehab.alwan@majancolleg.edu.om Associate
More informationVisualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR ccfong@umac.mo Abstract An e-government
More informationA STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE
STUDIA UNIV. BABEŞ BOLYAI, INFORMATICA, Volume LIX, Number 1, 2014 A STUDY REGARDING INTER DOMAIN LINKED DOCUMENTS SIMILARITY AND THEIR CONSEQUENT BOUNCE RATE DIANA HALIŢĂ AND DARIUS BUFNEA Abstract. Then
More informationVisualization methods for patent data
Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationA New Approach for Evaluation of Data Mining Techniques
181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty
More informationHow To Find Influence Between Two Concepts In A Network
2014 UKSim-AMSS 16th International Conference on Computer Modelling and Simulation Influence Discovery in Semantic Networks: An Initial Approach Marcello Trovati and Ovidiu Bagdasar School of Computing
More informationCOURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
More informationSTATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
More informationChapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks
Chapter 29 Scale-Free Network Topologies with Clustering Similar to Online Social Networks Imre Varga Abstract In this paper I propose a novel method to model real online social networks where the growing
More informationTutorial, IEEE SERVICE 2014 Anchorage, Alaska
Tutorial, IEEE SERVICE 2014 Anchorage, Alaska Big Data Science: Fundamental, Techniques, and Challenges (Data Mining on Big Data) 2014. 6. 27. By Neil Y. Yen Presented by Incheon Paik University of Aizu
More informationDATA MINING CONCEPTS AND TECHNIQUES. Marek Maurizio E-commerce, winter 2011
DATA MINING CONCEPTS AND TECHNIQUES Marek Maurizio E-commerce, winter 2011 INTRODUCTION Overview of data mining Emphasis is placed on basic data mining concepts Techniques for uncovering interesting data
More informationArtificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
More informationPart 2: Community Detection
Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -
More informationAN EFFICIENT APPROACH TO PERFORM PRE-PROCESSING
AN EFFIIENT APPROAH TO PERFORM PRE-PROESSING S. Prince Mary Research Scholar, Sathyabama University, hennai- 119 princemary26@gmail.com E. Baburaj Department of omputer Science & Engineering, Sun Engineering
More informationHow To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
More informationHigh-dimensional labeled data analysis with Gabriel graphs
High-dimensional labeled data analysis with Gabriel graphs Michaël Aupetit CEA - DAM Département Analyse Surveillance Environnement BP 12-91680 - Bruyères-Le-Châtel, France Abstract. We propose the use
More informationA Survey on Outlier Detection Techniques for Credit Card Fraud Detection
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661, p- ISSN: 2278-8727Volume 16, Issue 2, Ver. VI (Mar-Apr. 2014), PP 44-48 A Survey on Outlier Detection Techniques for Credit Card Fraud
More informationData quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
More informationAssociation rules for improving website effectiveness: case analysis
Association rules for improving website effectiveness: case analysis Maja Dimitrijević, The Higher Technical School of Professional Studies, Novi Sad, Serbia, dimitrijevic@vtsns.edu.rs Tanja Krunić, The
More informationAn Introduction to APGL
An Introduction to APGL Charanpal Dhanjal February 2012 Abstract Another Python Graph Library (APGL) is a graph library written using pure Python, NumPy and SciPy. Users new to the library can gain an
More informationRANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS
ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for
More informationData mining in the e-learning domain
Data mining in the e-learning domain The author is Education Liaison Officer for e-learning, Knowsley Council and University of Liverpool, Wigan, UK. Keywords Higher education, Classification, Data encapsulation,
More informationA RESEARCH STUDY ON DATA MINING TECHNIQUES AND ALGORTHMS
A RESEARCH STUDY ON DATA MINING TECHNIQUES AND ALGORTHMS Nitin Trivedi, Research Scholar, Manav Bharti University, Solan HP ABSTRACT The purpose of this study is not to delve deeply into the technical
More informationASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
More informationSOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS
SOCIAL NETWORK ANALYSIS EVALUATING THE CUSTOMER S INFLUENCE FACTOR OVER BUSINESS EVENTS Carlos Andre Reis Pinheiro 1 and Markus Helfert 2 1 School of Computing, Dublin City University, Dublin, Ireland
More informationA SURVEY ON WEB MINING TOOLS
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 3, Issue 10, Oct 2015, 27-34 Impact Journals A SURVEY ON WEB MINING TOOLS
More informationMining for Web Engineering
Mining for Engineering A. Venkata Krishna Prasad 1, Prof. S.Ramakrishna 2 1 Associate Professor, Department of Computer Science, MIPGS, Hyderabad 2 Professor, Department of Computer Science, Sri Venkateswara
More informationSociology and CS. Small World. Sociology Problems. Degree of Separation. Milgram s Experiment. How close are people connected? (Problem Understanding)
Sociology Problems Sociology and CS Problem 1 How close are people connected? Small World Philip Chan Problem 2 Connector How close are people connected? (Problem Understanding) Small World Are people
More informationHow To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China catch0327@yahoo.com yanxing@gdut.edu.cn
More information15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
More informationQuality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report G. Banos 1, P.A. Mitkas 2, Z. Abas 3, A.L. Symeonidis 2, G. Milis 2 and U. Emanuelson 4 1 Faculty
More informationNetwork-Based Tools for the Visualization and Analysis of Domain Models
Network-Based Tools for the Visualization and Analysis of Domain Models Paper presented as the annual meeting of the American Educational Research Association, Philadelphia, PA Hua Wei April 2014 Visualizing
More informationMining Social-Network Graphs
342 Chapter 10 Mining Social-Network Graphs There is much information to be gained by analyzing the large-scale data that is derived from social networks. The best-known example of a social network is
More information