Data Mining and Analysis of Online Social Networks
|
|
|
- Susan Pitts
- 10 years ago
- Views:
Transcription
1 Data Mining and Analysis of Online Social Networks R.Sathya 1, A.Aruna devi 2, S.Divya 2 Assistant Professor 1, M.Tech I Year 2, Department of Information Technology, Ganadipathy Tulsi s Jain Engineering Collge, Kaniyambadi, Vellore ABSTRACT- Social media and, in particular, Online Social Networks (OSNs) acquired a huge popularity and represent one of the most important social and Computer Science phenomena in these years. Social networks allow users to collaborate with others. People of similar backgrounds and interests meet and cooperate using these social networks, enabling them to share information across the world. The social networks contain millions of unprocessed raw data. By analyzing this data, new knowledge can be gained. Since this data is dynamic and unstructured traditional data mining techniques will not be appropriate. Web data mining is an interesting field with vast amount of applications. With the growth of online social networks have significantly increased data content available because profile holders become more active producers and distributors of such data. This paper identifies and analyzes existing web mining techniques used to mine social network data. This dissertation presents a comprehensive study of the process of mining information from Online Social Networks and analyzing the structure of the networks themselves. To this purpose, several methods are adopted, ranging from Web Mining techniques, to graph-theoretical models and finally statistical analysis of network features, from a quantitative and qualitative perspective. The origin, distribution and sheer size of the data involved makes each of them either moot or inapplicable to the required scale. KEYWORDS: Social Networks, Web Data Mining, Data mining techniques, Social Network Analysis INTRODUCTION The content of the present dissertation can be schematized in three main parts: (i) First phase of this research work explains the problem of mining Web sources from an algorithmic perspective. Different techniques, largely adopted in Web data extraction tasks, are discussed and a novel approach to refine the process of automatic extraction of information from Web pages is presented, which is the core for the definition of a platform for sampling data from OSNs. (ii) The second part of this work discusses the analysis of a large dataset acquired from the most representative (and largest) OSN to date: This platform gathers hundreds of millions of users and its modeling and analysis is possible by means of Social Network Analysis techniques. The investigation of topological features is extended to other OSNs datasets available online for the scientific community. Several features of these networks, such as the wellknown small world effect, scale-free distributions and community structure, are characterized and analyzed. At the same time, our analysis provides quantitative clues to verify the validity of different sociological theories on large-scale Social networks (for example, the six degrees of separation or the strength of weak ties). In particular, the problem of the community detection on massive OSNs is redefined and solved by means of a new algorithm. This result puts into evidence the need of defining computationally efficient, even if heuristic, measures to assess the importance of individuals in the network. (iii) The last part of the work is devoted to presenting a novel, efficient measure of centrality for Social networks, whose rationale is grounded in the random walks theory. Its validity is assessed against massive OSNs datasets; it becomes the basis for a novel community detection algorithm which is shown to work surprising well in different contexts, such as social and biological network analysis. 2. Data Mining 2.1. Overview of data mining It is the process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems [5]. Data warehouses are being made use of in order to store large amounts of data. The growth of commercial databases has had a huge impact on the necessity of data mining in organizations. Data mining allows organizations to proactively respond to problems that may arise in future by forecasting about specific occurrences [6]. As illustrated in figure 1, the first step is data preparation. Data is selected, processed under the knowledge of a domain expert. Second, a data mining algorithm is used to process the prepared data. The third phase is to analyze whether important facts were generated by the data mining algorithms [5]. Data mining undergoes some preprocessing steps as shown in figure1. Integrated Intelligent Research (IIR) 528
2 Figure 1: Data Mining Steps Data mining is becoming increasingly common in both the private and public sectors. Industries such as banking, insurance, medicine, and retailing commonly use data mining to reduce costs, enhance research, and increase sales. In the public sector, data mining applications initially were used as a means to detect fraud and waste, but have grown to also be used for purposes such as measuring and improving program performance [5] Data Mining Algorithms Data mining techniques is dividing in to two approaches; direct approach is used in prediction where it tries to predict a state of a new value by looking at the known values. The Second approach, non-direct approach is used to identify new patterns by looking at the past values. Before start creating mining models data should be cleaned and prepared. The Mining models can be created on following Algorithms [8]. Association Rules - This algorithm can be used in marketing base analysis like identifying cross-selling opportunities. This takes multiple items in a single transaction, scans the data and counts the number of times the items appear in the transaction so it can be used to identify the relationships in the large data sets. Clustering - This algorithm groups the data according to their similar characteristics. This can be used to identify the relationship of the characteristics among a group. When a new data is introduced, the characteristics of it can be mapped with the relationships, it can be used to predict the behavior of the new data. Clustering can be used to find anomalies of the data as well. This is commonly used in systems of fraud detection and Customer relationship Management. Decision Trees - This is the simple and one of the most commonly used algorithm. This is used to predict discrete and continues variables. Liner regression - This is predicting only continues variables using single multiple liner regression formula. Logistic Regression This algorithm uses a neural network without hidden layers. Naïve Bayes This can be used to calculate probabilities for each possible state of the input attribute when a state of a predictive attribute is given. This can be used as the starting algorithm of the predicting process. Neural Networks This algorithm has been adopted from artificial intelligence. This can be used to search nonlinear functional dependencies. This will perform non liner transformations on the data in layers from input layers to the hidden layers and finally to the output layer. Sequence Clustering - This looks for cluster based models than the similarity if the data. The model use sequence of events by using hidden Markov chains. The states are models to a matrix and the probabilities of transiting from one state to another in the cells of the matrix. With these probabilities the probabilities for sequence of transition can be calculated by multiplying probabilities of state transitions in the sequence. The chains of highest probability can be used to model the clusters. Time Series - This is used to forecast continues variables. This is a combination of two algorithms called auto regression trees and Auto regressive integrated moving average. 3. Web Data mining 3.1. Overview of Web Data Mining Web mining is the process of analyzing and discovering patterns on web data. It can be defined as searching data automatically from various online resources [9]. Since the heterogeneity nature of the data on web, mining is very hard [10]. Applying the above mentioned data mining algorithms directly is not feasible. Web data mining architec- ture, which is composed of the following elements (see Figure 2): i) a server running the mining agent(s); ii) a crossplatform Java application, which implements the logic of the agent; and, iii) an Apache interface, which manages the information transfer through the Web. Special methodologies should be used to make web data structured and to mine. The web information retrieval tools like web crawlers do not do exactly web mining, they extract only text, and they do not extract information or knowledge from web data. Web mining can be categorized it to three aspects, web usage mining, web content mining and web structure mining [11]. Web Usage to identify user browsing patterns of web sites by recording the URL visited or by accessing web server logs [12]. Integrated Intelligent Research (IIR) 529
3 Figure 2: Architecture of the data mining platform This can be used to identify users, figure our patterns in session creations, detect robot, filtering and deriving the sites that visited together by the user as well. Web Structure Mining is identifying the structure of a particular web. This will focus on hyperlink information. By analyzing the structure, interesting connectivity information can be recognized [9]. This is ideal when applying web data mining in social network sites. By starting with one user s profiles page, the friends network or the friends cycles can be identified. This data can be very useful to cluster the profile page data and identify the relationships and interesting details regarding the connectivity. Web content mining is mining the content of the web pages. There are two approaches to do web content mining, as mentioned in paper [11]. Namely agent-based web mining systems having three variations like intelligent search agents, information filtering/categorization and personalized web agents. The second one is the databases approach with multilevel databases or web query systems. Content mining does not exactly means to search keywords on web pages; it is extracting information and discovering patterns buy analyzing web documents. This leads to discover new knowledge. This is more difficult that mining data in data warehouses since web data is semi structured [10]. 4. Social network data mining Algorithms 4.1. Overview of the Algorithms When mining social network data it should be a combination of web structure mining and web content mining [1]. Analyzing the structure of the Social network is known as Social Network Analysis. Social Network analysis where was a hot topic among the researchers from 1994 especially in the fields of psychology, anthropology, economics, geography, biology and epidemiology, anthropology, economics, geography, biology and epidemiology. Several tools has been introduced in the area of social network analysis like Graph Characterization Toolkit, TweetHood, Meerkat, NetDriller, HiTS/ISAC Social Network Analysis Tool,D-Dupe and X-RIME, a cloud-based library forlarge scale social network analysis. Web content mining was more popular in marketing and advertising research [ 9, 10] Analysis of the Algorithms In social network data mining, existing data mining algorithms cannot be used directly because of the dynamic behavior [16, 17].When analyzing the literature on social network data mining techniques; it was found that each algorithm has strengths and weaknesses. The following section explains about the existing algorithms in detail Graph mining algorithms Most popular data mining technique in Social Network Analysis is using Graph mining algorithms. World Wide Web including social networks is a collection of interconnected hypertext documents. These are interconnected by hyperlinks. So web can be considered as a directed graph, where nodes will be the hypertext documents and edges will be hyperlinks. Web structure analysis based on graph algorithms has been analyzed in many researches in past years. Lahiri and Berger-Wolf (2008) have created and tested methods combining network, quantitative, semantic, data processing, conversion and visualization-based components. They have introduce a new graph mining algorithm periodic subgraph mining, or the discovery of all interaction patterns that occur at regular time intervals taking into consideration of the dynamic behavior of Social Networks. The Algorithm is based on frequent pattern mining in transactional and graph databases with periodic pattern mining in unidimensional and multidimensional sequences. Bourqui et.al (2009) presented a framework which is based on dynamic graph discretization and graph clustering. This framework is capable of detecting the dynamic changers of the social network structure and identifies events analyzing temporal dimension and exposes command hierarchies in social networks. The particular algorithms treat the network as a graph but it minimize the clustering problems and graph partitioning problems. As a solution minimum spanning trees can be used to identify users having similar profile pages and strong relationships. Zhang et.al (2010) have conducted an experiment on the applicability of general greedy, hill-climbing and centrality-based algorithms on dynamic social network data to identify key users for target marketing by mapping the network to a graph. They have proposed a new approximation searching algorithm based on the heuristics information from the above algorithms. Integrated Intelligent Research (IIR) 530
4 Even though the graphs map the connection or the relationship between the nodes it does not show the relationship strength. One interesting tools has been developed called SocialViz to provide h frequency information on social relationship among multiple entities in the networks by using a Frequent Pattern Visualization Approach Classification Classification is the method of categorizing data in to one of many categories. This can be apply in web data mining to classify user profiles based on profile characteristics. Most popular classification algorithms in data mining are decision trees, naïve Bayesian classifier and neural networks. Surma and Furmanek (2010) introduced an interesting algorithm called C&RT, combining classification and regression tree algorithms to determine rules to identify target groups to market. This can be used in real social network data Clustering Clustering is grouping a set of items such a way that items in the same group are more similar to each other than to those in other groups. These groups are known as a cluster. Clustering is mainly used in information retrieval in web mining. Based on past research clustering will increase the efficiency in information retrieval. Graph based clustering is comely used in web structure mining as explained in early section. Text based clustering is most commonly used in web content mining whether you create clusters based on the content of the web document. Bartal et.al (2009) introduced an interesting method combing social network analysis and text based clustering to predict the nodes of a social network would be linked next Associations Association rule mining is used to find frequent patterns and correlation among data set. Nancy et.al (2013) had use association rules to mine social network data using 100 Facebook university pages. The research focused on the formulation of association rules using which decisions can be made and uses Apriori Algorithm to derive association rules Semantic Web and Ontology Semantic Web is a new research area where it tends to give meaning to Web data. This enables machine and humans to interact intelligently and exchange information. There are many researchers has been carried out in this filed like using semantic geo catalogues and recovery in mental health information. Zhou et.al (2008) explains applying statistical learning methods on semantic web data. It has used an extended FOAF (friend-of-a-friend) ontology applied as a mediation schema to integrate Social Networks and a hybrid entity reconciliation method to resolve entities of different data sources]. Tushar et.al (2008) explains the usage of Semantic Web technology to detect the associations between multiple domains in a Social Network. Opuszko and Ruhland (2012) introduced a novel approach of using semantic similarity measure based on pre-defined ontologies for classify social network data. Ostrowski (2012) has developed an algorithm to retrieve information in social networks to identify trends. The Algorithm has use semantics for determine the relevancy of networks using unstructured data. The algorithm was tested on twitter messages Markov models Markov chains are is a mathematical algorithm that undergoes transitions from one state to another, among a finite or countable number of possible states. It is a random process where the next state depends only on the current state and not on the sequence of events that preceded it. Markov models can be used in web mining to predict users next action. The Social network can be mapped to a where nodes will be users previous visits. So based on the node information by using Markov models users next visit can be predicted. When analyzing the literature it proves that most of the research has been carried out in web structure mining and less in web content mining. Most of the researches are tested on static networks; they do not consider the dynamic behavior. 5. Conclusion Data mining is an interesting field which can be used to produce new knowledge by analyzing large collection of data. In order to apply the traditional data mining techniques the data should be stored in data warehouses in a structured manner. Web 2.0 has leaded the web to store massive collection of data. Even though interesting patterns can be identified by mining web data applying traditional data mining techniques is not practical because of the unstructured and the dynamic behavior of web data. This has lead may researchers to find special algorithms to mine web data. This paper has specifically focused on the techniques used to mine social network data. Most of the algorithms are developed to mine the structure of the social network where mapping the network to a graph. Fewer researches have been conduct in the category of content mining even fewer in web usage mining. For the future research it would be beneficial to focus more on the content mining where lot of human behavior patterns can be identified by analyzing the social network profile pages. Researching on an efficient hybrid approach by combing social network analysis (web structure mining) with content mining would be more useful. The statistical methods like Markov Models can be adopted to resolve the temporal behavior of web data and as well as to introduce personalization. This would really valuable to marketing and advertising fields since social network marketing is an emerging technique in business world. Integrated Intelligent Research (IIR) 531
5 REFERENCES [1] E.K Clemons., The Future of Advertising and the Value of Social Network Websites: Some Preliminary Examinations. Minneapolis, Minneosta, USA, 2007: ng_preliminary_exam.pdf. [2] D. Boyd., Social Network Sites: Definition, History, and Scholarship,Journal of Computer- Mediated Communication 13 (1), [3] Social Network Marketing: The Basics Available: g_the_basics.pdf [Aug 01,2013]. [4] [4] J. Rennie, G. Zorpette, The Social Era of the Web Starts Now, IEEE Spectrum, June 2011.Available: /internet/the-social-era-of-the-web-starts-now [Aug 01, 2013]. [5] J.W. Seifert. Data Mining: An Overview, CRS Report for Congress, 2004 [Online] Available: [Sep ] [6] H. Kob, G, Tan, Data Mining Applications in Healthcare [Online] Available: a%20mining%20in%20healthcare_jhim_.pdf [Nov ] [7] M Hart. Progress of organisational data mining in South Africa, Department of Information Systems, University of Cape Town, South Africa, [Online] Available: f/arima00601.pdf [Jun 5, 2011] [8] E.Veerman, T.Lachev, D.Sarka, Microsoft SQL Server Bussiness Intelligence Development and Manitenance, Microsof, PHI Learning Private Limited, pp , [9] C. Yu, X. Ying, Application of Data Mining Technology in E-Commerce. in IEEE Int. Conf. on International Forum on Computer Science-Technology and Applications, pp , [10] G. Lappas G. From Web Mining to Social Multimedia Mining., IEEE International Conference on Advances in Social Networks Analysis and Mining, pp , [11] J. Srivastava, Cooley R., Deshpande M., and Tan P.N., "Web Usage Mining: Discovery and Applications of Usage Patterns from Web Data," SIGKKD Explorations, [12] S. Madria, S. Bhowmick, Research issues in web data mining, DataWarehousing and Knowledge Discovery Lecture Notes in Computer Science Volume 1676, pp , Integrated Intelligent Research (IIR) 532
A Survey on Web Mining From Web Server Log
A Survey on Web Mining From Web Server Log Ripal Patel 1, Mr. Krunal Panchal 2, Mr. Dushyantsinh Rathod 3 1 M.E., 2,3 Assistant Professor, 1,2,3 computer Engineering Department, 1,2 L J Institute of Engineering
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
ISSN: 2320-1363 CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS
CONTEXTUAL ADVERTISEMENT MINING BASED ON BIG DATA ANALYTICS A.Divya *1, A.M.Saravanan *2, I. Anette Regina *3 MPhil, Research Scholar, Muthurangam Govt. Arts College, Vellore, Tamilnadu, India Assistant
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Sanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 [email protected] 1. Introduction The field of data mining and knowledgee discovery is emerging as a
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM
ORGANIZATIONAL KNOWLEDGE MAPPING BASED ON LIBRARY INFORMATION SYSTEM IRANDOC CASE STUDY Ammar Jalalimanesh a,*, Elaheh Homayounvala a a Information engineering department, Iranian Research Institute for
Data Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM
INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 DATA MINING TECHNIQUES SUPPORT TO KNOWLEGDE OF BUSINESS INTELLIGENT SYSTEM M. Mayilvaganan 1, S. Aparna 2 1 Associate
How To Use Neural Networks In Data Mining
International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
Enhanced Boosted Trees Technique for Customer Churn Prediction Model
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction
Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP
Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING
A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management
Using reporting and data mining techniques to improve knowledge of subscribers; applications to customer profiling and fraud management Paper Jean-Louis Amat Abstract One of the main issues of operators
Keywords Big Data; OODBMS; RDBMS; hadoop; EDM; learning analytics, data abundance.
Volume 4, Issue 11, November 2014 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analytics
The Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
Role of Social Networking in Marketing using Data Mining
Role of Social Networking in Marketing using Data Mining Mrs. Saroj Junghare Astt. Professor, Department of Computer Science and Application St. Aloysius College, Jabalpur, Madhya Pradesh, India Abstract:
Chapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
How To Find Out What A Web Log Data Is Worth On A Blog
46 Next Generation Business Intelligence Techniques in the Concept of Web Engineering of Data Mining 1 M Vijaya Kamal, 2 P Srikanth, 3 Dr. D Vasumathi 1 Asst. Professor, University of Petroleum & Energy
Adobe Insight, powered by Omniture
Adobe Insight, powered by Omniture Accelerating government intelligence to the speed of thought 1 Challenges that analysts face 2 Analysis tools and functionality 3 Adobe Insight 4 Summary Never before
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
AN EFFICIENT APPROACH TO PERFORM PRE-PROCESSING
AN EFFIIENT APPROAH TO PERFORM PRE-PROESSING S. Prince Mary Research Scholar, Sathyabama University, hennai- 119 [email protected] E. Baburaj Department of omputer Science & Engineering, Sun Engineering
A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
How to use Big Data in Industry 4.0 implementations. LAURI ILISON, PhD Head of Big Data and Machine Learning
How to use Big Data in Industry 4.0 implementations LAURI ILISON, PhD Head of Big Data and Machine Learning Big Data definition? Big Data is about structured vs unstructured data Big Data is about Volume
A SURVEY ON WEB MINING TOOLS
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 3, Issue 10, Oct 2015, 27-34 Impact Journals A SURVEY ON WEB MINING TOOLS
Available online at www.sciencedirect.com Available online at www.sciencedirect.com. Advanced in Control Engineering and Information Science
Available online at www.sciencedirect.com Available online at www.sciencedirect.com Procedia Procedia Engineering Engineering 00 (2011) 15 (2011) 000 000 1822 1826 Procedia Engineering www.elsevier.com/locate/procedia
An Introduction to Data Mining
An Introduction to Intel Beijing [email protected] January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
Visualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government
Building Data Cubes and Mining Them. Jelena Jovanovic Email: [email protected]
Building Data Cubes and Mining Them Jelena Jovanovic Email: [email protected] KDD Process KDD is an overall process of discovering useful knowledge from data. Data mining is a particular step in the
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 ISSN 2229-5518
International Journal of Scientific & Engineering Research, Volume 5, Issue 4, April-2014 442 Over viewing issues of data mining with highlights of data warehousing Rushabh H. Baldaniya, Prof H.J.Baldaniya,
Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data
Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data Sheetal A. Raiyani 1, Shailendra Jain 2 Dept. of CSE(SS),TIT,Bhopal 1, Dept. of CSE,TIT,Bhopal 2 [email protected]
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
Course 803401 DSS. Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Oman College of Management and Technology Course 803401 DSS Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization CS/MIS Department Information Sharing
Web Usage Mining: Identification of Trends Followed by the user through Neural Network
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 7 (2013), pp. 617-624 International Research Publications House http://www. irphouse.com /ijict.htm Web
A Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
Research of Postal Data mining system based on big data
3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY
INTERNATIONAL JOURNAL OF PURE AND APPLIED RESEARCH IN ENGINEERING AND TECHNOLOGY A PATH FOR HORIZING YOUR INNOVATIVE WORK A SURVEY ON BIG DATA ISSUES AMRINDER KAUR Assistant Professor, Department of Computer
MULTI AGENT-BASED DISTRIBUTED DATA MINING
MULTI AGENT-BASED DISTRIBUTED DATA MINING REECHA B. PRAJAPATI 1, SUMITRA MENARIA 2 Department of Computer Science and Engineering, Parul Institute of Technology, Gujarat Technology University Abstract:
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
Hexaware E-book on Predictive Analytics
Hexaware E-book on Predictive Analytics Business Intelligence & Analytics Actionable Intelligence Enabled Published on : Feb 7, 2012 Hexaware E-book on Predictive Analytics What is Data mining? Data mining,
Learning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
Fluency With Information Technology CSE100/IMT100
Fluency With Information Technology CSE100/IMT100 ),7 Larry Snyder & Mel Oyler, Instructors Ariel Kemp, Isaac Kunen, Gerome Miklau & Sean Squires, Teaching Assistants University of Washington, Autumn 1999
Advanced In-Database Analytics
Advanced In-Database Analytics Tallinn, Sept. 25th, 2012 Mikko-Pekka Bertling, BDM Greenplum EMEA 1 That sounds complicated? 2 Who can tell me how best to solve this 3 What are the main mathematical functions??
Exploring Big Data in Social Networks
Exploring Big Data in Social Networks [email protected] ([email protected]) INWEB National Science and Technology Institute for Web Federal University of Minas Gerais - UFMG May 2013 Some thoughts about
ANALYTICS IN BIG DATA ERA
ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization
5.5 Copyright 2011 Pearson Education, Inc. publishing as Prentice Hall. Figure 5-2
Class Announcements TIM 50 - Business Information Systems Lecture 15 Database Assignment 2 posted Due Tuesday 5/26 UC Santa Cruz May 19, 2015 Database: Collection of related files containing records on
ABSTRACT The World MINING 1.2.1 1.2.2. R. Vasudevan. Trichy. Page 9. usage mining. basic. processing. Web usage mining. Web. useful information
SSRG International Journal of Electronics and Communication Engineering (SSRG IJECE) volume 1 Issue 1 Feb Neural Networks and Web Mining R. Vasudevan Dept of ECE, M. A.M Engineering College Trichy. ABSTRACT
Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
Web Mining as a Tool for Understanding Online Learning
Web Mining as a Tool for Understanding Online Learning Jiye Ai University of Missouri Columbia Columbia, MO USA [email protected] James Laffey University of Missouri Columbia Columbia, MO USA [email protected]
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
Chapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
not possible or was possible at a high cost for collecting the data.
Data Mining and Knowledge Discovery Generating knowledge from data Knowledge Discovery Data Mining White Paper Organizations collect a vast amount of data in the process of carrying out their day-to-day
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Formal Methods for Preserving Privacy for Big Data Extraction Software
Formal Methods for Preserving Privacy for Big Data Extraction Software M. Brian Blake and Iman Saleh Abstract University of Miami, Coral Gables, FL Given the inexpensive nature and increasing availability
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Neural Networks in Data Mining
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V6 PP 01-06 www.iosrjen.org Neural Networks in Data Mining Ripundeep Singh Gill, Ashima Department
Sunnie Chung. Cleveland State University
Sunnie Chung Cleveland State University Data Scientist Big Data Processing Data Mining 2 INTERSECT of Computer Scientists and Statisticians with Knowledge of Data Mining AND Big data Processing Skills:
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report G. Banos 1, P.A. Mitkas 2, Z. Abas 3, A.L. Symeonidis 2, G. Milis 2 and U. Emanuelson 4 1 Faculty
Comparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
Index Contents Page No. Introduction . Data Mining & Knowledge Discovery
Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.
A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model
A Secured Approach to Credit Card Fraud Detection Using Hidden Markov Model Twinkle Patel, Ms. Ompriya Kale Abstract: - As the usage of credit card has increased the credit card fraud has also increased
Grid Density Clustering Algorithm
Grid Density Clustering Algorithm Amandeep Kaur Mann 1, Navneet Kaur 2, Scholar, M.Tech (CSE), RIMT, Mandi Gobindgarh, Punjab, India 1 Assistant Professor (CSE), RIMT, Mandi Gobindgarh, Punjab, India 2
Inner Classification of Clusters for Online News
Inner Classification of Clusters for Online News Harmandeep Kaur 1, Sheenam Malhotra 2 1 (Computer Science and Engineering Department, Shri Guru Granth Sahib World University Fatehgarh Sahib) 2 (Assistant
Data Mining of Web Access Logs
Data Mining of Web Access Logs A minor thesis submitted in partial fulfilment of the requirements for the degree of Master of Applied Science in Information Technology Anand S. Lalani School of Computer
Chapter 5. Warehousing, Data Acquisition, Data. Visualization
Decision Support Systems and Intelligent Systems, Seventh Edition Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization 5-1 Learning Objectives
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
The University of Jordan
The University of Jordan Master in Web Intelligence Non Thesis Department of Business Information Technology King Abdullah II School for Information Technology The University of Jordan 1 STUDY PLAN MASTER'S
Master of Science in Health Information Technology Degree Curriculum
Master of Science in Health Information Technology Degree Curriculum Core courses: 8 courses Total Credit from Core Courses = 24 Core Courses Course Name HRS Pre-Req Choose MIS 525 or CIS 564: 1 MIS 525
Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari [email protected]
Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari [email protected] Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content
A Survey on Preprocessing of Web Log File in Web Usage Mining to Improve the Quality of Data
A Survey on Preprocessing of Web Log File in Web Usage Mining to Improve the Quality of Data R. Lokeshkumar 1, R. Sindhuja 2, Dr. P. Sengottuvelan 3 1 Assistant Professor - (Sr.G), 2 PG Scholar, 3Associate
Dynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: [email protected]) Dr. G.N. Singh Department of Physics and
Nine Common Types of Data Mining Techniques Used in Predictive Analytics
1 Nine Common Types of Data Mining Techniques Used in Predictive Analytics By Laura Patterson, President, VisionEdge Marketing Predictive analytics enable you to develop mathematical models to help better
8. Machine Learning Applied Artificial Intelligence
8. Machine Learning Applied Artificial Intelligence Prof. Dr. Bernhard Humm Faculty of Computer Science Hochschule Darmstadt University of Applied Sciences 1 Retrospective Natural Language Processing Name
DATA PREPARATION FOR DATA MINING
Applied Artificial Intelligence, 17:375 381, 2003 Copyright # 2003 Taylor & Francis 0883-9514/03 $12.00 +.00 DOI: 10.1080/08839510390219264 u DATA PREPARATION FOR DATA MINING SHICHAO ZHANG and CHENGQI
Abdullah Mohammed Abdullah Khamis
Abdullah Mohammed Abdullah Khamis Jeddah, Saudi Arabia Email: [email protected] Mobile: +966 567243182 Tel: +966 2 6340699 (Yemeni) Research and Professional Objective To Complete my Ph.D. in Pattern
Business Intelligence. Data Mining and Optimization for Decision Making
Brochure More information from http://www.researchandmarkets.com/reports/2325743/ Business Intelligence. Data Mining and Optimization for Decision Making Description: Business intelligence is a broad category
