Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
|
|
|
- Steven Hicks
- 10 years ago
- Views:
Transcription
1 1/ Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom Abstract Application of data mining techniques to the World Wide Web, referred to as Web mining However, there is no established vocabulary, leading to confusion when comparing research efforts The term Web mining has been used in two distinct ways The first, called Web content mining which is the process of information discovery from sources across the World Wide Web The second, called Web usage mining, is the process of mining for user browsing and access patterns This research concentrates on one particular aspect which is how to make the web usage mining more efficient This can be done by adding new level for web usage This new level would be located before the data mining level and present classification of the web login files The classification will be done according to some of the attributes consistence to these web files Then apply the web usage on each class in independence form This would present efficient web usage for each class due to the volume limitation of each class Then the visualization level would be very clear and understood by the users without needing for the analysis followed by data mining process to increase the understanding and clearing of the visualized mined patterns Keywords Web mining, web usage, decision tree classification 1 Introduction [1] With the explosive growth of information sources available on the World Wide Web, it has become increasingly necessary for users to utilize automated tools in find the desired information resources, and to track and analyze their usage patterns These factors give rise to the necessity of creating server-side and clientside intelligent systems that can effectively mine for knowledge Web mining can be broadly defined as the discovery and analysis of useful information from the World Wide Web This describes the automatic search of
2 2/10 information resources available on-line, ie Web content mining, and the discovery of user access patterns from Web servers, ie, Web usage mining We present taxonomy of Web mining, and place various aspects of Web mining in their proper There are several important issues, unique to the Web paradigm, that come into play if sophisticated types of analyses are to be done on server side data collections These include integrating various data sources such as server access logs, referrer logs, user registration or profile information; resolving difficulties in the identification of users due to missing unique key attributes in collected data; and the importance of identifying user sessions or transactions from usage data, site topologies, and models of user behavior 2 A Taxonomy of Web Mining [1,2,3] In this section we present taxonomy of Web mining, ie Web content mining and Web usage mining This taxonomy is depicted in Figure 1 Web Mining Agent Based Approach *intelligent search agent *information filtering categorization *personalized web agent Web Content Mining Database Approach *Multilevel Database *Query web database Web Usage Mining *preprocessing *transaction identification *pattern discovery tools *pattern analysis tools Figure 1 Web mining taxonomy 21 Web Content Mining The lack of structure that permeates the information sources on the World Wide Web makes automated discovery of Web-based information difficult Traditional search engines such as Lycos, Alta Vista, WebCrawler, MetaCrawler, and others provide some comfort to users, but do not generally provide structural information nor categorize, filter, or interpret documents A recent study provides a comprehensive and statistically thorough comparative evaluation of the most popular search engines In recent years these factors have prompted re- searchers to develop more intelligent tools for information retrieval, such as intelligent Web agents, and to extend data
3 3/10 mining techniques to provide a higher level of organization for semi-structured data available on the Web 22 Web Usage Mining Web usage mining is the automatic discovery of user access patterns from Web servers Organizations collect large volumes of data in their daily operations, generated automatically by Web servers and collected in server access logs Other sources of user information include referrer logs which contain information about the referring pages for each page reference, and user registration or survey data gathered via CGI scripts Analyzing such data can help organizations determine the life time value of customers, cross marketing strategies across products, and effectiveness of promotional campaigns, among other things It can also provide information on how to restructure a Web site to create a more effective organizational presence, and shed light on more effective management of workgroup communication and organizational infrastructure For selling advertisements on the World Wide Web, analyzing user access patterns helps in targeting ads to specific groups of users Most existing Web analysis tools provide mechanisms for reporting user activity in the servers and various forms of data filtering Using such tools it is possible to determine the number of accesses to the server and to individual files, the times of visits, and the domain names and URLs of users However, these tools are designed to handle low to moderate traffic servers, and usually provide little or no analysis of data relationships among the accessed files and directories within the Web space More sophisticated systems and techniques for discovery and analysis of patterns are now emerging These tools can be placed into two main categories, as discussed below 221 Pattern Discovery Tools The emerging tools for user pattern discovery use sophisticated techniques from AI, data mining, psychology, and information theory, to mine for knowledge from collected data For example, the WEBMINER system introduces a general architecture for Web usage mining WEBMINER automatically discovers association rules and sequential patterns from server access logs In algorithms are introduced for finding maximal forward references and large reference sequences These can, in turn be used to perform various types of user traversal path analysis, such as identifying the most traversed paths thorough a Web locality use information foraging theory to combine path traversal patterns, Web page typing, and site topology information to categorize pages for easier access by users
4 4/ Pattern Analysis Tools Once access patterns have been discovered, analysts need the appropriate tools and techniques to understand, visualize, and interpret these patterns, eg the WebViz system Others have proposed using OLAP techniques such as data cubes for the purpose of simplifying the analysis of usage statistics from server access logs 3 What can be discovered [4] The kinds of patterns that can be discovered depend upon the data mining tasks employed By and large, there are two types of data mining tasks: descriptive data mining tasks that describe the general properties of the existing data, and predictive data mining tasks that attempt to do predictions based on inference on available data The data mining functionalities and the variety of knowledge they discover are briefly presented in the following list: Characterization: Data characterization is a summarization of general features of objects in a target class, and produces what is called characteristic rules Discrimination: Data discrimination produces what are called discriminant rules and is basically the comparison of the general features of objects between two classes referred to as the target class and the contrasting class Association analysis: Association analysis is the discovery of what are commonly called association rules It studies the frequency of items occurring together in transactional databases, and based The WEBMINER system proposes an SQLlike query mechanism for querying the discovered knowledge (in the form of association rules and sequential patterns) These techniques and others are further discussed in the subsequent sections on a threshold called support, identifies the frequent item sets Another threshold, confidence, which is the conditional probability than an item appears in a transaction when another item appears, is used to pinpoint association rules Classification: Classification analysis is the organization of data in given classes Also known as supervised classification, the classification uses given class labels to order the objects in the data collection Classification approaches normally use a training set where all objects are already associated with known class labels The classification algorithm learns from the training set and builds a model The model is used to classify new objects Clustering: Similar to classification, clustering is the organization of data in classes However, unlike classification, in clustering, class labels are unknown and it is up to the clustering algorithm to discover acceptable classes
5 5/10 4 Decision Tree Classification [5] Decision Tree Classifiers (DTC's) are used successfully in many diverse areas such as radar signal classification, character recognition, remote sensing, medical diagnosis, expert systems, and speech recognition, to name only a few Perhaps, the most important feature of DTC's is their capability to break down a complex decisionmaking process into a collection of simpler decisions, thus providing a solution which is often easier to interpret The decision tree classifier is one of the possible approaches to multistage decision making; table look-up rules, decision table conversion to optimal decision trees, and sequential approaches are others The basic idea involved in any multistage approach is to break up a complex decision into a union of several simpler decisions, hoping the final solution obtained this way would resemble the intended desired solution We briefly describe some necessary terminology for describing trees Definitions: 1) A graph G = (V, E) consists of a finite, nonempty set of nodes (or vertices) V and a set of edges E If the edges are ordered pairs (v,w) of vertices, then the graph is said to be directed 2) A path in a graph is a sequence of edges of the form (v 1, v2), (v2, v3),,(vn-1, vn) We say the path is from v1to vn and is of the length n 3) A directed graph with no cycles is called a directed acyclic graph A directed (or rooted) tree is a directed acyclic graph satisfying the following properties: i) There is exactly one node, called the root, which no edges enter The root node contains all the class labels ii) Every node except the root has exactly one entering edge iii) There is a unique path from the root to each node 4) If (v,w) is an edge in a tree, then v is called the father of w, and w is a son of v If there is a path from v to w (v w), then v is a proper ancestor of w and w is a proper descendant of v 5) A node with no proper descendant is called a leaf (or a terminal) All other nodes (except the root) are called internal nodes The main objectives of decision tree classifiers are: 1) to classify correctly as much of the training sample as possible; 2) generalize beyond the training sample so that unseen samples could be classified with as high of an accuracy as possible; 3) be easy to update as more training sample becomes available (ie, be incremental - see section IV) ; 4) and have as simple a structure as possible Then the design of a DTC can be decomposed into following tasks: 1) The appropriate choice of the tree structure 2) The choice of feature subsets to be used at each internal node 3) The choice of the decision rule or strategy to be used at each internal node
6 6/10 5 The Proposed System In this research we describe an efficient web usage mining framework The key ideas are to preprocess the web log files and then classify this log file into number of files each one represent a class, this classification done by a decision tree classifier After that each class would be submitted to web usage mining this make the web usage more efficient because each application with all it is services and their visitors would be studied separately The general algorithm and all the details would be explained in the following sections briefly: 51 Preprocessing Tasks As discussed in previous section, analysis of how users are accessing a site is critical for determining effective marketing strategies and optimizing the logical structure of the Web site In this research, specially, there are a number of issues in pre processing data for mining that must be addressed before the mining algorithms can be run done on Web usage data, sequences of page references must be grouped into logical units representing Web trans- actions or user sessions This done by convert the web log file see figure (2) to a relational database see figure 3, this would give each user entry a transaction identifier, this preprocessing represent the basic step in beginning with web usage The first major preprocessing task is transaction identification Before any mining is Figure 2 application specialized to collect the web log information
7 7/10 TID Local IP (A) Local Remote IP (C) Remote port State (E) Type (F) Time port (D) stamp (B) (G) Listen Tcp 2: Listen Udp 5:50 Figure 3 the relational database (D) for the web login/out file The second preprocessing task is data cleaning Techniques to clean a server log to eliminate irrelevant items are of importance for any type of Web log analysis, not just data mining The discovered associations or reported statistics are only useful if the data represented in the server log gives an accurate picture of the user accesses of the Web site Here, in this research the cleaning technique would mean eliminate the transaction of intruders this would be done by using the decision tree classifier And then by these classifier the web log file would be classified into no of classes to mining each class separately The decision tree classifier would be explained in details in the following section 52 Decision Tree Classifier We used the sibairwall program for dealing with the data of web log, and then it converted to a relational database This database have one global scheme include the following attributes ( local IP address, remote IP address, local port, remote port, state, type of protocol and time stamp) Now we would built the decision tree classifier for two purposes the first is to delete all intruder transactions, the second is to classify the web log file to some previous known classes DTC is a tree as declared in the above previous sections So the most important step is how to choice the attribute to be the root nod, then how to choice each one of the internal nods to complete the splitting and built this classifier In our research we would built this classifier without needing to measure the entropy of each on of these attributes to decide which one represent the root nod and then which one represent the more power attribute
8 8/10 to be the internal node to complete the splitting Because the power of each attribute is very clear in the web log files, so: first class is the intruder visitors class and the second is the normal visitors class See the following figure 4 The root nod would be the remote IP address so, by this attribute the tree would be split into two classes the Remote IP address another best attribute If IP in set of normal visitors If IP in set of intruder visitors Intruder visitor class Figure 4 first step to built the DTC include choice the best attribute (root node ) After splitting the DT in to two classes, usually we would eliminate the intruder visitors class and we would continue with normal visitors class This by selecting the more power attribute for splitting the last class into the final classes This attribute would be the local port number, so the final classes would be web log files each file represent the visitors of specific application see the following figure 5 If IP in set of normal visitors Remote IP address If IP in set of intruder visitors Local port number Intruder visitor class Application1 Application2 Applications Figure 5 represent the final DTC which classify the web log files into classes according local port number attribute
9 9/10 53 Discovery Techniques on Web Transactions in each class Once user transactions or sessions have been identified, there are several kinds of access pattern mining that can be performed depending on the needs of the analyst, such as path analysis, discovery of association rules and sequential patterns, and clustering and classification In this research we would depend on the association rule analysis as a basic tool for web usage mining this by assume A be a set of attributes and I be a set of values on A, called items Any subset of I is called an itemset The number of items in an itemset is called its length Let D be a database with n attributes (columns) Define support(x) as the percentage of transactions (records) in D that contain itemset X An association rule is the expression X Y, c, s Here X and Y are itemsets, and X Y = Ф s = support(x U Y ) is the support of the rule, and c = support(x U Y ) / support(x) is the confidence Association rule discovery techniques are generally applied to databases of transactions where each transaction consists of a set of items In such a framework the problem is to discover all associations and correlations among data items where the presence of one set of items in a transaction implies (with a certain degree of confidence) the presence of other items In the context of Web usage mining, this problem amounts to discovering the correlations among references to various files available on the server by a given client But by classify the huge web log file into smaller web log file for each application the association rule would be very efficient for web usage this, because that will minimize the amount of discovering the associations and correlations among the itemsets of the files Each transaction is comprised of a set of URLs accessed by a client in one visit to the server For example, using association rule discovery techniques we can find correlations such as the following: _ 40% of clients who accessed the Web page with application /company/product1, also accessed /company/product2; or _ 30% of clients who accessed with gopher application/company/special, placed an online order in /company/product1 After the web mining process on each of classified files and extract the hidden pattern we don t need to analyze these discovered patterns because it would be very clear and understood in the visualization level This clearness of these pattern depend on applying the web usage mining on specialized and determined files, instead of huge data web files, as in more previous research 6 Conclusion The term Web mining has been used to refer to techniques that encompass a broad range of Web mining to mean different things to different people, and there is a need to develop issues However, while meaningful and attractive, this very broadness has caused a common vocabulary Towards this goal we display a definition of Web mining, and
10 10/10 developed taxonomy of the various ongoing efforts related to it Next, we presented a survey of the research in this area and concentrated on Web usage mining We proposed that system for the following aims: 1 apply the web usage mining on web log file after classify the last one into number of files each one represent web log data for an application this would minimize the amount of associations and correlations needed for the usage, then we would have optimize web usage in time, space storage, performance and no need to analyze the discovered patterns to present high quality for visualization 2 In this research the DTC used for classify the web log file into web log application files, because it represent the more powerful classification tool for it is speed in the classification and little data used for training it 3 The root node selected as the remote IP address to eliminate the intruder transactions before classify web file into application web files References 1 M S Chen, J Han, and P S Yu Data mining: An overview from a database perspective IEEE Trans Knowledge and Data Engineering, 8: , 1996 U M Fayyad, G Piatetsky-Shapiro, P Smyth, and R Uthurusamy Advances in Knowledge Discovery and Data Mining AAAI/MIT Press, Web Mining Research: A Survey, Raymond Kosala, Hendrik Blockeel, J Han and M Kamber Data Mining: Concepts and Techniques Morgan Kaufmann, A Survey of Decision Tree Classifier Methodology, S Rasoul Safavian and David Landgrebe, landgreb@ecnpurdueedu, 1999
A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING
A COGNITIVE APPROACH IN PATTERN ANALYSIS TOOLS AND TECHNIQUES USING WEB USAGE MINING M.Gnanavel 1 & Dr.E.R.Naganathan 2 1. Research Scholar, SCSVMV University, Kanchipuram,Tamil Nadu,India. 2. Professor
ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION
ANALYSIS OF WEBSITE USAGE WITH USER DETAILS USING DATA MINING PATTERN RECOGNITION K.Vinodkumar 1, Kathiresan.V 2, Divya.K 3 1 MPhil scholar, RVS College of Arts and Science, Coimbatore, India. 2 HOD, Dr.SNS
How To Use Data Mining For Knowledge Management In Technology Enhanced Learning
Proceedings of the 6th WSEAS International Conference on Applications of Electrical Engineering, Istanbul, Turkey, May 27-29, 2007 115 Data Mining for Knowledge Management in Technology Enhanced Learning
Data Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania [email protected] Over
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH
205 A STUDY OF DATA MINING ACTIVITIES FOR MARKET RESEARCH ABSTRACT MR. HEMANT KUMAR*; DR. SARMISTHA SARMA** *Assistant Professor, Department of Information Technology (IT), Institute of Innovation in Technology
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
Introduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Arti Tyagi Sunita Choudhary
Volume 5, Issue 3, March 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Web Usage Mining
Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data
Enhance Preprocessing Technique Distinct User Identification using Web Log Usage data Sheetal A. Raiyani 1, Shailendra Jain 2 Dept. of CSE(SS),TIT,Bhopal 1, Dept. of CSE,TIT,Bhopal 2 [email protected]
ANALYSIS OF WEB LOGS AND WEB USER IN WEB MINING
ANALYSIS OF WEB LOGS AND WEB USER IN WEB MINING L.K. Joshila Grace 1, V.Maheswari 2, Dhinaharan Nagamalai 3, 1 Research Scholar, Department of Computer Science and Engineering [email protected]
Association rules for improving website effectiveness: case analysis
Association rules for improving website effectiveness: case analysis Maja Dimitrijević, The Higher Technical School of Professional Studies, Novi Sad, Serbia, [email protected] Tanja Krunić, The
An Enhanced Framework For Performing Pre- Processing On Web Server Logs
An Enhanced Framework For Performing Pre- Processing On Web Server Logs T.Subha Mastan Rao #1, P.Siva Durga Bhavani #2, M.Revathi #3, N.Kiran Kumar #4,V.Sara #5 # Department of information science and
Visualizing e-government Portal and Its Performance in WEBVS
Visualizing e-government Portal and Its Performance in WEBVS Ho Si Meng, Simon Fong Department of Computer and Information Science University of Macau, Macau SAR [email protected] Abstract An e-government
Data Warehousing and Data Mining in Business Applications
133 Data Warehousing and Data Mining in Business Applications Eesha Goel CSE Deptt. GZS-PTU Campus, Bathinda. Abstract Information technology is now required in all aspect of our lives that helps in business
SPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
PREPROCESSING OF WEB LOGS
PREPROCESSING OF WEB LOGS Ms. Dipa Dixit Lecturer Fr.CRIT, Vashi Abstract-Today s real world databases are highly susceptible to noisy, missing and inconsistent data due to their typically huge size data
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
WEB SITE OPTIMIZATION THROUGH MINING USER NAVIGATIONAL PATTERNS
WEB SITE OPTIMIZATION THROUGH MINING USER NAVIGATIONAL PATTERNS Biswajit Biswal Oracle Corporation [email protected] ABSTRACT With the World Wide Web (www) s ubiquity increase and the rapid development
So today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
A Time Efficient Algorithm for Web Log Analysis
A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,
Dynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
Healthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
Understanding Web personalization with Web Usage Mining and its Application: Recommender System
Understanding Web personalization with Web Usage Mining and its Application: Recommender System Manoj Swami 1, Prof. Manasi Kulkarni 2 1 M.Tech (Computer-NIMS), VJTI, Mumbai. 2 Department of Computer Technology,
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
Web Mining as a Tool for Understanding Online Learning
Web Mining as a Tool for Understanding Online Learning Jiye Ai University of Missouri Columbia Columbia, MO USA [email protected] James Laffey University of Missouri Columbia Columbia, MO USA [email protected]
MINING CLICKSTREAM-BASED DATA CUBES
MINING CLICKSTREAM-BASED DATA CUBES Ronnie Alves and Orlando Belo Departament of Informatics,School of Engineering, University of Minho Campus de Gualtar, 4710-057 Braga, Portugal Email: {alvesrco,obelo}@di.uminho.pt
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis
, 23-25 October, 2013, San Francisco, USA Mining Online GIS for Crime Rate and Models based on Frequent Pattern Analysis John David Elijah Sandig, Ruby Mae Somoba, Ma. Beth Concepcion and Bobby D. Gerardo,
A Review of Anomaly Detection Techniques in Network Intrusion Detection System
A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
Data Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka ([email protected]) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
Data Mining and Database Systems: Where is the Intersection?
Data Mining and Database Systems: Where is the Intersection? Surajit Chaudhuri Microsoft Research Email: [email protected] 1 Introduction The promise of decision support systems is to exploit enterprise
DATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
COURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
Advanced Preprocessing using Distinct User Identification in web log usage data
Advanced Preprocessing using Distinct User Identification in web log usage data Sheetal A. Raiyani 1, Shailendra Jain 2, Ashwin G. Raiyani 3 Department of CSE (Software System), Technocrats Institute of
WebAdaptor: Designing Adaptive Web Sites Using Data Mining Techniques
From: FLAIRS-01 Proceedings. Copyright 2001, AAAI (www.aaai.org). All rights reserved. WebAdaptor: Designing Adaptive Web Sites Using Data Mining Techniques Howard J. Hamilton, Xuewei Wang, and Y.Y. Yao
Introduction to Data Mining
Introduction to Data Mining Jay Urbain Credits: Nazli Goharian & David Grossman @ IIT Outline Introduction Data Pre-processing Data Mining Algorithms Naïve Bayes Decision Tree Neural Network Association
CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION
CYBER SCIENCE 2015 AN ANALYSIS OF NETWORK TRAFFIC CLASSIFICATION FOR BOTNET DETECTION MATIJA STEVANOVIC PhD Student JENS MYRUP PEDERSEN Associate Professor Department of Electronic Systems Aalborg University,
Interactive Exploration of Decision Tree Results
Interactive Exploration of Decision Tree Results 1 IRISA Campus de Beaulieu F35042 Rennes Cedex, France (email: pnguyenk,[email protected]) 2 INRIA Futurs L.R.I., University Paris-Sud F91405 ORSAY Cedex,
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
Web Mining Functions in an Academic Search Application
132 Informatica Economică vol. 13, no. 3/2009 Web Mining Functions in an Academic Search Application Jeyalatha SIVARAMAKRISHNAN, Vijayakumar BALAKRISHNAN Faculty of Computer Science and Engineering, BITS
Building a Database to Predict Customer Needs
INFORMATION TECHNOLOGY TopicalNet, Inc (formerly Continuum Software, Inc.) Building a Database to Predict Customer Needs Since the early 1990s, organizations have used data warehouses and data-mining tools
Numerical Algorithms Group
Title: Summary: Using the Component Approach to Craft Customized Data Mining Solutions One definition of data mining is the non-trivial extraction of implicit, previously unknown and potentially useful
Data Mining Approach in Security Information and Event Management
Data Mining Approach in Security Information and Event Management Anita Rajendra Zope, Amarsinh Vidhate, and Naresh Harale Abstract This paper gives an overview of data mining field & security information
A Survey on Web Mining From Web Server Log
A Survey on Web Mining From Web Server Log Ripal Patel 1, Mr. Krunal Panchal 2, Mr. Dushyantsinh Rathod 3 1 M.E., 2,3 Assistant Professor, 1,2,3 computer Engineering Department, 1,2 L J Institute of Engineering
Data Mining for Fun and Profit
Data Mining for Fun and Profit Data mining is the extraction of implicit, previously unknown, and potentially useful information from data. - Ian H. Witten, Data Mining: Practical Machine Learning Tools
An Empirical Study of Application of Data Mining Techniques in Library System
An Empirical Study of Application of Data Mining Techniques in Library System Veepu Uppal Department of Computer Science and Engineering, Manav Rachna College of Engineering, Faridabad, India Gunjan Chindwani
Graph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
E-CRM and Web Mining. Objectives, Application Fields and Process of Web Usage Mining for Online Customer Relationship Management.
University of Fribourg, Switzerland Department of Computer Science Information Systems Research Group Seminar Online CRM, 2005 Prof. Dr. Andreas Meier E-CRM and Web Mining. Objectives, Application Fields
Mining for Web Engineering
Mining for Engineering A. Venkata Krishna Prasad 1, Prof. S.Ramakrishna 2 1 Associate Professor, Department of Computer Science, MIPGS, Hyderabad 2 Professor, Department of Computer Science, Sri Venkateswara
OLAP and Data Mining. Data Warehousing and End-User Access Tools. Introducing OLAP. Introducing OLAP
Data Warehousing and End-User Access Tools OLAP and Data Mining Accompanying growth in data warehouses is increasing demands for more powerful access tools providing advanced analytical capabilities. Key
Identifying the Number of Visitors to improve Website Usability from Educational Institution Web Log Data
Identifying the Number of to improve Website Usability from Educational Institution Web Log Data Arvind K. Sharma Dept. of CSE Jaipur National University, Jaipur, Rajasthan,India P.C. Gupta Dept. of CSI
ISSN: 2348 9510. A Review: Image Retrieval Using Web Multimedia Mining
A Review: Image Retrieval Using Web Multimedia Satish Bansal*, K K Yadav** *, **Assistant Professor Prestige Institute Of Management, Gwalior (MP), India Abstract Multimedia object include audio, video,
Protein Protein Interaction Networks
Functional Pattern Mining from Genome Scale Protein Protein Interaction Networks Young-Rae Cho, Ph.D. Assistant Professor Department of Computer Science Baylor University it My Definition of Bioinformatics
2.1. Data Mining for Biomedical and DNA data analysis
Applications of Data Mining Simmi Bagga Assistant Professor Sant Hira Dass Kanya Maha Vidyalaya, Kala Sanghian, Distt Kpt, India (Email: [email protected]) Dr. G.N. Singh Department of Physics and
Analysis of Data Mining Concepts in Higher Education with Needs to Najran University
590 Analysis of Data Mining Concepts in Higher Education with Needs to Najran University Mohamed Hussain Tawarish 1, Farooqui Waseemuddin 2 Department of Computer Science, Najran Community College. Najran
Data Mining. Knowledge Discovery, Data Warehousing and Machine Learning Final remarks. Lecturer: JERZY STEFANOWSKI
Data Mining Knowledge Discovery, Data Warehousing and Machine Learning Final remarks Lecturer: JERZY STEFANOWSKI Email: [email protected] Data Mining a step in A KDD Process Data mining:
Data Mining Analytics for Business Intelligence and Decision Support
Data Mining Analytics for Business Intelligence and Decision Support Chid Apte, T.J. Watson Research Center, IBM Research Division Knowledge Discovery and Data Mining (KDD) techniques are used for analyzing
Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results
, pp.33-40 http://dx.doi.org/10.14257/ijgdc.2014.7.4.04 Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results Muzammil Khan, Fida Hussain and Imran Khan Department
Exploitation of Server Log Files of User Behavior in Order to Inform Administrator
Exploitation of Server Log Files of User Behavior in Order to Inform Administrator Hamed Jelodar Computer Department, Islamic Azad University, Science and Research Branch, Bushehr, Iran ABSTRACT All requests
Data Mining and Exploration. Data Mining and Exploration: Introduction. Relationships between courses. Overview. Course Introduction
Data Mining and Exploration Data Mining and Exploration: Introduction Amos Storkey, School of Informatics January 10, 2006 http://www.inf.ed.ac.uk/teaching/courses/dme/ Course Introduction Welcome Administration
Mining an Online Auctions Data Warehouse
Proceedings of MASPLAS'02 The Mid-Atlantic Student Workshop on Programming Languages and Systems Pace University, April 19, 2002 Mining an Online Auctions Data Warehouse David Ulmer Under the guidance
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
The Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
Chapter 2 Literature Review
Chapter 2 Literature Review 2.1 Data Mining The amount of data continues to grow at an enormous rate even though the data stores are already vast. The primary challenge is how to make the database a competitive
AN EFFICIENT APPROACH TO PERFORM PRE-PROCESSING
AN EFFIIENT APPROAH TO PERFORM PRE-PROESSING S. Prince Mary Research Scholar, Sathyabama University, hennai- 119 [email protected] E. Baburaj Department of omputer Science & Engineering, Sun Engineering
Web Mining. Margherita Berardi LACAM. Dipartimento di Informatica Università degli Studi di Bari [email protected]
Web Mining Margherita Berardi LACAM Dipartimento di Informatica Università degli Studi di Bari [email protected] Bari, 24 Aprile 2003 Overview Introduction Knowledge discovery from text (Web Content
SIP Service Providers and The Spam Problem
SIP Service Providers and The Spam Problem Y. Rebahi, D. Sisalem Fraunhofer Institut Fokus Kaiserin-Augusta-Allee 1 10589 Berlin, Germany {rebahi, sisalem}@fokus.fraunhofer.de Abstract The Session Initiation
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
STATISTICA. Financial Institutions. Case Study: Credit Scoring. and
Financial Institutions and STATISTICA Case Study: Credit Scoring STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table of Contents INTRODUCTION: WHAT
Data Mining: Overview. What is Data Mining?
Data Mining: Overview What is Data Mining? Recently * coined term for confluence of ideas from statistics and computer science (machine learning and database methods) applied to large databases in science,
Foundations of Business Intelligence: Databases and Information Management
Foundations of Business Intelligence: Databases and Information Management Problem: HP s numerous systems unable to deliver the information needed for a complete picture of business operations, lack of
Static Data Mining Algorithm with Progressive Approach for Mining Knowledge
Global Journal of Business Management and Information Technology. Volume 1, Number 2 (2011), pp. 85-93 Research India Publications http://www.ripublication.com Static Data Mining Algorithm with Progressive
Data Mining of Web Access Logs
Data Mining of Web Access Logs A minor thesis submitted in partial fulfilment of the requirements for the degree of Master of Applied Science in Information Technology Anand S. Lalani School of Computer
IT and CRM A basic CRM model Data source & gathering system Database system Data warehouse Information delivery system Information users
1 IT and CRM A basic CRM model Data source & gathering Database Data warehouse Information delivery Information users 2 IT and CRM Markets have always recognized the importance of gathering detailed data
Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)
260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case
Foundations of Business Intelligence: Databases and Information Management
Chapter 5 Foundations of Business Intelligence: Databases and Information Management 5.1 Copyright 2011 Pearson Education, Inc. Student Learning Objectives How does a relational database organize data,
Chapter ML:XI. XI. Cluster Analysis
Chapter ML:XI XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained Cluster
Data Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report
Quality Control of National Genetic Evaluation Results Using Data-Mining Techniques; A Progress Report G. Banos 1, P.A. Mitkas 2, Z. Abas 3, A.L. Symeonidis 2, G. Milis 2 and U. Emanuelson 4 1 Faculty
Selection of Optimal Discount of Retail Assortments with Data Mining Approach
Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
Binary Coded Web Access Pattern Tree in Education Domain
Binary Coded Web Access Pattern Tree in Education Domain C. Gomathi P.G. Department of Computer Science Kongu Arts and Science College Erode-638-107, Tamil Nadu, India E-mail: [email protected] M. Moorthi
A Comparative Study of Different Log Analyzer Tools to Analyze User Behaviors
A Comparative Study of Different Log Analyzer Tools to Analyze User Behaviors S. Bhuvaneswari P.G Student, Department of CSE, A.V.C College of Engineering, Mayiladuthurai, TN, India. [email protected]
