Towards a Practical Approach to Discover Internal Dependencies in Rule-Based Knowledge Bases
|
|
- Christopher Charles
- 7 years ago
- Views:
Transcription
1 Towards a Practical Approach to Discover Internal Dependencies in Rule-Based Knowledge Bases Roman Simiński, Agnieszka Nowak-Brzezińska, Tomasz Jach, and Tomasz Xiȩski University of Silesia, Institute of Computer Science ul. Bȩdzińska 29, Sosnowiec, Poland {roman.siminski,agnieszka.nowak, tomasz.jach,tomasz.xieski}@us.edu.pl Abstract. In this paper, we intend to introduce the conception of discovering the knowledge about rules saved in large rule-based knowledge bases, both generated automatically and acquired from human experts in the classical way. This paper presents a preliminary study of a new project in which we are going to join the two approaches: the hierarchical decomposition of large rule bases using cluster analysis and the decision units conception. Our goal is to discover useful, potentially implicit and directly unreadable information from large rule sets. Keywords: rule knowledge bases, inference, decision unit, cluster analysis, data mining. 1 Introduction The last decade resulted in significant development of data mining methods, tools and applications. Data mining and knowledge-discovery in databases is the process of automatically searching large volumes of data for useful patterns (typically rules) [1, 2]. We will assume that the result of data mining is a set of rules, herein after referred as a knowledge base. From our point of view, the resulting knowledge base is the most interesting feature, rather than the process of acquiring rules from data, but usually we will have in mind the automatic rules generation method based on the rough set theory [3 5]. When the rule sets have been induced from data they can be applied in several ways, for example to classify the unseen cases in classifiers or to build the knowledge base of decision support systems. However, in many applications the process of rule induction is in fact discovering previously unknown knowledge, discovered rules are previously unknown and potentially surprising. Several hundred and sometimes thousands of rules is a frequent result of a data mining process applied on the real-world data sets. The dependencies between rules from different large rule sets are typically not simple and clear enough. An interesting paradox arises: data mining, which was to bring out knowledge hidden in large databases, often discovers and provides large sets of rules, J.T. Yao et al. (Eds): RSKT 2011, LNCS 6954, pp , c Springer-Verlag Berlin Heidelberg 2011
2 Towards a Practical Approach to Discover Internal Dependencies 233 which contain knowledge which is hard to understand and are directly useless for domain experts. Thus, in many cases the methods of the nontrivial extraction of implicit... and potentially useful information from......large data sets... produce large rule sets with nontrivial, probably useful, but unreadable knowledge for human experts who attempt to verify the discovered knowledge. The main goal of this paper is to introduce the conception of discovering the knowledge about rules saved in large rule-based knowledge bases, both generated automatically and acquired from human experts in the classical way. This paper presents a preliminary study of a new project in which we are going to join the two approaches: the hierarchical decomposition of large rule bases using cluster analysis and the decision units conception. The proposed approach assumes that we reorganize any attributive rule knowledge base from the set of not related rules to groups of similar rules (using cluster analysis) [8] or decision units [6]. These structurs will be a source for extracting relevant information about the rules, this information is useful for the understanding of the internal structure of the knowledge base, as it describes the knowledge about knowledge saved in the rule sets. We can use this information to extract the current decision model implicitly stored in the rule sets, as well as for formal verifying and validating the rules against the expert knowledge. We plan to build a modular, hierarchically organized rule base using the cluster analysis method and decision units. These methods have been successfully used in the optimization of the inference task [6] due to the analysis of the internal properties discovered in the rule sets [8, 12]. In this work we want to introduce an extension of the clustered rules and the decision units oriented for extraction of additional knowledge from rule-based knowledge bases. The practical goal of the project is the implementation of visual-oriented software tool for knowledge engineering: a new, second version of kbbuilder system [12] and HKB_Builder system [8]. 2 Preliminaries and Basic Notation The proposed approach is dedicated to the rule knowledge bases containing the Horn clause rules, where literals are coded using attribute-value pairs. Such form of rules representation is very popular and widely used in data mining. Let R be the rule knowledge base containing m rules: R = {r 1,r 2,...,r m },where r : l 1 l 2... l n c and n is the number of conditional literals in the rule r, l i denotes the i-th conditional literal of r rule and c denotes the conclusion literal of r rule. Let A = {a 1,a 2,...a n } is a non-empty finite set of conditional and decision attributes and for every a A the set V a is called the value set of a: A = {a 1,a 2,...a n }, V a = {v1 a,va 2,...,va k }. Each attribute in a A may be a conditional and/or a decision attribute. Let literals of the rules from R become (a, vi a)wherea A and va i V a and let notations (a, vi a)anda = va i be equivalent. We make no assumption about rules in the set R. This set can be a result of a data mining process as well as it may be created by human expert or knowledge engineer, still it is possible
3 234 R. Simiński et al. that R contains errors and anomalies, we will use such a rule set for forward and backward inference, which probably will create a deep inference path. 3 Decision Units as a Decision Model for Rule Base The concept of decision units originally came into existence as a tool of rule base verification, the decision units can be also considered as a simple tool for rule knowledge base modeling. The concept of a decision unit is described in the [10 12] and due to the limited space in this paper can not be presented in detail. The idea of decision units allows us to divide a set of rules into subsets according to a simple criterion: dividing by the elementary decision (the rules contained in the subset have exactly this same attribute-value pair (a, v) in the conclusion), dividing by the decision (the rules contained in the subset have exactly this same attribute a in the each attribute-value pair (a, v) in the conclusion) or dividing by the general decision (like above, but we do not take into account the information about values appearing in the conclusions of the rules). The decision unit is the triple (I,O,R)whereR is the subset of rules (R R) created according to one of the three criteria described above. O denotes the set of the output entries of the decision unit, output entries are conclusions of rules from R, I denotes the set of the input entries of decision unit, input entries are the conditions of rules from R [12]. According to the selected rules dividing criterion, we obtain three kinds of decision units [11]: elementary decision units denoted as triple U e =(I e,o e,r e ), ordinal decision units denoted as triple U =(I,O,R), general decision units, related to ordinal decision units, but only information about attributes are considered, denoted as triple U g =(I g,o g,r g ). In the case of the two kinds of decision units elementary and ordinal sets I e, I and O e, O contain attribute-value pairs (a, v). For the general decision units we consider only the attributes, therefore I g and O g are the sets of attributes. An attribute usually represents in knowledge base particular concept or property from the real word. The value of an attribute express the determined meaning of the concept or state of the property. If we consider a specified (a, v) pair,we think about a particular kind of concept or about a particular property state. Because in the considered case the pair (a, v) is a conclusion literal, attribute a is a decision attribute. On the high level of abstraction, an elementary decision units D e is similar to i-th decision class XA i tn the decision table [4]. O e represent concrete, ther e describes reasoning about such concrete: we can confirm (or not) hypothesis about concrete using goal-driven backward inference or we can check the availability of concrete based on currently know fact using data-driven forward inference. We can say that an elementary decision units is a model of elementary decision about concrete from real-word concepts. An ordinary decision can be considered as the model of decision making about particular concept or property from the real word. We consider decision about concepts to be on higher level of abstraction than decision about meaning of concepts. Thus therulesetr can be constructed as the composition of elementary decision units D e. A decision unit is similar to decision table DT =(U, A {d}) with single decision attribute d.
4 Towards a Practical Approach to Discover Internal Dependencies 235 Sometimes it is interesting which relations the particular attribute a depends on. It is especially useful when we attempt to discover hidden decision model in an extensive data set which is not known at the early stage. Connections between attributes can express the relations between concepts. For this reason, we introduce general decision units U g. The connections between conclusion and conditional literals are usually hidden in rule bases and only during an inference we discover how sometimes deep the inference chains are. Such knowledge bases generate a few decision units with connections between input and output entries in this way we obtain a decision units net. An example of the usage of the properties of decision units in the inference optimization task is presented in [6, 7], same issues concerning rule base modeling contains [12]. 4 Rules Clusters as a Decision Model for Rule Base The clustering techniques have to be used whenever the size of input data exceeds capabilities of the traditional methods of data analysis. This is the case especially when we deal with a problem of large number of rules in knowledge bases in modern typical decision support system, we should cluster rules in order to increase the overall efficiency. Effective inference in such structures can be a tremendous task usually overwhelming classical systems. This is why the authors try to find some previously unknown connections in the rules which can aid to create more effective systems. At first, the rules are organized into groups of similar rules. Clustering is considered optimal if each cluster consists of very similar rules and if different clusters are easily distinguished with every another cluster. There is a vast number of methods to achieve this goal, among them there are the hierarchical and the non-hierarchical clustering algorithms. Both try to divide the datasets into the clusters, but the first one produces the hierarchical structure of the rules. By organizing data into the clusters, several additional benefits are achieved. Primo, additional information about analyzed data is acquired. The groups of rules can be further analyzed in order to observe the additional patterns which can be used to simplify the structure of the knowledge base either by simplifying the rules or by reducing the number of them. This data would not be found if the division into groups was not established. Secundo, clustering rules is the faster method of finding relevant rules in the knowledge base. It is an undeniable fact that the cluster analysis methods allow to look for the relevant rule much faster than browsing through the entire knowledge base. In this case, the hierarchical methods are better because of the possibility of finding the relevant rule by traversing through the tree of rules made by the hierarchical algorithm: AHC as well as mahc [6 8]. Every level of a tree produces more accurate answer and, additionally, the number of comparisons needed is much less than in the classical systems. The concept of new hierarchical model of the knowledge base is described in [6 8] and due to the limits of this paper can not be presented in detail. Let us just remind that it is represented by Tree = {w 1,,w 2n 1 } or Tree = {w 1,,w k } where
5 236 R. Simiński et al. k 2n 1 (if we apply the mahc algorithm which creates a smaller tree). It is essential to note that each node in that tree: w i = {d i,c i,f,i,j} (where i, j are numbers of clustered groups) is the representative of the conditional parts c i as well as the decisions d i of the rules belonging to it. Also the similarity values f : R R R [0 1] of rules forming the cluster w i to each other is stored. Clustering the rules from the knowledge base allows us to agglomerate a set of the rules into the subsets according to a simple criterion of their inner similarity. Apparently, each representant c i consists of all (or the most frequent) literals c i : l 1 l 2... l k d j of the conditional parts of the rules clustered in a cluster i (k is the number of conditional literals). At the first step of the algorithm, having the set of rules from knowledge base (R), each rule r i (r j R) represents a node w i in created Tree structure. In the second step, two most similar rules (on the basis of comparison the set of conditional attributes and their values) are found and linked in a cluster, which creates a new node at the higher level of the Tree. We may say that all clusters from nodes that are not leaves represent some kind of concepts. That is why the description of the rules forming a cluster is just a connection of all (or the most frequent) pairs (a, v) from conditional part of those rules. The process ends when there is only one cluster with every rule from the system in it (AHC) or at the desired, previously set level (mahc)[6, 8]. Having rules organized into clusters shortens the time needed to find relevant ones. Because the created structure is a kind of binary tree, we can state that the time efficiency of searching such trees is O(log 2 n). It means that the rule interpreter in a given decision support system does not have to search the whole knowledge base in the time efficiency of O(n) (like it is for the classical knowledge bases). However, in order to have hierarchical structure computed (needed for fast lookup of queries), firstly the clustering algorithm has to be executed. Time complexity of typical hierarchical algorithms varies, but is often a member of O(n 2 ) class (noticeable fact: one must execute this algorithm only once). 5 Conclusions The main purpose of this study was to introduce a foundations of practical approach to the internal dependencies discovery in rule-based knowledge bases. We have presented the motivation and objectives of the project, next we have introduced the concept of the representation of knowledge using the clusters analysis and the decision unit. We plan to build a modular, hierarchically organized rule base using these methods. We paraphrase the classical definition of data mining and we shall to introduce the the term knowledge mining the clusters of rules can be analyzed in order to discover the hidden patterns in the knowledge base, the decision units allow us to discover, verify and improve the current decision model. These methods have been successfully used in the optimization of the inference task. We believe that the techniques described in this paper will allow us to extract some kind of the new, metaknowledge from the large knowledge bases. This information will be useful both in knowledge engineering and in the
6 Towards a Practical Approach to Discover Internal Dependencies 237 utilisation of rule bases within the decision support systems. In the first case we expect that the proposed approach will result in improved the quality of knowledge base, in the second case the groving of efficiency of inference is expected. The practical goal of the project is the implementation of visual-oriented software tool for knowledge engineering: a new, second version of kbbuilder system and HKB_Builder system. References 1. Frawley, W., Piatetsky-Shapiro, G., Matheus, C.: Knowledge Discovery in Databases: An Overview. AI Magazine, (Fall 1992) 2. Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001) 3. Nguyen, H.S., Skowron, A.: Rough Set Approach to KDD (Extended Abstract). In: Wang, G., Li, T., Grzymala-Busse, J.W., Miao, D., Skowron, A., Yao, Y. (eds.) RSKT LNCS (LNAI), vol. 5009, pp Springer, Heidelberg (2008) 4. Skowron, A., Komorowski, H.J., Pawlak, Z., Polkowski, L.T.: A rough set perspective on data and knowledge. In: Handbook of Data Mining and Knowledge Discovery, pp Oxford University Press, Oxford (2002) 5. Moshkov, M., Skowron, A., Suraj, Z.: On Minimal Rule Sets for Almost All Binary Information Systems. Fundamenta Informaticae 80 (2007) 6. Nowak, A., Simiński, R., Wakulicz-Deja, A.: Towards modular representation of knowledge base. In: Advances in Soft Computing, pp Physica-Verlag, Springer Verlag Company, Heidelberg (2006) 7. Nowak, A., Simiński, R., Wakulicz-Deja, A.: Two-way optimizations of inference for rule knowledge bases. In: Proceedings of International Conference CSP 2008, Concurrency, Specification And Programming, September 29-October 1, vol. 3, pp Humboldt-Universität, Berlin (2008) 8. Nowak, A., Wakulicz-Deja, A.: The way of rules representation in composited knowledge bases. In: Advanced In Intelligent and Soft Computing, Man - Machine Interactions, pp Springer, Heidelberg (2009) 9. Nowak-Brzezińska, A., Jach, T., Xiȩski, T.: Wybór algorytmu grupowania a efektywność wyszukiwania dokumentów. Studia Informatica, Zeszyty Naukowe Politechniki Śląskiej 31(2A(89)), (2010) 10. Simiński, R., Wakulicz-Deja, A.: Verification of Rule Knowledge Bases Using Decision Units. In: Advances in Soft Computing, Intelligent Information Systems, pp Physica Verlag, Springer Verlag Company, Heidelberg (2000) 11. Simiński, R., Wakulicz-Deja, A.: Decision units as a tool for rule base modeling and verification. In: Advances in Soft Computing, Information Processing and Web Mining, pp Springer Verlag Company, Heidelberg (2003) 12. Simiński, R.: Decision units approach in knowledge base modeling. In: Recent Advances in Intelligent Information Systems, pp Academic Publishing House EXIT, New Jersey (2009)
Rule based Classification of BSE Stock Data with Data Mining
International Journal of Information Sciences and Application. ISSN 0974-2255 Volume 4, Number 1 (2012), pp. 1-9 International Research Publication House http://www.irphouse.com Rule based Classification
More informationData Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control
Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationThe Theory of Concept Analysis and Customer Relationship Mining
The Application of Association Rule Mining in CRM Based on Formal Concept Analysis HongSheng Xu * and Lan Wang College of Information Technology, Luoyang Normal University, Luoyang, 471022, China xhs_ls@sina.com
More informationInclusion degree: a perspective on measures for rough set data analysis
Information Sciences 141 (2002) 227 236 www.elsevier.com/locate/ins Inclusion degree: a perspective on measures for rough set data analysis Z.B. Xu a, *, J.Y. Liang a,1, C.Y. Dang b, K.S. Chin b a Faculty
More informationModelling Complex Patterns by Information Systems
Fundamenta Informaticae 67 (2005) 203 217 203 IOS Press Modelling Complex Patterns by Information Systems Jarosław Stepaniuk Department of Computer Science, Białystok University of Technology Wiejska 45a,
More information131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
More informationMarcin Szczuka The University of Warsaw Poland
Marcin Szczuka The University of Warsaw Poland Why KDD? We are drowning in the sea of data, but what we really want is knowledge PROBLEM: How to retrieve useful information (knowledge) from massive data
More informationBig Data with Rough Set Using Map- Reduce
Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,
More informationEnforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques
Enforcing Data Quality Rules for a Synchronized VM Log Audit Environment Using Transformation Mapping Techniques Sean Thorpe 1, Indrajit Ray 2, and Tyrone Grandison 3 1 Faculty of Engineering and Computing,
More informationA ROUGH SET-BASED KNOWLEDGE DISCOVERY PROCESS
Int. J. Appl. Math. Comput. Sci., 2001, Vol.11, No.3, 603 619 A ROUGH SET-BASED KNOWLEDGE DISCOVERY PROCESS Ning ZHONG, Andrzej SKOWRON The knowledge discovery from real-life databases is a multi-phase
More informationROUGH SETS AND DATA MINING. Zdzisław Pawlak
ROUGH SETS AND DATA MINING Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, ul. altycka 5, 44 100 Gliwice, Poland ASTRACT The paper gives basic ideas of rough
More informationData Mining Techniques and Opportunities for Taxation Agencies
Data Mining Techniques and Opportunities for Taxation Agencies Florida Consultant In This Session... You will learn the data mining techniques below and their application for Tax Agencies ABC Analysis
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationMULTIDIMENSIONAL CLUSTERING DATA VISUALIZATION USING k-medoids ALGORITHM
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 22/2013, ISSN 1642-6037 cluster analysis, data mining, k-medoids, BUPA Jacek KOCYBA 1, Tomasz JACH 2, Agnieszka NOWAK-BRZEZIŃSKA 3 MULTIDIMENSIONAL CLUSTERING
More informationAnalysis of Software Process Metrics Using Data Mining Tool -A Rough Set Theory Approach
Analysis of Software Process Metrics Using Data Mining Tool -A Rough Set Theory Approach V.Jeyabalaraja, T.Edwin prabakaran Abstract In the software development industries tasks are optimized based on
More informationTOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
More informationHealthcare Measurement Analysis Using Data mining Techniques
www.ijecs.in International Journal Of Engineering And Computer Science ISSN:2319-7242 Volume 03 Issue 07 July, 2014 Page No. 7058-7064 Healthcare Measurement Analysis Using Data mining Techniques 1 Dr.A.Shaik
More informationData Mining Solutions for the Business Environment
Database Systems Journal vol. IV, no. 4/2013 21 Data Mining Solutions for the Business Environment Ruxandra PETRE University of Economic Studies, Bucharest, Romania ruxandra_stefania.petre@yahoo.com Over
More informationData Mining in Telecommunication
Data Mining in Telecommunication Mohsin Nadaf & Vidya Kadam Department of IT, Trinity College of Engineering & Research, Pune, India E-mail : mohsinanadaf@gmail.com Abstract Telecommunication is one of
More informationData Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
More informationUsing Semantic Data Mining for Classification Improvement and Knowledge Extraction
Using Semantic Data Mining for Classification Improvement and Knowledge Extraction Fernando Benites and Elena Sapozhnikova University of Konstanz, 78464 Konstanz, Germany. Abstract. The objective of this
More informationClassification and Prediction
Classification and Prediction Slides for Data Mining: Concepts and Techniques Chapter 7 Jiawei Han and Micheline Kamber Intelligent Database Systems Research Lab School of Computing Science Simon Fraser
More informationEnhanced data mining analysis in higher educational system using rough set theory
African Journal of Mathematics and Computer Science Research Vol. 2(9), pp. 184-188, October, 2009 Available online at http://www.academicjournals.org/ajmcsr ISSN 2006-9731 2009 Academic Journals Review
More informationInteractive Exploration of Decision Tree Results
Interactive Exploration of Decision Tree Results 1 IRISA Campus de Beaulieu F35042 Rennes Cedex, France (email: pnguyenk,amorin@irisa.fr) 2 INRIA Futurs L.R.I., University Paris-Sud F91405 ORSAY Cedex,
More informationData Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, fabian.gruening@informatik.uni-oldenburg.de Abstract: Independent
More informationAssessing Data Mining: The State of the Practice
Assessing Data Mining: The State of the Practice 2003 Herbert A. Edelstein Two Crows Corporation 10500 Falls Road Potomac, Maryland 20854 www.twocrows.com (301) 983-3555 Objectives Separate myth from reality
More informationDATA MINING PROCESS FOR UNBIASED PERSONNEL ASSESSMENT IN POWER ENGINEERING COMPANY
Medunarodna naucna konferencija MENADŽMENT 2012 International Scientific Conference MANAGEMENT 2012 Mladenovac, Srbija, 20-21. april 2012 Mladenovac, Serbia, 20-21 April, 2012 DATA MINING PROCESS FOR UNBIASED
More informationIntroduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing
Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition
More informationEFFICIENT DATA PRE-PROCESSING FOR DATA MINING
EFFICIENT DATA PRE-PROCESSING FOR DATA MINING USING NEURAL NETWORKS JothiKumar.R 1, Sivabalan.R.V 2 1 Research scholar, Noorul Islam University, Nagercoil, India Assistant Professor, Adhiparasakthi College
More informationFull and Complete Binary Trees
Full and Complete Binary Trees Binary Tree Theorems 1 Here are two important types of binary trees. Note that the definitions, while similar, are logically independent. Definition: a binary tree T is full
More informationAn Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset
P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang
More informationFREQUENT PATTERN MINING FOR EFFICIENT LIBRARY MANAGEMENT
FREQUENT PATTERN MINING FOR EFFICIENT LIBRARY MANAGEMENT ANURADHA.T Assoc.prof, atadiparty@yahoo.co.in SRI SAI KRISHNA.A saikrishna.gjc@gmail.com SATYATEJ.K satyatej.koganti@gmail.com NAGA ANIL KUMAR.G
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationManjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone
More informationHow To Improve Cloud Computing With An Ontology System For An Optimal Decision Making
International Journal of Computational Engineering Research Vol, 04 Issue, 1 An Ontology System for Ability Optimization & Enhancement in Cloud Broker Pradeep Kumar M.Sc. Computer Science (AI) Central
More informationIntroduction to Data Mining Techniques
Introduction to Data Mining Techniques Dr. Rajni Jain 1 Introduction The last decade has experienced a revolution in information availability and exchange via the internet. In the same spirit, more and
More informationDatabase Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationData quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationBOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
More informationKnowledge Based Descriptive Neural Networks
Knowledge Based Descriptive Neural Networks J. T. Yao Department of Computer Science, University or Regina Regina, Saskachewan, CANADA S4S 0A2 Email: jtyao@cs.uregina.ca Abstract This paper presents a
More informationUse of Data Mining in the field of Library and Information Science : An Overview
512 Use of Data Mining in the field of Library and Information Science : An Overview Roopesh K Dwivedi R P Bajpai Abstract Data Mining refers to the extraction or Mining knowledge from large amount of
More informationNEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE
www.arpapress.com/volumes/vol13issue3/ijrras_13_3_18.pdf NEW TECHNIQUE TO DEAL WITH DYNAMIC DATA MINING IN THE DATABASE Hebah H. O. Nasereddin Middle East University, P.O. Box: 144378, Code 11814, Amman-Jordan
More informationBig Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network
, pp.273-284 http://dx.doi.org/10.14257/ijdta.2015.8.5.24 Big Data Analytics of Multi-Relationship Online Social Network Based on Multi-Subnet Composited Complex Network Gengxin Sun 1, Sheng Bin 2 and
More informationSPATIAL DATA CLASSIFICATION AND DATA MINING
, pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal
More informationAn Outline of a Theory of Three-way Decisions
An Outline of a Theory of Three-way Decisions Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract. A theory of three-way
More informationA NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data
More informationEFFECTIVE CONSTRUCTIVE MODELS OF IMPLICIT SELECTION IN BUSINESS PROCESSES. Nataliya Golyan, Vera Golyan, Olga Kalynychenko
380 International Journal Information Theories and Applications, Vol. 18, Number 4, 2011 EFFECTIVE CONSTRUCTIVE MODELS OF IMPLICIT SELECTION IN BUSINESS PROCESSES Nataliya Golyan, Vera Golyan, Olga Kalynychenko
More informationModeling BPMN Diagrams within XTT2 Framework. A Critical Analysis**
AUTOMATYKA 2011 Tom 15 Zeszyt 2 Antoni Ligêza*, Tomasz Maœlanka*, Krzysztof Kluza*, Grzegorz Jacek Nalepa* Modeling BPMN Diagrams within XTT2 Framework. A Critical Analysis** 1. Introduction Design, analysis
More informationA Rough Set View on Bayes Theorem
A Rough Set View on Bayes Theorem Zdzisław Pawlak* University of Information Technology and Management, ul. Newelska 6, 01 447 Warsaw, Poland Rough set theory offers new perspective on Bayes theorem. The
More informationData Mining and KDD: A Shifting Mosaic. Joseph M. Firestone, Ph.D. White Paper No. Two. March 12, 1997
1 of 11 5/24/02 3:50 PM Data Mining and KDD: A Shifting Mosaic By Joseph M. Firestone, Ph.D. White Paper No. Two March 12, 1997 The Idea of Data Mining Data Mining is an idea based on a simple analogy.
More informationData Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
More informationDATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
More informationStatistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept
Statistics 215b 11/20/03 D.R. Brillinger Data mining A field in search of a definition a vague concept D. Hand, H. Mannila and P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge. Some definitions/descriptions
More informationPrediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More informationImproving Knowledge-Based System Performance by Reordering Rule Sequences
Improving Knowledge-Based System Performance by Reordering Rule Sequences Neli P. Zlatareva Department of Computer Science Central Connecticut State University 1615 Stanley Street New Britain, CT 06050
More informationWeb Data Extraction: 1 o Semestre 2007/2008
Web Data : Given Slides baseados nos slides oficiais do livro Web Data Mining c Bing Liu, Springer, December, 2006. Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationDMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support
DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information
More informationData Mining System, Functionalities and Applications: A Radical Review
Data Mining System, Functionalities and Applications: A Radical Review Dr. Poonam Chaudhary System Programmer, Kurukshetra University, Kurukshetra Abstract: Data Mining is the process of locating potentially
More informationA ROUGH CLUSTER ANALYSIS OF SHOPPING ORIENTATION DATA. Kevin Voges University of Canterbury. Nigel Pope Griffith University
Abstract A ROUGH CLUSTER ANALYSIS OF SHOPPING ORIENTATION DATA Kevin Voges University of Canterbury Nigel Pope Griffith University Mark Brown University of Queensland Track: Marketing Research and Research
More informationDynamic Data in terms of Data Mining Streams
International Journal of Computer Science and Software Engineering Volume 2, Number 1 (2015), pp. 1-6 International Research Publication House http://www.irphouse.com Dynamic Data in terms of Data Mining
More informationInternational Journal of Advance Research in Computer Science and Management Studies
Volume 2, Issue 12, December 2014 ISSN: 2321 7782 (Online) International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online
More informationFundations of Data Mining
A Step Towards the Foundations of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 ABSTRACT This paper addresses some fundamental issues related
More informationData mining and complex telecommunications problems modeling
Paper Data mining and complex telecommunications problems modeling Janusz Granat Abstract The telecommunications operators have to manage one of the most complex systems developed by human beings. Moreover,
More informationWeb-Based Support Systems with Rough Set Analysis
Web-Based Support Systems with Rough Set Analysis JingTao Yao and Joseph P. Herbert Department of Computer Science University of Regina, Regina Canada S4S 0A2 {jtyao,herbertj}@cs.uregina.ca Abstract. Rough
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationISSUES IN RULE BASED KNOWLEDGE DISCOVERING PROCESS
Advances and Applications in Statistical Sciences Proceedings of The IV Meeting on Dynamics of Social and Economic Systems Volume 2, Issue 2, 2010, Pages 303-314 2010 Mili Publications ISSUES IN RULE BASED
More informationArtificial Neural Network Approach for Classification of Heart Disease Dataset
Artificial Neural Network Approach for Classification of Heart Disease Dataset Manjusha B. Wadhonkar 1, Prof. P.A. Tijare 2 and Prof. S.N.Sawalkar 3 1 M.E Computer Engineering (Second Year)., Computer
More informationIntroduction. Introduction. Spatial Data Mining: Definition WHAT S THE DIFFERENCE?
Introduction Spatial Data Mining: Progress and Challenges Survey Paper Krzysztof Koperski, Junas Adhikary, and Jiawei Han (1996) Review by Brad Danielson CMPUT 695 01/11/2007 Authors objectives: Describe
More informationCritical Success Factors for Implementing CRM Using Data Mining*
Interscience Management Review, Vol.I/1, 2008 Critical Success Factors for Implementing CRM Using Data Mining* Jayanti Ranjan 1 Abstract: Vishal Bhatnagar 2 The paper presents the Critical success factors
More informationMobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
More informationSelf Organizing Maps for Visualization of Categories
Self Organizing Maps for Visualization of Categories Julian Szymański 1 and Włodzisław Duch 2,3 1 Department of Computer Systems Architecture, Gdańsk University of Technology, Poland, julian.szymanski@eti.pg.gda.pl
More informationSECURITY AND DATA MINING
SECURITY AND DATA MINING T. Y. Lin Mathematics and Computer Science Department, San Jose State University San Jose, California 95120, USA 408-924-512(voice), 408-924-5080 (fax), tylin@cs.sjsu.edu T. H.
More informationIntroduction. A. Bellaachia Page: 1
Introduction 1. Objectives... 3 2. What is Data Mining?... 4 3. Knowledge Discovery Process... 5 4. KD Process Example... 7 5. Typical Data Mining Architecture... 8 6. Database vs. Data Mining... 9 7.
More informationTesting LTL Formula Translation into Büchi Automata
Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland
More informationWhy? A central concept in Computer Science. Algorithms are ubiquitous.
Analysis of Algorithms: A Brief Introduction Why? A central concept in Computer Science. Algorithms are ubiquitous. Using the Internet (sending email, transferring files, use of search engines, online
More informationCOMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
More informationData Mining Fundamentals
Part I Data Mining Fundamentals Data Mining: A First View Chapter 1 1.11 Data Mining: A Definition Data Mining The process of employing one or more computer learning techniques to automatically analyze
More informationA Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data
A Data Mining Study of Weld Quality Models Constructed with MLP Neural Networks from Stratified Sampled Data T. W. Liao, G. Wang, and E. Triantaphyllou Department of Industrial and Manufacturing Systems
More informationRecognition. Sanja Fidler CSC420: Intro to Image Understanding 1 / 28
Recognition Topics that we will try to cover: Indexing for fast retrieval (we still owe this one) History of recognition techniques Object classification Bag-of-words Spatial pyramids Neural Networks Object
More informationMeta-learning. Synonyms. Definition. Characteristics
Meta-learning Włodzisław Duch, Department of Informatics, Nicolaus Copernicus University, Poland, School of Computer Engineering, Nanyang Technological University, Singapore wduch@is.umk.pl (or search
More informationNuggets and Data Mining
Nuggets and Data Mining White Paper Michael Gilman, PhD Data Mining Technologies Inc. Melville, NY 11714 (631) 692-4400 e-mail mgilman@data-mine.com June 2004 Management Summary - The Bottom Line Data
More informationComparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
More informationProcess Mining by Measuring Process Block Similarity
Process Mining by Measuring Process Block Similarity Joonsoo Bae, James Caverlee 2, Ling Liu 2, Bill Rouse 2, Hua Yan 2 Dept of Industrial & Sys Eng, Chonbuk National Univ, South Korea jsbae@chonbukackr
More informationSome Research Challenges for Big Data Analytics of Intelligent Security
Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,
More informationIMPLEMENTATION OF MS ACCESS SOFTWARE FOR CASING-CLASS MANUFACTURING FEATURES SAVING
constructional data, database, casing-class part, MS Access Arkadiusz GOLA *, Łukasz SOBASZEK ** IMPLEMENTATION OF MS ACCESS SOFTWARE FOR CASING-CLASS MANUFACTURING FEATURES SAVING Abstract Manufacturing
More informationVisualization of Breast Cancer Data by SOM Component Planes
International Journal of Science and Technology Volume 3 No. 2, February, 2014 Visualization of Breast Cancer Data by SOM Component Planes P.Venkatesan. 1, M.Mullai 2 1 Department of Statistics,NIRT(Indian
More informationClustering through Decision Tree Construction in Geology
Nonlinear Analysis: Modelling and Control, 2001, v. 6, No. 2, 29-41 Clustering through Decision Tree Construction in Geology Received: 22.10.2001 Accepted: 31.10.2001 A. Juozapavičius, V. Rapševičius Faculty
More informationStandardization of Components, Products and Processes with Data Mining
B. Agard and A. Kusiak, Standardization of Components, Products and Processes with Data Mining, International Conference on Production Research Americas 2004, Santiago, Chile, August 1-4, 2004. Standardization
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
More informationComputational Intelligence in Data Mining and Prospects in Telecommunication Industry
Journal of Emerging Trends in Engineering and Applied Sciences (JETEAS) 2 (4): 601-605 Scholarlink Research Institute Journals, 2011 (ISSN: 2141-7016) jeteas.scholarlinkresearch.org Journal of Emerging
More informationInner Classification of Clusters for Online News
Inner Classification of Clusters for Online News Harmandeep Kaur 1, Sheenam Malhotra 2 1 (Computer Science and Engineering Department, Shri Guru Granth Sahib World University Fatehgarh Sahib) 2 (Assistant
More informationWhy Semantic Analysis is Better than Sentiment Analysis. A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights
Why Semantic Analysis is Better than Sentiment Analysis A White Paper by T.R. Fitz-Gibbon, Chief Scientist, Networked Insights Why semantic analysis is better than sentiment analysis I like it, I don t
More informationInteractive Dynamic Information Extraction
Interactive Dynamic Information Extraction Kathrin Eichler, Holmer Hemsen, Markus Löckelt, Günter Neumann, and Norbert Reithinger Deutsches Forschungszentrum für Künstliche Intelligenz - DFKI, 66123 Saarbrücken
More informationData Mining Applications in Fund Raising
Data Mining Applications in Fund Raising Nafisseh Heiat Data mining tools make it possible to apply mathematical models to the historical data to manipulate and discover new information. In this study,
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationA Fast Host-Based Intrusion Detection System Using Rough Set Theory
A Fast Host-Based Intrusion Detection System Using Rough Set Theory Sanjay Rawat 1,2, V P Gulati 2, and Arun K Pujari 1 1 AI Lab, Dept. of Computer and Information Sciences University of Hyderabad, Hyderabad-500046,
More information