Qualitative Modelling
|
|
|
- Vanessa Burns
- 9 years ago
- Views:
Transcription
1 Qualitative Modelling Ivan Bratko Faculty of Computer and Information Sc., University of Ljubljana Abstract. Traditional, quantitative simulation based on quantitative models aims at producing precise numerical results as answers to user s questions about the problem domain. Such precise numerical answers are often overly elaborate, and they contain much more information than it is actually needed. In every day life, humans use common sense to reason about problems qualitatively, without numbers. In the area of Artificial Intelligence, methods exist for qualitative modelling and simulation. In this paper, we review some ideas of qualitative reasoning and modelling. We also discuss qualitative data mining as an approach to the analysis of numerical data, and Q2 learning which combines qualitative and quantitative approaches to modelling from data. Keywords: modelling, qualitative, quantitative, simulation, data mining, Q2 learning Quantitative vs. qualitative modelling Traditional, quantitative modelling and simulation give precise numerical answers. For everyday use, such answers are often overly elaborate. For example, consider the bath tub in Figure 1. Assume the tub is initially empty, there is a constant flow from the tap, and that the drain is closed, so there is no out flow. What will happen? To answer this question, the physicist's solution would be to write down a differential equation model of this system, and run this model by numerical simulation. The numerical simulation would produce a table with, say, 1000 rows, giving the exact values of the level 8
2 of water at consecutive tabulated time points. The table would show, for example, that the level will reach the top of the tub at 65.5 cm in sec. For everyday use, such an elaborate answer is overkill. A common sense answer that completely suffices for everyday purposes, and is actually much more appropriate, is instead something like this: "The water level will keep increasing and will eventually reach the top. After this, water will be overflowing and cause a flood in the bathroom." This gives just a useful summary of a possibly large amount of quantitative information. The physicist's answer was quantitative, giving precise numerical information. The common sense answer was qualitative, just giving a useful summary of the large amount of quantitative information. top Level zero Figure 1: Bath tub with some input flow and closed drain. Qualitative modelling and reasoning in AI The humans are good at common sense, qualitative reasoning. Traditionally, computerbased methods are on the other hand mostly numerical -- just the opposite to common sense. Can the common sense, qualitative approach also be computerised? The area of qualitative reasoning and modelling in Artificial Intelligence aims at this (e.g. Weld and de Kleer 1990). It is concerned with the formalisation of and algorithms for qualitative reasoning about the world, producing qualitative, non-numerical answers to questions that are typically answered numerically by "proper" physics. To emphasise the contrast between the "proper" physics as taught in schools, and the qualitative, common sense reasoning about the physical world, the qualitative physics is sometimes also called naive physics. 9
3 Principles of qualitative abstraction Qualitative reasoning is usually viewed as an abstraction of quantitative reasoning. Accordingly, in qualitative reasoning some numerical details are discarded; instead, a rather simpler qualitative summary of these numerical details is retained. There are many ways of abstracting away from detailed numerical information. Some abstraction principles are explained through examples in the following paragraphs. Numbers are abstracted into symbolic values and intervals. For example, consider the quantitative statement: Level at 3.2 sec is 2.6 cm. A qualitative abstraction of this is: Level at time t1 is between the bottom and the top of the bath tab. Notice here that 3.2 sec has been replaced by a symbolic time point t1. So instead of giving exact time, this just says that there is a time point, referred to as t1, at which Level has the given qualitative value. Regarding this qualitative value, the whole set of numbers between 0 and 65.5 has been collapsed into a symbolic interval zero..top. A further abstraction would be to disregard the top of the tub as an important value, and simply state: Level at time t1 is positive, written as: Level(t1) = pos. Another qualitative abstraction principle is to simplify the time derivatives into just directions of change. Such a qualitative statement can be: Level at time t1 is increasing. Another very useful qualitative abstraction principle is to simplify functions into monotonic relations. Consider for example the quantitative statement that states the relation between the amount of water and the level of water: Amount = f( Level) = 25 * Level *Level. A qualitative abstraction of this can be: For Level 0, Amount is a monotonically increasing function of Level, written formally as: Amount = M + ( Level). That is: if Level increases then Amount increases as well, and vice versa. Notice that this statement is very simple, it is based on common everyday experience. The qualitative statement is also very general because it is true for every bathtub, regardless if its shape, it is actually true for every container. 10
4 Qualitative abstraction is related to qualitative modelling. Numerical models are an abstraction of the real world. Qualitative models are often viewed as a further abstraction of numerical models. In this abstraction some quantitative information is abstracted away. For example, a quantitative model of the water flow in a river may state that the flow Flow depends on the level Level of water in the river in some complicated way which also takes into account the shape of the river bed. In a qualitative model this may be abstracted into a monotonically increasing relation: M + ( Level, Flow) This just says that the greater the level the greater the flow, without specifying this in any more concrete and detailed way. Obviously, it is much easier to design such coarse qualitative models than precise quantitative models. Why qualitative modelling? Here we discuss some advantages of qualitative modelling with respect to the traditional, quantitative modelling. Of course there are many situations where a qualitative model, due to lack of precise numerical information, is not sufficient. However, there are many situations in which a qualitative model has advantages. First, qualitative modelling is easier than quantitative modelling. Precise relations among the variables in the system to be modelled may be hard or impossible to determine, but it is usually still possible to state some qualitative relations among the variables. Also, even if a complete quantitative model is known, such a model still requires the knowledge of all the, possibly many, numerical parameters in the model. For example, a numerical physiological model may require the precise electrical conductance of a neuron, its length and width etc. These parameters may be hard or impossible to measure. Yet, to run such a numerical model, a numerical simulator will require the values of all these parameters to be specified by the user before the simulation can start. Usually the user will then make some guesses at these parameters and hope that they are not too far off their real values. But then the user will not know how far the simulation results are from the truth. The user will typically not know even if the obtained results are qualitatively correct. With a qualitative model, much 11
5 of such guesswork can be avoided, and in the end the user will at least be sure about the qualitative correctness of the simulations. So, paradoxically, quantitative results, although more precise than qualitative results, are in greater danger of being incorrect and completely useless, because the accumulated error may become too gross. For example, in an ecological model, even without knowing the precise parameters of growth and mortality rates etc. for the species in the model, a qualitative model may answer the question whether certain species will eventually become extinct, or possibly different species will interchange their temporal domination in time cycles. A qualitative simulator may find such an answer by finding all the possible qualitative behaviours that correspond to all possible combinations of the values of the parameters in the model. Another point is that for many tasks, numerical precision is not required. Often it only obscures the essential properties of the system. Generic tasks in which qualitative modelling is often more appropriate include functional reasoning, diagnosis and structural synthesis. Functional reasoning is concerned with questions like: How does a device or a system work? In a diagnostic task we are interested in defects that caused the observed abnormal behaviour of the system. Usually, we are only interested in those deviations from the normal state that caused a behaviour that is qualitatively different from normal. The problem of structural synthesis is: Given some basic building blocks, find their combination which achieves a given function. For example, put the available components together to achieve the effect of cooling. In other words, invent the refrigerator from "first principles". The basic building blocks can be available technical components, or just the laws of physics, or materials with certain properties. In such design from first principles, the goal is to synthesise a structure capable of achieving some given function through some mechanism. In the early, most innovative stage of design, this mechanism is described qualitatively. Only at a later stage of design when the structure is already known, quantitative synthesis also becomes important. The use of qualitative models requires qualitative reasoning. Some techniques of qualitative reasoning are presented e.g. in (Kuipers, 1993) and (Bratko 2001, chapter 20). Bratko et al. (1989) present a complex application of qualitative modelling in the diagnosis of cardiac arrhythmias. 12
6 Qualitative data mining Consider the mining of numerical data. Usually, in machine learning used for numerical data mining, the goal is to construct a numerical function of the form y = f(x 1, x 2,...) where y is a distinguished variable called the dependent variable, and x 1, x 2,... are attributes (or independent variables). Examples of this kind of numerical learning are regression trees (CART, Breiman et al. 84), Retis (Karalič 92), M5 (Quinlan 1993). In contrast to this, we may consider qualitative data mining which aims at finding qualitative patterns, or qualitative relationships in numerical data. An obvious motivation for qualitative data mining comes from the fact that for some tasks, qualitative models are more suitable than classical quantitative, numerical models, as discussed above. This kind of reasoning enables the solution of certain types of problems without resorting to numerical computation, and without the use of a quantitative model of the system in question. When a problem can be solved at the qualitative level of abstraction, there is no need for building a quantitative model often a demanding or unrealistic task. Building qualitative models for complex systems should be easier. While there are many machine learning or data mining tools that support the building of numerical models from data, there are few tools to support the building of qualitative models from data. Below we present one approach to qualitative data mining, realized in the program QUIN in which the target concepts are expressed by so-called qualitative decision trees. The QUIN approach to qualitative data mining QUIN (Qualitative Induction) is a learning program that looks for qualitative patterns in numerical data (Šuc 2001; Šuc and Bratko 2001). These patterns are then combined into a so-called qualitative tree. Induction of qualitative trees is similar to the well-known induction of decision trees. The difference is that in decision trees the leaves are labelled with class values, whereas in qualitative trees the leaves are labelled with what we call monotonic qualitative constraints (MQC). MQCs are a generalisation of monotonic 13
7 relationships explained earlier, that are widely used in the field of qualitative reasoning, for example Y = M + (X). This says that Y is a monotonically increasing function of X. In general, MQCs can have more than one argument. For example, Z = M +,- (X,Y) says that Z monotonically increases in X and decreases in Y. Monotonic constraints can be combined into if-then rules to express piece-wise monotonic functional relationships. For example: if X < 0 then Y = M - (X) else Y = M + (X) Nested if-then expressions can be represented as trees, called qualitative trees (Šuc 2001). Qualitative trees are similar to regression trees (Breiman et al. 1984). Both regression and qualitative trees describe how a numerical variable depends on other variables. The difference between the two types of trees only occurs in the leaves. A leaf of a regression tree specifies a numerical regression function that tells how the class variable numerically depends on the attributes within the scope of the leaf. On the other hand, a leaf in a qualitative tree only specifies the relation between the class and the attributes qualitatively, in terms of monotonic qualitative constraints. QUIN takes as input a set of numerical examples and looks for regions in the data space where monotonicity constraints hold. Such a set of qualitative patterns are represented in terms of a qualitative tree. As in decision trees, the internal nodes in a qualitative tree specify conditions that split the attribute space into subspaces. In a qualitative tree, however, each leaf specifies a MQC that holds among the input data that fall into that leaf. As a simple example consider a data set with three variables X, Y and Z where data triples (X, Y, Z) correspond to the function Z = X 2 Y 2, possibly with some Gaussian noise added. When QUIN is asked to find in these data qualitative constraints on Z as a function of X and Y, QUIN generates the qualitative tree that can be represented by the following nested if-then-else expression: if X < 0 then if Y < 0 then Z = M -,+ (X,Y) else Z = M -,- (X,Y) else if Y < 0 then Z = M +,+ (X,Y) else Z = M +,- (X,Y) 14
8 This tree partitions the data space into four regions that correspond to the four leaves of the tree. A different MQC applies in each of the leaves. The tree describes how Z qualitatively depends on X and Y. QUIN can tolerate some noise in the data and would induce in this example the same tree (with thresholds for X and Y slightly distorted) even when moderate noise is added to the data. Technical details of the QUIN algorithm can be found in (Šuc 2001) or (Šuc and Bratko 2001) where QUIN s performance on noisy data is also studied. Q 2 learning Q 2 learning (Šuc et al. 2004) stands for qualitatively faithful quantitative learning. This approach combines qualitative and numerical learning from numerical data. The idea is to first find qualitative relationships in the data, and then use these qualitative properties to guide the numerical learning from the same data. This second stage can be carried out by a numerical regression method so that this method is forced to respect the observed qualitative relationships in the data. The Q 2 method as an approach to problem solving is illustrated in Fig. 2. Quantitative problem Qualitative abstraction Quantitative solution Quantitative concretion Qualitative problem Qualitative problem solving Qualitative solution Fig. 2 Solving quantitative problems by means of qualitative abstraction Advantages of Q 2 learning can be two fold. The induced qualitative model enables nice causal interpretation of the relations among the variables in the system, as one would expect from a qualitative model. More surprisingly, however, it was also shown in applications of Q 2 that the final quantitative model may yield numerical predictions that are significantly more accurate than those provided by state-of-the-art numerical modelling methods on their own. 15
9 In (Šuc et al. 2004) Q 2 was applied to induce a model of a complex, industrially relevant mechanical system (a car wheel suspension system). Thus the main message of this case study is that a combination of methods for qualitative and quantitative system identification methods has good chances to attain significant improvements over numerical system identification techniques, including techniques of numerical machine learning methods, such as regression trees, model trees, and locally weighted regression. Examples of application of Q 2 learning in ecological modelling are described by Vladušič et al. (2005; growth of plankton in Lake Glumsoe) and Žabkar et al. (2005; prediction of harmful ozone concentrations in Ljubljana and Nova Gorica). Summary and conclusions In this paper we reviewed some concepts in the qualitative approach to modelling, and some ideas of qualitative data mining or qualitative machine learning, and their combination with numerical machine learning exemplified by the Q 2 approach. We also reviewed some applications in which this approach was applied. These case studies demonstrate, as one would expect, that induced qualitative patterns are useful to facilitate the user s understanding of the domain of application, and to enhance the knowledge about the domain. However, less expected, we also found another very common application scenario of qualitative learning, when we are not just interested in qualitative relations, but in a concrete quantitative solution (e.g. making quantitative predictions, synthesising a numerical controller for a dynamic system, numerical system identification). This is facilitated by the Q2 method which combines qualitative and numerical learning. Acknowledgement The work described in this paper was in part done within the research programme Artificial Intelligence and Intelligent Systems supported by the Slovenian research agency ARRS, and in the European 5th framework project Clockwork, supported by the European Commission. 16
10 References Bratko, I. (2001) Prolog Programming for Artificial Intelligence, 3rd edition. Pearson Education. Bratko, I., Mozetič, I., Lavrač, N. (1989) KARDIO: a Study in Deep and Qualitative Knowledge for Expert Systems. Cambridge, MA: MIT Press. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J. Classification and Regression Trees. Monterey, CA: Wadsford, Karalič, A. Employing linear regression in regression tree leaves. Proc. ECAI 92 (10th European Conference on Artificial Intelligence), pp Vienna Wiley & Sons. de Kleer, J., Williams, B.C. (1991, eds.) Artificial Intelligence Journal, Vol. 51 (Special Issue on Qualitative Reasoning about Physical Systems II). Kuipers, B.J. (1994) Qualitative Reasoning: Modeling and Simulation with Incomplete Knowledge. Cambridge, MA: MIT Press. Šuc, D. (2001) Machine Reconstruction of Human Control Strategies. PhD Thesis, University of Ljubljana, Faculty of Computer and Information Sc., Ljubljana Also appeared as book in series Frontiers in Artificial Intelligence and Applications. IOS Press, Amsterdam, The Netherlands (2003). Šuc, D., Bratko, I. (2001) Induction of qualitative trees. Proc. Machine learning: ECML 2001: (Lecture notes in artificial intelligence, Lecture notes in computer science, 2167) pp Berlin: Springer, 2001, Šuc, D., Vladušič, D., Bratko, I., Qualitatively faithful quantitative prediction. Artificial Intelligence Journal 158,
11 Vladušič, D., Kompare, B., Bratko, I. (2005) Modelling phytoplankton in Lake Glumsoe with Q2 learning. Ecological Modelling (in press). Weld, D.S., de Kleer, J. (1990) Readings in Qualitative Reasoning about Physical Systems, San Mateo, CA: Morgan Kaufmann. Jure Žabkar, J., Rahela Žabkar, R., Vladušič, D., Čemas D., Šuc, D. and Bratko, I. (2005) Q 2 prediction of ozone concentrations. Ecological Modelling (in press). 18
Benchmarking Open-Source Tree Learners in R/RWeka
Benchmarking Open-Source Tree Learners in R/RWeka Michael Schauerhuber 1, Achim Zeileis 1, David Meyer 2, Kurt Hornik 1 Department of Statistics and Mathematics 1 Institute for Management Information Systems
Decision Tree Learning on Very Large Data Sets
Decision Tree Learning on Very Large Data Sets Lawrence O. Hall Nitesh Chawla and Kevin W. Bowyer Department of Computer Science and Engineering ENB 8 University of South Florida 4202 E. Fowler Ave. Tampa
Efficiency in Software Development Projects
Efficiency in Software Development Projects Aneesh Chinubhai Dharmsinh Desai University [email protected] Abstract A number of different factors are thought to influence the efficiency of the software
Data Mining: A Preprocessing Engine
Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Weather forecast prediction: a Data Mining application
Weather forecast prediction: a Data Mining application Ms. Ashwini Mandale, Mrs. Jadhawar B.A. Assistant professor, Dr.Daulatrao Aher College of engg,karad,[email protected],8407974457 Abstract
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
The Optimality of Naive Bayes
The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most
5.5. Solving linear systems by the elimination method
55 Solving linear systems by the elimination method Equivalent systems The major technique of solving systems of equations is changing the original problem into another one which is of an easier to solve
A Decision Theoretic Approach to Targeted Advertising
82 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 A Decision Theoretic Approach to Targeted Advertising David Maxwell Chickering and David Heckerman Microsoft Research Redmond WA, 98052-6399 [email protected]
Healthcare Data Mining: Prediction Inpatient Length of Stay
3rd International IEEE Conference Intelligent Systems, September 2006 Healthcare Data Mining: Prediction Inpatient Length of Peng Liu, Lei Lei, Junjie Yin, Wei Zhang, Wu Naijun, Elia El-Darzi 1 Abstract
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),
Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables
D A T A M I N I N G C L A S S I F I C A T I O N
D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification
Feature vs. Classifier Fusion for Predictive Data Mining a Case Study in Pesticide Classification Henrik Boström School of Humanities and Informatics University of Skövde P.O. Box 408, SE-541 28 Skövde
Preference Models in Data Storage - PDQ PDP
Preference Learning from Qualitative Partial Derivatives Jure Žabkar, Martin Možina, Tadej Janež, Ivan Bratko, and Janez Demšar University of Ljubljana, Faculty of Computer and Information Science Tržaška
DATA MINING, DIRTY DATA, AND COSTS (Research-in-Progress)
DATA MINING, DIRTY DATA, AND COSTS (Research-in-Progress) Leo Pipino University of Massachusetts Lowell [email protected] David Kopcso Babson College [email protected] Abstract: A series of simulations
Blog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
Lecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
Consolidated Tree Classifier Learning in a Car Insurance Fraud Detection Domain with Class Imbalance
Consolidated Tree Classifier Learning in a Car Insurance Fraud Detection Domain with Class Imbalance Jesús M. Pérez, Javier Muguerza, Olatz Arbelaitz, Ibai Gurrutxaga, and José I. Martín Dept. of Computer
Event driven trading new studies on innovative way. of trading in Forex market. Michał Osmoła INIME live 23 February 2016
Event driven trading new studies on innovative way of trading in Forex market Michał Osmoła INIME live 23 February 2016 Forex market From Wikipedia: The foreign exchange market (Forex, FX, or currency
Data mining techniques: decision trees
Data mining techniques: decision trees 1/39 Agenda Rule systems Building rule systems vs rule systems Quick reference 2/39 1 Agenda Rule systems Building rule systems vs rule systems Quick reference 3/39
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients
An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients Celia C. Bojarczuk 1, Heitor S. Lopes 2 and Alex A. Freitas 3 1 Departamento
Gerry Hobbs, Department of Statistics, West Virginia University
Decision Trees as a Predictive Modeling Method Gerry Hobbs, Department of Statistics, West Virginia University Abstract Predictive modeling has become an important area of interest in tasks such as credit
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil [email protected] 2 Network Engineering
Machine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
Technical Trading Rules as a Prior Knowledge to a Neural Networks Prediction System for the S&P 500 Index
Technical Trading Rules as a Prior Knowledge to a Neural Networks Prediction System for the S&P 5 ndex Tim Cheno~ethl-~t~ Zoran ObradoviC Steven Lee4 School of Electrical Engineering and Computer Science
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan
Predictive Analytics Tools and Techniques
Global Journal of Finance and Management. ISSN 0975-6477 Volume 6, Number 1 (2014), pp. 59-66 Research India Publications http://www.ripublication.com Predictive Analytics Tools and Techniques Mr. Chandrashekar
Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms
Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms Yin Zhao School of Mathematical Sciences Universiti Sains Malaysia (USM) Penang, Malaysia Yahya
Chapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
15.062 Data Mining: Algorithms and Applications Matrix Math Review
.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
CS311 Lecture: Sequential Circuits
CS311 Lecture: Sequential Circuits Last revised 8/15/2007 Objectives: 1. To introduce asynchronous and synchronous flip-flops (latches and pulsetriggered, plus asynchronous preset/clear) 2. To introduce
Model Trees for Classification of Hybrid Data Types
Model Trees for Classification of Hybrid Data Types Hsing-Kuo Pao, Shou-Chih Chang, and Yuh-Jye Lee Dept. of Computer Science & Information Engineering, National Taiwan University of Science & Technology,
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India [email protected]
1.2 Solving a System of Linear Equations
1.. SOLVING A SYSTEM OF LINEAR EQUATIONS 1. Solving a System of Linear Equations 1..1 Simple Systems - Basic De nitions As noticed above, the general form of a linear system of m equations in n variables
Independent samples t-test. Dr. Tom Pierce Radford University
Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
REPORT DOCUMENTATION PAGE
REPORT DOCUMENTATION PAGE Form Approved OMB NO. 0704-0188 Public Reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions,
Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods
Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing
SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89. by Joseph Collison
SYSTEMS OF EQUATIONS AND MATRICES WITH THE TI-89 by Joseph Collison Copyright 2000 by Joseph Collison All rights reserved Reproduction or translation of any part of this work beyond that permitted by Sections
Fuzzy Cognitive Map for Software Testing Using Artificial Intelligence Techniques
Fuzzy ognitive Map for Software Testing Using Artificial Intelligence Techniques Deane Larkman 1, Masoud Mohammadian 1, Bala Balachandran 1, Ric Jentzsch 2 1 Faculty of Information Science and Engineering,
Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results
From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Credit Card Fraud Detection Using Meta-Learning: Issues 1 and Initial Results Salvatore 2 J.
Integer Operations. Overview. Grade 7 Mathematics, Quarter 1, Unit 1.1. Number of Instructional Days: 15 (1 day = 45 minutes) Essential Questions
Grade 7 Mathematics, Quarter 1, Unit 1.1 Integer Operations Overview Number of Instructional Days: 15 (1 day = 45 minutes) Content to Be Learned Describe situations in which opposites combine to make zero.
arxiv:1112.0829v1 [math.pr] 5 Dec 2011
How Not to Win a Million Dollars: A Counterexample to a Conjecture of L. Breiman Thomas P. Hayes arxiv:1112.0829v1 [math.pr] 5 Dec 2011 Abstract Consider a gambling game in which we are allowed to repeatedly
[Refer Slide Time: 05:10]
Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture
Mining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
On the effect of data set size on bias and variance in classification learning
On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent
Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control
Data Mining for Manufacturing: Preventive Maintenance, Failure Prediction, Quality Control Andre BERGMANN Salzgitter Mannesmann Forschung GmbH; Duisburg, Germany Phone: +49 203 9993154, Fax: +49 203 9993234;
Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining -
Evaluating an Integrated Time-Series Data Mining Environment - A Case Study on a Chronic Hepatitis Data Mining - Hidenao Abe, Miho Ohsaki, Hideto Yokoi, and Takahira Yamaguchi Department of Medical Informatics,
Case-Based Reasoning for General Electric Appliance Customer Support
Case-Based Reasoning for General Electric Appliance Customer Support William Cheetham General Electric Global Research, One Research Circle, Niskayuna, NY 12309 [email protected] (Deployed Application)
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
1 Cubic Hermite Spline Interpolation
cs412: introduction to numerical analysis 10/26/10 Lecture 13: Cubic Hermite Spline Interpolation II Instructor: Professor Amos Ron Scribes: Yunpeng Li, Mark Cowlishaw, Nathanael Fillmore 1 Cubic Hermite
Comparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One
Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities
Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities Oscar Kipersztok Mathematics and Computing Technology Phantom Works, The Boeing Company P.O.Box 3707, MC: 7L-44 Seattle, WA 98124
Selection of Optimal Discount of Retail Assortments with Data Mining Approach
Available online at www.interscience.in Selection of Optimal Discount of Retail Assortments with Data Mining Approach Padmalatha Eddla, Ravinder Reddy, Mamatha Computer Science Department,CBIT, Gandipet,Hyderabad,A.P,India.
The Application Method of CRM as Big Data: Focused on the Car Maintenance Industry
, pp.93-97 http://dx.doi.org/10.14257/astl.2015.84.19 The Application Method of CRM as Big Data: Focused on the Car Maintenance Industry Dae-Hyun Jung 1, Lee-Sang Jung 2 {San 30, Jangjeon-dong, Geumjeonggu,
CALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents
Gerard Mc Nulty Systems Optimisation Ltd [email protected]/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I
Gerard Mc Nulty Systems Optimisation Ltd [email protected]/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy
Specific Usage of Visual Data Analysis Techniques
Specific Usage of Visual Data Analysis Techniques Snezana Savoska 1 and Suzana Loskovska 2 1 Faculty of Administration and Management of Information systems, Partizanska bb, 7000, Bitola, Republic of Macedonia
An Analysis of Four Missing Data Treatment Methods for Supervised Learning
An Analysis of Four Missing Data Treatment Methods for Supervised Learning Gustavo E. A. P. A. Batista and Maria Carolina Monard University of São Paulo - USP Institute of Mathematics and Computer Science
2.4 Real Zeros of Polynomial Functions
SECTION 2.4 Real Zeros of Polynomial Functions 197 What you ll learn about Long Division and the Division Algorithm Remainder and Factor Theorems Synthetic Division Rational Zeros Theorem Upper and Lower
Introduction to Learning & Decision Trees
Artificial Intelligence: Representation and Problem Solving 5-38 April 0, 2007 Introduction to Learning & Decision Trees Learning and Decision Trees to learning What is learning? - more than just memorizing
Method of Fault Detection in Cloud Computing Systems
, pp.205-212 http://dx.doi.org/10.14257/ijgdc.2014.7.3.21 Method of Fault Detection in Cloud Computing Systems Ying Jiang, Jie Huang, Jiaman Ding and Yingli Liu Yunnan Key Lab of Computer Technology Application,
A Learning Algorithm For Neural Network Ensembles
A Learning Algorithm For Neural Network Ensembles H. D. Navone, P. M. Granitto, P. F. Verdes and H. A. Ceccatto Instituto de Física Rosario (CONICET-UNR) Blvd. 27 de Febrero 210 Bis, 2000 Rosario. República
Non-random/non-probability sampling designs in quantitative research
206 RESEARCH MET HODOLOGY Non-random/non-probability sampling designs in quantitative research N on-probability sampling designs do not follow the theory of probability in the choice of elements from the
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION ANALYSIS. email [email protected]
Eighth International IBPSA Conference Eindhoven, Netherlands August -4, 2003 APPLICATION OF DATA MINING TECHNIQUES FOR BUILDING SIMULATION PERFORMANCE PREDICTION Christoph Morbitzer, Paul Strachan 2 and
Cluster Description Formats, Problems and Algorithms
Cluster Description Formats, Problems and Algorithms Byron J. Gao Martin Ester School of Computing Science, Simon Fraser University, Canada, V5A 1S6 [email protected] [email protected] Abstract Clustering is
Financial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
ARTIFICIAL INTELLIGENCE: DEFINITION, TRENDS, TECHNIQUES, AND CASES
ARTIFICIAL INTELLIGENCE: DEFINITION, TRENDS, TECHNIQUES, AND CASES Joost N. Kok, Egbert J. W. Boers, Walter A. Kosters, and Peter van der Putten Leiden Institute of Advanced Computer Science, Leiden University,
Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case Study: Qom Payame Noor University)
260 IJCSNS International Journal of Computer Science and Network Security, VOL.11 No.6, June 2011 Predicting required bandwidth for educational institutes using prediction techniques in data mining (Case
Problem of the Month: Fair Games
Problem of the Month: The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards:
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining
Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du [email protected] University of British Columbia
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared. [email protected]
COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared [email protected] Relationships between variables So far we have looked at ways of characterizing the distribution
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B. KITCHENS
December 4, 2013 MATH 171 BASIC LINEAR ALGEBRA B KITCHENS The equation 1 Lines in two-dimensional space (1) 2x y = 3 describes a line in two-dimensional space The coefficients of x and y in the equation
Ira J. Haimowitz Henry Schwarz
From: AAAI Technical Report WS-97-07. Compilation copyright 1997, AAAI (www.aaai.org). All rights reserved. Clustering and Prediction for Credit Line Optimization Ira J. Haimowitz Henry Schwarz General
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
FrIDA Free Intelligent Data Analysis Toolbox.
FrIDA A Free Intelligent Data Analysis Toolbox Christian Borgelt and Gil González Rodríguez Abstract This paper describes a Java-based graphical user interface to a large number of data analysis programs
Chapter 14 Managing Operational Risks with Bayesian Networks
Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian
Adaptive Business Intelligence
Adaptive Business Intelligence Zbigniew Michalewicz 1 Business Intelligence What is Business Intelligence? Business Intelligence is a collection of tools, methods, technologies, and processes needed to
