Preference Models in Data Storage - PDQ PDP
|
|
|
- Maryann Watkins
- 5 years ago
- Views:
Transcription
1 Preference Learning from Qualitative Partial Derivatives Jure Žabkar, Martin Možina, Tadej Janež, Ivan Bratko, and Janez Demšar University of Ljubljana, Faculty of Computer and Information Science Tržaška 25, SI-1000 Ljubljana, Slovenia Abstract. We address the problem of learning preference models from data containing implicit preference information. The information about preferences is expressed as a result of a decision or action in a given state in context of other attributes. We propose an algorithm that observes the change of probability of a target class (desired result) w.r.t. the change in the values of the attribute describing the decision or action. We generalize the notion of a partial derivative by defining the probabilistic discrete qualitative partial derivative (PDQ PD). PDQ PD is a qualitative relation between the target class c and a discrete attribute, given as a sequence of attribute values a i in the order of P (c a i) in a local neighbourhood of the reference point. In a two stage learning process we first compute PDQ PD for all examples in the training data, and then generalize over the entire data set using an appropriate machine learning algorithm (e.g. a decision tree). The induced preference model explains the influence of the attribute values on the target class in different subspaces of the attribute space. 1 Introduction The typical task of preference learning is to induce models for prediction of preferences, based on learning data which includes descriptions of examples and their preferences. Consider, for example, a flight carrier desiring to predict the food preferences of their passengers. John, a white middle-aged male from Wales, might prefer non-vegetarian food over vegetarian over snacks over drink-only service. Preference learning generalizes such cases into models. Actual data often does not look like this. Passengers, for instance, are not asked which food they prefer but rather whether they were satisfied with what they were given or not. We may know that John, a white male from Wales, age 56, who was given veal and rice topped with a delicate mustard sauce, called it a good meal, while Jack, a white middle-aged male from Scotland did not appreciate the pretzels and sparkling water he was provided on his flight. In this paper we present an algorithm for generalizing the preference relations from this kind of data. We assume the learning examples described by discrete attributes (gender, age, type of food) and classified into ordered classes (disliked, OK d, liked the food). The algorithm can estimate the preferences for each example with regard to each attribute. For instance, it can tell us that John, the white middle-aged man from Wales, prefers non-vegetarian over sweets over vegetarian over drink-only: drink-only vegetarian sweets non-vegetarian.
2 Other attributes can also be used to derive the preferences. For instance, we may learn that white middle-aged inhabitants of Wales who are given vegetarian lunch are more likely to be satisfied with it if they are female. This is a sex-preference relation for a certain combination of attributes. Or, for instance, it can tell us that the satisfaction of white middle-aged men with pretzels does not depend on whether they are Scottish or Welsh, which can be seen as ethnic preference (or indifference) for a certain kind of food. Our venture point in development of the presented algorithm is qualitative modelling. A branch of qualitative modelling considers learning qualitative models from numerical data [4,25,26]. It is a technique similar to regression modelling, except that instead of numerical predictions it predicts the direction of change (increase, decrease, no change) of the target variable. Such models are preferred over regression models when exact numerical models are difficult to obtain or difficult to use due to unreliable measurements, and, in particular, when the task is to predict the effect that a change in input variables will have on the observed variable. For instance, a qualitative model may describe the conditions under which a decrease of interest rates will stimulate investments and how will this affect the unemployment rate. While exact numerical models for this problem are elusive, qualitative relations between these variables may be easier to describe. In this paper we extend this type of reasoning to categorical domains. We describe an algorithm Qube [24] for calculating qualitative preferences, a type of qualitative models describing the influence of a categorical (e.g. nominal) variable on a target class probability. Instead of dealing with direction of change, we rank attribute values so that the probability of the target class increases. By ranking, we drop all numerical information (probabilities). Finally, as we generalize over the entire data set, the induced qualitative model describes the conditions under which certain attribute values have greater/lower influence on the target class probability. 2 Methods We will first provide a definition of probabilistic discrete qualitative partial derivatives based on conditional probabilities of the target class given the attribute values. The computation of these probabilities from data requires selecting proper subsets of examples, which we will describe next. Finally, we show how to combine the computed probabilities into partial derivatives and use them to induce qualitative models. 2.1 Probabilistic discrete qualitative partial derivative Derivative of a function f(x) at a certain point x 0, f (x 0 ), tells us, informally, the change of the function value corresponding to a certain (small) change of the value of the function s argument, i.e. f(x 0 + x) f(x 0 ) f (x 0 ) x. (1) For functions of multiple arguments, e.g. f(x 1, x 2,..., x n ), we compute the derivative w.r.t. each argument x i separately and denote it by f/ x i.
3 Qualitative derivatives are similar to ordinary derivatives except that they give only the direction of change, that is, whether the function will increase or decrease when its argument increases. Qualitative derivative of a function is positive (negative, zero) if the continuous derivative is positive (negative, zero), Q f Q x = sgn f x. (2) Now consider a multivariate distribution which assigns a probability y to each element of Cartesian product of A 1 A 2... A n. In machine learning, (a 1, a 2,..., a n ) A 1 A 2... A n can be values of discrete attributes A 1, A 2,..., A n describing an example (we will call this example a reference example), and y is the probability that such an example belongs to some target class c, y = p(c a 1, a 2,..., a n ). (3) Recall that qualitative derivative of f w.r.t. x i computed at (x 1, x 2,..., x n ) tells whether a certain change of x i will increase or decrease the function value if other arguments remain constant. Let the probabilistic discrete qualitative partial derivative at (a 1,..., a n ) with respect to A i tell whether a change of value of the attribute A i from a i to a i will increase or decrease the probability of the target class: Q f Q A i : a i a (a 1,..., a n ) = i +, p(c a 1,..., a i,..., a n ) < p(c a 1,..., a i,..., a n), p(c a 1,..., a i,..., a n ) = p(c a 1,..., a i,..., a n), p(c a 1,..., a i,..., a n ) > p(c a 1,..., a i,..., a n). (4) Let us define a partial order on set A i, with respect to fixed values of a j for all j i a i a i p(c a 1,..., a i,..., a n ) p(c a 1,..., a i,..., a n) (5) This allows us to rewrite (4) as Q f +, a i a i Q A i : a i a (a 1,..., a n ) =, a i = a i (6) i, a i a i. The derivative Q f/ Q A i : a i a i for any pair a i and a i can thus be described by a partial ordering of attribute values A i. 2.2 Computation of conditional probabilities To compute the derivative we have to estimate the conditional probability p(c a 1,..., a n ) at a single point (a 1, a 2,..., a n ) from data sample. This probability cannot be estimated directly, for instance, by relative frequencies, since there may be only a few or
4 even no examples in the data which match the condition part. We also cannot use a naive Bayesian method since it would reduce the PDQ PD to comparison of p(c a i ) and p(c a i ), cancelling out all the terms corresponding to the values of other attributes. This is easily explained: the naive Bayesian assumption of conditional independence of attributes given the class implies that the derivative Q f/ Q A i : a i a i is constant on the entire attribute space. The problem requires a semi-naive Bayesian approach. We will replace the condition in p(c a 1,..., a n ) with a relaxed condition D {a 1, a 2,..., a n }, where D will include only the attribute values which are conditionally dependent on a i given the class. Ignoring the other, conditionally independent values do not change the computed derivative (see the proof in Appendix). To construct the set of conditions D we will use a greedy approach: we start with an empty set D and iteratively add the most dependent value. Let e j represent an event that attribute A j on a certain example has a value a j. Let event v represent values of attribute A i on an example (v A i ). We need to test the hypothesis that e j and v are conditionally independent given the class and a set of existing conditions D (that is, knowing the class of an example and knowing that the example satisfies D, the probability distribution for values of A i is independent of whether the value of j-th attribute equals a j or not). In each step of the algorithm we test the dependence between v and e j, find e j which most strongly violates the independence and add the corresponding a j to D. The independence is tested using the standard χ 2 test. Separate tables with 2 A i cells are constructed for class c and its complement. We compute the expected absolute frequencies for the first table as n(c, D)p(a j c, D)p(v c, D) and n(c, D)(1 p(a j c, D))p(v c, D), where v A i and n(c, D) is the number of examples in class c which satisfy the conditions D. Frequencies for the complement of c are computed analogously. The sum of χ 2 statistics for both tables is distributed according to χ 2 distribution with 2( A i 1) degrees of freedom. For each a j we compute its corresponding p-value and select the one with the lowest value. We stop the selection procedure when the lowest p-value is above the specified threshold or when the number of examples matching the conditions D falls below the given minimum. This is needed to ensure the reliability of χ 2 statistics and of estimated conditional probabilities. Our use of p-value does not require adjustments for multiple hypotheses testing since the p-value is used only as a stopping criteria and not to claim the significance of the alternative hypothesis. After choosing a set of conditions D, we compute p(c v, D) for all v A i using relative frequency, Laplace estimate or m-estimate [6] on examples matching D. Note that the χ 2 test is performed over all values of A i and not only a i and a i. This lets us use the same set of conditions D for all derivatives with respect to A i at a certain reference example and ensures that probabilities p(c a 1,..., a i,..., a n ) for all a i A i are comparable and thus useful for defining a partial ordering of a i. For another interpretation of this procedure, consider the definition of partial derivative of a continuous function: f f(x 1 + h, x 2,..., x n ) f(x 1, x 2,..., x n ) (x 1,..., x n ) = lim. (7) x 1 h 0 h
5 When computing the derivative w.r.t. x 1, we subtract the function value in two points where all arguments except x 1 are the same. If the function value at point (x 1,..., x n ) does not depend on, say, x 2, we could use different value of x 2 in the two terms. Omitting the values from (4) is similar to that: the χ 2 test is used to determine whether ignoring the difference between a j and other values of the attribute A j will affect the computation or not. The proposed greedy procedure may seem naive, yet it works well in practice. We must also keep in mind that the selection of attributes needs to be fast since we have to run it for each point in which we compute the derivative, which rules out any advanced search for sets of dependent values. 2.3 Computation of derivatives The partial order of A i is determined by the order of the corresponding probabilities as defined in (4). To handle noisy data we will however treat two probabilities (and the corresponding values of A i ) as equal if they differ for less than a user-provided threshold. For this we use hierarchical clustering of values with average linkage [23] using the difference of probabilities as distances. The clustering is stopped when the distance between the closest clusters is greater than 0.2. For example, let A i be a five-valued attribute with values v 1 to v 5. Probabilities p(c v i, D) for these values equal 0.1, 0.2, 0.3, 0.5 and 0.6, respectively. Let the merging threshold be 0.2. We recognize v 1 and v 2 as equivalent and assign them the average probability of ( )/2 = Next we merge v 4 and v 5, the average probability is = Finally, we merge v 1 and v 2 with v 3 ; the average probability is ( )/3 = 0.2. We then stop since the difference between p = 0.2 (v 1 to v 3 ) and p = 0.55 (v 4 to v 5 ) exceeds the threshold of 0.2. The resulting partial ordering of A i is v 1 = v 2 = v 3 v 4 = v Induction of preference models To induce a preference model with respect to a certain attribute A i, we first compute the PDQ PD for the entire learning set: for each example, we compute the set of dependent values D and find the total ordering of attribute values A i as explained in the previous two sections. We replace the original class labels with partial derivatives (that is, the total ordering) and induce a model for predicting the ordering. In principle, any learning algorithm can be used for this task. 3 Experiments We will first show a toy experiment on the Titanic data set from UCI [13] repository. The second, more complex experiment will be conducted on a data set obtained from a billiards simulator. Conditional probabilities for both are estimated using the m-estimate [6] with m = 2. We use a threshold of 0.2 for merging attribute values. Our reimplementation of the
6 C4.5 algorithm [9] is used to obtain qualitative models describing the relation between the target class and each attribute separately. 3.1 Titanic Titanic data set consists of 2201 examples described by three attributes, status (first, second, third, crew), age (child, adult) and sex (male, female), and a class survived (yes, no). We selected target class survived = yes and for each passenger computed the preference (with respect to survival) regarding the status. Finally, we used C4.5 to learn the preference model over the entire data set. Each passenger s status preferences express the ordering of the attribute values in which the probability of surviving is increasing. For adult males, the probability of survival does not depend upon their status, therefore there is no status preference with respect to survival. For others, the probability is the smallest in the third, and thus unpreferred class. For women and boys, there is no difference between the other classes, while girls were more likely to survive in the second than in the first class, and should thus prefer the former. Based on the model, if the goal is to survive the sinking, the passenger s preference should be not to travel in the third class, and girls should, specifically, prefer the second class. sex female age age male adult third<crew=first=second child third<first<second adult crew=first=second=third child third<first=second Fig. 1: Preference model for surviving induced from Titanic data set. 3.2 Billiards Billiards is a common name for table games played with a stick and a set of balls, such as snooker or pool and their variants. The goals of the games vary, but the main idea is common to all: the player uses the stick to stroke the cue ball aiming at another ball to achieve the desired effect. The friction between the table and the balls, the spin of the cue ball and the collision of the balls combine into a very complex physical system [1]. However, despite of its complexity, an amateur player can still learn the basic principles of how to stroke the cue ball without knowing much about the physics behind it. In this case study we will use our method to induce a simple model, which could be used by a human player, for ranking shot types in different billiards positions.
7 R S L Fig. 2: Blocking the strokes. Ball S blocks a direct shot while balls L and R block left and right shots respectively. Our goal is to learn shot preferences in different circumstances from simulated data. Although it is possible to pocket the black ball with several different type of shots, some shots are more appropriate than others thus increasing the probability of a successful shot. For example, if a hole, a black ball and a cue ball are collinear, a direct shot is preferred over hitting a rail first because it is much more difficult to correctly estimate the reflection angles so that the black ball would pocket. However, in the presence of other balls which may present an obstacle for a direct shot, a rail-first shot may be preferred. We consider billiards positions with a cue ball (white colour), a black ball and three optional balls (L, S, and R) that may present an obstacle. The balls positions are fixed as shown in Figure 2. There are several different types of shots in the game of billiards [1]. For the sake of simplicity, we will only consider direct shots and rail-first shots. In the former, the cue ball directly hits the black ball, without hitting anything else in between. In the latter, the cue ball first hits the rail (also called cushion) and only then the black ball. In our study, we defined 4 possible actions a player can make: strong direct shot, weak direct shot, left rail-first shot and right rail-first shot. Strong and weak direct shots (Figure 4a) differ in the force applied by the player, which affects the initial velocity of the cue ball. The difference between left and right rail-first shots is shown in Figures 4b and 4c. In the left-rail-first shot, the white ball will pass the black ball on the left, hit the rail, and then hit the black ball. Similarly, in the right-rail-first shot, the white ball will first pass the black ball on the right, hit the rail, and then hit the black ball. We created a data set of 1000 shots (i.e. learning examples), where each example is described by the following four attributes:
8 S no yes L L no yes no yes rrail lrail wdir sdir lrail = rrail wdir sdir R rrail = lrail = wdir = sdir no yes sdir = wdir rrail lrail rrail = sdir = wdir lrail Fig. 3: The induced preference model suggesting the player the best action regarding the situation of the balls on the table. S: whether the ball S blocking a straight shot was present (values:yes/no), L: whether the ball L blocking a left shot was present(values:yes/no), R: whether the ball R blocking a right shot was present(values:yes/no), ShotType: the shot type (values:sdir,wdir,lrail,rrail), and the class variable black in pocket with values yes and no. The goal (target class) in our method is thus black in pocket = yes, namely, whether the shot forced the black ball into a pocket. The attribute values of each single example were randomly selected, while the value of the class was determined using the billiards simulator [22]. Each shot in the simulator is defined by: shot direction, stick elevation, shot velocity and shot follow. The most important property of a shot is its direction, as it has the dominant effect on the direction of the black ball after it is hit by the white ball. First, we computed the optimal stroke direction opang that results in pocketing the black ball. However, to account for human imprecision, we added uniform noise in the interval [ 0.2 deg, 0.2 deg]. Therefore, the value of stroke direction in the simulator is set to a random value selected from the interval [opang 0.2, opang + 0.2]. We would expect that more difficult shots are more prone to changes in the optimal setting. Similarly, other properties of the shot were randomly chosen from specified intervals. The stick elevation was from [0, 5] and the shot follow from [-0.1, 0.1]. The shot velocity was chosen from [.1, 2] for weak shots and from [3, 4] for strong shots. We again used the C4.5 algorithm to obtain a qualitative model. The induced preference model is shown in Fig. 3. The interpretations of leaves of the tree are as follows: S = no L = no: Balls S and L are not on the table. In this case, strong direct shot is preferred over weak direct shot since the velocity of the black ball may not be sufficient for the black ball to reach the pocket using the weak shot. A weak shot is preferred over left rail-first shot which is preferred over right rail-first shot. In general, rail-first shots are less successful than direct shots due to distortions caused by cue ball hitting the rail. However, left rail-first shot is usually better since the path of the ball is shorter than the path in the right rail-first shot. It is irrelevant whether ball R is present or not because it can only block the right rail-first shot which has the lowest preference in any case.
9 S = no L = yes: Ball L is blocking the left rail-first shot which makes both direct shots preferred over the rail-first shots. The preference of direct shots is the same as above while both rail-first shots are equally preferred. S = yes L = no R = no: Ball S is blocking direct shots which makes both rail-first shots preferable and both direct shots equally bad. Left rail-first shot is preferred over the right one for the same reason as above. S = yes L = no R = yes: Balls S and R are blocking the shots. The left rail-first shot is preferred over all other shots since it is the only open shot. S = yes L = yes: Balls S and L are blocking both direct and left rail-first shots. There are no shot preferences since all shots are equally difficult. The induced model conforms with the expert opinion and common sense. 4 Related work Preference learning [10,16,3] considers the problem of learning a preference model given learning examples as well as the preferences [2,7,5]. Our approach starts earlier and calculates the preferences for each learning example from data. While one could continue by using standard preference learning approaches [7,5], we use simple machine learning algorithms and treat preferences as values of a new class variable. Theoretically, it is possible that the number of class values exceeds a reasonable amount but it can be practically very well controlled by setting the threshold parameter (for joining the values) and the size of the neighbourhood of the reference example (a kind of smoothing the data). The motivation for our work comes from the field of qualitative reasoning where Qube can be used for learning qualitative models from categorical data. Qualitative reasoning has been mostly concerned with qualitative physics [8,19,20,12,11]. In these works the model was provided by an expert and then used in qualitative simulations. There are only a few algorithms for automated induction of such models [17] and even these are limited to learning from numerical data [4,25]. There are, to the best of our knowledge, no algorithms for learning qualitative models from categorical data. An important part of our method deals with relaxing the strong independence assumption of naive Bayesian approach. There exist a number of methods for this purpose, yet none fits our context. Kononenko introduced semi-naive Bayes [18] and Langley and Sage [21] proposed Selective Bayesian Classifier, a variant of the naive method that uses only a subset of the attributes in making predictions. Since their algorithm is only searching for subset of attributes that yields highest classification accuracy, it can not reveal attribute dependencies. Friedman and Goldszmidt introduced tree augmented naive Bayes (TAN) [15] which allows for attributes having another attribute as a parent in the bayesian network representation. A lazy algorithm, named Locally Weighted Naive Bayes (LWNB) is proposed in [14]. LWNB relaxes the independence assumption by learning local models at prediction time. The models are learned on weighted set of training instances in the neighbourhood of the test instance. In LWNB, the test example neighbourhood is chosen using the k-nearest neighbours algorithm. A step further is the Lazy Bayesian Rules (LBR) algorithm [27]. LBR search of the local neighbourhood is not based on a global metric.
10 Instead, for each test example, LBR uses a greedy search to generate a Bayesian rule with an antecedent that matches the test example. The basic difference between these approaches and ours is that these methods are concerned with optimizing the accuracy of predictions and not with estimations of the chosen attribute s influence on the target class probability. 5 Conclusion We have presented a novel approach to learning preference models. Its specific is that it learns the preference of values of an attribute, which leads to the preferred class in the context of other attributes. That is, in a given situation (as described by values of attributes) it tells which value of a particular attribute should increase the probability of the preferred class, where the preference of classes is known, e.g. to survive or to reach the goal of the game. We demonstrated the method on the standard toy data set about Titanic and a larger size problem of choosing the optimal shot in the game of billiards. While not offering (and not expected to offer) any great new insights into the game, the method resulted in a model which is intuitively obvious and correct. This work was supported by the Slovenian research agency ARRS (J2-2194, P2-0209). Appendix We need to prove that the ordering of probabilities p(c a 1,..., a i,..., a n ) does not change if we omit from the conditional part the values which are conditionally independent from a i given the class c. Let us first redefine the PDQ PD using conditional log odds ratios. Q f Q A i : a i a (a 1,..., a n ) = i sgn( ln p(c a (8) 1,..., a i,..., a n )/p( c a 1,..., a i,..., a n ) p(c a 1,..., a i,..., a n )/p( c a 1,..., a i,..., a n ) ), where c is the complement of the target class c. It is easy to see that (8) is equivalent to (4). Let us without loss of generality assume that values a 1 to a k, k < i are conditionally independent of values a k+1 to a n, given the class. Applying Bayesian rule, using the independence assumption, cancelling the identical terms and reapplying the Bayesian rule turns (8) into Q f Q A i : a i a (a 1,..., a n ) = i sgn( ln p(c a (9) k+1,..., a i,..., a n )/p( c a k+1,..., a i,..., a n ) p(c a k+1,..., a i,..., a n )/p( c a k+1,..., a i,..., a n ) ). This is equivalent to (4) without values a 1 to a k. Therefore, p(c a 1,..., a i,..., a n ) p(c a 1,..., a i,..., a n) p(c a k+1,..., a i,..., a n ) p(c a k+1,..., a i,..., a n).
11 References 1. Alciatore, D.G.: The Illustrated Principles of Pool and Billiards. Sterling; 1st edition (August 2004) 2. Boutilier, C., Brafman, R.I., Hoos, H.H., Poole, D.: Cp-nets: A tool for representing and reasoning with conditional ceteris paribus preference statements. Journal of Artificial Intelligence Research 21, 2004 (2003) 3. Brafman, R.I.: Preferences, planning and control. In: KR. pp. 2 5 (2008) 4. Bratko, I., Šuc, D.: Learning qualitative models. AI Magazine 24(4), (2003) 5. Brochu, E., de Freitas, N., Ghosh, A.: Active preference learning with discrete choice data. In: Advances in Neural Information Processing Systems (2007) 6. Cestnik, B.: Estimating probabilities: A crucial task in machine learning. In: ECAI. pp (1990) 7. Chu, W., Ghahramani, Z.: Preference learning with gaussian processes. In: ICML 05: Proceedings of the 22nd international conference on Machine learning. pp ACM, New York, NY, USA (2005) 8. de Kleer, J., Brown, J.: A qualitative physics based on confluences. Artificial Intelligence 24, 7 83 (1984) 9. Demšar, J., Zupan, B., Leban, G., Curk, T.: Orange: From experimental machine learning to interactive data mining. In: PKDD. pp (2004) 10. Doyle, J.: Prospects for preferences. Computational Intelligence 20(2), (2004) 11. Forbus, K.: Qualitative process theory. Artificial Intelligence 24, (1984) 12. Forbus, K.: Qualitative reasoning. CRC Press (1997) 13. Frank, A., Asuncion, A.: UCI machine learning repository (2010), Frank, E., Hall, M., Pfahringer, B.: Locally weighted naive Bayes. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence (UAI 2003) (2003) 15. Friedman, N., Goldszmidt, M.: Building classifiers using Bayesian networks. In: Proceedings of the thirteenth national conference on artificial intelligence. pp AAAI Press (1996) 16. Fürnkranz, J., Hüllermeier, E.: Preference learning. KI 19(1) (2005) 17. Klenk, M., Forbus, K.: Analogical model formulation for transfer learning in AP physics. Artificial Intelligence 173(18), (2009) 18. Kononenko, I.: Semi-naive bayesian classifier. In: EWSL. pp (1991) 19. Kuipers, B.: Qualitative simulation. Artificial Intelligence 29, (1986) 20. Kuipers, B.: Qualitative Reasoning: Modeling and Simulation with Incomplete Knowledge. MIT Press, Massachusetts (1994) 21. Langley, P., Sage, S.: Induction of selective Bayesian classifiers. In: Proceedings of the Tenth Conference on Uncertainty in Artificial Intelligence. pp Morgan Kaufmann (1994) 22. Papavasiliou, D.: Billiards manual. Tech. rep. (2009), Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin 28, (1958) 24. Žabkar, J.,, Možina, M., Bratko, I., Demšar, J.: Learning qualitative relations from categorical data. In: Proc. of the 24th International Workshop on Qualitative Reasoning. Portland, USA (2010) 25. Žabkar, J., Bratko, I., Demšar, J.: Learning qualitative models through partial derivatives by Padé. In: Proc. of the 21th International Workshop on Qualitative Reasoning. Aberystwyth, U.K. (2007) 26. Žabkar, J., Možina, M., Bratko, I., Demšar, J.: Discovering monotone relations with padé. In: ECML 2009 workshop: Learning Monotone Models From Data (2009)
12 27. Zheng, Z., Webb, G.I., Ting, K.M.: Lazy Bayesian rules: a lazy semi-naive Bayesian learning technique competitive to boosting decision trees. In: Proc. 16th International Conf. on Machine Learning. pp Morgan Kaufmann, San Francisco, CA (1999)
13 (a) Direct shot. (b) Left rail-first shot. (c) Right rail-first shot. Fig. 4: Examples of different type of shots used in our case study.
The Optimality of Naive Bayes
The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
STATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
Data Preprocessing. Week 2
Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.
Categorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
Data quality in Accounting Information Systems
Data quality in Accounting Information Systems Comparing Several Data Mining Techniques Erjon Zoto Department of Statistics and Applied Informatics Faculty of Economy, University of Tirana Tirana, Albania
Chapter 6. The stacking ensemble approach
82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
Data Mining: A Preprocessing Engine
Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,
An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset
P P P Health An Analysis of Missing Data Treatment Methods and Their Application to Health Care Dataset Peng Liu 1, Elia El-Darzi 2, Lei Lei 1, Christos Vasilakis 2, Panagiotis Chountas 2, and Wei Huang
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Qualitative Modelling
Qualitative Modelling Ivan Bratko Faculty of Computer and Information Sc., University of Ljubljana Abstract. Traditional, quantitative simulation based on quantitative models aims at producing precise
Clustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
On the effect of data set size on bias and variance in classification learning
On the effect of data set size on bias and variance in classification learning Abstract Damien Brain Geoffrey I Webb School of Computing and Mathematics Deakin University Geelong Vic 3217 With the advent
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
Data Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
Data Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities
Another Look at Sensitivity of Bayesian Networks to Imprecise Probabilities Oscar Kipersztok Mathematics and Computing Technology Phantom Works, The Boeing Company P.O.Box 3707, MC: 7L-44 Seattle, WA 98124
Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics
Selective Naive Bayes Regressor with Variable Construction for Predictive Web Analytics Boullé Orange Labs avenue Pierre Marzin 3 Lannion, France [email protected] ABSTRACT We describe our submission
Software Defect Prediction Modeling
Software Defect Prediction Modeling Burak Turhan Department of Computer Engineering, Bogazici University [email protected] Abstract Defect predictors are helpful tools for project managers and developers.
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier
A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
Prediction of Heart Disease Using Naïve Bayes Algorithm
Prediction of Heart Disease Using Naïve Bayes Algorithm R.Karthiyayini 1, S.Chithaara 2 Assistant Professor, Department of computer Applications, Anna University, BIT campus, Tiruchirapalli, Tamilnadu,
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
Web Document Clustering
Web Document Clustering Lab Project based on the MDL clustering suite http://www.cs.ccsu.edu/~markov/mdlclustering/ Zdravko Markov Computer Science Department Central Connecticut State University New Britain,
The Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
Data Mining Classification: Decision Trees
Data Mining Classification: Decision Trees Classification Decision Trees: what they are and how they work Hunt s (TDIDT) algorithm How to select the best split How to handle Inconsistent data Continuous
Discretization and grouping: preprocessing steps for Data Mining
Discretization and grouping: preprocessing steps for Data Mining PetrBerka 1 andivanbruha 2 1 LaboratoryofIntelligentSystems Prague University of Economic W. Churchill Sq. 4, Prague CZ 13067, Czech Republic
Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods
Mining Direct Marketing Data by Ensembles of Weak Learners and Rough Set Methods Jerzy B laszczyński 1, Krzysztof Dembczyński 1, Wojciech Kot lowski 1, and Mariusz Paw lowski 2 1 Institute of Computing
Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski [email protected]
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski [email protected] Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training
Impact of Boolean factorization as preprocessing methods for classification of Boolean data
Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS
ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS Abstract D.Lavanya * Department of Computer Science, Sri Padmavathi Mahila University Tirupati, Andhra Pradesh, 517501, India [email protected]
Advanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification
Performance Analysis of Naive Bayes and J48 Classification Algorithm for Data Classification Tina R. Patil, Mrs. S. S. Sherekar Sant Gadgebaba Amravati University, Amravati [email protected], [email protected]
Decision Support System For A Customer Relationship Management Case Study
61 Decision Support System For A Customer Relationship Management Case Study Ozge Kart 1, Alp Kut 1, and Vladimir Radevski 2 1 Dokuz Eylul University, Izmir, Turkey {ozge, alp}@cs.deu.edu.tr 2 SEE University,
Getting Even More Out of Ensemble Selection
Getting Even More Out of Ensemble Selection Quan Sun Department of Computer Science The University of Waikato Hamilton, New Zealand [email protected] ABSTRACT Ensemble Selection uses forward stepwise
A Decision Theoretic Approach to Targeted Advertising
82 UNCERTAINTY IN ARTIFICIAL INTELLIGENCE PROCEEDINGS 2000 A Decision Theoretic Approach to Targeted Advertising David Maxwell Chickering and David Heckerman Microsoft Research Redmond WA, 98052-6399 [email protected]
Visualization of large data sets using MDS combined with LVQ.
Visualization of large data sets using MDS combined with LVQ. Antoine Naud and Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland. www.phys.uni.torun.pl/kmk
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH
IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria
Decision Tree Learning on Very Large Data Sets
Decision Tree Learning on Very Large Data Sets Lawrence O. Hall Nitesh Chawla and Kevin W. Bowyer Department of Computer Science and Engineering ENB 8 University of South Florida 4202 E. Fowler Ave. Tampa
Comparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
ROC Curve, Lift Chart and Calibration Plot
Metodološki zvezki, Vol. 3, No. 1, 26, 89-18 ROC Curve, Lift Chart and Calibration Plot Miha Vuk 1, Tomaž Curk 2 Abstract This paper presents ROC curve, lift chart and calibration plot, three well known
A Game Theoretical Framework for Adversarial Learning
A Game Theoretical Framework for Adversarial Learning Murat Kantarcioglu University of Texas at Dallas Richardson, TX 75083, USA muratk@utdallas Chris Clifton Purdue University West Lafayette, IN 47907,
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100
Identifying At-Risk Students Using Machine Learning Techniques: A Case Study with IS 100 Erkan Er Abstract In this paper, a model for predicting students performance levels is proposed which employs three
Knowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
Predicting Student Performance by Using Data Mining Methods for Classification
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance
Learning Example. Machine learning and our focus. Another Example. An example: data (loan application) The data and the goal
Learning Example Chapter 18: Learning from Examples 22c:145 An emergency room in a hospital measures 17 variables (e.g., blood pressure, age, etc) of newly admitted patients. A decision is needed: whether
Ensemble Data Mining Methods
Ensemble Data Mining Methods Nikunj C. Oza, Ph.D., NASA Ames Research Center, USA INTRODUCTION Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods
Introducing diversity among the models of multi-label classification ensemble
Introducing diversity among the models of multi-label classification ensemble Lena Chekina, Lior Rokach and Bracha Shapira Ben-Gurion University of the Negev Dept. of Information Systems Engineering and
Support Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France [email protected] Massimiliano
Introduction to Data Mining
Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:
How To Perform An Ensemble Analysis
Charu C. Aggarwal IBM T J Watson Research Center Yorktown, NY 10598 Outlier Ensembles Keynote, Outlier Detection and Description Workshop, 2013 Based on the ACM SIGKDD Explorations Position Paper: Outlier
Background knowledge-enrichment for bottom clauses improving.
Background knowledge-enrichment for bottom clauses improving. Orlando Muñoz Texzocotetla and René MacKinney-Romero Departamento de Ingeniería Eléctrica Universidad Autónoma Metropolitana México D.F. 09340,
Random Forest Based Imbalanced Data Cleaning and Classification
Random Forest Based Imbalanced Data Cleaning and Classification Jie Gu Software School of Tsinghua University, China Abstract. The given task of PAKDD 2007 data mining competition is a typical problem
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION
REFLECTIONS ON THE USE OF BIG DATA FOR STATISTICAL PRODUCTION Pilar Rey del Castillo May 2013 Introduction The exploitation of the vast amount of data originated from ICT tools and referring to a big variety
Alok Gupta. Dmitry Zhdanov
RESEARCH ARTICLE GROWTH AND SUSTAINABILITY OF MANAGED SECURITY SERVICES NETWORKS: AN ECONOMIC PERSPECTIVE Alok Gupta Department of Information and Decision Sciences, Carlson School of Management, University
10-601. Machine Learning. http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
10-601 Machine Learning http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html Course data All up-to-date info is on the course web page: http://www.cs.cmu.edu/afs/cs/academic/class/10601-f10/index.html
Automated Collaborative Filtering Applications for Online Recruitment Services
Automated Collaborative Filtering Applications for Online Recruitment Services Rachael Rafter, Keith Bradley, Barry Smyth Smart Media Institute, Department of Computer Science, University College Dublin,
BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
E-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee [email protected] Seunghee Ham [email protected] Qiyi Jiang [email protected] I. INTRODUCTION Due to the increasing popularity of e-commerce
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer
Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET)
INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6367(Print) ISSN 0976 6375(Online)
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
Clustering Connectionist and Statistical Language Processing
Clustering Connectionist and Statistical Language Processing Frank Keller [email protected] Computerlinguistik Universität des Saarlandes Clustering p.1/21 Overview clustering vs. classification supervised
Using News Articles to Predict Stock Price Movements
Using News Articles to Predict Stock Price Movements Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 9237 [email protected] 21, June 15,
Utility-Based Fraud Detection
Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Utility-Based Fraud Detection Luis Torgo and Elsa Lopes Fac. of Sciences / LIAAD-INESC Porto LA University of
Probability Using Dice
Using Dice One Page Overview By Robert B. Brown, The Ohio State University Topics: Levels:, Statistics Grades 5 8 Problem: What are the probabilities of rolling various sums with two dice? How can you
D-optimal plans in observational studies
D-optimal plans in observational studies Constanze Pumplün Stefan Rüping Katharina Morik Claus Weihs October 11, 2005 Abstract This paper investigates the use of Design of Experiments in observational
Chapter 14 Managing Operational Risks with Bayesian Networks
Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian
Segmentation and Classification of Online Chats
Segmentation and Classification of Online Chats Justin Weisz Computer Science Department Carnegie Mellon University Pittsburgh, PA 15213 [email protected] Abstract One method for analyzing textual chat
ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
Monotonicity Hints. Abstract
Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: [email protected] Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology
Keywords data mining, prediction techniques, decision making.
Volume 5, Issue 4, April 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Analysis of Datamining
In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999
In Proceedings of the Eleventh Conference on Biocybernetics and Biomedical Engineering, pages 842-846, Warsaw, Poland, December 2-4, 1999 A Bayesian Network Model for Diagnosis of Liver Disorders Agnieszka
Multivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
Divide-n-Discover Discretization based Data Exploration Framework for Healthcare Analytics
for Healthcare Analytics Si-Chi Chin,KiyanaZolfaghar,SenjutiBasuRoy,AnkurTeredesai,andPaulAmoroso Institute of Technology, The University of Washington -Tacoma,900CommerceStreet,Tacoma,WA980-00,U.S.A.
6.042/18.062J Mathematics for Computer Science. Expected Value I
6.42/8.62J Mathematics for Computer Science Srini Devadas and Eric Lehman May 3, 25 Lecture otes Expected Value I The expectation or expected value of a random variable is a single number that tells you
Performance Analysis of Decision Trees
Performance Analysis of Decision Trees Manpreet Singh Department of Information Technology, Guru Nanak Dev Engineering College, Ludhiana, Punjab, India Sonam Sharma CBS Group of Institutions, New Delhi,India
It is important to bear in mind that one of the first three subscripts is redundant since k = i -j +3.
IDENTIFICATION AND ESTIMATION OF AGE, PERIOD AND COHORT EFFECTS IN THE ANALYSIS OF DISCRETE ARCHIVAL DATA Stephen E. Fienberg, University of Minnesota William M. Mason, University of Michigan 1. INTRODUCTION
Robust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
Factoring & Primality
Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Learning is a very general term denoting the way in which agents:
What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);
Three types of messages: A, B, C. Assume A is the oldest type, and C is the most recent type.
Chronological Sampling for Email Filtering Ching-Lung Fu 2, Daniel Silver 1, and James Blustein 2 1 Acadia University, Wolfville, Nova Scotia, Canada 2 Dalhousie University, Halifax, Nova Scotia, Canada
Chapter 20: Data Analysis
Chapter 20: Data Analysis Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 20: Data Analysis Decision Support Systems Data Warehousing Data Mining Classification
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES
IDENTIFYING BANK FRAUDS USING CRISP-DM AND DECISION TREES Bruno Carneiro da Rocha 1,2 and Rafael Timóteo de Sousa Júnior 2 1 Bank of Brazil, Brasília-DF, Brazil [email protected] 2 Network Engineering
Healthcare Data Mining: Prediction Inpatient Length of Stay
3rd International IEEE Conference Intelligent Systems, September 2006 Healthcare Data Mining: Prediction Inpatient Length of Peng Liu, Lei Lei, Junjie Yin, Wei Zhang, Wu Naijun, Elia El-Darzi 1 Abstract
