# Probabilistic Rough Set Approximations

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Probabilistic Rough Set Approximations Yiyu (Y.Y.) Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 Abstract Probabilistic approaches have been applied to the theory of rough set in several forms, including decision-theoretic analysis, variable precision analysis, and information-theoretic analysis. Based on rough membership functions and rough inclusion functions, we revisit probabilistic rough set approximation operators and present a critical review of existing studies. Intuitively, they are defined based on a pair of thresholds representing the desired levels of precision. Formally, the Bayesian decision-theoretic analysis is adopted to provide a systematic method for determining the precision parameters by using more familiar notions of costs and risks. Results from existing studies are reviewed, synthesized and critically analyzed, and new results on the decision-theoretic rough set model are reported. Key words: rough sets, approximation operators, decision-theoretic model, variable precision and parameterized models, probabilistic rough set models 1 Introduction In the standard rough set model proposed by Pawlak [26,27], the lower and upper approximations are defined based on the two extreme cases regarding the relationships between an equivalence class and a set. The lower approximation requires that the equivalence class is a subset of the set. For the upper approximation, the equivalence class must have a non-empty overlap with the set. A lack of consideration for the degree of their overlap unnecessarily limits the applications of rough sets and has motivated many researchers to investigate probabilistic generalizations of the theory [11,14,25,28 30,32,33,42 46,52,54,56 58,60,66,69]. Probabilistic approaches to rough sets have appeared in many forms, such as the decision-theoretic rough set model [52,54,56 58], the variable preci- Preprint submitted to Elsevier Science

2 sion rough set model [14,66,69], the Bayesian rough set model [11,34,35], information-theoretic analysis [3,41], probabilistic rule induction [6,8,13,19 21,31,37 39,64,65,67,68], and many related studies [44,45]. The extensive results increase our understanding of the theory. At the same time, it seems necessary to provide a unified and comprehensive framework so that those results can be put together into an integrated whole, rather than separated studies [54]. Most of the papers in this special issue aim at such a goal. The current paper focuses specifically on the issues of probabilistic approximations. The existing results are revisited and critically reviewed and new results are provided. Probabilistic rough set approximations can be formulated based on the notions of rough membership functions [28] and rough inclusion [30]. Both notions can be interpreted in terms of conditional probabilities or a posteriori probabilities. Threshold values, known as parameters, are applied to a rough membership function or a rough inclusion to obtain probabilistic or parameterized approximations. Three probabilistic models have been proposed and studied intensively. They are the decision-theoretic rough set model [52,54,56,58], the variable precision rough set model [14,66], and the Bayesian rough set model [11,34,35]. The main differences among those models are their different, but equivalent, formulations of probabilistic approximations and interpretations of the required parameters. The variable precision rough set model treats the required parameters as a primitive notion. The interpretation and the process of determining the parameters are based on rather intuitive arguments and left to empirical studies. There is a lack of theoretical and systematic studies and justifications on the choices of the threshold parameters. In fact, a solution to this problem was reported earlier in a decision-theoretic framework for probabilistic rough set approximations [52,56,58], based on the well established Bayesian decision procedure for classification [7]. Within the decision-theoretic framework, the required threshold values can be easily interpreted and calculated based on more concrete notions, such as costs and risks. Unfortunately, many researchers are still unaware of the decision-theoretic model and tend to estimate the parameters based on tedious trial-and-error approaches. By explicitly showing the connections of the two models in this paper, we hope to increase further understanding of the theoretical foundations of probabilistic approximations. The Bayesian rough set model [11,34,35] attempts to provide an alternative interpretation of the required parameters. The model is based on the Bayes rule that expresses the change from the a priori probability to the a posteriori probability, and a connection between classification and hypothesis verification. Under specific interpretations, the required parameters can be expressed in terms of various probabilities. It is not difficult to establish connections between the probabilities used in the Bayesian rough set model and the costs 2

3 used in the decision-theoretic model. Additional parameters are introduced in the Bayesian rough set model. There remains the question of how to interpret and determine the required parameters systematically. With the objective of bringing together existing studies on probabilistic rough set approximations in a unified and comprehensive framework, the rest of the paper is organized into four parts. In Section 2, we review the basic concepts of the standard rough set approximations. This establishes a basis and guidelines for various probabilistic generalizations. In Section 3, we examine the two fundamental notions of rough membership functions and rough inclusions. They serve as a foundation on which probabilistic rough set approximations can be developed. In Section 4, we critically review different formulations of probabilistic rough set approximations. In Section 5, we explicitly show the conditions on a loss function so that many specific classes of probabilistic rough set approximations introduced in Section 4 can be derived in the decisiontheoretic model. 2 Standard Rough Set Approximations Suppose U is a finite and nonempty set called the universe. Let E U U be an equivalence relation on U, i.e., E is reflexive, symmetric, and transitive. The basic building blocks of rough set theory are the equivalence classes of E. For an element x U, the equivalence class containing x is given by: [x] E = {y U xey}. (1) When no confusion arises, we also simply write [x]. The family of all equivalence classes is also known as the quotient set of U, and is denoted by U/E = {[x] x U}. It defines a partition of the universe, namely, a family of pairwise disjoint subsets whose union is the universe. For an equivalence relation, the pair apr = (U, E) is called an approximation space [26,27]. In the approximation space, we only have a coarsened view of the universe. Each equivalence class is considered as a whole granule instead of many individuals [51]. Equivalence classes are the elementary definable, measurable, or observable sets in the approximation space [26,27,50]. By taking unions of elementary definable sets, one can derive larger definable sets. The family of all definable sets contains the empty set, the whole set U, and is closed with respect to set complement, intersection, and union. It is an σ-algebra over U. Furthermore, σ(u/e) defines uniquely a topological space (U, σ(u/e)), in which σ(u/e) is the family of all open and closed sets [26]. 3

4 In general, the sigma-algebra σ(u/e) is only a subset of the power set 2 U. An interesting issue is therefore the representation of undefinable sets in 2 U σ(u/e) in terms of definable sets, in order to infer knowledge about undefinable sets. Similar to the interior and closure operators in topological spaces, one can define rough set approximation operators [26,27]. For a subset A U, its lower approximation is the greatest definable set contained in A, and its upper approximation is the least definable set containing A. That is, for A U, apr(a) = {X X σ(u/e), X A}, apr(a) = {X X σ(u/e), A X}. (2) In the study of rough set theory, one often uses the following equivalent definitions [49]: and apr(a) = {x U y U[xEy = y A]}, apr(a) = {x U y U[xEy, y A]}; (3) apr(a) = {x U [x] A}, apr(a) = {x U [x] A }. (4) An element is in the lower approximation of A if all of its equivalent elements are in A, and an element is in the upper approximation of A if at least one of its equivalent elements is in A. Definition given by equation (2) is referred to as the subsystem based definition. Definitions given by equations (3) and (4) are referred to as the element based definitions [53]. Equivalently, from equation (4), one can also have a granule based definition: apr(a) = {[x] U/E [x] A}, apr(a) = {[x] U/E [x] A }. (5) It provides a new interpretation of rough set approximations. The lower approximation is the union of equivalence classes that are subsets of A and the upper approximation is the union of equivalence classes that have a non-empty intersection with A. Let A c denote the complement of the set A. Some of the useful properties satisfied by the pair of approximation operators are summarized below [26,27,49]: for A, B U, 4

5 (L0) apr(a) = (apr(a c )) c, (U0) apr(a) = (apr(a c )) c ; (L1) apr(a B) = apr(a) apr(b), (U1) apr(a B) = apr(a) apr(b); (L2) apr(a B) apr(a) apr(b), (U2) apr(a B) apr(a) apr(b); (L3) A B = apr(a) apr(b), (U3) A B = apr(a) apr(b); (L4) apr(a) A, (U4) A apr(a); (L5) apr(a) = apr(apr(a)), (U5) apr(a) = apr(apr(a)); (L6) apr(a) = apr(apr(a)), (U6) apr(a) = apr(apr(a)); (L7) apr(a) = A A σ(u/e), (U7) apr(a) = A A σ(u/e). Properties (L0) and (U0) state that the lower and upper approximations are a pair of dual operators. Properties (L1) and (U1) show the distributivity of apr over set intersection, and apr over set union. Properties (L2) and (U2) state that the lower approximation operator is not necessarily distributive over set union, and the upper approximation operator is not necessarily distributive over set intersection. According to properties (L3)and (U3), both operators are monotonic with respect to set inclusion. By properties (L4) and (U4), a set lies within its lower and upper approximations. The next two pairs of properties state that the result of applying a consecutive approximation operators is the same as the result of the operator closest to A. Properties (L7) and (U7) state that a set and its approximations are the same if and only if the set is a definable set in σ(u/e). Given a subset A U, the universe can be divided into three disjoint regions, namely, the positive, the negative, and the boundary regions [26]: POS(A) = apr(a), NEG(A) = POS(A c ) = (apr(a)) c, BND(A) = apr(a) apr(a). (6) An element of the positive region POS(A) definitely belongs to A, an element of the negative region NEG(A) definitely does not belong to A, and an element of the boundary region BND(A) only possibly belongs to A. 5

6 The three regions and the approximation operators uniquely determine each other. One may therefore use any of the three pairs to represent a subset A U: (POS(A), POS(A) BND(A)) = (apr(a), apr(a)), (POS(A), BND(A)) = (apr(a), apr(a) apr(a)), (POS(A), NEG(A)) = (apr(a), (apr(a)) c ). Each of them makes explicit certain particular aspect of the approximations. The first pair is the most commonly used one, defining the lower and upper bounds within which lies the set A. It is related to the notions of the core and support of a fuzzy set. The second pair explicitly gives the boundary elements under the approximations. The third pair focuses on what is definitely in A, in contrast to what is definitely not in A. 3 Rough Membership Functions and Rough Inclusion Since the lower and upper approximations are dual operators, it is sufficient to consider one of them. According equations (3) and (4), generalized approximation operators can be introduced by relaxing the conditions: (LC) y U[xEy = y A], (SC) [x] A. For the logic condition (LC), one can use probabilistic versions. The results from graded modal logic [10,24], variable precision logic [22], and probabilistic logic [16] may be adopted. For the set-theoretic condition (SC), we may adopt the notion of the degree of set inclusion from many studies, such as approximate reasoning [46,61] and rough mereology [30]. 3.1 Rough membership functions The concept of rough membership functions is based on the generalization of the strict logic condition (LC) into a probabilistic version. More specifically, the rough membership value of an element x, with respect to a set A U, is typically defined in terms of a measure of the degree to which the logic condition (LC) is true. The notion of a rough membership function was explicitly introduced by Pawlak and Skowron [28], although it had been used and studied earlier by 6

7 Wong and Ziarko [43], Pawlak, Wong, and Ziarko [29], Yao, Wong and Lingras [58], Yao and Wong [56], and many authors. Let P : 2 U [0, 1] be a probability function defined on the power set 2 U, and E an equivalence relation on U. The triplet apr = (U, E, P) is called a probabilistic approximation space [29,43]. For a subset A U, its rough membership function is given by the conditional probability as follows: µ A (x) = P(A [x]), (7) Rough membership value of an element belonging to A is the probability of the element in A given that the element is in [x]. With the probabilistic interpretation of rough membership function, we will use µ A (x) and P(A [x]) interchangeably in subsequent discussions. For a finite universe, the rough membership function is typically computed by [28]: µ A (x) = A [x], (8) [x] where A denotes the cardinality of the set A. Rough membership functions satisfy the following properties [28,55]: (m1) µ U (x) = 1, (m2) µ (x) = 0, (m3) xey = µ A (x) = µ A (y), (m4) x A = µ A (x) 0, (m5) x A = µ A (x) 1, (m6) µ A (x) = 1 [x] A, (m7) µ A (x) > 0 [x] A, (m8) A B = µ A (x) µ B (x), (m9) µ A c(x) = 1 µ A (x), (m10) µ A B (x) = µ A (x) + µ B (x) µ A B (x), (m11) A B = = µ A B (x) = µ A (x) + µ B (x), (m12) max(0, µ A (x) + µ B (x) 1) µ A B (x) min(µ A (x), µ B (x)), (m13) max(µ A (x), µ B (x)) µ A B (x) min(1, µ A (x) + µ B (x)). Those properties easily follow from the properties of a probability function. While (m1)-(m8) show the properties of rough membership functions, (m9)- (m13) show the properties of set-theoretic operations with rough membership 7

8 functions. The property (m3) is particularly interesting, which shows that elements in the same equivalence class must have the same degree of membership. In other words, equivalent elements must have the same membership value. 3.2 Rough inclusion The concept of rough inclusion generalizes the set-theoretic condition (SC) in order to capture graded inclusion. The degree to which [x] is included in A depends on both the overlap and non-overlap parts of [x] and A. In the rough set theory literature, the notion of rough inclusion, introduced explicitly by Polkowski and Skowron [30], has been studied using other names, including relative degree of misclassification [66], majority inclusion relation [66], vague inclusion [33], inclusion degrees [46,60,61], and so on. Recall that the notion of rough membership functions is a generalization of the logic condition (LC). By the equivalence of the two conditions (LC) and (SC), we can extend the notion of a rough membership function to rough inclusion [30,33]. For the maximum membership value 1, we have [x] A, namely, [x] is a subset of A. For the minimum membership value 0, we have [x] A =, or equivalently [x] A c, namely, [x] is totally not a subset of A. For a value between 0 and 1, it may be interpreted as the degree to which [x] is a subset of A. Thus, one obtains a measure of graded inclusion of two sets [30,32,33,61]: v(b A) = A B. (9) A For the case where A =, we define v(b ) = 1, namely, the empty set is a subset of any set. Accordingly, the degree to which the equivalence class [x] is included in a set A is given by: v([a [x]) = [x] A. (10) [x] It follows that v(a [x]) = µ A (x). From the properties of a rough membership function, one can easily obtain the corresponding properties of rough inclusion. Similar to a rough membership function, the value of v(b A) can be interpreted as the conditional probability P(B A) that a randomly selected element from A belongs to B. There is a close connection between graded inclusion and fuzzy set inclusion [33,59]. In the development of the variable precision rough set model, Ziarko [66] used 8

9 an inverse measure of v called the relative degree of misclassification: c(b A) = 1 v(b A) = 1 A B. (11) A Bryniarski and Wybraniec-Skardowska [4] proposed to use a family of inclusion relations called context relations, indexed by a bounded and partially ordered set called rank set. The unit interval [0, 1] can be treated as a rank set. From a measure of graded inclusion, a context relation with respect to a value α [0, 1] can be defined by: α = {(A, B) v(b A) α}. (12) If one interprets v as a fuzzy relation on 2 U, the relation α may be interpreted as an α-cut of the fuzzy relation. The use of a complete lattice, or a rank set, corresponds to the study of L-fuzzy sets and L-fuzzy relations in the theory of fuzzy sets [15]. Many proposals have been made to generalize and characterize the notion of graded inclusion. Skowron and Stepaniuk [33] suggested that graded (vague) inclusion of sets may be measured by a function, v : 2 U 2 U [0, 1], with monotonicity regarding the first argument, namely, for A, B, C U, v(b A) (C A) for any B C. In this case, the function defined by equation (9) is an example of such a measure. Skowron and Polkowski [32] suggested new properties for rough inclusion, in addition to the monotonicity. The unit interval [0, 1] can also be generalized to a complete lattice in the definition of rough inclusion [30]. Rough inclusion is only an example for measuring degrees of inclusion in rough mereology. A more detailed discussion on rough mereology and related concepts can be found in [30,32]. Zhang and Leung [61], and Xu et al. [46] proposed a generalized notion of inclusion degree in the context of a partially ordered set. Let (L, ) be a partially ordered set. A function D : L L [0, 1] is called a measure of inclusion degree if it satisfies the following properties [46]: for a, b, c L, (i). 0 D(b a) 1, (ii). a b = D(b a) = 1, (iii). a b c = D(a b) D(a c), (iv). a b = D(a c) D(b c). Property (i) is the normalization condition. Property (ii) ensures that the degree of inclusion reaches the maximum value for the standard inclusion. Properties (iii) and (iv) state two types of monotonicity. When the concept of 9

10 inclusion degree is applied to the partially ordered set (2 U, ), we immediately obtain a rough inclusion [60]. 4 Probabilistic Rough Set Approximations The standard approximation operators ignore the detailed statistical information of the overlap of an equivalence class and a set [29,43]. By exploring such information, probabilistic approximation operators can be introduced. 4.1 Standard approximations as the core and the support of a fuzzy set According to properties (m6) and (m7) of rough membership functions, µ A (x) = 1 if and only if for all y U, xey implies y A, and µ A (x) > 0 if and only if there exists a y U such that xey and y A. A rough membership function µ A may be interpreted as a special kind of fuzzy membership function. Under this interpretation, it is possible to re-express the standard rough set approximations [28,29,43], and to establish their connection to the core and support of a fuzzy set [55], as follows: apr(a) = {x U µ A (x) = 1} =core(µ A ), apr(a) = {x U µ A (x) > 0} =support(µ A ). (13) That is, the lower and upper approximations of a set A are in fact the core and support of the fuzzy set µ A, respectively. In the theory of fuzzy sets, fuzzy set intersection and union are commonly defined in terms of a pair of t-norm and t-conorm [15]. Suppose fuzzy set complement is defined by 1 ( ). For a pair of t-norm and t-conorm, core and support of fuzzy sets satisfy the same properties of (L0)-(L6), and (U0)- (U6), except in which the lower approximation is replaced by the core, and the upper approximation by the support, respectively [55]. If the core and support are interpreted as a qualitative representation of a fuzzy set, one may conclude that the theories of fuzzy sets and rough sets share the same qualitative properties. 10

11 4.2 The 0.5 probabilistic approximations An attempt to use probabilistic information for approximations was suggested by Pawlak, Wong, Ziarko [29]. Their model is based essentially on the majority rule. An element x is put into the lower approximation of A if the majority of its equivalent elements [x] are in A. That is, apr 0.5 = {x U P(A [x]) > 0.5), apr 0.5 = {x U P(A [x]) 0.5). (14) The lower and upper 0.5 probabilistic approximation operators are dual to each other. The boundary region consists of those elements whose conditional probabilities are exactly 0.5, which represents maximal uncertainty. 4.3 Probabilistic approximations as the α-cuts of a fuzzy set The standard approximations and 0.5 probabilistic approximations use special points of the probability, namely, the two extreme points 0 and 1, and the middle point 0.5. By considering other values, Yao and Wong [56] introduced more general probabilistic approximations in the decision-theoretic model. In the theory of fuzzy sets, α-cut and strong α-cut are important notions [15]. For α [0, 1], the α-cut and strong α-cut are defined, respectively, by: (µ A ) α = {x U µ A (x) α}, (µ A ) α + = {x U µ A (x) > α}. (15) Using α-cuts, the standard rough set approximations can be expressed apr(a) = (µ A ) 1 and apr(a) = (µ A ) 0 +. The 0.5 probabilistic approximations can be expressed as apr 0.5 (A) = (µ A ) and apr 0.5 (A) = (µ A ) 0.5. For generalized probabilistic approximations, a pair of parameters α, β [0, 1] with α β are used. The condition α β ensures that the lower approximation is smaller than the upper approximation in order to be consistent with existing approximation operators. Yao and Wong [56] considered two separate cases, 0 β < α 1 and 0 β = α. For 0 β < α 1, the standard rough approximations are extended by the definition [56]: apr α = {x U P(A [x]) α), apr β = {x U P(A [x]) > β). (16) 11

12 For α = β 0, the 0.5 probabilistic approximations are extended by the definition [56]: apr α = {x U P(A [x]) > α), apr α = {x U P(A [x]) α). (17) For 0 < β α < 1, Wei and Zhang [42] suggested another version in which the lower approximation is defined by > and the upper approximation by instead. One advantage of their definition is that we do not need to have two separated cases. However, their definition cannot produce the standard rough set approximations. As will be shown in the next section, both versions are derivable from the decision-theoretic model if different tie-breaking criteria are used. With a pair of arbitrary α and β, the probabilistic approximation operators are not necessarily dual to each other. In order to obtain a pair of dual operators, we set β = 1 α. Then, the lower and upper probabilistic approximation operators are dual operators. Although the above formulation is motivated by the notion of α-cuts in fuzzy set theory, similar notions have in fact been considered in many fields, such as variable precision (probabilistic) logic [22], probabilistic modal logic [10], graded/fuzzy modal logic [24], and many others. The use of thresholds on probability values for making a practical decision is in fact a common method in many fields, such as pattern recognition and classification [7], machine learning [23], data mining [2], and information retrieval [40], to name just a few. Consider the condition α > β. Based on the definition of equation (16), the probabilistic rough set operators satisfy the following properties [56,66]: for 0 β < α 1, α (0, 1] and β [0, 1), (PL0) apr α (A) = (apr 1 α (A c )) c, (PU0) apr α (A) = (apr 1 α (A c )) c ; (PL1) apr α (A B) apr α (A) apr α (B), (PU1) apr β (A B) apr β (A) apr β (B); (PL2) apr α (A B) apr α (A) apr α (B), (PU2) apr β (A B) apr β (A) apr β (B); (PL3) A B = apr α (A) apr α (B), (PU3) A B = apr β (A) apr β (B); (PLU4) apr α (A) apr β (A); (PL5) apr α (A) = apr β (apr α (A)), (PU5) apr β (A) = apr α (apr β (A)); 12

13 (PL6) (PU6) (PL7) (PU7) apr α (A) = apr α (apr α (A)), apr β (A) = apr β (apr β (A)); apr α (A) = A A σ(u/e), apr β (A) = A A σ(u/e). They are counterparts of the properties (L0)-(L7) and (U0)-(U7) of the standard rough set approximation operators. As stated earlier, apr α and apr β are defined differently for the case when α = β, where similar properties can be obtained. For probabilistic approximation operators, one can have additional properties: for α, α (0, 1] and β, β [0, 1), (PL8) (PU8) (PL9) (PU9) α α = apr α (A) apr α (A), β β = apr β (A) apr β (A); apr(a) = apr 1 (A), apr(a) = apr 0 (A). Properties (PL8) and (PU8) show that both probabilistic approximation operators are monotonic decreasing with respect to the parameters α and β. Properties (PL9) and (PU9) establish the connection between probabilistic approximation operators and the standard approximation operators. 4.4 The variable precision and parameterized rough set models With the introduction of rough inclusion, the standard approximation space can be generalized to apr = (U, I, v), where I : U 2 U is an information function and v is a measure of rough inclusion [25,33]. The mapping [ ] that maps an element to its equivalence class is an instance of the information function I. By applying threshold values on a rough inclusion v, it is possible to derive variable precision or parameterized approximations [11,33,66] by generalizing equation (4) of the element based definition or equation (5) of the granule based definition. The variable precision rough set model is one of the well known such formulations [66]. In formulating the variable precision rough set model, Ziarko [66] used the relative degree of misclassification function c and the granule based definition of approximations. To be consistent with the previous and subsequent discussions, we present a slightly different, but equivalent, formulation based on the rough inclusion, v(a [x]) = A [x] [x] = P(A [x]), (18) 13

14 and the element based definition. As mentioned in the earlier discussion, based on the rough inclusion v, we can define different levels of set inclusion [4,66]: [x] α A v(a [x]) α P(A [x]) α, (19) where α (0, 1]. When defining the lower approximation, the majority requirement of the variable precision rough set model suggests that more than 50% of elements in an equivalence class [x] must be in A in order for x to be in the lower approximation. In other words, the set-theoretic condition (SC) must hold to a degree greater than 0.5. We need to choose the threshold value α in the range (0.5, 1]. By generalizing equation (4), the α-level lower approximation is given by: for α (0.5, 1], apr α (A) = {x U [x] α A} {x U P(A [x]) α}. (20) The corresponding upper approximation is defined based on the dual of the lower approximation: apr 1 α (A)=(apr α (A c )) c = {x U P(A [x]) > 1 α}. (21) The condition 0.5 < α 1 implies 0 1 α < 0.5. It follows that the lower approximation is a subset of the upper approximation. The pair of parameters (α, 1 α) is referred to as the symmetric bounds, as it produces a pair of dual approximation operators (apr α, apr 1 α ). Variable precision rough sets with asymmetric bounds were examined by Katzberg and Ziarko [14]. It is only required that 0 β < α 1, where β is used to define the upper approximation and α is used to define the lower approximation, as defined in equation (16). Although apr α and apr β are not necessarily dual to each other, Yao and Wong [56] showed that the two pairs of operators, (apr α, apr β ) and (apr 1 β, apr 1 α ) are complement to each other. For the special rough inclusion v(a [x]) = P(A [x]), the probabilistic approximations from the decision-theoretic model and the variable precision model are equivalent. The main differences are their formulations. The decision-theoretic model systematically derives many types of approximation operators and provides theoretical guidelines for the estimation of required parameters, while the variable precision model relies much on intuitive arguments. 14

15 Ślȩzak and Ziarko [35] and Ślȩzak [34] introduced the Bayesian rough set model in an attempt to provide an alternative interpretation of the required parameters in the variable precision rough set model. By setting the parameters as the a priori probabilities, a pair of probabilistic approximations is defined by [35]: for A U, apr P(A) (A) = {x U P(A [x]) > P(A)}, apr P(A) (A) = {x U P(A [x]) P(A)}. (22) They correspond to the case where α = β = P(A). Ślȩzak [34] examined a more complicated version of Bayesian rough set approximations by comparing probabilities P([x] A) and P([x] A c ): bapr δ1 (A) = {x U P([x] A) δ 1 P([x] A c )}, bapr δ2 (A) = {x U P([x] A) δ 2 P([x] A c )}, (23) where δ 1 and δ 2 are parameters. Based on the Bayes rule, one can easily find their corresponding variable precision approximations [34]. The corresponding parameters of the variable precision approximations are expressed in terms the probability P(A), δ 1 and δ 2. In comparison with the variable precision model, the new parameters of the Bayesian rough set model are less intuitive and their estimation becomes a challenge. Greco, Matarazzo and S lowiński [11,12] observed that rough membership functions and rough inclusions, as defined by the conditional probabilities P(A [x]), consider the overlap of A and [x] and do not explicitly consider the overlap of A and [x] c. By considering both overlaps, they introduced a relative rough membership function: A [x] ˆµ A (x) = A [x]c [x] [x] c =P(A [x]) P(A [x] c ). (24) The relative rough membership function is an instance of a class of measures known as the Bayesian confirmation measures [9]. By incorporating a confirmation measure to the existing parameterized models, they proposed two-parameterized approximations [11,12]: apr α,a = {x U P(A [x]) α and bc([x], A) a}, apr β,b = {x U P(A [x]) > β or bc([x], A) > b}, (25) where bc( ) is a Bayesian confirmation measure, and a and b are parameters in 15

16 the range of bc( ). They have shown that the variable precision and Bayesian rough set models are special cases. More details of the two-parameterized approximation models can be found in their paper in this issue [11]. The extra parameters may make the model more effective, which at the same time leads to more difficulties in estimating those parameters. 5 The Decision-Theoretic Rough Set Model A fundamental difficulty with the probabilistic, variable precision, and parameterized approximations introduced in the last section is the physical interpretation of the required threshold parameters, as well as systematic methods for setting the parameters. This difficulty has in fact been resolved in the decision-theoretic model of rough sets proposed earlier [52,56,58]. This section reviews and summarizes the main results of the decision-theoretic framework and its connections to other studies. It draws extensive results from two previous papers [52,56] on the one hand and re-interprets these results on the other. For clarity, we only consider the element based definition of probabilistic approximation operators. The same argument can be easily applied to other cases. 5.1 An overview of the Bayesian decision procedure Bayesian decision procedure deals mainly with making decision with minimum risk or cost under probabilistic uncertainty. We present an overview by following the discussion in the textbook by Duda and Hart [7], in which more detailed information can be found. Let Ω = {w 1,...,w s } be a finite set of s states, and let A = {a 1,...,a m } be a finite set of m possible actions. Let P(w j x) be the conditional probability of an object x being in state w j given that the object is described by x. Without loss of generality, we simply assume that these conditional probabilities P(w j x) are known. Let λ(a i w j ) denote the loss, or cost, for taking action a i when the state is w j. For an object with description x, suppose action a i is taken. Since P(w j x) is the probability that the true state is w j given x, the expected loss associated with taking action a i is given by: s R(a i x) = λ(a i w j )P(w j x). (26) j=1 16

17 The quantity R(a i x) is also called the conditional risk. Given description x, a decision rule is a function τ(x) that specifies which action to take. That is, for every x, τ(x) assumes one of the actions, a 1,...,a m. The overall risk R is the expected loss associated with a given decision rule. Since R(τ(x) x) is the conditional risk associated with action τ(x), the overall risk is defined by: R = x R(τ(x) x)p(x), (27) where the summation is over the set of all possible descriptions of objects, i.e., the knowledge representation space. If τ(x) is chosen so that R(τ(x) x) is as small as possible for every x, the overall risk R is minimized. The Bayesian decision procedure can be formally stated as follows. For every x, compute the conditional risk R(a i x) for i = 1,..., m defined by equation (26), and then select the action for which the conditional risk is minimum. If more than one action minimizes R(a i x), any tie-breaking rule can be used. 5.2 Probabilistic rough set approximations In an approximation space apr = (U, E), all elements in the equivalence class [x] share the same description [27,43]. For a given subset A U, the approximation operators partition the universe into three disjoint classes POS(A), NEG(A), and BND(A). Furthermore, one decides how to assign x into the three regions based on the conditional probability P(A [x]). It follows that the Bayesian decision procedure can be immediately applied to solve this problem [52,56,58]. For deriving the probabilistic approximation operators, we have the following problem. The set of states is given by Ω = {A, A c } indicating that an element is in A and not in A, respectively. We use the same symbol to denote both a subset A and the corresponding state. With respect to three regions, the set of actions is given by A = {a 1, a 2, a 3 }, where a 1, a 2, and a 3 represent the three actions in classifying an object, namely, deciding POS(A), deciding NEG(A), and deciding BND(A), respectively. Let λ(a i A) denote the loss incurred for taking action a i when an object in fact belongs to A, and let λ(a i A c ) denote the loss incurred for taking the same action when the object does not belong to A. The rough membership values µ A (x) = P(A [x]) and µ A c(x) = P(A c [x]) = 1 P(A [x]) are in fact the probabilities that an object in the equivalence class [x] belongs to A and A c, respectively. The expected loss R(a i [x]) associated with taking the individual actions can be expressed as: 17

18 R(a 1 [x]) =λ 11 P(A [x]) + λ 12 P(A c [x]), R(a 2 [x]) =λ 21 P(A [x]) + λ 22 P(A c [x]), R(a 3 [x]) =λ 31 P(A [x]) + λ 32 P(A c [x]), (28) where λ i1 = λ(a i A), λ i2 = λ(a i A c ), and i = 1, 2, 3. The Bayesian decision procedure leads to the following minimum-risk decision rules: (P) (N) (B) If R(a 1 [x]) R(a 2 [x]) and R(a 1 [x]) R(a 3 [x]), decide POS(A); If R(a 2 [x]) R(a 1 [x]) and R(a 2 [x]) R(a 3 [x]), decide NEG(A); If R(a 3 [x]) R(a 1 [x]) and R(a 3 [x]) R(a 2 [x]), decide BND(A). Tie-breaking rules should be added so that each element is classified into only one region. Since P(A [x]) + P(A c [x]) = 1, the above decision rules can be simplified so that only the probabilities P(A [x]) are involved. We can classify any object in the equivalence class [x] based only on the probabilities P(A [x]), i.e., the rough membership values, and the given loss function λ ij, i = 1, 2, 3 and j = 1, 2. Consider a special kind of loss functions with λ 11 λ 31 < λ 21 and λ 22 λ 32 < λ 12. That is, the loss of classifying an object x belonging to A into the positive region POS(A) is less than or equal to the loss of classifying x into the boundary region BND(A), and both of these losses are strictly less than the loss of classifying x into the negative region NEG(A). The reverse order of losses is used for classifying an object that does not belong to A. For this type of loss functions, the minimum-risk decision rules (P)-(B) can be written as: (P) (N) (B) If P(A [x]) γ and P(A [x]) α, decide POS(A); If P(A [x]) β and P(A [x]) γ, decide NEG(A); If β P(A [x]) α, decide BND(A); where λ 12 λ 32 α = (λ 31 λ 32 ) (λ 11 λ 12 ), λ 12 λ 22 γ = (λ 21 λ 22 ) (λ 11 λ 12 ), 18

19 λ 32 λ 22 β = (λ 21 λ 22 ) (λ 31 λ 32 ). (29) By the assumptions, λ 11 λ 31 < λ 21 and λ 22 λ 32 < λ 12, it follows that α (0, 1], γ (0, 1), and β [0, 1). If a loss function with λ 11 λ 31 < λ 21 and λ 22 λ 32 < λ 12 further satisfies the condition: (λ 12 λ 32 )(λ 21 λ 31 ) (λ 31 λ 11 )(λ 32 λ 22 ), (30) then α γ β. The condition ensures that probabilistic rough set approximations are consistent with the standard rough set approximations. In other words, the lower approximation is a subset of the upper approximation, and the boundary region may be non-empty. The physical meaning of condition (30) may be interpreted as follows. Let l = (λ 12 λ 32 )(λ 21 λ 31 ) and r = (λ 31 λ 11 )(λ 32 λ 22 ). While l is the product of the differences between the cost of making an incorrect classification and cost of classifying an element into the boundary region, r is the product of the differences between the cost of classifying an element into the boundary region and the cost of a correct classification. A larger value of l, or equivalently a smaller value of r, can be obtained if we move λ 32 away from λ 12, or move λ 31 away from λ 21. In fact, the condition can be intuitively interpreted as saying that cost of classifying an element into the boundary region is closer to the cost of a correct classification than to the cost of an incorrect classification. Such a condition seems to be reasonable. When α > β, we have α > γ > β. After tie-breaking, we obtain the decision rules: (P1) (N1) (B1) If P(A [x]) α, decide POS(A); If P(A [x]) β, decide NEG(A); If β < P(A [x]) < α, decide BND(A). Based on the relationship between approximations and the three regions, we obtain the probabilistic approximations: apr α (A) = {x U P(A [x]) α}, apr β (A) = {x U P(A [x]) > β}. When α = β, we have α = γ = β. In this case, we use the decision rules: (P2) If P(A [x]) > α, decide POS(A); 19

20 (N2) (B2) If P(A [x]) < α, decide NEG(A); If P(A [x]) = α, decide BND(A). For the second set of decision rules, we use a tie-breaking criterion so that the boundary region may be non-empty. Probabilistic approximations can be obtained, which is similar to the 0.5 probabilistic approximations introduced by Pawlak, Wong, Ziarko [29]. As an example to illustrate the probabilistic approximations, consider a loss function: λ 12 = λ 21 = 4, λ 31 = λ 32 = 1, λ 11 = λ 22 = 0. (31) It states that there is no cost for a correct classification, 4 units of cost for an incorrect classification, and 1 unit cost for classifying an object into the boundary region. From equation (29), we have α = 0.75, β = 0.25 and γ = 0.5. By decision rules (P1)-(B1), we have a pair of dual approximation operators apr 0.75 and apr In general, the relationships between a loss function λ and the pair of parameters (α, β) can be established. For a loss function with λ 11 λ 31 < λ 21 and λ 22 λ 32 < λ 12, we have [52]: α is monotonic non-decreasing with respect to λ 12 and monotonic nonincreasing with respect to λ 32. If λ 11 < λ 31, α is strictly monotonic increasing with respect to λ 12 and strictly monotonic decreasing with respect to λ 32. α is strictly monotonic decreasing with respect to λ 31 and strictly monotonic increasing with respect to λ 11. β is monotonic non-increasing with respect to λ 21 and monotonic nondecreasing with respect to λ 31. If λ 22 < λ 32, β is strictly monotonic decreasing respect to λ 21 and strictly monotonic increasing with respect to λ 31. β is strictly monotonic increasing with respect to λ 32 and strictly monotonic decreasing with respect to λ 22. Such connections between the required parameters of probabilistic rough set approximations and loss functions have significant implications in applying the decision-theoretic model of rough sets. For example, if we increase the cost of an incorrect classification λ 12 and keep other costs unchanged, the value α would not be decreased. Parameters α and β are determined from a loss function. One may argue that a loss function may be considered as a set of parameters. However, in contrast to the standard threshold values, they are not abstract notions, but have an intuitive interpretation. One can easily interpret and measure loss or cost in a real application. In fact, the results 20

### Information Granulation and Approximation in a Decision-theoretic Model of Rough Sets

Information Granulation and Approximation in a Decision-theoretic Model of Rough Sets Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan Canada S4S 0A2 E-mail: yyao@cs.uregina.ca

### Inclusion degree: a perspective on measures for rough set data analysis

Information Sciences 141 (2002) 227 236 www.elsevier.com/locate/ins Inclusion degree: a perspective on measures for rough set data analysis Z.B. Xu a, *, J.Y. Liang a,1, C.Y. Dang b, K.S. Chin b a Faculty

### A Three-Way Decision Approach to Email Spam Filtering

A Three-Way Decision Approach to Email Spam Filtering Bing Zhou, Yiyu Yao, and Jigang Luo Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 {zhou200b,yyao,luo226}@cs.uregina.ca

### An Outline of a Theory of Three-way Decisions

An Outline of a Theory of Three-way Decisions Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract. A theory of three-way

### Web-Based Support Systems with Rough Set Analysis

Web-Based Support Systems with Rough Set Analysis JingTao Yao and Joseph P. Herbert Department of Computer Science University of Regina, Regina Canada S4S 0A2 {jtyao,herbertj}@cs.uregina.ca Abstract. Rough

### Modelling Complex Patterns by Information Systems

Fundamenta Informaticae 67 (2005) 203 217 203 IOS Press Modelling Complex Patterns by Information Systems Jarosław Stepaniuk Department of Computer Science, Białystok University of Technology Wiejska 45a,

### ROUGH SETS AND DATA MINING. Zdzisław Pawlak

ROUGH SETS AND DATA MINING Zdzisław Pawlak Institute of Theoretical and Applied Informatics, Polish Academy of Sciences, ul. altycka 5, 44 100 Gliwice, Poland ASTRACT The paper gives basic ideas of rough

### Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms

Explanation-Oriented Association Mining Using a Combination of Unsupervised and Supervised Learning Algorithms Y.Y. Yao, Y. Zhao, R.B. Maguire Department of Computer Science, University of Regina Regina,

### A Multiview Approach for Intelligent Data Analysis based on Data Operators

A Multiview Approach for Intelligent Data Analysis based on Data Operators Yaohua Chen and Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: {chen115y,

### Rough Sets and Fuzzy Rough Sets: Models and Applications

Rough Sets and Fuzzy Rough Sets: Models and Applications Chris Cornelis Department of Applied Mathematics and Computer Science, Ghent University, Belgium XV Congreso Español sobre Tecnologías y Lógica

### A ROUGH SET-BASED KNOWLEDGE DISCOVERY PROCESS

Int. J. Appl. Math. Comput. Sci., 2001, Vol.11, No.3, 603 619 A ROUGH SET-BASED KNOWLEDGE DISCOVERY PROCESS Ning ZHONG, Andrzej SKOWRON The knowledge discovery from real-life databases is a multi-phase

### Information Retrieval Support Systems

1 Information Retrieval Support Systems Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract - Information retrieval support

### Discernibility Thresholds and Approximate Dependency in Analysis of Decision Tables

Discernibility Thresholds and Approximate Dependency in Analysis of Decision Tables Yu-Ru Syau¹, En-Bing Lin²*, Lixing Jia³ ¹Department of Information Management, National Formosa University, Yunlin, 63201,

### Action Reducts. Computer Science Department, University of Pittsburgh at Johnstown, Johnstown, PA 15904, USA 2

Action Reducts Seunghyun Im 1, Zbigniew Ras 2,3, Li-Shiang Tsay 4 1 Computer Science Department, University of Pittsburgh at Johnstown, Johnstown, PA 15904, USA sim@pitt.edu 2 Computer Science Department,

### A Step Towards the Foundations of Data Mining

A Step Towards the Foundations of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 ABSTRACT This paper addresses some fundamental issues related

### A Rough Set View on Bayes Theorem

A Rough Set View on Bayes Theorem Zdzisław Pawlak* University of Information Technology and Management, ul. Newelska 6, 01 447 Warsaw, Poland Rough set theory offers new perspective on Bayes theorem. The

### Three-Way Decisions Solution to Filter Spam Email: An Empirical Study

Three-Way Decisions Solution to Filter Spam Email: An Empirical Study Xiuyi Jia 1,4, Kan Zheng 2,WeiweiLi 3, Tingting Liu 2, and Lin Shang 4 1 School of Computer Science and Technology, Nanjing University

### A FUZZY LOGIC APPROACH FOR SALES FORECASTING

A FUZZY LOGIC APPROACH FOR SALES FORECASTING ABSTRACT Sales forecasting proved to be very important in marketing where managers need to learn from historical data. Many methods have become available for

### Optimization under fuzzy if-then rules

Optimization under fuzzy if-then rules Christer Carlsson christer.carlsson@abo.fi Robert Fullér rfuller@abo.fi Abstract The aim of this paper is to introduce a novel statement of fuzzy mathematical programming

### A ROUGH CLUSTER ANALYSIS OF SHOPPING ORIENTATION DATA. Kevin Voges University of Canterbury. Nigel Pope Griffith University

Abstract A ROUGH CLUSTER ANALYSIS OF SHOPPING ORIENTATION DATA Kevin Voges University of Canterbury Nigel Pope Griffith University Mark Brown University of Queensland Track: Marketing Research and Research

### Towards a Practical Approach to Discover Internal Dependencies in Rule-Based Knowledge Bases

Towards a Practical Approach to Discover Internal Dependencies in Rule-Based Knowledge Bases Roman Simiński, Agnieszka Nowak-Brzezińska, Tomasz Jach, and Tomasz Xiȩski University of Silesia, Institute

### Linguistic Preference Modeling: Foundation Models and New Trends. Extended Abstract

Linguistic Preference Modeling: Foundation Models and New Trends F. Herrera, E. Herrera-Viedma Dept. of Computer Science and Artificial Intelligence University of Granada, 18071 - Granada, Spain e-mail:

### HANDLING MISSING ATTRIBUTE VALUES

Chapter 1 HANDLING MISSING ATTRIBUTE VALUES Jerzy W. Grzymala-Busse University of Kansas Witold J. Grzymala-Busse FilterLogix Abstract In this chapter methods of handling missing attribute values in data

### Removing Partial Inconsistency in Valuation- Based Systems*

Removing Partial Inconsistency in Valuation- Based Systems* Luis M. de Campos and Serafín Moral Departamento de Ciencias de la Computación e I.A., Universidad de Granada, 18071 Granada, Spain This paper

### Chapter 1. Sigma-Algebras. 1.1 Definition

Chapter 1 Sigma-Algebras 1.1 Definition Consider a set X. A σ algebra F of subsets of X is a collection F of subsets of X satisfying the following conditions: (a) F (b) if B F then its complement B c is

### FS I: Fuzzy Sets and Fuzzy Logic. L.A.Zadeh, Fuzzy Sets, Information and Control, 8(1965)

FS I: Fuzzy Sets and Fuzzy Logic Fuzzy sets were introduced by Zadeh in 1965 to represent/manipulate data and information possessing nonstatistical uncertainties. L.A.Zadeh, Fuzzy Sets, Information and

### Using Rough Sets to predict insolvency of Spanish non-life insurance companies

Using Rough Sets to predict insolvency of Spanish non-life insurance companies M.J. Segovia-Vargas a, J.A. Gil-Fana a, A. Heras-Martínez a, J.L. Vilar-Zanón a, A. Sanchis-Arellano b a Departamento de Economía

### Fuzzy Numbers in the Credit Rating of Enterprise Financial Condition

C Review of Quantitative Finance and Accounting, 17: 351 360, 2001 2001 Kluwer Academic Publishers. Manufactured in The Netherlands. Fuzzy Numbers in the Credit Rating of Enterprise Financial Condition

### DATA MINING PROCESS FOR UNBIASED PERSONNEL ASSESSMENT IN POWER ENGINEERING COMPANY

Medunarodna naucna konferencija MENADŽMENT 2012 International Scientific Conference MANAGEMENT 2012 Mladenovac, Srbija, 20-21. april 2012 Mladenovac, Serbia, 20-21 April, 2012 DATA MINING PROCESS FOR UNBIASED

### CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2016. Tom M. Mitchell. All rights reserved. *DRAFT OF January 24, 2016* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is a

### Any Improved Trapezoidal Approximation to Intersection (Fusion) of Trapezoidal Fuzzy Numbers Leads to Non-Associativity: A Theorem

Any Improved Trapezoidal Approximation to Intersection (Fusion) of Trapezoidal Fuzzy Numbers Leads to Non-Associativity: A Theorem Gang Xiang and Vladik Kreinovich, Senior Member, IEEE Abstract In some

### 1 Introduction 2. 2 Granular Computing 4 2.1 Basic Issues... 4 2.2 Simple Granulations... 5 2.3 Hierarchical Granulations... 7

In: Wu, W., Xiong, H. and Shekhar, S. (Eds.), Information Retrieval and Clustering Kluwer Academic Publishers, pp. 299-329, 2003. Granular Computing for the Design of Information Retrieval Support Systems

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Types of Degrees in Bipolar Fuzzy Graphs

pplied Mathematical Sciences, Vol. 7, 2013, no. 98, 4857-4866 HIKRI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.37389 Types of Degrees in Bipolar Fuzzy Graphs Basheer hamed Mohideen Department

### Incorporating Evidence in Bayesian networks with the Select Operator

Incorporating Evidence in Bayesian networks with the Select Operator C.J. Butz and F. Fang Department of Computer Science, University of Regina Regina, Saskatchewan, Canada SAS 0A2 {butz, fang11fa}@cs.uregina.ca

### Enhanced data mining analysis in higher educational system using rough set theory

African Journal of Mathematics and Computer Science Research Vol. 2(9), pp. 184-188, October, 2009 Available online at http://www.academicjournals.org/ajmcsr ISSN 2006-9731 2009 Academic Journals Review

### Knowledge Based Descriptive Neural Networks

Knowledge Based Descriptive Neural Networks J. T. Yao Department of Computer Science, University or Regina Regina, Saskachewan, CANADA S4S 0A2 Email: jtyao@cs.uregina.ca Abstract This paper presents a

### Sets and functions. {x R : x > 0}.

Sets and functions 1 Sets The language of sets and functions pervades mathematics, and most of the important operations in mathematics turn out to be functions or to be expressible in terms of functions.

### Fuzzy Probability Distributions in Bayesian Analysis

Fuzzy Probability Distributions in Bayesian Analysis Reinhard Viertl and Owat Sunanta Department of Statistics and Probability Theory Vienna University of Technology, Vienna, Austria Corresponding author:

### FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

### Mathematics for Computer Science/Software Engineering. Notes for the course MSM1F3 Dr. R. A. Wilson

Mathematics for Computer Science/Software Engineering Notes for the course MSM1F3 Dr. R. A. Wilson October 1996 Chapter 1 Logic Lecture no. 1. We introduce the concept of a proposition, which is a statement

### The Optimality of Naive Bayes

The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

### Structure of Measurable Sets

Structure of Measurable Sets In these notes we discuss the structure of Lebesgue measurable subsets of R from several different points of view. Along the way, we will see several alternative characterizations

### Integrating Pattern Mining in Relational Databases

Integrating Pattern Mining in Relational Databases Toon Calders, Bart Goethals, and Adriana Prado University of Antwerp, Belgium {toon.calders, bart.goethals, adriana.prado}@ua.ac.be Abstract. Almost a

### Big Data with Rough Set Using Map- Reduce

Big Data with Rough Set Using Map- Reduce Mr.G.Lenin 1, Mr. A. Raj Ganesh 2, Mr. S. Vanarasan 3 Assistant Professor, Department of CSE, Podhigai College of Engineering & Technology, Tirupattur, Tamilnadu,

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Dominance-based Rough Set Approach Data Analysis Framework. User's guide

Dominance-based Rough Set Approach Data Analysis Framework User's guide jmaf - Dominance-based Rough Set Data Analysis Framework http://www.cs.put.poznan.pl/jblaszczynski/site/jrs.html Jerzy Bªaszczy«ski,

### Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

### On Development of Fuzzy Relational Database Applications

On Development of Fuzzy Relational Database Applications Srdjan Skrbic Faculty of Science Trg Dositeja Obradovica 3 21000 Novi Sad Serbia shkrba@uns.ns.ac.yu Aleksandar Takači Faculty of Technology Bulevar

### OPRE 6201 : 2. Simplex Method

OPRE 6201 : 2. Simplex Method 1 The Graphical Method: An Example Consider the following linear program: Max 4x 1 +3x 2 Subject to: 2x 1 +3x 2 6 (1) 3x 1 +2x 2 3 (2) 2x 2 5 (3) 2x 1 +x 2 4 (4) x 1, x 2

### SOLUTIONS TO ASSIGNMENT 1 MATH 576

SOLUTIONS TO ASSIGNMENT 1 MATH 576 SOLUTIONS BY OLIVIER MARTIN 13 #5. Let T be the topology generated by A on X. We want to show T = J B J where B is the set of all topologies J on X with A J. This amounts

### Chapter 3. Cartesian Products and Relations. 3.1 Cartesian Products

Chapter 3 Cartesian Products and Relations The material in this chapter is the first real encounter with abstraction. Relations are very general thing they are a special type of subset. After introducing

### The Art of Granular Computing

The Art of Granular Computing Yiyu Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca Abstract. The current research in granular computing

### Classical Analysis I

Classical Analysis I 1 Sets, relations, functions A set is considered to be a collection of objects. The objects of a set A are called elements of A. If x is an element of a set A, we write x A, and if

### DATA MINING IN FINANCE

DATA MINING IN FINANCE Advances in Relational and Hybrid Methods by BORIS KOVALERCHUK Central Washington University, USA and EVGENII VITYAEV Institute of Mathematics Russian Academy of Sciences, Russia

### This chapter describes set theory, a mathematical theory that underlies all of modern mathematics.

Appendix A Set Theory This chapter describes set theory, a mathematical theory that underlies all of modern mathematics. A.1 Basic Definitions Definition A.1.1. A set is an unordered collection of elements.

### Concept Formation and Learning: A Cognitive Informatics Perspective

Concept Formation and Learning: A Cognitive Informatics Perspective Y.Y. Yao Department of Computer Science University of Regina Regina, Saskatchewan, Canada S4S 0A2 E-mail: yyao@cs.uregina.ca URL: http://www.cs.uregina/

### Applications of Methods of Proof

CHAPTER 4 Applications of Methods of Proof 1. Set Operations 1.1. Set Operations. The set-theoretic operations, intersection, union, and complementation, defined in Chapter 1.1 Introduction to Sets are

### POWER SETS AND RELATIONS

POWER SETS AND RELATIONS L. MARIZZA A. BAILEY 1. The Power Set Now that we have defined sets as best we can, we can consider a sets of sets. If we were to assume nothing, except the existence of the empty

### INTRODUCTORY SET THEORY

M.Sc. program in mathematics INTRODUCTORY SET THEORY Katalin Károlyi Department of Applied Analysis, Eötvös Loránd University H-1088 Budapest, Múzeum krt. 6-8. CONTENTS 1. SETS Set, equal sets, subset,

### Classification of Fuzzy Data in Database Management System

Classification of Fuzzy Data in Database Management System Deval Popat, Hema Sharda, and David Taniar 2 School of Electrical and Computer Engineering, RMIT University, Melbourne, Australia Phone: +6 3

### Lecture 8 The Subjective Theory of Betting on Theories

Lecture 8 The Subjective Theory of Betting on Theories Patrick Maher Philosophy 517 Spring 2007 Introduction The subjective theory of probability holds that the laws of probability are laws that rational

### Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

### REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE

REAL ANALYSIS LECTURE NOTES: 1.4 OUTER MEASURE CHRISTOPHER HEIL 1.4.1 Introduction We will expand on Section 1.4 of Folland s text, which covers abstract outer measures also called exterior measures).

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Introduction to Statistical Machine Learning

CHAPTER Introduction to Statistical Machine Learning We start with a gentle introduction to statistical machine learning. Readers familiar with machine learning may wish to skip directly to Section 2,

### Set theory, and set operations

1 Set theory, and set operations Sayan Mukherjee Motivation It goes without saying that a Bayesian statistician should know probability theory in depth to model. Measure theory is not needed unless we

### In mathematics you don t understand things. You just get used to them. (Attributed to John von Neumann)

Chapter 1 Sets and Functions We understand a set to be any collection M of certain distinct objects of our thought or intuition (called the elements of M) into a whole. (Georg Cantor, 1895) In mathematics

### The Application of Association Rule Mining in CRM Based on Formal Concept Analysis

The Application of Association Rule Mining in CRM Based on Formal Concept Analysis HongSheng Xu * and Lan Wang College of Information Technology, Luoyang Normal University, Luoyang, 471022, China xhs_ls@sina.com

### International Journal of Electronics and Computer Science Engineering 1449

International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

### Degrees of Truth: the formal logic of classical and quantum probabilities as well as fuzzy sets.

Degrees of Truth: the formal logic of classical and quantum probabilities as well as fuzzy sets. Logic is the study of reasoning. A language of propositions is fundamental to this study as well as true

### A Multi-granular Linguistic Model to Evaluate the Suitability of Installing an ERP System

Mathware & Soft Computing 12 (2005) 217-233 A Multi-granular Linguistic Model to Evaluate the Suitability of Installing an ERP System P.J. Sánchez, L.G. Pérez, F. Mata and A.G. López-Herrera, Dept. of

### Models for Inexact Reasoning. Fuzzy Logic Lesson 1 Crisp and Fuzzy Sets. Master in Computational Logic Department of Artificial Intelligence

Models for Inexact Reasoning Fuzzy Logic Lesson 1 Crisp and Fuzzy Sets Master in Computational Logic Department of Artificial Intelligence Origins and Evolution of Fuzzy Logic Origin: Fuzzy Sets Theory

### 6.231 Dynamic Programming and Stochastic Control Fall 2008

MIT OpenCourseWare http://ocw.mit.edu 6.231 Dynamic Programming and Stochastic Control Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 6.231

### Determination of the normalization level of database schemas through equivalence classes of attributes

Computer Science Journal of Moldova, vol.17, no.2(50), 2009 Determination of the normalization level of database schemas through equivalence classes of attributes Cotelea Vitalie Abstract In this paper,

### Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets

Unraveling versus Unraveling: A Memo on Competitive Equilibriums and Trade in Insurance Markets Nathaniel Hendren January, 2014 Abstract Both Akerlof (1970) and Rothschild and Stiglitz (1976) show that

### Machine Learning. Chapter 18, 21. Some material adopted from notes by Chuck Dyer

Machine Learning Chapter 18, 21 Some material adopted from notes by Chuck Dyer What is learning? Learning denotes changes in a system that... enable a system to do the same task more efficiently the next

### Factoring & Primality

Factoring & Primality Lecturer: Dimitris Papadopoulos In this lecture we will discuss the problem of integer factorization and primality testing, two problems that have been the focus of a great amount

### Fuzzy decision trees

Fuzzy decision trees Catalin Pol Abstract Decision trees are arguably one of the most popular choices for learning and reasoning systems, especially when it comes to learning from discrete valued (feature

### Credit Risk Comprehensive Evaluation Method for Online Trading

Credit Risk Comprehensive Evaluation Method for Online Trading Company 1 *1, Corresponding Author School of Economics and Management, Beijing Forestry University, fankun@bjfu.edu.cn Abstract A new comprehensive

### Domain Classification of Technical Terms Using the Web

Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using

### NOTES ON MEASURE THEORY. M. Papadimitrakis Department of Mathematics University of Crete. Autumn of 2004

NOTES ON MEASURE THEORY M. Papadimitrakis Department of Mathematics University of Crete Autumn of 2004 2 Contents 1 σ-algebras 7 1.1 σ-algebras............................... 7 1.2 Generated σ-algebras.........................

### CHAPTER 5. Product Measures

54 CHAPTER 5 Product Measures Given two measure spaces, we may construct a natural measure on their Cartesian product; the prototype is the construction of Lebesgue measure on R 2 as the product of Lebesgue

### Randomized Sorting as a Big Data Search Algorithm

Int'l Conf. on Advances in Big Data Analytics ABDA' 7 Randomized Sorting as a Big Data Search Algorithm I. Avramovic and D. S. Richards Department of Computer Science, George Mason University, Fairfax,

### Feature Selection with Rough Sets for Web Page Classification

Feature Selection with Rough Sets for Web Page Classification Aijun An 1, Yanhui Huang 2, Xiangji Huang 1, and Nick Cercone 3 1 York University, Toronto, Ontario, M3J 1P3, Canada 2 University of Waterloo,

### PartJoin: An Efficient Storage and Query Execution for Data Warehouses

PartJoin: An Efficient Storage and Query Execution for Data Warehouses Ladjel Bellatreche 1, Michel Schneider 2, Mukesh Mohania 3, and Bharat Bhargava 4 1 IMERIR, Perpignan, FRANCE ladjel@imerir.com 2

### A set is a Many that allows itself to be thought of as a One. (Georg Cantor)

Chapter 4 Set Theory A set is a Many that allows itself to be thought of as a One. (Georg Cantor) In the previous chapters, we have often encountered sets, for example, prime numbers form a set, domains

### S(A) X α for all α Λ. Consequently, S(A) X, by the definition of intersection. Therefore, X is inductive.

MA 274: Exam 2 Study Guide (1) Know the precise definitions of the terms requested for your journal. (2) Review proofs by induction. (3) Be able to prove that something is or isn t an equivalence relation.

### Two classes of ternary codes and their weight distributions

Two classes of ternary codes and their weight distributions Cunsheng Ding, Torleiv Kløve, and Francesco Sica Abstract In this paper we describe two classes of ternary codes, determine their minimum weight

### WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly

### Section 5 shows comparison between CRSA and DRSA. Finally, Section 6 concludes the paper. <ε then STOP, otherwise return. to step 2.

The Online Journal on Computer Science Information Technology (OJCSIT) Vol. () No. () Dominance-based rough set approach in business intelligence S.M Aboelnaga, H.M Abdalkader R.Hussein Information System

### An Introduction to Web-based Support Systems

An Introduction to Web-based Support Systems JingTao Yao Department of Computer Science, University of Regina Regina, Saskatchewan, Canada S4S 0A2 ABSTRACT We view Web-based Support Systems (WSS) as a

### NEURAL NETWORKS IN DATA MINING

NEURAL NETWORKS IN DATA MINING 1 DR. YASHPAL SINGH, 2 ALOK SINGH CHAUHAN 1 Reader, Bundelkhand Institute of Engineering & Technology, Jhansi, India 2 Lecturer, United Institute of Management, Allahabad,

### The Compound Operations of Uncertain Cloud Concepts

Journal of Computational Information Systems 11: 13 (2015) 4881 4888 Available at http://www.jofcis.com The Compound Operations of Uncertain Cloud Concepts Longjun YIN 1,, Junjie XU 1, Guansheng ZHANG

### A New Method for Multi-Objective Linear Programming Models with Fuzzy Random Variables

383838383813 Journal of Uncertain Systems Vol.6, No.1, pp.38-50, 2012 Online at: www.jus.org.uk A New Method for Multi-Objective Linear Programming Models with Fuzzy Random Variables Javad Nematian * Department

### Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison

Random vs. Structure-Based Testing of Answer-Set Programs: An Experimental Comparison Tomi Janhunen 1, Ilkka Niemelä 1, Johannes Oetsch 2, Jörg Pührer 2, and Hans Tompits 2 1 Aalto University, Department

### Problem Set I: Preferences, W.A.R.P., consumer choice

Problem Set I: Preferences, W.A.R.P., consumer choice Paolo Crosetto paolo.crosetto@unimi.it Exercises solved in class on 18th January 2009 Recap:,, Definition 1. The strict preference relation is x y

### CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES

CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,