Relational Methodology for Data Mining and Knowledge Discovery

Size: px
Start display at page:

Download "Relational Methodology for Data Mining and Knowledge Discovery"

Transcription

1 Relational Methodology for Data Mining and Knowledge Discovery Vityaev E.E.* 1, Kovalerchuk B.Y. 2 1 Sobolev Institute of Mathematics SB RAS, Novosibirsk State University, Novosibirsk, , Russia. 2 Computer Science Department, Central Washington University, Ellensburg, WA, , USA. Abstract. Knowledge discovery and data mining methods have been successful in many domains. However, their abilities to build or discover a domain theory remain unclear. This is largely due to the fact that many fundamental KDD&DM methodological questions are still unexplored such as (1) the nature of the information contained in input data relative to the domain theory, and (2) the nature of the knowledge that these methods discover. The goal of this paper is to clarify methodological questions of KDD&DM methods. This is done by using the concept of Relational Data Mining (RDM), representative measurement theory, an ontology of a subject domain, a many-sorted empirical system (algebraic structure in the first-order logic), and an ontology of a KDD&DM method. The paper concludes with a review of our RDM approach and Discovery system built on this methodology that can analyze any hypotheses represented in the first-order logic and use any input by representing it in manysorted empirical system. 1. Introduction. A variety of KDD&DM methods have been developed, however their abilities to build or discover a domain theory remain unclear. The goal of this paper is to clarify this and associated questions. We start from the questions about the nature of the information contained in input data relative to the domain theory. Question 1. What is the nature of the information contained in input data relative to the domain theory and what is the empirical content of input data? At first we note, that quantities in data are not numbers themselves, but numbers with an interpretation. For example, abstract numbers 200, 3400, 300, 500 have three different interpretations shown in Table 1. Table 1. Interpretation Values Meaningful operations Abstract numbers 200, 3400, 300, 500 Meaning of 20 > 30 is not clear. These numbers can be just labels. Abstract angles 200, 3400, 300, meaningfully greater that 200 Azimuth angles 200, 3400, 300, 500 Azimuth operations Rotational angles 200, 3400, 300, 500 Rotational angle operations For every quantity there are relations and operations that are meaningful for this quantity. This interpretation of quantities is a core approach of the Representational Measurement Theory (RMT) [26], [37]. The RMT interprets quantities as empirical systems algebraic structures defined on the objects of subject domain with the set of empirically interpretable relations and operations that define the meaning of the quantity (see the next section for definitions). The empirical content of quantities, laws and data is expressed by their (many-sorted) empirical systems. 1 Corresponding author. vityaev@math.nsc.ru 1

2 More specifically main statements of the measurement theory relative to data mining issues are as follows [26], [37]: numerical representations (scales) of quantities, laws and data are determined by the corresponding empirical systems; scales are unique up to a certain sets of permissible transformations such as changing measurement units from meters to kilometers for ratio-scales; the laws and KDD&DM methods need to be invariant relative to the sets of permissible transformations of quantities in data. To represent the empirical content of data in accordance with the measurement theory we need to transform the data into many-sorted empirical systems. These transformations are described in [23] for such data types as pair comparisons, binary matrices, matrices of orderings, matrices of proximity and attribute-based matrix. The problem of extraction the empirical content of data and transformation of it into the many-sorted empirical systems for the general case is considered in section 3. This transformation depends on the subject domain. For example, for many physical quantities, there exists the interpretable operation, which possess all formal properties of additive operation +. However, medicine and other areas may have no empirical interpretation for the operation. For example, empirical system of temperature, measured by a thermometer, may include operation that produces it temperature t 3 from temperatures t 1 and t 2, such that t 3 = t 1 t 2. But this operation in medicine may be non interpretable. In that case, the operation should be removed from the empirical system of temperature in medicine and the corresponding scale should be reconsidered. The order relation for temperature is obviously interpretable in medicine, and the order relation may be included in the empirical system of temperature in medicine. Physical temperature measured by thermometer for the case of medicine is for example the indirect measure of the metabolic rate of the patient. Thus, the empirical content of data and scales depend on the interpretation. Question 2. What determines the interpretation of data and what information may be extracted from data? The interpretation depends on ontology, which determines the view of the real-world. The subject domain and ontology are closely connected. The patient in medicine may be considered from many points of view: psychology, physiology, anatomy, sociology and so on. Thus, we cannot properly extract information from data about the patient without setting up an ontology. Consider another example from the area of finance. What is the empirical content of financial time series? It also can be viewed from many points of view that include the points of view of (1) a trader-expert, (2) one of the mathematical disciplines, (3) technical analysis, (4) trade indexes etc. We need to specify the ontology to extract the empirical content and interpret all relations and operations to define the meaning of all quantities contained in data. The information extracted from data is represented by many-sorted empirical system with that set of relations and operations. Question 3. What is the nature of the knowledge that a particular KDD&DM method extracts from input data? As was pointed out above any KDD&DM method contains a view of the real-world (ontology of the method). More completely, from our viewpoint, any KDD&DM method explicitly or implicitly assume: (1) some quantities for input data; (2) ontology of particular KDD&DM method (including a language) to manipulate and interpret data and results; (3) class of hypothesis in terms of that language to be tested on data (knowledge space of the KDD&DM method, see definition in section 5); 2

3 (4) Interpretation of the ontology of particular KDD&DM method in the ontology of the subject domain. For example, to apply a classification method that uses spherical shapes for classes of data, we need first to interpret spherical shapes in the ontology of the subject domain. Otherwise, we cannot interpret the results of classification. The knowledge extracted by a KDD&DM method is the set of confirmed hypothesis that are interpretable in the ontology of the KDD&DM method and in the ontology of the subject domain. The results of the KDD&DM methods should not depend on the subjective choice of the measurements units of quantities contained in data. Thus, KDD&DM methods need to be invariant relative to a sets of permissible transformations of quantities. In section 4, we define the invariance of KDD&DM methods relative to permissible transformations of scales. But, as we pointed out above, the scales of quantities depend on the interpretation of the relations and operations with data. Hence, the invariance of the method cannot be established before we revise the scales of all quantities based on the interpretation of their relations and operations. For example, we need to determine a new scale for temperature relevant to medical data before we can establish the invariance of the KDD&DM method for such data. We discuss this issue in section 3. The invariance of a KDD&DM method is closely related to interpretability of its results in the ontology of the subject domain. If a KDD&DM method uses operations or relations, that are not interpretable in the language of the method (and hence in the subject domain ontology), then it may obtain non-interpretable results. For example, average or sum of patients temperatures in the hospital has no medical interpretation. If all relations and operations, used in the algorithm, are included in the empirical systems of quantities, then the algorithm will be obviously invariant relative to the permissible transformations of scales for these quantities. Thus, to avoid the non invariance of the method and non interpretability of its results we need to use in the algorithms only relations and operations that are interpretable from the many-sorted empirical system. It means that the hypotheses tested by the method include only those interpretable in the ontology of the subject domain relations and operations. Then the knowledge space of particular KDD&DM method is a set of hypothesis formulated in terms of the relations and operations interpretable in the ontology. According to the measurement theory, any numerical data type can be transformed into a relational form that preserves all relevant information of that numerical data type. Now we can formulate our answer to the question about the nature of the knowledge that KDD&DM methods extract -- any KDD&DM method extracts a confirmed hypothesis from its knowledge space. In this context, the main characteristics of each KDD&DM method are the method s ontology and knowledge space. Discoveries of many KDD&DM methods are not invariant to the permissible transformations of scales of input data. As a result, these discoveries are not fully interpretable in the subject domain ontology. The interpretation usually is not explicitly defined and may be subjective. The expert in the subject domain can obtain a correct conclusion using non invariant method by using informal intuitive extraction of interpretable results from the confirmed hypotheses, but this is rather an art of Data Mining. The Relational Data Mining (RDM) approach [24-25, 51-52] and Discovery system described below allow us to overcome presented limitations. This approach is based on the firstorder logic for knowledge extraction from the many-sorted empirical systems for various classes of hypothesis. The relational approach: a) extending the data type notion; 3

4 b) using first-order logic and the measurement theory for presenting various data types as many-sorted empirical systems; c) using any background knowledge expressed in the first-order logic for learning and forecasting; d) extracting various hypotheses classes formulated in the first-order logic. This approach allows us to answer to the following question. Question 5. Can a subject domain theory be discovered by using KDD&DM methods? As was pointed out above any KDD&DM method discovers some class of hypothesis. In the relational data mining approach, classes of hypotheses are extended to classes of hypotheses presented in the first-order logic. Several KDD&DM methods discover wide classes of hypotheses in the First-Order Logic (FOL). We compare some FOL methods with our relational approach and the Discovery system in section 6. In sections 7-14, we present a series of definitions and prove that a subject domain theory can be discovered by using the relational data mining approach. We define notions of law and probabilistic law (section 10), a subject domain theory T as the set of all laws, and the theory TP as the set of all probabilistic laws. Next, a theorem is proved that the set SPL of Strongest Probabilistic Laws (with the maximum values of conditional probability) contains a subject domain theory T, and T SPL TP. We define a notion of a most specific rule (section 13) and prove that an inductive statistical inference (based on these rules) avoids the problem of statistical ambiguity. Next in sections 12 and 13, a special semantic probabilistic inferences is defined that infers theories T, TP, set SPL and generalizes the logical inference of logic programming. Finally, we describe the Discovery system, which implements semantic probabilistic inferences and discovers theories T, TP and sets SPL, MSR. As a result, it is proved that subject domain theory T, its probabilistic approximation TP and its consistent probabilistic approximation MSR can be discovered in the frames of the relational approach using the Discovery system. The Discovery system has been successfully applied to solutions of many practical tasks (see website The original challenge for the RDM Discovery system was the simulation of discovering scientific laws from empirical data in chemistry and physics. There is a well-known difference between the black box models and basic models (laws) in modern physics. The lifetime of the latter models is much longer, the scope is wider, and their background is sound. There is reason to believe that the RDM Discovery system captures certain important features of the discovery of laws. 2. Representative Measurement Theory In accordance with the measurement theory, numerical representations of quantities, laws and data are determined by the corresponding empirical systems. In this section we present required definitions from the measurement theory [26],[37]. An empirical system is a relational structure that consists of a set of objects A, k(i) ary relations P 1,,P n and k(j)-ary operations ρ 1,, ρ m defined on A, A = A, P 1,,P n, ρ 1,, ρ m Every relation P i is a Boolean function (a predicate) with k(i) arguments from A, and ρ j is the k(j) argument operation on A. A system R R = R,T 1,,T n, σ 1,, σ m, 4

5 m is called a numerical system of the same type as system A, if R is a subset of Re, m 1, m Re is a set of m-tuples of real numbers, every relation T i has the same arity k(i) as the corresponding relation P i, and every real-value function σ j has the same arity k(j) as the corresponding operation ρ j. A numerical system R is called a numerical representation of the empirical system A, if a (strong) homomorphism ϕ: A R exists such that: P i (a 1,,a k(i) ) T i (ϕ(a 1 ),, ϕ(a k(i) )), i = 1,,n; ϕ(ρ j (a 1,,a k(j) )) = σ j (ϕ(a 1 ),, ϕ(a k(j) )), j = 1,,m. The strong homomorphism means that if predicate T i (ϕ(a 1 ),, ϕ(a k(i) )) is true on ϕ(a 1 ),, ϕ(a k(i) ), then there exists tuple b 1,,b k(i) in A, such that P i (b 1,,b k(i) ) is true and ϕ(b 1 ) = ϕ(a 1 ),, ϕ(b k(i) ) = ϕ(a k(i) ). We will denote such homomorphism between the empirical system A and numerical system R as ϕ: A R. Thus, the numerical system R represents a relational structure in computationally tractable form with a complete retention of all the properties of the relational structure. In the measurement theory, the following process is in place: (1) finding a numerical representation R for empirical system A; (2) proving a theorem that homomorphism ϕ: A R exists; and (3) defining the set of all possible transformations f: R R (the uniqueness theorems) of the homomorphism ϕ, such that fϕ is also homomorphism fϕ: A R. Example: A relational structure A = A, P is called a semi-ordering, if for all a, b, c A the following axioms are satisfied: (P(a,b)&P(b, c) d A(P(a, d) P(d, c))). Theorem [10]: If A = A, P is semi-ordering, then there exists a function U: A Re such that: P(a,b) U(a) + 1 < U(b). There are hundreds of numerical representations known in the measurement theory with few most commonly used. The strongest one is called the absolute data type (absolute scale). The weakest numerical data type is the nominal data type (nominal scale). There is a spectrum of data types between them. They allow us comparing, ordering, adding, multiplying, dividing values and so on. The classification of these data types is presented in table 1. The basis of this classification is a transformation group. The strongest absolute data type does not permit to transform data at all, and the weakest nominal data type permits any one-to-one transformation. Intermediate data types permit different transformations such as positive affine, linear and others (see table 1). Table 1. Numerical data types. Transformation Transformation Group Data type (scale) X ƒ(x), F :Re (onto)re, 1 1 transformation group Nominal X ƒ(x), F :Re (onto)re homeomorphism group Order X rx + s, r > 0 Positive affine group Interval X tx r, t,r > 0 Power group Log-interval X x + s Translation group Difference X tx, t > 0 Similarity group Ratio X x Identity group Absolute The transformation groups are used to determine the invariance of law. The law expression must be invariant to the transformation group; otherwise it will depend not only on the nature, but on the subjective choice of the of measurement units. 5

6 3. Data type problems Below we consider a problem of the empirical content of data. A data type in the objectoriented programming languages as well as in the measurement theory is relational structure A with the sets of relations and operations interpretable in the domain theory. For instance, a stock price data type can be represented as a relational structure A = A; {, =, } with where nodes A are individual stock prices and arcs are their relations {, =, }. Implicitly, every attribute in the data represents a data type, which can take a number of possible values. These values are elements of A. For instance, the attribute date has 365 (366) elements ranging from January 1 to December 31. There are several meaningful relations and operations with dates such as <, =, >, and middle(a,b). For instance, the operation middle(a,b) produces the middle date c = for inputs a = and b = Conventionally, in attribute-value languages (AVL), this data type as well as many other data types are given implicitly, i.e., the relations and operations are not explicitly presented. Below we consider this implicit situation in more detail for six cases: 1. Physical data types in physical domains. 2. Physical data types in non-physical domains. 3. Non-physical data types in non-physical domains 4. Nominal discrete data types. 5. Non-quantitative and non-discrete data types. 6. Mix of data types. 1. Physical data types in physical domains. Data contain only physical quantities, ontology and physics domain background knowledge of the learning task. This is a realm of physics with well-developed data types and measurement procedures. In this case, the measurement theory [26] provides formalized relational structures for all the quantities and KDD&DM methods can be correctly applied. 2. Physical data types for non-physical domains. The data contain physical quantities, but the ontology and domain background knowledge of the learning task does not refer to physics. The background knowledge may refer to finance, geology, medicine, and other areas. In such cases, the real data types are not known, even when they represent physical quantities (as we pointed out above for the temperature in medicine). If the quantity is physical, then we can define the relational structure from structures available in the measurement theory. However, the physically interpretable relations of the relational structure are not necessarily interpretable in ontologies of other subject domains. Interpretation of the relations and operations should be provided for a new domain. If relations are not interpretable, they should be removed from the relational structure. The invariance of the KDD&DM results is not guarantied if the relation is not removed and a data mining method uses it (see the next section for definitions). 3. Non-physical data types in non-physical domains. For non-physical quantities, data types are virtually unknown. There are two sub-cases: a. Non-numerical data types. It has been demonstrated in [23] that several data types such as pair-wise and multiple comparison data types, attribute-based data types, order, and coherence matrixes data types can be represented in many-sorted empirical systems in the rather natural way. Without such representation, the invariance of KDD&DM methods cannot be rigorously established. b. Numerical data types. Here, we have a measurer x(a), which produces a number as a result of a measurement procedure applied to an object a. Examples of measurers are psychological tests, stock market indicators, questionnaires, and physical measuring instruments used in non-physical areas. 6

7 Let us define a set of empirically interpretable relations and operations for the measurer k m x(a). For any numerical relation R(y 1,,y k ) Re and operation σ(x 1,,x m ): Re Re, where Re is the set of real numbers, an empirical relation P R on A k and an empirical operation ρ σ : A m A can be defined as follows P R (a 1,,a k ) R(x(a 1 ),, x(a k )) ρ σ (a 1,,a m ) = σ(x(a 1 ),, x(a m )) The values x(a) provided by the measurer obviously have an empirical interpretation, but the relation P R and operation ρ σ may not. We should find relations R and operations σ that have an empirical interpretation in the subject domain ontology. The set of derived interpretable relations is not empty, because at least one relation (P = ) has an empirical interpretation: P = (a 1,a 2 ) x(a 1 ) = x(a 2 ). In the measurement theory, many sets of axioms that establish strong data types are based only on ordering and equivalence relations. Some strong data types can be constructed from interactions of the quantities with weak data types, such as ordering and equivalence. For instance, given weak order relation < y (for the attribute y) and n equivalence relations x,..., 1 x for the attributes x n 1,,x n, one can construct a complex relation G(y,x 1,,x n ) y = f(x 1,, x n ) (defined by the axiomatic system) between y and x 1,,x n, such that f(x 1,,x n ) is a polynomial [26]. For the polynomial the multiplication, power and sum operations are required. However, these operations can be defined for y, x 1,,x n using relation G, if a certain set of axioms in terms of relations < y, x,..., 1 x is true for A. n Ordering and equivalence relations are usually empirically interpretable in the ontology of various subject domains. The invariance of the KDD&DM is not guarantied for initial numerical data types for which they usually applied, but is guarantied for revised scales, based on relations P R and operations ρ σ. 4. Nominal discrete data types. Here, data are interpretable in the corresponding relational structures, because there is no difference between the numerical and empirical systems beyond possible use of different symbols. All numbers can be considered as names, and can be easily represented as predicates with a single variable. The KDD&DM methods and their results are invariant if discrete data types are used as names. 5. Non-quantitative and non-discrete data types. Data contain no quantities and discrete variables, but do contain ranks, orders and other non-numerical data types. This case is similar to the above item 3a. The only difference is that such data are usually made discrete by various calibrations with a loss of useful information. 6. Mix of data types. All the mentioned difficulties arise in this case. To work with such mix requires a new approach. Our Relational Data Mining approach provides it using a relational representation of the data types. 4. Invariance of the KDD&DM methods The results of the KDD&DM methods must not depend on the subjective choice of the measurement units, but usually it is not the case. Let us define the notion of invariance of a KDD&DM method. To that end, we will use the common (attribute-based) representation of a supervised learning [54] (fig. 1), where: W={w} is a training sample; X(w) = (x 1,,x n ) is the tuple of values of n variables (attributes) for training example w; Y(w) is the target function assigning the target value for each training example w; The result of KDD&DM method M learning on the training set {X(w)}, w W is a rule J J = M({X(w)}), 7

8 Population Training sample W={w} Assigning target values to training examples Y(w) Target values Y(w) X(w) {X(w)}={(x 1,x 2..., x n )} Assigning (learning) rule/classifier J by the KDD&DM method M M({X(w)}) = J, J(X(w)) = Y(w) Representation training examples by descriptors: (x 1,x 2..., x n )= X(w) Figure 1. Schematic supervised attribute-based data mining model that predicts values of the target function Y(w). For example, consider w with unknown value Y(w) but with the values all its attributes X(w) known, then J(X(w)) = Y(w), where J(X(w)) is value generated by the rule J. The resulting rule J can be an algebraic expression, a logic expression, a decision tree, a neural network, a complex algorithm, or a combination of these models. If the attributes (x 1,,x n, Y) are determined by the empirical systems A 1,, A n, and B having the transformation groups g 1,, g n, and g, respectively, then the transformation group G for all attributes is a product G = g 1 g n g. The KDD&DM method M is invariant relative to the transformation group G iff for any g G rules J = [M({X(w)})], J g = [M({g(X(w))})], produced by the method M, generate the same results and for any X(w), w W gj(x(w)) = J g (g(x(w)). If the method is not invariant (that is the case for majority of the methods), then predictions generated by the method depends on the subjective choice of the measurement units. The invariance of the method is closely connected to the interpretability of its results. The numerical KDD&DM methods assume that a numerical standard mathematical operation such as +,-,*, / can be used in an algorithm despite possible non-interpretability. In this case, the method can be non-invariant and can deliver non-interpretable results. In contrast a KDD&DM method M is invariant if it uses only information from empirical systems A 1,, A n, and B as data and produces rules J that are logical expressions in terms of these empirical systems. This approach is proposed in Relational approach to Data Mining [23],[25], [47], [51]. 8

9 5. Relational methodology for the analysis of KDD&DM methods A non invariant KDD&DM method M: {X(w)} J can be analyzed and another invariant method can be extracted from M. Let us define a many-sorted empirical system A(W) that is a product of empirical systems A 1,,A n, B bound on the set W. The empirical system A(W) contains all interpretable information from a learning sample W. Next we define transformation W A(W) of all data W into a many-sorted empirical system A(W), and replace the representation W {X(w)} by the transformation W A(W) {X(w)}. Based on method M: {X(w)} J, we define a new method ML: A(W) J such that ML(A(W)) = M({X(w)}) = J, using transformation W A(W) {X(w)}. Thus, method ML uses only interpretable information from data A(W) and produces the rule J using method M. Let us analyze the transformation of the interpretable information A(W) into the rule J through the method M. If we apply only interpretable operations to the interpretable information A(W) from the method M, we may extract some logical rule JL from rule J. This rule JL will contain only interpretable information from the rule J, expressed in terms of empirical system A(W). Let us define the next method MLogic: A(W) JL, where the rule JL is a set of logical rules (1) for rule J produced by method M, and interpretable information A(W), A(W) {X(w)}, M: {X(w)} J. The method MLogic is obviously invariant. If we consider all possible data for the method M, and all rules JL, that may be produced by the MLogic method, then we will obtain a class of rules (hypotheses) {JL} (knowledge space) of the KDD&DM method M. As a result we obtain: (1) an ontology of the particular KDD&DM method M as an empirical system A(W), and (2) a knowledge space {JL} of the method M. There are First-Order Logic (FOL) methods that are capable discovering some sets of hypotheses (knowledge space) presented in the first-order logic. Thus, these FOL methods that are invariant simulate (model) traditional KDD&DM methods at some extent. 6. First-order logic approaches Let us consider the existent FOL-methods and compare them with the proposed relational approach and the Discovery system. A variety of relational KDD&DM systems has been developed in recent years [29]. Theoretically, they have many advantages. However, in practice, the complexity of the language is greatly restricted reducing its applicability. For example, some systems require that the concept definition be expressed in terms of attribute-value pairs [27][6] or only in terms of unary predicates [17],[30][19][43][41]. The systems that allow workable relational concept definitions (e.g., OCCAM [33][11][2]) place strong restrictions on the form of induction and the initial knowledge that is provided to the system [35]. 9

10 The major successful applications of the FOL have been described in [3],[4],[7],[8], [21], [32],[36] that are in domains of chemistry, physics, medicine and others. Tasks, such as mesh design, mutagenicity, and river water quality, exemplify successful applications. Domain specialists appreciate that the learned regularities are understandable directly in domain terms. Fu [13] has justly observed: Lack of comprehension causes concern about the credibility of the result when neural networks are applied to risky domains, such as patient care and financial investment. Advantages of the first-order logic (FOL) methods. Comprehensive predicate invention. Human-readable and comprehensible form of rules. Logical relations (predicates) should be developed to exploit advantages of the human-readable forecasting rules. In this way, FOL methods can produce valuable comprehensible rules in addition to the forecast. Using this technique, a user can evaluate the performance of a forecast as well as a forecasting rule. Obviously, comprehensive rules have advantages over a forecast without explanations. The problems of inventing predicates were considered in the previous section and in [23]. Advantages versus disadvantages of the attribute-value languages (AVLs) methods and the first-order logic methods (table 1). Bratko and Muggleton [4] have indicated that the current FOL systems are relatively inefficient and have rather limited facilities for handling numerical data. The purpose of Relational Data Mining (RDM) is to overcome these limitations of the current FOL methods. There are two types of numerical data in data mining: (a) a numerical target variable; (b) numerical attributes used to describe objects and discover patterns. Traditionally, the FOL methods solve only classification tasks without direct operations on the numerical data. The Discovery system handles an interval forecast of continuous numerical variables, like prices, along with classification tasks. The Discovery system handles numerical time series using the first-order logic techniques, which is not typical of ILP and FOL applications. Table.1. Comparison of the AVL-based and first-order logic methods Method Advantages for the learning process Disadvantages for the learning process Method based on attribute-value languages Simple, efficient, and handles noisy data. Limited form of background knowledge. Lack of relations in the concept Method based on the First Order Logic Appropriate learning time with a high number of training examples. Sound theoretical basis (first-order logic, logic programming). Flexible form of background knowledge, problem representation, and problem-specific constraints. Comprehensive representation of background knowledge, and relations among examples. description language. Inappropriate learning time with a high number of arguments in the relations. Poor facilities for processing numerical data without using the measurement theory. Background knowledge and ontology. Knowledge Representation is an important and informal first step in Relational Data Mining. In the attribute-based methods, the attribute form of data actually dictates the form of knowledge representation. Relational data mining has more options for this purpose. For example, for RDM the attribute-based stock market information, such as stock prices, indices, and volume of trading should be transformed into the first-order logic form. This knowledge includes much more than just attribute values. There are many ways to represent knowledge in the first-order logic language. Data mining algorithms may work too long to dig out relevant information or can even produce inappropriate rules. Introducing data types [12] and concepts of the representative measurement theory [26][37] into the knowledge representation process helps to address this representation problem. In fact, the measurement theory developed a wide set of data types, which cover the data 10

11 types used in [12]. The FOL systems have a mechanism to represent background knowledge in a human-readable and comprehensive form. Hybridizing the logical data mining methods with a probabilistic approach. This is done by introducing probabilities over logical formulas [5][9] [14] [21] [23][25] [31][45] [47] [48]. For financial data mining this was done in [22][23] [45][47][48] using the Discovery system that has been applied to predict the SP500C time series and to develop a trading strategy. This RDM method outperformed several other strategies in simulated trading [22][23]. Statistical significance. Traditionally, the FOL methods were purely deterministic, which originated from logic programming. The deterministic methods have a well-known problem of handling data with a significant level of noise. This is especially important in financial data, which are very noisy. In contrast, the RDM can handle noisy and imperfect data including numerical data. Statistical significance is another challenge for deterministic methods. Statistically significant rules are advantageous in comparison with rules tested only for their performance on training and testing data [29]. Training and testing data can be too limited and/or not representative. There are more chances that rules would fail to give a correct forecast on other data if we.rely only on them,. This is a difficult problem for any data mining method, especially for deterministic methods, including ILP. Intensive studies have been conducted for incorporating a probabilistic mechanism into the ILP [31]. Hypotheses space. It is well known that the general problem of rule generating and testing is NP-complete [18]. Therefore, the above discussion is closely related to the following questions. What determines the number of rules? When to stop generating rules? The number of hypotheses is another important parameter. It has already been mentioned that the RDM with the first-order rules allows expressing naturally a wide variety of general hypotheses. These more general rules can be used in solving classification problems as well as for interval forecasting of continuous variables. The algorithmic complexity of FOL algorithms is growing exponentially with the number of combinations of predicates to be tested. A restraint to halt this exponential growth is required in order to reduce a set of combination. To address this issue, we propose an approach based on data types and the measurement theory. This approach provides better means for generating only meaningful hypotheses using syntactic information. A probabilistic approach also naturally addresses knowledge discovery in situations with incomplete or incorrect domain knowledge. In this way, the properties of single examples are not generalized beyond the limits of statistically significant rules. FOIL, FOCL and Discovery algorithms. The algorithm FOIL [38][39] learns constantfree Horn clauses, a useful subset of first-order predicate calculus. Subsequently, the FOIL was extended to use a variety of types of background knowledge to increase the class of problems solvable, to decrease the hypothesis space explored, and to improve the accuracy of learned rules. The FOCL (First Order Combined Learner) algorithm [35] extends FOIL. FOCL uses the first-order logic and combines its information based optimality metric with background knowledge. The FOCL has been tested on various problems [36] that include a domain describing when a student loan is to be repaid [34]. As indicated above, the general problem of rule generating and testing is NP-complete. Therefore, we face the problem of designing NP-complete algorithms. Several related questions are raised. What determines the number of rules to be tested? When to stop generating rules? What is the justification for specifying particular expressions instead of any other? The FOCL, FOIL and Discovery systems use different stop criteria and different mechanisms to generate rules for testing. The RDM Discovery system selects rules, which are probabilistic laws (see section 10) and consistent with measurement scales [26] for a particular task. The algorithm stops generating new rules when they become too complicated (i.e., statistically 11

12 insignificant for the data) despite the possibly high accuracy of the rules when applied to training data. The FOIL and FOCL are based on the information gain criterion. The Discovery system contains several extensions over other FOL algorithms. It enables various forms of background knowledge to be exploited. The goal of this system is to create probabilistic laws in terms of the relations (predicates and literals) defined by a collection of examples and other forms of background knowledge. The Discovery system, as well as FOCL, have several advantages over FOIL: Improves the search of hypotheses by using background knowledge with predicates defined by a rule in addition to predicates defined by a collection of examples. Limits the search space by posing constraints. Improves the search for hypotheses by accepting as input a partial, possibly incorrect, rule that is an initial approximation to the predicate to be learned. There are also advantages of this RDM system over FOCL which. Limits the search space by using the statistical significance of hypotheses and Limits the search space by using the strength of the data type scales. The above advantages are ways of generalization used in this system. Generalization is the critical issue in applying the data-driven forecasting systems. The Discovery system generalizes data through probabilistic laws (see below). The approach is somewhat similar to the hint approach [1]. The main source for hints in the first-order logic rules is the representative measurement theory [26]. Note, a class of general propositional and first-order logic rules, covered by the system is wider than the class of decision trees. 7. Subject domain theory In this section, we introduce the notion of subject domain theory and define the property of an experiment, which necessitate the universal axiomatizability of this theory. Let us introduce the first-order logic L of signature I = P 1,...,P k, k > 0, where P 1,...,P k are predicate symbols of the arity n 1..., n k. An empirical system [37][26] is a finite model M = B, W of the signature I, where B is the basic set of the empirical system, W = P 1,...,P k is the tuple of predicates of the signature I defined on B. Every predicate P j can be also defined as a subset P B nj on which it is true. Let us represent a subject domain by the many-sorted empirical system M = Α, W. A Subject Domain Theory (SDT) Τ = Α, W, Obs, S I is a set that consists of: - the tuple of predicates W of the signature I; - the measuring procedure Obs: B B,W, mapping any finite subset of objects B A into the protocol of measurements, represented by a finite (many-sorted) empirical system (data) D = B,W ; - axiom system S I, that should be true on any protocol of measurement. The defintion of the truth of an axiom in an empirical system D is a standard definition of the truth of expression on a model (empirical system). Task: a task of discovering a subject domain theory is determining a system of axioms S I that is true on data (presented by a many-sorted empirical system D = Β, W ). All observation results Obs: B B, W are parts of empirical systems any observation result is a finite submodel of the subject domain M. In this case, we can prove that the axiomatic system S I is universally axiomatizable. Let PR M = {Obs(B) B A} be a set of all the experimental results that can be obtained as protocols (submodels) of observations. Using [28] it is not difficult to prove the following theorem. Theorem 1. If PR M is a set of all finite submodels of the subject domain M, then the axiomatic system S I is logically equivalent to the set of universal formulas. 12

13 We will assume that the condition of the theorem is true for our subject domain and, hence, the axiomatic system S I is universally axiomatizable. 8. What is the law? It is known, that a set of universal formulas S I can be reduced to the set of rules (1) by the logically equivalent transformations with literals A 0, A 1,, A k. C = (A 1 &...&A k A 0 ), k 0, (1) Therefore, we can assume that the axiomatic system S I is a set of rules (1). Thus, the task of discovering a subject domain theory is reduced to discovering rules (1). Let us analyse this task. What can we say about the truth of the axiomatic system S I on a set of experimental results PR M, by using a logical analysis of axioms? - The rule C = (A 1 &...&A k A 0 ) is true on PR M, if its premise is always false on PR M. We prove in the theorem below that in this case some logically stronger subrule, linking atoms of the premise, is true on PR M ; - The rule C is true on PR M if some of its logically stronger subrule that contains only a part of the premise and the same conclusion, is true on PR M. Let us clarify logically stronger subrules from which the truth of the rule follows. Theorem 2 [48][49]. The rule C = (A 1 &...&A k A 0 ) logically follows from any rule of the form: (1) A i1 &...&A ih A i0, where {A i1,...,a ih,a i0 } {A 1,...,A k }, 0 h < k, and (A i1 &...&A ih A i0 ) (A 1 &...&A k ) (A 1 &...&A k A 0 ); (2) (A i1 &...&A ih A 0 ), where {A i1,...,a ih } {A 1,...,A k }, 0 h < k, and (A i1 &...&A ih A 0 ) (A 1 &...&A k A 0 ), provability in a propositional calculus. Definition 1. A subrule of rule C is a logically stronger rule (1) or (2), defined in the theorem 2 for rule C. It is easy to see that any subrule has also form (1). Corollary 1. If a subrule of rule C is true on PR M, then the rule C is also true on PR M. Definition 2. A law on a set of experimental results PR M is rule C, which is true on PR M, and none of its subrules is true on PR M. Let L be the set of all laws on PR M. It follows from the logic and methodology of science that hypotheses that are most falsifiable, simplest and contain the smallest number of parameters can be viewed as laws. In our case, all these properties, that are usually difficult to define, follow from the logical strength of the laws. The subrules are: - logically stronger than the rules and more prone to become false (falsifiable) because they contain weaker premises and, therefore, applicable to larger datasets; - simpler, because they contain a smaller number of atomic expressions than the rule; - include a smaller number of "parameters" (the number of atomic expressions may be regarded as parameters "tuning" rules to data). Why should the laws be most falsifiable, simplest and contain smallest number of parameters? In general, views differ from one author to another. In our case, for the hypotheses of the form (1), we can give a specific answer to this question. Discovery of laws is in fact solving a more relevant task identifying logically strongest theory, describing our data and providing a probable meachanism of data generation. It can be proved that from a set of laws L the subject domain theory S I is inferrable Theorem 3 [51]. L S I. 13

14 Therefore, the task of the subject domain theory S I discovery is reduced to discovering a set of laws L. 9. Events and their probability Let us generalize the notion of a law for a probabilistic case. We define a probability on a set of experimental results PR M and logical expressions. We assume that objects for the experiment are selected randomly from set A. For the sake of simplicity we introduce the discrete probability function on A as a mapping µ: A [0,1] such that [14] shown in (2). µ ( a) = 1 and µ(a) 0, a A. a Λ (2) µ ( B ) = µ ( b), B A The discrete probability function b B n µ on the product (A) n will be thereby defined by n ( 1 n 1 n µ a,..., a ) = µ ( a )... µ ( a More general definitions of probability function µ are considered in [14]. Let us define an interpretation of language L on the empirical system M = Α, W as mapping I: I W. This mapping associates every signature symbol P j I, j = 1,...,k, the predicate P j from W of the same arity. Let X = {x 1, x 2, x 3,... } be variables of language L. The valuation ν is defined as a function ν: X A. Let us define the probability for sentences of language L. Let U(I) be a set of all atomic formulas of language L of the form P(x 1,,x n ); R(I) is a set of all the sentences of language L, obtained by the closure of set U(I) relative to logical operations &,v,. The formula ϕ $ is defined by νiϕ, ϕ R(I), where predicate symbols from I are substituted by the predicates from W, that is by interpretation I and variables of the formula ϕ are substituted by objects from A by the validation ν. The probability η of sentence ϕ(x 1,,x n ) R(I) on M is defined as follows η(ϕ) = µ n ({(a 1,,a n ) M $ ϕ, ν(x 1 ) = a 1,, ν(x n ) = a n }), where is the truht on M (3) ) 10. General notion of law, probabilistic laws on PR M Now we revise the concept of law on PR M in terms of probability. Let us do it in such a way that the concept of the law on PR M be a particular case of this definition. The law is true on PR M rule, whose subrules are false on PR M. Let us revise the concept of the law on the PR M. The law is such a true on PR M rule, which cannot be made simpler or logically stronger without losing the truth. This property of the law "not to be simplified" allows stating the law not only in terms of truth but also in terms of probability. Theorem 4 [48][49]. For any rule C = (A 1 &...&A k A 0 ), the following two conditions are equivalent: 1. the rule C is a law on PR M ; 2. (a) η(a 0 /A 1 &...&A k ) = 1 and η(a 1 &...&A k ) > 0; (b) the conditional probability η(a 0 /A 1 &...&A k ) of the rule is greater than the conditional probability of each of its subrules. This theorem gives an equivalent definition of the law on PR M in probability terms. Definition 3. A probabilistic law on PR M with conditional probability 1 is rule C = (A 1 &...&A k A 0 ) such that: a) η(a 0 /A 1 &...&A k ) = 1 и η(a 1 &...&A k ) > 0; 14

15 b) the conditional probability η(a 0 /A 1 &...&A k ) of the rule is greater than the conditional probability of each of its subrules. We denote a set of all probabilistic laws on PR M with conditional probability 1 as LP1. Corollary 2. A probabilistic law on PR M with conditional probability 1 is a law on PR M. Therefore, the task of the subject domain theory S I discovery is reduced to a task of discovering all probabilistic laws on PR M with the conditional probability equal to 1. The definition of probability (3) is based on the random choice of objects for the experiment. Experiments are parts (submodels) of the empirical system M = Α,W, and it is not assumed that the truth values of predicates can change during the experiments. As we noted above a more general definition of the probability function µ is given in [14]. This definition includes cases with noise, when truth-values of predicates can change. We cannot require the complete logical truth of the laws on PR M for experiments with noise and the definition of the law on PR M should be changed. Let us consider items 1 and 2 of the theorem 4 from the standpoint of the "not to be simplified" law: - a law is such a rule, which is true on PR M, and it cannot be simplified (to be logically stronger) without a loss of the truth. - a probabilistic law on PR M with conditional probability 1 cannot be simplified (to be logically stronger) without loosing (decreasing) the value 1 of the conditional probability, so that it became less than 1. This makes a following general definition of law feasible: Definition 4. The law is such a rule of the form (1) based on truth, conditional probability, and other estimations, which cannot be made logically stronger without reducing their estimations. Therefore, we may generalize the definition of the probabilistic law with conditional probability 1 by omitting condition η(a 0 /A 1 &...&A k ) = 1 from the point (a) of definition 3. The remaining condition (b) expresses the property of the law in the sense of definition 4. Definition 5. A probabilistic law on PR M is a rule C = (A 1 &...&A k A 0 ), k 0, such that the conditional probability of the rule η(a 0 /A 1 &...&A k ), η(a 1 &...&A k ) > 0 is greater than the conditional probability of each of its subrules. We denote the set of all probabilistic laws on PR M as LP. Corollary 3. L LP. Definition 6. A Strongest Probabilistic Law (SLP-rule) on PR M is a probabilistic law C = (A 1 &...&A k A 0 ), which is not a subrule of any other probabilistic law. We define SPL as a set of all SPL-rules. Proposition 3. L SLP LP. Consider the task of the subject domain theory S I discovery in the noise conditions. Definition 7. We say that the noise is saving, if sets of laws LP1 and LP are equal. The task of subject domain theory S I discovery in the presence of noise is, thus, reduced to two tasks (1) evaluation if the noise is saving ; (2) discovery of set LP. It follows from theorems 3 and 4, corollary 2 and definition 7 that if the noise is saving, then LP S I, and the task of subject domain theory S I discovery is reduced to the set LP discovery. In the paper [50] we describe two examples of saving noise that satisfy the definition The models of predictions and statistical ambiguity problem The next question about the subject domain theory S I discovery is how to use the theory. The main its use is prediction. Now we consider the models of predictions and statistical ambiguity problem for predictions. 15

16 One of the major results of the Philosophy of Science is so-called covering law model that was introduced by Hempel in the early sixties in his famous article Aspects of scientific explanation [15][16]. The basic idea of this covering law model is that a fact is explained/predicted by subsumption under so-called covering law, i.e. the task of an explanation/prediction is to show that a fact can be considered as an instantiation of a law. In the covering law model, two types of explanation/predictions are distinguished: Deductive- Nomological (D-N) explanations/predictions and Inductive-Statistical (I-S) explanations/predictions. In D-N explanations/predictions, a law is deterministic, whereas in I-S explanations a law is statistical. Right from the beginning, it was clear to Hempel that two I-S explanations can yield contradictory conclusions. He called this phenomenon the statistical ambiguity of I-S explanations [15][16]. Let us consider the following example of the statistical ambiguity. As opposed to the classical deduction, in statistical inference it is possible to infer contradictory conclusions from consistent premises. Suppose that the theory LP makes the following statements: - (L1): Almost all cases of streptococcus infection clear up quickly after the administration of penicillin; - (L2): Almost no cases of penicillin resistant streptococcus infection clear up quickly after the administration of penicillin; - (C1): Jane Jones had streptococcus infection; - (C2): Jane Jones received treatment with penicillin; - (C3): Jane Jones had a penicillin resistant streptococcus infection; It is possible to construct two contradictory arguments from this theory, one explaining why Jane Jones recovered quickly (E), and the other one explaining its negation why Jane Jones did not recover quickly ( E) Argument 1 Argument 2 L1 L2 C1,C2 C2,C3 [r] E E [r] The premises of both arguments are consistent with each other. They could all be true. However, their conclusions contradict each other, making these arguments rival ones. So, the set of rules LP may be inconsistent. Hempel hoped to solve this problem by forcing all statistical laws in an argument to be maximally specific. That is, they should contain all relevant information with respect to the domain in question. In our example, premise C3 of the second argument invalidates the first argument, since law L1 is not maximally specific with respect to all information about Jane in LP. Thus, theory LP can only explain E, but not E. We will return to this example below. The Deductive-Nomological explanations/predictions of some observed phenomenon G are inferred by the rule: L 1,,L m C 1,,C n G It satisfies the following conditions: i. L 1,,L m are universally quantified sentences (having at least one universally quantified formula), C 1,,C n have no quantifiers or variables; ii. L 1,,L m, C 1,,C n G; 16

17 iii. L 1,,L m, C 1,,C n is consistent; iv. L 1,,L m G; C 1,,C n G; We assume that for deductive-nomological inference of predictions we will use laws from L. Therefore, due to theorem 3, we may infer any prediction that follows from the subject domain theory S I [58]. Hempelian inductive-statistical explanations/predictions of some observed phenomenon G are inferred by the analogous rule: L 1,,L m C 1,,C n [r] G where [r] is the probability of inference. In addition to points i-iv, it satisfies the following Requirement of Maximal Specificity RMS: v. RMS: All laws L 1,,L m are maximal specific. In Hempel [15][16] the RMS is defined as follows. An I-S argument of the form: p(g;f) = r F(a) G(a) [r] is an acceptable I-S prediction with respect to a knowledge state K, if the following requirement of maximal specificity is satisfied. For any class H for which the following two sentences are contained in K x(h(x) F(x)), H(a), (4) exists a statistical law p(g;h) = r in K such that r = r. The basic idea of RMS is that if F and H both contain the object a, and H is a subset of F, then H provides more specific information about the object a than F, and therefore law p(g;h) should be preferred over law p(g; F). For inductive-statistical inference of predictions, we may use probabilistic laws LP. However, in the next sections we present a new definition of a maximum specificity requirement and a coressponding definition of maximum specific rules that solve the problem of statistical ambiguity. 12. Semantic Probabilistic Inference of the Set of Laws L and LP In this section, we define a semantic probabilistic inference of the sets of laws L, probabilistic laws LP and SLP. This inference also gives us possibility to define maximum specific rules. Definition 8 [46]. A semantic probabilistic inference (SP-inference) of some SPL-rule is a sequence of probabilistic laws, which we designate as C 1 C 2... C n such that: C 1,C 2,...,C n LP, C n SPL rule, C i = (A &...& A G), i = 1,2,...n, n 1, i i 1 ki the rule C i is subrule of the rule C i+1, η(c i+1 ) > η(c i ), i = 1,2,...n-1, (5) Proposition 4. Any probabilistic law belongs to some SPI-inference. Proposition 5. There is a SPI-inference for any SPL-rule. Corollary 4. For any law from L there is a SPI-inference of that law. 17

18 A 3 k1+1 &...&A3 k3 & A 1 1 &...&A1 k1 & A 4 k1+1 &...&A4 k4 & G A 2 1 &...& A2 k2 & A 5 k2+1 &...&A5 k5 & A 6 k2+1 &...&A6 k6 & A 7 k2+1 &...&A7 k7 & Fig 1. Semantic Probabilistic Inference tree. Let us consider a set of all SP-inferences of some sentence G. This set constitutes the semantic probabilistic inference tree of sentence G (see fig. 1). Definition 9. A maximum specific rule MS(G) of the sentence G is a SPL-rule of the semantic probabilistic inference tree of sentence G that has maximum value of the conditional probability. We define a set of all maximum specific rules for any atom G as MSR. Proposition 6. T MSR SPL TP. 13. Requirement of Maximal Specificity and Solution of the statistical ambiguity problem Now we define the RMS for a probabilistic case. We suppose that class H of objects in (4) is defined by some sentence H R(I) of the language L. Therefore, according to RMS p(g;h) = p(g;f) = r for this sentence. In terms of probability, it means that η(g/h) = η(g/f) = r for any H R(I) that satisfies (4). Definition 9. The rule C = (F G) satisfies the Probabilistic Requirement of Maximal Specificity (PRMS) iff: the equtions η(g/f&h) = η(g/f) = r for the rules C = (F G) and C = (F&H G) follow from: H R(I) and F(a)&H(a) (in this case the sentence (4) x(f(x)&h(x) F(x)) is true and η(f&h) > 0 due to (2)), In other words, PRMS means that there is no other sentence H R(I) that increases or decreases the conditional probability η(g/f) = r of the rule C by adding it to the premise. See lemma 1 below. Lemma 1 [52]. If sentence H R(I) decreases the probability η(g/f&h) < η(g/f) then the sentence H increases it: η(g/f& H) > η(g/f). Lemma 2 [52]. For any rule C = (B 1 &...&B t A 0 ), η(b 1 &...&B t ) > 0 of the form (2) there is a probabilistic law C = (A 1 &...&A k A 0 ) on M which is subrule of the rule C and η(c ) η(c). Theorem 3 [52]. Any MS(G) rule satisfy PRMS. Corollary 5 [52]. Any law on M satisfies the PRMS requirement. Theorem 4 [52]. The I-S inference is consistent for any laws L 1,,L m MSR. It follows from the theorem, that after discovering a set of all maximum specific rules MSR we can predict without contradictions by using I-S inference. Let us illustrate this theorem by using the previous example. The maximum specific rules MS(E) and MS( E) for the sentences E and E are the rules: [L1']: Almost all cases of streptococcus infection, that are not resistant streptococcus infection, clear up quickly after the administration of penicillin ; 18

19 [L2] : Almost no cases of penicillin resistant streptococcus infection clear up quickly after the administration of penicillin. The rule L1 have the greater value of conditional probability, then the rule L1 and hence is a MS(E) rule for E. These two rules can t be fulfilled on the same data. 14. Relational Data Mining and Discovery system This paper reviewed the theory behind the Relational Data Mining (RDM) approach to the knowledge discovery and the Discovery system [23, 47] based on this approach. The novel part of the paper is in blending RMD approach, representative measurement theory and current studies on ontology. In this approach, the initial rule/hypotheses generation is taskdependent. More detail about examples of such domain and task specific set of rules/hypotheses are presented in [23] for an initial set of hypotheses for financial time series. For a particular task and a subject domain, the RDM system selects rules that are simplest and consistent with measurement scales (data types). It implements the semantic probabilistic inference and discovers all sets of rules L, LP, SLP, and MSR using data represented as a many-sorted empirical system. In this way a complete and consistent set of rules can be discovered. The system was successfully applied for solving many practical tasks from cancer diagnostic systems, time series forecasting to psychophysics, and bioinformatics (see scientific discovery website [42] for more information). ACKNOWLEDGEMENTS The work has been supported by the Russian Federation grants (Scientific Schools grant of the President of the Russian Federation and Siberian Branch of the Russian Academy of Sciences, Integration project No. 1, 115). References [1] Abu-Mostafa, Y.S. (1990), `Learning from hints in neural networks, J Complexity 6, pp [2] Bergadano, F., Giordana, A., & Ponsero, S. (1989). Deduction in top-down inductive learning. Proceedings of the Sixth International Workshop on Machine Learning (pp ). Ithaca, NY: Morgan Kaufmann. [3] Bratko, I., Muggleton, S., Varvsek, A. Learning qualitative models of dynamic systems. In Inductive Logic Programming, S. Muggleton, Ed. Academic Press, London, 1992 [4] Bratko I, Muggleton S (1995): Applications of inductive logic programming. Communications of ACM 38 (11): [5] Carnap, R., Logical foundations of probability, Chicago, University of Chicago Press, [6] Danyluk, A. (1989). Finding new rules for incomplete theories: Explicit biases for induction with contextual information. Proceedings of the Sixth International Workshop on Machine Learning (pp ). Ithaca, NY: Morgan Kaufmann. [7] Dzeroski, S., DeHaspe, L., Ruck, B.M., and Walley, W.J. (1994). Classification of river water quality data using machine learning. In: Proceedings of the Fifth International Conference on the Development and Application of Computer Techniques to Environmental Studies (ENVIROSOFT 94). [8] Dzeroski S (1996): Inductive Logic Programming and Knowledge Discovery in Databases. In: Advances in Knowledge Discovery and Data Mining, Eds. U. Fayad, G., Piatetsky-Shapiro, P. Smyth, R. Uthurusamy. AAAI Press, The MIT Press, pp [9] Fenstad, J.I. Representation of probabilities defined on first order languages // J.N.Crossley, ed., Sets, Models and Recursion Theory: Proceedings of the Summer School in Mathematical Logic and Tenth Logic Colloquium (1967) [10] Fishbern PC (1970): Utility Theory for Decision Making. NY-London, J.Wiley&Sons. [11] Flann, N., & Dietterich, T. (1989). A study of explanation based methods for inductive learning. Machine Learning, 4, [12] Flach, P., Giraud-Carrier C., and Lloyd J.W. (1998). Strongly Typed Inductive Concept Learning. In Proceedings of the Eighth International Conference on Inductive Logic Programming (ILP'98),

20 [13] Fu LiMin (1999): Knowledge Discovery Based on Neural Networks, Communications of ACM, vol. 42, N11, pp Goodrich, J.A., Cutler, G., Tjian, R. (1996), `Contacts in context: promoter specificity and macromolecular interactions in transcription', Cell 84(6), [14] Halpern, J.Y. (1990), `An analysis of first-order logic of probability, Artificial Intelligence 46, pp [15] Hempel, C. G. (1965) Aspects of Scientific Explanation, In: C. G. Hempel, Aspects of Scientific Explanation and other Essays in the Philosophy of Science, The Free Press, New York. [16] Hempel, C. G.: 1968, Maximal Specificity and Lawlikeness in Probabilistic Explanation, Philosophy of Science 35, [17] Hirsh, H. (1989). Combining empirical and analytical learning with version spaces. [18] Hyafil L, Rivest RL (1976): Constructing optimal binary decision trees is NP-Complete. Information Processing Letters 5 (1): [19] Katz, B.(1989). Integrating learning in a neural network. Proceedings of the Sixth international Workshop on Machine Learning (pp ). Ithaca, NY: Morgan Kaufmann. [20] Kovalerchuk B (1973): Classification invariant to coding of objects. Comp. Syst. 55:90-97, Institute of Mathematics, Novosibirsk. (in Russian). [21] Kovalerchuk, B., Vityaev, E., Ruiz, J.F. (1997), `Design of consistent system for radiologists to support breast cancer diagnosis In: Proc Joint Conf Information Sciences, Durham, NC, 2, pp [22] Kovalerchuk B, Vityaev E (1998): Discovering Lawlike Regularities in Financial Time Series. Journal of Computational Intelligence in Finance 6 (3): [23] Kovalerchuk, B., Vityaev, E. (2000), Data Mining in finance: Advances in Relational and Hybrid Methods, Kluwer Academic Publishers, 308 p. [24] Kovalerchuk B., Vityaev E., Ruiz J. (2000), `Consistent Knowledge Discovery in Medical Diagnosis, IEEE Engineering in Medicine and Biology Magazine. Special issue: "Medical Data Mining", July/August 2000, pp [25] Kovalerchuk, B., Vityaev, E., Ruiz, J.F. (2001), Consistent and Complete Data and "Expert" Mining in Medicine. In: Medical Data Mining and Knowledge Discovery, Springer, pp [26] Krantz, D.H., Luce, R.D., Suppes, P., Tversky, A. (1971, 1989, 1990), Foundations of measurement, Vol. 1,2,3 - NY, London: Acad. press, (1971) 577 p., (1989) 493 p., (1990) 356 p. [27] Lebowitz, M. (1986). Integrated learning: Controlling explanation. Cognitive Science, 10. [28] Mal tsev А. I. Algebraic systems, Springer, [29] Mitchell, T. (1997), Machine Learning, New York: McGraw Hill. [30] Mooney, R., & Ourston, D. (1989). Induction over the unexplained: Integrated learning of concepts with both explainable and conventional aspects. Proceedings of the Sixth International Workshop on Machine Learning (pp. 5--7). Ithaca, NY: Morgan Kaufmann. [31] Muggleton S. (1994). Bayesian inductive logic programming. In Proceedings of the Eleventh International Conference on Machine Learning W. Cohen and H. Hirsh, Eds., pp [32] Muggleton S. (1999): Scientific Knowledge Discovery Using Inductive Logic Programming, Communications of ACM, vol. 42, N11, pp [33] Pazzani, M. J. (1990). Creating a memory of causal relationships: An integration of empirical and explanation based learning methods. Hillsdale, NJ: Lawrence Erlbaum Associates. [34] Pazzani, M., Brunk, C. (1990), Detecting and correcting errors in rule-based expert systems: An integration of empirical and explanation-based learning. Proceedings of the Workshop on Knowledge Acquisition for Knowledge-Based System. Banff, Canada. [35] Pazzani, M., Kibler, D. (1992). The utility of prior knowledge in inductive learning. Machine Learning, 9, [36] Pazzani, M., (1997), Comprehensible Knowledge Discovery: Gaining Insight from Data. First Federal Data Mining Conference and Exposition, pp Washington, DC [37] Pfanzagl J. (1971). Theory of measurement (in cooperation with V.Baumann, H.Huber) 2nd ed. Physica- Verlag. [38] Quinlan, J. R. (1989). Learning relations: Comparison of a symbolic and a connectionist approach (Technical Report). Sydney, Australia: University of Sidney. [39] Quinlan, J. R. (1990). Learning logical definitions from relations. Machine Learning, 5, [40] Samokhvalov, K., (1973). On theory of empirical prediction, (Comp. Syst., #55), (In Russian) [41] Sarrett, W., Pazzani, M. (1989). One-sided algorithms for integrating empirical and explanation-based learning. Proceedings of the Sixth International Workshop on Machine Learning (pp ). Ithaca, NY: Morgan Kaufmann. [42] Scientific Discovery website 20

Symbolic Methodology in Numeric Data Mining: Relational Techniques for Financial Applications

Symbolic Methodology in Numeric Data Mining: Relational Techniques for Financial Applications Symbolic Methodology in Numeric Data Mining: Relational Techniques for Financial Applications Boris Kovalerchuk Central Washington University, Ellensburg, WA, USA, borisk@cwu.edu Evgenii Vityaev Institute

More information

DATA MINING IN FINANCE

DATA MINING IN FINANCE DATA MINING IN FINANCE Advances in Relational and Hybrid Methods by BORIS KOVALERCHUK Central Washington University, USA and EVGENII VITYAEV Institute of Mathematics Russian Academy of Sciences, Russia

More information

CHAPTER 7 GENERAL PROOF SYSTEMS

CHAPTER 7 GENERAL PROOF SYSTEMS CHAPTER 7 GENERAL PROOF SYSTEMS 1 Introduction Proof systems are built to prove statements. They can be thought as an inference machine with special statements, called provable statements, or sometimes

More information

ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS

ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical and Mathematical Sciences 2012 1 p. 43 48 ON FUNCTIONAL SYMBOL-FREE LOGIC PROGRAMS I nf or m at i cs L. A. HAYKAZYAN * Chair of Programming and Information

More information

Computational Logic and Cognitive Science: An Overview

Computational Logic and Cognitive Science: An Overview Computational Logic and Cognitive Science: An Overview Session 1: Logical Foundations Technical University of Dresden 25th of August, 2008 University of Osnabrück Who we are Helmar Gust Interests: Analogical

More information

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

More information

Scalable Automated Symbolic Analysis of Administrative Role-Based Access Control Policies by SMT solving

Scalable Automated Symbolic Analysis of Administrative Role-Based Access Control Policies by SMT solving Scalable Automated Symbolic Analysis of Administrative Role-Based Access Control Policies by SMT solving Alessandro Armando 1,2 and Silvio Ranise 2, 1 DIST, Università degli Studi di Genova, Italia 2 Security

More information

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year. This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra

More information

Rigorous Software Development CSCI-GA 3033-009

Rigorous Software Development CSCI-GA 3033-009 Rigorous Software Development CSCI-GA 3033-009 Instructor: Thomas Wies Spring 2013 Lecture 11 Semantics of Programming Languages Denotational Semantics Meaning of a program is defined as the mathematical

More information

Optimizing Description Logic Subsumption

Optimizing Description Logic Subsumption Topics in Knowledge Representation and Reasoning Optimizing Description Logic Subsumption Maryam Fazel-Zarandi Company Department of Computer Science University of Toronto Outline Introduction Optimization

More information

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine.

(LMCS, p. 317) V.1. First Order Logic. This is the most powerful, most expressive logic that we will examine. (LMCS, p. 317) V.1 First Order Logic This is the most powerful, most expressive logic that we will examine. Our version of first-order logic will use the following symbols: variables connectives (,,,,

More information

o-minimality and Uniformity in n 1 Graphs

o-minimality and Uniformity in n 1 Graphs o-minimality and Uniformity in n 1 Graphs Reid Dale July 10, 2013 Contents 1 Introduction 2 2 Languages and Structures 2 3 Definability and Tame Geometry 4 4 Applications to n 1 Graphs 6 5 Further Directions

More information

The Optimality of Naive Bayes

The Optimality of Naive Bayes The Optimality of Naive Bayes Harry Zhang Faculty of Computer Science University of New Brunswick Fredericton, New Brunswick, Canada email: hzhang@unbca E3B 5A3 Abstract Naive Bayes is one of the most

More information

CS510 Software Engineering

CS510 Software Engineering CS510 Software Engineering Propositional Logic Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-cs510-se

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Detecting patterns of fraudulent behavior in forensic accounting *

Detecting patterns of fraudulent behavior in forensic accounting * Detecting patterns of fraudulent behavior in forensic accounting * Boris Kovalerchuk 1, Evgenii Vityaev 2 1 Dept. of Computer Science, Central Washington University Ellensburg, WA 98926, USA borisk@cwu.edu

More information

A Semantical Perspective on Verification of Knowledge

A Semantical Perspective on Verification of Knowledge A Semantical Perspective on Verification of Knowledge Paul Leemans, Jan Treur, Mark Willems Vrije Universiteit Amsterdam, Department of Artificial Intelligence De Boelelaan 1081a, 1081 HV Amsterdam The

More information

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm.

Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. Chapter 4, Arithmetic in F [x] Polynomial arithmetic and the division algorithm. We begin by defining the ring of polynomials with coefficients in a ring R. After some preliminary results, we specialize

More information

How To Understand The Relation Between Simplicity And Probability In Computer Science

How To Understand The Relation Between Simplicity And Probability In Computer Science Chapter 6 Computation 6.1 Introduction In the last two chapters we saw that both the logical and the cognitive models of scientific discovery include a condition to prefer simple or minimal explanations.

More information

Polynomial Operations and Factoring

Polynomial Operations and Factoring Algebra 1, Quarter 4, Unit 4.1 Polynomial Operations and Factoring Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned Identify terms, coefficients, and degree of polynomials.

More information

A Propositional Dynamic Logic for CCS Programs

A Propositional Dynamic Logic for CCS Programs A Propositional Dynamic Logic for CCS Programs Mario R. F. Benevides and L. Menasché Schechter {mario,luis}@cos.ufrj.br Abstract This work presents a Propositional Dynamic Logic in which the programs are

More information

Detecting Patterns of Fraudulent Behavior in Forensic Accounting

Detecting Patterns of Fraudulent Behavior in Forensic Accounting Detecting Patterns of Fraudulent Behavior in Forensic Accounting Boris Kovalerchuk 1 and Evgenii Vityaev 2 1 Dept. of Computer Science, Central Washington University Ellensburg, WA 98926, USA borisk@cwu.edu

More information

DELAWARE MATHEMATICS CONTENT STANDARDS GRADES 9-10. PAGE(S) WHERE TAUGHT (If submission is not a book, cite appropriate location(s))

DELAWARE MATHEMATICS CONTENT STANDARDS GRADES 9-10. PAGE(S) WHERE TAUGHT (If submission is not a book, cite appropriate location(s)) Prentice Hall University of Chicago School Mathematics Project: Advanced Algebra 2002 Delaware Mathematics Content Standards (Grades 9-10) STANDARD #1 Students will develop their ability to SOLVE PROBLEMS

More information

Correspondence analysis for strong three-valued logic

Correspondence analysis for strong three-valued logic Correspondence analysis for strong three-valued logic A. Tamminga abstract. I apply Kooi and Tamminga s (2012) idea of correspondence analysis for many-valued logics to strong three-valued logic (K 3 ).

More information

Software Modeling and Verification

Software Modeling and Verification Software Modeling and Verification Alessandro Aldini DiSBeF - Sezione STI University of Urbino Carlo Bo Italy 3-4 February 2015 Algorithmic verification Correctness problem Is the software/hardware system

More information

Experiments in Web Page Classification for Semantic Web

Experiments in Web Page Classification for Semantic Web Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address

More information

Bounded Treewidth in Knowledge Representation and Reasoning 1

Bounded Treewidth in Knowledge Representation and Reasoning 1 Bounded Treewidth in Knowledge Representation and Reasoning 1 Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien Luminy, October 2010 1 Joint work with G.

More information

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM

FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT

More information

Course Syllabus For Operations Management. Management Information Systems

Course Syllabus For Operations Management. Management Information Systems For Operations Management and Management Information Systems Department School Year First Year First Year First Year Second year Second year Second year Third year Third year Third year Third year Third

More information

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley)

Attribution. Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) Machine Learning 1 Attribution Modified from Stuart Russell s slides (Berkeley) Parts of the slides are inspired by Dan Klein s lecture material for CS 188 (Berkeley) 2 Outline Inductive learning Decision

More information

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

Schema Mappings and Data Exchange

Schema Mappings and Data Exchange Schema Mappings and Data Exchange Phokion G. Kolaitis University of California, Santa Cruz & IBM Research-Almaden EASSLC 2012 Southwest University August 2012 1 Logic and Databases Extensive interaction

More information

Lehrstuhl für Informatik 2

Lehrstuhl für Informatik 2 Analytical Learning Introduction Lehrstuhl Explanation is used to distinguish the relevant features of the training examples from the irrelevant ones, so that the examples can be generalised Introduction

More information

Notes from Week 1: Algorithms for sequential prediction

Notes from Week 1: Algorithms for sequential prediction CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking

More information

Fixed-Point Logics and Computation

Fixed-Point Logics and Computation 1 Fixed-Point Logics and Computation Symposium on the Unusual Effectiveness of Logic in Computer Science University of Cambridge 2 Mathematical Logic Mathematical logic seeks to formalise the process of

More information

Generating models of a matched formula with a polynomial delay

Generating models of a matched formula with a polynomial delay Generating models of a matched formula with a polynomial delay Petr Savicky Institute of Computer Science, Academy of Sciences of Czech Republic, Pod Vodárenskou Věží 2, 182 07 Praha 8, Czech Republic

More information

Big Ideas in Mathematics

Big Ideas in Mathematics Big Ideas in Mathematics which are important to all mathematics learning. (Adapted from the NCTM Curriculum Focal Points, 2006) The Mathematics Big Ideas are organized using the PA Mathematics Standards

More information

Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment,

Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment, Uncertainty Problems often have a certain amount of uncertainty, possibly due to: Incompleteness of information about the environment, E.g., loss of sensory information such as vision Incorrectness in

More information

Lecture 13 of 41. More Propositional and Predicate Logic

Lecture 13 of 41. More Propositional and Predicate Logic Lecture 13 of 41 More Propositional and Predicate Logic Monday, 20 September 2004 William H. Hsu, KSU http://www.kddresearch.org http://www.cis.ksu.edu/~bhsu Reading: Sections 8.1-8.3, Russell and Norvig

More information

24. The Branch and Bound Method

24. The Branch and Bound Method 24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no

More information

The Classes P and NP

The Classes P and NP The Classes P and NP We now shift gears slightly and restrict our attention to the examination of two families of problems which are very important to computer scientists. These families constitute the

More information

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT?

WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? WHAT ARE MATHEMATICAL PROOFS AND WHY THEY ARE IMPORTANT? introduction Many students seem to have trouble with the notion of a mathematical proof. People that come to a course like Math 216, who certainly

More information

Data, Measurements, Features

Data, Measurements, Features Data, Measurements, Features Middle East Technical University Dep. of Computer Engineering 2009 compiled by V. Atalay What do you think of when someone says Data? We might abstract the idea that data are

More information

The Recovery of a Schema Mapping: Bringing Exchanged Data Back

The Recovery of a Schema Mapping: Bringing Exchanged Data Back The Recovery of a Schema Mapping: Bringing Exchanged Data Back MARCELO ARENAS and JORGE PÉREZ Pontificia Universidad Católica de Chile and CRISTIAN RIVEROS R&M Tech Ingeniería y Servicios Limitada A schema

More information

NP-Completeness and Cook s Theorem

NP-Completeness and Cook s Theorem NP-Completeness and Cook s Theorem Lecture notes for COM3412 Logic and Computation 15th January 2002 1 NP decision problems The decision problem D L for a formal language L Σ is the computational task:

More information

Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34

Machine Learning. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34 Machine Learning Javier Béjar cbea LSI - FIB Term 2012/2013 Javier Béjar cbea (LSI - FIB) Machine Learning Term 2012/2013 1 / 34 Outline 1 Introduction to Inductive learning 2 Search and inductive learning

More information

Mining the Software Change Repository of a Legacy Telephony System

Mining the Software Change Repository of a Legacy Telephony System Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,

More information

3. Mathematical Induction

3. Mathematical Induction 3. MATHEMATICAL INDUCTION 83 3. Mathematical Induction 3.1. First Principle of Mathematical Induction. Let P (n) be a predicate with domain of discourse (over) the natural numbers N = {0, 1,,...}. If (1)

More information

Computational Methods for Database Repair by Signed Formulae

Computational Methods for Database Repair by Signed Formulae Computational Methods for Database Repair by Signed Formulae Ofer Arieli (oarieli@mta.ac.il) Department of Computer Science, The Academic College of Tel-Aviv, 4 Antokolski street, Tel-Aviv 61161, Israel.

More information

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. MATHEMATICS: THE LEVEL DESCRIPTIONS In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. Attainment target

More information

Performance Level Descriptors Grade 6 Mathematics

Performance Level Descriptors Grade 6 Mathematics Performance Level Descriptors Grade 6 Mathematics Multiplying and Dividing with Fractions 6.NS.1-2 Grade 6 Math : Sub-Claim A The student solves problems involving the Major Content for grade/course with

More information

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING

CFSD 21 ST CENTURY SKILL RUBRIC CRITICAL & CREATIVE THINKING Critical and creative thinking (higher order thinking) refer to a set of cognitive skills or strategies that increases the probability of a desired outcome. In an information- rich society, the quality

More information

OHJ-2306 Introduction to Theoretical Computer Science, Fall 2012 8.11.2012

OHJ-2306 Introduction to Theoretical Computer Science, Fall 2012 8.11.2012 276 The P vs. NP problem is a major unsolved problem in computer science It is one of the seven Millennium Prize Problems selected by the Clay Mathematics Institute to carry a $ 1,000,000 prize for the

More information

Chapter II. Controlling Cars on a Bridge

Chapter II. Controlling Cars on a Bridge Chapter II. Controlling Cars on a Bridge 1 Introduction The intent of this chapter is to introduce a complete example of a small system development. During this development, you will be made aware of the

More information

Copyright 2012 MECS I.J.Information Technology and Computer Science, 2012, 1, 50-63

Copyright 2012 MECS I.J.Information Technology and Computer Science, 2012, 1, 50-63 I.J. Information Technology and Computer Science, 2012, 1, 50-63 Published Online February 2012 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2012.01.07 Using Logic Programming to Represent

More information

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Policy Analysis for Administrative Role Based Access Control without Separate Administration

Policy Analysis for Administrative Role Based Access Control without Separate Administration Policy nalysis for dministrative Role Based ccess Control without Separate dministration Ping Yang Department of Computer Science, State University of New York at Binghamton, US Mikhail I. Gofman Department

More information

Integrating Benders decomposition within Constraint Programming

Integrating Benders decomposition within Constraint Programming Integrating Benders decomposition within Constraint Programming Hadrien Cambazard, Narendra Jussien email: {hcambaza,jussien}@emn.fr École des Mines de Nantes, LINA CNRS FRE 2729 4 rue Alfred Kastler BP

More information

DRAFT. Algebra 1 EOC Item Specifications

DRAFT. Algebra 1 EOC Item Specifications DRAFT Algebra 1 EOC Item Specifications The draft Florida Standards Assessment (FSA) Test Item Specifications (Specifications) are based upon the Florida Standards and the Florida Course Descriptions as

More information

Testing LTL Formula Translation into Büchi Automata

Testing LTL Formula Translation into Büchi Automata Testing LTL Formula Translation into Büchi Automata Heikki Tauriainen and Keijo Heljanko Helsinki University of Technology, Laboratory for Theoretical Computer Science, P. O. Box 5400, FIN-02015 HUT, Finland

More information

Automated Theorem Proving - summary of lecture 1

Automated Theorem Proving - summary of lecture 1 Automated Theorem Proving - summary of lecture 1 1 Introduction Automated Theorem Proving (ATP) deals with the development of computer programs that show that some statement is a logical consequence of

More information

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

More information

Introducing Formal Methods. Software Engineering and Formal Methods

Introducing Formal Methods. Software Engineering and Formal Methods Introducing Formal Methods Formal Methods for Software Specification and Analysis: An Overview 1 Software Engineering and Formal Methods Every Software engineering methodology is based on a recommended

More information

Prentice Hall: Middle School Math, Course 1 2002 Correlated to: New York Mathematics Learning Standards (Intermediate)

Prentice Hall: Middle School Math, Course 1 2002 Correlated to: New York Mathematics Learning Standards (Intermediate) New York Mathematics Learning Standards (Intermediate) Mathematical Reasoning Key Idea: Students use MATHEMATICAL REASONING to analyze mathematical situations, make conjectures, gather evidence, and construct

More information

COMPUTER SCIENCE TRIPOS

COMPUTER SCIENCE TRIPOS CST.98.5.1 COMPUTER SCIENCE TRIPOS Part IB Wednesday 3 June 1998 1.30 to 4.30 Paper 5 Answer five questions. No more than two questions from any one section are to be answered. Submit the answers in five

More information

Kolmogorov Complexity and the Incompressibility Method

Kolmogorov Complexity and the Incompressibility Method Kolmogorov Complexity and the Incompressibility Method Holger Arnold 1. Introduction. What makes one object more complex than another? Kolmogorov complexity, or program-size complexity, provides one of

More information

x < y iff x < y, or x and y are incomparable and x χ(x,y) < y χ(x,y).

x < y iff x < y, or x and y are incomparable and x χ(x,y) < y χ(x,y). 12. Large cardinals The study, or use, of large cardinals is one of the most active areas of research in set theory currently. There are many provably different kinds of large cardinals whose descriptions

More information

Imprecise probabilities, bets and functional analytic methods in Łukasiewicz logic.

Imprecise probabilities, bets and functional analytic methods in Łukasiewicz logic. Imprecise probabilities, bets and functional analytic methods in Łukasiewicz logic. Martina Fedel joint work with K.Keimel,F.Montagna,W.Roth Martina Fedel (UNISI) 1 / 32 Goal The goal of this talk is to

More information

A single minimal complement for the c.e. degrees

A single minimal complement for the c.e. degrees A single minimal complement for the c.e. degrees Andrew Lewis Leeds University, April 2002 Abstract We show that there exists a single minimal (Turing) degree b < 0 s.t. for all c.e. degrees 0 < a < 0,

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

South Carolina College- and Career-Ready (SCCCR) Algebra 1

South Carolina College- and Career-Ready (SCCCR) Algebra 1 South Carolina College- and Career-Ready (SCCCR) Algebra 1 South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR) Mathematical Process

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

How To Understand The Theory Of Hyperreals

How To Understand The Theory Of Hyperreals Ultraproducts and Applications I Brent Cody Virginia Commonwealth University September 2, 2013 Outline Background of the Hyperreals Filters and Ultrafilters Construction of the Hyperreals The Transfer

More information

Prentice Hall Algebra 2 2011 Correlated to: Colorado P-12 Academic Standards for High School Mathematics, Adopted 12/2009

Prentice Hall Algebra 2 2011 Correlated to: Colorado P-12 Academic Standards for High School Mathematics, Adopted 12/2009 Content Area: Mathematics Grade Level Expectations: High School Standard: Number Sense, Properties, and Operations Understand the structure and properties of our number system. At their most basic level

More information

Formal Verification Coverage: Computing the Coverage Gap between Temporal Specifications

Formal Verification Coverage: Computing the Coverage Gap between Temporal Specifications Formal Verification Coverage: Computing the Coverage Gap between Temporal Specifications Sayantan Das Prasenjit Basu Ansuman Banerjee Pallab Dasgupta P.P. Chakrabarti Department of Computer Science & Engineering

More information

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements.

This asserts two sets are equal iff they have the same elements, that is, a set is determined by its elements. 3. Axioms of Set theory Before presenting the axioms of set theory, we first make a few basic comments about the relevant first order logic. We will give a somewhat more detailed discussion later, but

More information

For example, estimate the population of the United States as 3 times 10⁸ and the

For example, estimate the population of the United States as 3 times 10⁸ and the CCSS: Mathematics The Number System CCSS: Grade 8 8.NS.A. Know that there are numbers that are not rational, and approximate them by rational numbers. 8.NS.A.1. Understand informally that every number

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

COMPOSITION, EQUIVALENCE AND INTEROPERABILITY: AN EXAMPLE

COMPOSITION, EQUIVALENCE AND INTEROPERABILITY: AN EXAMPLE COMPOSITION, EQUIVALENCE AND INTEROPERABILITY: AN EXAMPLE Leszek Lechowicz (Department of Electrical and Computer Engineering, Northeastern University, Boston, MA; llechowi@ece.neu.edu); Mieczyslaw M.

More information

Neighborhood Data and Database Security

Neighborhood Data and Database Security Neighborhood Data and Database Security Kioumars Yazdanian, FrkdCric Cuppens e-mail: yaz@ tls-cs.cert.fr - cuppens@ tls-cs.cert.fr CERT / ONERA, Dept. of Computer Science 2 avenue E. Belin, B.P. 4025,31055

More information

Overview. Essential Questions. Precalculus, Quarter 4, Unit 4.5 Build Arithmetic and Geometric Sequences and Series

Overview. Essential Questions. Precalculus, Quarter 4, Unit 4.5 Build Arithmetic and Geometric Sequences and Series Sequences and Series Overview Number of instruction days: 4 6 (1 day = 53 minutes) Content to Be Learned Write arithmetic and geometric sequences both recursively and with an explicit formula, use them

More information

Things That Might Not Have Been Michael Nelson University of California at Riverside mnelson@ucr.edu

Things That Might Not Have Been Michael Nelson University of California at Riverside mnelson@ucr.edu Things That Might Not Have Been Michael Nelson University of California at Riverside mnelson@ucr.edu Quantified Modal Logic (QML), to echo Arthur Prior, is haunted by the myth of necessary existence. Features

More information

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients

An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients An innovative application of a constrained-syntax genetic programming system to the problem of predicting survival of patients Celia C. Bojarczuk 1, Heitor S. Lopes 2 and Alex A. Freitas 3 1 Departamento

More information

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS

NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK September 2014 Authorized for Distribution by the New York State Education Department This test design and framework document

More information

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing

CS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate

More information

Determination of the normalization level of database schemas through equivalence classes of attributes

Determination of the normalization level of database schemas through equivalence classes of attributes Computer Science Journal of Moldova, vol.17, no.2(50), 2009 Determination of the normalization level of database schemas through equivalence classes of attributes Cotelea Vitalie Abstract In this paper,

More information

Software Verification: Infinite-State Model Checking and Static Program

Software Verification: Infinite-State Model Checking and Static Program Software Verification: Infinite-State Model Checking and Static Program Analysis Dagstuhl Seminar 06081 February 19 24, 2006 Parosh Abdulla 1, Ahmed Bouajjani 2, and Markus Müller-Olm 3 1 Uppsala Universitet,

More information

Overview of the TACITUS Project

Overview of the TACITUS Project Overview of the TACITUS Project Jerry R. Hobbs Artificial Intelligence Center SRI International 1 Aims of the Project The specific aim of the TACITUS project is to develop interpretation processes for

More information

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH

IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH IMPROVING DATA INTEGRATION FOR DATA WAREHOUSE: A DATA MINING APPROACH Kalinka Mihaylova Kaloyanova St. Kliment Ohridski University of Sofia, Faculty of Mathematics and Informatics Sofia 1164, Bulgaria

More information

Removing Partial Inconsistency in Valuation- Based Systems*

Removing Partial Inconsistency in Valuation- Based Systems* Removing Partial Inconsistency in Valuation- Based Systems* Luis M. de Campos and Serafín Moral Departamento de Ciencias de la Computación e I.A., Universidad de Granada, 18071 Granada, Spain This paper

More information

Procedia Computer Science 00 (2012) 1 21. Trieu Minh Nhut Le, Jinli Cao, and Zhen He. trieule@sgu.edu.vn, j.cao@latrobe.edu.au, z.he@latrobe.edu.

Procedia Computer Science 00 (2012) 1 21. Trieu Minh Nhut Le, Jinli Cao, and Zhen He. trieule@sgu.edu.vn, j.cao@latrobe.edu.au, z.he@latrobe.edu. Procedia Computer Science 00 (2012) 1 21 Procedia Computer Science Top-k best probability queries and semantics ranking properties on probabilistic databases Trieu Minh Nhut Le, Jinli Cao, and Zhen He

More information

Answer Key for California State Standards: Algebra I

Answer Key for California State Standards: Algebra I Algebra I: Symbolic reasoning and calculations with symbols are central in algebra. Through the study of algebra, a student develops an understanding of the symbolic language of mathematics and the sciences.

More information

A first step towards modeling semistructured data in hybrid multimodal logic

A first step towards modeling semistructured data in hybrid multimodal logic A first step towards modeling semistructured data in hybrid multimodal logic Nicole Bidoit * Serenella Cerrito ** Virginie Thion * * LRI UMR CNRS 8623, Université Paris 11, Centre d Orsay. ** LaMI UMR

More information

A Modular Representation of a Business Process Planner

A Modular Representation of a Business Process Planner A Modular Representation of a Business Process Planner Shahab Tasharrofi and Evgenia Ternovska School of Computing Science Simon Fraser University Canada 1st International Workshop on Knowledge-intensive

More information

[Refer Slide Time: 05:10]

[Refer Slide Time: 05:10] Principles of Programming Languages Prof: S. Arun Kumar Department of Computer Science and Engineering Indian Institute of Technology Delhi Lecture no 7 Lecture Title: Syntactic Classes Welcome to lecture

More information

MA651 Topology. Lecture 6. Separation Axioms.

MA651 Topology. Lecture 6. Separation Axioms. MA651 Topology. Lecture 6. Separation Axioms. This text is based on the following books: Fundamental concepts of topology by Peter O Neil Elements of Mathematics: General Topology by Nicolas Bourbaki Counterexamples

More information