Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes

Size: px
Start display at page:

Download "Comment on On Discriminative vs. Generative Classifiers: A Comparison of Logistic Regression and Naive Bayes"

Transcription

1 Coent on On Discriinative vs. Generative Classifiers: A Coparison of Logistic Regression and Naive Bayes Jing-Hao Xue (jinghao@stats.gla.ac.uk) and D. Michael Titterington (ike@stats.gla.ac.uk) Departent of Statistics, University of Glasgow, Glasgow G12 8QQ, UK Abstract. Coparison of generative and discriinative classifiers is an everlasting topic. As an iportant contribution to this topic, based on their theoretical and epirical coparisons between the naïve Bayes classifier and linear logistic regression, Ref. [6] claied that there exist two distinct regies of perforance between the generative and discriinative classifiers with regard to the trainingset size. In this paper, our epirical and siulation studies, as a copleent of their work, however, suggest that the existence of the two distinct regies ay not be so reliable. In addition, for real world datasets, so far there is no theoretically correct, general criterion for choosing between the discriinative and the generative approaches to classification of an observation x into a class y; the choice depends on the relative confidence we have in the correctness of the specification of either p(y x) or p(x, y) for the data. This can be to soe extent a deonstration of why Ref. [3] and [7] prefer noral-based linear discriinant analysis (LDA) when no odel is-specification occurs but other epirical studies ay prefer linear logistic regression instead. Furtherore, we suggest that pairing of either LDA assuing a coon diagonal covariance atrix (LDA-Λ) or the naïve Bayes classifier and linear logistic regression ay not be perfect, and hence it ay not be reliable for any clai that was derived fro the coparison between LDA-Λ or the naïve Bayes classifier and linear logistic regression to be generalised to all generative and discriinative classifiers. Keywords: Asyptotic relative efficiency; Discriinative classifiers; Generative classifiers; Logistic regression; Noral-based discriinant analysis; Naïve Bayes classifier c 2008 Kluwer Acadeic Publishers. Printed in the Netherlands. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.1

2 2 Xue and Titterington Abbreviations: LDA/QDA Noral-based Linear/Quadratic Discriinant Analysis; AIC Akaike Inforation Criterion; GAM Generalised Additive Model 1. Introduction Classification is a ubiquitous proble tackled in statistics, achine learning, pattern recognition and data ining [4]. Generative classifiers, also tered the sapling paradig [2], such as noral-based discriinant analysis and the naïve Bayes classifier, odel the joint distribution p(x,y) of the easured features x and the class labels y factorised in the for p(x y)p(y), and learn the odel paraeters through axiisation of the likelihood given by p(x y)p(y). Discriinative classifiers, also tered the diagnostic paradig [2], such as logistic regression, odel the conditional distribution p(y x) of the class labels given the features, and learn the odel paraeters through axiising the conditional likelihood based on p(y x). Coparison of generative and discriinative classifiers is an everlasting topic [3,6,7,10,12]. Results fro such coparisons, in particular in ters of isclassification rates, can not only guide the selection of an appropriate classifier, either generative or discriinative, but also shed light on how to exploit the best of both worlds of classifiers, and thus has been attracting long-standing interest fro both researchers and practitioners. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.2

3 Discriinative vs. Generative Classifiers 3 An iportant contribution to this topic is fro Ref. [6], presenting soe theoretical and epirical coparisons between linear logistic regression and the naïve Bayes classifier. The results in [6] suggested that, between the two classifiers, there were two distinct regies of discriinant perforance with respect to the training-set size. More precisely, they proposed that the discriinative classifier had lower asyptotic rate while the generative classifier ay approach its (higher) asyptotic rate uch faster. In other words, the discriinative classifier perfors better with larger training sets while the generative classifier does better with saller training sets. However, Ref. [3] and [7] presented soe theoretical and siulation studies showing that noral-based linear discriinant analysis (LDA), a generative classifier, has better asyptotic efficiency (i.e., perfors better with larger training sets) when no odel is-specification occurs. Our epirical and siulation studies, as presented in this paper, suggest that it ay not be so reliable to clai such an existence of the two distinct regies. Furtherore, we suggest that pairing of either LDA assuing a coon diagonal covariance atrix Λ (denoted by LDA-Λ hereafter) or the naïve Bayes classifier and linear logistic regression ay not be perfect, and hence it ay not be reliable for any clai that was derived fro the coparison between LDA-Λ or the naïve Bayes classifier and linear logistic regression to be generalised to all generative and discriinative classifiers. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.3

4 4 Xue and Titterington 2. Setting for Coparison 2.1. Setting used by Ref. [6] The setting for the theoretical proof and epirical evidence in [6] includes a binary class label y, e.g., y {1,2}, a p-diensional feature vector x and the assuption of conditional independence aongst x y, the features within a class. The naïve Bayes classifier, a generative classifier defined as in equation (4) in Section 2.2, assues statistically independent features x within classes y and thus diagonal covariance atrices within classes. By contrast, linear logistic regression, a discriinative classifier defined as in equation (1), ay not assue such conditional independence of the coponents of x. Both classifiers can be applied to discrete, continuous or ixed-valued features x. In the case of discrete features x, each feature x i,i = 1,...,p, independently of other features within x, is assued within a class to be a binoial variable such that its value x i {0,1}. However, this ay not guarantee the discriinant function λ(α) = log{p(y = 1 x)/p(y = 2 x)} of the naïve Bayes classifier, where α is a paraeter vector, to be linear. As linear logistic regression uses a linear discriinant function, the naïve Bayes classifier ay not be a partner of linear logistic regression as a generative-discriinative pair (see Section 2.2 for ore discussion about this pairing). In the case of continuous features x, x y is assued to follow Gaussian distributions with equal covariance atrices across the two classes, i.e., Σ 1 = Σ 2 and, in view of the conditional independence assuption, Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.4

5 Discriinative vs. Generative Classifiers 5 both covariance atrices are equal to a diagonal atrix Λ. Soe algebra shows that, under the assuption of a coon diagonal covariance atrix Λ for norally distributed data, the naïve Bayes ethod is equivalent to LDA-Λ (defined as equation (2)), and, under the assuption of unequal diagonal within-class covariance atrices, it is equivalent to quadratic discriinant analysis. For the experients in [6], all of the observed values of the features are rescaled so that x i [0,1]. Based on such a setting, Ref. [6] copared two so-called generativediscriinative pairs: one is for the continuous case, coparing LDA assuing a coon diagonal covariance atrix (LDA-Λ) vs. linear logistic regression, and the other is for the discrete case, coparing the naïve Bayes classifier vs. linear logistic regression. We shall next ake soe coents on these pairings On the Pairing of LDA-Λ/Naïve Bayes and Linear Logistic Regression/GAM As entioned in Section 2.1, first, the naïve Bayes classifier cannot guarantee the linear for of the discriinant function λ(α) = log{p(y = 1 x)/p(y = 2 x)}, and, secondly, the conditional independence aongst the ultiple features within a class is a necessary condition for the validity of the naïve Bayes classifier and LDA-Λ but not for linear logistic regression, although in the latter the discriinant function λ(α) is odelled as a linear cobination of separate features. Therefore, the coparison between a generative-discriinative pair of LDA-Λ/naïve Bayes classifier vs. linear logistic regression should be interpreted with caution, in particular when the data do not support the assuption of conditional Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.5

6 6 Xue and Titterington independence of x y that ay shed unfavourable light on the siplified generative version, LDA-Λ and the naïve Bayes classifier. In this section, we will illustrate two such generative-discriinative pairs: one is LDA-Λ vs. linear logistic regression [6], and the other is the naïve Bayes classifier vs. the generalised additive odel (GAM) [10] LDA-Λ vs. Linear Logistic Regression Consider a feature vector x = (x 1,...,x p ) T and a binary class label y = 1,2. Linear logistic regression, one of the discriinative classifiers that do not assue any distribution p(x y) of the data, is odelled directly with a linear discriinant function as λ dis (α) = log p(y = 1 x) p(y = 2 x) = log π 1 p(x y = 1) + log π 2 p(x y = 2) = β 0 + β T x, (1) where p(y = k) = π k, α T = (β 0,β T ) and β is a paraeter vector of p eleents. By linear, we ean a scalar-valued function of a linear cobination of the features x 1,...,x p of an observed feature vector x. By contrast, LDA-Λ, one of the generative classifiers, assues that the data arise fro two p-variate noral distributions with different eans but the sae diagonal covariance atrix such that (x y = k;θ) N(µ k,λ), k = 1,2, where θ = (µ k,λ); this iplies an assuption of conditional independence between any two features x i y and x j y, i j, within a class. The density function of (x y = k;θ) can be written as p(x y = k;θ) = {e µt k Λ 1 x } { 1 (2π) p Λ e 1 2 µt k Λ 1 µ k } { e 1 2 xt Λ 1 x }, Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.6

7 Discriinative vs. Generative Classifiers 7 which leads to a linear discriinant function, λ gen (α) = log p(y = 1 x) p(y = 2 x) = log π 1 π 2 + log A(θ 1,η) A(θ 2,η) + (θ 1 θ 2 ) T x, (2) where θ k = µ T k Λ 1, η = Λ 1 1 and A(θ k,η) = (2π) p Λ e 1 2 µt k Λ 1 µ k. Siilarly, by assuing that the data arise fro two p-variate noral distributions with different eans but the sae full covariance atrix such that (x y = k;θ) N(µ k,σ), k = 1,2, we can obtain the sae forula as λ gen (α) but with θ k = µ T k Σ 1, η = Σ 1 and A(θ k,η) = 1 (2π) p Σ e 1 2 µt k Σ 1 µ k, which leads to the linear discriinant function of LDA with a coon full covariance atrix Σ (denoted by LDA-Σ hereafter). Therefore, we could rewrite θ as θ = (θ k,η), where θ k is a class-dependent paraeter vector while η is a coon paraeter vector across the classes. It is clear that the assuption of conditional independence aongst the features within a class is not a necessary condition for a generative classifier to attain a linear λ gen (α). In fact, as pointed out by [7], if the feature vector x follows a ultivariate exponential faily distribution with the density or probability ass function within a class being p(x y = k,θ k ) = e θt k x A(θ k,η)h(x,η),k = 1,2, the generative classifiers will attain a linear λ gen (α) Naïve Bayes vs. Generalised Additive Model (GAM) As with logistic regression, a GAM does not assue any distribution p(x y) for the data; it is odelled directly with a discriinant function that is a su of p functions f(x i ),i = 1,...,p, of the p features x i Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.7

8 8 Xue and Titterington separately [10]; that is λ dis (α) = log p(y = 1 x) p(y = 2 x) = log π 1 π 2 + p f(x i ). (3) i=1 Meanwhile, along with the assuption of the distribution of (x y), a fundaental assuption underlying the naïve Bayes classifier is that of the conditional independence aongst the p features within a class, so that the joint probability is p(x y) = p i=1 p(x i y). It follows that the discriinant function λ(α) is λ gen (α) = log p(y = 1 x) p(y = 2 x) = log π 1 π 2 + p i=1 log p(x i y = 1) p(x i y = 2). (4) It is clear, as pointed out by [10], that the naïve Bayes classifier is a specialised case of a GAM, with f(x i ) = log{p(x i y = 1)/p(x i y = 2)}. Furtherore, GAMs ay not necessarily assue conditional independence. One sufficient condition that leads to another specialised case of a GAM (we call it Q-GAM) is that p(x y) = q(x) p i=1 q(x i y), where q(x) is coon across the classes but cannot be further factorised into a product of functions of individual features as p i=1 q(x i). In such a case, the assuption of conditional independence between x i y and x j y, i j, is invalid but we still have f(x i ) = log{q(x i y = 1)/q(x i y = 2)}, where q(x i y) is different fro the arginal probability p(x i y) that is used by the naïve Bayes classifier. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.8

9 Discriinative vs. Generative Classifiers Suary First, the conditional independence aongst the features within a class is a necessary condition for the naïve Bayes classifier and LDA-Λ, but it is not a necessary condition for linear logistic regression. Therefore, the generative-discriinative pair of LDA with a coon full covariance atrix Σ (LDA-Σ) vs. linear logistic regression also erits investigation. Secondly, given the parity between λ gen (α) and λ dis (α) and thus that, between two pairs, LDA-Σ vs. linear logistic regression and Q- GAM vs. GAM in ters of classification, neither classifier assues conditional independence of x y aongst the features within a class, which is an eleentary assuption underlying LDA-Λ and the naïve Bayes classifier. Therefore, it ay not be reliable for any clai that is derived fro the coparison between LDA-Λ or the naïve Bayes classifier and linear logistic regression to be generalised to all generative and discriinative classifiers. Thirdly, a coparison of quadratic noral discriinant analysis (QDA) with unequal diagonal atrices Λ 1 and Λ 2 (denoted by QDA- Λ g hereafter) and unequal full covariance atrices Σ 1 and Σ 2 (denoted by QDA-Σ g hereafter) with quadratic logistic regression ay provide an interesting extension of the work of [6] Our ipleentation Ref. [6] reported experiental results on 15 real-world datasets, 8 with only continuous and binary features and 7 with only discrete features, fro the UCI achine learning repository [1]; this repository stores Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.9

10 10 Xue and Titterington ore than 100 datasets contributed and widely used by the achine learning counity, as a benchark for epirical studies of achine learning approaches. As pointed out in [6], there were a few cases (2 out of 8 continuous cases and 4 out of 7 discrete cases) that did not support the better asyptotic perforance of the discriinative classifier, priarily because of the lack of sufficiently large training sets. However, it is known that the perforance of a classifier varies to soe extent with the features selected and a generally-valid epirical evaluation of classifiers is always an iportant but difficult proble [4] In this context, we first replicate experients on these 15 datasets, with and without stepwise variable selection being perfored on the full linear logistic regression odel using all the observations of each dataset. In the stepwise variable selection process, the decision to include or exclude a variable is based on the calculation of the Akaike inforation criterion (AIC). Furtherore, in the 8 continuous cases, both LDA-Λ and LDA-Σ are copared with linear logistic regression. Then we will extend the coparison to between QDA and quadratic logistic regression for the 8 continuous UCI datasets and finally to siulated continuous datasets. The ipleentations in R ( of LDA and QDA are rewritten fro a Matlab function cda for classical linear and quadratic discriinant analysis [13]. Logistic regression is ipleented by an R function gl fro a standard package stats in R, and the naïve Bayes classifier is ipleented by an R function naivebayes fro a contributed package e1071 for R. In addition, siilarly to what was done by [6], for each sapled training-set size, we perfor 1000 rando splits of each dataset into Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.10

11 Discriinative vs. Generative Classifiers 11 a training set of size and a test set of size N, where N is the nuber of observations in the whole dataset, and report the average of the isclassification rates over these 1000 test sets. The training set is required to have at least 1 observation fro each of the two classes, and, for discrete datasets, to have all the levels of the features presented by the training observations, otherwise the prediction for the test set ay be asked to predict on soe new levels for which no inforation has been provided in the training process. In order to have all the coefficients of predictor variables in the odel estiated in our ipleentation of logistic regression by gl, the nuber of training observations should be larger than the nuber p of predictor variables, where p = p for the continuous cases if all p features are used for the linear odel. More attention should be paid to the discrete cases with ultinoial features in the odel, where ore duy variables have to be used as the predictor variables, with the consequence that p could be uch larger than p, e.g., p = 3p for the linear odel if all the features have 4 levels. In other words, although we ay report isclassification rates for logistic regression with sall, it is not reliable for us to base any general clai on those of saller than p, the actual nuber of predictor variables used by the logistic regression odel. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.11

12 12 Xue and Titterington 3. Linear/Quadratic Discriination for Epirical Datasets 3.1. Linear Discriination for Continuous Datasets For the continuous datasets, as was done by [6], all the ultinoial features are reoved so that only continuous and binary features x i are kept and their values x i are rescaled into [0,1]. Any observation with issing features is reoved fro the datasets, as is any feature with only a single value for all the observations. In addition, as Gaussian distributions and equal within-class covariance atrices are assued for x y for LDA-Λ and LDA-Σ, testing such assuptions can help the interpretation of classification perforance of relevant classifiers. Therefore, before carrying out the classification, we perfor the Shapiro-Wilk test for within-class norality for each feature x i y [11] and Levene s test for hoogeneity of variance across the two classes [5]. For the datasets discussed below, the significance level is set at 0.05, and we observe that null hypotheses of norality and hoogeneity of variance are ostly rejected by the tests at that significance level. A brief description of the continuous datasets can be found in Table I, which lists, for each dataset, the total nuber N 0 of the observations, the nuber N of the observations that we use after the pre-processing entioned above, the total nuber p of continuous or binary features, the nuber p AIC of features selected by AIC, the nuber p SW of features for which the null hypotheses were rejected by the Shapiro-Wilk test and the corresponding nuber p L for Levene s test, the indicator 1 {2R Λ} {1,0} of whether or not the two regies Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.12

13 Table I. Description of continuous datasets. Discriinative vs. Generative Classifiers 13 Dataset N 0 N p p AIC p SW p L 1 {2R Λ} 1 {2R Σ} Pia Adult Boston Optdigits Optdigits Ionosphere Liver disorders Sonar are observed between LDA-Λ and linear logistic regression and the indicator 1 {2R Σ} {1,0} with regard to LDA-Σ. Note that, for soe large datasets such as Adult (and Sick in Section 3.3), in order to reduce coputational coplexity without degrading the validity of the coparison between the classifiers, we randoly saple observations with the class prior probability kept unchanged. Our results are shown in Figure 1. Since with variable selection by AIC the results confor ore to the clai of two regies ade by [6], we show such results only if they are different fro those without variable selection. Meanwhile, in the figures hereafter we use the sae annotation of the vertical and horizontal axes and the sae line type as those in [6]. For the reason given at the end of Section 2.3, Figure 1 is only drawn for > p, with the intercept in λ(α) taken into account. In general, our study of these continuous datasets suggests the following conclusions. First, in the coparison of LDA-Λ vs. linear logistic regression, the pattern of our results can be said to be siilar to that of [6]. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.13

14 14 Xue and Titterington pia adult logistic reg logistic reg boston optdigits 0 vs. 1 (AIC) logistic reg logistic reg. optdigits 2 vs. 3 (AIC) ionosphere (AIC) logistic reg logistic reg. liver disorders sonar (AIC) logistic reg logistic reg Figure 1. Plots of isclassification rate vs. training-set size (averaged over 1000 rando training/test set splits) for the continuous UCI datasets, with regard to linear discriination. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.14

15 Discriinative vs. Generative Classifiers 15 Secondly, the perforance of LDA-Σ is worse than that of LDA-Λ when the training-set size is sall, but better than that of the latter when is large. Thirdly, the perforance of LDA-Σ is better than that of linear logistic regression when is sall, but is ore or less coparable with that of the latter when is large. Fourthly, pre-processing with variable selection can reveal the distinction in perforance between generative and discriinative classifiers with fewer training observations. Therefore, considering LDA-Λ vs. linear logistic regression, there is strong evidence to support the clai that the discriinative classifier has lower asyptotic rate while the generative classifier ay approach its (higher) asyptotic rate uch faster. However, considering LDA-Σ vs. linear logistic regression, the evidence is not so strong, although the clai ay still be ade Quadratic Discriination On Continuous Datasets As a natural extension of the coparison between LDA-Λ (with a coon diagonal covariance atrix Λ across the two classes), LDA-Σ (with a coon full covariance atrix Σ) and linear logistic regression that was presented in Section 3.1, this section presents the coparison between QDA-Λ g (with two unequal diagonal covariance atrices Λ 1 and Λ 2 ), QDA-Σ g (with two unequal full covariance atrices Σ 1 and Σ 2 ) and quadratic logistic regression. Using the 8 continuous UCI datasets, all the settings are the sae as those in Section 3.1 except for the following aspects. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.15

16 16 Xue and Titterington First, considering that in the quadratic logistic regression odel there are p(p 1)/2 interaction ters between the features in a p- diensional feature space, a large nuber of interactions when the diensionality p is high, the odel is constrained to contain only the intercept, the p features and their p squared ters, so as to ake the estiation of the odel ore feasible and interpretable. Secondly, for the sae reason as explained at the end of Section 1, in the reported plots of isclassification rate vs. without variable selection, only the results for > 2p are reliable for coparison since there are 2p predictor variables in the quadratic logistic regression odel. Hence, only the results for > 2p are shown in Figure 2. Thirdly, the datasets are randoly split into training sets and test sets 100 ties rather than 1000 ties for each sapled training-set size because of the higher coputational coplexity of the quadratic odels copared with that of the linear odels. In general, our study of these continuous datasets, as shown in Figure 2, suggests quite siilar conclusions to those in Section 3.1, through substituting QDA-Λ g for LDA-Λ, QDA-Σ g for LDA-Σ, and quadratic logistic regression for linear logistic regression Linear Discriination On Discrete Datasets For the discrete datasets, as was done by [6], all the continuous features are reoved and only the discrete features are used. The results are entitled ultinoial in the following figures if a dataset includes ultinoial features, and otherwise are entitled binoial. Meanwhile, Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.16

17 Discriinative vs. Generative Classifiers 17 pia (AIC) adult QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ boston (AIC) optdigits 0 vs. 1 (AIC) QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ 2 optdigits 2 vs. 3 (AIC) ionosphere (AIC) QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ 2 liver disorders sonar (AIC) QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ QDA: Σ 1 = Λ 1, Σ 2 = Λ 2 quadratic logistic reg. QDA: Σ 1, Σ Figure 2. Plots of isclassification rate vs. training-set size (averaged over 100 rando training/test set splits) for the continuous UCI datasets, with regard to quadratic discriination. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.17

18 18 Xue and Titterington any observation with issing features is reoved fro the datasets, as is any feature with only a single value for all the observations. Table II. Description of discrete datasets. Dataset N 0 N p p p AIC p AIC 1 {2R NB} Prooters Lyphography Breast cancer Voting recorders Lenses Sick Adult A brief description of the discrete datasets can be found in Table II, which includes the indicator 1 {2R NB} {1,0} of whether or not the two regies are observed between the naïve Bayes classifier and linear logistic regression. Our results are shown in Figure 3 for soe > p or > p AIC, with duy variables taken into account for the ultinoial features. In general, our study of these discrete datasets suggests that, in the coparison of the naïve Bayes classifier vs. linear logistic regression, the pattern of our results can be said to be siilar to that of [6]. 4. Linear Discriination On Siulated Datasets In this section, 16 siulated datasets are used to copare the perforance of LDA-Λ, LDA-Σ and linear logistic regression. The saples are siulated fro bivariate noral distributions, bivariate Student s t-distributions, bivariate log-noral distributions and ixtures of 2 Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.18

19 Discriinative vs. Generative Classifiers prooters (ultinoial, AIC) naive Bayes logistic reg lyphography (ultinoial, AIC) naive Bayes logistic reg breast cancer (ultinoial) voting records (binoial, AIC) naive Bayes logistic reg naive Bayes logistic reg lenses hard+soft vs. no (ultinoial) sick (ultinoial, AIC) naive Bayes logistic reg naive Bayes logistic reg adult (ultinoial) naive Bayes logistic reg Figure 3. Plots of isclassification rate vs. training-set size (averaged over 1000 rando training/test set splits) for the discrete UCI datasets, with regard to linear discriination. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.19

20 20 Xue and Titterington bivariate noral distributions, with 4 datasets for each of these 4 types of distribution. Within each dataset there are 1000 siulated saples, which are divided equally into 2 classes. The siulations fro the bivariate log-noral distributions and noral ixtures are based on an R function vrnor for siulating fro a ultivariate noral distribution fro a contributed R package MASS, and the siulation fro the bivariate Student s t-distribution is ipleented by an R function rvt fro a contributed R package vtnor. Differently fro the UCI datasets, the siulated data are not rescaled into the range [0,1] and no variable selection is used since the feature space is only of diension two Norally Distributed Data Four siulated datasets are randoly generated fro two bivariate noral distributions, N(µ 1,Σ 1 ) and N(µ 2,Σ 2 ), where µ 1 = (1,0) T, µ 2 = ( 1,0) T and Σ 1 and Σ 2 are subject to four different types of constraint specified as having equal diagonal or full covariance atrices Σ 1 = Σ 2 and having unequal diagonal or full covariance atrices Σ 1 Σ 2. Siilarly to what was done for the UCI datasets, for each sapled training-set size, we perfor 1000 rando splits of the 1000 saples of each siulated dataset into a training set of size and a test set of size 1000, and report the average isclassification rates over these 1000 test sets. The training set is required to have at least 1 saple fro each of the two classes. In such a way, LDA-Λ and LDA-Σ are Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.20

21 Discriinative vs. Generative Classifiers 21 copared with linear logistic regression, in ters of isclassification rate, with the following results shown in Figure 4. Noral: Σ 1 = Λ, Σ 2 = Λ Noral: Σ 1 = Σ, Σ 2 = Σ, Σ Λ linear logistic reg linear logistic reg. Noral: Σ 1 = Λ 1, Σ 2 = Λ 2, Σ 1 Σ 2 Noral: Σ 1 Λ 1, Σ 2 Λ 2, Σ 1 Σ linear logistic reg linear logistic reg. Figure 4. Plots of isclassification rate vs. training-set size (averaged over 1000 rando training/test set splits) for siulated bivariate norally distributed data for two classes. The dataset for the top-left panel of Figure 4 has Σ 1 = Σ 2 = Λ with a diagonal atrix Λ = Diag(1, 1), such that the data satisfy the assuptions underlying LDA-Λ. The dataset for the top-right panel has Σ 1 = Σ 2 = Σ with a full atrix Σ = 1 0.5, such that the data satisfy the assuptions underlying LDA-Σ. The dataset for the bottoleft panel has Σ 1 = Λ 1,Σ 2 = Λ 2 with diagonal atrices Λ 1 = Diag(1,1) and Λ 2 = Diag(0.25,0.75), such that the hoogeneity of the covariance atrices is violated. The dataset for the botto-right panel has Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.21

22 22 Xue and Titterington Σ 1 = and Σ 2 =, such that both the hoogeneity of the covariance atrices and the conditional independence (uncorrelatedness) of the features within a class are violated Student s t-distributed Data Four siulated datasets are randoly generated fro two bivariate Student s t-distributions, both distributions with degrees of freedo ν = 3. The values of class eans µ 1 and µ 2, the four types of constraint on Σ 1 and Σ 2, and other settings of the experients are all the sae as those in Section 4.1. The results are shown in Figure 5, where for each panel the constraint with regard to Σ 1 and Σ 2 is the sae as the corresponding one in Figure 4, except for a scalar ultiplier ν/(ν 2) Log-norally Distributed Data Four siulated datasets are randoly generated fro two bivariate log-noral distributions, whose logariths are norally distributed as N(µ 1,Σ 1 ) and N(µ 2,Σ 2 ), respectively. The values of µ 1 and µ 2, the four types of constraint on Σ 1 and Σ 2, and other settings of the experients are all the sae as those in Section 4.1. By definition, if a p-variate rando vector x N(µ(x), Σ(x)), then a p-variate vector x of the exponentials of the coponents of x follows a p-variate log-noral distribution, i.e., x = exp(x) log N(µ( x), Σ( x)), where the i-th eleent µ (i) ( x) of the ean vector and the (i,j)-th Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.22

23 Discriinative vs. Generative Classifiers 23 Student s t: Σ 1 = Λ, Σ 2 = Λ Student s t: Σ 1 = Σ, Σ 2 = Σ, Σ Λ linear logistic reg linear logistic reg. Student s t: Σ 1 = Λ 1, Σ 2 = Λ 2, Σ 1 Σ 2 Student s t: Σ 1 Λ 1, Σ 2 Λ 2, Σ 1 Σ linear logistic reg linear logistic reg. Figure 5. Plots of isclassification rate vs. training-set size (averaged over 1000 rando training/test set splits) for siulated bivariate Student s t-distributed data for two classes. eleent Σ (i,j) ( x) of the covariance atrix, i,j = 1,...,p, are µ (i) ( x) = e µ(i) (x)+ Σ(i,i) (x) 2, Σ (i,j) ( x) = (e Σ(i,j) (x) 1)e µ(i) (x)+µ (j) (x)+ Σ(i,i) (x)+σ (j,j) (x) 2. It follows that, if the coponents of its logarith x are independent and norally distributed, the coponents of the log-norally distributed ultivariate rando variable x are uncorrelated. In other words, if x N(µ(x),Λ(x)), then x = exp(x) log N(µ( x),λ( x)). However, as shown by the equations above, Λ( x) is deterined by both µ(x) and Λ(x), so that Σ 1 (x) = Σ 2 (x) ay not ean Σ 1 ( x) = Σ 2 ( x). Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.23

24 24 Xue and Titterington Therefore, if we consider in our cases µ 1 µ 2, it can be expected that the pattern of perforance of the classifiers for the datasets with equal covariance atrices Σ 1 = Σ 2 in the underlying noral distributions could be siilar to that for the datasets with unequal covariance atrices Σ 1 Σ 2, since in both cases the covariance atrices of the log-norally distributed variables are in fact unequal. In this context, it akes ore sense to copare the classifiers in situations with diagonal and full covariance atrices of the underlying norally distributed data, respectively, rather than those with equal and unequal covariance atrices. Log noral: Σ 1 = Λ, Σ 2 = Λ Log noral: Σ 1 = Σ, Σ 2 = Σ, Σ Λ linear logistic reg linear logistic reg. Log noral: Σ 1 = Λ 1, Σ 2 = Λ 2, Σ 1 Σ 2 Log noral: Σ 1 Λ 1, Σ 2 Λ 2, Σ 1 Σ linear logistic reg linear logistic reg. Figure 6. Plots of isclassification rate vs. training-set size (averaged over 1000 rando training/test set splits) for siulated bivariate log-norally distributed data for two classes. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.24

25 Discriinative vs. Generative Classifiers 25 The results are shown in Figure 6, where for each panel the constraint with regard to Σ 1 and Σ 2 is the sae as the corresponding one in Figure Noral Mixture Data Copared with the noral distribution, the Student s t-distribution and the log-noral distribution used in Sections 4.1, 4.2 and 4.3 for the coparison of the classifiers, the ixture of noral distributions is a better approxiation to real data in a variety of situations. In this section, 4 siulated datasets, each consisting of 1000 saples, are randoly generated fro two ixtures, each of two bivariate noral distributions, with 250 saples fro each ixture coponent. The two coponents, A and B, of the ixture for Class 1 are norally distributed with distributions N(µ 1A,Σ 1 ) and N(µ 1B,Σ 1 ), respectively, where µ 1A = (1,0) T and µ 1B = (3,0) T ; and the two coponents, C and D, of the ixture for Class 2 are norally distributed with probability density functions N(µ 2C,Σ 2 ) and N(µ 2D,Σ 2 ), respectively, where µ 2C = ( 1,0) T and µ 2D = ( 3,0) T. In such a way, when Σ 1 and Σ 2 are subject to the four different types of constraint with regard to Σ 1 and Σ 2 as previously discussed, the covariance atrices of the two ixtures will be subject to the sae constraints. Other settings of the experients are all the sae as that in Section 4.1. The results are shown in Figure 7, where for each panel the constraint with regard to Σ 1 and Σ 2 is the sae as the corresponding one in Figure 4. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.25

26 26 Xue and Titterington Mixture: Σ 1 = Λ, Σ 2 = Λ Mixture: Σ 1 = Σ, Σ 2 = Σ, Σ Λ linear logistic reg linear logistic reg. Mixture: Σ 1 = Λ 1, Σ 2 = Λ 2, Σ 1 Σ 2 Mixture: Σ 1 Λ 1, Σ 2 Λ 2, Σ 1 Σ linear logistic reg linear logistic reg. Figure 7. Plots of isclassification rate vs. training-set size (averaged over 1000 rando training/test set splits) for siulated bivariate 2-coponent noral ixture data for two classes Suary of Linear Discriination for Siulated Datasets In general, our study of these siulated continuous datasets suggests the following conclusions. First, when the data are consistent with the assuptions underlying LDA-Λ or LDA-Σ, as shown in the top-left and top-right panels of Figure 4, both ethods can perfor the best aong the and linear logistic regression, throughout the range of the training-set size in our study. In these cases, there is no evidence to support the clai that the discriinative classifier has lower asyptotic rate while the Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.26

27 Discriinative vs. Generative Classifiers 27 generative classifier ay approach its (higher) asyptotic rate uch faster. Secondly, when the data violate the assuptions underlying the LDAs, linear logistic regression generally perfors better than the LDAs, in particular when is large. This pattern is especially clear, as shown in Figure 6 for the log-norally distributed data, the distributions of which are heavy-tailed, asyetric and thus in soe sense less Gaussian than Student s t and noral-ixture distributions in our experients. In this case, there is strong evidence to support the clai that the discriinative classifier has lower asyptotic rate, but there is no convincing evidence to support the clai that the generative classifier ay approach its (higher) asyptotic rate uch faster. Finally, when the covariance atrices are non-diagonal, LDA-Σ perfors rearkably better than LDA-Λ and ore rearkably when is large; when the covariance atrices are diagonal, LDA-Λ perfors generally better than LDA-Σ and ore so when is large. 5. Coents on the Two Regies of Perforance regarding Training-Set Size Based on the theoretical analysis and epirical coparison between LDA-Λ or the naïve Bayes classifiers and linear logistic regression, Ref. [6] clai that there are two distinct regies of perforance with regard to the training-set size. However, our epirical results, as shown in Tables I and II, could not convincingly support the clai. Furtherore, our siulation studies, as presented in Section 4, failed to Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.27

28 28 Xue and Titterington find the two regies when the data either confored to the assuptions underlying the generative classifiers, as shown in Figure 4, or heavily violated the assuptions, as shown in Figure 6. Therefore, besides coenting on the pairing of the copared classifiers in Section 2.2, we shall clarify the clai further through coenting on the reliability of the two regies. Suppose we have a training set {(y (i) tr,x(i) tr )} i=1 of independent observations and a test set {(y (i) te,x(i) te )}N i=1 of N independent observations, where x (i) = (x (i) 1,...,x(i) p ) T is the i-th observed p-variate feature vector x, and y (i) {1,2} is its observed univariate class label. Let us also assue that each observation {(y (i),x (i) )} follows an identical distribution so that testing based on the training results akes sense. In order to siplify the notation, let x tr denote {(x (i) tr )} i=1, and siilarly define x te, y tr and y te. Meanwhile, a discriinant function λ(α) = log{p(y = 1 x)/p(y = 2 x)}, which is equivalent to a Bayes classifier ŷ(x) = argax y p(y x), is used for the 2-class classification For Discriinative Classifiers Discriinative classifiers estiate the paraeter α of the discriinant function λ(α) through ˆα = argax α p(y tr x tr,α), the axiisation of a conditional probability; such an estiation procedure can be regarded as a kind of axiu likelihood estiation with p(y tr x tr,α) as the likelihood function. It is well known that, if the 0 1 loss function is used so that the isclassification rate is the total risk, the Bayes classifiers will attain the iniu rate [9]. This iplies that, under such a loss function, the discriinative classifiers are in fact Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.28

29 Discriinative vs. Generative Classifiers 29 using the sae criterion to optiise the estiation of the paraeter α and the perforance of classification. In this context, the following clais, supported by the siulation study in Section 4, can be proposed. First, if the sae dataset is used to train and test, i.e., x tr as x te and y tr as y te, then the discriinative classifiers should always provide the best perforance, no atter how large the training-set size is, provided that the 0 1 loss function is used and the odelling of p(y x,α), such as the linearity of λ(α), is correctly specified for all the observations, and thus the only work that reains is to estiate accurately the paraeter α. Secondly, if is large enough to ake (y tr,x tr ) representative of all the observations including (y te,x te ), then the discriinative classifiers should also provide the best prediction perforance on (y te,x te ), i.e., with the best asyptotic perforance, provided that the odelling of p(y x,α) is correctly specified for all the observations. Finally, if is not large enough to ake (y tr,x tr ) representative of all the observations, and (y te,x te ) is not exactly the sae as (y tr,x tr ), then the discriinative classifiers ay not necessarily provide the best prediction perforance on (y te,x te ), even though the odelling of p(y x,α) ay be correct For Generative Classifiers Generative classifiers estiate the paraeter α of the discriinant function λ(α) through first axiising a joint probability function, i.e. ˆθ = argax θ p(y tr,x tr θ), to obtain a axiu likelihood estiate Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.29

30 30 Xue and Titterington (MLE) ˆθ of θ, the paraeter of the joint distribution of (y,x), and then calculate ˆα as a function α(θ) at ˆθ. Under soe regularity conditions, such as the existence of the first and second derivatives of the loglikelihood function and the inverse of the Fisher inforation atrix I(θ), the MLE ˆθ is asyptotically unbiased, efficient and norally distributed. Accordingly, by the delta ethod, ˆα is also asyptotically norally distributed, unbiased and efficient, given the existence of the first derivative of the function α(θ). Therefore, the following clais, supported by the siulation study in Section 4, can be proposed. First, asyptotically, the generative classifiers will provide the best prediction perforance on (y te,x te ), dependent on the preise that the odelling of p(y,x θ), instead of p(y x,α), is correctly specified for all the observations. Secondly, if is large enough to ake (y tr,x tr ) representative of all the observations including (y te,x te ), then the generative classifiers should also provide the best prediction perforance on (y te,x te ), i.e., with the best asyptotic perforance, given that the odelling of p(y, x θ), instead of p(y x, α), is correctly specified for all the observations. Finally, if is not large enough to ake (y tr,x tr ) representative of all the observations, then the generative classifiers ay not necessarily provide the best prediction perforance on (y te,x te ). Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.30

31 Discriinative vs. Generative Classifiers Suary In suary, it ay not be so reliable to clai the existence of the two distinct regies of perforance between the generative and discriinative classifiers with regard to the training-set size. For real world datasets such as those deonstrated in Sections 3.1 and 3.3, so far there is no theoretically correct, general criterion for choosing between the discriinative and the generative classifiers; the choice depends on the relative confidence we have in the correctness of the specification of either p(y x) or p(y,x) for the data. This can be to soe extent a deonstration of why Ref. [3] and [7] prefer LDA when no odel is-specification occurs but other epirical studies ay prefer linear logistic regression instead. Ref. [6] provided theoretical proof that the discriinative classifiers need Ω(p) (i.e., M 1 p where M 1 > 0) training observations to approach its asyptotic rate with high probability, whereas the generative classifiers need only Ω(log(p)) (i.e., M 2 log(p) where M 2 > 0) training observations. We observe the following. First, for two distinct regies to occur, it is necessary that M 1 p M 2 log(p). Secondly, such a higher efficiency of the generative classifiers ight be also attained because of the bias induced by its odel is-specification, such as using LDA-Λ/the naïve Bayes classifiers for the cases in which it would be better to adopt LDA-Σ/QDA-Σ g. For real-world data, application of such a is-specified odel is likely; the bias-variance tradeoff ay then play a role in deterining the occurrence of the two distinct regies. Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.31

32 32 Xue and Titterington In addition, a siilar pattern of two distinct regies with regard to was also reported in Ref. [8], based on the perforance of logistic regression and tree induction; they found that logistic regression perfors better with saller and tree induction with larger. Therefore, although tree induction and logistic regression are not a pair of generative and discriinative classifiers, it could be interesting to explore such a pattern for other pairs of classifiers. Acknowledgeents The authors thank Andrew Y. Ng for counication about the ipleentation of the epirical studies in this paper. We are grateful to the associate editor, anonyous referees and David J. Hand for advice about the structure, references, experiental illustration and interpretation of this anuscript. The work also benefited fro our participation in the Research Prograe on Statistical Theory and Methods for Coplex, High-Diensional Data at the Isaac Newton Institute for Matheatical Sciences in Cabridge. References 1. Asuncion, A. and D. J. Newan: 2007, UCI Machine Learning Repository. Irvine, CA: University of California, School of Inforation and Coputer Science. learn/mlrepository.htl. 2. Dawid, A. P.: 1976, Properties of diagnostic data distributions. Bioetrics 32(3), Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.32

33 Discriinative vs. Generative Classifiers Efron, B.: 1975, The Efficiency of Logistic Regression Copared to Noral Discriinant Analysis. Journal of the Aerican Statistical Association 70(352), Hand, D. J.: 2006, Classifier technology and illusion of progress (with discussion). Statistical Science 21, Li, T.-S. and W.-Y. Loh: 1996, A coparison of tests of equality of variances. Coputational Statistics & Data Analysis 22(3), Ng, A. Y. and M. I. Jordan: 2001, On discriinative vs. generative classifiers: a coparison of logistic regression and naive Bayes. In: NIPS. pp O Neill, T. J.: 1980, The general distribution of the rate of a classification procedure with application to logistic regression discriination. Journal of the Aerican Statistical Association 75(369), Perlich, C., F. Provost, and J. S. Sionoff: 2003, Tree induction vs. logistic regression: A learning-curve analysis. Journal of Machine Learning Research 4( ). 9. Ripley, B. D.: 1996, Pattern Recognition and Neural Networks. New York: Cabridge University Press. 10. Rubinstein, Y. D. and T. Hastie: 1997, Discriinative vs. inforative learning. In: KDD. pp Shapiro, S. S. and M. B. Wilk: 1965, An analysis of variance test for norality (coplete saples). Bioetrika 52(3-4), Titterington, D. M., G. D. Murray, L. S. Murray, D. J. Spiegelhalter, A. M. Skene, J. D. F. Habbea, and G. J. Gelpke: 1981, Coparison of Discriination Techniques Applied to a Coplex Data Set of Head Injured Patients (with discussion). Journal of the Royal Statistical Society. Series A (General) 144(2), Verboven, S. and M. Hubert: 2005, LIBRA: a MATLAB Library for Robust Analysis. Cheoetrics and Intelligent Laboratory Systes 75(2), Xue-Titterington-SCH2008.tex; 5/08/2008; 22:19; p.33

Online Bagging and Boosting

Online Bagging and Boosting Abstract Bagging and boosting are two of the ost well-known enseble learning ethods due to their theoretical perforance guarantees and strong experiental results. However, these algoriths have been used

More information

Machine Learning Applications in Grid Computing

Machine Learning Applications in Grid Computing Machine Learning Applications in Grid Coputing George Cybenko, Guofei Jiang and Daniel Bilar Thayer School of Engineering Dartouth College Hanover, NH 03755, USA gvc@dartouth.edu, guofei.jiang@dartouth.edu

More information

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS

CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS 641 CRM FACTORS ASSESSMENT USING ANALYTIC HIERARCHY PROCESS Marketa Zajarosova 1* *Ph.D. VSB - Technical University of Ostrava, THE CZECH REPUBLIC arketa.zajarosova@vsb.cz Abstract Custoer relationship

More information

Applying Multiple Neural Networks on Large Scale Data

Applying Multiple Neural Networks on Large Scale Data 0 International Conference on Inforation and Electronics Engineering IPCSIT vol6 (0) (0) IACSIT Press, Singapore Applying Multiple Neural Networks on Large Scale Data Kritsanatt Boonkiatpong and Sukree

More information

Use of extrapolation to forecast the working capital in the mechanical engineering companies

Use of extrapolation to forecast the working capital in the mechanical engineering companies ECONTECHMOD. AN INTERNATIONAL QUARTERLY JOURNAL 2014. Vol. 1. No. 1. 23 28 Use of extrapolation to forecast the working capital in the echanical engineering copanies A. Cherep, Y. Shvets Departent of finance

More information

arxiv:0805.1434v1 [math.pr] 9 May 2008

arxiv:0805.1434v1 [math.pr] 9 May 2008 Degree-distribution stability of scale-free networs Zhenting Hou, Xiangxing Kong, Dinghua Shi,2, and Guanrong Chen 3 School of Matheatics, Central South University, Changsha 40083, China 2 Departent of

More information

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers

Performance Evaluation of Machine Learning Techniques using Software Cost Drivers Perforance Evaluation of Machine Learning Techniques using Software Cost Drivers Manas Gaur Departent of Coputer Engineering, Delhi Technological University Delhi, India ABSTRACT There is a treendous rise

More information

Modeling operational risk data reported above a time-varying threshold

Modeling operational risk data reported above a time-varying threshold Modeling operational risk data reported above a tie-varying threshold Pavel V. Shevchenko CSIRO Matheatical and Inforation Sciences, Sydney, Locked bag 7, North Ryde, NSW, 670, Australia. e-ail: Pavel.Shevchenko@csiro.au

More information

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network

Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona Network 2013 European Control Conference (ECC) July 17-19, 2013, Zürich, Switzerland. Extended-Horizon Analysis of Pressure Sensitivities for Leak Detection in Water Distribution Networks: Application to the Barcelona

More information

6. Time (or Space) Series Analysis

6. Time (or Space) Series Analysis ATM 55 otes: Tie Series Analysis - Section 6a Page 8 6. Tie (or Space) Series Analysis In this chapter we will consider soe coon aspects of tie series analysis including autocorrelation, statistical prediction,

More information

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy

Analyzing Spatiotemporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Vol. 9, No. 5 (2016), pp.303-312 http://dx.doi.org/10.14257/ijgdc.2016.9.5.26 Analyzing Spatioteporal Characteristics of Education Network Traffic with Flexible Multiscale Entropy Chen Yang, Renjie Zhou

More information

AUC Optimization vs. Error Rate Minimization

AUC Optimization vs. Error Rate Minimization AUC Optiization vs. Error Rate Miniization Corinna Cortes and Mehryar Mohri AT&T Labs Research 180 Park Avenue, Florha Park, NJ 0793, USA {corinna, ohri}@research.att.co Abstract The area under an ROC

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Searching strategy for multi-target discovery in wireless networks

Searching strategy for multi-target discovery in wireless networks Searching strategy for ulti-target discovery in wireless networks Zhao Cheng, Wendi B. Heinzelan Departent of Electrical and Coputer Engineering University of Rochester Rochester, NY 467 (585) 75-{878,

More information

Image restoration for a rectangular poor-pixels detector

Image restoration for a rectangular poor-pixels detector Iage restoration for a rectangular poor-pixels detector Pengcheng Wen 1, Xiangjun Wang 1, Hong Wei 2 1 State Key Laboratory of Precision Measuring Technology and Instruents, Tianjin University, China 2

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

ABSTRACT KEYWORDS. Comonotonicity, dependence, correlation, concordance, copula, multivariate. 1. INTRODUCTION

ABSTRACT KEYWORDS. Comonotonicity, dependence, correlation, concordance, copula, multivariate. 1. INTRODUCTION MEASURING COMONOTONICITY IN M-DIMENSIONAL VECTORS BY INGE KOCH AND ANN DE SCHEPPER ABSTRACT In this contribution, a new easure of coonotonicity for -diensional vectors is introduced, with values between

More information

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES

AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES Int. J. Appl. Math. Coput. Sci., 2014, Vol. 24, No. 1, 133 149 DOI: 10.2478/acs-2014-0011 AN ALGORITHM FOR REDUCING THE DIMENSION AND SIZE OF A SAMPLE FOR DATA EXPLORATION PROCEDURES PIOTR KULCZYCKI,,

More information

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks

Reliability Constrained Packet-sizing for Linear Multi-hop Wireless Networks Reliability Constrained acket-sizing for inear Multi-hop Wireless Networks Ning Wen, and Randall A. Berry Departent of Electrical Engineering and Coputer Science Northwestern University, Evanston, Illinois

More information

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive

This paper studies a rental firm that offers reusable products to price- and quality-of-service sensitive MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol., No. 3, Suer 28, pp. 429 447 issn 523-464 eissn 526-5498 8 3 429 infors doi.287/so.7.8 28 INFORMS INFORMS holds copyright to this article and distributed

More information

Quality evaluation of the model-based forecasts of implied volatility index

Quality evaluation of the model-based forecasts of implied volatility index Quality evaluation of the odel-based forecasts of iplied volatility index Katarzyna Łęczycka 1 Abstract Influence of volatility on financial arket forecasts is very high. It appears as a specific factor

More information

Multi-Class Deep Boosting

Multi-Class Deep Boosting Multi-Class Deep Boosting Vitaly Kuznetsov Courant Institute 25 Mercer Street New York, NY 002 vitaly@cis.nyu.edu Mehryar Mohri Courant Institute & Google Research 25 Mercer Street New York, NY 002 ohri@cis.nyu.edu

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Work Travel and Decision Probling in the Network Marketing World

Work Travel and Decision Probling in the Network Marketing World TRB Paper No. 03-4348 WORK TRAVEL MODE CHOICE MODELING USING DATA MINING: DECISION TREES AND NEURAL NETWORKS Chi Xie Research Assistant Departent of Civil and Environental Engineering University of Massachusetts,

More information

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index

Modified Latin Hypercube Sampling Monte Carlo (MLHSMC) Estimation for Average Quality Index Analog Integrated Circuits and Signal Processing, vol. 9, no., April 999. Abstract Modified Latin Hypercube Sapling Monte Carlo (MLHSMC) Estiation for Average Quality Index Mansour Keraat and Richard Kielbasa

More information

Data Set Generation for Rectangular Placement Problems

Data Set Generation for Rectangular Placement Problems Data Set Generation for Rectangular Placeent Probles Christine L. Valenzuela (Muford) Pearl Y. Wang School of Coputer Science & Inforatics Departent of Coputer Science MS 4A5 Cardiff University George

More information

Physics 211: Lab Oscillations. Simple Harmonic Motion.

Physics 211: Lab Oscillations. Simple Harmonic Motion. Physics 11: Lab Oscillations. Siple Haronic Motion. Reading Assignent: Chapter 15 Introduction: As we learned in class, physical systes will undergo an oscillatory otion, when displaced fro a stable equilibriu.

More information

An Innovate Dynamic Load Balancing Algorithm Based on Task

An Innovate Dynamic Load Balancing Algorithm Based on Task An Innovate Dynaic Load Balancing Algorith Based on Task Classification Hong-bin Wang,,a, Zhi-yi Fang, b, Guan-nan Qu,*,c, Xiao-dan Ren,d College of Coputer Science and Technology, Jilin University, Changchun

More information

Pricing Asian Options using Monte Carlo Methods

Pricing Asian Options using Monte Carlo Methods U.U.D.M. Project Report 9:7 Pricing Asian Options using Monte Carlo Methods Hongbin Zhang Exaensarbete i ateatik, 3 hp Handledare och exainator: Johan Tysk Juni 9 Departent of Matheatics Uppsala University

More information

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel

Adaptive Modulation and Coding for Unmanned Aerial Vehicle (UAV) Radio Channel Recent Advances in Counications Adaptive odulation and Coding for Unanned Aerial Vehicle (UAV) Radio Channel Airhossein Fereidountabar,Gian Carlo Cardarilli, Rocco Fazzolari,Luca Di Nunzio Abstract In

More information

Earnings and Community College Field of Study Choice in Canada

Earnings and Community College Field of Study Choice in Canada DISCUSSION PAPER SERIES IZA DP No. 1156 Earnings and Counity College Field of Study Choice in Canada Brahi Boudarbat May 2004 Forschungsinstitut zur Zukunft der Arbeit Institute for the Study of Labor

More information

Gaussian Processes for Regression: A Quick Introduction

Gaussian Processes for Regression: A Quick Introduction Gaussian Processes for Regression A Quick Introduction M Ebden, August 28 Coents to arkebden@engoacuk MOTIVATION Figure illustrates a typical eaple of a prediction proble given soe noisy observations of

More information

Fuzzy Sets in HR Management

Fuzzy Sets in HR Management Acta Polytechnica Hungarica Vol. 8, No. 3, 2011 Fuzzy Sets in HR Manageent Blanka Zeková AXIOM SW, s.r.o., 760 01 Zlín, Czech Republic blanka.zekova@sezna.cz Jana Talašová Faculty of Science, Palacký Univerzity,

More information

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS

INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE SYSTEMS Artificial Intelligence Methods and Techniques for Business and Engineering Applications 210 INTEGRATED ENVIRONMENT FOR STORING AND HANDLING INFORMATION IN TASKS OF INDUCTIVE MODELLING FOR BUSINESS INTELLIGENCE

More information

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument

Factor Model. Arbitrage Pricing Theory. Systematic Versus Non-Systematic Risk. Intuitive Argument Ross [1],[]) presents the aritrage pricing theory. The idea is that the structure of asset returns leads naturally to a odel of risk preia, for otherwise there would exist an opportunity for aritrage profit.

More information

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation

Media Adaptation Framework in Biofeedback System for Stroke Patient Rehabilitation Media Adaptation Fraework in Biofeedback Syste for Stroke Patient Rehabilitation Yinpeng Chen, Weiwei Xu, Hari Sundara, Thanassis Rikakis, Sheng-Min Liu Arts, Media and Engineering Progra Arizona State

More information

STRONGLY CONSISTENT ESTIMATES FOR FINITES MIX'IURES OF DISTRIBUTION FUNCTIONS ABSTRACT. An estimator for the mixing measure

STRONGLY CONSISTENT ESTIMATES FOR FINITES MIX'IURES OF DISTRIBUTION FUNCTIONS ABSTRACT. An estimator for the mixing measure STRONGLY CONSISTENT ESTIMATES FOR FINITES MIX'IURES OF DISTRIBUTION FUNCTIONS Keewhan Choi Cornell University ABSTRACT The probles with which we are concerned in this note are those of identifiability

More information

SAMPLING METHODS LEARNING OBJECTIVES

SAMPLING METHODS LEARNING OBJECTIVES 6 SAMPLING METHODS 6 Using Statistics 6-6 2 Nonprobability Sapling and Bias 6-6 Stratified Rando Sapling 6-2 6 4 Cluster Sapling 6-4 6 5 Systeatic Sapling 6-9 6 6 Nonresponse 6-2 6 7 Suary and Review of

More information

A Score Test for Determining Sample Size in Matched Case-Control Studies with Categorical Exposure

A Score Test for Determining Sample Size in Matched Case-Control Studies with Categorical Exposure Bioetrical Journal 48 (2006), 35 53 DOI: 0.002/bij.20050200 A Score Test for Deterining Saple Size in Matched Case-Control Studies with Categorical Exposure Sairan Sinha and Bhraar Mukherjee *; 2 Departent

More information

Enrolment into Higher Education and Changes in Repayment Obligations of Student Aid Microeconometric Evidence for Germany

Enrolment into Higher Education and Changes in Repayment Obligations of Student Aid Microeconometric Evidence for Germany Enrolent into Higher Education and Changes in Repayent Obligations of Student Aid Microeconoetric Evidence for Gerany Hans J. Baugartner *) Viktor Steiner **) *) DIW Berlin **) Free University of Berlin,

More information

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs

The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic Jobs Send Orders for Reprints to reprints@benthascience.ae 206 The Open Fuels & Energy Science Journal, 2015, 8, 206-210 Open Access The Research of Measuring Approach and Energy Efficiency for Hadoop Periodic

More information

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128

ON SELF-ROUTING IN CLOS CONNECTION NETWORKS. BARRY G. DOUGLASS Electrical Engineering Department Texas A&M University College Station, TX 77843-3128 ON SELF-ROUTING IN CLOS CONNECTION NETWORKS BARRY G. DOUGLASS Electrical Engineering Departent Texas A&M University College Station, TX 778-8 A. YAVUZ ORUÇ Electrical Engineering Departent and Institute

More information

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO

PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO Bulletin of the Transilvania University of Braşov Series I: Engineering Sciences Vol. 4 (53) No. - 0 PERFORMANCE METRICS FOR THE IT SERVICES PORTFOLIO V. CAZACU I. SZÉKELY F. SANDU 3 T. BĂLAN Abstract:

More information

The Benefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelism

The Benefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelism The enefit of SMT in the Multi-Core Era: Flexibility towards Degrees of Thread-Level Parallelis Stijn Eyeran Lieven Eeckhout Ghent University, elgiu Stijn.Eyeran@elis.UGent.be, Lieven.Eeckhout@elis.UGent.be

More information

ESTIMATING LIQUIDITY PREMIA IN THE SPANISH GOVERNMENT SECURITIES MARKET

ESTIMATING LIQUIDITY PREMIA IN THE SPANISH GOVERNMENT SECURITIES MARKET ESTIMATING LIQUIDITY PREMIA IN THE SPANISH GOVERNMENT SECURITIES MARKET Francisco Alonso, Roberto Blanco, Ana del Río and Alicia Sanchis Banco de España Banco de España Servicio de Estudios Docuento de

More information

Managing Complex Network Operation with Predictive Analytics

Managing Complex Network Operation with Predictive Analytics Managing Coplex Network Operation with Predictive Analytics Zhenyu Huang, Pak Chung Wong, Patrick Mackey, Yousu Chen, Jian Ma, Kevin Schneider, and Frank L. Greitzer Pacific Northwest National Laboratory

More information

( C) CLASS 10. TEMPERATURE AND ATOMS

( C) CLASS 10. TEMPERATURE AND ATOMS CLASS 10. EMPERAURE AND AOMS 10.1. INRODUCION Boyle s understanding of the pressure-volue relationship for gases occurred in the late 1600 s. he relationships between volue and teperature, and between

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking

An Integrated Approach for Monitoring Service Level Parameters of Software-Defined Networking International Journal of Future Generation Counication and Networking Vol. 8, No. 6 (15), pp. 197-4 http://d.doi.org/1.1457/ijfgcn.15.8.6.19 An Integrated Approach for Monitoring Service Level Paraeters

More information

Lecture L9 - Linear Impulse and Momentum. Collisions

Lecture L9 - Linear Impulse and Momentum. Collisions J. Peraire, S. Widnall 16.07 Dynaics Fall 009 Version.0 Lecture L9 - Linear Ipulse and Moentu. Collisions In this lecture, we will consider the equations that result fro integrating Newton s second law,

More information

Pure Bending Determination of Stress-Strain Curves for an Aluminum Alloy

Pure Bending Determination of Stress-Strain Curves for an Aluminum Alloy Proceedings of the World Congress on Engineering 0 Vol III WCE 0, July 6-8, 0, London, U.K. Pure Bending Deterination of Stress-Strain Curves for an Aluinu Alloy D. Torres-Franco, G. Urriolagoitia-Sosa,

More information

Data Streaming Algorithms for Estimating Entropy of Network Traffic

Data Streaming Algorithms for Estimating Entropy of Network Traffic Data Streaing Algoriths for Estiating Entropy of Network Traffic Ashwin Lall University of Rochester Vyas Sekar Carnegie Mellon University Mitsunori Ogihara University of Rochester Jun (Ji) Xu Georgia

More information

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment

The AGA Evaluating Model of Customer Loyalty Based on E-commerce Environment 6 JOURNAL OF SOFTWARE, VOL. 4, NO. 3, MAY 009 The AGA Evaluating Model of Custoer Loyalty Based on E-coerce Environent Shaoei Yang Econoics and Manageent Departent, North China Electric Power University,

More information

ASIC Design Project Management Supported by Multi Agent Simulation

ASIC Design Project Management Supported by Multi Agent Simulation ASIC Design Project Manageent Supported by Multi Agent Siulation Jana Blaschke, Christian Sebeke, Wolfgang Rosenstiel Abstract The coplexity of Application Specific Integrated Circuits (ASICs) is continuously

More information

An Approach to Combating Free-riding in Peer-to-Peer Networks

An Approach to Combating Free-riding in Peer-to-Peer Networks An Approach to Cobating Free-riding in Peer-to-Peer Networks Victor Ponce, Jie Wu, and Xiuqi Li Departent of Coputer Science and Engineering Florida Atlantic University Boca Raton, FL 33431 April 7, 2008

More information

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model

Evaluating Inventory Management Performance: a Preliminary Desk-Simulation Study Based on IOC Model Evaluating Inventory Manageent Perforance: a Preliinary Desk-Siulation Study Based on IOC Model Flora Bernardel, Roberto Panizzolo, and Davide Martinazzo Abstract The focus of this study is on preliinary

More information

A magnetic Rotor to convert vacuum-energy into mechanical energy

A magnetic Rotor to convert vacuum-energy into mechanical energy A agnetic Rotor to convert vacuu-energy into echanical energy Claus W. Turtur, University of Applied Sciences Braunschweig-Wolfenbüttel Abstract Wolfenbüttel, Mai 21 2008 In previous work it was deonstrated,

More information

The Virtual Spring Mass System

The Virtual Spring Mass System The Virtual Spring Mass Syste J. S. Freudenberg EECS 6 Ebedded Control Systes Huan Coputer Interaction A force feedbac syste, such as the haptic heel used in the EECS 6 lab, is capable of exhibiting a

More information

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2

Exploiting Hardware Heterogeneity within the Same Instance Type of Amazon EC2 Exploiting Hardware Heterogeneity within the Sae Instance Type of Aazon EC2 Zhonghong Ou, Hao Zhuang, Jukka K. Nurinen, Antti Ylä-Jääski, Pan Hui Aalto University, Finland; Deutsch Teleko Laboratories,

More information

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises

Research Article Performance Evaluation of Human Resource Outsourcing in Food Processing Enterprises Advance Journal of Food Science and Technology 9(2): 964-969, 205 ISSN: 2042-4868; e-issn: 2042-4876 205 Maxwell Scientific Publication Corp. Subitted: August 0, 205 Accepted: Septeber 3, 205 Published:

More information

Online Appendix I: A Model of Household Bargaining with Violence. In this appendix I develop a simple model of household bargaining that

Online Appendix I: A Model of Household Bargaining with Violence. In this appendix I develop a simple model of household bargaining that Online Appendix I: A Model of Household Bargaining ith Violence In this appendix I develop a siple odel of household bargaining that incorporates violence and shos under hat assuptions an increase in oen

More information

COMBINING CRASH RECORDER AND PAIRED COMPARISON TECHNIQUE: INJURY RISK FUNCTIONS IN FRONTAL AND REAR IMPACTS WITH SPECIAL REFERENCE TO NECK INJURIES

COMBINING CRASH RECORDER AND PAIRED COMPARISON TECHNIQUE: INJURY RISK FUNCTIONS IN FRONTAL AND REAR IMPACTS WITH SPECIAL REFERENCE TO NECK INJURIES COMBINING CRASH RECORDER AND AIRED COMARISON TECHNIQUE: INJURY RISK FUNCTIONS IN FRONTAL AND REAR IMACTS WITH SECIAL REFERENCE TO NECK INJURIES Anders Kullgren, Maria Krafft Folksa Research, 66 Stockhol,

More information

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization?

Partitioning Data on Features or Samples in Communication-Efficient Distributed Optimization? Partitioning Data on Features or Saples in Counication-Efficient Distributed Optiization? Chenxin Ma Industrial and Systes Engineering Lehigh University, USA ch54@lehigh.edu Martin Taáč Industrial and

More information

Construction Economics & Finance. Module 3 Lecture-1

Construction Economics & Finance. Module 3 Lecture-1 Depreciation:- Construction Econoics & Finance Module 3 Lecture- It represents the reduction in arket value of an asset due to age, wear and tear and obsolescence. The physical deterioration of the asset

More information

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS

A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS A CHAOS MODEL OF SUBHARMONIC OSCILLATIONS IN CURRENT MODE PWM BOOST CONVERTERS Isaac Zafrany and Sa BenYaakov Departent of Electrical and Coputer Engineering BenGurion University of the Negev P. O. Box

More information

Factored Models for Probabilistic Modal Logic

Factored Models for Probabilistic Modal Logic Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence (2008 Factored Models for Probabilistic Modal Logic Afsaneh Shirazi and Eyal Air Coputer Science Departent, University of Illinois

More information

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1

International Journal of Management & Information Systems First Quarter 2012 Volume 16, Number 1 International Journal of Manageent & Inforation Systes First Quarter 2012 Volue 16, Nuber 1 Proposal And Effectiveness Of A Highly Copelling Direct Mail Method - Establishent And Deployent Of PMOS-DM Hisatoshi

More information

MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS

MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS MINIMUM VERTEX DEGREE THRESHOLD FOR LOOSE HAMILTON CYCLES IN 3-UNIFORM HYPERGRAPHS JIE HAN AND YI ZHAO Abstract. We show that for sufficiently large n, every 3-unifor hypergraph on n vertices with iniu

More information

Software Quality Characteristics Tested For Mobile Application Development

Software Quality Characteristics Tested For Mobile Application Development Thesis no: MGSE-2015-02 Software Quality Characteristics Tested For Mobile Application Developent Literature Review and Epirical Survey WALEED ANWAR Faculty of Coputing Blekinge Institute of Technology

More information

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes

On Computing Nearest Neighbors with Applications to Decoding of Binary Linear Codes On Coputing Nearest Neighbors with Applications to Decoding of Binary Linear Codes Alexander May and Ilya Ozerov Horst Görtz Institute for IT-Security Ruhr-University Bochu, Gerany Faculty of Matheatics

More information

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing

Real Time Target Tracking with Binary Sensor Networks and Parallel Computing Real Tie Target Tracking with Binary Sensor Networks and Parallel Coputing Hong Lin, John Rushing, Sara J. Graves, Steve Tanner, and Evans Criswell Abstract A parallel real tie data fusion and target tracking

More information

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach

An Optimal Task Allocation Model for System Cost Analysis in Heterogeneous Distributed Computing Systems: A Heuristic Approach An Optial Tas Allocation Model for Syste Cost Analysis in Heterogeneous Distributed Coputing Systes: A Heuristic Approach P. K. Yadav Central Building Research Institute, Rooree- 247667, Uttarahand (INDIA)

More information

Modeling Parallel Applications Performance on Heterogeneous Systems

Modeling Parallel Applications Performance on Heterogeneous Systems Modeling Parallel Applications Perforance on Heterogeneous Systes Jaeela Al-Jaroodi, Nader Mohaed, Hong Jiang and David Swanson Departent of Coputer Science and Engineering University of Nebraska Lincoln

More information

Implementation of Active Queue Management in a Combined Input and Output Queued Switch

Implementation of Active Queue Management in a Combined Input and Output Queued Switch pleentation of Active Queue Manageent in a obined nput and Output Queued Switch Bartek Wydrowski and Moshe Zukeran AR Special Research entre for Ultra-Broadband nforation Networks, EEE Departent, The University

More information

ADJUSTING FOR QUALITY CHANGE

ADJUSTING FOR QUALITY CHANGE ADJUSTING FOR QUALITY CHANGE 7 Introduction 7.1 The easureent of changes in the level of consuer prices is coplicated by the appearance and disappearance of new and old goods and services, as well as changes

More information

The Model of Lines for Option Pricing with Jumps

The Model of Lines for Option Pricing with Jumps The Model of Lines for Option Pricing with Jups Claudio Albanese, Sebastian Jaiungal and Ditri H Rubisov January 17, 21 Departent of Matheatics, University of Toronto Abstract This article reviews a pricing

More information

Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process

Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process IMECS 2008 9-2 March 2008 Hong Kong Evaluating Software Quality of Vendors using Fuzzy Analytic Hierarchy Process Kevin K.F. Yuen* Henry C.W. au Abstract This paper proposes a fuzzy Analytic Hierarchy

More information

Information Processing Letters

Information Processing Letters Inforation Processing Letters 111 2011) 178 183 Contents lists available at ScienceDirect Inforation Processing Letters www.elsevier.co/locate/ipl Offline file assignents for online load balancing Paul

More information

Binary Embedding: Fundamental Limits and Fast Algorithm

Binary Embedding: Fundamental Limits and Fast Algorithm Binary Ebedding: Fundaental Liits and Fast Algorith Xinyang Yi The University of Texas at Austin yixy@utexas.edu Eric Price The University of Texas at Austin ecprice@cs.utexas.edu Constantine Caraanis

More information

Markovian inventory policy with application to the paper industry

Markovian inventory policy with application to the paper industry Coputers and Cheical Engineering 26 (2002) 1399 1413 www.elsevier.co/locate/copcheeng Markovian inventory policy with application to the paper industry K. Karen Yin a, *, Hu Liu a,1, Neil E. Johnson b,2

More information

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization

Impact of Processing Costs on Service Chain Placement in Network Functions Virtualization Ipact of Processing Costs on Service Chain Placeent in Network Functions Virtualization Marco Savi, Massio Tornatore, Giacoo Verticale Dipartiento di Elettronica, Inforazione e Bioingegneria, Politecnico

More information

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller

Design of Model Reference Self Tuning Mechanism for PID like Fuzzy Controller Research Article International Journal of Current Engineering and Technology EISSN 77 46, PISSN 347 56 4 INPRESSCO, All Rights Reserved Available at http://inpressco.co/category/ijcet Design of Model Reference

More information

REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES

REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES REQUIREMENTS FOR A COMPUTER SCIENCE CURRICULUM EMPHASIZING INFORMATION TECHNOLOGY SUBJECT AREA: CURRICULUM ISSUES Charles Reynolds Christopher Fox reynolds @cs.ju.edu fox@cs.ju.edu Departent of Coputer

More information

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information

Capacity of Multiple-Antenna Systems With Both Receiver and Transmitter Channel State Information IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO., OCTOBER 23 2697 Capacity of Multiple-Antenna Systes With Both Receiver and Transitter Channel State Inforation Sudharan K. Jayaweera, Student Meber,

More information

Towards Change Management Capability Assessment Model for Contractors in Building Project

Towards Change Management Capability Assessment Model for Contractors in Building Project Middle-East Journal of Scientific Research 23 (7): 1327-1333, 2015 ISSN 1990-9233 IDOSI Publications, 2015 DOI: 10.5829/idosi.ejsr.2015.23.07.120 Towards Change Manageent Capability Assessent Model for

More information

Preference-based Search and Multi-criteria Optimization

Preference-based Search and Multi-criteria Optimization Fro: AAAI-02 Proceedings. Copyright 2002, AAAI (www.aaai.org). All rights reserved. Preference-based Search and Multi-criteria Optiization Ulrich Junker ILOG 1681, route des Dolines F-06560 Valbonne ujunker@ilog.fr

More information

The Fundamentals of Modal Testing

The Fundamentals of Modal Testing The Fundaentals of Modal Testing Application Note 243-3 Η(ω) = Σ n r=1 φ φ i j / 2 2 2 2 ( ω n - ω ) + (2ξωωn) Preface Modal analysis is defined as the study of the dynaic characteristics of a echanical

More information

Dynamic Placement for Clustered Web Applications

Dynamic Placement for Clustered Web Applications Dynaic laceent for Clustered Web Applications A. Karve, T. Kibrel, G. acifici, M. Spreitzer, M. Steinder, M. Sviridenko, and A. Tantawi IBM T.J. Watson Research Center {karve,kibrel,giovanni,spreitz,steinder,sviri,tantawi}@us.ib.co

More information

Cross-Domain Metric Learning Based on Information Theory

Cross-Domain Metric Learning Based on Information Theory Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence Cross-Doain Metric Learning Based on Inforation Theory Hao Wang,2, Wei Wang 2,3, Chen Zhang 2, Fanjiang Xu 2. State Key Laboratory

More information

Note on a generalized wage rigidity result. Abstract

Note on a generalized wage rigidity result. Abstract Note on a generalized age rigidity result Ariit Mukheree University of Nottingha Abstract Considering Cournot copetition, this note shos that, if the firs differ in labor productivities, the equilibriu

More information

A quantum secret ballot. Abstract

A quantum secret ballot. Abstract A quantu secret ballot Shahar Dolev and Itaar Pitowsky The Edelstein Center, Levi Building, The Hebrerw University, Givat Ra, Jerusale, Israel Boaz Tair arxiv:quant-ph/060087v 8 Mar 006 Departent of Philosophy

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Comparison of Gamma and Passive Neutron Non-Destructive Assay Total Measurement Uncertainty Using the High Efficiency Neutron Counter

Comparison of Gamma and Passive Neutron Non-Destructive Assay Total Measurement Uncertainty Using the High Efficiency Neutron Counter Coparison o Gaa and Passive Neutron Non-Destructive Assay Total Measureent Uncertainty Using the High Eiciency Neutron Counter John M. Veilleux Los Alaos National Laboratory Los Alaos, NM 87545, USA 505/667-7434

More information

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY

CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY CLOSED-LOOP SUPPLY CHAIN NETWORK OPTIMIZATION FOR HONG KONG CARTRIDGE RECYCLING INDUSTRY Y. T. Chen Departent of Industrial and Systes Engineering Hong Kong Polytechnic University, Hong Kong yongtong.chen@connect.polyu.hk

More information

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP 2009. 15 July 2013

Calculation Method for evaluating Solar Assisted Heat Pump Systems in SAP 2009. 15 July 2013 Calculation Method for evaluating Solar Assisted Heat Pup Systes in SAP 2009 15 July 2013 Page 1 of 17 1 Introduction This docuent describes how Solar Assisted Heat Pup Systes are recognised in the National

More information

No. 2004/12. Daniel Schmidt

No. 2004/12. Daniel Schmidt No. 2004/12 Private equity-, stock- and ixed asset-portfolios: A bootstrap approach to deterine perforance characteristics, diversification benefits and optial portfolio allocations Daniel Schidt Center

More information

Salty Waters. Instructions for the activity 3. Results Worksheet 5. Class Results Sheet 7. Teacher Notes 8. Sample results. 12

Salty Waters. Instructions for the activity 3. Results Worksheet 5. Class Results Sheet 7. Teacher Notes 8. Sample results. 12 1 Salty Waters Alost all of the water on Earth is in the for of a solution containing dissolved salts. In this activity students are invited to easure the salinity of a saple of salt water. While carrying

More information

Insurance Spirals and the Lloyd s Market

Insurance Spirals and the Lloyd s Market Insurance Spirals and the Lloyd s Market Andrew Bain University of Glasgow Abstract This paper presents a odel of reinsurance arket spirals, and applies it to the situation that existed in the Lloyd s

More information

Incorporating Complex Substitution Patterns and Variance Scaling in Long Distance Travel Choice Behavior

Incorporating Complex Substitution Patterns and Variance Scaling in Long Distance Travel Choice Behavior Incorporating Coplex Substitution Patterns and Variance Scaling in Long Distance Travel Choice Behavior Frank S. Koppelan Professor of Civil Engineering and Transportation Northwestern University Evanston,

More information

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance

Calculating the Return on Investment (ROI) for DMSMS Management. The Problem with Cost Avoidance Calculating the Return on nvestent () for DMSMS Manageent Peter Sandborn CALCE, Departent of Mechanical Engineering (31) 45-3167 sandborn@calce.ud.edu www.ene.ud.edu/escml/obsolescence.ht October 28, 21

More information