1 An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy Ida G. Srinkhuizen-Kuyer Evgueni N. Smirnov MICC-IKAT, Universiteit Maastricht, PO Box 616, 6200 MD Maastricht, The Netherlands. Abstract Reliable classifiers abstain from uncertain instance classifications. In this aer we extend our revious aroach to construct reliable classifiers which is based on isometrics in Receiver Oerator Characteristic (ROC) sace. We analyze the conditions to obtain a reliable classifier with higher erformance than reviously ossible. Our results show that the aroach is generally alicable to boost erformance on each class simultaneously. Moreover, the aroach is able to construct a classifier with at least a desired erformance er class. 1. Introduction Machine learning classifiers were alied to various classification roblems. Nevertheless, only few classifiers were emloyed in domains with high misclassification costs, e.g., medical diagnosis and legal ractice. In these domains it is desired to have classifiers that abstain from uncertain instance classifications such that a desired level of reliability is obtained. These classifiers are called reliable classifiers. Recently, we roosed an easy-to-visualize aroach to reliable instance classification (Vanderlooy et al., 2006). Classification erformance is visualized as an ROC curve and a reliable classifier is constructed by skiing the art of the curve that reresents instances difficult to classify. The transformation to the ROC curve of the reliable classifier was rovided. An analysis showed when and where this new curve dominates the original one. If the underlying data of both curves have aroximately equal class distributions, then dominance immediately results in erfor- Aearing in Proceedings of the ICML 2006 worksho on ROC Analysis in Machine Learning, Pittsburgh, USA, Coyright 2006 by the author(s)/owner(s). mance increase. However, in case of different class distributions and a erformance measure that is classdistribution deendent, dominance of an ROC curve does not always guarantee an increase in erformance. In this aer we analyze for which erformance metrics the aroach boosts erformance on each class simultaneously. We restrict to widely used metrics characterized by rotating linear isometrics (Fürnkranz & Flach, 2005). Furthermore, skew sensitive metrics are used to generalize the aroach to each ossible scenario of error costs and class distributions. This aer is organized as follows. Section 2 rovides terminology and notation. Section 3 gives a brief background on ROC curves. Sections 4 and 5 introduce skew sensitive evaluation and isometrics, resectively. Section 7 defines reliable classifiers and their visualization in ROC sace. In Section 8 we rovide our main contribution. Section 9 concludes the aer. 2. Terminology and Notation We consider classification roblems with two classes: ositive () and negative (n). A discrete classifier is a maing from instances to classes. Counts of true ositives, false ositives, true negatives and false negatives are denoted by TP, FP, TN, and FN, resectively. The number of ositive instances is P = TP + FN. Similarly, N = TN + FP is the number of negative instances. From these counts the following statistics are derived: TP tr = TP + FN FP fr = FP + TN TN tnr = TN + FP FN fnr = TP + FN True ositive rate is denoted by tr and true negative rate by tnr. False ositive rate and false negative rate are denoted by fr and fnr, resectively. Note that tnr = 1 fr and fnr = 1 tr. Most classifiers are rankers or scoring classifiers. They
2 outut two ositive values l(x ) and l(x n) that indicate the likelihood that an instance x is ositive and negative, resectively. The score of an instance combines these values as follows: l(x) = l(x ) l(x n) (1) and can be used to rank instances from most likely ositive to most likely negative (Lachiche & Flach, 2003). 3. ROC Curves The erformance of a discrete classifier can be reresented by a oint (fr, tr) in ROC sace. Otimal erformance is obtained in (0, 1). Points (0, 0) and (1, 1) reresent classifiers that always redict the negative and ositive class, resectively. The ascending diagonal connects these oints and reresents the strategy of random classification. A threshold on the score l(x) transforms a scoring classifier into a discrete one. Instances with a score higher than or equal to this threshold are classified as ositive. The remaining instances are classified as negative. An ROC curve shows what haens with the corresonding confusion matrix for each ossible threshold (Fawcett, 2003). The convex hull of the ROC curve (ROCCH) removes concavities. Theorem 1 For any oint (fr, tr) on an ROCCH a classifier can be constructed that has the erformance reresented by that oint. Provost and Fawcett (2001) rove this theorem. For simlicity of resentation, in the following we will assume that ROC curves are convex and all oints can be obtained by a threshold. 4. Skew Sensitive Evaluation The metrics tr, fr, tnr, and fnr evaluate erformance on a single class. This follows from the confusion matrix since values are used from a single column. In most cases a metric is desired that indicates erformance on both classes simultaneously. Unfortunately, such metric assumes that the class distribution of the alication domain is known and used in the test set. Accuracy is a well-known examle. Provost et al. (1998) showed that classifier selection with this metric has two severe shortcomings with regard to class and error costs distributions. To overcome these roblems, Flach (2003) considers class and error costs distributions as a arameter of erformance metrics. Evaluation with these metrics is called skew sensitive evaluation. The arameter is called the skew ratio and exresses the relative imortance of negative versus ositive class: c = c(, n) P (n) c(n, ) P () (2) Here, c(, n) and c(n, ) denote the costs of a false ositive and false negative, resectively 1. The robabilities of a ositive and negative instance are denoted by P () = P and P (n) = N class ratio is then P (n) P () = N P., resectively. The From Eq. 2 it is clear that we can cover all ossible scenarios of class and cost distributions by a single value of c used as arameter in the erformance metric. If c < 1 (c > 1), then the ositive (negative) class is most imortant. In the following we assume without restriction that c is the ratio of negative to ositive instances in the test set, i.e., c = N P. The reader should kee in mind that our results are also valid for c = c(,n) N c(n,) P. 5. ROC Isometrics Classifier erformance is evaluated on both classes. We define a ositive (negative) erformance metric as a metric that measures erformance on the ositive (negative) classifications. The skew sensitive metrics used in this aer are summarized in Table 1. An exlanation of these metrics follows. ROC isometrics are collections of oints in ROC sace with the same value for a erformance metric. Flach (2003) and Fürnkranz and Flach (2005) investigate isometrics to understand metrics. However, isometrics can be used for the task of classifier selection and to construct reliable classifiers (see Section 6). Table 1 also shows the isometrics for the erformance metrics. They are obtained by fixing the erformance metric and rewriting its equation to that of a line in ROC sace. Varying the value of the metric results in linear lines that rotate around a single oint in which the metric is undefined Precision Positive recision, rec c, is defined as the roortion of true ositives to the total number of ositive classifications. The isometrics are linear lines that rotate around the origin (0, 0). 1 Benefits of true ositives and true negatives are incororated by adding them to the corresonding errors. This oeration normalizes the cost matrix such that the two values on the main diagonal are zero.
3 Table 1. Performance metrics and corresonding isometrics defined in terms of fr, tr, c = N, α P R+, and ˆm = Metric Indicator Formula Isometric Pos. recision Neg. recision Pos. F -measure Neg. F -measure Pos. gm-estimate Neg. gm-estimate rec c rec c n F c,α F c,α n gm tr tr+c fr tr = recc 1 rec c fr c tnr tnr+ 1 c fnr (1+α 2 )tr α 2 +tr+c fr tr = (1+α 2 )tnr α 2 +tnr+ 1 c fnr tr = 1 recc n rec c n F c,α 1+α 2 F c,α tr = 1+α2 F c,α n Fn c,α tr+ ˆm tr+c fr+ ˆm(1+c) tr = gm 1 gm tnr+ ˆm tnr+ 1 1+c c fnr+ ˆm c tr = ˆm 1 gmc, n c fr recc n c c fr + rec c n α2 F c,α 1+α 2 F c,α c fr (1+α2 )(F c,α c fr + c fr + 1 n 1) c Fn c,α ˆm ˆm(gmc, (1+c) 1) 1 gm ˆm ˆm 1 gmc, n c + (1+c) c) ˆm(gmc, n m. Figure 1. Precision isometrics in ROC sace: solid lines are rec 1 -isometrics and dashed lines are rec 1 n-isometrics. Figure 2. F -measure isometrics in ROC sace: solid lines are F 1,1 -isometrics and dashed lines are Fn 1,1 -isometrics. The case of negative recision, rec c n, is similar. Corresonding isometrics rotate around oint (1, 1). Figure 1 shows rec c -isometrics and rec c n-isometrics for c = 1. In this and subsequent figures the value of the erformance metric is varied from 0.1 to F -measure Positive recision is maximized when all ositive classifications are correct. To know if rec c uses enough ositive instances to be considered as reliable, it is combined with tr. Note that rec c and tr are antagonistic, i.e., if rec c goes u, then tr usually goes down (and vice versa). Rijsbergen (1979) introduced the ositive F -measure for the trade-off between these metrics: F c,α = (1 + ( α2 ) rec c tr ) 1 + α 2 tr α 2 rec c = + tr α 2 (3) + tr + c fr where the arameter α indicates the imortance given to rec c relative to tr. If α < 1 (α > 1) then tr is less (more) imortant than rec c. If α = 1, then both terms are equally imortant. The isometrics of F c,α are linear lines rotating around ( α2 c, 0). Therefore, they can be seen as a shifted version of the ositive recision isometrics. The larger c and/or the smaller α, the smaller the difference with rec c -isometrics. Similar to F c,α the negative F -measure is a metric for the trade-off between rec c n and tnr. Isometrics are a shifted version of the rec c n-isometrics and rotate around (1, 1 + α 2 c). Figure 2 shows F c,α -isometrics and Fn c,α -isometrics for c = 1 and α = 1 in the relevant region (0, 1) (0, 1) of ROC sace Generalized m-estimate The m-estimate comutes a recision estimate assuming that m instances are a riori classified. One of the
4 main reasons why it is favored over recision is that it is less sensitive to noise and more effective in avoiding overfitting (Fürnkranz & Flach, 2005; Lavrac & Dzeroski, 1994, Chaters 8-10). This is esecially true if the metric is used for the minority class when the class distribution is very skewed. The ositive m-estimate assumes that m instances are a riori classified as ositive. These instances are distributed according to the class distribution in the training set: or equivalently: = TP + m P TP + FP + m gm c,m = tr + m tr + c fr + m P gm c,m (4) (5) Figure 3. Generalized m-estimate isometrics in ROC sace: solid lines are gm 1,0.1 -isometrics and dashed lines are -isometrics. gm 1,0.1 n To eliminate absolute numbers P and N we define ˆm = m and obtain the formula in Table 1. Fürnkranz and Flach (2005) call this metric the ositive gmestimate (generalized m-estimate) since ˆm defines the rotation oint of the isometrics (see below) 2. The isometrics of the gm -estimate rotate around ( ˆm, ˆm). If ˆm = 0, then we obtain rec c -isometrics. For ˆm the erformance metric converges to 1 1+c = P () and the corresonding isometric is the ascending diagonal. The case of the negative gm-estimate is similar. The rotation oint of the isometrics is (1 + ˆm, 1 + ˆm). Figure 3 shows gm -isometrics and -isometrics for c = 1 and ˆm = 0.1. For simlicity of resentation, in the following the isometric of a ositive (negative) erformance metric is simly called a ositive (negative) isometric. 6. Classifier Design through Isometrics In Vanderlooy et al. (2006) we used recision isometrics as a tool to design classifiers. We generalize this aroach to include all isometrics defined in Section 5. For secific skew ratio, a ositive isometric is build with a desired ositive erformance. By definition, the intersection oint (fr a, tr a ) with an ROCCH reresents a classifier with this erformance. Similarly, the intersection oint (fr b, tr b ) of a negative isometric and the ROCCH reresents a classifier with negative erformance defined by that isometric. If we 2 The gm-estimate of Fürnkranz and Flach (2005) is more general than ours since they also vary a = 1 in Eq. 5. (a) (b) (c) Figure 4. Location of intersection between a ositive and negative isometric: (a) Case 1, (b) Case 2, and (c) Case 3. assume that the ositive and negative isometrics intersect each other in the relevant region of ROC sace, then three cases can be distinguished to construct the desired classifier (see Figure 4). Case 1: the isometrics intersect on the ROCCH The discrete classifier corresonding to this oint has the erformance defined by both isometrics. Theorem 1 guarantees that we can construct it. Therefore, the isometrics rovide an aroach to construct a classifier with a desired erformance er class. Case 2: the isometrics intersect below the ROCCH This classifier can also be constructed. However, the classifiers corresonding to any oint on the ROCCH between (fr b, tr b ) and (fr a, tr a ) have better erformance. Case 3: the isometrics intersect above the ROCCH There is no classifier with the desired erformances. To increase erformance instances between (fr a, tr a ) and (fr b, tr b ) are not classified. In case
5 of more than one intersection oint for the ositive (negative) isometric and the ROCCH, the intersection oint with highest tr (lowest fr) is chosen such that fr a < fr b. Then, the number of unclassified instances is minimized. The resulting classifier is called a reliable classifier. 7. Reliable Instance Classification A scoring classifier is almost never otimal: there exists negative instances with higher score than some ositive instances. A reliable classifier abstains from these uncertain instance classifications. It simulates the behavior of a human exert in fields with high error costs. For examle, in medical diagnosis an exert does not state a ossibly incorrect diagnosis but she says I do not know and erforms more tests. Similar to Ferri and Hernández-Orallo (2004), we define a reliable classifier as a filtering mechanism with two thresholds a > b. An instance x is classified as ositive if l(x) a. If l(x) b, then x is classified as negative. Otherwise, the instance is left unclassified. Unclassified instances can be rejected, assed to a human, or to another classifier (Ferri et al., 2004). Pietraszek (2005) chooses a and b to minimize exected cost, also considering the abstention costs. Here, we focus on erformance on the classified instances. Counts of unclassified ositives and unclassified negatives are denoted by UP and UN, resectively. Unclassified ositive rate and unclassified negative rate are then defined as follows: UP ur = TP+FN +UP (6) UN unr = FP+TN +UN (7) We define thresholds a and b to corresond with oints (fr a, tr a ) and (fr b, tr b ), resectively. The ROC curve of the reliable classifier is obtained by skiing the art between (fr a, tr a ) and (fr b, tr b ). By definition we have: ur = tr b tr a (8) unr = fr b fr a (9) The transformation from the original ROC curve to that of the reliable classifier is given in Theorem 2. Theorem 2 If the art between oints (fr a, tr a ) and (fr b, tr b ) of an ROC curve is skied with 0 < ur < 1 and 0 < unr < 1, then oints (fr x, tr x ) on this curve between (0, 0) and (fr a, tr a ) are transformed into oints (fr x, tr x) such that: fr x = fr x 1 unr, tr x = tr x 1 ur (10) Figure 5. ROCCH 2 is obtained by not covering the art between (fr a, tr a ) and (fr b, tr b ) of ROCCH 1. The length of the horizontal (vertical) line below ROCCH 1 equals unr (ur). Points (fr x, tr x ) between (fr b, tr b ) and (1, 1) are transformed into oints (fr x, tr x) such that: fr x = 1 1 fr x 1 unr, tr x = 1 1 tr x 1 ur (11) The roof is in Vanderlooy et al. (2006). Note that the transformations of (fr a, tr a ) and (fr b, tr b ) are the same oint on the new ROC curve. Figure 5 shows an examle of a transformation. The intersection oints are obtained with recision isometrics for c = 1, rec c = 0.93, and rec c n = Theorem 3 If the original ROC curve is convex, then the ROC curve obtained by not considering the oints between (fr a, tr a ) and (fr b, tr b ) is also convex. We roved this theorem in Vanderlooy et al. (2006). There, we also analyzed when and where the original ROCCH is dominated by that of the reliable classifier. Note that the underlying data of both ROCCHs can have a different class distribution when ur unr. For skew insensitive metrics or when ur unr, dominance of a ROCCH will immediately result in erformance increase. In the next Section 8 we analyze when the skew sensitive erformance metrics in Table 1 can be boosted by abstention. 8. Effect on Performance We defined (fr a, tr a ) and (fr b, tr b ) as intersection oints of an ROCCH and ositive and negative isometric, resectively. The tye of isometrics defines the effect on the erformance of the reliable classifier corresonding to (fr a, tr a) as defined in Theorem 2.
6 8.1. Precision Theorem 4 rovides an easy and comutationally efficient aroach to construct a classifier with a desired recision er class. Theorem 4 If oints (fr a, tr a ) and (fr b, tr b ) are defined by an rec c -isometric and rec c n-isometric resectively, then the oint (fr a, tr a) has the recisions of both isometrics. The roof of this theorem and also of following theorems are included in the aendix. Since isometrics of skew sensitive erformance metrics are used, the aroach does not commit to costs and class distributions 3. Thus, when the alication domain changes a new reliable classifier can be constructed from the original ROC curve only. Theorem 4 together with the next Theorem 5 rovides an aroach to construct a classifier with desired accuracy. This aroach overcomes the roblems with accuracy exlained in Section 4. From the roof it follows that if the recisions are not equal, then the accuracy is bounded by the smallest and largest recision. Theorem 5 If oint (fr a, tr a) has rec c = rec c n, then the accuracy in this oint equals the recisions F -measure Theorem 6 shows that also the F -measure can be boosted on both classes if a art of an ROC curve is not covered. In this case, the resulting classifier has higher erformance than defined by both isometrics. Figure 6 gives an examle where ositive (negative) erformance is increased with aroximately 5% (10%). Theorem 6 If oints (fr a, tr a ) and (fr b, tr b ) are defined by an F c,α -isometric and Fn c,α -isometric resectively, then the oint (fr a, tr a) has higher erformance than defined by both isometrics Generalized m-estimate To analyze the effect of abstention on the gm-estimate, we can consider the number of a riori classified instances m to be fixed or the arameter ˆm to be fixed. Consider the case when m is not changed after transformation. In this case ur and unr can change the distribution of a riori instances over the classes. If ur < unr, then the distribution of these instances in 3 Remember that, although our roofs use the simlest case c = N c(,n) N, the results are also valid for c =. P c(n,) P Figure 6. Designing with F -measure isometrics: F 2,1 = 0.72 in (fr a, tr a ) and Fn 2,1 = 0.75 in (fr b, tr b ). The reliable classifier reresented by (fr a, tr 1.84,1 a ) has F = and Fn 1.84,1 = The abstention is reresented by ur = and unr = the ositive gm-estimate moves to the true ositives resulting in higher erformance. For the negative gmestimate, the distribution moves to the false negatives resulting in lower erformance. The case of ur > unr is the other way around. Therefore, an increase in erformance in both classes is only ossible iff ur = unr. For the case when ˆm is not changed after transformation, a similar reasoning results in imrovement of the ositive gm-estimate if ur unr and tr a fr a. The latter condition holds for all oints on the ROCCH. Similarly, imrovement in the negative gmestimate occurs if ur unr and tr b fr b. Thus, we find the following theorems for the gm-estimate. Theorem 7 If oint (fr a, tr a ) is defined by an gm -estimate isometric with m > 0 and if ur unr, then the oint (fr a, tr a) has at least the ositive erformance defined by that isometric. Theorem 8 If oint (fr b, tr b ) is defined by an -estimate isometric with m > 0 and if ur unr, then the oint (fr a, tr a) has at least the negative erformance defined by that isometric. Corollary 1 If oints (fr a, tr a ) and (fr b, tr b ) are defined by an gm -estimate isometric and - estimate isometric resectively with m > 0 and if ur = unr, then the oint (fr a, tr a) has at least the gm-estimates of both isometrics. We suggest to use the gm-estimate for the minority class only and to use a normal recision for the majority class. From Theorems 7 and 8, if the minority
7 We may conclude that the roosed aroach is able to boost erformance on each class simultaneously. Benefits of the aroach are numerous: it guarantees a classifier with an accetable erformance in domains with high error costs, it is efficient in terms of time and sace, classifier indeendent, and it incororates changing error costs and class distributions easily. Acknowledgments We thank the reviewers for useful comments and suggestions. This work was artially suorted by the Dutch Organization for Scientific Research (NWO), grant nr: Figure 7. Designing with recision and gm-estimate isometrics: rec 0.3 = 0.97 in (fr a, tr a ) and gm 0.3,0.1 n = 0.55 in (fr b, tr b ). The reliable classifier reresented by (fr a, tr a ) has rec0.3 = 0.97 and gm 0.34,0.18 n = The abstention is reresented by ur = and unr = class is the ositive (negative) class, then we need an abstention characterized by ur unr (ur unr). Figure 7 shows an examle with fixed m and the negative class as minority class. Therefore, we want that the -estimate isometric covers a large art in ROC sace and consequently the condition ur unr is easily satisfied. 9. Conclusions A reliable classifier abstains from uncertain instance classifications. Benefits are significant in alication domains with high error costs, e.g., medical diagnosis and legal ractice. A classifier is transformed into a reliable one by not covering a art of its ROC curve. This art is defined by two isometrics indicating erformance on a different class. In case of a classifier and corresonding reliable classifier, dominance of an ROC curve immediately reresents an increase in erformance if the underlying data of both curves have aroximately equal class distributions. Since this assumtion is too strong, we analyzed when erformance can be boosted by abstention. We showed how to construct a (reliable) classifier with a desired recision er class. We did the same for accuracy. For the F -measure a classifier is obtained with at least the desired erformance er class. To revent a ossible erformance decrease with the gmestimate, we roose to use it for the minority class and to use a normal recision for the majority class. References Fawcett, T. (2003). ROC grahs: Notes and ractical considerations for researchers (Technical Reort HPL ). HP Laboratories. Ferri, C., Flach, P., & Hernández-Orallo, J. (2004). Delegating classifiers. Proceedings of the 1st International Worksho on ROC Analysis in Artificial Intelligence (ROCAI-2004) ( ). Ferri, C., & Hernández-Orallo, J. (2004). Cautious classifiers. Proceedings of the 1st International Worksho on ROC Analysis in Artificial Intelligence (ROCAI-2004) ( ). Flach, P. (2003). The geometry of ROC sace: Understanding machine learning metrics through ROC isometrics. Proceedings of the 20th International Conference on Machine Learning (ICML-2003) ( ). Fürnkranz, J., & Flach, P. (2005). Roc n rule learning towards a better understanding of covering algorithms. Machine Learning, 58, Lachiche, N., & Flach, P. (2003). Imroving accuracy and cost of two-class and multi-class robabilistic classifiers using ROC curves. Proceedings of the 20th International Conference on Machine Learning (ICML-2003) ( ). Lavrac, N., & Dzeroski, S. (1994). Inductive logic rogramming: Techniques and alications. Ellis Horwood, New York. Pietraszek, T. (2005). Otimizing abstaining classifiers using ROC analysis. Proceedings of the 22th International Conference on Machine Learning (ICML- 2005) ( ).
8 Provost, F., & Fawcett, T. (2001). Robust classification for imrecise environments. Machine Learning, 42, Provost, F., Fawcett, T., & Kohavi, R. (1998). The case against accuracy estimation for comaring induction algorithms. Proceedings of the 15th International Conference on Machine Learning (ICML- 1998) ( ). Rijsbergen, C. V. (1979). Information retrieval. Deartment of Comuter Science, University of Glasgow. 2nd edition. Vanderlooy, S., Srinkhuizen-Kuyer, I., & Smirnov, E. (2006). Reliable classifiers in ROC sace. Proceedings of the 15th Annual Machine Learning Conference of Belgium and the Netherlands (BENELEARN-2006) ( ). A. Proofs Proof of Theorem 4 The ositive recisions in (fr a, tr a ) and (fr a, tr a) are defined as follows: rec c (fr a, tr a ) = ( rec c fr a, tr a) = tr a tr a + c fr a (12) tr a tr a + c fr a (13) with c = c 1 unr 1 ur. Substitution of Eq. 10 in Eq. 13 results in Eq 12. In a similar way, Eq. 11 is used to show that the negative recisions in (fr b, tr b ) and (fr b, tr b ) are the same. The theorem follows since (fr b, tr b ) = (fr a, tr a). Proof of Theorem 5 Since the ositive recision and negative recision in (fr a, tr a) are equal, we can write: tr a = a ( tr a + c fr ) a (14) ( tnr a = a tnr a + 1 ) c fnr a (15) with a = rec c = rec c n. It follows that: tr a + c tnr a = a ( tr a + c fr a + c tnr a + fnr ) a or equivalently: tr a + c tnr a a = tr a + c fr a + c tnr a + fnr a and this is the accuracy with skew ratio c. (16) (17) Proof of Theorem 6 The ositive F -measures in (fr a, tr a ) and (fr a, tr a) are defined as follows: F c,α (fr a, tr a ) = ( F c,α fr a, tr a) = ( ) 1 + α 2 tr a α 2 (18) + tr a + c fr ( ) a 1 + α 2 tr a α 2 + tr a + c fr (19) a Using Eq. 10 and c = c 1 unr 1 ur, the right-hand side of Eq. 19 becomes: ( ) 1 + α 2 tr a α 2 (20) (1 ur) + tr a + c fr a ( ) It follows that F c,α fr a, tr a > F c,α (fr a, tr a ) since 0 < ur < 1. The case of the negative F -measure is similar. Proof of Theorem 7 The ositive gm-estimates in (fr a, tr a ) and (fr a, tr a) are defined as follows: gm gm c, ˆm with ˆm = tr+ ˆm tr+c fr+ ˆm(1+c) (21) (fr a, tr a ) = ( fr a, tr a) = tr + ˆm tr +c fr + ˆm (1+c ) (22) m, and c = c 1 unr 1 ur. Case 1: m is not changed after transformation In this case we can write ˆm m = P (1 ur)+n(1 unr). Substitution of Eq. 10 in Eq. 22 results in the following right-hand side: Clearly, gm c, ˆm 1 ur P (1 ur)+n(1 unr) tr + m (23) tr + c fr + ˆm(1 + c) ( ) fr a, tr a gm (fr a, tr a ) iff: 1 ur P (1 ur) + N(1 unr) 1 P + N This holds iff ur unr. (24) Case 2: ˆm is not changed after transformation Substitution of Eq. 10 in Eq. 22 with fixed ˆm results in the following right-hand side: tr + ˆm(1 ur) tr + c fr + ˆm(1 ur + c(1 unr)) (25) Straightforward ( ) comutation results in gm c, ˆm fr a, tr a gm (fr a, tr a ) iff: ˆm(unr ur) + (tr a unr fr a ur) 0 (26) This holds if ur unr and tr a fr a. Proof of Theorem 8 The roof is similar to that of Theorem 7.