3 Supervised Learning

Size: px
Start display at page:

Download "3 Supervised Learning"

Transcription

1 3 Supervsed Learnng Supervsed learnng has been a great success n real-world applcatons. It s used n almost every doman, ncludng text and Web domans. Supervsed learnng s also called classfcaton or nductve learnng n machne learnng. Ths type of learnng s analogous to human learnng from past experences to gan new knowledge n order to mprove our ablty to perform real-world tasks. However, snce computers do not have experences, machne learnng learns from data, whch are collected n the past and represent past experences n some real-world applcatons. There are several types of supervsed learnng tasks. In ths chapter, we focus on one partcular type, namely, learnng a target functon that can be used to predct the values of a dscrete class attrbute. Ths type of learnng has been the focus of the machne learnng research and s perhaps also the most wdely used learnng paradgm n practce. Ths chapter ntroduces a number of such supervsed learnng technques. They are used n almost every Web mnng applcaton. We wll see ther uses from Chaps Basc Concepts A data set used n the learnng task conssts of a set of data records, whch are descrbed by a set of attrbutes A = {A 1, A,, A A }, where A denotes the number of attrbutes or the sze of the set A. The data set also has a specal target attrbute C, whch s called the class attrbute. In our subsequent dscussons, we consder C separately from attrbutes n A due to ts specal status,.e., we assume that C s not n A. The class attrbute C has a set of dscrete values,.e., C = {c 1, c,, c C }, where C s the number of classes and C. A class value s also called a class label. A data set for learnng s smply a relatonal table. Each data record descrbes a pece of past experence. In the machne learnng and data mnng lterature, a data record s also called an example, an nstance, a case or a vector. A data set bascally conssts of a set of examples or nstances. Gven a data set D, the obectve of learnng s to produce a classfcaton/predcton functon to relate values of attrbutes n A and classes n C. The functon can be used to predct the class values/labels of the future B. Lu, Web Data Mnng: Explorng Hyperlnks, Contents, and Usage Data, Data-Centrc Systems and Applcatons, DOI / _3, Sprnger-Verlag Berln Hedelberg

2 64 3 Supervsed Learnng data. The functon s also called a classfcaton model, a predctve model or smply a classfer. We wll use these terms nterchangeably n ths book. It should be noted that the functon/model can be n any form, e.g., a decson tree, a set of rules, a Bayesan model or a hyperplane. Example 1: Table 3.1 shows a small loan applcaton data set. It has four attrbutes. The frst attrbute s Age, whch has three possble values, young, mddle and old. The second attrbute s Has_Job, whch ndcates whether an applcant has a ob. Its possble values are true (has a ob) and false (does not have a ob). The thrd attrbute s Own_house, whch shows whether an applcant owns a house. The fourth attrbute s Credt_ratng, whch has three possble values, far, good and excellent. The last column s the Class attrbute, whch shows whether each loan applcaton was approved (denoted by Yes) or not (denoted by No) n the past. Table 3.1. A loan applcaton data set ID Age Has_ob Own_house Credt_ratng Class 1 young false false far No young false false good No 3 young true false good Yes 4 young true true far Yes 5 young false false far No 6 mddle false false far No 7 mddle false false good No 8 mddle true true good Yes 9 mddle false true excellent Yes 10 mddle false true excellent Yes 11 old false true excellent Yes 1 old false true good Yes 13 old true false good Yes 14 old true false excellent Yes 15 old false false far No We want to learn a classfcaton model from ths data set that can be used to classfy future loan applcatons. That s, when a new customer comes nto the bank to apply for a loan, after nputtng hs/her age, whether he/she has a ob, whether he/she owns a house, and hs/her credt ratng, the classfcaton model should predct whether hs/her loan applcaton should be approved. Our learnng task s called supervsed learnng because the class labels (e.g., Yes and No values of the class attrbute n Table 3.1) are provded n

3 3.1 Basc Concepts 65 the data. It s as f some teacher tells us the classes. Ths s n contrast to the unsupervsed learnng, where the classes are not known and the learnng algorthm needs to automatcally generate classes. Unsupervsed learnng s the topc of the next chapter. The data set used for learnng s called the tranng data (or the tranng set). After a model s learned or bult from the tranng data by a learnng algorthm, t s evaluated usng a set of test data (or unseen data) to assess the model accuracy. It s mportant to note that the test data s not used n learnng the classfcaton model. The examples n the test data usually also have class labels. That s why the test data can be used to assess the accuracy of the learned model because we can check whether the class predcted for each test case by the model s the same as the actual class of the test case. In order to learn and also to test, the avalable data (whch has classes) for learnng s usually splt nto two dsont subsets, the tranng set (for learnng) and the test set (for testng). We wll dscuss ths further n Sect The accuracy of a classfcaton model on a test set s defned as: Number of correct classfcatons Accuracy, (1) Total number of test cases where a correct classfcaton means that the learned model predcts the same class as the orgnal class of the test case. There are also other measures that can be used. We wll dscuss them n Sect We pause here to rases two mportant questons: 1. What do we mean by learnng by a computer system?. What s the relatonshp between the tranng and the test data? We answer the frst queston frst. Gven a data set D representng past experences, a task T and a performance measure M, a computer system s sad to learn from the data to perform the task T f after learnng the system s performance on the task T mproves as measured by M. In other words, the learned model or knowledge helps the system to perform the task better as compared to no learnng. Learnng s the process of buldng the model or extractng the knowledge. We use the data set n Example 1 to explan the dea. The task s to predct whether a loan applcaton should be approved. The performance measure M s the accuracy n Equaton (1). Wth the data set n Table 3.1, f there s no learnng, all we can do s to guess randomly or to smply take the maorty class (whch s the Yes class). Suppose we use the maorty class and announce that every future nstance or case belongs to the class Yes. If the future data are drawn from the same dstrbuton as the exstng tranng data n Table 3.1, the estmated classfcaton/predcton accuracy

4 66 3 Supervsed Learnng on the future data s 9/15 = 0.6 as there are 9 Yes class examples out of the total of 15 examples n Table 3.1. The queston s: can we do better wth learnng? If the learned model can ndeed mprove the accuracy, then the learnng s sad to be effectve. The second queston n fact touches the fundamental assumpton of machne learnng, especally the theoretcal study of machne learnng. The assumpton s that the dstrbuton of tranng examples s dentcal to the dstrbuton of test examples (ncludng future unseen examples). In practcal applcatons, ths assumpton s often volated to a certan degree. Strong volatons wll clearly result n poor classfcaton accuracy, whch s qute ntutve because f the test data behave very dfferently from the tranng data then the learned model wll not perform well on the test data. To acheve good accuracy on the test data, tranng examples must be suffcently representatve of the test data. We now llustrate the steps of learnng n Fg. 3.1 based on the precedng dscussons. In step 1, a learnng algorthm uses the tranng data to generate a classfcaton model. Ths step s also called the tranng step or tranng phase. In step, the learned model s tested usng the test set to obtan the classfcaton accuracy. Ths step s called the testng step or testng phase. If the accuracy of the learned model on the test data s satsfactory, the model can be used n real-world tasks to predct classes of new cases (whch do not have classes). If the accuracy s not satsfactory, we need to go back and choose a dfferent learnng algorthm and/or do some further processng of the data (ths step s called data pre-processng, not shown n the fgure). A practcal learnng task typcally nvolves many teratons of these steps before a satsfactory model s bult. It s also possble that we are unable to buld a satsfactory model due to a hgh degree of randomness n the data or lmtatons of current learnng algorthms. Tranng data Learnng algorthm model Test data Accuracy Step 1: Tranng Step : Testng Fg The basc learnng process: tranng and testng From the next secton onward, we study several supervsed learnng algorthms, except Sect. 3.3, whch focuses on model/classfer evaluaton. We note that throughout the chapter we assume that the tranng and test data are avalable for learnng. However, n many text and Web page related learnng tasks, ths s not true. Usually, we need to collect raw data,

5 3. Decson Tree Inducton 67 desgn attrbutes and compute attrbute values from the raw data. The reason s that the raw data n text and Web applcatons are often not sutable for learnng ether because ther formats are not rght or because there are no obvous attrbutes n the raw text documents or Web pages. 3. Decson Tree Inducton Decson tree learnng s one of the most wdely used technques for classfcaton. Its classfcaton accuracy s compettve wth other learnng methods, and t s very effcent. The learned classfcaton model s represented as a tree, called a decson tree. The technques presented n ths secton are based on the C4.5 system from Qunlan [49]. Example : Fg. 3. shows a possble decson tree learnt from the data n Table 3.1. The tree has two types of nodes, decson nodes (whch are nternal nodes) and leaf nodes. A decson node specfes some test (.e., asks a queston) on a sngle attrbute. A leaf node ndcates a class. Age? Young mddle old Has_ob? Own_house? Credt_ratng? true false Yes No (/) (3/3) true false Yes No (3/3) (/) far good excellent No Yes Yes (1/1) (/) (/) Fg. 3.. A decson tree for the data n Table 3.1 The root node of the decson tree n Fg. 3. s Age, whch bascally asks the queston: what s the age of the applcant? It has three possble answers or outcomes, whch are the three possble values of Age. These three values form three tree branches/edges. The other nternal nodes have the same meanng. Each leaf node gves a class value (Yes or No). (x/y) below each class means that x out of y tranng examples that reach ths leaf node have the class of the leaf. For nstance, the class of the left most leaf node s Yes. Two tranng examples (examples 3 and 4 n Table 3.1) reach here and both of them are of class Yes. To use the decson tree n testng, we traverse the tree top-down accordng to the attrbute values of the gven test nstance untl we reach a leaf node. The class of the leaf s the predcted class of the test nstance.

6 68 3 Supervsed Learnng Example 3: We use the tree to predct the class of the followng new nstance, whch descrbes a new loan applcant. Age Has_ob Own_house Credt-ratng Class young false false good? Gong through the decson tree, we fnd that the predcted class s No as we reach the second leaf node from the left. A decson tree s constructed by parttonng the tranng data so that the resultng subsets are as pure as possble. A pure subset s one that contans only tranng examples of a sngle class. If we apply all the tranng data n Table 3.1 on the tree n Fg. 3., we wll see that the tranng examples reachng each leaf node form a subset of examples that have the same class as the class of the leaf. In fact, we can see that from the x and y values n (x/y). We wll dscuss the decson tree buldng algorthm n Sect An nterestng queston s: Is the tree n Fg. 3. unque for the data n Table 3.1? The answer s no. In fact, there are many possble trees that can be learned from the data. For example, Fg. 3.3 gves another decson tree, whch s much smaller and s also able to partton the tranng data perfectly accordng to ther classes. Own_house? true false Yes (6/6) Has_ob? true false Yes No (3/3) (6/6) Fg A smaller tree for the data set n Table 3.1 In practce, one wants to have a small and accurate tree for many reasons. A smaller tree s more general and also tends to be more accurate (we wll dscuss ths later). It s also easer to understand by human users. In many applcatons, the user understandng of the classfer s mportant. For example, n some medcal applcatons, doctors want to understand the model that classfes whether a person has a partcular dsease. It s not satsfactory to smply produce a classfcaton because wthout understandng why the decson s made the doctor may not trust the system and/or does not gan useful knowledge. It s useful to note that n both Fg. 3. and Fg. 3.3, the tranng exam-

7 3. Decson Tree Inducton 69 ples that reach each leaf node all have the same class (see the values of (x/y) at each leaf node). However, for most real-lfe data sets, ths s usually not the case. That s, the examples that reach a partcular leaf node are not of the same class,.e., x y. The value of x/y s, n fact, the confdence (conf) value used n assocaton rule mnng, and x s the support count. Ths suggests that a decson tree can be converted to a set of f-then rules. Yes, ndeed. The converson s done as follows: Each path from the root to a leaf forms a rule. All the decson nodes along the path form the condtons of the rule and the leaf node or the class forms the consequent. For each rule, a support and confdence can be attached. Note that n most classfcaton systems, these two values are not provded. We add them here to see the connecton of assocaton rules and decson trees. Example 4: The tree n Fg. 3.3 generates three rules., means and. Own_house = true Class =Yes [sup=6/15, conf=6/6] Own_house = false, Has_ob = true Class = Yes [sup=3/15, conf=3/3] Own_house = false, Has_ob = false Class = No [sup=6/15, conf=6/6]. We can see that these rules are of the same format as assocaton rules. However, the rules above are only a small subset of the rules that can be found n the data of Table 3.1. For nstance, the decson tree n Fg. 3.3 does not fnd the followng rule: Age = young, Has_ob = false Class = No [sup=3/15, conf=3/3]. Thus, we say that a decson tree only fnds a subset of rules that exst n data, whch s suffcent for classfcaton. The obectve of assocaton rule mnng s to fnd all rules subect to some mnmum support and mnmum confdence constrants. Thus, the two methods have dfferent obectves. We wll dscuss these ssues agan n Sect. 3.5 when we show that assocaton rules can be used for classfcaton as well, whch s obvous. An nterestng and mportant property of a decson tree and ts resultng set of rules s that the tree paths or the rules are mutually exclusve and exhaustve. Ths means that every data nstance s covered by a sngle rule (a tree path) and a sngle rule only. By coverng a data nstance, we mean that the nstance satsfes the condtons of the rule. We also say that a decson tree generalzes the data as a tree s a smaller (more compact) descrpton of the data,.e., t captures the key regulartes n the data. Then, the problem becomes buldng the best tree that s small and accurate. It turns out that fndng the best tree that models the data s a NP-complete problem [6]. All exstng algorthms use heurstc methods for tree buldng. Below, we study one of the most successful technques.

8 70 3 Supervsed Learnng Algorthm decsontree(d, A, T) 1 f D contans only tranng examples of the same class c C then make T a leaf node labeled wth class c ; 3 elsef A = then 4 make T a leaf node labeled wth c, whch s the most frequent class n D 5 else // D contans examples belongng to a mxture of classes. We select a sngle 6 // attrbute to partton D nto subsets so that each subset s purer 7 p 0 = mpurtyeval-1(d); 8 for each attrbute A A (={A 1, A,, A k }) do 9 p = mpurtyeval-(a, D) 10 endfor 11 Select A g {A 1, A,, A k } that gves the bggest mpurty reducton, computed usng p 0 p ; 1 f p 0 p g < threshold then // A g does not sgnfcantly reduce mpurty p 0 13 make T a leaf node labeled wth c, the most frequent class n D. 14 else // A g s able to reduce mpurty p 0 15 Make T a decson node on A g ; 16 Let the possble values of A g be v 1, v,, v m. Partton D nto m dsont subsets D 1, D,, D m based on the m values of A g. 17 for each D n {D 1, D,, D m } do 18 f D then 19 create a branch (edge) node T for v as a chld node of T; 0 decsontree(d, A{A g }, T ) // A g s removed 1 endf endfor 3 endf 4 endf Fg A decson tree learnng algorthm 3..1 Learnng Algorthm As ndcated earler, a decson tree T smply parttons the tranng data set D nto dsont subsets so that each subset s as pure as possble (of the same class). The learnng of a tree s typcally done usng the dvde-andconquer strategy that recursvely parttons the data to produce the tree. At the begnnng, all the examples are at the root. As the tree grows, the examples are sub-dvded recursvely. A decson tree learnng algorthm s gven n Fg For now, we assume that every attrbute n D takes dscrete values. Ths assumpton s not necessary as we wll see later. The stoppng crtera of the recurson are n lnes 1 4 n Fg The algorthm stops when all the tranng examples n the current data are of the same class, or when every attrbute has been used along the current tree

9 3. Decson Tree Inducton 71 path. In tree learnng, each successve recurson chooses the best attrbute to partton the data at the current node accordng to the values of the attrbute. The best attrbute s selected based on a functon that ams to mnmze the mpurty after the parttonng (lnes 7 11). In other words, t maxmzes the purty. The key n decson tree learnng s thus the choce of the mpurty functon, whch s used n lnes 7, 9 and 11 n Fg The recursve recall of the algorthm s n lne 0, whch takes the subset of tranng examples at the node for further parttonng to extend the tree. Ths s a greedy algorthm wth no backtrackng. Once a node s created, t wll not be revsed or revsted no matter what happens subsequently. 3.. Impurty Functon Before presentng the mpurty functon, we use an example to show what the mpurty functon ams to do ntutvely. Example 5: Fg. 3.5 shows two possble root nodes for the data n Table 3.1. Age? Own_house? Young mddle old true false No: 3 No: No: 1 Yes: Yes: 3 Yes: 4 (A) No: 0 No: 6 Yes: 6 Yes: 3 (B) Fg Two possble root nodes or two possble attrbutes for the root node Fg. 3.5(A) uses Age as the root node, and Fg. 3.5(B) uses Own_house as the root node. Ther possble values (or outcomes) are the branches. At each branch, we lsted the number of tranng examples of each class (No or Yes) that land or reach there. Fg. 3.5(B) s obvously a better choce for the root. From a predcton or classfcaton pont of vew, Fg. 3.5(B) makes fewer mstakes than Fg. 3.5(A). In Fg. 3.5(B), when Own_house = true every example has the class Yes. When Own_house = false, f we take maorty class (the most frequent class), whch s No, we make three mstakes/errors. If we look at Fg. 3.5(A), the stuaton s worse. If we take the maorty class for each branch, we make fve mstakes (marked n bold). Thus, we say that the mpurty of the tree n Fg. 3.5(A) s hgher than the tree n Fg. 3.5(B). To learn a decson tree, we prefer Own_house to Age to be the root node. Instead of countng the number of mstakes or errors, C4.5 uses a more prncpled approach to perform ths evaluaton on every attrbute n order to choose the best attrbute to buld the tree.

10 7 3 Supervsed Learnng The most popular mpurty functons used for decson tree learnng are nformaton gan and nformaton gan rato, whch are used n C4.5 as two optons. Let us frst dscuss nformaton gan, whch can be extended slghtly to produce nformaton gan rato. The nformaton gan measure s based on the entropy functon from nformaton theory [55]: entropy ( D) Pr( c )log Pr( c ) () C 1 Pr( c ) 1, C 1 where Pr(c ) s the probablty of class c n data set D, whch s the number of examples of class c n D dvded by the total number of examples n D. In the entropy computaton, we defne 0log0 = 0. The unt of entropy s bt. Let us use an example to get a feelng of what ths functon does. Example 6: Assume we have a data set D wth only two classes, postve and negatve. Let us see the entropy values for three dfferent compostons of postve and negatve examples: 1. The data set D has 50% postve examples (Pr(postve) = 0.5) and 50% negatve examples (Pr(negatve) = 0.5). entropy ( D) 0.5 log log The data set D has 0% postve examples (Pr(postve) = 0.) and 80% negatve examples (Pr(negatve) = 0.8). entropy ( D) 0. log log The data set D has 100% postve examples (Pr(postve) = 1) and no negatve examples, (Pr(negatve) = 0). entropy ( D) 1log 1 0log 0 0. We can see a trend: When the data becomes purer and purer, the entropy value becomes smaller and smaller. In fact, t can be shown that for ths bnary case (two classes), when Pr(postve) = 0.5 and Pr(negatve) = 0.5 the entropy has the maxmum value,.e., 1 bt. When all the data n D belong to one class the entropy has the mnmum value, 0 bt. It s clear that the entropy measures the amount of mpurty or dsorder n the data. That s exactly what we need n decson tree learnng. We now descrbe the nformaton gan measure, whch uses the entropy functon.

11 3. Decson Tree Inducton 73 Informaton Gan The dea s the followng: 1. Gven a data set D, we frst use the entropy functon (Equaton ) to compute the mpurty value of D, whch s entropy(d). The mpurtyeval-1 functon n lne 7 of Fg. 3.4 performs ths task.. Then, we want to know whch attrbute can reduce the mpurty most f t s used to partton D. To fnd out, every attrbute s evaluated (lnes 8 10 n Fg. 3.4). Let the number of possble values of the attrbute A be v. If we are gong to use A to partton the data D, we wll dvde D nto v dsont subsets D 1, D,, D v. The entropy after the partton s v D entropya ( D) entropy( D ). (3) D 1 The mpurtyeval- functon n lne 9 of Fg. 3.4 performs ths task. 3. The nformaton gan of attrbute A s computed wth: gan( D, A ) entropy( D) entropy ( D). (4) Clearly, the gan crteron measures the reducton n mpurty or dsorder. The gan measure s used n lne 11 of Fg. 3.4, whch chooses attrbute A g resultng n the largest reducton n mpurty. If the gan of A g s too small, the algorthm stops for the branch (lne 1). Normally a threshold s used here. If choosng A g s able to reduce mpurty sgnfcantly, A g s employed to partton the data to extend the tree further, and so on (lnes 15 1 n Fg. 3.4). The process goes on recursvely by buldng sub-trees usng D 1, D,, D m (lne 0). For subsequent tree extensons, we do not need A g any more, as all tranng examples n each branch has the same A g value. Example 7: Let us compute the gan values for attrbutes Age, Own_house and Credt_Ratng usng the whole data set D n Table 3.1,.e., we evaluate for the root node of a decson tree. Frst, we compute the entropy of D. Snce D has 6 No class tranng examples, and 9 Yes class tranng examples, we have entropy D) log log ( A We then try Age, whch parttons the data nto 3 subsets (as Age has three possble values) D 1 (wth Age=young), D (wth Age=mddle), and D 3 (wth Age=old). Each subset has fve tranng examples. In Fg. 3.5, we also see the number of No class examples and the number of Yes examples n each subset (or n each branch).

12 74 3 Supervsed Learnng entropy Age ( D) entropy( D1 ) entropy( D ) entropy( D3 ) Lkewse, we compute for Own_house, whch parttons D nto two subsets, D 1 (wth Own_house=true) and D (wth Own_house=false). entropy 6 9 D) entropy ( D1) entropy ( D ) Own _ house ( Smlarly, we obtan entropy Has_ob (D) = 0.647, and entropy Credt_ratng (D) = The gans for the attrbutes are: gan(d, Age) = = gan(d, Own_house) = = 0.40 gan(d, Has_ob) = = 0.34 gan(d, Credt_ratng) = = Own_house s the best attrbute for the root node. Fg. 3.5(B) shows the root node usng Own_house. Snce the left branch has only one class (Yes) of data, t results n a leaf node (lne 1 n Fg. 3.4). For Own_house = false, further extenson s needed. The process s the same as above, but we only use the subset of the data wth Own_house = false,.e., D. Informaton Gan Rato The gan crteron tends to favor attrbutes wth many possble values. An extreme stuaton s that the data contan an ID attrbute that s an dentfcaton of each example. If we consder usng ths ID attrbute to partton the data, each tranng example wll form a subset and has only one class, whch results n entropy ID (D) = 0. So the gan by usng ths attrbute s maxmal. From a predcton pont of revew, such a partton s useless. Gan rato (Equaton 5) remedes ths bas by normalzng the gan usng the entropy of the data wth respect to the values of the attrbute. Our prevous entropy computatons are done wth respect to the class attrbute: ganrato( D, A ) s D D log 1 D D gan( D, A ) where s s the number of possble values of A, and D s the subset of data (5)

13 3. Decson Tree Inducton 75 that has the th value of A. D / D corresponds to the probablty of Equaton (). Usng Equaton (5), we smply choose the attrbute wth the hghest ganrato value to extend the tree. Ths method works because f A has too many values the denomnator wll be large. For nstance, n our above example of the ID attrbute, the denomnator wll be log D. The denomnator s called the splt nfo n C4.5. One note s that the splt nfo can be 0 or very small. Some heurstc solutons can be devsed to deal wth t (see [49]) Handlng of Contnuous Attrbutes It seems that the decson tree algorthm can only handle dscrete attrbutes. In fact, contnuous attrbutes can be dealt wth easly as well. In a real lfe data set, there are often both dscrete attrbutes and contnuous attrbutes. Handlng both types n an algorthm s an mportant advantage. To apply the decson tree buldng method, we can dvde the value range of attrbute A nto ntervals at a partcular tree node. Each nterval can then be consdered a dscrete value. Based on the ntervals, gan or ganrato s evaluated n the same way as n the dscrete case. Clearly, we can dvde A nto any number of ntervals at a tree node. However, two ntervals are usually suffcent. Ths bnary splt s used n C4.5. We need to fnd a threshold value for the dvson. Clearly, we should choose the threshold that maxmzes the gan (or ganrato). We need to examne all possble thresholds. Ths s not a problem because although for a contnuous attrbute A the number of possble values that t can take s nfnte, the number of actual values that appear n the data s always fnte. Let the set of dstnctve values of attrbute A that occur n the data be {v 1, v,, v r }, whch are sorted n ascendng order. Clearly, any threshold value lyng between v and v +1 wll have the same effect of dvdng the tranng examples nto those whose value of attrbute A les n {v 1, v,, v } and those whose value les n {v +1, v +,, v r }. There are thus only r1 possble splts on A, whch can all be evaluated. The threshold value can be the mddle pont between v and v +1, or ust on the rght sde of value v, whch results n two ntervals A v and A > v. Ths latter approach s used n C4.5. The advantage of ths approach s that the values appearng n the tree actually occur n the data. The threshold value that maxmzes the gan (ganrato) value s selected. We can modfy the algorthm n Fg. 3.4 (lnes 8 11) easly to accommodate ths computaton so that both dscrete and contnuous attrbutes are consdered. A change to lne 0 of the algorthm n Fg. 3.4 s also needed. For a contnuous attrbute, we do not remove attrbute A g because an nterval can

14 76 3 Supervsed Learnng be further splt recursvely n subsequent tree extensons. Thus, the same contnuous attrbute may appear multple tmes n a tree path (see Example 9), whch does not happen for a dscrete attrbute. From a geometrc pont of vew, a decson tree bult wth only contnuous attrbutes represents a parttonng of the data space. A seres of splts from the root node to a leaf node represents a hyper-rectangle. Each sde of the hyper-rectangle s an axs-parallel hyperplane. Example 8: The hyper-rectangular regons n Fg. 3.6(A), whch parttons the space, are produced by the decson tree n Fg. 3.6(B). There are two classes n the data, represented by empty crcles and flled rectangles..6.5 Y (A) A partton of the data space X X > Y Y.5 >.5 > Y X.6 >.6 3 > 3 X 4 > 4 (B). The decson tree Fg A parttonng of the data space and ts correspondng decson tree Handlng of contnuous (numerc) attrbutes has an mpact on the effcency of the decson tree algorthm. Wth only dscrete attrbutes the algorthm grows lnearly wth the sze of the data set D. However, sortng of a contnuous attrbute takes D log D tme, whch can domnate the tree learnng process. Sortng s mportant as t ensures that gan or ganrato can be computed n one pass of the data Some Other Issues We now dscuss several other ssues n decson tree learnng. Tree Prunng and Overfttng: A decson tree algorthm recursvely parttons the data untl there s no mpurty or there s no attrbute left. Ths process may result n trees that are very deep and many tree leaves may cover very few tranng examples. If we use such a tree to predct the tranng set, the accuracy wll be very hgh. However, when t s used to classfy unseen test set, the accuracy may be very low. The learnng s thus not effectve,.e., the decson tree does not generalze the data well. Ths

15 3. Decson Tree Inducton 77 phenomenon s called overfttng. More specfcally, we say that a classfer f 1 overfts the data f there s another classfer f such that f 1 acheves a hgher accuracy on the tranng data than f, but a lower accuracy on the unseen test data than f [45]. Overfttng s usually caused by nose n the data,.e., wrong class values/labels and/or wrong values of attrbutes, but t may also be due to the complexty and randomness of the applcaton doman. These problems cause the decson tree algorthm to refne the tree by extendng t to very deep usng many attrbutes. To reduce overfttng n the context of decson tree learnng, we perform prunng of the tree,.e., to delete some branches or sub-trees and replace them wth leaves of maorty classes. There are two man methods to do ths, stoppng early n tree buldng (whch s also called pre-prunng) and prunng the tree after t s bult (whch s called post-prunng). Postprunng has been shown more effectve. Early-stoppng can be dangerous because t s not clear what wll happen f the tree s extended further (wthout stoppng). Post-prunng s more effectve because after we have extended the tree to the fullest, t becomes clearer whch branches/subtrees may not be useful (overft the data). The general dea of post-prunng s to estmate the error of each tree node. If the estmated error for a node s less than the estmated error of ts extended sub-tree, then the sub-tree s pruned. Most exstng tree learnng algorthms take ths approach. See [49] for a technque called the pessmstc error based prunng. Example 9: In Fg. 3.6(B), the sub-tree representng the rectangular regon X, Y >.5, Y.6 n Fg. 3.6(A) s very lkely to be overfttng. The regon s very small and contans only a sngle data pont, whch may be an error (or nose) n the data collecton. If t s pruned, we obtan Fg. 3.7(A) and (B)..6.5 Y X (A) A partton of the data space X > Y > X 3 > 3 X 4 > 4 (B). The decson tree Fg The data space partton and the decson tree after prunng

16 78 3 Supervsed Learnng Another common approach to prunng s to use a separate set of data called the valdaton set, whch s not used n tranng and nether n testng. After a tree s bult, t s used to classfy the valdaton set. Then, we can fnd the errors at each node on the valdaton set. Ths enables us to know what to prune based on the errors at each node. Rule Prunng: We noted earler that a decson tree can be converted to a set of rules. In fact, C4.5 also prunes the rules to smplfy them and to reduce overfttng. Frst, the tree (C4.5 uses the unpruned tree) s converted to a set of rules n the way dscussed n Example 4. Rule prunng s then performed by removng some condtons to make the rules shorter and fewer (after prunng some rules may become redundant). In most cases, prunng results n a more accurate rule set as shorter rules are less lkely to overft the tranng data. Prunng s also called generalzaton as t makes rules more general (wth fewer condtons). A rule wth more condtons s more specfc than a rule wth fewer condtons. Example 10: The sub-tree below X n Fg. 3.6(B) produces these rules: Rule 1: Rule : Rule 3: X, Y >.5, Y >.6 X, Y >.5, Y.6 O X, Y.5 Note that Y >.5 n Rule 1 s not useful because of Y >.6, and thus Rule 1 should be Rule 1: X, Y >.6 In prunng, we may be able to delete the condtons Y >.6 from Rule 1 to produce: X Then Rule and Rule 3 become redundant and can be removed. A useful pont to note s that after prunng the resultng set of rules may no longer be mutually exclusve and exhaustve. There may be data ponts that satsfy the condtons of more than one rule, and f naccurate rules are dscarded, of no rules. An orderng of the rules s thus needed to ensure that when classfyng a test case only one rule wll be appled to determne the class of the test case. To deal wth the stuaton that a test case does not satsfy the condtons of any rule, a default class s used, whch s usually the maorty class. Handlng Mssng Attrbute Values: In many practcal data sets, some attrbute values are mssng or not avalable due to varous reasons. There are many ways to deal wth the problem. For example, we can fll each

17 3.3 Classfer Evaluaton 79 mssng value wth the specal value unknown or the most frequent value of the attrbute f the attrbute s dscrete. If the attrbute s contnuous, use the mean of the attrbute for each mssng value. The decson tree algorthm n C4.5 takes another approach. At a tree node, t dstrbutes the tranng example wth mssng value for the attrbute to each branch of the tree proportonally accordng to the dstrbuton of the tranng examples that have values for the attrbute. Handlng Skewed Class Dstrbuton: In many applcatons, the proportons of data for dfferent classes can be very dfferent. For nstance, n a data set of ntruson detecton n computer networks, the proporton of ntruson cases s extremely small (< 1%) compared wth normal cases. Drectly applyng the decson tree algorthm for classfcaton or predcton of ntrusons s usually not effectve. The resultng decson tree often conssts of a sngle leaf node normal, whch s useless for ntruson detecton. One way to deal wth the problem s to over sample the ntruson examples to ncrease ts proporton. Another soluton s to rank the new cases accordng to how lkely they may be ntrusons. The human users can then nvestgate the top ranked cases. 3.3 Classfer Evaluaton After a classfer s constructed, t needs to be evaluated for accuracy. Effectve evaluaton s crucal because wthout knowng the approxmate accuracy of a classfer, t cannot be used n real-world tasks. There are many ways to evaluate a classfer, and there are also many measures. The man measure s the classfcaton accuracy (Equaton 1), whch s the number of correctly classfed nstances n the test set dvded by the total number of nstances n the test set. Some researchers also use the error rate, whch s 1 accuracy. Clearly, f we have several classfers, the one wth the hghest accuracy s preferred. Statstcal sgnfcance tests may be used to check whether one classfer s accuracy s sgnfcantly better than that of another gven the same tranng and test data sets. Below, we frst present several common methods for classfer evaluaton, and then ntroduce some other evaluaton measures Evaluaton Methods Holdout Set: The avalable data D s dvded nto two dsont subsets, the tranng set D tran and the test set D test, D = D tran D test and D tran D test =. The test set s also called the holdout set. Ths method s manly used

18 80 3 Supervsed Learnng when the data set D s large. Note that the examples n the orgnal data set D are all labeled wth classes. As we dscussed earler, the tranng set s used for learnng a classfer and the test set s used for evaluatng the classfer. The tranng set should not be used n the evaluaton as the classfer s based toward the tranng set. That s, the classfer may overft the tranng data, whch results n very hgh accuracy on the tranng set but low accuracy on the test set. Usng the unseen test set gves an unbased estmate of the classfcaton accuracy. As for what percentage of the data should be used for tranng and what percentage for testng, t depends on the data set sze and two thrds for tranng and one thrd for testng are commonly used. To partton D nto tranng and test sets, we can use a few approaches: 1. We randomly sample a set of tranng examples from D for learnng and use the rest for testng.. If the data s collected over tme, then we can use the earler part of the data for tranng/learnng and the later part of the data for testng. In many applcatons, ths s a more sutable approach because when the classfer s used n the real-world the data are from the future. Ths approach thus better reflects the dynamc aspects of applcatons. Multple Random Samplng: When the avalable data set s small, usng the above methods can be unrelable because the test set would be too small to be representatve. One approach to deal wth the problem s to perform the above random samplng n tmes. Each tme a dfferent tranng set and a dfferent test set are produced. Ths produces n accuraces. The fnal estmated accuracy on the data s the average of the n accuraces. Cross-Valdaton: When the data set s small, the n-fold cross-valdaton method s very commonly used. In ths method, the avalable data s parttoned nto n equal-sze dsont subsets. Each subset s then used as the test set and the remanng n1 subsets are combned as the tranng set to learn a classfer. Ths procedure s then run n tmes, whch gves n accuraces. The fnal estmated accuracy of learnng from ths data set s the average of the n accuraces. 10-fold and 5-fold cross-valdatons are often used. A specal case of cross-valdaton s the leave-one-out cross-valdaton. In ths method, each fold of the cross valdaton has only a sngle test example and all the rest of the data s used n tranng. That s, f the orgnal data has m examples, then ths s m-fold cross-valdaton. Ths method s normally used when the avalable data s very small. It s not effcent for a large data set as m classfers need to be bult. In Sect. 3..4, we mentoned that a valdaton set can be used to prune a decson tree or a set of rules. If a valdaton set s employed for that pur-

19 3.3 Classfer Evaluaton 81 pose, t should not be used n testng. In that case, the avalable data s dvded nto three subsets, a tranng set, a valdaton set and a test set. Apart from usng a valdaton set to help tree or rule prunng, a valdaton set s also used frequently to estmate parameters n learnng algorthms. In such cases, the values that gve the best accuracy on the valdaton set are used as the fnal values of the parameters. Cross-valdaton can be used for parameter estmatng as well. Then a separate valdaton set s not needed. Instead, the whole tranng set s used n cross-valdaton Precson, Recall, F-score and Breakeven Pont In some applcatons, we are only nterested n one class. Ths s partcularly true for text and Web applcatons. For example, we may be nterested n only the documents or web pages of a partcular topc. Also, n classfcaton nvolvng skewed or hghly mbalanced data, e.g., network ntruson and fnancal fraud detecton, we are typcally nterested n only the mnorty class. The class that the user s nterested n s commonly called the postve class, and the rest negatve classes (the negatve classes may be combned nto one negatve class). Accuracy s not a sutable measure n such cases because we may acheve a very hgh accuracy, but may not dentfy a sngle ntruson. For nstance, 99% of the cases are normal n an ntruson detecton data set. Then a classfer can acheve 99% accuracy (wthout dong anythng) by smply classfyng every test case as not ntruson. Ths s, however, useless. Precson and recall are more sutable n such applcatons because they measure how precse and how complete the classfcaton s on the postve class. It s convenent to ntroduce these measures usng a confuson matrx (Table 3.). A confuson matrx contans nformaton about actual and predcted results gven by a classfer. Table 3.. Confuson matrx of a classfer Classfed postve Classfed negatve Actual postve TP FN Actual negatve FP TN where TP: the number of correct classfcatons of the postve examples (true postve) FN: the number of ncorrect classfcatons of postve examples (false negatve) FP: the number of ncorrect classfcatons of negatve examples (false postve) TN: the number of correct classfcatons of negatve examples (true negatve) Based on the confuson matrx, the precson (p) and recall (r) of the postve class are defned as follows:

20 8 3 Supervsed Learnng TP TP p. r. TP FP TP FN (6) In words, precson p s the number of correctly classfed postve examples dvded by the total number of examples that are classfed as postve. Recall r s the number of correctly classfed postve examples dvded by the total number of actual postve examples n the test set. The ntutve meanngs of these two measures are qute obvous. However, t s hard to compare classfers based on two measures, whch are not functonally related. For a test set, the precson may be very hgh but the recall can be very low, and vce versa. Example 11: A test data set has 100 postve examples and 1000 negatve examples. After classfcaton usng a classfer, we have the followng confuson matrx (Table 3.3), Table 3.3. Confuson matrx of a classfer Classfed postve Classfed negatve Actual postve 1 99 Actual negatve Ths confuson matrx gves the precson p = 100% and the recall r = 1% because we only classfed one postve example correctly and classfed no negatve examples wrongly. Although n theory precson and recall are not related, n practce hgh precson s acheved almost always at the expense of recall and hgh recall s acheved at the expense of precson. In an applcaton, whch measure s more mportant depends on the nature of the applcaton. If we need a sngle measure to compare dfferent classfers, the F-score s often used: pr F. (7) p r The F-score (also called the F 1 -score) s the harmonc mean of precson and recall. F. 1 1 (8) p r The harmonc mean of two numbers tends to be closer to the smaller of the two. Thus, for the F-score to be hgh, both p and r must be hgh. There s also another measure, called precson and recall breakeven pont, whch s used n the nformaton retreval communty. The break-

21 3.3 Classfer Evaluaton 83 even pont s when the precson and the recall are equal. Ths measure assumes that the test cases can be ranked by the classfer based on ther lkelhoods of beng postve. For nstance, n decson tree classfcaton, we can use the confdence of each leaf node as the value to rank test cases. Example 1: We have the followng rankng of 0 test documents. 1 represents the hghest rank and 0 represents the lowest rank. + ( ) represents an actual postve (negatve) document Assume that the test set has 10 postve examples. At rank 1: p = 1/1 = 100% r = 1/10 = 10% At rank : p = / = 100% r = /10 = 0% At rank 9: p = 6/9 = 66.7% r = 6/10 = 60% At rank 10: p = 7/10 = 70% r = 7/10 = 70% The breakeven pont s p = r = 70%. Note that nterpolaton s needed f such a pont cannot be found Recever Operatng Characterstc Curve A recever operatng characterstc (ROC) curve s a plot of the true postve rate aganst the false postve rate. It s also commonly used to evaluate classfcaton results on the postve class n two-class classfcaton problems. The classfer needs to rank the test cases accordng to ther lkelhoods of belongng to the postve class wth the most lkely postve case ranked at the top. The true postve rate (TPR) s defned as the fracton of actual postve cases that are correctly classfed, TP TPR. (9) TP FN The false postve rate (FPR) s defned as the fracton of actual negatve cases that are classfed to the postve class, FP FPR. (10) TN FP TPR s bascally the recall of the postve class and s also called senstvty n statstcs. There s also another measure n statstcs called specfcty, whch s the true negatve rate (TNR), or the recall of the negatve class. TNR s defned as follows:

22 84 3 Supervsed Learnng TN TNR. (11) TN FP From Equatons (10) and (11), we can see the followng relatonshp, FPR 1 specfcty. (1) Fg. 3.8 shows the ROC curves of two example classfers (C 1 and C ) on the same test data. Each curve starts from (0, 0) and ends at (1, 1). (0, 0) represents the stuaton where every test case s classfed as negatve, and (1, 1) represents the stuaton where every test case s classfed as postve. Ths s the case because we can treat the classfcaton result as a rankng of the test cases n the postve class, and we can partton the ranked lst at any pont nto two parts wth the upper part assgned to the postve class and the lower part assgned to the negatve class. We wll see shortly that an ROC curve s drawn based on such parttons. In Fg. 3.8, we also see the man dagonal lne, whch represents random guessng,.e., predctng each case to be postve wth a fxed probablty. In ths case, t s clear that for every FPR value, TPR has the same value,.e., TPR = FPT. C 1 C Fg ROC curves for two classfers (C 1 and C ) on the same data For classfer evaluaton usng the ROC curves n Fg. 3.8, we want to know whch classfer s better. The answer s that when FPR s less than 0.43, C 1 s better, and when FPR s greater than 0.43, C s better. However, sometmes ths s not a satsfactory answer because we cannot say any one of the classfers s strctly better than the other. For an overall comparson, researchers often use the area under the ROC curve (AUC). If the AUC value for a classfer C s greater than that of another classfer C, t s sad that C s better than C. If a classfer s perfect, ts AUC value s 1. If a classfer makes all random guesses, ts AUC value s 0.5.

23 3.3 Classfer Evaluaton 85 Let us now descrbe how to draw an ROC curve gven the classfcaton result as a rankng of test cases. The rankng s obtaned by sortng the test cases n decreasng order of the classfer s output values (e.g., posteror probabltes). We then partton the rank lst nto two subsets (or parts) at every test case and regard every test case n the upper part (wth hgher classfer output value) as a postve case and every test case n the lower part as a negatve case. For each such partton, we compute a par of TPR and FPR values. When the upper part s empty, we obtan the pont (0, 0) on the ROC and when the lower part s empty, we obtan the pont (1, 1). Fnally, we smply connect the adacent ponts. Example 13: We have 10 test cases. A classfer has been bult, and t has ranked the 10 test cases as shown n the second row of Table 3.4 (the numbers n row 1 are the rank postons, wth 1 beng the hghest rank and 10 the lowest). The second row shows the actual class of each test case. + means that the test case s from the postve class, and means that t s from the negatve class. All the results needed for drawng the ROC curve are shown n rows 3 8 n Table 3.4. The ROC curve s gven n Fg Table 3.4. Computatons for drawng an ROC curve Rank Actual class TP FP TN FN TPR FPR Fg ROC curve for the data shown n Table 3.4

24 86 3 Supervsed Learnng Lft Curve The lft curve (also called the lft chart) s smlar to the ROC curve. It s also for evaluaton of two-class classfcaton tasks, where the postve class s the target of nterest and usually the rare class. It s often used n drect marketng applcatons to lnk classfcaton results to costs and profts. For example, a mal order company wants to send promotonal materals to potental customers to sell an expensve watch. Snce prntng and postage cost money, the company needs to buld a classfer to dentfy lkely buyers, and only sends the promotonal materals to them. The queston s how many should be sent. To make the decson, the company needs to balance the cost and proft (f a watch s sold, the company makes a certan proft, but to send each letter there s a fxed cost). The lft curve provdes a nce tool to enable the marketer to make the decson. Lke an ROC curve, to draw a lft curve, the classfer needs to produce a rankng of the test cases accordng to ther lkelhoods of belongng to the postve class wth the most lkely postve case ranked at the top. After the rankng, the test cases are dvded nto N equal-szed bns (N s usually 10 0). The actual postve cases n each bn are then counted. A lft curve s drawn wth the x-axs beng the percentages of test data (or bns) and the y- axs beng the percentages of cumulatve postve cases from the frst bn to the current bn. A lft curve usually also ncludes a lne (called the baselne) along the man dagonal [from (0, 0) to (100, 100)] whch represents the stuaton where the postve cases n the test set are unformly (or randomly) dstrbuted n the N bns (no learnng),.e., each bn contans 100/N percent of the postve cases. If the lft curve s above ths baselne, learnng s sad to be effectve. The greater the area between the lft curve and the baselne, the better the classfer. Example 14: A company wants to send promotonal materals to potental buyers to sell an expensve brand of watches. It bulds a classfcaton model and tests t on a test data of 10,000 people (test cases) that they collected n the past. After classfcaton and rankng, t decdes to dvde the test data nto 10 bns wth each bn contanng 10% of the test cases or 1,000 cases. Out of the 1,000 cases n each bn, there are a certan number of postve cases (e.g., past buyers). The detaled results are lsted n Table 3.5, whch ncludes the number (#) of postve cases and the percentage (%) of postve cases n each bn, and the cumulatve percentage for that bn. The cumulatve percentages are used n drawng the lft curve whch s gven n Fg We can see that the lft curve s way above the baselne, whch means that the learnng s hghly effectve. Suppose prntng and postage cost $1.00 for each letter, and the sale of each watch makes $100 (assumng that each buyer only buys one watch).

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

The Greedy Method. Introduction. 0/1 Knapsack Problem

The Greedy Method. Introduction. 0/1 Knapsack Problem The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Recurrence. 1 Definitions and main statements

Recurrence. 1 Definitions and main statements Recurrence 1 Defntons and man statements Let X n, n = 0, 1, 2,... be a MC wth the state space S = (1, 2,...), transton probabltes p j = P {X n+1 = j X n = }, and the transton matrx P = (p j ),j S def.

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

Logistic Regression. Steve Kroon

Logistic Regression. Steve Kroon Logstc Regresson Steve Kroon Course notes sectons: 24.3-24.4 Dsclamer: these notes do not explctly ndcate whether values are vectors or scalars, but expects the reader to dscern ths from the context. Scenaro

More information

Chapter 6. Classification and Prediction

Chapter 6. Classification and Prediction Chapter 6. Classfcaton and Predcton What s classfcaton? What s Lazy learners (or learnng from predcton? your neghbors) Issues regardng classfcaton and Frequent-pattern-based predcton classfcaton Classfcaton

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance

) of the Cell class is created containing information about events associated with the cell. Events are added to the Cell instance Calbraton Method Instances of the Cell class (one nstance for each FMS cell) contan ADC raw data and methods assocated wth each partcular FMS cell. The calbraton method ncludes event selecton (Class Cell

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES

CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES CHAPTER 5 RELATIONSHIPS BETWEEN QUANTITATIVE VARIABLES In ths chapter, we wll learn how to descrbe the relatonshp between two quanttatve varables. Remember (from Chapter 2) that the terms quanttatve varable

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

8 Algorithm for Binary Searching in Trees

8 Algorithm for Binary Searching in Trees 8 Algorthm for Bnary Searchng n Trees In ths secton we present our algorthm for bnary searchng n trees. A crucal observaton employed by the algorthm s that ths problem can be effcently solved when the

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

Section 5.4 Annuities, Present Value, and Amortization

Section 5.4 Annuities, Present Value, and Amortization Secton 5.4 Annutes, Present Value, and Amortzaton Present Value In Secton 5.2, we saw that the present value of A dollars at nterest rate per perod for n perods s the amount that must be deposted today

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Traffic-light a stress test for life insurance provisions

Traffic-light a stress test for life insurance provisions MEMORANDUM Date 006-09-7 Authors Bengt von Bahr, Göran Ronge Traffc-lght a stress test for lfe nsurance provsons Fnansnspetonen P.O. Box 6750 SE-113 85 Stocholm [Sveavägen 167] Tel +46 8 787 80 00 Fax

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Texas Instruments 30X IIS Calculator

Texas Instruments 30X IIS Calculator Texas Instruments 30X IIS Calculator Keystrokes for the TI-30X IIS are shown for a few topcs n whch keystrokes are unque. Start by readng the Quk Start secton. Then, before begnnng a specfc unt of the

More information

Planning for Marketing Campaigns

Planning for Marketing Campaigns Plannng for Marketng Campagns Qang Yang and Hong Cheng Department of Computer Scence Hong Kong Unversty of Scence and Technology Clearwater Bay, Kowloon, Hong Kong, Chna (qyang, csch)@cs.ust.hk Abstract

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES

THE METHOD OF LEAST SQUARES THE METHOD OF LEAST SQUARES The goal: to measure (determne) an unknown quantty x (the value of a RV X) Realsaton: n results: y 1, y 2,..., y j,..., y n, (the measured values of Y 1, Y 2,..., Y j,..., Y n ) every result s encumbered

More information

Formulating & Solving Integer Problems Chapter 11 289

Formulating & Solving Integer Problems Chapter 11 289 Formulatng & Solvng Integer Problems Chapter 11 289 The Optonal Stop TSP If we drop the requrement that every stop must be vsted, we then get the optonal stop TSP. Ths mght correspond to a ob sequencng

More information

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy

Answer: A). There is a flatter IS curve in the high MPC economy. Original LM LM after increase in M. IS curve for low MPC economy 4.02 Quz Solutons Fall 2004 Multple-Choce Questons (30/00 ponts) Please, crcle the correct answer for each of the followng 0 multple-choce questons. For each queston, only one of the answers s correct.

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

STATISTICAL DATA ANALYSIS IN EXCEL

STATISTICAL DATA ANALYSIS IN EXCEL Mcroarray Center STATISTICAL DATA ANALYSIS IN EXCEL Lecture 6 Some Advanced Topcs Dr. Petr Nazarov 14-01-013 petr.nazarov@crp-sante.lu Statstcal data analyss n Ecel. 6. Some advanced topcs Correcton for

More information

7.5. Present Value of an Annuity. Investigate

7.5. Present Value of an Annuity. Investigate 7.5 Present Value of an Annuty Owen and Anna are approachng retrement and are puttng ther fnances n order. They have worked hard and nvested ther earnngs so that they now have a large amount of money on

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Study on Model of Risks Assessment of Standard Operation in Rural Power Network

Study on Model of Risks Assessment of Standard Operation in Rural Power Network Study on Model of Rsks Assessment of Standard Operaton n Rural Power Network Qngj L 1, Tao Yang 2 1 Qngj L, College of Informaton and Electrcal Engneerng, Shenyang Agrculture Unversty, Shenyang 110866,

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence

How Sets of Coherent Probabilities May Serve as Models for Degrees of Incoherence 1 st Internatonal Symposum on Imprecse Probabltes and Ther Applcatons, Ghent, Belgum, 29 June 2 July 1999 How Sets of Coherent Probabltes May Serve as Models for Degrees of Incoherence Mar J. Schervsh

More information

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits

Linear Circuits Analysis. Superposition, Thevenin /Norton Equivalent circuits Lnear Crcuts Analyss. Superposton, Theenn /Norton Equalent crcuts So far we hae explored tmendependent (resste) elements that are also lnear. A tmendependent elements s one for whch we can plot an / cure.

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

+ + + - - This circuit than can be reduced to a planar circuit

+ + + - - This circuit than can be reduced to a planar circuit MeshCurrent Method The meshcurrent s analog of the nodeoltage method. We sole for a new set of arables, mesh currents, that automatcally satsfy KCLs. As such, meshcurrent method reduces crcut soluton to

More information

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council Usng Supervsed Clusterng Technque to Classfy Receved Messages n 137 Call Center of Tehran Cty Councl Mahdyeh Haghr 1*, Hamd Hassanpour 2 (1) Informaton Technology engneerng/e-commerce, Shraz Unversty (2)

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

Using Series to Analyze Financial Situations: Present Value

Using Series to Analyze Financial Situations: Present Value 2.8 Usng Seres to Analyze Fnancal Stuatons: Present Value In the prevous secton, you learned how to calculate the amount, or future value, of an ordnary smple annuty. The amount s the sum of the accumulated

More information

Time Value of Money Module

Time Value of Money Module Tme Value of Money Module O BJECTIVES After readng ths Module, you wll be able to: Understand smple nterest and compound nterest. 2 Compute and use the future value of a sngle sum. 3 Compute and use the

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

An Inductive Fuzzy Classification Approach applied to Individual Marketing

An Inductive Fuzzy Classification Approach applied to Individual Marketing An Inductve Fuzzy Classfcaton Approach appled to Indvdual Marketng Mchael Kaufmann, Andreas Meer Abstract A data mnng methodology for an nductve fuzzy classfcaton s ntroduced. The nducton step s based

More information

J. Parallel Distrib. Comput.

J. Parallel Distrib. Comput. J. Parallel Dstrb. Comput. 71 (2011) 62 76 Contents lsts avalable at ScenceDrect J. Parallel Dstrb. Comput. journal homepage: www.elsever.com/locate/jpdc Optmzng server placement n dstrbuted systems n

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications

Descriptive Models. Cluster Analysis. Example. General Applications of Clustering. Examples of Clustering Applications CMSC828G Prncples of Data Mnng Lecture #9 Today s Readng: HMS, chapter 9 Today s Lecture: Descrptve Modelng Clusterng Algorthms Descrptve Models model presents the man features of the data, a global summary

More information

Searching for Interacting Features for Spam Filtering

Searching for Interacting Features for Spam Filtering Searchng for Interactng Features for Spam Flterng Chuanlang Chen 1, Yun-Chao Gong 2, Rongfang Be 1,, and X. Z. Gao 3 1 Department of Computer Scence, Bejng Normal Unversty, Bejng 100875, Chna 2 Software

More information

Learning from Multiple Outlooks

Learning from Multiple Outlooks Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

When do data mining results violate privacy? Individual Privacy: Protect the record

When do data mining results violate privacy? Individual Privacy: Protect the record When do data mnng results volate prvacy? Chrs Clfton March 17, 2004 Ths s jont work wth Jashun Jn and Murat Kantarcıoğlu Indvdual Prvacy: Protect the record Indvdual tem n database must not be dsclosed

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

Abstract. Clustering ensembles have emerged as a powerful method for improving both the

Abstract. Clustering ensembles have emerged as a powerful method for improving both the Clusterng Ensembles: {topchyal, Models jan, of punch}@cse.msu.edu Consensus and Weak Parttons * Alexander Topchy, Anl K. Jan, and Wllam Punch Department of Computer Scence and Engneerng, Mchgan State Unversty

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Implementation of Deutsch's Algorithm Using Mathcad

Implementation of Deutsch's Algorithm Using Mathcad Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

Types of Injuries. (20 minutes) LEARNING OBJECTIVES MATERIALS NEEDED

Types of Injuries. (20 minutes) LEARNING OBJECTIVES MATERIALS NEEDED U N I T 3 Types of Injures (20 mnutes) PURPOSE: To help coaches learn how to recognze the man types of acute and chronc njures. LEARNING OBJECTIVES In ths unt, coaches wll learn how most njures occur,

More information

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets Improved Mnng of Software Complexty Data on Evolutonary Fltered Tranng Sets VILI PODGORELEC Insttute of Informatcs, FERI Unversty of Marbor Smetanova ulca 17, SI-2000 Marbor SLOVENIA vl.podgorelec@un-mb.s

More information

RequIn, a tool for fast web traffic inference

RequIn, a tool for fast web traffic inference RequIn, a tool for fast web traffc nference Olver aul, Jean Etenne Kba GET/INT, LOR Department 9 rue Charles Fourer 90 Evry, France Olver.aul@nt-evry.fr, Jean-Etenne.Kba@nt-evry.fr Abstract As networked

More information

Problem Set 3. a) We are asked how people will react, if the interest rate i on bonds is negative.

Problem Set 3. a) We are asked how people will react, if the interest rate i on bonds is negative. Queston roblem Set 3 a) We are asked how people wll react, f the nterest rate on bonds s negatve. When

More information

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP)

1. Fundamentals of probability theory 2. Emergence of communication traffic 3. Stochastic & Markovian Processes (SP & MP) 6.3 / -- Communcaton Networks II (Görg) SS20 -- www.comnets.un-bremen.de Communcaton Networks II Contents. Fundamentals of probablty theory 2. Emergence of communcaton traffc 3. Stochastc & Markovan Processes

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

Financial Mathemetics

Financial Mathemetics Fnancal Mathemetcs 15 Mathematcs Grade 12 Teacher Gude Fnancal Maths Seres Overvew In ths seres we am to show how Mathematcs can be used to support personal fnancal decsons. In ths seres we jon Tebogo,

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

CHAPTER 14 MORE ABOUT REGRESSION

CHAPTER 14 MORE ABOUT REGRESSION CHAPTER 14 MORE ABOUT REGRESSION We learned n Chapter 5 that often a straght lne descrbes the pattern of a relatonshp between two quanttatve varables. For nstance, n Example 5.1 we explored the relatonshp

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

An Empirical Study of Search Engine Advertising Effectiveness

An Empirical Study of Search Engine Advertising Effectiveness An Emprcal Study of Search Engne Advertsng Effectveness Sanjog Msra, Smon School of Busness Unversty of Rochester Edeal Pnker, Smon School of Busness Unversty of Rochester Alan Rmm-Kaufman, Rmm-Kaufman

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Trivial lump sum R5.0

Trivial lump sum R5.0 Optons form Once you have flled n ths form, please return t wth your orgnal brth certfcate to: Premer PO Box 2067 Croydon CR90 9ND. Fll n ths form usng BLOCK CAPITALS and black nk. Mark all answers wth

More information

14.74 Lecture 5: Health (2)

14.74 Lecture 5: Health (2) 14.74 Lecture 5: Health (2) Esther Duflo February 17, 2004 1 Possble Interventons Last tme we dscussed possble nterventons. Let s take one: provdng ron supplements to people, for example. From the data,

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing

A Replication-Based and Fault Tolerant Allocation Algorithm for Cloud Computing A Replcaton-Based and Fault Tolerant Allocaton Algorthm for Cloud Computng Tork Altameem Dept of Computer Scence, RCC, Kng Saud Unversty, PO Box: 28095 11437 Ryadh-Saud Araba Abstract The very large nfrastructure

More information

Efficient Project Portfolio as a tool for Enterprise Risk Management

Efficient Project Portfolio as a tool for Enterprise Risk Management Effcent Proect Portfolo as a tool for Enterprse Rsk Management Valentn O. Nkonov Ural State Techncal Unversty Growth Traectory Consultng Company January 5, 27 Effcent Proect Portfolo as a tool for Enterprse

More information

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000

Number of Levels Cumulative Annual operating Income per year construction costs costs ($) ($) ($) 1 600,000 35,000 100,000 2 2,200,000 60,000 350,000 Problem Set 5 Solutons 1 MIT s consderng buldng a new car park near Kendall Square. o unversty funds are avalable (overhead rates are under pressure and the new faclty would have to pay for tself from

More information

v a 1 b 1 i, a 2 b 2 i,..., a n b n i.

v a 1 b 1 i, a 2 b 2 i,..., a n b n i. SECTION 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS 455 8.4 COMPLEX VECTOR SPACES AND INNER PRODUCTS All the vector spaces we have studed thus far n the text are real vector spaces snce the scalars are

More information

Application of Improved Decision Tree Method based on Rough Set in Building Smart Medical Analysis CRM System

Application of Improved Decision Tree Method based on Rough Set in Building Smart Medical Analysis CRM System , pp. 51-66 http://dx.do.org/10.1457/jsh.016.10.1.3 Applcaton of Improved Decson Tree Method based on Rough Set n Buldng Smart Medcal Analyss CRM System Hongsheng Xu *, Lan Wang and Wenl Gan Luoyang Normal

More information

Title Language Model for Information Retrieval

Title Language Model for Information Retrieval Ttle Language Model for Informaton Retreval Rong Jn Language Technologes Insttute School of Computer Scence Carnege Mellon Unversty Alex G. Hauptmann Computer Scence Department School of Computer Scence

More information

On the Optimal Control of a Cascade of Hydro-Electric Power Stations

On the Optimal Control of a Cascade of Hydro-Electric Power Stations On the Optmal Control of a Cascade of Hydro-Electrc Power Statons M.C.M. Guedes a, A.F. Rbero a, G.V. Smrnov b and S. Vlela c a Department of Mathematcs, School of Scences, Unversty of Porto, Portugal;

More information