MARLINE: Multi-Source Mapping Transfer Learning for Non-Stationary Environments

Size: px
Start display at page:

Download "MARLINE: Multi-Source Mapping Transfer Learning for Non-Stationary Environments"

Transcription

1 MARLINE: Mult-Source Mappng Transfer Learnng for Non-Statonary Envronments Honghu Du School of Informatcs Unversty of Lecester Lecester, Unted Kngdom Leandro L. Mnku School of Computer Scence Unversty of Brmngham Brmngham, Unted Kngdom Huyu Zhou School of Informatcs Unversty of Lecester Lecester, Unted Kngdom Abstract Concept drft s a major problem n onlne learnng due to ts mpact on the predctve performance of data stream mnng systems. Recent studes have started explorng data streams from dfferent sources as a strategy to tackle concept drft n a gven target doman. These approaches make the assumpton that at least one of the source models represents a concept smlar to the target concept, whch may not hold n many real-world scenaros. In ths paper, we propose a novel approach called Mult-source mappng wth transfer LearnIng for Nonstatonary Envronments (MARLINE). MARLINE can beneft from knowledge from multple data sources n non-statonary envronments even when source and target concepts do not match. Ths s acheved by projectng the target concept to the space of each source concept, enablng multple source sub-classfers to contrbute towards the predcton of the target concept as part of an ensemble. Experments on several synthetc and real-world datasets show that MARLINE was more accurate than several state-of-the-art data stream learnng approaches. Index Terms concept drfts, non-statonary envronment, mult-sources, transfer learnng. I. INTRODUCTION The need for effcent streamng data analytcs has rapdly grown n recent years [1]. A data stream can be defned as a sequence of observatons that contnuously arrve over tme, occurrng n many applcatons, such as credt card approval, fraud detecton, and software defect predcton [2]. A key challenge n data stream learnng s that the jont probablty dstrbuton of an applcaton may change over tme,.e., there may be concept drft [3]. Learnng from data streams that may suffer from concept drfts s frequently referred to as learnng n non-statonary envronments [4], [5], whereas a gven jont probablty dstrbuton can be treated as a concept [2], [6]. Data stream learnng algorthms must be able to adapt and swftly react to concept drfts to avod poor predctons [1]. Usng nformaton learned from dfferent data sources s a feasble way to speed up the learnng of a new target concept and mprove the accuracy of the predctons. Ths can be consdered as transfer learnng [7]. However, transfer learnng has been usually used off-lne, requrng the entre tranng set to be avalable before tranng commences. Whle a few Authors n order of contrbuton. L. Mnku was supported by EPSRC Grant No. EP/R006660/2 and H. Zhou was supported by EU Horzon 2020 DOMINOES Project (grant number: ). recent studes appled transfer learnng n non-statonary data streamng envronments [5], most of the approaches presume a smlarty exstng between the source and target concepts [2], [8], [9]. Ths assumpton often fals to hold n practce. For example, the concept underlyng the predcton for bke rental demands n Washngton D.C. and London are dfferent due to dfferent weather patterns and consumer behavours n these two ctes. However, data streams of these two locatons are avalable [10], [11] and could potentally be used to mprove the predctve performance of data stream learnng approaches. Another example s software effort estmaton, where data streams descrbng software projects developed by dfferent companes may be used to mprove software effort estmaton n a gven company, despte havng dfferent underlyng dstrbutons [12]. Therefore, ths paper ams to answer the followng research questons: Can mult-source transfer learnng help us to mprove the predctve performance n non-statonary envronments where source and target data streams do not share the same concept? If so, how? To answer these questons, we hereby propose a novel approach, namely Mult-source mappng transfer LearnIng for Non-statonary Envronments (MARLINE). MARLINE s the frst approach desgned to beneft from multple source data streams even when sources and target could have consderably dfferent concepts. It acheves that by projectng the target concept to the space of each source concept through a novel mappng mechansm, enablng multple source sub-classfers to contrbute towards the predcton of the target concept as part of an ensemble. Our experments show that MARLINE can mprove the predctve performance of the exstng approaches over tme and quckly obtan good performance at the early learnng stage or after the concept drft occurs, even though there are only a few tranng examples avalable. II. RELATED WORK Several approaches have been proposed for data stream learnng n non-statonary envronments [4], [6]. Among these, approaches able to learn example-by-example (onlne) rather than chunk-by-chunk [13] are partcularly relevant to our paper. They have the potental to adapt to concept drfts faster than chunk-based approaches. Such approaches can be further

2 dvded nto actve and passve approaches [4], [13]. Actve approaches trgger the adaptaton mechansms when concept drft detecton methods alert [4], [14]. Examples of drft detecton methods nclude the Drft Detecton Method (DDM) [15] and Drft Detecton Methods based on the Hoeffdng s nequalty (HDDM) (HDDM A and HDDM W ) [16]. Passve approaches contnuously adapt to concept drft wthout relyng on explct concept drft detecton [4], [13]. A popular passve approach s Dynamc Weghted Majorty (DWM) [17]. In spte of ther learnng capacty, none of these approaches uses transfer learnng or operates n mult-source scenaros. Very few approaches have used transfer learnng n nonstatonary envronments [5]. Onlne nductve parameter transfer learnng approaches nclude Dynamc Cross-company Mapped Model Learnng (Dycom) [12], Dversty for Dealng wth Drfts (DDD) [18] and Onlne Wndow Adjustment Algorthm (OWA) [19]. Dycom treats source data offlne, whereas DDD and OWA do not use source data, transferrng knowledge only from the mmedate prevous target concept to the current target concept. A new chunk-based nductve parameter transfer approach called Dversty and Transferbased Ensemble Learnng (DTEL) was proposed n [20]. Smlar to DDD, DTEL does not consder dfferent sources but uses hstorcal target concepts. In addton, ths s a chunkbased approach. Recently, two transductve transfer learnng approaches called MultStream Classfcaton usng Relatve Densty Rato (MSCRDR) [8] and Cross-doman Multstream Classfcaton (COMC) [9] have been proposed to handle multple sources, wth the assumpton that the target stream s unlabelled and the source s labelled. However, these two algorthms requre both the source and the target to share the same task, even after a concept drft happens. Mult-source transfer learnng for non-statonary envronments (Melane) [2] s another onlne transfer learnng method that can learn from multple sources. However, Melane only benefts from source concepts that are smlar to the target. Overall, there s no exstng approach to perform transfer learnng usng multple non-statonary data sources that may have dfferent concepts from those of the target. III. PROBLEM STATEMENT Let {x, y } denote an example receved at a gven pont n tme at a data stream wth doman D = {X, p (x)} and task T = {Y, p (y x)}, where {S 1, S 2,, S n, T }, S n s the nth source data stream, T s the target data stream, x X, X s a d-dmensonal feature space, y { 1, +1} s the class label, p (x) s the margnal probablty dstrbuton and p (y x) s the posteror probablty dstrbuton. All sources and target streams may suffer concept drft. We wll enumerate the concepts p j seen n a data stream usng a sequental dentfer j. Whenever a concept drft occurs, we ncrement j. We use J to denote the number of concepts observed so far n data stream. Note that a gven tranng example, doman and task are all assocated wth the gven concept. Therefore, they are actually all ndexed by j as shown n {x j, yj }, Dj, and T j, but we wll leave ths ndex mplct. S n T M x y d J p j x j H j H K λ s TABLE I: Key Notaton and Descrpton. The data stream from source n The target data stream Set of sources and target data streams seen so far Feature vector from data stream The correspondng label of x Dmensonalty of the feature space Number of concepts from data stream observed so far The jth concept of data stream, 0 < j J The projecton of x T on the jth concept of stream The base learnng ensemble classfer traned by the jth concept of data stream Pool of base learnng ensembles for data stream Base learnng ensemble sze The kth sub-classfer of H j Sum of s probablstc predctons for target examples that t correctly (s = sc) or ncorrectly (s = sw) classfes s predctve performance on current target concept At the begnnng of the data stream or after a concept drft, due to the lack of the target data representng the new concept, the performance of the predctve models s usually poor. Therefore, the am of mult-source transfer s to mprove predctve performance n non-statonary envronments and speed up learnng especally n the begnnng of the data stream or after the occurrence of concept drfts by usng the data from multple sources. We wll nvestgate nductve transfer learnng (.e. T S T T whle D S D T or D S = D T ), as concept drfts may cause changes n T T and T S over tme. IV. PROPOSED METHOD Ths secton ntroduces MARLINE. An overvew of MAR- LINE s tranng framework s gven n Fgure 1, and ts Java mplementaton s avalable n [21]. MARLINE consders that we have multple nput data streams (represented n grey n the fgure), where the concepts of the sources and target streams may be dfferent. However, classfers learned from the source concepts may stll be used to mprove the predctve performance n the target doman. For that, each concept observed from each data stream s learnt by an ndependent onlne learnng ensemble, whch we refer to as base learnng ensemble (shown n yellow). To dentfy dfferent concepts, a concept drft detecton method s used. Whenever a new target example needs to be predcted, MARLINE uses a mappng between the current target concept and each source concept, so that the target example s geometrcally projected onto the space of the source concept (the projectons are shown n purple). Through ths mappng, the target example can be predcted by dfferent sub-classfers (shown n orange) traned by dfferent base learnng ensembles. MARLINE weghts each sub-classfer dependng on how useful t s for predctng the projected target (shown n cyan-blue). The predcton by MARLINE s based on the weghted majorty vote of the predctons of all the sub-classfers that compose all the source and target ensembles. The set of all weghted sub-classfers s referred to as the MARLINE ensemble. Secton IV-A ntroduces MARLINE s tranng procedure. Secton IV-B presents the mappng procedure for concept projectons, whch s undertaken wth respect to the centrods of

3 Fg. 1: Overvew of the proposed MARLINE tranng framework. each concept. The centrods calculaton s explaned n Secton IV-C. Secton IV-D dscusses the weghtng of sub-classfers that compose the MARLINE ensemble used for predctons. Secton IV-E explans the votng procedure used for makng predctons. Secton IV-F shows the tme complexty. A. Tranng The pseudo-code of MARLINE s tranng process s shown n Algorthm 1. The set M {S 1,, S n, T } s the set of all the sources and target for whch an onlne base learnng ensemble has already been generated. When a new example (x, y ) s receved from a source or target data stream for the frst tme, the proposed method creates one onlne base learnng ensemble H 1 for ths source or target (lnes 3 to 6). Any onlne learnng ensemble method can be used, e.g., onlne boostng or baggng [22]. Ensembles are used here because ther dversty ncreases the chances that at least some of the sub-classfers become useful for predctng the target [2]. As the concept of each data stream may change due to concept drfts, each source/target s assocated wth a pool of base learnng ensembles H, where each ensemble may represent a dfferent concept observed from that source/target. Each ensemble H j wthn the pool contans K sub-classfers, where 1 k K. The newly created ensemble H 1 s added to ts correspondng ensemble pool H (lne 7). Ths pool receves addtonal ensembles when suffers from concept drfts, as explaned later n ths secton. After havng added H 1 to H, the method dscussed n Secton IV-C s used to calculate the centrod assocated wth the concept p 1 (lne 8). Ths centrod s used to create the mappng from the target to the source concepts, as explaned n Secton IV-B. Lne 9 s used to ntalse the weghts assocated wth each sub-classfer. These weghts are used to dentfy whch sub-classfers are more mportant for predctng the projected target, and ther calculaton s explaned n Secton IV-D. Whenever a new tranng example (x, y ) s receved from stream, MARLINE runs a drft detecton method on. Any drft detecton method could be used, e.g., HDDM [16]. If the drft detecton method requres montorng a predctve model representng, the most recent ensemble H J s used. If a concept drft s detected, the proposed method creates a new onlne base learnng ensemble H J+1 wth the ntalsaton of ts weghts and connecton wth the pool of ensembles H (lnes 11 to 15). If the new example belongs to the target doman, all the values of λ sc, λ sw,, M, j J, k K for all the sub-classfers are reset (lnes 16 to 18) so that all the sub-classfers weghts can start adaptng to the new target dstrbuton. After checkng for concept drft, the most recent ensemble created for the source or target s traned on the current example (x, y ) (lne 20). The centrod of concept p J s H J updated (lne 21), as explaned n Secton IV-C. If the new tranng example belongs to the target stream, the sub-classfers weghtng scheme dscussed n Secton IV-D s appled (lne 23). The mappng procedure shown n Secton IV-B s used n the weghtng scheme. B. Mappng Procedure The man purpose of the mappng procedure s to create the projecton (x j ) of the current target example (x T ) on the source concept p j. Therefore, gven a current target example, the classfers traned wth a certan source concept can make a predcton on the projecton of ths target example based on the knowledge learned from the source concept. After a concept drft has been detected n the target data stream, any prevous concept n that stream s also regarded as a source concept. From ths pont onward, we wll use the term source+ nstead of source when the past target concepts are ncluded as sources. Therefore, and j n the source+ concept p j are defned as follows: {, j {S 1,, S n }, 0 < j J } {, j = T, 0 < j < J }. To seek the projecton (x j ) of target example (x T ) on p j, a mappng functon between p j and pj T T s requred. However, for onlne learnng we only store the latest example n memory

4 Algorthm 1 Learnng Procedure of MARLINE. Input: Data streams D, {S 1, S 2,, S n, T } Parameter: Onlne Base Learnng Ensemble Algorthm; Base Learnng Ensemble Sze K; Drft Detecton Method; Tme Forgettng Factor 0 < θ 1; Performance Index 0 σ 1 1: Set H =, J = 0, M = ; {S 1, S 2,, S n, T } 2: whle Receve a new example (x, y ), {S 1, S 2,, S n, T } do 3: f / M then 4: M M 5: J 1 (Number of onlne base learnng ensembles assocated to s 1) 6: Intalse onlne base learnng ensemble H J 7: H H H J 8: Calculate centrods c,j as shown n Secton IV-C, used to create the mappng as shown n Secton IV-B 9: λ sc h J,k 0, λ sw h J,k 0, J,k 1, k K 10: end f 11: f DrftDetecton (x, y ) = true then 12: Intalse new onlne base learnng ensemble H J+1 13: J J : H H H J 15: λ sc 0, λ sw 1, k K h J,k 16: f = T then 17: λ sc h J,k 0, λ sw 0, J,k 0, 1, M, j J, k K 18: end f 19: end f 20: OnlneBaseLearnerAlgorthm{H J, (x, y )} 21: Update centrods c,j as shown n Secton IV-C, used to create the mappng as shown n Secton IV-B 22: f = T then 23: Update each sub-classfer s weght as shown n Secton IV-D 24: end f 25: end whle over tme. To buld the mappng functon between the source+ and target concepts wthout retrevng the overall hstorcal data, we propose the followng procedure. Consder a par of reference ponts n a gven source+ concept, and another par n the target concept. For nstance, for a sngle concept p j, the par of reference ponts can be the centrods of the dstrbutons p j (x y), y [ 1, +1],.e.: c y,j = [c1, c 2,, c d ]; y [ 1, +1] (1) The calculaton of c y,j s shown n Secton IV-C. We connect the pars of reference ponts of a gven concept usng a vector V,j = c y=1,j c y= 1,j. The transformaton matrx R between V,j and V T,JT can be consdered as the mappng functon of any two vectors between p j and pj T T : V,j = R V T,JT (2) Therefore, the source+ vector V,j can be seen as a projecton of the target vector V T,JT on p j. The transformaton matrx R can be calculated by Eqs. (3)- (5), where I d s the dentty matrx wth d dmensons V,j u = V -1, V v = T,JT,j V -1 (3) T,JT A = I d ( u + ( u + v ) -1 I d v ) 2 ( u + v ) -1 ( u + (4) v ) R = (A v -1 A v 2 v -1 v ) V,j (5) V T,JT When a new example (x T ) from the target doman s receved, the vector between the new example and one centrod (c y=1 T,J T ) of the target concept can be computed as follows: V T = x T c y=1 T,J T (6) The projecton of V T on p j can be calculated as follows: V I j = R V T (7) The projecton (x j ) s the sum of cy=1,j and V I j : x j = c y=1,j + V I j (8) C. Calculatng and Updatng the Centrods For savng computatonal tme, we update the centrods of each concept nstead of updatng transformaton matrx R at each tme step. The transformaton matrx R wll be calculated whenever a target example needs to be predcted. The centrods of the concept p j (x, y) are dynamcally updated based on examples (x, y ) receved from data stream durng the tme wndow snce the concept j has become actve. Durng ths tme wndow, f no example wth label y has been seen before, the centrod c y=y,j s set as x. Otherwse, t s updated as follows: sumc y=y,j c y=y = θsumc y=y,j + x (9),J = sumcy=y,j L t =1 θ(t 1) (10) where L s the number of the tranng examples receved n the data stream snce the concept p j has become actve, and θ, 0 < θ 1, s a pre-defned forgettng factor used to reduce the weght gven to the hstorcal examples. It helps to deal wth non-statonary envronments. The summaton n the denomnator s a normalsaton factor, whch can be updated n an onlne manner. D. Sub-Classfers Weghtng When an example (x T, y T ) from the target doman s receved, all the sub-classfers weghts ω h are updated. We assgn larger weghts to the sub-classfers whch focus on harder classfed examples. The sub-classfer s weght ω h depends on the correspondng sub-classfer s performance

5 on the current target concept p J T T. All the sub-classfers performances, M, 0 < j J, 0 < k K are updated based on the correspondng projecton (x j, y T ) of the current target example (x T, y T ), whch has been obtaned by the mappng procedure explaned n Secton IV-B. When = T, j = J T, the projecton (x j, y T ) s set to the current target example (x T, y T ). The sub-classfers performance s ntalsed wth = 1, M, 0 < j J, 0 < k K. To update each subclassfer s performance, the current example s weght needs to be calculated. We get the predcton of each sub-classfer on the correspondng target example projecton. The example s weght SW SC can be calculated as follows: SC J K P ( (x j ) = y T ) (11) SW M j=1 k=1 J K M j=1 k=1 P ( (x j ) y T ) (12) where P (h(x) = y) s the probablty of class y, estmated by the sub-classfer h(x), and s the performance computed based on the target examples receved before recevng (x T, y T ). The example s weght SW SC s used to ndcate how confdent the MARLINE ensemble s for the current example. We expect that the more confdent the MARLINE ensemble s, the smaller weght the example receves, and vce versa. Consder that λ sc s the sum of the probablstc predctons of sub-classfer for the target examples that t correctly classfes and λ sw s the sum of the probablstc predctons for msclassfed target examples. The performance of each sub-classfer can be computed ncrementally by: λ sc λ sw θλ sc + SW SC θλ sw + SW SC λ sc P ( (x j ) = y T ) SC P ( λ sc SW + λ sw (x j ) y T ) where 0 < θ 1 s a forgettng factor and α (13) (14), (15) P ( (x j )=y T ) represents how much contrbuton sub-classfer makes n the MARLINE ensemble to vote for the projecton (x j ) of (x T ) wth label y T. Thus, s the current performance percentage of each sub-classfer, gvng more focus to more recently arrvng target examples. The weghts of all the sub-classfers assocated wth > σ are assgned to ther predctve performance α h normalsed by the sum of the predctve performances of all the sub-classfers assocated wth can be formulated as follows: ω h = 1 J K M j =1 k=1 ( j,k >σ? α j,k h SC > σ. The weght ω h α :0) h, > σ 0, otherwse (16) where σ s a pre-defned parameter, (testcondton? v1 : v2) retreves v1 f testcondton s true, and v2 otherwse. E. Votng Procedure for Makng Predctons When a predcton s needed for a target nstance (x T ), we multply the correspondng weghts of the sub-classfers wth the probablstc predcton made by each sub-classfer on ther correspondng projecton (x j ) of the current target example. All sub-classfers, M, j {1,, J }, k {1,, K} are consdered for ths purpose. Afterwards, we obtan the sum of the weghted predcton probabltes of all classes and use majorty vote to decde the predcted class. F. Tme Complexty Analyss When learnng a target tranng example, MARLINE s tranng tme complexty s O(f DD +f H +(J S1 +J S2 + +J Sn + J T )d 2 +(J S1 +J S2 + +J Sn +J T )K f h ), where f H, f DD and f h are the tme complextes for tranng the base learnng ensemble wth the example, runnng the drft detecton method and gettng the predcton from a sub-classfer. When learnng a source tranng example, MARLINE s tranng tme complexty s O(f DD + f H + d). MARLINE s tme complexty for predcton s O((J S1 + J S2 + + J Sn + J T )d 2 + (J S1 + J S2 + + J Sn )K f h Detals on the complexty estmaton can be found n the Supplementary Materal of ths paper [23]. V. EXPERIMENTS SETUP We evaluate MARLINE under several dfferent condtons, ncludng statonary envronments, non-statonary envronments wth dfferent types of concept drfts, and dfferent target data stream szes. Artfcal datasets enable us to better understand when and how MARLINE can be helpful. Real world datasets enable us to check whether MARLINE can work well n practce. A. Datasets We use the same three artfcal datasets of smlar target and sources as those of [2] and generated addtonal datasets where the source was non-smlar to the targets. These datasets have two numerc features and one bnary output. The examples belongng to each output class were generated by a Gaussan dstrbuton as shown n Table II, where each dataset s composed of several (target and sources) data streams. The three datasets wth smlar sources use only the target and smlar source data streams from Table II, whereas the three datasets wth non-smlar source use only the target and nonsmlar source data stream from Table II. The datasets smulate a statonary envronment (no drft) and two types of nonstatonary envronments (abrupt drft and ncremental drft) on the target stream. Each dataset also has three dfferent versons based on dfferent class sze scenaros (small, medum, large) by varyng the number of target tranng examples of each class n 50, 500, Each source stream has 5000 tranng examples of each class wthout any concept drft. The real-world datasets are acqured from London bke sharng dataset [10] and Bke Sharng n Washngton D.C.

6 dataset [11]. The task s to classfy whether rental bkes are n low or hgh demand. We use the medan of the total count of the rental bkes of a gven dataset to ndcate low and hgh demands n ths dataset. We select the features shared by the two datasets (actual temperature, feelng temperature, humdty and wnd speeds) to unfy the feature dmenson. Each dataset s dvded nto three sub-datasets based on holday, weekend and weekday and compose the followng three scenaros: We choose weekdays from Washngton D.C. as the source, and (1) holdays and (2) weekends n London are the targets. We make weekends n Washngton D.C. as the source and (3) weekdays n London are the target. The am of these three dfferent target sub-datasets s to create small, medum and large stream szes (Holday: 384, Weekend: 4970, Weekday: 12060). B. Benchmark Methods and Evaluaton Measures MARLINE was compared aganst Melane [2], Adaptve Random Forest [14], Dynamc Weghted Majorty (DWM) [17], Onlne Baggng [22], Onlne Boostng [22], Onlne Baggng wth Drft Detecton and Onlne Boostng wth Drft Detecton. Melane was chosen for the comparson because t s the state-of-the-art mult-source transfer learnng approach for non-statonary data streams. As MSCRDR [8] and COMC [9], Melane s only able beneft from source concepts when they share the same task as the target concept. However, dfferent from these approaches, Melane has the advantage of beng able to detect when source tasks are dssmlar to the target, avodng to hnder predctve performance on the target when that s the case. Comparng MARLINE aganst Melane wll reveal whether MARLINE s mappng functon s helpful to mprove predctve performance aganst an approach that s only able to beneft from source concepts when they are smlar to the target concept. Adaptve Random Forest and DWM were chosen because they are popular data stream learnng approaches avalable n the Massve Onlne Analyss (MOA) tool [24]. Comparng aganst them shows whether MARLINE can outperform the popular approaches. Onlne baggng and onlne boostng [22] were ncluded as baselne ensemble approaches wth no strat- TABLE II: The parameters of the Gaussans of Artfcal Datasets. Datasets Doman Type Data Stream Class 0 Centre Class 1 Centre Target Target (2,3) (7,8) No Drft Datasets Smlar Source (2,1) (7,8) Non-Smlar Source (-2,-3) (-7,2) Before Drft (2,3) (7,8) Target Target After Drft (2,9) (5,4) Abrupt Drft Datasets Smlar Source (2,9) (5,4) Non-Smlar Source (-2,-3) (-7,2) Target Target Before Drft (2,3) (7,8) After Drft (2,3) (7,8) Source 1 (2,3) (7,8) Source 2 (3,4) (6,7) Incremental Drft Datasets Source 3 (4,5) (5,6) Smlar Source 4 (5,6) (4,5) Source 5 (6,7) (3,4) Source 6 (7,8) (2,3) Non-Smlar Source (-2,-3) (-7,2) ( ) 1 0 The covarance matrx for each class of each doman s except for 0 ( 2 ) 2 0 target n the no drft and non-smlar datasets, whch uses for both 0 2 classes. For the ncremental drft, the centres of the Gaussan of class 0 and 1 move towards each other by one unt at each 100, 1000 and tme steps for the datasets wth class szes of 50, 500, 5000, respectvely, untl the Gaussans of class 0 and 1 swap locaton. Ths leads to ntermedate concepts that are equvalent to each of the smlar sources 1 to 6. egy to deal wth concept drft. They provde a desred lower bound for the predctve performance acheved by MARLINE and any other approaches for non-statonary envronments. Addtonally, they were also appled n combnaton wth a drft detecton method, whch resets the models upon drft alarm, to enable these approaches to cope wth drfts. Comparng aganst the combnaton shows whether or not MARLINE s able to beneft from sources n general. MARLINE wthout source data streams was also used n the comparson because mappng s also performed between the old target concepts and the current target concept. Includng ths approach shows whether or not t s benefcal to use dfferent sources especally for the ntal learnng stage, rather than only mappng between old and new target concepts. Both MARLINE and Melane have been nvestgated wth onlne baggng and onlne boostng as base learnng ensemble methods [22]. All ensemble approaches used Hoeffdng trees [25] as the basc unts of learnng, except for one of the compared approaches (ARF), whch s based on ARFHoeffdng Tree [14]. Two drft detecton methods (DDM [15] and HDDM A [16]) were used for all the approaches that requre drft detecton. DDM s a well known method. HDDM A has been recently shown to perform well compared to other drft detecton methods when confgurng ensembles [26]. Thrty runs were performed for all the compared approaches, except for DWM [17], whch s determnstc and requres a sngle run. The average accuracy across the 30 runs s reported. Grd search was used to tune the hyperparameters of each approach on each dataset based on a prelmnary run. For Onlne Baggng and Boostng, the szes of the sub-classfers vared n 1:1:30. For DWM, β vared n 0:0.1:1, perod p = 1, and the weght threshold for removng sub-classfers was For ARF, the number of trees vared n 10:1:30 (MOA restrcts the mnmum ARF ensemble sze to 10). For Melane, the forgettng factor vared n 0.9 : 0.01 : 1. For MARLINE, θ = 0.9 : 0.01 : 1 and σ = 0.1 : 0.1 : 1. The grd searches results are n the Supplementary Materal of ths paper [23]. For the artfcal data streams, the predctve performance s calculated prequentally and reset upon the real locaton of the drfts [18]. In the artfcal datasets, we know exactly when the concept drfts happen. Ths evaluaton framework wll reset the accuracy to zero when the concept drfts (or the ncrements of an ncremental drft) occur. Ths enables us to measure the performance on each concept separately wthout beng affected by the prevous concepts. For real world data streams, the predctve performance of all the approaches s evaluated usng sldng wndows [27] wth the sze of 10% of the target data stream. Fredman and ther Nemeny post-hoc tests were used to compare the predctve performance of all approaches, on each dataset. VI. EXPERIMENT RESULTS A. Comparson on Artfcal Datasets 1) Experments wth Non-Smlar Source: The experment ams to nvestgate whether or not the use of very dfferent

7 Dataset TABLE III: Fredman Ranks on Each Dataset. Non-Smlar Source Smlar Source Real-World Data No Drft Abrupt Incremental No Drft Abrupt Incremental Holday Weekend Weekday Class Sze or Target Stream Sze Marlne(DDM(Onlne Baggng)) wth source Marlne(DDM(Onlne Boostng)) wth source Marlne(HDDMA(Onlne Baggng)) wth source Marlne(HDDMA(Onlne Boostng)) wth source Marlne(DDM(Onlne Baggng)) wthout source Marlne(DDM(Onlne Boostng)) wthout source Marlne(HDDMA(Onlne Baggng)) wthout source Marlne(HDDMA(Onlne Boostng)) wthout source Melane(DDM(Onlne Baggng)) wth source Melane(DDM(Onlne Boostng)) wth source Melane(HDDMA(Onlne Baggng)) wth source Melane(HDDMA(Onlne Boostng)) wth source Melane(DDM(Onlne Baggng)) wthout source Melane(DDM(Onlne Boostng)) wthout source Melane(HDDMA(Onlne Baggng)) wthout source Melane(HDDMA(Onlne Boostng)) wthout source DDM(Onlne Baggng) DDM(Onlne Boostng) HDDMA(Onlne Baggng) HDDMA(Onlne Boostng) Onlne Baggng Onlne Boostng Adaptve Random Forest(DDM) Adaptve Random Forest(HDDMA) Dynamc Weghted Majorty Fredman s p-values were always < The best approach has ts rankng n red wth grey background and the approaches not sgnfcantly dfferent from t accordng to the Nemeny test are n bold wth grey background. Mean accuracy and standard devatons are n the Supplementary Materal [23]. (a) No Drft; class sze of 50 (b) No Drft; class sze of 5000 (a) Holday (b) Weekend (c) Abrupt; class sze of 50 (d) Abrupt; class sze of 5000 (e) Incremental; class sze of 50 (f) Incremental; class sze of 5000 Fg. 2: Average Accuracy wth Non-Smlar Source. concept sources by MARLINE can help us to mprove the predctve performance. The Fredman rankng of the approaches on each dataset s shown n Table III. We can see that MARLINE wth source s amongst the best performers under dfferent amounts of the tranng data and dfferent types of drfts, as shown by the table cells hghlghted n grey, except the ncremental drfts wth the class sze of 500. MARLINE wthout source s sometmes amongst the best, demonstratng that mappng the new target concept to the space of the hstorcal target concepts s also benefcal. Fgure 2 shows (c) Weekday (d) Weekday Fg. 3: Accuracy on Real World Datasets. some representatve results across tme. Other fgures were omtted due to space restrctons. 2) Experments wth Smlar Source: Melane was desgned to transfer knowledge wth smlar sources and target concepts, beng thus expected to acheve the best performance for these data streams. Based on Fredman and Nemeny tests shown n Table III, Melane outperforms the other approaches n most scenaros. However, MARLINE wth source also acheves compettve results, smlar to those of Melane. In some cases, the performance of MARLINE wth source s better than that of Melane, e.g., MARLINE(HDDM A (Onlne Boostng)) wth source and Melane (HDDM A (Onlne Boostng)) wth source wth the class sze of 500 on the abrupt drft dataset. B. Comparson on Real World Datasets London and Washngton D.C. bke sharng data were collected from dfferent sources, so ther nput and output spaces are qute dfferent (see Table IV). Therefore, the concepts (both n terms of domans and tasks) of the two datasets are supposed to be very dfferent. From Table III,

8 TABLE IV: Input Space and Rental Count for Real World Dataset. Feature AT FT HD WS RC London 1.5 : 34 6 : : : : 7860 Washngton D.C 0.02 : 1 0 : 1 0 : 1 0 : : 977 AT: Actual Temperature; FT Feelng Temperature; HD: Humdty; WS: Wnd Speed; RC: Rental Count. MARLINE(HDDM A (Onlne Baggng)) wth source has the best performance on all the real world datasets. MARLINE wthout source s the 2nd best. From Fgure 3, we can see that the accuracy of MARLINE wth source s qute smlar to that of MARLINE wthout source. Ths may be due to the adaptve mechansm of MARLINE. It s worth notng that when the concept of the stream was easer to learn (as for the artfcal datasets), then MARLINE was most helpful n the begnnng of the stream and rght after drfts. Ths s because, wth tme ncreasng, every method can learn the concept well (Fgure 2). However, when the concept was more complex (as n the real world datasets), then MARLINE provded great help throughout tme (Fgure 3). Addtonal related analyses are n the Supplementary Materal [23]. C. Contrbuton of Source+ Sub-classfers To further nvestgate the mportance of ndvdual subclassfers n MARLINE, we select two datasets (Abrupt wth non-smlar source and class sze of 5000; and Weekday) and plot the average weght ratos of all the source+ sub-classfers over 30 runs n Fgure 4. The total weght rato of the source+ sub-classfers s calculated as: W eghtrato = J K M, T j=1 k=1 ω h J T 1 + j=1 K k=1 ω h T Ths s the sum of the total weghts assgned to the source and hstorcal target sub-classfers n the MARLINE ensemble. From Fgure 4a, before the concept drft occurs, the mean total weght of the source sub-classfers durng ths perod s 25.86%. After concept drft, due to past target subclassfers jonng the MARLINE ensemble, the mportance of the source+ sub-classfers ncreases and the mean total weght of the source+ sub-classfers s 48.24%. The average total weght s even larger for the real world dataset (see Fgure 4b). From Fgure 4b, we notce that the source+ classfers can sgnfcantly contrbute towards the predctons throughout tme (the mean of the total weghts over the whole data stream s 94.88%). Ths may be due to the fact that the real world dataset poses more challenges to the target sub-classfers, whch struggle to mantan ther performance on the artfcal datasets. We also notce some spkes and sudden drops n the total weghts over tme. Ths suggests that the weghtng mechansm s affected by nose. In our future work, we wll nvestgate whether or not other weghtng mechansms can mprove the predctve performance of MARLINE further. VII. SENSITIVITY ANALYSIS As MARLINE has a few hyperparameters, t s mportant to conduct a study to understand (Q1) how dfferent MARLINE (a) Non-smlar; Abrupt; Sze 5000 (b) Weekday Fg. 4: Sources+ Sub-classfers Average Total Weght (Over 30 Runs) by MARLINE (HDDM A(Onlne Baggng)) wth source. ensemble compostons (dfferent types of base learnng ensemble wth dfferent szes K and performance ndex σ) affect the predctve performance, and (Q2) the nfluence of dfferent drft detecton methods and forgettng factors θ used to handle dfferent types of concept drft. Secton VII-B1 answers these questons based on artfcal datasets. The real world datasets are also used to support the analyss n Secton VII-B2. A. Expermental Desgn To nvestgate (Q1) and (Q2), Analyss of Varance (ANOVA) [28] was performed to analyse the nfluence of each hyperparameter as well as ts nteractons wth others on the average prequental accuracy. The step-wse changes of each hyperparameter are defned to cover the range of the best hyperparameter values selected by the grd search for dfferent datasets n Secton V. For (Q1), the followng factors are nvestgated: base learner wth two levels (BLM: Onlne Baggng and Boostng), base learnng ensemble sze K {10, 20, 30} and performance ndex σ {0.0, 0.2, 0.4, 0.6}. As these are all subject-based factors, a Repeated Measures ANOVA desgn s used. For (Q2), the followng factors are nvestgated: drft detecton method (DD: DDM and HDDM A ), forgettng factor θ {0.9, 0.92, 0.94, 0.96, 0.98, 1} and drft type (DT: No Drft, Abrupt, Incremental). The last s only consdered when we apply the artfcal datasets. As the frst two factors are wthnsubject factors and the last s a between-subjects factor, a splt plot (mxed) ANOVA desgn s adopted for the artfcal datasets and a Repeated Measures ANOVA desgn s adopted for the real world datasets. Thrty runs for each combnaton of the factors are carred out on each dataset. Mauchly s sphercty test [29] s used wth a level of sgnfcance of 0.05 to evaluate whether or not the sphercty assumpton s volated. If volated, the ANOVA s p-values are corrected to take that nto account. If the epslon estmate s below 0.75, the Greenhouse Gesser correcton [30] s adopted to correct the degrees of freedom of the F-dstrbuton. Otherwse, the Huynh Feldt correcton [31] s adopted to make t less conservatve [32]. B. Results Table V and VI present the ANOVA results for the artfcal and real world datasets, respectvely. The Sum of squares (SS), degrees of freedom (DF), mean squares (MS), test F statstcs (F) and partal eta-squared (η 2 p) are reported.

9 TABLE V: ANOVA Results for Artfcal Datasets. Factor/Int. SS DF MS F ηp 2 Test of Wthn-Subjects Effects (Q1) σ K K * σ BLM * K BLM BLM * K * σ BLM * σ Test of Wthn-Subjects Effects (Q2) DD * DT θ θ * DT DD * θ * DT DD DD * θ Test of Between-Subjects Effects (Q2) DT BLM: Base Learner Method; DD: Drft Detector Method; DT: Drft Type. The p-value s always less than 0.001, except for DD * θ, whch s (a) Effect of K*σ wth Onlne Baggng (c) Effect of θ*dt wth DDM (b) Effect of K*σ wth Onlne Boostng (d) Effect of θ*dt wth HDDM A Fg. 5: Plots of margnal means on Artfcal Datasets. 1) Analyss Usng Artfcal Datasets: As t can be observed from Table V, performance ndex σ, base learnng ensemble sze K and nteracton K σ have a large mpact (η 2 p 0.131), whereas the other factors and nteractons have a small mpact (η 2 p 0.017). Therefore, the MARLINE ensemble composton factors σ and K have more nfluence on the accuracy. Fgures 5a and 5b llustrate the mpact of factors σ, K, base ensemble learner method and ther nteracton. The two plots are farly smlar to each other, confrmng that the nteracton BLM * K * σ has a small mpact. A large σ = 0.06 s detrmental to the accuracy wth worse accuracy obtaned especally wth a smaller ensemble sze K. Ths s because ths performance ndex s dffcult to be reached by the subclassfers. Therefore, most sub-classfers wll have weght zero n the MARLINE ensemble, effectvely decreasng ts sze and dversty. When σ 0.4, dfferent ensemble szes K and σ values lead to smlar accuracy, where a smaller ensemble sze, e.g. K = 10, leads to slghtly worse accuracy and σ = 0.4 leads to slghtly better accuracy when we use Onlne Baggng. Table V also shows that the forgettng factor θ, the nteracton between the drft detectng method and the drft type DD * DT and the nteracton θ * DT have a medum mpact (0.071 η 2 p 0.076). So, the drft detecton method and θ play mportant and probably dverse roles when handlng dfferent types of concept drfts. Fgures 5c and 5d llustrate the effect of the factors θ, DD and DT and ther nteracton. The two plots show farly smlar patterns, verfyng that the nteracton DD * θ * DT has a small mpact. When the drft appears abruptly, ndependent of the drft detecton method (DT), θ = 0.9 result n the best accuracy. As θ ncreases, the accuracy slghtly decreases. When the drft type s ncremental, the accuracy has larger drops wth θ 0.96, compared wth the abrupt concept drft. As the concept drfts n the ncremental datasets are more dffcult to be detected than the concept drfts n the abrupt datasets, t s reasonable that θ wll take more responsbltes to cope wth concept drfts when drft detecton does not perform well. When the dataset has no drft, there s no sgnfcant dfference between the accuracy obtaned by dfferent drft detecton methods, whch we confrmed by addtonal pared T tests wth Bonferron correctons. Furthermore, the accuracy changes very slghtly when we change θ. Therefore, we summarse that: Q1: Large performance ndex σ and small ensemble szes K can be detrmental to the predctve performance of MARLINE, whereas n general σ = 0.4 assocated wth K 20 led to better results. Q2: If there are concept drfts n the data steam, when the drft detecton method cannot detect the concept drfts accurately, a small value for the forgettng factor (0.9 θ 0.94) normally can help MARLINE to ncrease predctve performance on handlng concept drfts. When there s no concept drft, a small value for the forgettng factor wll not hurt the performance ether. 2) Analyss Usng Real World Datasets: Table VI shows the tests of wthn-subjects performed on the real world datasets. The plots of margnal means are shown n Fgure 6. The results are n general smlar to the results on artfcal datasets, though certan effects and dfferences n the magntude of the predctve performance were larger. We can see that σ, K and nteracton σ K have large effect sze (η 2 p 0.161). Meanwhle, θ has a very large effect sze (η 2 p = 0.495) and the choce of drft detecton method also has a large effect sze (η 2 p = 0.111). Ths could be because n the real world datasets, the concept drfts are a mx of dfferent types of concept drft, whch makes them more dffcult to be detected. Therefore, MARLINE reles more on θ to cope wth the concept drfts. Fgures 6a and 6b show smlar trends to the artfcal datasets. However, when σ 0.4, the mprovement n accuracy wth a greater σ s more sgnfcant, confrmng that both the sze and the qualty (performance ndex) of the subclassfers are mportant. Fgure 6c also shows that HDDM A performs better on the real world datasets, n lne wth the experments shown n Secton V. Also, we fnd that smaller θ values beneft the accuracy more.

10 TABLE VI: ANOVA Results for Real World Datasets. Factor/Int. SS DF MS F ηp 2 Test of Wthn-Subjects Effects (Q1) σ K * σ K BLM BLM * σ BLM * K * σ BLM * K Test of Wthn-Subjects Effects (Q2) θ DD DD * θ BLM: Base Learner Method; DD: Drft Detector Method; DT: Drft Type. The p-value s always less than (a) Effect of K*σ wth Onlne Baggng (c) Effect of DD*θ (b) Effect of K*σ wth Onlne Boostng Fg. 6: Plots of margnal means on Real World Datasets. VIII. CONCLUSION In ths paper, we focus on a general and challengng problem learnng from very dfferent concepts n data stream mnng. By mappng the target concept to the space of each source+ concept, the sub-classfers that closely match the part of the projecton of the target concept are gven hgher weghts n the MARLINE ensemble, beng able to acheve better performance n non-statonary envronments. We carred out extensve experments and the results demonstrate that our proposed MARLINE s effectve. A senstvty analyss s also presented. Future work ncludes the nvestgaton of strateges to reduce the sze of MARLINE s classfer pool; nvestgaton of dfferent weghtng schemes to further mprove accuracy; analyss of the computatonal tme taken to run the approach, complementng ts tme complexty analyss; experments wth more data streams, base learners and drft detecton methods; and an nvestgaton of senstvty to nose. REFERENCES [1] J. Lu, A. Lu, F. Dong, F. Gu, J. Gama, and G. Zhang, Learnng under concept drft: A revew, IEEE TKDE, [2] H. Du, L. L. Mnku, and H. Zhou, Mult-source transfer learnng for non-statonary envronments, n IJCNN, 2019, pp [3] A. C. Pocock, P. Yapans, J. Snger, M. Luján, and G. Brown, Onlne non-statonary boostng. n MCS, 2010, pp [4] G. Dtzler, M. Rover, C. Alpp, and R. Polkar, Learnng n nonstatonary envronments: A survey, IEEE CIM, vol. 10, no. 4, pp , [5] L. L. Mnku, Transfer learnng n non-statonary envronments, n Learnng from Data Streams n Evolvng Envronments, 2019, pp [6] J. Gama, I. Žlobatė, A. Bfet, M. Pechenzky, and A. Bouchacha, A survey on concept drft adaptaton, ACM CSUR, vol. 46, no. 4, p. 44, [7] S. J. Pan and Q. Yang, A survey on transfer learnng, IEEE TKDE, vol. 22, no. 10, pp , [8] B. Dong, Y. Gao, S. Chandra, and L. Khan, Multstream classfcaton wth relatve densty rato estmaton, n AAAI, vol. 33, 2019, pp [9] H. Tao, Z. Wang, Y. L, M. Zaman, and L. Khan, Comc: A framework for onlne cross-doman multstream classfcaton, n IJCNN, 2019, pp [10] London bke sharng dataset, london-bke-sharng-dataset, [11] H. Fanaee-T and J. Gama, Event labelng combnng ensemble detectors and background knowledge, PRAI, vol. 2, no. 2-3, pp , [12] L. L. Mnku and X. Yao, How to make best use of cross-company data n software effort estmaton? n ICSE, 2014, pp [13] B. Krawczyk, L. L. Mnku, J. Gama, J. Stefanowsk, and M. Woźnak, Ensemble learnng for data stream analyss: A survey, Informaton Fuson, vol. 37, pp , [14] H. M. Gomes, A. Bfet, J. Read, J. P. Barddal, F. Enembreck, B. Pfharnger, G. Holmes, and T. Abdessalem, Adaptve random forests for evolvng data stream classfcaton, Machne Learnng, vol. 106, no. 9-10, pp , [15] J. Gama, P. Medas, G. Castllo, and P. Rodrgues, Learnng wth drft detecton, n SBIA, 2004, pp [16] I. Frías-Blanco, J. del Campo-Ávla, G. Ramos-Jmenez, R. Morales- Bueno, A. Ortz-Díaz, and Y. Caballero-Mota, Onlne and nonparametrc drft detecton methods based on hoeffdng s bounds, IEEE TKDE, vol. 27, no. 3, pp , [17] J. Z. Kolter and M. A. Maloof, Dynamc weghted majorty: An ensemble method for drftng concepts, JMLR, vol. 8, no. Dec, pp , [18] L. L. Mnku and X. Yao, DDD: a new ensemble approach for dealng wth concept drft, IEEE TKDE, vol. 24, no. 4, pp , [19] P. Zhao, S. C. Ho, J. Wang, and B. L, Onlne transfer learnng, Artfcal Intellgence, vol. 216, pp , [20] Y. Sun, K. Tang, Z. Zhu, and X. Yao, Concept drft adaptaton by explotng hstorcal knowledge, IEEE TNNLS, [21] H. Du, Code repostory for MARLINE: Mult-source mappng transferlearnng for non-statonary envronments, MARLINE, [22] N. C. Oza, Onlne baggng and boostng, n SMC, vol. 3, 2005, pp [23] H. Du, L. L. Mnku, and H. Zhou, Supplementary materal for MARLINE: Mult-source mappng transferlearnng for non-statonary envronments, [24] A. Bfet, G. Holmes, R. Krkby, and B. Pfahrnger, Moa: Massve onlne analyss, JMLR, vol. 11, no. May, pp , [25] P. Domngos and G. Hulten, Mnng hgh-speed data streams, n KDD, 2000, pp [26] R. S. M. de Barros and S. G. T. de Carvalho Santos, An overvew and comprehensve comparson of ensembles for concept drft, Informaton Fuson, vol. 52, pp , [27] J. Gama, R. Sebastão, and P. P. Rodrgues, Issues n evaluaton of stream learnng algorthms, n KDD, 2009, pp [28] D. C. Montgomery, Desgn and analyss of experments. John wley & sons, [29] J. W. Mauchly, Sgnfcance test for sphercty of a normal n-varate dstrbuton, The Annals of Mathematcal Statstcs, vol. 11, no. 2, pp , [30] S. W. Greenhouse and S. Gesser, On methods n the analyss of profle data, Psychometrka, vol. 24, no. 2, pp , [31] H. Huynh and L. S. Feldt, Estmaton of the box correcton for degrees of freedom from sample data n randomzed block and spltplot desgns, Journal of educatonal statstcs, vol. 1, no. 1, pp , [32] J. Verma, Repeated measures desgn for emprcal researchers. John Wley & Sons, 2015.

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS

INVESTIGATION OF VEHICULAR USERS FAIRNESS IN CDMA-HDR NETWORKS 21 22 September 2007, BULGARIA 119 Proceedngs of the Internatonal Conference on Informaton Technologes (InfoTech-2007) 21 st 22 nd September 2007, Bulgara vol. 2 INVESTIGATION OF VEHICULAR USERS FAIRNESS

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008

Risk-based Fatigue Estimate of Deep Water Risers -- Course Project for EM388F: Fracture Mechanics, Spring 2008 Rsk-based Fatgue Estmate of Deep Water Rsers -- Course Project for EM388F: Fracture Mechancs, Sprng 2008 Chen Sh Department of Cvl, Archtectural, and Envronmental Engneerng The Unversty of Texas at Austn

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

How To Understand The Results Of The German Meris Cloud And Water Vapour Product

How To Understand The Results Of The German Meris Cloud And Water Vapour Product Ttel: Project: Doc. No.: MERIS level 3 cloud and water vapour products MAPP MAPP-ATBD-ClWVL3 Issue: 1 Revson: 0 Date: 9.12.1998 Functon Name Organsaton Sgnature Date Author: Bennartz FUB Preusker FUB Schüller

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 12 14 The Ch-squared dstrbuton PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 1 If a normal varable X, havng mean µ and varance σ, s standardsed, the new varable Z has a mean 0 and varance 1. When ths standardsed

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

DEFINING %COMPLETE IN MICROSOFT PROJECT

DEFINING %COMPLETE IN MICROSOFT PROJECT CelersSystems DEFINING %COMPLETE IN MICROSOFT PROJECT PREPARED BY James E Aksel, PMP, PMI-SP, MVP For Addtonal Informaton about Earned Value Management Systems and reportng, please contact: CelersSystems,

More information

Traffic State Estimation in the Traffic Management Center of Berlin

Traffic State Estimation in the Traffic Management Center of Berlin Traffc State Estmaton n the Traffc Management Center of Berln Authors: Peter Vortsch, PTV AG, Stumpfstrasse, D-763 Karlsruhe, Germany phone ++49/72/965/35, emal peter.vortsch@ptv.de Peter Möhl, PTV AG,

More information

Disagreement-Based Multi-System Tracking

Disagreement-Based Multi-System Tracking Dsagreement-Based Mult-System Trackng Quannan L 1, Xnggang Wang 2, We Wang 3, Yuan Jang 3, Zh-Hua Zhou 3, Zhuowen Tu 1 1 Lab of Neuro Imagng, Unversty of Calforna, Los Angeles 2 Huazhong Unversty of Scence

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek

THE DISTRIBUTION OF LOAN PORTFOLIO VALUE * Oldrich Alfons Vasicek HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

More information

Portfolio Loss Distribution

Portfolio Loss Distribution Portfolo Loss Dstrbuton Rsky assets n loan ortfolo hghly llqud assets hold-to-maturty n the bank s balance sheet Outstandngs The orton of the bank asset that has already been extended to borrowers. Commtment

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

A Fast Incremental Spectral Clustering for Large Data Sets

A Fast Incremental Spectral Clustering for Large Data Sets 2011 12th Internatonal Conference on Parallel and Dstrbuted Computng, Applcatons and Technologes A Fast Incremental Spectral Clusterng for Large Data Sets Tengteng Kong 1,YeTan 1, Hong Shen 1,2 1 School

More information

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background:

SPEE Recommended Evaluation Practice #6 Definition of Decline Curve Parameters Background: SPEE Recommended Evaluaton Practce #6 efnton of eclne Curve Parameters Background: The producton hstores of ol and gas wells can be analyzed to estmate reserves and future ol and gas producton rates and

More information

The Application of Fractional Brownian Motion in Option Pricing

The Application of Fractional Brownian Motion in Option Pricing Vol. 0, No. (05), pp. 73-8 http://dx.do.org/0.457/jmue.05.0..6 The Applcaton of Fractonal Brownan Moton n Opton Prcng Qng-xn Zhou School of Basc Scence,arbn Unversty of Commerce,arbn zhouqngxn98@6.com

More information

1. Measuring association using correlation and regression

1. Measuring association using correlation and regression How to measure assocaton I: Correlaton. 1. Measurng assocaton usng correlaton and regresson We often would lke to know how one varable, such as a mother's weght, s related to another varable, such as a

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts

Power-of-Two Policies for Single- Warehouse Multi-Retailer Inventory Systems with Order Frequency Discounts Power-of-wo Polces for Sngle- Warehouse Mult-Retaler Inventory Systems wth Order Frequency Dscounts José A. Ventura Pennsylvana State Unversty (USA) Yale. Herer echnon Israel Insttute of echnology (Israel)

More information

Statistical Methods to Develop Rating Models

Statistical Methods to Develop Rating Models Statstcal Methods to Develop Ratng Models [Evelyn Hayden and Danel Porath, Österrechsche Natonalbank and Unversty of Appled Scences at Manz] Source: The Basel II Rsk Parameters Estmaton, Valdaton, and

More information

How To Calculate The Accountng Perod Of Nequalty

How To Calculate The Accountng Perod Of Nequalty Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

More information

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by

8.5 UNITARY AND HERMITIAN MATRICES. The conjugate transpose of a complex matrix A, denoted by A*, is given by 6 CHAPTER 8 COMPLEX VECTOR SPACES 5. Fnd the kernel of the lnear transformaton gven n Exercse 5. In Exercses 55 and 56, fnd the mage of v, for the ndcated composton, where and are gven by the followng

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification

A Hierarchical Anomaly Network Intrusion Detection System using Neural Network Classification IDC IDC A Herarchcal Anomaly Network Intruson Detecton System usng Neural Network Classfcaton ZHENG ZHANG, JUN LI, C. N. MANIKOPOULOS, JAY JORGENSON and JOSE UCLES ECE Department, New Jersey Inst. of Tech.,

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

Implementation of Deutsch's Algorithm Using Mathcad

Implementation of Deutsch's Algorithm Using Mathcad Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

More information

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble 1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, szhang12@llnos.edu Dept. of Electr. & Comput. Eng., Unv. of Illnos at Urbana-Champagn, Urbana, IL, USA Abstract In

More information

Improved SVM in Cloud Computing Information Mining

Improved SVM in Cloud Computing Information Mining Internatonal Journal of Grd Dstrbuton Computng Vol.8, No.1 (015), pp.33-40 http://dx.do.org/10.1457/jgdc.015.8.1.04 Improved n Cloud Computng Informaton Mnng Lvshuhong (ZhengDe polytechnc college JangSu

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

"Research Note" APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES *

Research Note APPLICATION OF CHARGE SIMULATION METHOD TO ELECTRIC FIELD CALCULATION IN THE POWER CABLES * Iranan Journal of Scence & Technology, Transacton B, Engneerng, ol. 30, No. B6, 789-794 rnted n The Islamc Republc of Iran, 006 Shraz Unversty "Research Note" ALICATION OF CHARGE SIMULATION METHOD TO ELECTRIC

More information

A Hierarchical Reliability Model of Service-Based Software System

A Hierarchical Reliability Model of Service-Based Software System 2009 33rd Annual IEEE Internatonal Computer Software and Applcatons Conference A Herarchcal Relablty Model of Servce-Based Software System Lun Wang, Xaoyng Ba, Lzhu Zhou Department of Computer Scence and

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Sketching Sampled Data Streams

Sketching Sampled Data Streams Sketchng Sampled Data Streams Florn Rusu, Aln Dobra CISE Department Unversty of Florda Ganesvlle, FL, USA frusu@cse.ufl.edu adobra@cse.ufl.edu Abstract Samplng s used as a unversal method to reduce the

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

1 Example 1: Axis-aligned rectangles

1 Example 1: Axis-aligned rectangles COS 511: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture # 6 Scrbe: Aaron Schld February 21, 2013 Last class, we dscussed an analogue for Occam s Razor for nfnte hypothess spaces that, n conjuncton

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

Reliable State Monitoring in Cloud Datacenters

Reliable State Monitoring in Cloud Datacenters Relable State Montorng n Cloud Datacenters Shcong Meng Arun K. Iyengar Isabelle M. Rouvellou Lng Lu Ksung Lee Balaj Palansamy Yuzhe Tang College of Computng, Georga Insttute of Technology, Atlanta, GA

More information

Extending Probabilistic Dynamic Epistemic Logic

Extending Probabilistic Dynamic Epistemic Logic Extendng Probablstc Dynamc Epstemc Logc Joshua Sack May 29, 2008 Probablty Space Defnton A probablty space s a tuple (S, A, µ), where 1 S s a set called the sample space. 2 A P(S) s a σ-algebra: a set

More information

A Secure Password-Authenticated Key Agreement Using Smart Cards

A Secure Password-Authenticated Key Agreement Using Smart Cards A Secure Password-Authentcated Key Agreement Usng Smart Cards Ka Chan 1, Wen-Chung Kuo 2 and Jn-Chou Cheng 3 1 Department of Computer and Informaton Scence, R.O.C. Mltary Academy, Kaohsung 83059, Tawan,

More information

Lecture 3: Annuity. Study annuities whose payments form a geometric progression or a arithmetic progression.

Lecture 3: Annuity. Study annuities whose payments form a geometric progression or a arithmetic progression. Lecture 3: Annuty Goals: Learn contnuous annuty and perpetuty. Study annutes whose payments form a geometrc progresson or a arthmetc progresson. Dscuss yeld rates. Introduce Amortzaton Suggested Textbook

More information

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc.

The Use of Analytics for Claim Fraud Detection Roosevelt C. Mosley, Jr., FCAS, MAAA Nick Kucera Pinnacle Actuarial Resources Inc. Paper 1837-2014 The Use of Analytcs for Clam Fraud Detecton Roosevelt C. Mosley, Jr., FCAS, MAAA Nck Kucera Pnnacle Actuaral Resources Inc., Bloomngton, IL ABSTRACT As t has been wdely reported n the nsurance

More information

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays

VoIP Playout Buffer Adjustment using Adaptive Estimation of Network Delays VoIP Playout Buffer Adjustment usng Adaptve Estmaton of Network Delays Mroslaw Narbutt and Lam Murphy* Department of Computer Scence Unversty College Dubln, Belfeld, Dubln, IRELAND Abstract The poor qualty

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Learning from Multiple Outlooks

Learning from Multiple Outlooks Learnng from Multple Outlooks Maayan Harel Department of Electrcal Engneerng, Technon, Hafa, Israel She Mannor Department of Electrcal Engneerng, Technon, Hafa, Israel maayanga@tx.technon.ac.l she@ee.technon.ac.l

More information

Enterprise Master Patient Index

Enterprise Master Patient Index Enterprse Master Patent Index Healthcare data are captured n many dfferent settngs such as hosptals, clncs, labs, and physcan offces. Accordng to a report by the CDC, patents n the Unted States made an

More information

Credit Limit Optimization (CLO) for Credit Cards

Credit Limit Optimization (CLO) for Credit Cards Credt Lmt Optmzaton (CLO) for Credt Cards Vay S. Desa CSCC IX, Ednburgh September 8, 2005 Copyrght 2003, SAS Insttute Inc. All rghts reserved. SAS Propretary Agenda Background Tradtonal approaches to credt

More information

Design and Development of a Security Evaluation Platform Based on International Standards

Design and Development of a Security Evaluation Platform Based on International Standards Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

Analysis of Premium Liabilities for Australian Lines of Business

Analysis of Premium Liabilities for Australian Lines of Business Summary of Analyss of Premum Labltes for Australan Lnes of Busness Emly Tao Honours Research Paper, The Unversty of Melbourne Emly Tao Acknowledgements I am grateful to the Australan Prudental Regulaton

More information

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS

PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS PEER REVIEWER RECOMMENDATION IN ONLINE SOCIAL LEARNING CONTEXT: INTEGRATING INFORMATION OF LEARNERS AND SUBMISSIONS Yunhong Xu, Faculty of Management and Economcs, Kunmng Unversty of Scence and Technology,

More information

Small pots lump sum payment instruction

Small pots lump sum payment instruction For customers Small pots lump sum payment nstructon Please read these notes before completng ths nstructon About ths nstructon Use ths nstructon f you re an ndvdual wth Aegon Retrement Choces Self Invested

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Lecture 3: Force of Interest, Real Interest Rate, Annuity

Lecture 3: Force of Interest, Real Interest Rate, Annuity Lecture 3: Force of Interest, Real Interest Rate, Annuty Goals: Study contnuous compoundng and force of nterest Dscuss real nterest rate Learn annuty-mmedate, and ts present value Study annuty-due, and

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES

FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES FREQUENCY OF OCCURRENCE OF CERTAIN CHEMICAL CLASSES OF GSR FROM VARIOUS AMMUNITION TYPES Zuzanna BRO EK-MUCHA, Grzegorz ZADORA, 2 Insttute of Forensc Research, Cracow, Poland 2 Faculty of Chemstry, Jagellonan

More information

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT

APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT APPLICATION OF PROBE DATA COLLECTED VIA INFRARED BEACONS TO TRAFFIC MANEGEMENT Toshhko Oda (1), Kochro Iwaoka (2) (1), (2) Infrastructure Systems Busness Unt, Panasonc System Networks Co., Ltd. Saedo-cho

More information

Multi-sensor Data Fusion for Cyber Security Situation Awareness

Multi-sensor Data Fusion for Cyber Security Situation Awareness Avalable onlne at www.scencedrect.com Proceda Envronmental Scences 0 (20 ) 029 034 20 3rd Internatonal Conference on Envronmental 3rd Internatonal Conference on Envronmental Scence and Informaton Applcaton

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall

Staff Paper. Farm Savings Accounts: Examining Income Variability, Eligibility, and Benefits. Brent Gloy, Eddy LaDue, and Charles Cuykendall SP 2005-02 August 2005 Staff Paper Department of Appled Economcs and Management Cornell Unversty, Ithaca, New York 14853-7801 USA Farm Savngs Accounts: Examnng Income Varablty, Elgblty, and Benefts Brent

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1

Open Access A Load Balancing Strategy with Bandwidth Constraint in Cloud Computing. Jing Deng 1,*, Ping Guo 2, Qi Li 3, Haizhu Chen 1 Send Orders for Reprnts to reprnts@benthamscence.ae The Open Cybernetcs & Systemcs Journal, 2014, 8, 115-121 115 Open Access A Load Balancng Strategy wth Bandwdth Constrant n Cloud Computng Jng Deng 1,*,

More information

How To Detect An 802.11 Traffc From A Network With A Network Onlne Onlnet

How To Detect An 802.11 Traffc From A Network With A Network Onlne Onlnet IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. X, NO. X, XXX 2008 1 Passve Onlne Detecton of 802.11 Traffc Usng Sequental Hypothess Testng wth TCP ACK-Pars We We, Member, IEEE, Kyoungwon Suh, Member, IEEE,

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features

On-Line Fault Detection in Wind Turbine Transmission System using Adaptive Filter and Robust Statistical Features On-Lne Fault Detecton n Wnd Turbne Transmsson System usng Adaptve Flter and Robust Statstcal Features Ruoyu L Remote Dagnostcs Center SKF USA Inc. 3443 N. Sam Houston Pkwy., Houston TX 77086 Emal: ruoyu.l@skf.com

More information

Network Aware Load-Balancing via Parallel VM Migration for Data Centers

Network Aware Load-Balancing via Parallel VM Migration for Data Centers Network Aware Load-Balancng va Parallel VM Mgraton for Data Centers Kun-Tng Chen 2, Chen Chen 12, Po-Hsang Wang 2 1 Informaton Technology Servce Center, 2 Department of Computer Scence Natonal Chao Tung

More information

Web Spam Detection Using Machine Learning in Specific Domain Features

Web Spam Detection Using Machine Learning in Specific Domain Features Journal of Informaton Assurance and Securty 3 (2008) 220-229 Web Spam Detecton Usng Machne Learnng n Specfc Doman Features Hassan Najadat 1, Ismal Hmed 2 Department of Computer Informaton Systems Faculty

More information

Trust Formation in a C2C Market: Effect of Reputation Management System

Trust Formation in a C2C Market: Effect of Reputation Management System Trust Formaton n a C2C Market: Effect of Reputaton Management System Htosh Yamamoto Unversty of Electro-Communcatons htosh@s.uec.ac.jp Kazunar Ishda Tokyo Unversty of Agrculture k-shda@noda.ac.jp Toshzum

More information

SIMPLE LINEAR CORRELATION

SIMPLE LINEAR CORRELATION SIMPLE LINEAR CORRELATION Smple lnear correlaton s a measure of the degree to whch two varables vary together, or a measure of the ntensty of the assocaton between two varables. Correlaton often s abused.

More information

Cloud-based Social Application Deployment using Local Processing and Global Distribution

Cloud-based Social Application Deployment using Local Processing and Global Distribution Cloud-based Socal Applcaton Deployment usng Local Processng and Global Dstrbuton Zh Wang *, Baochun L, Lfeng Sun *, and Shqang Yang * * Bejng Key Laboratory of Networked Multmeda Department of Computer

More information

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank.

Marginal Benefit Incidence Analysis Using a Single Cross-section of Data. Mohamed Ihsan Ajwad and Quentin Wodon 1. World Bank. Margnal Beneft Incdence Analyss Usng a Sngle Cross-secton of Data Mohamed Ihsan Ajwad and uentn Wodon World Bank August 200 Abstract In a recent paper, Lanjouw and Ravallon proposed an attractve and smple

More information

Brigid Mullany, Ph.D University of North Carolina, Charlotte

Brigid Mullany, Ph.D University of North Carolina, Charlotte Evaluaton And Comparson Of The Dfferent Standards Used To Defne The Postonal Accuracy And Repeatablty Of Numercally Controlled Machnng Center Axes Brgd Mullany, Ph.D Unversty of North Carolna, Charlotte

More information

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S

How To Know The Components Of Mean Squared Error Of Herarchcal Estmator S S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

Enabling P2P One-view Multi-party Video Conferencing

Enabling P2P One-view Multi-party Video Conferencing Enablng P2P One-vew Mult-party Vdeo Conferencng Yongxang Zhao, Yong Lu, Changja Chen, and JanYn Zhang Abstract Mult-Party Vdeo Conferencng (MPVC) facltates realtme group nteracton between users. Whle P2P

More information

Planning for Marketing Campaigns

Planning for Marketing Campaigns Plannng for Marketng Campagns Qang Yang and Hong Cheng Department of Computer Scence Hong Kong Unversty of Scence and Technology Clearwater Bay, Kowloon, Hong Kong, Chna (qyang, csch)@cs.ust.hk Abstract

More information

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno

Data Mining from the Information Systems: Performance Indicators at Masaryk University in Brno Data Mnng from the Informaton Systems: Performance Indcators at Masaryk Unversty n Brno Mkuláš Bek EUA Workshop Strasbourg, 1-2 December 2006 1 Locaton of Brno Brno EUA Workshop Strasbourg, 1-2 December

More information

A system for real-time calculation and monitoring of energy performance and carbon emissions of RET systems and buildings

A system for real-time calculation and monitoring of energy performance and carbon emissions of RET systems and buildings A system for real-tme calculaton and montorng of energy performance and carbon emssons of RET systems and buldngs Dr PAAIOTIS PHILIMIS Dr ALESSADRO GIUSTI Dr STEPHE GARVI CE Technology Center Democratas

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

Conversion between the vector and raster data structures using Fuzzy Geographical Entities

Conversion between the vector and raster data structures using Fuzzy Geographical Entities Converson between the vector and raster data structures usng Fuzzy Geographcal Enttes Cdála Fonte Department of Mathematcs Faculty of Scences and Technology Unversty of Combra, Apartado 38, 3 454 Combra,

More information

Trivial lump sum R5.0

Trivial lump sum R5.0 Optons form Once you have flled n ths form, please return t wth your orgnal brth certfcate to: Premer PO Box 2067 Croydon CR90 9ND. Fll n ths form usng BLOCK CAPITALS and black nk. Mark all answers wth

More information

ivoip: an Intelligent Bandwidth Management Scheme for VoIP in WLANs

ivoip: an Intelligent Bandwidth Management Scheme for VoIP in WLANs VoIP: an Intellgent Bandwdth Management Scheme for VoIP n WLANs Zhenhu Yuan and Gabrel-Mro Muntean Abstract Voce over Internet Protocol (VoIP) has been wdely used by many moble consumer devces n IEEE 802.11

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

320 The Internatonal Arab Journal of Informaton Technology, Vol. 5, No. 3, July 2008 Comparsons Between Data Clusterng Algorthms Osama Abu Abbas Computer Scence Department, Yarmouk Unversty, Jordan Abstract:

More information

A Model of Private Equity Fund Compensation

A Model of Private Equity Fund Compensation A Model of Prvate Equty Fund Compensaton Wonho Wlson Cho Andrew Metrck Ayako Yasuda KAIST Yale School of Management Unversty of Calforna at Davs June 26, 2011 Abstract: Ths paper analyzes the economcs

More information

RequIn, a tool for fast web traffic inference

RequIn, a tool for fast web traffic inference RequIn, a tool for fast web traffc nference Olver aul, Jean Etenne Kba GET/INT, LOR Department 9 rue Charles Fourer 90 Evry, France Olver.aul@nt-evry.fr, Jean-Etenne.Kba@nt-evry.fr Abstract As networked

More information

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application

Performance Analysis of Energy Consumption of Smartphone Running Mobile Hotspot Application Internatonal Journal of mart Grd and lean Energy Performance Analyss of Energy onsumpton of martphone Runnng Moble Hotspot Applcaton Yun on hung a chool of Electronc Engneerng, oongsl Unversty, 511 angdo-dong,

More information

Dynamic Pricing for Smart Grid with Reinforcement Learning

Dynamic Pricing for Smart Grid with Reinforcement Learning Dynamc Prcng for Smart Grd wth Renforcement Learnng Byung-Gook Km, Yu Zhang, Mhaela van der Schaar, and Jang-Won Lee Samsung Electroncs, Suwon, Korea Department of Electrcal Engneerng, UCLA, Los Angeles,

More information

Financial Mathemetics

Financial Mathemetics Fnancal Mathemetcs 15 Mathematcs Grade 12 Teacher Gude Fnancal Maths Seres Overvew In ths seres we am to show how Mathematcs can be used to support personal fnancal decsons. In ths seres we jon Tebogo,

More information

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center

How To Solve An Onlne Control Polcy On A Vrtualzed Data Center Dynamc Resource Allocaton and Power Management n Vrtualzed Data Centers Rahul Urgaonkar, Ulas C. Kozat, Ken Igarash, Mchael J. Neely urgaonka@usc.edu, {kozat, garash}@docomolabs-usa.com, mjneely@usc.edu

More information

Abstract. Clustering ensembles have emerged as a powerful method for improving both the

Abstract. Clustering ensembles have emerged as a powerful method for improving both the Clusterng Ensembles: {topchyal, Models jan, of punch}@cse.msu.edu Consensus and Weak Parttons * Alexander Topchy, Anl K. Jan, and Wllam Punch Department of Computer Scence and Engneerng, Mchgan State Unversty

More information