Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification


 Evan West
 3 years ago
 Views:
Transcription
1 1882 J. Chem. If. Comput. Sci. 2003, 43, Compariso of Support Vector Machie ad Artificial Neural Network Systems for Drug/Nodrug Classificatio Evgey Byvatov, Uli Fecher, Jes Sadowski, ad Gisbert Scheider*, Istitut für Orgaische Chemie ud Chemische Biologie, Joha Wolfgag GoetheUiversität, MarieCurieStrasse 11, D Frakfurt, Germay, ad AstraZeeca R&D Möldal, SC 264, S Möldal, Swede Received Jue 13, 2003 Support vector machie (SVM) ad artificial eural etwork (ANN) systems were applied to a drug/odrug classificatio problem as a example of biary decisio problems i earlyphase virtual compoud filterig ad screeig. The results idicate that solutios obtaied by SVM traiig seem to be more robust with a smaller stadard error compared to ANN traiig. Geerally, the SVM classifier yielded slightly higher predictio accuracy tha ANN, irrespective of the type of descriptors used for molecule ecodig, the size of the traiig data sets, ad the algorithm employed for eural etwork traiig. The performace was compared usig various differet descriptor sets ad descriptor combiatios based o the 120 stadard GhoseCrippe fragmet descriptors, a wide rage of 180 differet properties ad physicochemical descriptors from the Molecular Operatig Eviromet (MOE) package, ad 225 topological pharmacophore (CATS) descriptors. For the complete set of 525 descriptors crossvalidated classificatio by SVM yielded 82% correct predictios (Matthews cc ) 0.63), whereas ANN reached 80% correct predictios (Matthews cc ) 0.58). Although SVM outperformed the ANN classifiers with regard to overall predictio accuracy, both methods were show to complemet each other, as the sets of true positives, false positives (overpredictio), true egatives, ad false egatives (uderpredictio) produced by the two classifiers were ot idetical. The theory of SVM ad ANN traiig is briefly reviewed. INTRODUCTION Earlyphase virtual screeig ad compoud library desig ofte employs filterig routies which are based o biary classifiers ad are meat to elimiate potetially uwated molecules from a compoud library. 1,2 Curretly two classifier systems are most ofte used i these applicatios: PLSbased classifiers 3,4 ad various types of artificial eural etworks (ANN). 59 Typically, these systems yield a average overall accuracy of 80% correct predictios for biary decisio tasks followig the likeess cocept i virtual screeig. 2,10 The support vector machie (SVM) approach was first itroduced by Vapik as a potetial alterative to covetioal artificial eural etworks. 11,12 Its popularity has grow ever sice i various areas of research, ad first applicatios i molecular iformatics ad pharmaceutical research have bee described Although SVM ca be applied to multiclass separatio problems, its origial implemetatio solves biary class/oclass separatio problems. Here we describe applicatio of SVM to the drug/ odrug classificatio problem, which employs a class/ oclass implemetatio of SVM. Both SVM ad ANN algorithms ca be formulated i terms of learig machies. The stadard sceario for classifier developmet cosists of two stages: traiig ad testig. Durig first stage the learig machie is preseted with labeled samples, which are basically dimesioal vectors with a class membership * Correspodig author phoe: ; fax: ; Joha Wolfgag GoetheUiversität. AstraZeeca R&D Möldal. label attached. The learig machie geerates a classifier for predictio of the class label of the iput coordiates. Durig the secod stage, the geeralizatio ability of the model is tested. Curretly various sets of molecular descriptors are available. 16 For applicatio to drug/odrug classificatio of compouds, the molecules are typically represeted by dimesioal vectors. 6,7 I this work, we focused o the fragmetbased GhoseCrippe (GC) descriptors which were used i the origial work of Sadowski ad Kubiyi for drug/odrug classificatio, 7 descriptors provided by the MOE software package (Molecular Operatig Eviromet. Chemical Computig Group Ic., Motreal, Caada), ad CATS topological pharmacophores. 20 Havig defied this molecular represetatio, the task of the preset study was to compare the classificatio ability of stadard SVM ad feedforward ANN o the drug/odrug data. A wwwbased iterface for calculatig the druglikeess score of a molecule usig our SVM solutio based o the CATS descriptor was developed ad ca be foud at URL: gecco.org.chemie.uifrakfurt.de/gecco.html. DATA AND METHODS Data Sets. For SVM ad ANN traiig we used the sets of drug ad odrug molecules prepared by Kubiyi ad Sadowski. 7 From the origial data set 9208 molecules could be processed by our descriptor geeratio software. The fial workig set cotaied 4998 drugs ad 4210 odrug molecules. Three sets of descriptors were calculated: couts of the stadard 120 Ghose Crippe descriptors, /ci CCC: $ America Chemical Society Published o Web 09/27/2003
2 ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, descriptors from MOE (Molecular Operatig Eviromet. Chemical Computig Group Ic., Motreal, Caada), ad 225 topological pharmacophore (CATS) descriptors. 20 MOE descriptors iclude various 2D ad 3D descriptors such as volume ad shape desciptors, atom ad bods couts, Kier Hall coectivity ad kappa shape idices, adjacecy ad distace matrix descriptors, pharmacophore feature descriptors, partial charges, potetial eergy descriptors, ad coformatiodepedet charge descriptors. Before calculatig MOE descriptors, sigle 3D coformers were geerated by CORINA CATS descriptors were calculated usig our ow software takig ito cosideratio pairs of atom types separated by up to 15 bods (URL: gecco.org.chemie.uifrakfurt.de/gecco.html). 20 All 225 descriptor colums were idividually autoscaled. A alterative would have bee blockscalig where each descriptor class is autoscaled as a whole, which was ot applied here. Support Vector Machie. SVM classifiers are geerated by a twostep procedure: First, the sample data vectors are mapped ( projected ) to a very highdimesioal space. The dimesio of this space is sigificatly larger tha dimesio of the origial data space. The, the algorithm fids a hyperplae i this space with the largest margi separatig classes of data. It was show that classificatio accuracy usually depeds oly weakly o the specific projectio, provided that the target space is sufficietly high dimesioal. 11 Sometimes it is ot possible to fid the separatig hyperplae eve i a very highdimesioal space. I this case a tradeoff is itroduced betwee the size of the separatig margi ad pealties for every vector which is withi the margi. 11 The basic theory of SVM will be briefly reviewed i the followig. The separatig hyperplae is defied as D(x) ) (w x) + w 0 Here x is a samples vector mapped to a high dimesioal space, ad w ad w 0 are parameters of the hyperplae that SVM will estimate. The the margi ca be expressed as a miimal τ for which holds Without loss of geerality we ca apply a costrait τ w ) 1tow. I this case maximizig τ is equivalet to miimizig w ad SVM traiig is becomig the problem of fidig the miimum of a fuctio with the followig costraits: miimize y k D(x k ) g τ w η(w) ) 1 2 (w w) subject to costraits y i [(w x i ) + w 0 ] g 1 This problem is solved by itroductio of Lagrage multipliers ad miimizatio of the fuctio Here R i are Lagrage multipliers. Differetiatig over w ad w i ad substitutig we obtai Q(w,w 0,R) ) 1 2 (w w)  R i {y i [(w x i ) + w 0 ]  1} Figure 1. Priciple of SVM classificatio. The task was to separate two classes of objects idicated by squares ad circles. Squares represet oclass samples ( egative examples, e.g. odrugs) ad circles are class members ( positive examples, e.g. drugs). D(x) is the decisio fuctio defiig class membership accordig to the SVM classifier which is represeted by the separatig lie (D(x) ) 0). The margi is idicated by dotted lies. Support vectors are idicated by filled objects (x 2, x 2, x 3, x 4 ). ξ i are slack variables for support vectors that are ot lyig o the margi border. y i are labelvariables equal to 1 for positive examples (class membership) ad 1 for egative examples (oclass membership). See text for details. max subject to costraits Q(R) ) R i  1 R i R j y i y j (x i x j ) 2 i,j)1 Whe perfect separatio is ot possible slack variables are itroduced for sample vectors which are withi the margi, ad the optimizatio problem ca be reformulated: Here ξ i are slack variables. These variables are ot equal to zero oly for those vectors which are withi the margi. Itroducig Lagrage multipliers agai we fially obtai This is a quadratic programmig (QP) problem for which several efficiet stadard methods are kow. 22 Due to the very high dimesioality of the QP problem, which typically arises durig SVM traiig, a extesio of the algorithm for solvig QP is used i SVM applicatios. 23 A geometrical illustratio of the meaig of slack variables ad Lagrage multipliers is give i Figure 1. Poits classified by SVM ca be divided ito two groups, support vectors ad osupport vectors. Nosupport vectors are classified correctly by the hyperplae ad are located outside y i R i ) 0; R i g 0,i ) 1,..., miimize η(w) ) 1 2 (w w) + C ξ i i subject to costraits y i [(w x i ) + w 0 ] g 1  ξ i max subject to costraits Q(R) ) R i  1 R i R j y i y j (x i x j ) 2 i,j)1 y i R i ) 0, C g R i g 0,i ) 1,...,
3 1884 J. Chem. If. Comput. Sci., Vol. 43, No. 6, 2003 BYVATOV ET AL. the separatig margi. Slack variables ad Lagrage multipliers for them are equal to zero. Parameters of the hyperplae do ot deped o them, ad eve if their positio is chaged the separatig hyperplae ad margi will remai uchaged, provided that these poits will stay outside the margi. Other poits are support vectors, ad they are the poits which determie the exact positio of the hyperplae. For all support vectors the absolute values of the slack variables are equal to the distaces from these poits to the edge of the separatig margi. These distaces are defied i the uits of half of the width of the separatig margi. For correctly classified poits withi the separatig margi, slack variable values are betwee zero ad oe. For misclassified poits withi the margi the values of the slack variables are betwee oe ad two. For other misclassified poits they are greater tha two. For poits that are lyig o the edge of margi, Lagrage multipliers are betwee zero ad C, ad slack variables for these poits are still equal to zero. For all other poits, for which the values of slack variables are larger tha zero, Lagrage multipliers assume the value of C. Explicit mappig to a very highdimesioal space is ot required if calculatio of the scalar product i this high dimesioal space of every two vectors is feasible. This scalar product ca be defied by itroducig a kerel fuctio(x x ) ) K(x,x ), 24 where x ad x are vectors i a lowdimesioal space for which a kerel fuctio that correspods to a scalar product i a high dimesioal space is defied. Various kerels may be applied. 25 I our case, we used a kerel fuctio of a fifthorder polyomial: K(x,x ) ) ((x x )s + r) 5 This kerel correspods to the decisio fuctio f(x) ) sig( R i K(x sv i, x) + b) i where R i are Lagrage multipliers determied durig traiig of SVM. The sum is oly over support vectors x sv. Lagrage multipliers for all other poits are equal to zero. Parameter b determies the shift of the hyperplae, ad it is also foud durig SVM traiig. Simultaeous scalig of s, r, ad b parameters does ot chage the decisio fuctio. Thus, we ca simplify the kerel by settig r equal to oe: K(x,x ) ) ((x x )s + 1) 5 I this case oly the kerel parameter s ad error tradeoff C must be tued. Parameter C is ot preset explicitly i this equatio; it is set up as a pealty for the misclassificatio error before the traiig of SVM is performed. For tuig parameters s ad C, fourtimes crossvalidatio of traiig data was applied, ad values for s ad C that maximize accuracy were the chose. Accuracy maximizatio was performed by heuristics based gradiet descet. 26 Basically, the followig procedure was applied. The data set was divided ito two parts, traiig ad validatio set. The validatio subset was put aside ad used oly for estimatio of the performace of the traied classifier. Traiig data were divided ito four ooverlappig subsets. The SVM parameters to be determied were set to reasoable iitial values. The, the SVM was traied o the traiig data Figure 2. Architecture of artificial eural etworks. Formal euros are draw as circles, weights are represeted by lies coectig the euro layers. Faout euros are draw i white, sigmoidal uits i black, ad liear uits i gray. (a) covetioal threelayered feedforward system ( architecture I ); (b) etwork architecture used by Ajay ad coworkers for druglikeess predictio ( architecture II ). 6 excludig oe of the four subsets, ad the performace of the obtaied SVM classifier was estimated with the excluded subset. This procedure was repeated for each subset, ad a average performace of the SVM classifier was obtaied. For SVM traiig we used freely available SVM software (SVMLight package; URL: org/). 26,27 A Liuxbased LSF (Load Sharig Facility; Platform Computig GmbH, D Ratige, Germay) cluster was used for determiatio of the crossvalidatio error to reduce calculatio time. All calculatios were performed usig the MATLAB package (MATLAB 2002, The mathematical laboratory. The MathWorks GmbH, D Aache, Germay). ARTIFICIAL NEURAL NETWORK Covetioal twolayered eural etworks with a sigle output euro were used for ANN model developmet (Figure 2a). 26 As a result of etwork traiig a decisio fuctio is chose from the family of fuctios represeted by the etwork architecture. This fuctio family is defied by the complexity of the eural etwork: umber of hidde layers, umber of euros i these layers, ad topology of the etwork. The decisio fuctio is determied by choosig appropriate weights for the eural etwork. Optimal weights usually miimize a error fuctio for the particular etwork architecture. The error fuctio describes the deviatio of predicted target values from observed or desired values. For our class/oclass classificatio problem the target values were 1 for class (drugs) ad 1 for oclass (odrugs). Stadard twolayered eural etwork with a sigle output euro ca be represeted by the followig equatio y ) g ( M w 1j j)1 d w ji (2) g( (1) x i + w (1) j0 ) + w 11 with the error fuctio E ) k)1 (y(x k )  y k ) 2. I this work, g is a liear fuctio ad g is a tasigmoid trasfer fuctio. A secod type etwork architecture cotaiig additioal coectios from the iput layer to the output layer was traied to reimplemet the origial drug/odrug ANN developed by Ajay ad coworkers (Figure 2b). 6 Traiig of eural etwork is typically performed o variatios of gradiet descet based algorithms, 26 tryig to (2) )
4 Table 1. CrossValidated Results of Machie Learig a % correct Matthews cc ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, descriptors ANN SVM ANN SVM GC ( ( ( ( MOE ( ( ( ( CATS_ ( ( ( ( all (GC+MOE+CATS) ( ( ( ( a Average values ad stadard deviatios are give. The LevebergMarquardt traiig method was used for ANN traiig. miimize a error fuctio. To avoid overfittig crossvalidatio ca be used for fidig a earlier poit of traiig. 28 I this work the eural etwork toolbox from MATLAB was used. Data were preprocessed idetically to SVM based learig. We applied the followig traiig algorithms to ANN optimizatio i their default versios provided by MATLAB: gradiet descet with variable learig rate, 29,30 cojugated gradiet descet, 30,31 scaled cojugated gradiet descet, 32 quasinewto algorithm, 33 LevebergMarquardt (LM), 34,35 ad automated regularizatio. 36 For each optimizatio tetimes crossvalidatio was performed (80+20 splits ito traiig ad test data), where the ANN weights ad biases were optimized usig the traiig data, ad predictio accuracy was measured usig test data to determie the umber of traiig epochs, i.e., the edpoit of the traiig process. This was performed to reduce the risk of overfittig. It should be oted that the validatio data were left utouched. MODEL VALIDATION The SVM model for drug/odrug classificatio of a patter x was SVM(x) ) (a i K(x SV i, x) + b) i Here, i rus oly over support vectors (SV). The value of SVM(x) is either positive ( drug ) or egative ( odrug ). The ANN model for drug/odrug classificatio produced values i ]1,1[, where a positive value meat drug ad a egative value odrug. Classificatio accuracy was evaluated based o predictio accuracy, i.e., percet of test compouds correctly classified, ad the correlatio coefficiet accordig to Matthews: 37 NP  OU cc ) (N + O)(N + U)(P + O)(P + U) where P, N, O, ad U are the umber of true positive, true egative, false positive, ad false egative predictios, respectively. Drugs were cosidered as positive set, the odrug molecules formed the egative set. The values of cc ca rage from 1 to 1. Perfect predictio gives a correlatio coefficiet of 1. SVM ad ANN models were developed usig various sizes of traiig data to measure the ifluece of the size of the traiig set o the quality of the classificatio model. The umber of traiig samples was iteratively dimiished: Startig with a radom split of all available samples ito traiig ad validatio subsets, at each of the followig iteratios we dimiished the size of the traiig set to oly 80% of the umber of samples of the previous iteratio. This allowed us to obtai better samplig for small traiig sets. 10times crossvalidatio was performed, ad average values of predictio accuracy ad cc were calculated. RESULTS AND DISCUSSION The mai aim of this study was to compare SVM ad ANN classifiers i their ability to distiguish betwee sets of drugs ad odrugs. We traied differet eural etwork topologies, ad performace of the best etwork was compared to the SVM classifier. Two types of ANN architecture were cosidered: stadard feedforward etworks with oe hidde layer ( architecture I ) ad a feedforward etwork with oe hidde layer with additioal direct coectios from iput euros to the output ( architecture II ) (Figure 2). The first type of ANN was used by Sadowski ad Kubiyi i their origial work o druglikeess predictio; 7 the secod architecture was employed by Ajay ad coworkers servig the same purpose. 6 Usig these etworks ad the GC descriptors i combiatio with the LevebergMarquardt traiig method, classificatio accuracy was idetical to the origial results (o average 80% correct) despite the use of a differet traiig techique ad differet traiig data (Table 1). This observatio substatiates the origial fidigs. Both etwork types performed idetically cosiderig the error margi (approximately 80% correct classificatio). We observed that for some of the traiig algorithms a slightly lower stadard deviatio of the predictio accuracy was observed for architecture I (data ot show). Sice the additioal coectios i etwork architecture II did ot cotribute to a greater accuracy of the model, we used oly the stadard feedforward etwork with oe hidde layer cotaiig two euros (architecture I) for further aalysis. For each traiig method ad combiatio of iput variables (descriptors) etworks with differet umbers of hidde euros (210 euros) were traied. Overall, we did ot observe a overall best traiig algorithm. The LevebergMarquardt method was used for the developmet of the fial ANN model. Also, we did ot observe a improved classificatio result whe the umber of hidde euros was larger tha two (data ot show). ANN architecture I with two hidde euros yielded the overall best crossvalidated predictio result for all descriptors (GC+MOE+CATS), 80% correct predictios ( cc ) 0.58). The rak order of descriptor sets with regard to the overall classificatio accuracy yielded was as follows: All > GC > MOE > CATS (Table 1). It should be stressed that the differeces i classificatio accuracy are miute for the descriptors All, MOE, ad GC ad should be regarded as comparable cosiderig a stadard deviatio of 1%. The CATS descriptor led to approximately 5% lower accuracy.
5 1886 J. Chem. If. Comput. Sci., Vol. 43, No. 6, 2003 BYVATOV ET AL. Figure 3. Average crossvalidated predictio accuracy (fractio correct) of SVM ad ANN classifiers optimized by various traiig schemes for GC descriptors (upper graph: logarithmic scale; lower graph: liear scale). SVM traiig resulted i models showig slightly higher predictio accuracy tha the ANN systems (Table 1). A 12% gai was observed, idepedet of the umber of traiig samples ad method used for eural etwork traiig. Figures 3 ad 4 illustrate the depedecy of the classificatio accuracy o the umber of sample molecules used for traiig. I oe experimet oly GC descriptors were used (Figure 3), i a secod study the combiatio of GC, MOE, ad CATS descriptors was employed (Figure 4). With the GC descriptor the SVM estimator oly slightly outperforms the eural etworks (Figure 3). Similar results were obtaied if oly MOE or CATS descriptors were used for traiig (data ot show). The situatio chaged whe all descriptors were used. With the complete descriptor set (525dimesioal) SVM clearly outperforms the eural etwork system (Figure 4). These results substatiate earlier fidigs that SVM performs better tha ANN whe large umbers of features or descriptors are used. 12 A geeral observatio was the fact that classificatio accuracy sigificatly improved with a icreasig umber of traiig samples, reachig a plateau i performace betwee 2000 ad 3000 samples (Figures 3 ad 4). The accuracy curves represet almost ideal learig behavior. It should be metioed that the performace plateau observed does ot reflect a iheret clusterig of the data set, as traiig data subsets were radomly selected from the pool. The fractio correctly predicted grows from approximately 65% to 80% whe the traiig set is icreased by a factor of 250. The combiatio of MOE, GC, ad CATS descriptors improved classificatio accuracy by approximately two percet for SVM ad by oe percet for ANN compared to models based o idividual descriptors. These results demostrate that a optimal ANN traiig to a large extet depeds o the umber of traiig patters available ad the type of molecular descriptors used. For istace, for GC descriptors the best learig algorithm was traiig with
6 ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, Figure 4. Average crossvalidated predictio accuracy (fractio correct) of SVM ad ANN classifiers optimized by various traiig schemes for the combiatio of GC, MOE, ad CATS descriptors (upper graph: logarithmic scale; lower graph: liear scale). automated regularizatio, but for the combiatio of GC, MOE, ad CATS descriptors this algorithm was extremely slow ad coverged relatively ustable. I cotrast, SVM geerally performed more stably compared to ANN, with oly a small icrease i computatio time for both sets of descriptors (Figures 3 ad 4). I a previous compariso of SVM to several machie learig methods by Holde ad coworkers it was show that a SVM classifier outperformed other stadard methods, but a specially desiged ad structurally optimized eural etwork was agai superior to the SVM model i a bechmark test. 13 This observatio is supported by the observatio that i the preset study the set of molecules which were correctly classified by both SVM ad ANN (mutual true positives) was 72% o average, ad the fractio icorrectly classified by both systems (mutual false egatives) was 11%. 10% of the test data were correctly predicted by SVM but failed by ANN, ad 6% were correctly classified by ANN but ot by SVM usig the full set of descriptors (GC+MOE+CATS). Examples of the latter two sets of molecules are show i Figure 5. Clearly, the ANN classifier ad the SVM classifier complemet each other, ad both methods could be further optimized, for example, by chagig the SVM kerel or by explorig more sophisticated ANN architectures ad cocepts. Fast classifier systems are maily developed for firstpass virtual screeig, i particular for idetificatio ( flaggig ) of potetially udesired molecules i very large compoud collectios. 2 Due to robust covergece behavior SVM seems to be wellsuited for solvig biary decisio problems i molecular iformatics, especially whe a large umber of descriptors is available for characterizatio of molecules. I this study we have show that two druglikeess estimators ca produce complemetary predictios. We recommed the parallel applicatio of both predictive systems for virtual screeig applicatios. Oe possibility to combie several estimators for druglikeess or ay other classificatio task is to employ a jury decisio, e.g. calculate a esemble
7 1888 J. Chem. If. Comput. Sci., Vol. 43, No. 6, 2003 BYVATOV ET AL. determies the success or failure of machie learig systems. Both methods are suited to assess the usefuless of differet descriptor sets for a give classificatio task, ad they are methods of choice for rapid firstpass filterig of compoud libraries. 40 A particular advatage of SVM is sparseess of the solutio. This meas that a SVM classifier depeds oly o the support vectors, ad the classifier fuctio is ot iflueced by the whole data set, as it is the case for may eural etwork systems. Aother characteristic of SVM is the possibility to efficietly deal with a very large umber of features due to the exploitatio of kerel fuctios, which makes it a attractive techique, e.g., for gee chip aalysis or highdimesioal chemical spaces. The combiatio of SVM with a feature selectio routie might provide a efficiet tool for extractig chemically relevat iformatio. Figure 5. Examples of drugs correctly classified by ANN but ot by SVM (structures 15), ad drugs correctly classified by SVM but ot by ANN (structures 610). average. 38,39 As more ad more differet predictors become available for virtual screeig a meaigful combiatio of predictio systems that exploits the idividual stregths of the differet methods will be pivotal for reliable compoud library filterig. CONCLUSION It was demostrated that the SVM system used i this study has the capacity to produce higher overall predictio accuracy tha a particular ANN architecture. Based o this observatio we coclude that SVM represets a useful method for classificatio tasks i QSAR modelig ad virtual screeig, especially whe large umbers of iput variables are used. The SVM classifier was show to complemet the predictios obtaied by ANN. The SVM ad ANN classifiers obtaied for druglikeess predictio are comparable i overall accuracy ad produce overlappig, yet ot idetical sets of correctly ad misclassified compouds. A similar observatio ca be made whe two ANN models are compared. Differet ANN architectures ad traiig algorithms were show to lead to differet classificatio results. Therefore, it might be wise to apply several predictive models i parallel, irrespective of their ature, i.e., beig SVM or ANNbased. We wish to stress that our study does ot justify the coclusio that SVM outperforms ANN i geeral. I the preset work oly a stadard feedforward etwork with a fixed umber of hidde euros was compared to a stadard SVM implemetatio. Nevertheless, our results idicate that solutios obtaied by SVM traiig seem to be more robust with a smaller stadard error compared to stadard ANN traiig. Irrespective of the outcome of this study, it is the appropriate choice of traiig data ad descriptors, ad reasoable scalig of iput variables that ACKNOWLEDGMENT The authors are grateful to Norbert Dichter ad Ralf Tomczak for settig up the LSF Liux cluster. Alireza Givehchi is thaked for assistace i istallig the gecco! Web iterface. This work was supported by the Beilstei Istitut zur Förderug der Chemische Wisseschafte, Frakfurt. REFERENCES AND NOTES (1) Clark, D. E.; Pickett, S, D. Computatioal methods for the predictio of druglikeess. Drug DiscoV. Today 2000, 5, (2) Scheider, G.; Böhm, H.J. Virtual screeig ad fast automated dockig methods. Drug DiscoV. Today 2002, 7, (3) Wold, S. Expoetially weighted movig pricipal compoet aalysis ad projectios to latet structures. Chemomet. Itell. Lab. Syst. 1994, 23, (4) Foria, M.; Casolio, M. C.; de la Pezuela Martiez, C. Multivariate calibratio: applicatios to pharmaceutical aalysis. J. Pharm. Biomed. Aal. 1998, 18, (5) Neural Networks i QSAR ad Drug Desig; Devillers, J., Ed.; Academic Press: Lodo, (6) Ajay; Walters, W. P.; Murcko, M. A. Ca we lear to distiguish betwee druglike ad odruglike molecules? J. Med. Chem. 1998, 41, (7) Sadowski, J.; Kubiyi, H. A scorig scheme for discrimiatig betwee drugs ad odrugs. J. Med. Chem. 1998, 41, (8) Sadowski, J. Optimizatio of chemical libraries by eural etworks. Curr. Opi. Chem. Biol. 2000, 4, (9) Scheider, G. Neural etworks are useful tools for drug desig. Neural Networks 2000, 13, (10) Sadowski, J. I Virtual Screeig for BioactiVe Molecules; Böhm, H.J., Scheider, G., Eds.; Weiheim: WileyVCH: 2000; pp (11) Cortes, C.; Vapik, V. Supportvector etworks. Machie Learig 1995, 20, (12) Vapik, V. The Nature of Statistical Learig Theory; Berli: Spriger, (13) Burbidge, R.; Trotter, M.; Buxto, B.; Holde, S. Drug desig by machie learig: support vector machies for pharmaceutical data aalysis. Comput. Chem. 2001, 26, (14) Warmuth, M. K.; Liao, J.; Ratsch, G.; Mathieso, M.; Putta, S.; Lemme, C. Active learig with Support Vector Machies i the drug discovery process. J. Chem. If. Comput. Sci. 2003, 43, (15) Wilto, D.; Willett, P.; Lawso, K.; Mullier, G. Compariso of rakig methods for virtual screeig i leaddiscovery programs. J. Chem. If. Comput. Sci. 2003, 43, (16) Todeschii, R.; Cosoi, V. Hadbook of Molecular Descriptors; Weiheim: WileyVCH: (17) Ghose, A. K.; Crippe, G. M. Atomic physicochemical parameters for threedimesioal structuredirected quatitative structureactivity relatioships 1. Partitio coefficiets as a Measure of hydrophobicity. J. Comput. Chem. 1986, 7, (18) Ghose, A. K.; Crippe, G. M. Atomic physicochemical parameters for threedimesioal structuredirected quatitative structureactivity
8 ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, relatioships 2. Modelig dispersive ad hydrophobic iteractios. J. Comput. Chem. 1987, 27, (19) Ghose, A. K.; Pritchett, A.; Crippe, G. M. Atomic physicochemical parameters for threedimesioal structuredirected quatitative structureactivity relatioships 3. J. Comput. Chem. 1988, 9, (20) Scheider, G.; Neidhart, W.; Giller, T.; Schmid, G. Scaffoldhoppig by topological pharmacophore search: a cotributio to virtual screeig. Agew. Chem., It. Ed. Egl. 1999, 38, (21) Gasteiger, J.; Rudolph, C.; Sadowski, J. Automatic geeratio of 3Datomic coordiates for orgaic molecules. Tetrahedro Comput. Methods 1990, 3, (22) Colema, T. F.; Li, Y. A reflective Newto method for miimizig a quadratic fuctio subject to bouds o some of the variables. SIAM J. Optimizatio 1996, 6, (23) Joachims, T. I Makig largescale SVM learig practical. AdVaces i Kerel Methods  Support Vector Learig; Schölkopf, B., Burges, C., Smola, A., Eds.; MITPress: Cambridge, MA, 1999; pp (24) Cristiaii, N.; ShaweTaylor, J. A Itroductio to Support Vector Machies ad Other Kerelbased Learig Methods; Cambridge Uiversity Press: Cambridge, (25) Burges, C. J. C. A tutorial o support vector machies for patter recogitio. Data Miig Kowledge DiscoVery 1998, 2, (26) Bishop, C. M. Neural Networks for Patter Recogitio; Oxford: Oxford Uiversity Press: (27) Joachims, T. Learig to classify text usig Support Vector Machies. Kluwer Iteratioal Series i Egieerig ad Computer Sciece 668; Kluwer Academic Publishers: Bosto, (28) Duda, R. O.; Hart, P. E.; Stork, D. G. Patter Classificatio; Wiley Itersciece: New York, (29) Rumelhart, D. E.; McClellad, J. L.; The PDB Research Group. Parallel Distributed Processig; MIT Press: Cambridge, MA, (30) Haga, M. T.; Demuth, H. B.; Beale, M. H. Neural Network Desig; PWS Publishig: Bosto, (31) Fletcher, R.; Reeves, C. M. Fuctio miimizatio by cojugate gradiets. Comput. J. 1964, 7, (32) Moller, M. F. A scaled cojugate gradiet algorithm for fast supervised learig. Neural Networks 1993, 6, (33) Deis, J. E.; Schabel, R. B. Numerical Methods for Ucostraied Optimizatio ad Noliear Equatios; PreticeHall: Eglewood Cliffs, (34) Haga, M. T.; Mehaj, M. Traiig feedforward etworks with the Marquardt algorithm. IEEE Tras. Neural Networks 1994, 5, (35) Foresee, F. D.; Haga, M. T. GaussNewto approximatio to Bayesia regularizatio. Proceedigs of the 1997 Iteratioal Joit Coferece o Neural Networks; pp (36) MacKay, D. J. C. Bayesia iterpolatio. Neural Comput. 1992, 4, (37) Matthews, B. W. Compariso of the predicted ad observed secodary structure of T4 phage lysozyme. Biochim. Biophys. Acta 1975, 405, (38) Krogh, A.; Sollich, P. Statistical mechaics of esemble learig. Phys. ReV. E1997, 55, (39) Baldi, P.; Bruak, S. Bioiformatics  The Machie Learig Approach; MIT Press: Cambridge, (40) Byvatov, E.; Scheider, G. Support vector machie applicatios i bioiformatics. Appl. Bioif. 2003, 2, CI
Modified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
More informationLECTURE 13: Crossvalidation
LECTURE 3: Crossvalidatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Threeway data partitioi Itroductio to Patter Aalysis Ricardo GutierrezOsua Texas A&M
More informationConfidence Intervals for One Mean with Tolerance Probability
Chapter 421 Cofidece Itervals for Oe Mea with Tolerace Probability Itroductio This procedure calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) with
More informationAlternatives To Pearson s and Spearman s Correlation Coefficients
Alteratives To Pearso s ad Spearma s Correlatio Coefficiets Floreti Smaradache Chair of Math & Scieces Departmet Uiversity of New Mexico Gallup, NM 8730, USA Abstract. This article presets several alteratives
More informationNPTEL STRUCTURAL RELIABILITY
NPTEL Course O STRUCTURAL RELIABILITY Module # 0 Lecture 1 Course Format: Web Istructor: Dr. Aruasis Chakraborty Departmet of Civil Egieerig Idia Istitute of Techology Guwahati 1. Lecture 01: Basic Statistics
More informationReview: Classification Outline
Data Miig CS 341, Sprig 2007 Decisio Trees Neural etworks Review: Lecture 6: Classificatio issues, regressio, bayesia classificatio Pretice Hall 2 Data Miig Core Techiques Classificatio Clusterig Associatio
More informationSystems Design Project: Indoor Location of Wireless Devices
Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 6985295 Email: bcm1@cec.wustl.edu Supervised
More informationChapter Gaussian Elimination
Chapter 04.06 Gaussia Elimiatio After readig this chapter, you should be able to:. solve a set of simultaeous liear equatios usig Naïve Gauss elimiatio,. lear the pitfalls of the Naïve Gauss elimiatio
More information1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
More informationCHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
More informationIncremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich faf@cam.ac.uk dot@dotat.at Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
More informationNEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,
NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical
More informationOutput Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationAQA STATISTICS 1 REVISION NOTES
AQA STATISTICS 1 REVISION NOTES AVERAGES AND MEASURES OF SPREAD www.mathsbox.org.uk Mode : the most commo or most popular data value the oly average that ca be used for qualitative data ot suitable if
More informationReview for College Algebra Final Exam
Review for College Algebra Fial Exam (Please remember that half of the fial exam will cover chapters 14. This review sheet covers oly the ew material, from chapters 5 ad 7.) 5.1 Systems of equatios i
More informationLinear classifier MAXIMUM ENTROPY. Linear regression. Logistic regression 11/3/11. f 1
Liear classifier A liear classifier predicts the label based o a weighted, liear combiatio of the features predictio = w 0 + w 1 f 1 + w 2 f 2 +...+ w m f m For two classes, a liear classifier ca be viewed
More informationCHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
More informationDefinition. Definition. 72 Estimating a Population Proportion. Definition. Definition
7 stimatig a Populatio Proportio I this sectio we preset methods for usig a sample proportio to estimate the value of a populatio proportio. The sample proportio is the best poit estimate of the populatio
More information7. Sample Covariance and Correlation
1 of 8 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 7. Sample Covariace ad Correlatio The Bivariate Model Suppose agai that we have a basic radom experimet, ad that X ad Y
More informationAnalyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
More informationSoving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
More informationDAME  Microsoft Excel addin for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2
Itroductio DAME  Microsoft Excel addi for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,
More informationI. Chisquared Distributions
1 M 358K Supplemet to Chapter 23: CHISQUARED DISTRIBUTIONS, TDISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad tdistributios, we first eed to look at aother family of distributios, the chisquared distributios.
More informationConfidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
More information(VCP310) 18004186789
Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.
More informationA Gentle Introduction to Algorithms: Part II
A Getle Itroductio to Algorithms: Part II Cotets of Part I:. Merge: (to merge two sorted lists ito a sigle sorted list.) 2. Bubble Sort 3. Merge Sort: 4. The BigO, BigΘ, BigΩ otatios: asymptotic bouds
More informationTaking DCOP to the Real World: Efficient Complete Solutions for Distributed MultiEvent Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed MultiEvet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
More informationGeneralization Dynamics in LMS Trained Linear Networks
Geeralizatio Dyamics i LMS Traied Liear Networks Yves Chauvi Psychology Departmet Staford Uiversity Staford, CA 94305 Abstract For a simple liear case, a mathematical aalysis of the traiig ad geeralizatio
More informationFourier Series and the Wave Equation Part 2
Fourier Series ad the Wave Equatio Part There are two big ideas i our work this week. The first is the use of liearity to break complicated problems ito simple pieces. The secod is the use of the symmetries
More information5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
More informationSequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
More informationSolutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork
Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the
More informationDepartment of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS200609 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
More informationResearch Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
More informationChapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
More informationApplication and research of fuzzy clustering analysis algorithm under microlecture English teaching mode
SHS Web of Cofereces 25, shscof/20162501018 Applicatio ad research of fuzzy clusterig aalysis algorithm uder microlecture Eglish teachig mode Yig Shi, Wei Dog, Chuyi Lou & Ya Dig Qihuagdao Istitute of
More informationCOMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF SCONTROL CHART AND EWMAS CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
More informationA Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design
A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 168040030 haupt@ieee.org Abstract:
More informationCantilever Beam Experiment
Mechaical Egieerig Departmet Uiversity of Massachusetts Lowell Catilever Beam Experimet Backgroud A disk drive maufacturer is redesigig several disk drive armature mechaisms. This is the result of evaluatio
More informationARITHMETIC AND GEOMETRIC PROGRESSIONS
Arithmetic Ad Geometric Progressios Sequeces Ad ARITHMETIC AND GEOMETRIC PROGRESSIONS Successio of umbers of which oe umber is desigated as the first, other as the secod, aother as the third ad so o gives
More informationTheorems About Power Series
Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real oegative umber R, called the radius
More informationINVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
More informationLesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) types of data scatter plots measure of directio measure of stregth Computatio covariatio of X ad Y uique variatio i X ad Y measurig
More informationVladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
More informationSpam Detection. A Bayesian approach to filtering spam
Spam Detectio A Bayesia approach to filterig spam Kual Mehrotra Shailedra Watave Abstract The ever icreasig meace of spam is brigig dow productivity. More tha 70% of the email messages are spam, ad it
More informationPSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
More informationScalable Biomedical Named Entity Recognition: Investigation of a DatabaseSupported SVM Approach
Scalable Biomedical Named Etity Recogitio: Ivestigatio of a DatabaseSupported SVM Approach Moa Solima Habib * ad Jugal Kalita Departmet of Computer Sciece Uiversity of Colorado, 1420 Austi Bluffs Pkwy
More informationNumerical Solution of Equations
School of Mechaical Aerospace ad Civil Egieerig Numerical Solutio of Equatios T J Craft George Begg Buildig, C4 TPFE MSc CFD Readig: J Ferziger, M Peric, Computatioal Methods for Fluid Dyamics HK Versteeg,
More informationPlugin martingales for testing exchangeability online
Plugi martigales for testig exchageability olie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
More informationBASIC STATISTICS. Discrete. Mass Probability Function: P(X=x i ) Only one finite set of values is considered {x 1, x 2,...} Prob. t = 1.
BASIC STATISTICS 1.) Basic Cocepts: Statistics: is a sciece that aalyzes iformatio variables (for istace, populatio age, height of a basketball team, the temperatures of summer moths, etc.) ad attempts
More informationSection 73 Estimating a Population. Requirements
Sectio 73 Estimatig a Populatio Mea: σ Kow Key Cocept This sectio presets methods for usig sample data to fid a poit estimate ad cofidece iterval estimate of a populatio mea. A key requiremet i this sectio
More information1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
More informationGCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea  add up all
More informationLecture Notes CMSC 251
We have this messy summatio to solve though First observe that the value remais costat throughout the sum, ad so we ca pull it out frot Also ote that we ca write 3 i / i ad (3/) i T () = log 3 (log ) 1
More informationSubject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
More informationThe shaded region above represents the region in which z lies.
GCE A Level H Maths Solutio Paper SECTION A (PURE MATHEMATICS) (i) Im 3 Note: Uless required i the questio, it would be sufficiet to just idicate the cetre ad radius of the circle i such a locus drawig.
More informationSAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
More informationPROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUSMALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
More information3. Covariance and Correlation
Virtual Laboratories > 3. Expected Value > 1 2 3 4 5 6 3. Covariace ad Correlatio Recall that by takig the expected value of various trasformatios of a radom variable, we ca measure may iterestig characteristics
More informationBiology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships
Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the
More informationwhere: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
More informationUseful and unuseful summaries of regression models
Tutorial 5 b. Todeschii Useful ad uuseful summaries of regressio models oberto Todeschii Milao Chemometrics ad QSA esearch Group  Dept. of Evirometal Scieces, Uiversit of MilaoBicocca, P.za della Scieza
More informationNow here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
More informationTrigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is
0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values
More informationMESSAGE TO TEACHERS: NOTE TO EDUCATORS:
MESSAGE TO TEACHERS: NOTE TO EDUCATORS: Attached herewith, please fid suggested lesso plas for term 1 of MATHEMATICS Grade 12. Please ote that these lesso plas are to be used oly as a guide ad teachers
More informationThe Nine Dots Puzzle Extended to nxnx xn Points
The Nie Dots Puzzle Exteded to xx x Poits Marco Ripà 1 ad Pablo Remirez 2 1 Ecoomics Istitutios ad Fiace, Roma Tre Uiversity, Rome, Italy Email: marcokrt1984@yahoo.it 2 Electromechaical Egieerig, UNLPam,
More information5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
More informationB1. Fourier Analysis of Discrete Time Signals
B. Fourier Aalysis of Discrete Time Sigals Objectives Itroduce discrete time periodic sigals Defie the Discrete Fourier Series (DFS) expasio of periodic sigals Defie the Discrete Fourier Trasform (DFT)
More informationIn nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
More informationDefinition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
More informationChair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics
Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to
More information8.1 Arithmetic Sequences
MCR3U Uit 8: Sequeces & Series Page 1 of 1 8.1 Arithmetic Sequeces Defiitio: A sequece is a comma separated list of ordered terms that follow a patter. Examples: 1, 2, 3, 4, 5 : a sequece of the first
More informationA Study for the (μ,s) n Relation for Tent Map
Applied Mathematical Scieces, Vol. 8, 04, o. 60, 3009305 HIKARI Ltd, www.mhikari.com http://dx.doi.org/0.988/ams.04.4437 A Study for the (μ,s) Relatio for Tet Map Saba Noori Majeed Departmet of Mathematics
More informationHypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lieup for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
More informationPower Factor in Electrical Power Systems with NonLinear Loads
Power Factor i Electrical Power Systems with NoLiear Loads By: Gozalo Sadoval, ARTECHE / INELAP S.A. de C.V. Abstract. Traditioal methods of Power Factor Correctio typically focus o displacemet power
More informationYour organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:
Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network
More informationrepresented by 4! different arrangements of boxes, divide by 4! to get ways
Problem Set #6 solutios A juggler colors idetical jugglig balls red, white, ad blue (a I how may ways ca this be doe if each color is used at least oce? Let us preemptively color oe ball i each color,
More information.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
More informationGrade 7. Strand: Number Specific Learning Outcomes It is expected that students will:
Strad: Number Specific Learig Outcomes It is expected that studets will: 7.N.1. Determie ad explai why a umber is divisible by 2, 3, 4, 5, 6, 8, 9, or 10, ad why a umber caot be divided by 0. [C, R] [C]
More informationTHE ARITHMETIC OF INTEGERS.  multiplication, exponentiation, division, addition, and subtraction
THE ARITHMETIC OF INTEGERS  multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,
More informationTotally Corrective Boosting Algorithms that Maximize the Margin
Mafred K. Warmuth mafred@cse.ucsc.edu Ju Liao liaoju@cse.ucsc.edu Uiversity of Califoria at Sata Cruz, Sata Cruz, CA 95064, USA Guar Rätsch Guar.Raetsch@tuebige.mpg.de Friedrich Miescher Laboratory of
More information23.3 Sampling Distributions
COMMON CORE Locker LESSON Commo Core Math Stadards The studet is expected to: COMMON CORE SIC.B.4 Use data from a sample survey to estimate a populatio mea or proportio; develop a margi of error through
More informationHypothesis Tests Applied to Means
The Samplig Distributio of the Mea Hypothesis Tests Applied to Meas Recall that the samplig distributio of the mea is the distributio of sample meas that would be obtaied from a particular populatio (with
More informationADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC
8 th Iteratioal Coferece o DEVELOPMENT AND APPLICATION SYSTEMS S u c e a v a, R o m a i a, M a y 25 27, 2 6 ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC Vadim MUKHIN 1, Elea PAVLENKO 2 Natioal Techical
More information428 CHAPTER 12 MULTIPLE LINEAR REGRESSION
48 CHAPTER 1 MULTIPLE LINEAR REGRESSION Table 18 Team Wis Pts GF GA PPG PPcT SHG PPGA PKPcT SHGA Chicago 47 104 338 68 86 7. 4 71 76.6 6 Miesota 40 96 31 90 91 6.4 17 67 80.7 0 Toroto 8 68 3 330 79.3
More informationGeometric Sequences and Series. Geometric Sequences. Definition of Geometric Sequence. such that. a2 4
3330_0903qxd /5/05 :3 AM Page 663 Sectio 93 93 Geometric Sequeces ad Series 663 Geometric Sequeces ad Series What you should lear Recogize, write, ad fid the th terms of geometric sequeces Fid th partial
More informationMathematical goals. Starting points. Materials required. Time needed
Level A1 of challege: C A1 Mathematical goals Startig poits Materials required Time eeded Iterpretig algebraic expressios To help learers to: traslate betwee words, symbols, tables, ad area represetatios
More informationTHE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
More informationReliability Analysis in HPC clusters
Reliability Aalysis i HPC clusters Narasimha Raju, Gottumukkala, Yuda Liu, Chokchai Box Leagsuksu 1, Raja Nassar, Stephe Scott 2 College of Egieerig & Sciece, Louisiaa ech Uiversity Oak Ridge Natioal Lab
More informationUnit 20 Hypotheses Testing
Uit 2 Hypotheses Testig Objectives: To uderstad how to formulate a ull hypothesis ad a alterative hypothesis about a populatio proportio, ad how to choose a sigificace level To uderstad how to collect
More informationCuttingPlane Training of Structural SVMs
CuttigPlae Traiig of Structural SVMs Thorste Joachims, Thomas Filey, ad ChuNam Joh Yu Abstract Discrimiative traiig approaches like structural SVMs have show much promise for buildig highly complex ad
More informationAutomatic Tuning for FOREX Trading System Using Fuzzy Time Series
utomatic Tuig for FOREX Tradig System Usig Fuzzy Time Series Kraimo Maeesilp ad Pitihate Soorasa bstract Efficiecy of the automatic currecy tradig system is time depedet due to usig fixed parameters which
More informationAsymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
More information9.8: THE POWER OF A TEST
9.8: The Power of a Test CD91 9.8: THE POWER OF A TEST I the iitial discussio of statistical hypothesis testig, the two types of risks that are take whe decisios are made about populatio parameters based
More informationDomain 1: Designing a SQL Server Instance and a Database Solution
Maual SQL Server 2008 Desig, Optimize ad Maitai (70450) 18004186789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a
More informationNotes on exponential generating functions and structures.
Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a elemet set, (2) to fid for each the
More informationSection IV.5: Recurrence Relations from Algorithms
Sectio IV.5: Recurrece Relatios from Algorithms Give a recursive algorithm with iput size, we wish to fid a Θ (best big O) estimate for its ru time T() either by obtaiig a explicit formula for T() or by
More information4.1 Sigma Notation and Riemann Sums
0 the itegral. Sigma Notatio ad Riema Sums Oe strategy for calculatig the area of a regio is to cut the regio ito simple shapes, calculate the area of each simple shape, ad the add these smaller areas
More informationEstimating the Mean and Variance of a Normal Distribution
Estimatig the Mea ad Variace of a Normal Distributio Learig Objectives After completig this module, the studet will be able to eplai the value of repeatig eperimets eplai the role of the law of large umbers
More information