Comparison of Support Vector Machine and Artificial Neural Network Systems for Drug/Nondrug Classification
|
|
|
- Evan West
- 10 years ago
- Views:
Transcription
1 1882 J. Chem. If. Comput. Sci. 2003, 43, Compariso of Support Vector Machie ad Artificial Neural Network Systems for Drug/Nodrug Classificatio Evgey Byvatov, Uli Fecher, Jes Sadowski, ad Gisbert Scheider*, Istitut für Orgaische Chemie ud Chemische Biologie, Joha Wolfgag Goethe-Uiversität, Marie-Curie-Strasse 11, D Frakfurt, Germay, ad AstraZeeca R&D Möldal, SC 264, S Möldal, Swede Received Jue 13, 2003 Support vector machie (SVM) ad artificial eural etwork (ANN) systems were applied to a drug/odrug classificatio problem as a example of biary decisio problems i early-phase virtual compoud filterig ad screeig. The results idicate that solutios obtaied by SVM traiig seem to be more robust with a smaller stadard error compared to ANN traiig. Geerally, the SVM classifier yielded slightly higher predictio accuracy tha ANN, irrespective of the type of descriptors used for molecule ecodig, the size of the traiig data sets, ad the algorithm employed for eural etwork traiig. The performace was compared usig various differet descriptor sets ad descriptor combiatios based o the 120 stadard Ghose-Crippe fragmet descriptors, a wide rage of 180 differet properties ad physicochemical descriptors from the Molecular Operatig Eviromet (MOE) package, ad 225 topological pharmacophore (CATS) descriptors. For the complete set of 525 descriptors cross-validated classificatio by SVM yielded 82% correct predictios (Matthews cc ) 0.63), whereas ANN reached 80% correct predictios (Matthews cc ) 0.58). Although SVM outperformed the ANN classifiers with regard to overall predictio accuracy, both methods were show to complemet each other, as the sets of true positives, false positives (overpredictio), true egatives, ad false egatives (uderpredictio) produced by the two classifiers were ot idetical. The theory of SVM ad ANN traiig is briefly reviewed. INTRODUCTION Early-phase virtual screeig ad compoud library desig ofte employs filterig routies which are based o biary classifiers ad are meat to elimiate potetially uwated molecules from a compoud library. 1,2 Curretly two classifier systems are most ofte used i these applicatios: PLSbased classifiers 3,4 ad various types of artificial eural etworks (ANN). 5-9 Typically, these systems yield a average overall accuracy of 80% correct predictios for biary decisio tasks followig the likeess cocept i virtual screeig. 2,10 The support vector machie (SVM) approach was first itroduced by Vapik as a potetial alterative to covetioal artificial eural etworks. 11,12 Its popularity has grow ever sice i various areas of research, ad first applicatios i molecular iformatics ad pharmaceutical research have bee described Although SVM ca be applied to multiclass separatio problems, its origial implemetatio solves biary class/oclass separatio problems. Here we describe applicatio of SVM to the drug/ odrug classificatio problem, which employs a class/ oclass implemetatio of SVM. Both SVM ad ANN algorithms ca be formulated i terms of learig machies. The stadard sceario for classifier developmet cosists of two stages: traiig ad testig. Durig first stage the learig machie is preseted with labeled samples, which are basically -dimesioal vectors with a class membership * Correspodig author phoe: ; fax: ; [email protected]. Joha Wolfgag Goethe-Uiversität. AstraZeeca R&D Möldal. label attached. The learig machie geerates a classifier for predictio of the class label of the iput coordiates. Durig the secod stage, the geeralizatio ability of the model is tested. Curretly various sets of molecular descriptors are available. 16 For applicatio to drug/odrug classificatio of compouds, the molecules are typically represeted by -dimesioal vectors. 6,7 I this work, we focused o the fragmet-based Ghose-Crippe (GC) descriptors which were used i the origial work of Sadowski ad Kubiyi for drug/odrug classificatio, 7 descriptors provided by the MOE software package (Molecular Operatig Eviromet. Chemical Computig Group Ic., Motreal, Caada), ad CATS topological pharmacophores. 20 Havig defied this molecular represetatio, the task of the preset study was to compare the classificatio ability of stadard SVM ad feed-forward ANN o the drug/odrug data. A wwwbased iterface for calculatig the drug-likeess score of a molecule usig our SVM solutio based o the CATS descriptor was developed ad ca be foud at URL: gecco.org.chemie.ui-frakfurt.de/gecco.html. DATA AND METHODS Data Sets. For SVM ad ANN traiig we used the sets of drug ad odrug molecules prepared by Kubiyi ad Sadowski. 7 From the origial data set 9208 molecules could be processed by our descriptor geeratio software. The fial workig set cotaied 4998 drugs ad 4210 odrug molecules. Three sets of descriptors were calculated: couts of the stadard 120 Ghose Crippe descriptors, /ci CCC: $ America Chemical Society Published o Web 09/27/2003
2 ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, descriptors from MOE (Molecular Operatig Eviromet. Chemical Computig Group Ic., Motreal, Caada), ad 225 topological pharmacophore (CATS) descriptors. 20 MOE descriptors iclude various 2D ad 3D descriptors such as volume ad shape desciptors, atom ad bods couts, Kier- Hall coectivity ad kappa shape idices, adjacecy ad distace matrix descriptors, pharmacophore feature descriptors, partial charges, potetial eergy descriptors, ad coformatio-depedet charge descriptors. Before calculatig MOE descriptors, sigle 3D coformers were geerated by CORINA CATS descriptors were calculated usig our ow software takig ito cosideratio pairs of atom types separated by up to 15 bods (URL: gecco.org.chemie.ui-frakfurt.de/gecco.html). 20 All 225 descriptor colums were idividually autoscaled. A alterative would have bee block-scalig where each descriptor class is autoscaled as a whole, which was ot applied here. Support Vector Machie. SVM classifiers are geerated by a two-step procedure: First, the sample data vectors are mapped ( projected ) to a very high-dimesioal space. The dimesio of this space is sigificatly larger tha dimesio of the origial data space. The, the algorithm fids a hyperplae i this space with the largest margi separatig classes of data. It was show that classificatio accuracy usually depeds oly weakly o the specific projectio, provided that the target space is sufficietly high dimesioal. 11 Sometimes it is ot possible to fid the separatig hyperplae eve i a very high-dimesioal space. I this case a tradeoff is itroduced betwee the size of the separatig margi ad pealties for every vector which is withi the margi. 11 The basic theory of SVM will be briefly reviewed i the followig. The separatig hyperplae is defied as D(x) ) (w x) + w 0 Here x is a samples vector mapped to a high dimesioal space, ad w ad w 0 are parameters of the hyperplae that SVM will estimate. The the margi ca be expressed as a miimal τ for which holds Without loss of geerality we ca apply a costrait τ w ) 1tow. I this case maximizig τ is equivalet to miimizig w ad SVM traiig is becomig the problem of fidig the miimum of a fuctio with the followig costraits: miimize y k D(x k ) g τ w η(w) ) 1 2 (w w) subject to costraits y i [(w x i ) + w 0 ] g 1 This problem is solved by itroductio of Lagrage multipliers ad miimizatio of the fuctio Here R i are Lagrage multipliers. Differetiatig over w ad w i ad substitutig we obtai Q(w,w 0,R) ) 1 2 (w w) - R i {y i [(w x i ) + w 0 ] - 1} Figure 1. Priciple of SVM classificatio. The task was to separate two classes of objects idicated by squares ad circles. Squares represet oclass samples ( egative examples, e.g. odrugs) ad circles are class members ( positive examples, e.g. drugs). D(x) is the decisio fuctio defiig class membership accordig to the SVM classifier which is represeted by the separatig lie (D(x) ) 0). The margi is idicated by dotted lies. Support vectors are idicated by filled objects (x 2, x 2, x 3, x 4 ). ξ i are slack variables for support vectors that are ot lyig o the margi border. y i are label-variables equal to 1 for positive examples (class membership) ad -1 for egative examples (oclass membership). See text for details. max subject to costraits Q(R) ) R i - 1 R i R j y i y j (x i x j ) 2 i,j)1 Whe perfect separatio is ot possible slack variables are itroduced for sample vectors which are withi the margi, ad the optimizatio problem ca be reformulated: Here ξ i are slack variables. These variables are ot equal to zero oly for those vectors which are withi the margi. Itroducig Lagrage multipliers agai we fially obtai This is a quadratic programmig (QP) problem for which several efficiet stadard methods are kow. 22 Due to the very high dimesioality of the QP problem, which typically arises durig SVM traiig, a extesio of the algorithm for solvig QP is used i SVM applicatios. 23 A geometrical illustratio of the meaig of slack variables ad Lagrage multipliers is give i Figure 1. Poits classified by SVM ca be divided ito two groups, support vectors ad osupport vectors. Nosupport vectors are classified correctly by the hyperplae ad are located outside y i R i ) 0; R i g 0,i ) 1,..., miimize η(w) ) 1 2 (w w) + C ξ i i subject to costraits y i [(w x i ) + w 0 ] g 1 - ξ i max subject to costraits Q(R) ) R i - 1 R i R j y i y j (x i x j ) 2 i,j)1 y i R i ) 0, C g R i g 0,i ) 1,...,
3 1884 J. Chem. If. Comput. Sci., Vol. 43, No. 6, 2003 BYVATOV ET AL. the separatig margi. Slack variables ad Lagrage multipliers for them are equal to zero. Parameters of the hyperplae do ot deped o them, ad eve if their positio is chaged the separatig hyperplae ad margi will remai uchaged, provided that these poits will stay outside the margi. Other poits are support vectors, ad they are the poits which determie the exact positio of the hyperplae. For all support vectors the absolute values of the slack variables are equal to the distaces from these poits to the edge of the separatig margi. These distaces are defied i the uits of half of the width of the separatig margi. For correctly classified poits withi the separatig margi, slack variable values are betwee zero ad oe. For misclassified poits withi the margi the values of the slack variables are betwee oe ad two. For other misclassified poits they are greater tha two. For poits that are lyig o the edge of margi, Lagrage multipliers are betwee zero ad C, ad slack variables for these poits are still equal to zero. For all other poits, for which the values of slack variables are larger tha zero, Lagrage multipliers assume the value of C. Explicit mappig to a very high-dimesioal space is ot required if calculatio of the scalar product i this high dimesioal space of every two vectors is feasible. This scalar product ca be defied by itroducig a kerel fuctio(x x ) ) K(x,x ), 24 where x ad x are vectors i a low-dimesioal space for which a kerel fuctio that correspods to a scalar product i a high dimesioal space is defied. Various kerels may be applied. 25 I our case, we used a kerel fuctio of a fifth-order polyomial: K(x,x ) ) ((x x )s + r) 5 This kerel correspods to the decisio fuctio f(x) ) sig( R i K(x sv i, x) + b) i where R i are Lagrage multipliers determied durig traiig of SVM. The sum is oly over support vectors x sv. Lagrage multipliers for all other poits are equal to zero. Parameter b determies the shift of the hyperplae, ad it is also foud durig SVM traiig. Simultaeous scalig of s, r, ad b parameters does ot chage the decisio fuctio. Thus, we ca simplify the kerel by settig r equal to oe: K(x,x ) ) ((x x )s + 1) 5 I this case oly the kerel parameter s ad error tradeoff C must be tued. Parameter C is ot preset explicitly i this equatio; it is set up as a pealty for the misclassificatio error before the traiig of SVM is performed. For tuig parameters s ad C, four-times cross-validatio of traiig data was applied, ad values for s ad C that maximize accuracy were the chose. Accuracy maximizatio was performed by heuristics based gradiet descet. 26 Basically, the followig procedure was applied. The data set was divided ito two parts, traiig ad validatio set. The validatio subset was put aside ad used oly for estimatio of the performace of the traied classifier. Traiig data were divided ito four ooverlappig subsets. The SVM parameters to be determied were set to reasoable iitial values. The, the SVM was traied o the traiig data Figure 2. Architecture of artificial eural etworks. Formal euros are draw as circles, weights are represeted by lies coectig the euro layers. Fa-out euros are draw i white, sigmoidal uits i black, ad liear uits i gray. (a) covetioal three-layered feed-forward system ( architecture I ); (b) etwork architecture used by Ajay ad co-workers for drug-likeess predictio ( architecture II ). 6 excludig oe of the four subsets, ad the performace of the obtaied SVM classifier was estimated with the excluded subset. This procedure was repeated for each subset, ad a average performace of the SVM classifier was obtaied. For SVM traiig we used freely available SVM software (SVM-Light package; URL: org/). 26,27 A Liux-based LSF (Load Sharig Facility; Platform Computig GmbH, D Ratige, Germay) cluster was used for determiatio of the cross-validatio error to reduce calculatio time. All calculatios were performed usig the MATLAB package (MATLAB 2002, The mathematical laboratory. The MathWorks GmbH, D Aache, Germay). ARTIFICIAL NEURAL NETWORK Covetioal two-layered eural etworks with a sigle output euro were used for ANN model developmet (Figure 2a). 26 As a result of etwork traiig a decisio fuctio is chose from the family of fuctios represeted by the etwork architecture. This fuctio family is defied by the complexity of the eural etwork: umber of hidde layers, umber of euros i these layers, ad topology of the etwork. The decisio fuctio is determied by choosig appropriate weights for the eural etwork. Optimal weights usually miimize a error fuctio for the particular etwork architecture. The error fuctio describes the deviatio of predicted target values from observed or desired values. For our class/oclass classificatio problem the target values were 1 for class (drugs) ad -1 for oclass (odrugs). Stadard two-layered eural etwork with a sigle output euro ca be represeted by the followig equatio y ) g ( M w 1j j)1 d w ji (2) g( (1) x i + w (1) j0 ) + w 11 with the error fuctio E ) k)1 (y(x k ) - y k ) 2. I this work, g is a liear fuctio ad g is a ta-sigmoid trasfer fuctio. A secod type etwork architecture cotaiig additioal coectios from the iput layer to the output layer was traied to reimplemet the origial drug/odrug ANN developed by Ajay ad co-workers (Figure 2b). 6 Traiig of eural etwork is typically performed o variatios of gradiet descet based algorithms, 26 tryig to (2) )
4 Table 1. Cross-Validated Results of Machie Learig a % correct Matthews cc ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, descriptors ANN SVM ANN SVM GC ( ( ( ( MOE ( ( ( ( CATS_ ( ( ( ( all (GC+MOE+CATS) ( ( ( ( a Average values ad stadard deviatios are give. The Leveberg-Marquardt traiig method was used for ANN traiig. miimize a error fuctio. To avoid overfittig crossvalidatio ca be used for fidig a earlier poit of traiig. 28 I this work the eural etwork toolbox from MATLAB was used. Data were preprocessed idetically to SVM based learig. We applied the followig traiig algorithms to ANN optimizatio i their default versios provided by MATLAB: gradiet descet with variable learig rate, 29,30 cojugated gradiet descet, 30,31 scaled cojugated gradiet descet, 32 quasi-newto algorithm, 33 Leveberg-Marquardt (LM), 34,35 ad automated regularizatio. 36 For each optimizatio te-times cross-validatio was performed (80+20 splits ito traiig ad test data), where the ANN weights ad biases were optimized usig the traiig data, ad predictio accuracy was measured usig test data to determie the umber of traiig epochs, i.e., the edpoit of the traiig process. This was performed to reduce the risk of overfittig. It should be oted that the validatio data were left utouched. MODEL VALIDATION The SVM model for drug/odrug classificatio of a patter x was SVM(x) ) (a i K(x SV i, x) + b) i Here, i rus oly over support vectors (SV). The value of SVM(x) is either positive ( drug ) or egative ( odrug ). The ANN model for drug/odrug classificatio produced values i ]-1,1[, where a positive value meat drug ad a egative value odrug. Classificatio accuracy was evaluated based o predictio accuracy, i.e., percet of test compouds correctly classified, ad the correlatio coefficiet accordig to Matthews: 37 NP - OU cc ) (N + O)(N + U)(P + O)(P + U) where P, N, O, ad U are the umber of true positive, true egative, false positive, ad false egative predictios, respectively. Drugs were cosidered as positive set, the odrug molecules formed the egative set. The values of cc ca rage from -1 to 1. Perfect predictio gives a correlatio coefficiet of 1. SVM ad ANN models were developed usig various sizes of traiig data to measure the ifluece of the size of the traiig set o the quality of the classificatio model. The umber of traiig samples was iteratively dimiished: Startig with a radom split of all available samples ito traiig ad validatio subsets, at each of the followig iteratios we dimiished the size of the traiig set to oly 80% of the umber of samples of the previous iteratio. This allowed us to obtai better samplig for small traiig sets. 10-times cross-validatio was performed, ad average values of predictio accuracy ad cc were calculated. RESULTS AND DISCUSSION The mai aim of this study was to compare SVM ad ANN classifiers i their ability to distiguish betwee sets of drugs ad odrugs. We traied differet eural etwork topologies, ad performace of the best etwork was compared to the SVM classifier. Two types of ANN architecture were cosidered: stadard feed-forward etworks with oe hidde layer ( architecture I ) ad a feed-forward etwork with oe hidde layer with additioal direct coectios from iput euros to the output ( architecture II ) (Figure 2). The first type of ANN was used by Sadowski ad Kubiyi i their origial work o drug-likeess predictio; 7 the secod architecture was employed by Ajay ad co-workers servig the same purpose. 6 Usig these etworks ad the GC descriptors i combiatio with the Leveberg-Marquardt traiig method, classificatio accuracy was idetical to the origial results (o average 80% correct) despite the use of a differet traiig techique ad differet traiig data (Table 1). This observatio substatiates the origial fidigs. Both etwork types performed idetically cosiderig the error margi (approximately 80% correct classificatio). We observed that for some of the traiig algorithms a slightly lower stadard deviatio of the predictio accuracy was observed for architecture I (data ot show). Sice the additioal coectios i etwork architecture II did ot cotribute to a greater accuracy of the model, we used oly the stadard feed-forward etwork with oe hidde layer cotaiig two euros (architecture I) for further aalysis. For each traiig method ad combiatio of iput variables (descriptors) etworks with differet umbers of hidde euros (2-10 euros) were traied. Overall, we did ot observe a overall best traiig algorithm. The Leveberg-Marquardt method was used for the developmet of the fial ANN model. Also, we did ot observe a improved classificatio result whe the umber of hidde euros was larger tha two (data ot show). ANN architecture I with two hidde euros yielded the overall best cross-validated predictio result for all descriptors (GC+MOE+CATS), 80% correct predictios ( cc ) 0.58). The rak order of descriptor sets with regard to the overall classificatio accuracy yielded was as follows: All > GC > MOE > CATS (Table 1). It should be stressed that the differeces i classificatio accuracy are miute for the descriptors All, MOE, ad GC ad should be regarded as comparable cosiderig a stadard deviatio of 1%. The CATS descriptor led to approximately 5% lower accuracy.
5 1886 J. Chem. If. Comput. Sci., Vol. 43, No. 6, 2003 BYVATOV ET AL. Figure 3. Average cross-validated predictio accuracy (fractio correct) of SVM ad ANN classifiers optimized by various traiig schemes for GC descriptors (upper graph: logarithmic scale; lower graph: liear scale). SVM traiig resulted i models showig slightly higher predictio accuracy tha the ANN systems (Table 1). A 1-2% gai was observed, idepedet of the umber of traiig samples ad method used for eural etwork traiig. Figures 3 ad 4 illustrate the depedecy of the classificatio accuracy o the umber of sample molecules used for traiig. I oe experimet oly GC descriptors were used (Figure 3), i a secod study the combiatio of GC, MOE, ad CATS descriptors was employed (Figure 4). With the GC descriptor the SVM estimator oly slightly outperforms the eural etworks (Figure 3). Similar results were obtaied if oly MOE or CATS descriptors were used for traiig (data ot show). The situatio chaged whe all descriptors were used. With the complete descriptor set (525-dimesioal) SVM clearly outperforms the eural etwork system (Figure 4). These results substatiate earlier fidigs that SVM performs better tha ANN whe large umbers of features or descriptors are used. 12 A geeral observatio was the fact that classificatio accuracy sigificatly improved with a icreasig umber of traiig samples, reachig a plateau i performace betwee 2000 ad 3000 samples (Figures 3 ad 4). The accuracy curves represet almost ideal learig behavior. It should be metioed that the performace plateau observed does ot reflect a iheret clusterig of the data set, as traiig data subsets were radomly selected from the pool. The fractio correctly predicted grows from approximately 65% to 80% whe the traiig set is icreased by a factor of 250. The combiatio of MOE, GC, ad CATS descriptors improved classificatio accuracy by approximately two percet for SVM ad by oe percet for ANN compared to models based o idividual descriptors. These results demostrate that a optimal ANN traiig to a large extet depeds o the umber of traiig patters available ad the type of molecular descriptors used. For istace, for GC descriptors the best learig algorithm was traiig with
6 ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, Figure 4. Average cross-validated predictio accuracy (fractio correct) of SVM ad ANN classifiers optimized by various traiig schemes for the combiatio of GC, MOE, ad CATS descriptors (upper graph: logarithmic scale; lower graph: liear scale). automated regularizatio, but for the combiatio of GC, MOE, ad CATS descriptors this algorithm was extremely slow ad coverged relatively ustable. I cotrast, SVM geerally performed more stably compared to ANN, with oly a small icrease i computatio time for both sets of descriptors (Figures 3 ad 4). I a previous compariso of SVM to several machie learig methods by Holde ad co-workers it was show that a SVM classifier outperformed other stadard methods, but a specially desiged ad structurally optimized eural etwork was agai superior to the SVM model i a bechmark test. 13 This observatio is supported by the observatio that i the preset study the set of molecules which were correctly classified by both SVM ad ANN (mutual true positives) was 72% o average, ad the fractio icorrectly classified by both systems (mutual false egatives) was 11%. 10% of the test data were correctly predicted by SVM but failed by ANN, ad 6% were correctly classified by ANN but ot by SVM usig the full set of descriptors (GC+MOE+CATS). Examples of the latter two sets of molecules are show i Figure 5. Clearly, the ANN classifier ad the SVM classifier complemet each other, ad both methods could be further optimized, for example, by chagig the SVM kerel or by explorig more sophisticated ANN architectures ad cocepts. Fast classifier systems are maily developed for first-pass virtual screeig, i particular for idetificatio ( flaggig ) of potetially udesired molecules i very large compoud collectios. 2 Due to robust covergece behavior SVM seems to be well-suited for solvig biary decisio problems i molecular iformatics, especially whe a large umber of descriptors is available for characterizatio of molecules. I this study we have show that two drug-likeess estimators ca produce complemetary predictios. We recommed the parallel applicatio of both predictive systems for virtual screeig applicatios. Oe possibility to combie several estimators for drug-likeess or ay other classificatio task is to employ a jury decisio, e.g. calculate a esemble
7 1888 J. Chem. If. Comput. Sci., Vol. 43, No. 6, 2003 BYVATOV ET AL. determies the success or failure of machie learig systems. Both methods are suited to assess the usefuless of differet descriptor sets for a give classificatio task, ad they are methods of choice for rapid first-pass filterig of compoud libraries. 40 A particular advatage of SVM is sparseess of the solutio. This meas that a SVM classifier depeds oly o the support vectors, ad the classifier fuctio is ot iflueced by the whole data set, as it is the case for may eural etwork systems. Aother characteristic of SVM is the possibility to efficietly deal with a very large umber of features due to the exploitatio of kerel fuctios, which makes it a attractive techique, e.g., for gee chip aalysis or high-dimesioal chemical spaces. The combiatio of SVM with a feature selectio routie might provide a efficiet tool for extractig chemically relevat iformatio. Figure 5. Examples of drugs correctly classified by ANN but ot by SVM (structures 1-5), ad drugs correctly classified by SVM but ot by ANN (structures 6-10). average. 38,39 As more ad more differet predictors become available for virtual screeig a meaigful combiatio of predictio systems that exploits the idividual stregths of the differet methods will be pivotal for reliable compoud library filterig. CONCLUSION It was demostrated that the SVM system used i this study has the capacity to produce higher overall predictio accuracy tha a particular ANN architecture. Based o this observatio we coclude that SVM represets a useful method for classificatio tasks i QSAR modelig ad virtual screeig, especially whe large umbers of iput variables are used. The SVM classifier was show to complemet the predictios obtaied by ANN. The SVM ad ANN classifiers obtaied for drug-likeess predictio are comparable i overall accuracy ad produce overlappig, yet ot idetical sets of correctly ad misclassified compouds. A similar observatio ca be made whe two ANN models are compared. Differet ANN architectures ad traiig algorithms were show to lead to differet classificatio results. Therefore, it might be wise to apply several predictive models i parallel, irrespective of their ature, i.e., beig SVM- or ANN-based. We wish to stress that our study does ot justify the coclusio that SVM outperforms ANN i geeral. I the preset work oly a stadard feed-forward etwork with a fixed umber of hidde euros was compared to a stadard SVM implemetatio. Nevertheless, our results idicate that solutios obtaied by SVM traiig seem to be more robust with a smaller stadard error compared to stadard ANN traiig. Irrespective of the outcome of this study, it is the appropriate choice of traiig data ad descriptors, ad reasoable scalig of iput variables that ACKNOWLEDGMENT The authors are grateful to Norbert Dichter ad Ralf Tomczak for settig up the LSF Liux cluster. Alireza Givehchi is thaked for assistace i istallig the gecco! Web iterface. This work was supported by the Beilstei- Istitut zur Förderug der Chemische Wisseschafte, Frakfurt. REFERENCES AND NOTES (1) Clark, D. E.; Pickett, S, D. Computatioal methods for the predictio of drug-likeess. Drug DiscoV. Today 2000, 5, (2) Scheider, G.; Böhm, H.-J. Virtual screeig ad fast automated dockig methods. Drug DiscoV. Today 2002, 7, (3) Wold, S. Expoetially weighted movig pricipal compoet aalysis ad projectios to latet structures. Chemomet. Itell. Lab. Syst. 1994, 23, (4) Foria, M.; Casolio, M. C.; de la Pezuela Martiez, C. Multivariate calibratio: applicatios to pharmaceutical aalysis. J. Pharm. Biomed. Aal. 1998, 18, (5) Neural Networks i QSAR ad Drug Desig; Devillers, J., Ed.; Academic Press: Lodo, (6) Ajay; Walters, W. P.; Murcko, M. A. Ca we lear to distiguish betwee drug-like ad odrug-like molecules? J. Med. Chem. 1998, 41, (7) Sadowski, J.; Kubiyi, H. A scorig scheme for discrimiatig betwee drugs ad odrugs. J. Med. Chem. 1998, 41, (8) Sadowski, J. Optimizatio of chemical libraries by eural etworks. Curr. Opi. Chem. Biol. 2000, 4, (9) Scheider, G. Neural etworks are useful tools for drug desig. Neural Networks 2000, 13, (10) Sadowski, J. I Virtual Screeig for BioactiVe Molecules; Böhm, H.-J., Scheider, G., Eds.; Weiheim: Wiley-VCH: 2000; pp (11) Cortes, C.; Vapik, V. Support-vector etworks. Machie Learig 1995, 20, (12) Vapik, V. The Nature of Statistical Learig Theory; Berli: Spriger, (13) Burbidge, R.; Trotter, M.; Buxto, B.; Holde, S. Drug desig by machie learig: support vector machies for pharmaceutical data aalysis. Comput. Chem. 2001, 26, (14) Warmuth, M. K.; Liao, J.; Ratsch, G.; Mathieso, M.; Putta, S.; Lemme, C. Active learig with Support Vector Machies i the drug discovery process. J. Chem. If. Comput. Sci. 2003, 43, (15) Wilto, D.; Willett, P.; Lawso, K.; Mullier, G. Compariso of rakig methods for virtual screeig i lead-discovery programs. J. Chem. If. Comput. Sci. 2003, 43, (16) Todeschii, R.; Cosoi, V. Hadbook of Molecular Descriptors; Weiheim: Wiley-VCH: (17) Ghose, A. K.; Crippe, G. M. Atomic physicochemical parameters for three-dimesioal structure-directed quatitative structure-activity relatioships 1. Partitio coefficiets as a Measure of hydrophobicity. J. Comput. Chem. 1986, 7, (18) Ghose, A. K.; Crippe, G. M. Atomic physicochemical parameters for three-dimesioal structure-directed quatitative structure-activity
8 ARTIFICIAL NEURAL NETWORK SYSTEMS J. Chem. If. Comput. Sci., Vol. 43, No. 6, relatioships 2. Modelig dispersive ad hydrophobic iteractios. J. Comput. Chem. 1987, 27, (19) Ghose, A. K.; Pritchett, A.; Crippe, G. M. Atomic physicochemical parameters for three-dimesioal structure-directed quatitative structure-activity relatioships 3. J. Comput. Chem. 1988, 9, (20) Scheider, G.; Neidhart, W.; Giller, T.; Schmid, G. Scaffold-hoppig by topological pharmacophore search: a cotributio to virtual screeig. Agew. Chem., It. Ed. Egl. 1999, 38, (21) Gasteiger, J.; Rudolph, C.; Sadowski, J. Automatic geeratio of 3Datomic coordiates for orgaic molecules. Tetrahedro Comput. Methods 1990, 3, (22) Colema, T. F.; Li, Y. A reflective Newto method for miimizig a quadratic fuctio subject to bouds o some of the variables. SIAM J. Optimizatio 1996, 6, (23) Joachims, T. I Makig large-scale SVM learig practical. AdVaces i Kerel Methods - Support Vector Learig; Schölkopf, B., Burges, C., Smola, A., Eds.; MIT-Press: Cambridge, MA, 1999; pp (24) Cristiaii, N.; Shawe-Taylor, J. A Itroductio to Support Vector Machies ad Other Kerel-based Learig Methods; Cambridge Uiversity Press: Cambridge, (25) Burges, C. J. C. A tutorial o support vector machies for patter recogitio. Data Miig Kowledge DiscoVery 1998, 2, (26) Bishop, C. M. Neural Networks for Patter Recogitio; Oxford: Oxford Uiversity Press: (27) Joachims, T. Learig to classify text usig Support Vector Machies. Kluwer Iteratioal Series i Egieerig ad Computer Sciece 668; Kluwer Academic Publishers: Bosto, (28) Duda, R. O.; Hart, P. E.; Stork, D. G. Patter Classificatio; Wiley- Itersciece: New York, (29) Rumelhart, D. E.; McClellad, J. L.; The PDB Research Group. Parallel Distributed Processig; MIT Press: Cambridge, MA, (30) Haga, M. T.; Demuth, H. B.; Beale, M. H. Neural Network Desig; PWS Publishig: Bosto, (31) Fletcher, R.; Reeves, C. M. Fuctio miimizatio by cojugate gradiets. Comput. J. 1964, 7, (32) Moller, M. F. A scaled cojugate gradiet algorithm for fast supervised learig. Neural Networks 1993, 6, (33) Deis, J. E.; Schabel, R. B. Numerical Methods for Ucostraied Optimizatio ad Noliear Equatios; Pretice-Hall: Eglewood Cliffs, (34) Haga, M. T.; Mehaj, M. Traiig feedforward etworks with the Marquardt algorithm. IEEE Tras. Neural Networks 1994, 5, (35) Foresee, F. D.; Haga, M. T. Gauss-Newto approximatio to Bayesia regularizatio. Proceedigs of the 1997 Iteratioal Joit Coferece o Neural Networks; pp (36) MacKay, D. J. C. Bayesia iterpolatio. Neural Comput. 1992, 4, (37) Matthews, B. W. Compariso of the predicted ad observed secodary structure of T4 phage lysozyme. Biochim. Biophys. Acta 1975, 405, (38) Krogh, A.; Sollich, P. Statistical mechaics of esemble learig. Phys. ReV. E1997, 55, (39) Baldi, P.; Bruak, S. Bioiformatics - The Machie Learig Approach; MIT Press: Cambridge, (40) Byvatov, E.; Scheider, G. Support vector machie applicatios i bioiformatics. Appl. Bioif. 2003, 2, CI
Modified Line Search Method for Global Optimization
Modified Lie Search Method for Global Optimizatio Cria Grosa ad Ajith Abraham Ceter of Excellece for Quatifiable Quality of Service Norwegia Uiversity of Sciece ad Techology Trodheim, Norway {cria, ajith}@q2s.tu.o
LECTURE 13: Cross-validation
LECTURE 3: Cross-validatio Resampli methods Cross Validatio Bootstrap Bias ad variace estimatio with the Bootstrap Three-way data partitioi Itroductio to Patter Aalysis Ricardo Gutierrez-Osua Texas A&M
Review: Classification Outline
Data Miig CS 341, Sprig 2007 Decisio Trees Neural etworks Review: Lecture 6: Classificatio issues, regressio, bayesia classificatio Pretice Hall 2 Data Miig Core Techiques Classificatio Clusterig Associatio
Systems Design Project: Indoor Location of Wireless Devices
Systems Desig Project: Idoor Locatio of Wireless Devices Prepared By: Bria Murphy Seior Systems Sciece ad Egieerig Washigto Uiversity i St. Louis Phoe: (805) 698-5295 Email: [email protected] Supervised
NEW HIGH PERFORMANCE COMPUTATIONAL METHODS FOR MORTGAGES AND ANNUITIES. Yuri Shestopaloff,
NEW HIGH PERFORMNCE COMPUTTIONL METHODS FOR MORTGGES ND NNUITIES Yuri Shestopaloff, Geerally, mortgage ad auity equatios do ot have aalytical solutios for ukow iterest rate, which has to be foud usig umerical
Incremental calculation of weighted mean and variance
Icremetal calculatio of weighted mea ad variace Toy Fich [email protected] [email protected] Uiversity of Cambridge Computig Service February 009 Abstract I these otes I eplai how to derive formulae for umerically
1 Correlation and Regression Analysis
1 Correlatio ad Regressio Aalysis I this sectio we will be ivestigatig the relatioship betwee two cotiuous variable, such as height ad weight, the cocetratio of a ijected drug ad heart rate, or the cosumptio
CHAPTER 3 THE TIME VALUE OF MONEY
CHAPTER 3 THE TIME VALUE OF MONEY OVERVIEW A dollar i the had today is worth more tha a dollar to be received i the future because, if you had it ow, you could ivest that dollar ad ear iterest. Of all
Output Analysis (2, Chapters 10 &11 Law)
B. Maddah ENMG 6 Simulatio 05/0/07 Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
I. Chi-squared Distributions
1 M 358K Supplemet to Chapter 23: CHI-SQUARED DISTRIBUTIONS, T-DISTRIBUTIONS, AND DEGREES OF FREEDOM To uderstad t-distributios, we first eed to look at aother family of distributios, the chi-squared distributios.
CHAPTER 3 DIGITAL CODING OF SIGNALS
CHAPTER 3 DIGITAL CODING OF SIGNALS Computers are ofte used to automate the recordig of measuremets. The trasducers ad sigal coditioig circuits produce a voltage sigal that is proportioal to a quatity
Soving Recurrence Relations
Sovig Recurrece Relatios Part 1. Homogeeous liear 2d degree relatios with costat coefficiets. Cosider the recurrece relatio ( ) T () + at ( 1) + bt ( 2) = 0 This is called a homogeeous liear 2d degree
DAME - Microsoft Excel add-in for solving multicriteria decision problems with scenarios Radomir Perzina 1, Jaroslav Ramik 2
Itroductio DAME - Microsoft Excel add-i for solvig multicriteria decisio problems with scearios Radomir Perzia, Jaroslav Ramik 2 Abstract. The mai goal of every ecoomic aget is to make a good decisio,
Analyzing Longitudinal Data from Complex Surveys Using SUDAAN
Aalyzig Logitudial Data from Complex Surveys Usig SUDAAN Darryl Creel Statistics ad Epidemiology, RTI Iteratioal, 312 Trotter Farm Drive, Rockville, MD, 20850 Abstract SUDAAN: Software for the Statistical
Confidence Intervals for One Mean
Chapter 420 Cofidece Itervals for Oe Mea Itroductio This routie calculates the sample size ecessary to achieve a specified distace from the mea to the cofidece limit(s) at a stated cofidece level for a
Generalization Dynamics in LMS Trained Linear Networks
Geeralizatio Dyamics i LMS Traied Liear Networks Yves Chauvi Psychology Departmet Staford Uiversity Staford, CA 94305 Abstract For a simple liear case, a mathematical aalysis of the traiig ad geeralizatio
Department of Computer Science, University of Otago
Departmet of Computer Sciece, Uiversity of Otago Techical Report OUCS-2006-09 Permutatios Cotaiig May Patters Authors: M.H. Albert Departmet of Computer Sciece, Uiversity of Otago Micah Colema, Rya Fly
Taking DCOP to the Real World: Efficient Complete Solutions for Distributed Multi-Event Scheduling
Taig DCOP to the Real World: Efficiet Complete Solutios for Distributed Multi-Evet Schedulig Rajiv T. Maheswara, Milid Tambe, Emma Bowrig, Joatha P. Pearce, ad Pradeep araatham Uiversity of Souther Califoria
5 Boolean Decision Trees (February 11)
5 Boolea Decisio Trees (February 11) 5.1 Graph Coectivity Suppose we are give a udirected graph G, represeted as a boolea adjacecy matrix = (a ij ), where a ij = 1 if ad oly if vertices i ad j are coected
Cantilever Beam Experiment
Mechaical Egieerig Departmet Uiversity of Massachusetts Lowell Catilever Beam Experimet Backgroud A disk drive maufacturer is redesigig several disk drive armature mechaisms. This is the result of evaluatio
INVESTMENT PERFORMANCE COUNCIL (IPC)
INVESTMENT PEFOMANCE COUNCIL (IPC) INVITATION TO COMMENT: Global Ivestmet Performace Stadards (GIPS ) Guidace Statemet o Calculatio Methodology The Associatio for Ivestmet Maagemet ad esearch (AIM) seeks
(VCP-310) 1-800-418-6789
Maual VMware Lesso 1: Uderstadig the VMware Product Lie I this lesso, you will first lear what virtualizatio is. Next, you ll explore the products offered by VMware that provide virtualizatio services.
Chapter 7 Methods of Finding Estimators
Chapter 7 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 011 Chapter 7 Methods of Fidig Estimators Sectio 7.1 Itroductio Defiitio 7.1.1 A poit estimator is ay fuctio W( X) W( X1, X,, X ) of
Sequences and Series
CHAPTER 9 Sequeces ad Series 9.. Covergece: Defiitio ad Examples Sequeces The purpose of this chapter is to itroduce a particular way of geeratig algorithms for fidig the values of fuctios defied by their
Solutions to Selected Problems In: Pattern Classification by Duda, Hart, Stork
Solutios to Selected Problems I: Patter Classificatio by Duda, Hart, Stork Joh L. Weatherwax February 4, 008 Problem Solutios Chapter Bayesia Decisio Theory Problem radomized rules Part a: Let Rx be the
Research Article Sign Data Derivative Recovery
Iteratioal Scholarly Research Network ISRN Applied Mathematics Volume 0, Article ID 63070, 7 pages doi:0.540/0/63070 Research Article Sig Data Derivative Recovery L. M. Housto, G. A. Glass, ad A. D. Dymikov
Spam Detection. A Bayesian approach to filtering spam
Spam Detectio A Bayesia approach to filterig spam Kual Mehrotra Shailedra Watave Abstract The ever icreasig meace of spam is brigig dow productivity. More tha 70% of the email messages are spam, ad it
A Combined Continuous/Binary Genetic Algorithm for Microstrip Antenna Design
A Combied Cotiuous/Biary Geetic Algorithm for Microstrip Atea Desig Rady L. Haupt The Pesylvaia State Uiversity Applied Research Laboratory P. O. Box 30 State College, PA 16804-0030 [email protected] Abstract:
Vladimir N. Burkov, Dmitri A. Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT
Keywords: project maagemet, resource allocatio, etwork plaig Vladimir N Burkov, Dmitri A Novikov MODELS AND METHODS OF MULTIPROJECTS MANAGEMENT The paper deals with the problems of resource allocatio betwee
Theorems About Power Series
Physics 6A Witer 20 Theorems About Power Series Cosider a power series, f(x) = a x, () where the a are real coefficiets ad x is a real variable. There exists a real o-egative umber R, called the radius
Scalable Biomedical Named Entity Recognition: Investigation of a Database-Supported SVM Approach
Scalable Biomedical Named Etity Recogitio: Ivestigatio of a Database-Supported SVM Approach Moa Solima Habib * ad Jugal Kalita Departmet of Computer Sciece Uiversity of Colorado, 1420 Austi Bluffs Pkwy
COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S 2 CONTROL CHART FOR THE CHANGES IN A PROCESS
COMPARISON OF THE EFFICIENCY OF S-CONTROL CHART AND EWMA-S CONTROL CHART FOR THE CHANGES IN A PROCESS Supraee Lisawadi Departmet of Mathematics ad Statistics, Faculty of Sciece ad Techoology, Thammasat
Lesson 17 Pearson s Correlation Coefficient
Outlie Measures of Relatioships Pearso s Correlatio Coefficiet (r) -types of data -scatter plots -measure of directio -measure of stregth Computatio -covariatio of X ad Y -uique variatio i X ad Y -measurig
PSYCHOLOGICAL STATISTICS
UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION B Sc. Cousellig Psychology (0 Adm.) IV SEMESTER COMPLEMENTARY COURSE PSYCHOLOGICAL STATISTICS QUESTION BANK. Iferetial statistics is the brach of statistics
Now here is the important step
LINEST i Excel The Excel spreadsheet fuctio "liest" is a complete liear least squares curve fittig routie that produces ucertaity estimates for the fit values. There are two ways to access the "liest"
Plug-in martingales for testing exchangeability on-line
Plug-i martigales for testig exchageability o-lie Valetia Fedorova, Alex Gammerma, Ilia Nouretdiov, ad Vladimir Vovk Computer Learig Research Cetre Royal Holloway, Uiversity of Lodo, UK {valetia,ilia,alex,vovk}@cs.rhul.ac.uk
Subject CT5 Contingencies Core Technical Syllabus
Subject CT5 Cotigecies Core Techical Syllabus for the 2015 exams 1 Jue 2014 Aim The aim of the Cotigecies subject is to provide a groudig i the mathematical techiques which ca be used to model ad value
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM
PROCEEDINGS OF THE YEREVAN STATE UNIVERSITY Physical ad Mathematical Scieces 2015, 1, p. 15 19 M a t h e m a t i c s AN ALTERNATIVE MODEL FOR BONUS-MALUS SYSTEM A. G. GULYAN Chair of Actuarial Mathematics
Totally Corrective Boosting Algorithms that Maximize the Margin
Mafred K. Warmuth [email protected] Ju Liao [email protected] Uiversity of Califoria at Sata Cruz, Sata Cruz, CA 95064, USA Guar Rätsch [email protected] Friedrich Miescher Laboratory of
GCSE STATISTICS. 4) How to calculate the range: The difference between the biggest number and the smallest number.
GCSE STATISTICS You should kow: 1) How to draw a frequecy diagram: e.g. NUMBER TALLY FREQUENCY 1 3 5 ) How to draw a bar chart, a pictogram, ad a pie chart. 3) How to use averages: a) Mea - add up all
5: Introduction to Estimation
5: Itroductio to Estimatio Cotets Acroyms ad symbols... 1 Statistical iferece... Estimatig µ with cofidece... 3 Samplig distributio of the mea... 3 Cofidece Iterval for μ whe σ is kow before had... 4 Sample
where: T = number of years of cash flow in investment's life n = the year in which the cash flow X n i = IRR = the internal rate of return
EVALUATING ALTERNATIVE CAPITAL INVESTMENT PROGRAMS By Ke D. Duft, Extesio Ecoomist I the March 98 issue of this publicatio we reviewed the procedure by which a capital ivestmet project was assessed. The
Hypergeometric Distributions
7.4 Hypergeometric Distributios Whe choosig the startig lie-up for a game, a coach obviously has to choose a differet player for each positio. Similarly, whe a uio elects delegates for a covetio or you
Your organization has a Class B IP address of 166.144.0.0 Before you implement subnetting, the Network ID and Host ID are divided as follows:
Subettig Subettig is used to subdivide a sigle class of etwork i to multiple smaller etworks. Example: Your orgaizatio has a Class B IP address of 166.144.0.0 Before you implemet subettig, the Network
SAMPLE QUESTIONS FOR FINAL EXAM. (1) (2) (3) (4) Find the following using the definition of the Riemann integral: (2x + 1)dx
SAMPLE QUESTIONS FOR FINAL EXAM REAL ANALYSIS I FALL 006 3 4 Fid the followig usig the defiitio of the Riema itegral: a 0 x + dx 3 Cosider the partitio P x 0 3, x 3 +, x 3 +,......, x 3 3 + 3 of the iterval
Chair for Network Architectures and Services Institute of Informatics TU München Prof. Carle. Network Security. Chapter 2 Basics
Chair for Network Architectures ad Services Istitute of Iformatics TU Müche Prof. Carle Network Security Chapter 2 Basics 2.4 Radom Number Geeratio for Cryptographic Protocols Motivatio It is crucial to
1 Computing the Standard Deviation of Sample Means
Computig the Stadard Deviatio of Sample Meas Quality cotrol charts are based o sample meas ot o idividual values withi a sample. A sample is a group of items, which are cosidered all together for our aalysis.
Trigonometric Form of a Complex Number. The Complex Plane. axis. ( 2, 1) or 2 i FIGURE 6.44. The absolute value of the complex number z a bi is
0_0605.qxd /5/05 0:45 AM Page 470 470 Chapter 6 Additioal Topics i Trigoometry 6.5 Trigoometric Form of a Complex Number What you should lear Plot complex umbers i the complex plae ad fid absolute values
Definition. A variable X that takes on values X 1, X 2, X 3,...X k with respective frequencies f 1, f 2, f 3,...f k has mean
1 Social Studies 201 October 13, 2004 Note: The examples i these otes may be differet tha used i class. However, the examples are similar ad the methods used are idetical to what was preseted i class.
Cutting-Plane Training of Structural SVMs
Cuttig-Plae Traiig of Structural SVMs Thorste Joachims, Thomas Filey, ad Chu-Nam Joh Yu Abstract Discrimiative traiig approaches like structural SVMs have show much promise for buildig highly complex ad
.04. This means $1000 is multiplied by 1.02 five times, once for each of the remaining sixmonth
Questio 1: What is a ordiary auity? Let s look at a ordiary auity that is certai ad simple. By this, we mea a auity over a fixed term whose paymet period matches the iterest coversio period. Additioally,
ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC
8 th Iteratioal Coferece o DEVELOPMENT AND APPLICATION SYSTEMS S u c e a v a, R o m a i a, M a y 25 27, 2 6 ADAPTIVE NETWORKS SAFETY CONTROL ON FUZZY LOGIC Vadim MUKHIN 1, Elea PAVLENKO 2 Natioal Techical
Notes on exponential generating functions and structures.
Notes o expoetial geeratig fuctios ad structures. 1. The cocept of a structure. Cosider the followig coutig problems: (1) to fid for each the umber of partitios of a -elemet set, (2) to fid for each the
In nite Sequences. Dr. Philippe B. Laval Kennesaw State University. October 9, 2008
I ite Sequeces Dr. Philippe B. Laval Keesaw State Uiversity October 9, 2008 Abstract This had out is a itroductio to i ite sequeces. mai de itios ad presets some elemetary results. It gives the I ite Sequeces
Reliability Analysis in HPC clusters
Reliability Aalysis i HPC clusters Narasimha Raju, Gottumukkala, Yuda Liu, Chokchai Box Leagsuksu 1, Raja Nassar, Stephe Scott 2 College of Egieerig & Sciece, Louisiaa ech Uiversity Oak Ridge Natioal Lab
THE REGRESSION MODEL IN MATRIX FORM. For simple linear regression, meaning one predictor, the model is. for i = 1, 2, 3,, n
We will cosider the liear regressio model i matrix form. For simple liear regressio, meaig oe predictor, the model is i = + x i + ε i for i =,,,, This model icludes the assumptio that the ε i s are a sample
Biology 171L Environment and Ecology Lab Lab 2: Descriptive Statistics, Presenting Data and Graphing Relationships
Biology 171L Eviromet ad Ecology Lab Lab : Descriptive Statistics, Presetig Data ad Graphig Relatioships Itroductio Log lists of data are ofte ot very useful for idetifyig geeral treds i the data or the
THE ARITHMETIC OF INTEGERS. - multiplication, exponentiation, division, addition, and subtraction
THE ARITHMETIC OF INTEGERS - multiplicatio, expoetiatio, divisio, additio, ad subtractio What to do ad what ot to do. THE INTEGERS Recall that a iteger is oe of the whole umbers, which may be either positive,
Asymptotic Growth of Functions
CMPS Itroductio to Aalysis of Algorithms Fall 3 Asymptotic Growth of Fuctios We itroduce several types of asymptotic otatio which are used to compare the performace ad efficiecy of algorithms As we ll
Domain 1: Designing a SQL Server Instance and a Database Solution
Maual SQL Server 2008 Desig, Optimize ad Maitai (70-450) 1-800-418-6789 Domai 1: Desigig a SQL Server Istace ad a Database Solutio Desigig for CPU, Memory ad Storage Capacity Requiremets Whe desigig a
HCL Dynamic Spiking Protocol
ELI LILLY AND COMPANY TIPPECANOE LABORATORIES LAFAYETTE, IN Revisio 2.0 TABLE OF CONTENTS REVISION HISTORY... 2. REVISION.0... 2.2 REVISION 2.0... 2 2 OVERVIEW... 3 3 DEFINITIONS... 5 4 EQUIPMENT... 7
Evaluation of Different Fitness Functions for the Evolutionary Testing of an Autonomous Parking System
Evaluatio of Differet Fitess Fuctios for the Evolutioary Testig of a Autoomous Parkig System Joachim Wegeer 1, Oliver Bühler 2 1 DaimlerChrysler AG, Research ad Techology, Alt-Moabit 96 a, D-1559 Berli,
Automatic Tuning for FOREX Trading System Using Fuzzy Time Series
utomatic Tuig for FOREX Tradig System Usig Fuzzy Time Series Kraimo Maeesilp ad Pitihate Soorasa bstract Efficiecy of the automatic currecy tradig system is time depedet due to usig fixed parameters which
INVESTMENT PERFORMANCE COUNCIL (IPC) Guidance Statement on Calculation Methodology
Adoptio Date: 4 March 2004 Effective Date: 1 Jue 2004 Retroactive Applicatio: No Public Commet Period: Aug Nov 2002 INVESTMENT PERFORMANCE COUNCIL (IPC) Preface Guidace Statemet o Calculatio Methodology
Approximating Area under a curve with rectangles. To find the area under a curve we approximate the area using rectangles and then use limits to find
1.8 Approximatig Area uder a curve with rectagles 1.6 To fid the area uder a curve we approximate the area usig rectagles ad the use limits to fid 1.4 the area. Example 1 Suppose we wat to estimate 1.
Chatpun Khamyat Department of Industrial Engineering, Kasetsart University, Bangkok, Thailand [email protected]
SOLVING THE OIL DELIVERY TRUCKS ROUTING PROBLEM WITH MODIFY MULTI-TRAVELING SALESMAN PROBLEM APPROACH CASE STUDY: THE SME'S OIL LOGISTIC COMPANY IN BANGKOK THAILAND Chatpu Khamyat Departmet of Idustrial
JJMIE Jordan Journal of Mechanical and Industrial Engineering
JJMIE Jorda Joural of Mechaical ad Idustrial Egieerig Volume 5, Number 5, Oct. 2011 ISSN 1995-6665 Pages 439-446 Modelig Stock Market Exchage Prices Usig Artificial Neural Network: A Study of Amma Stock
Overview on S-Box Design Principles
Overview o S-Box Desig Priciples Debdeep Mukhopadhyay Assistat Professor Departmet of Computer Sciece ad Egieerig Idia Istitute of Techology Kharagpur INDIA -721302 What is a S-Box? S-Boxes are Boolea
Convention Paper 6764
Audio Egieerig Society Covetio Paper 6764 Preseted at the 10th Covetio 006 May 0 3 Paris, Frace This covetio paper has bee reproduced from the author's advace mauscript, without editig, correctios, or
CS103A Handout 23 Winter 2002 February 22, 2002 Solving Recurrence Relations
CS3A Hadout 3 Witer 00 February, 00 Solvig Recurrece Relatios Itroductio A wide variety of recurrece problems occur i models. Some of these recurrece relatios ca be solved usig iteratio or some other ad
Data Analysis and Statistical Behaviors of Stock Market Fluctuations
44 JOURNAL OF COMPUTERS, VOL. 3, NO. 0, OCTOBER 2008 Data Aalysis ad Statistical Behaviors of Stock Market Fluctuatios Ju Wag Departmet of Mathematics, Beijig Jiaotog Uiversity, Beijig 00044, Chia Email:
Regularized Distance Metric Learning: Theory and Algorithm
Regularized Distace Metric Learig: Theory ad Algorithm Rog Ji 1 Shiju Wag 2 Yag Zhou 1 1 Dept. of Computer Sciece & Egieerig, Michiga State Uiversity, East Lasig, MI 48824 2 Radiology ad Imagig Scieces,
*The most important feature of MRP as compared with ordinary inventory control analysis is its time phasing feature.
Itegrated Productio ad Ivetory Cotrol System MRP ad MRP II Framework of Maufacturig System Ivetory cotrol, productio schedulig, capacity plaig ad fiacial ad busiess decisios i a productio system are iterrelated.
1. Introduction. Scheduling Theory
. Itroductio. Itroductio As a idepedet brach of Operatioal Research, Schedulig Theory appeared i the begiig of the 50s. I additio to computer systems ad maufacturig, schedulig theory ca be applied to may
NATIONAL SENIOR CERTIFICATE GRADE 12
NATIONAL SENIOR CERTIFICATE GRADE MATHEMATICS P EXEMPLAR 04 MARKS: 50 TIME: 3 hours This questio paper cosists of 8 pages ad iformatio sheet. Please tur over Mathematics/P DBE/04 NSC Grade Eemplar INSTRUCTIONS
Hypothesis testing. Null and alternative hypotheses
Hypothesis testig Aother importat use of samplig distributios is to test hypotheses about populatio parameters, e.g. mea, proportio, regressio coefficiets, etc. For example, it is possible to stipulate
Lesson 15 ANOVA (analysis of variance)
Outlie Variability -betwee group variability -withi group variability -total variability -F-ratio Computatio -sums of squares (betwee/withi/total -degrees of freedom (betwee/withi/total -mea square (betwee/withi
Volatility of rates of return on the example of wheat futures. Sławomir Juszczyk. Rafał Balina
Overcomig the Crisis: Ecoomic ad Fiacial Developmets i Asia ad Europe Edited by Štefa Bojec, Josef C. Brada, ad Masaaki Kuboiwa http://www.hippocampus.si/isbn/978-961-6832-32-8/cotets.pdf Volatility of
Domain 1 - Describe Cisco VoIP Implementations
Maual ONT (642-8) 1-800-418-6789 Domai 1 - Describe Cisco VoIP Implemetatios Advatages of VoIP Over Traditioal Switches Voice over IP etworks have may advatages over traditioal circuit switched voice etworks.
Effective Hybrid Intrusion Detection System: A Layered Approach
I. J. Computer Network ad Iformatio Security, 2015, 3, 35-41 Published Olie February 2015 i MECS (http://www.mecs-press.org/) DOI: 10.5815/ijcis.2015.03.05 Effective Hybrid Itrusio Detectio System: A Layered
Determining the sample size
Determiig the sample size Oe of the most commo questios ay statisticia gets asked is How large a sample size do I eed? Researchers are ofte surprised to fid out that the aswer depeds o a umber of factors
The analysis of the Cournot oligopoly model considering the subjective motive in the strategy selection
The aalysis of the Courot oligopoly model cosiderig the subjective motive i the strategy selectio Shigehito Furuyama Teruhisa Nakai Departmet of Systems Maagemet Egieerig Faculty of Egieerig Kasai Uiversity
Problem Solving with Mathematical Software Packages 1
C H A P T E R 1 Problem Solvig with Mathematical Software Packages 1 1.1 EFFICIENT PROBLEM SOLVING THE OBJECTIVE OF THIS BOOK As a egieerig studet or professioal, you are almost always ivolved i umerical
Chapter 6: Variance, the law of large numbers and the Monte-Carlo method
Chapter 6: Variace, the law of large umbers ad the Mote-Carlo method Expected value, variace, ad Chebyshev iequality. If X is a radom variable recall that the expected value of X, E[X] is the average value
A gentle introduction to Expectation Maximization
A getle itroductio to Expectatio Maximizatio Mark Johso Brow Uiversity November 2009 1 / 15 Outlie What is Expectatio Maximizatio? Mixture models ad clusterig EM for setece topic modelig 2 / 15 Why Expectatio
Iran. J. Chem. Chem. Eng. Vol. 26, No.1, 2007. Sensitivity Analysis of Water Flooding Optimization by Dynamic Optimization
Ira. J. Chem. Chem. Eg. Vol. 6, No., 007 Sesitivity Aalysis of Water Floodig Optimizatio by Dyamic Optimizatio Gharesheiklou, Ali Asghar* + ; Mousavi-Dehghai, Sayed Ali Research Istitute of Petroleum Idustry
Case Study. Normal and t Distributions. Density Plot. Normal Distributions
Case Study Normal ad t Distributios Bret Halo ad Bret Larget Departmet of Statistics Uiversity of Wiscosi Madiso October 11 13, 2011 Case Study Body temperature varies withi idividuals over time (it ca
Modeling of Ship Propulsion Performance
odelig of Ship Propulsio Performace Bejami Pjedsted Pederse (FORCE Techology, Techical Uiversity of Demark) Ja Larse (Departmet of Iformatics ad athematical odelig, Techical Uiversity of Demark) Full scale
Ranking Irregularities When Evaluating Alternatives by Using Some ELECTRE Methods
Please use the followig referece regardig this paper: Wag, X., ad E. Triataphyllou, Rakig Irregularities Whe Evaluatig Alteratives by Usig Some ELECTRE Methods, Omega, Vol. 36, No. 1, pp. 45-63, February
Lecture 2: Karger s Min Cut Algorithm
priceto uiv. F 3 cos 5: Advaced Algorithm Desig Lecture : Karger s Mi Cut Algorithm Lecturer: Sajeev Arora Scribe:Sajeev Today s topic is simple but gorgeous: Karger s mi cut algorithm ad its extesio.
A probabilistic proof of a binomial identity
A probabilistic proof of a biomial idetity Joatho Peterso Abstract We give a elemetary probabilistic proof of a biomial idetity. The proof is obtaied by computig the probability of a certai evet i two
Lecture 3. denote the orthogonal complement of S k. Then. 1 x S k. n. 2 x T Ax = ( ) λ x. with x = 1, we have. i = λ k x 2 = λ k.
18.409 A Algorithmist s Toolkit September 17, 009 Lecture 3 Lecturer: Joatha Keler Scribe: Adre Wibisoo 1 Outlie Today s lecture covers three mai parts: Courat-Fischer formula ad Rayleigh quotiets The
AP Calculus AB 2006 Scoring Guidelines Form B
AP Calculus AB 6 Scorig Guidelies Form B The College Board: Coectig Studets to College Success The College Board is a ot-for-profit membership associatio whose missio is to coect studets to college success
