Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Mining Feature Importance: Applying Evolutionary Algorithms within a Web-based Educational System"

Transcription

1 Mnng Feature Importance: Applyng Evolutonary Algorthms wthn a Web-based Educatonal System Behrouz MINAEI-BIDGOLI 1, and Gerd KORTEMEYER 2, and Wllam F. PUNCH 1 1 Genetc Algorthms Research and Applcatons Group (GARAGe), Department of Computer Scence & Engneerng, Mchgan State Unversty East Lansng, MI 48824, USA, {mnaeb, 2 Dvson of Scence and Math Educaton, Mchgan State Unversty, LITE lab, East Lansng, MI 48824, USA, ABSTRACT A key objectve of data mnng s to uncover the hdden relatonshps among the objects n gven data set. Classfcaton has emerged as a popular task to fnd the groups of smlar objects n order to predct unseen test data ponts. Classfcaton fuson combnes multple classfcatons of data nto a sngle classfcaton soluton of hgher accuracy. Feature extracton ams to reduce the computatonal cost of feature measurement, ncrease classfer effcency, and allow greater classfcaton accuracy based on the process of dervng new features from the orgnal features. Recently web-based educatonal systems collect vast amounts of data on user patterns, and data mnng methods can be appled to these databases to dscover nterestng assocatons based on students features, problems attrbutes, and the actons taken by students n solvng homework and exam problems. Ths paper represents an approach for classfyng students n order to predct ther fnal grades based on features extracted from logged data n an educatonal web-based system. By weghng feature vectors representng feature mportance usng a Genetc Algorthm we can optmze the predcton accuracy and obtan sgnfcant mprovement over raw classfcaton. Ths work represents a rgorous applcaton of known classfers as a means of analyzng and comparng use and performance of students who have taken a techncal course that was partally/completely admnstered va the web. Keywords: Web-based Educatonal System, Data Mnng, Classfcaton Fuson, Predcton, Genetc Algorthm, Feature Extracton, Feature Importance. 1. INTRODUCTION The ever-ncreasng progress of network-dstrbuted computng and partcularly the rapd expanson of the web have had a broad mpact on socety n a relatvely short perod of tme. Educaton s on the brnk of a new era based on these changes. Onlne delvery of educatonal nstructon provdes the opportunty to brng colleges and unverstes new energy, students, and revenues. Many leadng educatonal nsttutons are workng to establsh an onlne teachng and learnng presence. Several web-based educatonal systems wth dfferent capabltes and approaches have been developed to delver onlne educaton n an academc settng. In partcular, Mchgan State Unversty (MSU) has poneered some of these systems to provde an nfrastructure for onlne nstructon. The research presented here was performed on a part of the latest onlne educatonal system developed at MSU, the Learnng Onlne Network wth Computer-Asssted Personalzed Approach (LON-CAPA). LON-CAPA s nvolved wth three knds of large data sets: 1) educatonal resources such as web pages, demonstratons, smulatons, and ndvdualzed problems desgned for use on homework assgnments, quzzes, and examnatons; 2) nformaton about users who create, modfy, assess, or use these resources; and 3) actvty log databases whch log actons taken by students n solvng homework and exam problems. In other words, we have three ever-growng pools of data. Ths paper nvestgates methods for extractng useful and nterestng patterns from these large databases usng onlne educatonal resources and ther recorded paths

2 wthn the system. We am to answer the followng research questons: Can we fnd classes of students? In other words, do there exst groups of students who use these onlne resources n a smlar way? If so, can we predct a class for any ndvdual student? Wth ths nformaton, can we then help a student use the resources better, based on the usage of the resource by other students n ther groups? We fnd smlar patterns of use n the data gathered from LON-CAPA, and eventually make predctons as to the most-benefcal course of studes for each learner based on ther past and present usage. The system could then make suggestons to the learner as to how best to proceed. 2. BACKGROUND Genetc Algorthms (GA) have been shown to be an effectve tool to use n data analyss and pattern recognton [1], [2], [3]. An mportant aspect of GAs n a learnng context s ther use n pattern recognton. There are two dfferent approaches to applyng GA n pattern recognton: 1. Apply a GA drectly as a classfer. Bandyopadhyay and Murthy n [4] appled GA to fnd the decson boundary n N dmensonal feature space. 2. Use a GA as an optmzaton tool for resettng the parameters n other classfers. Most applcatons of GAs n pattern recognton optmze some parameters n the classfcaton process. Many researchers have used GAs n feature selecton [5], [6], [7], [8]. GAs has been appled to fnd an optmal set of feature weghts that mprove classfcaton accuracy. Frst, a tradtonal feature extracton method such as Prncpal Component Analyss (PCA) s appled, and then a classfer such as k-nn s used to calculate the ftness functon for GA [9], [10]. Combnaton of classfers s another area that GAs have been used to optmze. Kuncheva and Jan n [11] used a GA for selectng the features as well as selectng the types of ndvdual classfers n ther desgn of a Classfer Fuson System. GA s also used n selectng the prototypes n the case-based classfcaton [12]. In ths paper we focus on the second approach and use a GA to optmze a combnaton of classfers. Our objectve s to predct the students fnal grades based on ther webuse features, whch are extracted from the homework data. We desgn, mplement, and evaluate a seres of pattern classfers wth varous parameters n order to compare ther performance on a dataset from LON-CAPA. Error rates for the ndvdual classfers, ther combnaton and the GA optmzed combnaton are presented. Two approaches are proposed for the problem of feature extracton and selecton. The flter model chooses features by heurstcally determned goodness/relevant or knowledge, whle the wrapper model does ths by the feedback of classfer evaluaton, or experment. Research has shown the wrapper model outperforms the flter model comparng the predctve power on unseen data [13]. We propose a wrapper model for feature extracton through settng dfferent weghts for features and gettng feedback from ensembles of classfers. 3. DATASETS AND FEATURES We selected 14 student/course data sets of LON-CAPA courses, whch were held at MSU n fall semester 2002 (FS02), sprng semester 2003 (SS03), and fall semester 2003 (FS03) as shown n Table 1. Table of MSU courses whch held by LON-CAPA Course Ttle Term ADV 205 Prncples of Advertsng SS03 BS 111 Bologcal Scence: Cells and Molecules SS02 BS 111 Bologcal Scence: Cells and Molecules SS03 CE 280 Cvl Engneerng: Intro Envronment Eng. SS03 FI 414 Advanced Busness Fnance (w) SS03 LBS 271 Lyman Brggs School: Physcs I FS02 LBS 272 Lyman Brggs School: Physcs II SS03 MT 204 Medcal Tech.: Mechansms of Dsease SS03 MT 432 Clnc Immun. & Immunohematology SS03 PHY 183 Physcs Scentsts & Engneers I SS02 PHY 183 Physcs Scentsts & Engneers I SS03 PHY 231c Introductory Physcs I SS03 PHY 232c Introductory Physcs II FS03 PHY 232 Introductory Physcs II FS03 Table 2. Characterstcs of courses whch held by LON-CAPA Course # of # of Sze of Sze of # of Students Problems Actvty log useful data Transactons ADV MB 12.1 MB 424,481 BS MB 34.1 MB 1,112,394 BS MB 50.2 MB 1,689,656 CE MB 3.5 MB 127,779 FI MB 2.2 MB 83,715 LBS MB 18.7 MB 706,700 LBS MB 15.3 MB 585,524 MT MB 0.7 MB 23,741 MT MB 2.4 MB 90,120 PHY MB 21.3 MB 452,342 PHY MB 26.8 MB 889,775 PHY 231c MB 14.1 MB 536,691 PHY 232c MB 10.9 MB 412,646 PHY MB 19.7 MB 981,568 Table 2 shows the characterstcs of these 14 courses; for example the thrd row of the table shows that the BS (Bologcal Scence: Cells and Molecules) was held n sprng semester 2003 ntegrated 229 onlne homework problems, and 402 students used LON-CAPA for ths course. the BS111 course had an actvty log wth approxmately 368 MB. Usng some perl scrpt modules for cleansng the data, we found 48 MB of useful data n

3 BS111 SS03 course. We then pulled out from these logged data 1,689,656 transactons (nteractons between students and homework/exam/quz problems) from whch from whch we extracted the followng nne features: 1. Total number of attempts before correct answer s derved 2. Total number of correct answers. (Success rate) 3. Success on the frst try 4. Success on the second try 5. Success after 3 to 9 attempts 6. Success after 10 or more attempts 7. Total tme from frst attempt untl the correct answer 8. Total tme spent on the problem, regardless of success 9. Partcpaton n the provded onlne communcaton mechansms wth other students and nstructonal staff Based on the above extracted features n each course, we classfy the students, and try to predct for every ndvdual student of test data to whch class he/she belongs. We categorze the students wth one of two class labels: Passed for grades hgher than 2.0, and Faled for grades less than or equal to 2.0 where the MSU gradng system s based on grades from 0.0 to CLASSIFICATION FUSION Pattern recognton has a wde varety of applcatons n many dfferent felds, such that t s not possble to come up wth a sngle classfer that can gve good results n all cases. The optmal classfer n every case s hghly dependent upon the problem doman. In practce, one mght come across a case where no sngle classfer can acheve an acceptable level of accuracy. In such cases t would be better to pool the results of dfferent classfers to acheve the optmal accuracy. Every classfer operates well on dfferent aspects of the tranng or test feature vector. As a result, assumng approprate condtons, combnng multple classfers may mprove classfcaton performance when compared wth any sngle classfer. The scope of ths study s restrcted to comparng some popular non-parametrc pattern classfers and a sngle parametrc pattern classfer accordng to the error estmate. Four dfferent classfers usng the LON-CAPA dataset are compared n ths study. The classfers used n ths study nclude Quadratc Bayesan classfer, 1-nearest neghbor (1-NN), k-nearest neghbor (k-nn), Parzenwndow. 1 These are some of the common classfers used n most practcal classfcaton problems. After some preprocessng operatons the optmal k=3 s chosen for knn algorthm. To mprove classfcaton performance, a fuson of classfers s performed. 1 The classfers are coded n MATLAB TM 6.5. Normalzton. Havng assumed n Bayesan and Parzenwndow classfers that the features are normally dstrbuted, t s necessary that the data for each feature be normalzed. Ths ensures that each feature has the same weght n the decson process. Assumng that the gven data s Gaussan, ths normalzaton s performed usng the mean and standard devaton of the tranng data. In order to normalze the tranng data, t s necessary frst to calculate the sample mean µ, and the standard devaton σ of each feature n ths dataset, and then normalze the data usng the Eq. (1). x µ x = (1) σ Ths ensures that each feature of the tranng dataset has a normal dstrbuton wth a mean of zero and a standard devaton of unty. In addton, the knn method requres normalzaton of all features nto the same range. Combnaton of Multple Classfers. In combnng multple classfers we mprove classfer performance. There are dfferent ways one can thnk of combnng classfers: The smplest way s to fnd the overall error rate of the classfers and choose the one whch has the least error rate on the gven dataset. Ths s called an offlne classfcaton fuson. Ths may appear to be a classfcaton fuson; however, n general, t has a better performance than ndvdual classfers. The second method, whch s called onlne classfcaton fuson, uses all the classfers followed by a vote. The class gettng the maxmum votes from the ndvdual classfers wll be assgned to the test sample. Usng the second method we show that classfcaton fuson can acheve a sgnfcant accuracy mprovement n all gven data sets. A GA s employed to determne whether classfcaton fuson performance can be maxmzed. 5. OPTIMIZING CLASSIFICATION FUSION WITH GENTIC ALGORITHMS Our goal s to fnd a populaton of best weghts for every feature vector, whch mnmze the classfcaton error rate. The feature vector for our predctors are the set of nne varables for every student: Number of attempts before correct answer s derved, Success rate, Success at the frst try, Success at the second try, Success wth number of tres between three and nne, Success wth hgh number of tres, the tme at whch the student got the problem correct

4 relatve to the due date, and total tme spent on the problem. We randomly ntalzed a populaton of nne dmensonal weght vectors wth values between 0 and 1, correspondng to the feature vector and expermented wth dfferent number of populaton szes. We found good results usng a populaton wth 200 ndvduals. Realvalued populatons may be ntalzed usng the GA MATLAB Toolbox functon crtrp. For example, to create a random populaton of nne ndvduals wth 200 varables each: we defne boundares on the varables n FeldD whch s a matrx contanng the boundares of each varable of an ndvdual. FeldD = [ ; % lower bound ]; % upper bound We create an ntal populaton wth Chrom = crtrp(200, FeldD), So we have for example: Chrom = We used the smple genetc algorthm (SGA), whch s descrbed by Goldberg n [14]. The SGA uses common GA operators to fnd a populaton of solutons whch optmze the ftness values. We used Stochastc Unversal Samplng [14] as our selecton method. A form of stochastc unversal samplng s mplemented by obtanng a cumulatve sum of the ftness vector, FtnV, and generatng N equally spaced numbers between 0 and sum(ftnv). Thus, only one random number s generated, all the others used beng equally spaced from that pont. The ndex of the ndvduals selected s determned by comparng the generated numbers wth the cumulatve sum vector. The probablty of an ndvdual beng selected s then gven by F ( x ) = N nd = 1 f ( x where f(x ) s the ftness of ndvdual x and F(x ) s the probablty of that ndvdual beng selected. The operaton of crossover s not necessarly performed on all strngs n the populaton. Instead, t s appled wth a probablty Px when the pars are chosen for breedng. We selected Px = 0.7. Intermedate recombnaton combnes parent values usng the followng formula [15]: Offsprng = parent1 + Alpha (parent2 parent1) (3) where Alpha s a Scalng factor chosen unformly n the nterval [-0.25, 1.25] A further genetc operator, mutaton s appled to the new chromosomes, wth a set probablty Pm. Mutaton causes the ndvdual genetc representaton to be changed accordng to some probablstc rule. Mutaton s generally consdered to be a background operator that ensures that the probablty of searchng a partcular subspace of the ) f ( x ) (2) problem space s never zero. Ths has the effect of tendng to nhbt the possblty of convergng to a local optmum, rather than the global optmum. We consdered 1/800 as our mutaton rate. The mutaton of each varable s calculated as follows: Mutated Var = Var + MutMx range MutOpt(2) delta (4) where delta s an nternal matrx whch specfes the normalzed mutaton step sze; MutMx s an nternal mask table; and MutOpt specfes the mutaton rate and ts shrnkage durng the run. Durng the reproducton phase, each ndvdual s assgned a ftness value derved from ts raw performance measure gven by the objectve functon. Ths value s used n the selecton to bas towards more ft ndvduals. Hghly ft ndvduals, relatve to the whole populaton, have a hgh probablty of beng selected for matng whereas less ft ndvduals have a correspondngly low probablty of beng selected. The error rate s measured n each round of cross valdaton by dvdng the total number of msclassfed examples nto total number of test examples. Therefore, our ftness functon measures the accuracy rate acheved by classfcaton fuson and our objectve would be to maxmze ths performance (mnmze the error rate). 6. EXPERIMENTS Wthout usng GA, the overall results of classfcaton performance on our datasets for four classfers and classfcaton fuson are shown n the Table 3. Regardng ndvdual classfers, mostly, 1NN and knn have the best performance. However, the classfcaton fuson mproved the classfcaton accuracy sgnfcantly n all data sets. That s, t acheved n average 79% accuracy over the gven data sets. Table 3. Comparng the average performance% of ten runs of classfers on the gven datasets usng 10-fold cross valdaton, wthout GA Data sets Bayes 1NN knn Parzen Classfcaton Wndow Fuson ADV 205, BS 111, BS 111, CE 280, FI 414, LBS 271, LBS 272, MT 204, MT 432, PHY 183, PHY 183, PHY 231c, PHY 232c, PHY 232, For GA optmzaton, we used 200 ndvduals (weght vectors) n our populaton, runnng the GA over 500

5 generatons. We ran the program 10 tmes and got the averages, whch are shown, n Table 4. The results n Table 3 represent the mean performance wth a two-taled t-test wth a 95% confdence nterval for every data set. For the mprovement of GA over non-ga result, a P-value ndcatng the probablty of the Null-Hypothess (There s no mprovement) s also gven, showng the sgnfcance of the GA optmzaton. All have p<0.001, ndcatng sgnfcant mprovement. Therefore, usng GA, n all the cases, we got approxmately more than a 10% mean ndvdual performance mprovement and about 10 to 17% best ndvdual performance mprovement. Fg. 1 shows the results of one of the ten runs n the case of 2-Classes (passed and faled). The dotted lne represents the populaton mean, and the sold lne shows the best ndvdual at each generaton and the best value yelded by the run (Due to the space lmtaton, only a graph for BS GA-optmzaton s shown). Table 4. Comparng the classfcaton fuson performance on gven datasets, wthout-ga, usng-ga (Mean ndvdual) and mprovement, 95% confdence nterval Data sets Wthout GA GA ptmzed Improvement ADV 205, ± ± ± 0.94 BS 111, ± ± ± 1.65 BS 111, ± ± ± 1.33 CE 280, ± ± ± 1.41 FI 414, ± ± ± 1.76 LBS 271, ± ± ± 0.64 LBS 272, ± ± ± 0.62 MT 204, ± ± ± 1.32 MT 432, ± ± ± 1.28 PHY 183, ± ± ± 1.92 PHY 183, ± ± ± 1.14 PHY 231c, ± ± ± 1.34 PHY 232c, ± ± ± 1.53 PHY 232, ± ± ± 2.23 Total Average ± ± ± 56 Fnally, we can examne the ndvduals (weghts) for features by whch we obtaned the mproved results. Ths feature weghtng ndcates the mportance of each feature for makng the requred classfcaton. In most cases the results are smlar to Multple Lnear Regressons or some tree-based software (lke CART) that use statstcal methods to measure feature mportance. The GA feature weghtng results, as shown n Table 4, state that the Success wth hgh number of tres s the most mportant feature. The Total number of correct answers feature s also the most mportant n some cases. Fg. 1. GA-Optmzed Combnaton of Multple Classfers (CMC) performance n the case of 2-Class labels (Passed and Faled) for BS , 200 weght vectors ndvduals, 500 Generatons Table 5. Relatve Feature Importance%, Usng GA weghtng for BS course Feature Importance % Aerage Number of Tres 18.9 Total number of Correct Answers 84.7 # of Success at the Frst Try 24.4 # of Success at the Second Try 26.5 Got Correct wth 3-9 Tres 21.2 Got Correct wth # of Tres Tme Spent to Solve the Problems 32.1 Total Tme Spent on the Problems 36.5 # of communcaton 3.6 Table 5 shows the mportance of the nne features n the BS 111 SS03 course, applyng the Gn splttng crteron. Based on Gn, a statstcal property called nformaton gan measures how well a gven feature separates the tranng examples n relaton to ther target classes. Gn characterzes mpurty of an arbtrary collecton of examples S at a specfc node N. In [16] the mpurty of a node N s denoted by (N) such that: Gn(S) = ( N) = j P( ω ) P( ω ) = 1 j j 2 P ( ω ) where P( ω j) s the fracton of examples at node N that go to category ω j. Gn attempts to separate classes by focusng on one class at a tme. It wll always favor workng on the largest or, f you use costs or weghts, the most mportant class n a node. j (5)

6 Table 5. Feature Importance for BS , usng decson-tree software CART, applyng Gn Crteron Varable Total number of Correct Answers Got Correct wth # of Tres Average number of tres # of Success at the Frst Try Got Correct wth 3-9 Tres # of Success at the Second Try Tme Spent to Solve the Problems Total Tme Spent on the Problems # of communcaton 2.21 Comparng results n Table 4 (GA-weghtng) and Table 5 (Gn ndex crteron) shows the very smlar output, whch demonstrates merts of the proposed method for detectng the feature mportance. 7. CONCLUSIONS We proposed a new approach to classfyng student usage of web-based nstructon. Four classfers are used n groupng the students. A combnaton of multple classfers leads to a sgnfcant accuracy mprovement n the gven data sets. Weghng the features and usng a genetc algorthm to mnmze the error rate mproves the predcton accuracy by at least 10% n the all three test cases. In cases where the number of features s low, feature weghtng worked much better than feature selecton. The successful optmzaton of student classfcaton n all three cases demonstrates the merts of usng the LON-CAPA data to predct the students fnal grades based on ther features, whch are extracted from the homework data. Ths approach s easly adaptable to dfferent types of courses, dfferent populaton szes, and allows for dfferent features to be analyzed. Ths work represents a rgorous applcaton of known classfers as a means of analyzng and comparng use and performance of students who have taken a techncal course that was partally/completely admnstered va the web. For future work, we plan to mplement such an optmzed assessment tool for every student on any partcular problem. Therefore, we can track students behavors on a partcular problem over several semesters n order to acheve more relable predcton. 8. ACKNOWLEDGMENT Ths work was partally supported by the Natonal Scence Foundaton under ITR REFERENCES 1. M.L. Raymer, W.F. Punch, E.D. Goodman, L.A. Kuhn, and A.K. Jan, Dmensonalty Reducton Usng Genetc Algorthms. IEEE Transactons on Evolutonary Computaton, Vol. 4, (2000) A. K. Jan, D. Zongker, Feature Selecton: Evaluaton, Applcaton, and Small Sample Performance, IEEE Transacton on Pattern Analyss and Machne Intellgence, Vol. 19, No. 2, February (1997). 3. K.A. De Jong, W.M. Spears and D.F. Gordon, Usng genetc algorthms for concept learnng. Machne Learnng 13, (1993) S. Bandyopadhyay, and C.A. Muthy, Pattern Classfcaton Usng Genetc Algorthms, Pattern Recognton Letters, Vol. 16, (1995) J. Bala, K. De Jong, J. Huang, H. Vafae, and H. Wechsler, Usng learnng to facltate the evoluton of features for recognzng vsual concepts. Evolutonary Computaton 4(3) - Specal Issue on Evoluton, Learnng, and Instnct: 100 years of the Baldwn Effect (1997) 6. C. Guerra-Salcedo, and D. Whtley, Feature Selecton mechansms for ensemble creaton: a genetc search perspectve. In: Fretas AA (Ed.) Data Mnng wth Evolutonary Algorthms: Research Drectons Papers from the AAAI Workshop, Techncal Report WS AAAI Press (1999) 7. H. Vafae, and K. De Jong, Robust feature Selecton algorthms, Proceedng of IEEE Internatonal Conference on Tools wth AI, Boston, Mass., USA. Nov. (1993) M.J. Martn-Bautsta, and M.A.Vla, A survey of genetc feature selecton n mnng ssues. Proceedng Congress on Evolutonary Computaton (CEC-99), Washngton D.C., July (1999) M. Pe, E.D. Goodman, and W.F. Punch, Pattern Dscovery from Data Usng Genetc Algorthms. Proceedng of 1 st Pacfc-Asa Conference Knowledge Dscovery & Data Mnng (PAKDD-97) (1997) 10. W.F. Punch, M. Pe, L. Cha-Shun, E.D. Goodman, P. Hovland, and R. Enbody, Further research on Feature Selecton and Classfcaton Usng Genetc Algorthms. In 5 th Internatonal Conference on Genetc Algorthm, Champagn IL, (1993) L.I. Kuncheva, and L.C. Jan, Desgnng Classfer Fuson Systems by Genetc Algorthms. IEEE Transacton on Evolutonary Computaton, Vol. 33 (2000) D. B. Skalak, Usng a Genetc Algorthm to Learn Prototypes for Case Retreval an Classfcaton, Proc. of the AAAI-93 Case-Based Reasonng Workshop, Washgton, D.C., Amercan Assocaton for Artfcal Intellgence, Menlo Park, CA, (1994) G.H. John, R. Kohav, K. Pfleger, Irrelevant Features and the Subset Selecton Problem, Proceedngs of the Eleventh Internatonal Conference of Machne Learnng, Morgan Kaufmann Publshers, San Francsco, CA (1994) D.E. Goldberg, Genetc Algorthms n Search, Optmzaton, and Machne Learnng. MA, Addson- Wesley (1989) 15. Muhlenben and D. Schlerkamp-Voosen, Predctve Models for the Breeder Genetc Algorthm, Contnuous Parameter Optmzaton, Evolutonary Computaton, Vol. 1, No. 1 (1993) R.O. Duda, P.E. Hart, and D.G. Stork, Pattern Classfcaton. 2 nd Edton, John Wley & Sons, Inc., New York NY

Forecasting the Direction and Strength of Stock Market Movement

Forecasting the Direction and Strength of Stock Market Movement Forecastng the Drecton and Strength of Stock Market Movement Jngwe Chen Mng Chen Nan Ye cjngwe@stanford.edu mchen5@stanford.edu nanye@stanford.edu Abstract - Stock market s one of the most complcated systems

More information

Using Genetic Algorithms for Data Mining Optimization in an Educational Web-based System

Using Genetic Algorithms for Data Mining Optimization in an Educational Web-based System Using Genetic Algorithms for Data Mining Optimization in an Educational Web-based System Behrouz Minaei-Bidgoli 1, William F. Punch III 1 1 Genetic Algorithms Research and Applications Group (GARAGe) Department

More information

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis

The Development of Web Log Mining Based on Improve-K-Means Clustering Analysis The Development of Web Log Mnng Based on Improve-K-Means Clusterng Analyss TngZhong Wang * College of Informaton Technology, Luoyang Normal Unversty, Luoyang, 471022, Chna wangtngzhong2@sna.cn Abstract.

More information

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College

Feature selection for intrusion detection. Slobodan Petrović NISlab, Gjøvik University College Feature selecton for ntruson detecton Slobodan Petrovć NISlab, Gjøvk Unversty College Contents The feature selecton problem Intruson detecton Traffc features relevant for IDS The CFS measure The mrmr measure

More information

What is Candidate Sampling

What is Candidate Sampling What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

More information

L10: Linear discriminants analysis

L10: Linear discriminants analysis L0: Lnear dscrmnants analyss Lnear dscrmnant analyss, two classes Lnear dscrmnant analyss, C classes LDA vs. PCA Lmtatons of LDA Varants of LDA Other dmensonalty reducton methods CSCE 666 Pattern Analyss

More information

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 3. Density estimation. CS 2750 Machine Learning. Announcements Lecture 3 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square Next lecture: Matlab tutoral Announcements Rules for attendng the class: Regstered for credt Regstered for audt (only f there

More information

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching)

Face Verification Problem. Face Recognition Problem. Application: Access Control. Biometric Authentication. Face Verification (1:1 matching) Face Recognton Problem Face Verfcaton Problem Face Verfcaton (1:1 matchng) Querymage face query Face Recognton (1:N matchng) database Applcaton: Access Control www.vsage.com www.vsoncs.com Bometrc Authentcaton

More information

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets

Improved Mining of Software Complexity Data on Evolutionary Filtered Training Sets Improved Mnng of Software Complexty Data on Evolutonary Fltered Tranng Sets VILI PODGORELEC Insttute of Informatcs, FERI Unversty of Marbor Smetanova ulca 17, SI-2000 Marbor SLOVENIA vl.podgorelec@un-mb.s

More information

LETTER IMAGE RECOGNITION

LETTER IMAGE RECOGNITION LETTER IMAGE RECOGNITION 1. Introducton. 1. Introducton. Objectve: desgn classfers for letter mage recognton. consder accuracy and tme n takng the decson. 20,000 samples: Startng set: mages based on 20

More information

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering

CS 2750 Machine Learning. Lecture 17a. Clustering. CS 2750 Machine Learning. Clustering Lecture 7a Clusterng Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square Clusterng Groups together smlar nstances n the data sample Basc clusterng problem: dstrbute data nto k dfferent groups such that

More information

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ).

benefit is 2, paid if the policyholder dies within the year, and probability of death within the year is ). REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

More information

An Interest-Oriented Network Evolution Mechanism for Online Communities

An Interest-Oriented Network Evolution Mechanism for Online Communities An Interest-Orented Network Evoluton Mechansm for Onlne Communtes Cahong Sun and Xaopng Yang School of Informaton, Renmn Unversty of Chna, Bejng 100872, P.R. Chna {chsun,yang}@ruc.edu.cn Abstract. Onlne

More information

Naive Rule Induction for Text Classification based on Key-phrases

Naive Rule Induction for Text Classification based on Key-phrases Nave Rule Inducton for Text Classfcaton based on Key-phrases Nktas N. Karankolas & Chrstos Skourlas Department of Informatcs, Technologcal Educatonal Insttute of Athens, Greece. Abstract In ths paper,

More information

A DATA MINING APPLICATION IN A STUDENT DATABASE

A DATA MINING APPLICATION IN A STUDENT DATABASE JOURNAL OF AERONAUTICS AND SPACE TECHNOLOGIES JULY 005 VOLUME NUMBER (53-57) A DATA MINING APPLICATION IN A STUDENT DATABASE Şenol Zafer ERDOĞAN Maltepe Ünversty Faculty of Engneerng Büyükbakkalköy-Istanbul

More information

Nonlinear data mapping by neural networks

Nonlinear data mapping by neural networks Nonlnear data mappng by neural networks R.P.W. Dun Delft Unversty of Technology, Netherlands Abstract A revew s gven of the use of neural networks for nonlnear mappng of hgh dmensonal data on lower dmensonal

More information

Single and multiple stage classifiers implementing logistic discrimination

Single and multiple stage classifiers implementing logistic discrimination Sngle and multple stage classfers mplementng logstc dscrmnaton Hélo Radke Bttencourt 1 Dens Alter de Olvera Moraes 2 Vctor Haertel 2 1 Pontfíca Unversdade Católca do Ro Grande do Sul - PUCRS Av. Ipranga,

More information

Sensitivity Analysis in a Generic Multi-Attribute Decision Support System

Sensitivity Analysis in a Generic Multi-Attribute Decision Support System Senstvty Analyss n a Generc Mult-Attrbute Decson Support System Sxto Ríos-Insua, Antono Jménez and Alfonso Mateos Department of Artfcal Intellgence, Madrd Techncal Unversty Campus de Montegancedo s/n,

More information

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network

Forecasting the Demand of Emergency Supplies: Based on the CBR Theory and BP Neural Network 700 Proceedngs of the 8th Internatonal Conference on Innovaton & Management Forecastng the Demand of Emergency Supples: Based on the CBR Theory and BP Neural Network Fu Deqang, Lu Yun, L Changbng School

More information

Using Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions

Using Mixture Covariance Matrices to Improve Face and Facial Expression Recognitions Usng Mxture Covarance Matrces to Improve Face and Facal Expresson Recogntons Carlos E. homaz, Duncan F. Glles and Raul Q. Fetosa 2 Imperal College of Scence echnology and Medcne, Department of Computng,

More information

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting

Causal, Explanatory Forecasting. Analysis. Regression Analysis. Simple Linear Regression. Which is Independent? Forecasting Causal, Explanatory Forecastng Assumes cause-and-effect relatonshp between system nputs and ts output Forecastng wth Regresson Analyss Rchard S. Barr Inputs System Cause + Effect Relatonshp The job of

More information

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble

ECE544NA Final Project: Robust Machine Learning Hardware via Classifier Ensemble 1 ECE544NA Fnal Project: Robust Machne Learnng Hardware va Classfer Ensemble Sa Zhang, szhang12@llnos.edu Dept. of Electr. & Comput. Eng., Unv. of Illnos at Urbana-Champagn, Urbana, IL, USA Abstract In

More information

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification

Logistic Regression. Lecture 4: More classifiers and classes. Logistic regression. Adaboost. Optimization. Multiple class classification Lecture 4: More classfers and classes C4B Machne Learnng Hlary 20 A. Zsserman Logstc regresson Loss functons revsted Adaboost Loss functons revsted Optmzaton Multple class classfcaton Logstc Regresson

More information

Study on CET4 Marks in China s Graded English Teaching

Study on CET4 Marks in China s Graded English Teaching Study on CET4 Marks n Chna s Graded Englsh Teachng CHE We College of Foregn Studes, Shandong Insttute of Busness and Technology, P.R.Chna, 264005 Abstract: Ths paper deploys Logt model, and decomposes

More information

Calculation of Sampling Weights

Calculation of Sampling Weights Perre Foy Statstcs Canada 4 Calculaton of Samplng Weghts 4.1 OVERVIEW The basc sample desgn used n TIMSS Populatons 1 and 2 was a two-stage stratfed cluster desgn. 1 The frst stage conssted of a sample

More information

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm

A hybrid global optimization algorithm based on parallel chaos optimization and outlook algorithm Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(7):1884-1889 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 A hybrd global optmzaton algorthm based on parallel

More information

Lecture 2: Single Layer Perceptrons Kevin Swingler

Lecture 2: Single Layer Perceptrons Kevin Swingler Lecture 2: Sngle Layer Perceptrons Kevn Sngler kms@cs.str.ac.uk Recap: McCulloch-Ptts Neuron Ths vastly smplfed model of real neurons s also knon as a Threshold Logc Unt: W 2 A Y 3 n W n. A set of synapses

More information

Gender Classification for Real-Time Audience Analysis System

Gender Classification for Real-Time Audience Analysis System Gender Classfcaton for Real-Tme Audence Analyss System Vladmr Khryashchev, Lev Shmaglt, Andrey Shemyakov, Anton Lebedev Yaroslavl State Unversty Yaroslavl, Russa vhr@yandex.ru, shmaglt_lev@yahoo.com, andrey.shemakov@gmal.com,

More information

Can Auto Liability Insurance Purchases Signal Risk Attitude?

Can Auto Liability Insurance Purchases Signal Risk Attitude? Internatonal Journal of Busness and Economcs, 2011, Vol. 10, No. 2, 159-164 Can Auto Lablty Insurance Purchases Sgnal Rsk Atttude? Chu-Shu L Department of Internatonal Busness, Asa Unversty, Tawan Sheng-Chang

More information

Communication Networks II Contents

Communication Networks II Contents 8 / 1 -- Communcaton Networs II (Görg) -- www.comnets.un-bremen.de Communcaton Networs II Contents 1 Fundamentals of probablty theory 2 Traffc n communcaton networs 3 Stochastc & Marovan Processes (SP

More information

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION

Vision Mouse. Saurabh Sarkar a* University of Cincinnati, Cincinnati, USA ABSTRACT 1. INTRODUCTION Vson Mouse Saurabh Sarkar a* a Unversty of Cncnnat, Cncnnat, USA ABSTRACT The report dscusses a vson based approach towards trackng of eyes and fngers. The report descrbes the process of locatng the possble

More information

SCHEDULING OF CONSTRUCTION PROJECTS BY MEANS OF EVOLUTIONARY ALGORITHMS

SCHEDULING OF CONSTRUCTION PROJECTS BY MEANS OF EVOLUTIONARY ALGORITHMS SCHEDULING OF CONSTRUCTION PROJECTS BY MEANS OF EVOLUTIONARY ALGORITHMS Magdalena Rogalska 1, Wocech Bożeko 2,Zdzsław Heduck 3, 1 Lubln Unversty of Technology, 2- Lubln, Nadbystrzycka 4., Poland. E-mal:rogalska@akropols.pol.lubln.pl

More information

Performance Management and Evaluation Research to University Students

Performance Management and Evaluation Research to University Students 631 A publcaton of CHEMICAL ENGINEERING TRANSACTIONS VOL. 46, 2015 Guest Edtors: Peyu Ren, Yancang L, Hupng Song Copyrght 2015, AIDIC Servz S.r.l., ISBN 978-88-95608-37-2; ISSN 2283-9216 The Italan Assocaton

More information

Linear Regression, Regularization Bias-Variance Tradeoff

Linear Regression, Regularization Bias-Variance Tradeoff HTF: Ch3, 7 B: Ch3 Lnear Regresson, Regularzaton Bas-Varance Tradeoff Thanks to C Guestrn, T Detterch, R Parr, N Ray 1 Outlne Lnear Regresson MLE = Least Squares! Bass functons Evaluatng Predctors Tranng

More information

9.1 The Cumulative Sum Control Chart

9.1 The Cumulative Sum Control Chart Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s

More information

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index

Quality Adjustment of Second-hand Motor Vehicle Application of Hedonic Approach in Hong Kong s Consumer Price Index Qualty Adustment of Second-hand Motor Vehcle Applcaton of Hedonc Approach n Hong Kong s Consumer Prce Index Prepared for the 14 th Meetng of the Ottawa Group on Prce Indces 20 22 May 2015, Tokyo, Japan

More information

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 2 LOSSLESS IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..

More information

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network *

Data Broadcast on a Multi-System Heterogeneous Overlayed Wireless Network * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 819-840 (2008) Data Broadcast on a Mult-System Heterogeneous Overlayed Wreless Network * Department of Computer Scence Natonal Chao Tung Unversty Hsnchu,

More information

Identifying Workloads in Mixed Applications

Identifying Workloads in Mixed Applications , pp.395-400 http://dx.do.org/0.4257/astl.203.29.8 Identfyng Workloads n Mxed Applcatons Jeong Seok Oh, Hyo Jung Bang, Yong Do Cho, Insttute of Gas Safety R&D, Korea Gas Safety Corporaton, Shghung-Sh,

More information

Questions that we may have about the variables

Questions that we may have about the variables Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent

More information

An Analysis of Factors Influencing the Self-Rated Health of Elderly Chinese People

An Analysis of Factors Influencing the Self-Rated Health of Elderly Chinese People Open Journal of Socal Scences, 205, 3, 5-20 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/ss http://dx.do.org/0.4236/ss.205.35003 An Analyss of Factors Influencng the Self-Rated Health of

More information

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm

A New Task Scheduling Algorithm Based on Improved Genetic Algorithm A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng Envronment Congcong Xong, Long Feng, Lxan Chen A New Task Schedulng Algorthm Based on Improved Genetc Algorthm n Cloud Computng

More information

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol

CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK. Sample Stability Protocol CHOLESTEROL REFERENCE METHOD LABORATORY NETWORK Sample Stablty Protocol Background The Cholesterol Reference Method Laboratory Network (CRMLN) developed certfcaton protocols for total cholesterol, HDL

More information

Fault tolerance in cloud technologies presented as a service

Fault tolerance in cloud technologies presented as a service Internatonal Scentfc Conference Computer Scence 2015 Pavel Dzhunev, PhD student Fault tolerance n cloud technologes presented as a servce INTRODUCTION Improvements n technques for vrtualzaton and performance

More information

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm

Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm Document Clusterng Analyss Based on Hybrd PSO+K-means Algorthm Xaohu Cu, Thomas E. Potok Appled Software Engneerng Research Group, Computatonal Scences and Engneerng Dvson, Oak Rdge Natonal Laboratory,

More information

Optimal Choice of Random Variables in D-ITG Traffic Generating Tool using Evolutionary Algorithms

Optimal Choice of Random Variables in D-ITG Traffic Generating Tool using Evolutionary Algorithms Optmal Choce of Random Varables n D-ITG Traffc Generatng Tool usng Evolutonary Algorthms M. R. Mosav* (C.A.), F. Farab* and S. Karam* Abstract: Impressve development of computer networks has been requred

More information

Implementation of Deutsch's Algorithm Using Mathcad

Implementation of Deutsch's Algorithm Using Mathcad Implementaton of Deutsch's Algorthm Usng Mathcad Frank Roux The followng s a Mathcad mplementaton of Davd Deutsch's quantum computer prototype as presented on pages - n "Machnes, Logc and Quantum Physcs"

More information

Ants Can Schedule Software Projects

Ants Can Schedule Software Projects Ants Can Schedule Software Proects Broderck Crawford 1,2, Rcardo Soto 1,3, Frankln Johnson 4, and Erc Monfroy 5 1 Pontfca Unversdad Católca de Valparaíso, Chle FrstName.Name@ucv.cl 2 Unversdad Fns Terrae,

More information

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending

Bayesian Network Based Causal Relationship Identification and Funding Success Prediction in P2P Lending Proceedngs of 2012 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 25 (2012) (2012) IACSIT Press, Sngapore Bayesan Network Based Causal Relatonshp Identfcaton and Fundng Success

More information

Predicting Software Development Project Outcomes *

Predicting Software Development Project Outcomes * Predctng Software Development Project Outcomes * Rosna Weber, Mchael Waller, June Verner, Wllam Evanco College of Informaton Scence & Technology, Drexel Unversty 3141 Chestnut Street Phladelpha, PA 19104

More information

Financial Mathemetics

Financial Mathemetics Fnancal Mathemetcs 15 Mathematcs Grade 12 Teacher Gude Fnancal Maths Seres Overvew In ths seres we am to show how Mathematcs can be used to support personal fnancal decsons. In ths seres we jon Tebogo,

More information

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION

THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Internatonal Journal of Electronc Busness Management, Vol. 3, No. 4, pp. 30-30 (2005) 30 THE APPLICATION OF DATA MINING TECHNIQUES AND MULTIPLE CLASSIFIERS TO MARKETING DECISION Yu-Mn Chang *, Yu-Cheh

More information

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation

Exhaustive Regression. An Exploration of Regression-Based Data Mining Techniques Using Super Computation Exhaustve Regresson An Exploraton of Regresson-Based Data Mnng Technques Usng Super Computaton Antony Daves, Ph.D. Assocate Professor of Economcs Duquesne Unversty Pttsburgh, PA 58 Research Fellow The

More information

Realistic Image Synthesis

Realistic Image Synthesis Realstc Image Synthess - Combned Samplng and Path Tracng - Phlpp Slusallek Karol Myszkowsk Vncent Pegoraro Overvew: Today Combned Samplng (Multple Importance Samplng) Renderng and Measurng Equaton Random

More information

Planning for Marketing Campaigns

Planning for Marketing Campaigns Plannng for Marketng Campagns Qang Yang and Hong Cheng Department of Computer Scence Hong Kong Unversty of Scence and Technology Clearwater Bay, Kowloon, Hong Kong, Chna (qyang, csch)@cs.ust.hk Abstract

More information

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is

The covariance is the two variable analog to the variance. The formula for the covariance between two variables is Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Max Wellng Department of Computer Scence Unversty of Toronto 10 Kng s College Road Toronto, M5S 3G5 Canada wellng@cs.toronto.edu Abstract Ths s a note to explan support vector machnes.

More information

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE

AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE AN APPOINTMENT ORDER OUTPATIENT SCHEDULING SYSTEM THAT IMPROVES OUTPATIENT EXPERIENCE Yu-L Huang Industral Engneerng Department New Mexco State Unversty Las Cruces, New Mexco 88003, U.S.A. Abstract Patent

More information

Project Networks With Mixed-Time Constraints

Project Networks With Mixed-Time Constraints Project Networs Wth Mxed-Tme Constrants L Caccetta and B Wattananon Western Australan Centre of Excellence n Industral Optmsaton (WACEIO) Curtn Unversty of Technology GPO Box U1987 Perth Western Australa

More information

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña

A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION. Michael E. Kuhl Radhamés A. Tolentino-Peña Proceedngs of the 2008 Wnter Smulaton Conference S. J. Mason, R. R. Hll, L. Mönch, O. Rose, T. Jefferson, J. W. Fowler eds. A DYNAMIC CRASHING METHOD FOR PROJECT MANAGEMENT USING SIMULATION-BASED OPTIMIZATION

More information

An Alternative Way to Measure Private Equity Performance

An Alternative Way to Measure Private Equity Performance An Alternatve Way to Measure Prvate Equty Performance Peter Todd Parlux Investment Technology LLC Summary Internal Rate of Return (IRR) s probably the most common way to measure the performance of prvate

More information

Luby s Alg. for Maximal Independent Sets using Pairwise Independence

Luby s Alg. for Maximal Independent Sets using Pairwise Independence Lecture Notes for Randomzed Algorthms Luby s Alg. for Maxmal Independent Sets usng Parwse Independence Last Updated by Erc Vgoda on February, 006 8. Maxmal Independent Sets For a graph G = (V, E), an ndependent

More information

Heuristic Static Load-Balancing Algorithm Applied to CESM

Heuristic Static Load-Balancing Algorithm Applied to CESM Heurstc Statc Load-Balancng Algorthm Appled to CESM 1 Yur Alexeev, 1 Sher Mckelson, 1 Sven Leyffer, 1 Robert Jacob, 2 Anthony Crag 1 Argonne Natonal Laboratory, 9700 S. Cass Avenue, Argonne, IL 60439,

More information

SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA

SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA SIX WAYS TO SOLVE A SIMPLE PROBLEM: FITTING A STRAIGHT LINE TO MEASUREMENT DATA E. LAGENDIJK Department of Appled Physcs, Delft Unversty of Technology Lorentzweg 1, 68 CJ, The Netherlands E-mal: e.lagendjk@tnw.tudelft.nl

More information

The OC Curve of Attribute Acceptance Plans

The OC Curve of Attribute Acceptance Plans The OC Curve of Attrbute Acceptance Plans The Operatng Characterstc (OC) curve descrbes the probablty of acceptng a lot as a functon of the lot s qualty. Fgure 1 shows a typcal OC Curve. 10 8 6 4 1 3 4

More information

Multivariate EWMA Control Chart

Multivariate EWMA Control Chart Multvarate EWMA Control Chart Summary The Multvarate EWMA Control Chart procedure creates control charts for two or more numerc varables. Examnng the varables n a multvarate sense s extremely mportant

More information

Research Article Integrated Model of Multiple Kernel Learning and Differential Evolution for EUR/USD Trading

Research Article Integrated Model of Multiple Kernel Learning and Differential Evolution for EUR/USD Trading Hndaw Publshng Corporaton e Scentfc World Journal, Artcle ID 914641, 12 pages http://dx.do.org/10.1155/2014/914641 Research Artcle Integrated Model of Multple Kernel Learnng and Dfferental Evoluton for

More information

The Analysis of Outliers in Statistical Data

The Analysis of Outliers in Statistical Data THALES Project No. xxxx The Analyss of Outlers n Statstcal Data Research Team Chrysses Caron, Assocate Professor (P.I.) Vaslk Karot, Doctoral canddate Polychrons Economou, Chrstna Perrakou, Postgraduate

More information

Genetic Algorithms applied to Clustering Problem and Data Mining

Genetic Algorithms applied to Clustering Problem and Data Mining Proceedngs of the 7th WSEAS Internatonal Conference on Smulaton, Modellng and Optmzaton, Beng, Chna, September 5-7, 007 9 Genetc Algorthms appled to Clusterng Problem and Data Mnng JF JIMENEZ a, FJ CUEVAS

More information

ERP Software Selection Using The Rough Set And TPOSIS Methods

ERP Software Selection Using The Rough Set And TPOSIS Methods ERP Software Selecton Usng The Rough Set And TPOSIS Methods Under Fuzzy Envronment Informaton Management Department, Hunan Unversty of Fnance and Economcs, No. 139, Fengln 2nd Road, Changsha, 410205, Chna

More information

Learning with Imperfections A Multi-Agent Neural-Genetic Trading System. with Differing Levels of Social Learning

Learning with Imperfections A Multi-Agent Neural-Genetic Trading System. with Differing Levels of Social Learning Proceedngs of the 4 IEEE Conference on Cybernetcs and Intellgent Systems Sngapore, 1-3 December, 4 Learnng wth Imperfectons A Mult-Agent Neural-Genetc Tradng System wth Dfferng Levels of Socal Learnng

More information

LITERATURE REVIEW: VARIOUS PRIORITY BASED TASK SCHEDULING ALGORITHMS IN CLOUD COMPUTING

LITERATURE REVIEW: VARIOUS PRIORITY BASED TASK SCHEDULING ALGORITHMS IN CLOUD COMPUTING LITERATURE REVIEW: VARIOUS PRIORITY BASED TASK SCHEDULING ALGORITHMS IN CLOUD COMPUTING 1 MS. POOJA.P.VASANI, 2 MR. NISHANT.S. SANGHANI 1 M.Tech. [Software Systems] Student, Patel College of Scence and

More information

Detecting Credit Card Fraud using Periodic Features

Detecting Credit Card Fraud using Periodic Features Detectng Credt Card Fraud usng Perodc Features Alejandro Correa Bahnsen, Djamla Aouada, Aleksandar Stojanovc and Björn Ottersten Interdscplnary Centre for Securty, Relablty and Trust Unversty of Luxembourg,

More information

Genetic algorithm for searching for critical slip surface in gravity dams based on stress fields CHEN Jianyun 1, WANG Shu 2, XU Qiang 3, LI Jing 4

Genetic algorithm for searching for critical slip surface in gravity dams based on stress fields CHEN Jianyun 1, WANG Shu 2, XU Qiang 3, LI Jing 4 Advanced Materals Research Onlne: 203-09-04 ISSN: 662-8985, Vol. 790, pp 46-49 do:0.4028/www.scentfc.net/amr.790.46 203 Trans Tech Publcatons, Swtzerland Genetc algorthm for searchng for crtcal slp surface

More information

Sciences Shenyang, Shenyang, China.

Sciences Shenyang, Shenyang, China. Advanced Materals Research Vols. 314-316 (2011) pp 1315-1320 (2011) Trans Tech Publcatons, Swtzerland do:10.4028/www.scentfc.net/amr.314-316.1315 Solvng the Two-Obectve Shop Schedulng Problem n MTO Manufacturng

More information

An Inductive Fuzzy Classification Approach applied to Individual Marketing

An Inductive Fuzzy Classification Approach applied to Individual Marketing An Inductve Fuzzy Classfcaton Approach appled to Indvdual Marketng Mchael Kaufmann, Andreas Meer Abstract A data mnng methodology for an nductve fuzzy classfcaton s ntroduced. The nducton step s based

More information

Multiple-Period Attribution: Residuals and Compounding

Multiple-Period Attribution: Residuals and Compounding Multple-Perod Attrbuton: Resduals and Compoundng Our revewer gave these authors full marks for dealng wth an ssue that performance measurers and vendors often regard as propretary nformaton. In 1994, Dens

More information

The Journal of Systems and Software

The Journal of Systems and Software The Journal of Systems and Software 82 (2009) 241 252 Contents lsts avalable at ScenceDrect The Journal of Systems and Software journal homepage: www. elsever. com/ locate/ jss A study of project selecton

More information

Performance Analysis and Coding Strategy of ECOC SVMs

Performance Analysis and Coding Strategy of ECOC SVMs Internatonal Journal of Grd and Dstrbuted Computng Vol.7, No. (04), pp.67-76 http://dx.do.org/0.457/jgdc.04.7..07 Performance Analyss and Codng Strategy of ECOC SVMs Zhgang Yan, and Yuanxuan Yang, School

More information

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University

Time Series Analysis in Studies of AGN Variability. Bradley M. Peterson The Ohio State University Tme Seres Analyss n Studes of AGN Varablty Bradley M. Peterson The Oho State Unversty 1 Lnear Correlaton Degree to whch two parameters are lnearly correlated can be expressed n terms of the lnear correlaton

More information

IMPACT ANALYSIS OF A CELLULAR PHONE

IMPACT ANALYSIS OF A CELLULAR PHONE 4 th ASA & μeta Internatonal Conference IMPACT AALYSIS OF A CELLULAR PHOE We Lu, 2 Hongy L Bejng FEAonlne Engneerng Co.,Ltd. Bejng, Chna ABSTRACT Drop test smulaton plays an mportant role n nvestgatng

More information

A novel Method for Data Mining and Classification based on

A novel Method for Data Mining and Classification based on A novel Method for Data Mnng and Classfcaton based on Ensemble Learnng 1 1, Frst Author Nejang Normal Unversty;Schuan Nejang 641112,Chna, E-mal: lhan-gege@126.com Abstract Data mnng has been attached great

More information

Design and Development of a Security Evaluation Platform Based on International Standards

Design and Development of a Security Evaluation Platform Based on International Standards Internatonal Journal of Informatcs Socety, VOL.5, NO.2 (203) 7-80 7 Desgn and Development of a Securty Evaluaton Platform Based on Internatonal Standards Yuj Takahash and Yoshm Teshgawara Graduate School

More information

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications

Methodology to Determine Relationships between Performance Factors in Hadoop Cloud Computing Applications Methodology to Determne Relatonshps between Performance Factors n Hadoop Cloud Computng Applcatons Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng and

More information

I. SCOPE, APPLICABILITY AND PARAMETERS Scope

I. SCOPE, APPLICABILITY AND PARAMETERS Scope D Executve Board Annex 9 Page A/R ethodologcal Tool alculaton of the number of sample plots for measurements wthn A/R D project actvtes (Verson 0) I. SOPE, PIABIITY AD PARAETERS Scope. Ths tool s applcable

More information

1 Approximation Algorithms

1 Approximation Algorithms CME 305: Dscrete Mathematcs and Algorthms 1 Approxmaton Algorthms In lght of the apparent ntractablty of the problems we beleve not to le n P, t makes sense to pursue deas other than complete solutons

More information

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns

A study on the ability of Support Vector Regression and Neural Networks to Forecast Basic Time Series Patterns A study on the ablty of Support Vector Regresson and Neural Networks to Forecast Basc Tme Seres Patterns Sven F. Crone, Jose Guajardo 2, and Rchard Weber 2 Lancaster Unversty, Department of Management

More information

Grid Resource Selection Optimization with Guarantee Quality of Service by GAPSO

Grid Resource Selection Optimization with Guarantee Quality of Service by GAPSO Australan Journal of Basc and Appled Scences, 5(11): 2139-2145, 2011 ISSN 1991-8178 Grd Resource Selecton Optmzaton wth Guarantee Qualty of Servce by GAPSO 1 Hossen Shrgah and 2 Nameh Danesh 1 Islamc Azad

More information

On Mean Squared Error of Hierarchical Estimator

On Mean Squared Error of Hierarchical Estimator S C H E D A E I N F O R M A T I C A E VOLUME 0 0 On Mean Squared Error of Herarchcal Estmator Stans law Brodowsk Faculty of Physcs, Astronomy, and Appled Computer Scence, Jagellonan Unversty, Reymonta

More information

Implementation and Evaluation of a Random Forest Machine Learning Algorithm

Implementation and Evaluation of a Random Forest Machine Learning Algorithm Implementaton and Evaluaton of a Random Forest Machne Learnng Algorthm Vachaslau Sazonau Unversty of Manchester, Oxford Road, Manchester, M13 9PL,UK sazonauv@cs.manchester.ac.uk Abstract hs work s amed

More information

A Dynamic Load Balancing for Massive Multiplayer Online Game Server

A Dynamic Load Balancing for Massive Multiplayer Online Game Server A Dynamc Load Balancng for Massve Multplayer Onlne Game Server Jungyoul Lm, Jaeyong Chung, Jnryong Km and Kwanghyun Shm Dgtal Content Research Dvson Electroncs and Telecommuncatons Research Insttute Daejeon,

More information

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6

NPAR TESTS. One-Sample Chi-Square Test. Cell Specification. Observed Frequencies 1O i 6. Expected Frequencies 1EXP i 6 PAR TESTS If a WEIGHT varable s specfed, t s used to replcate a case as many tmes as ndcated by the weght value rounded to the nearest nteger. If the workspace requrements are exceeded and samplng has

More information

Mooring Pattern Optimization using Genetic Algorithms

Mooring Pattern Optimization using Genetic Algorithms 6th World Congresses of Structural and Multdscplnary Optmzaton Ro de Janero, 30 May - 03 June 005, Brazl Moorng Pattern Optmzaton usng Genetc Algorthms Alonso J. Juvnao Carbono, Ivan F. M. Menezes Luz

More information

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS

METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS METHODOLOGY TO DETERMINE RELATIONSHIPS BETWEEN PERFORMANCE FACTORS IN HADOOP CLOUD COMPUTING APPLICATIONS Lus Eduardo Bautsta Vllalpando 1,2, Alan Aprl 1 and Alan Abran 1 1 Department of Software Engneerng

More information

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic

Institute of Informatics, Faculty of Business and Management, Brno University of Technology,Czech Republic Lagrange Multplers as Quanttatve Indcators n Economcs Ivan Mezník Insttute of Informatcs, Faculty of Busness and Management, Brno Unversty of TechnologCzech Republc Abstract The quanttatve role of Lagrange

More information

Mining Multiple Large Data Sources

Mining Multiple Large Data Sources The Internatonal Arab Journal of Informaton Technology, Vol. 7, No. 3, July 2 24 Mnng Multple Large Data Sources Anmesh Adhkar, Pralhad Ramachandrarao 2, Bhanu Prasad 3, and Jhml Adhkar 4 Department of

More information

Active Learning for Interactive Visualization

Active Learning for Interactive Visualization Actve Learnng for Interactve Vsualzaton Tomoharu Iwata Nel Houlsby Zoubn Ghahraman Unversty of Cambrdge Unversty of Cambrdge Unversty of Cambrdge Abstract Many automatc vsualzaton methods have been. However,

More information

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression

A Novel Methodology of Working Capital Management for Large. Public Constructions by Using Fuzzy S-curve Regression Novel Methodology of Workng Captal Management for Large Publc Constructons by Usng Fuzzy S-curve Regresson Cheng-Wu Chen, Morrs H. L. Wang and Tng-Ya Hseh Department of Cvl Engneerng, Natonal Central Unversty,

More information

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006

Latent Class Regression. Statistics for Psychosocial Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson Statstcs for Psychosocal Research II: Structural Models December 4 and 6, 2006 Latent Class Regresson (LCR) What s t and when do we use t? Recall the standard latent class model

More information

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School

Robust Design of Public Storage Warehouses. Yeming (Yale) Gong EMLYON Business School Robust Desgn of Publc Storage Warehouses Yemng (Yale) Gong EMLYON Busness School Rene de Koster Rotterdam school of management, Erasmus Unversty Abstract We apply robust optmzaton and revenue management

More information