CARD FRAUD DETECTION USING LEARNING MACHINES

Size: px
Start display at page:

Download "CARD FRAUD DETECTION USING LEARNING MACHINES"

Transcription

1 BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de Universitatea Tehnică Gheorghe Asachi din Iaşi Tomul LX (LXIV), Fasc. 2, 204 SecŃia AUTOMATICĂ şi CALCULATOARE CARD FRAUD DETECTION USING LEARNING MACHINES BY ARMAND EUGEN PĂSĂRICĂ Faculty of Cybernetics, Statistics and Economic Informatics Bucharest Received: September, 204 Accepted for publication: October 8, 204 Abstract. Searching Card Fraud via Internet will return approximately 80 million results. The total level of fraud reached.26 billion euro in 200 in Europe according with BCE. The ingenuity of thieves reached highly sophisticated forms. To model mathematically this behavior requires a classification method derived from supervised learning algorithm which must be able to separate the class of fraudulent with a high degree of accuracy. Following his definition, the technique of Support Vector Machines is characterized by two strong hypotheses: margin optimization and kernel representation. So, I chose the techniques of SVM with non-linear kernels. We propose the Gaussian kernel function for measuring the similarities between features into new linear space as the best approach to detect the fraud patterns. Key words: card fraud behaviour; SVM; non-linear kernels; Cover s theorem; LIBSVM. 200 Mathematics Subject Classification: 68T05, 62H30, 93E35.. Introduction Payment cards are the most commonly used electronic payment instrument. A payment card allows its holder to access funds and to make payments either on credit (credit card), either by debiting the account (debit Corresponding author; armand.pasarica@yahoo.com

2 30 Armand Eugen Păsărică card). A total of 42 million cards were issued in Romania by the end of 203, statistics shows the number of the cards and terminals published by the National Bank of Romania. With the development of Information Technology and globalization up to 85 percent of the retail transactions are made by cards. The most widely used techniques for reading information from the card to forge or use them are skimmer, where traditional cards and electronic band is replicated and digital pickpockets if the cards are using RFID (radio frequency identification). Unfortunately, current solutions do not satisfy very well the request of finding the best approach in card fraud detection, because there is no a best approach on this issue. Some encouraging results comes from Stolfo, Fan and Lee in Credit Card Fraud Detection Using Meta-Learning issues and Initial Results and Soheila Ehramikar with the book The Enhacement of credit Card Fraud Detection Systems: using Machine Learning. 2. Theoretical Foundation of a Learning Machine and his Implements in Economic and Industry Area The following set of definitions represents the foundation of a learning machine: Definition : Tom Mitchell defines in 988 the condition as a learning problem to be well defined: A soft program has learned from the experience E to perform the job J with measured performance P if P 0 ( J, E) ( t) dt> 0. Definition 2: The job of a supervised learning machine is to maximize the quality of output (either maximizing P(Y X ) or minimizing the empirical risk at the end of a finite number of iterative training epochs)(huang T.-M., Kecman). Definition 3: Each algorithm has both strength and weaknesses; there is no algorithm good in all classifications or regressions problems but for a given problem there is only one algorithm asymptotic convergent which maximizes the quality of output Y given some set of feature input X: P(Y X). Following is shown the general flow-chart for a General Supervised Machine Learning, which is a cybernetic system with a negative feed-back response because his purpose is to reduce the magnitude of changes with the help of control variables such as learning rates, parameter of regularization, etc. Lema : as an effect of a previous statement, a well defined Learning Machine must produce a steady system ready for using in prediction in practice for a huge range of problems. Field of intelligent machines, adaptable or designed to learn is one of the most interesting and complex areas today. It is a branch of artificial intelligence that deals with the study and construction of systems that can learn from data. For example, whenever a T

3 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, check is emitted in favor of a person who has an account at a bank, algorithms from the software that reads the payment instruments to identify the correct signature and payment amount without even you know this. When you use a credit card for the purposes of online shopping in the department of banks' risk fraud prevention programs who knows the behavior of the consumer and react when the card is stolen or suspicious. The Core of Machine learning is about to study the representation and generalization of the Data. Following is presented a Machine Learning problem viewed as a Cybernetic System (Fig. ): Fig. The logic behind a Machine Learning. Currently, the area of ML is one of the biggest challenges in modern History. The actual Google engines on Internet includes an impressive number of books, research studies, articles and forums on the study area which is also

4 32 Armand Eugen Păsărică called the Machine Learning or Statistical Learning (Vapnik,998) in many other papers. Learning Machine is an area interconnected with many other disciplines having impact in many other industries and sectors including: Financial Institutions: card fraud detection, analysis and prediction indexes, stock markets; Marketing: Customer segmentation, analysis of consumer preferences; clustering; Voice and Face detection, OCR, Biotechnology and Medicine; Robotics and Nanotechnology, Systems that drives themselves; IT: Software engineering, Sequence Pattern Mining, Information Retrieval, Adaptive Websites. 3. Support Vector Machines and his Advantages In Machine Learning area, Support Vector Machine is a supervised learning model that analyzes data and recognizes patterns mainly used for solving linear and non-linear classification. The main goal of an SVM model is to make non-linear classification and high-performance for this purpose is by using the kernel functions. Theoretical basis of SVM was founded by Vladimir N. Vapnik and the idea of Soft Margin Optimizer has been developed together with Corinna Cortez. Other implementations are using the KKT conditions or Least Squares Support Vector Machines. Suykens and Vandewalle have proposed LS-SVM method. The method of SVM is one of the most used machines learning technique into the world of university as well as industry. The best applications of SVM consist in image processing and voice recognition and benchmark tests have shown that the probability of correct classification exceeds 95% being a superior model to Neural Network algorithms such as back-propagation. The major inconveniences are represented by: a) Very large set of training data required to achieve this accuracy; b) The correct identification of the kernel function. These issues lead to huge computational effort and as an impact to very large IT hardware resources. It is important to explain why support classification looks to be more accurate than other existing classification methods. Strictly linear separation on the input space X is practically impossible to achieve based on classical discriminative techniques (like Bayes or the logistic regression). On the other hand trying to find an appropriate non-linear classifier from the data is it extremely difficult to be found and will have a small generalization character.

5 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, Support Vector Machines does a mapping of the initial feature data into another space and then do a selection of an optimal linear classification into that space. It is good demonstrated that putting the classification in terms of minimizing the empirical risk will impact the generalization capacity. Support Vector Technique put accent on finding the best geographical boundary for each data cloud (the vector supports) in each group (practically the border will be placed as far as possible by the most worst training examples from each group). Therefore, the risk has an element of empirical risk minimization and also an element of regularization (Das Gupta, 20). m R ( h) = E L h( x), y = L h( x), y dp( x, y) = L h( x ), y ( ) ( ) ( ) emp i i m i = Choosing the most appropriate kernel strongly depends on the problem but most important is that fine tuning his parameters will become extremely difficult. The technique for an automating selection of the best kernel is not yet mathematically proved. The motivation behind choosing of a particular kernel might be very intuitive, and depends directly on what kind of problem we have to learn. It is recommended to perform first a technique for dimensionality reduction (e.g. PCA method) on attributes and to normalize the data in order to represent graphic where is technically feasible. This will guide at least about the kernel structure. If that is not possible, then the Accuracy or an indicator like F score will have the most evaluating impact on choosing the right form of the kernel. 4. Research Methodology SVM is based on convex optimization problem and as a consequence will always find the global minimum point if the training set is separable either by a linear or non-linear hyper-plane. A necessary condition for that is the kernel function must be properly chosen. In opposition, Neural Networks, BP algorithm can get in some situations a non-convex optimization even the training set is still separable and as consequence is possible to get stuck into a local minimum point. To give a much robust behavior on generalization, I will come up with an interesting point of view about SVM by starting from the model of logistic regression. Firstly, we define the following variables in order to build a valid and robust ML model:

6 34 Armand Eugen Păsărică So for the observation k {,2,..., m} ( k ) ( k) pair of training set (, ) x y of the vectors: k considered fixed we have the ( ( k ) ( k ) ( k ), 2,..., ) x= x x x n ( k) y= y { 0;} So: x is the input matrix having m variables and n features (e.g: amount, tranzaction status...etc) and y is the outcome vector which must be binary where the pozitive example will reflect a tranzaction fraudulent and 0 means tranzaction ok. The vector of parameters must be learned such that my hypotesys h which is a continues function in variable x and parameter to predict/classify if a transaction is fraudulent or not. One of the constraints of SVM is that the cost function J ( ) must be continuos and because ψ ( ) is built on the basis of the Hypothesis h, I want to predict the output as a continue varriable so I will do the transformation: y 0, 0,. { } [ ] Now let s define the hypothesis h -the function that will be used to classify as accurate as possible the outcome of a card holder: fraudulent or not. The hypothesis should be continuously and must output values between 0 and. A convenient function seems to be the logistic function which is continue and outputs values between 0 and. So I will construct the Hypothesis h which simulates the output of fraudulent as: h : X Y where the purpose is to find a function 2 ( i) h ( x, x,..., x ) y for each pair of training example. n,

7 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, Let s take g : R R continuous where: def def g( z) = + e T z = x= x + x x = x 0 0 z n n n j j j= 0 h ( x) =, k= x + e n jx j j= 0 n j= 0 We notice that h ( x) practically converges rapidly to 0 and Fig. 2 The plot of the logistic function Source: MATLAB: fplot(@(k) /(+exp(-k)),[-0 0]). j j if y= h ( x) x 0 if y= 0 h ( x) x 0 The previous relations reflect how the positive and negative examples are geographically separated by the same separation hyperplane. But on the other side: P( y= x; ) = h ( x) P( y= 0 x; ) = h ( x) The probability that the output is certain classified as positive conditioned by the feature vector x and parameterized by the parameter is written as h ( x) and the probability that output to be certain classified as negative conditioned by the feature vector x and parameterized by is given by h ( x). T T

8 36 Armand Eugen Păsărică These two relations will be written into a single one expression according to a Bernoulli distribution: ( ) P( y / x; ) = h ( x) h ( x) Supposing that m training examples were generated independently then according with the method of maximum-likelihood estimation (MLE), the chance that all the training example will occur with the same probability in the training set is happened if and only if the following expression is maximized: L ( ) = P Y X ; = P y x ; = h x h ( x ) y y i ( i) ( i) ( i) ( i) ( ) ( ) ( ) ( ) def m m ( ) ( i) y ( y ) i= i= m i= ( ) ( ) ( ) ( ) l( ) = log L ( ) = y log h x + y log h x ψ ( ɶ ) = max l( ) = (, 2,..., n ) ( i ) ( i ) ( i ) ( i ) ( ) ( ) ( ) ( ) m = min y log h x + y log h x (, 2,..., n ) i= ( i) ( i) ( i) ( i) () () The cost function for one training example, (, ) ( ) ( ) ( ) x y will be written as: ( ) () () () () ψ ( ) = y log h x + y log h x or: () () ψ ( ) = y log + ( y ) log x + e + e () () () if y = 0 ψ ( ) = log () x + e () () x if y = ψ ( ) = log( + e ) So the cost function is continuously and convex over R and as consequences will admit a global minimum point (Fig. 3). x

9 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, j(theta) theta Fig. 3 The plot of the cost function for one training example Source: Output MATLAB. By the general case where m 2 and m> n, the overall cost function will be given by the following relation (Math. Jens Flemming Chemnitz, 20). The regularization is practically a compromise between how perfect do I want to fit the data and how much importance do I wish to allocate to each feature x by his parameter j ( ( )) ( ( )) ( ) ( ) m ψ ( ) = min y log h x + y log h x (, 2,.., n ) m i= ( i) ( i) ( i) ( i) We notice that finding the parameters ( ) 2 j,,..., n which must be estimated by minimizing the function J ( ) is not affected by the term m. We shall do the following notations and the cost function ψ ( ) will be simplified: C = λ c = log h x c2 = log h x ( i) ( ) ( i) ( ) ( ) i ψ ( ) = min C y c + ( y ) c + m n ( i) ( ) 2 2 j i= 2 j= (, 2,.., n ) 0

10 38 Armand Eugen Păsărică But the idea of SVM is to give extra safety condition for a much robust decision boundary. So we get: T ( i) T ( i) = if y x instead of x 0 T ( i) T ( i) if y= 0 x instead of x < 0 Further, I will analyse the following two outcomes of the regularization parameter: C vs. C 0. I) If the regularization parameter C is very large then in order to get ψ ( ) to be minimum then t 0 : m i= i ( ) ( i) ( ) t= y c + y c 2 0 So the optimization model will be written as: n 2 ψ ( ) = min j = min 2 2 (, 2,..., n ) 0 j= 2 T ( i) ( i) ( i) ( i) x y = pr y = T ( i) ( i) ( i) ( i) x y = 0 pr y = 0 Pr is a projection of x onto the vector of. It can be noticed the logic behind the SVM or why the Support Vector Machine is classifiers of large margin. So we proved that SVM model is a convex optimization problem and according with KKT theorem will admit a global minimum point. Following is presented an example of linear classification with regularization parameter C = I want to give importance even to a bad example so I chose a very high value for C. As the consequence the decision boundary will fit the training set perfectly, but this model is not so robust in prediction Fig. 4 SVM Linear Classification model with high variance (overfitting) Source: Output MATLAB.

11 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, II) If the regularization parameter C is small then the decision boundary is much robust in the presence of outliers and the optimisation model is practically derived from the general. i ψ ( ) = min C y c + ( y ) c + m n ( i) ( ) 2 2 j i= 2 j= (, 2,..., n ) 0 T ( i) ( i) x y = T ( i) ( i) x y = 0 That model is still a quadratic convex optimization and will admit a global minimum point.even if the data for classification is not linearly separable then this optimization model will give a global minimum point, and this is one of the most strengths of this algorithm. This model that has C = 0 is much robust because the decision boundary has the largest margin and must be used in prediction despite the fact that has an accuracy in testing of 9.66% (/2) Fig. 5 SVM Linear Classification model with high bias (underfitting) Source: Output MATLAB. Now I have all the necesar conditions to solve this optimization problem for parameters. The logic behind the Cover s theorem is that data which are not linear sepparable in 2D becomes linear separable in 3D by doing a function transformation which is called as kernel function. But not all the functions will create a valid kernels. In order to be a valid kernel must satisfy the technical condition named Mercer s Theorem. This theorem actually assures the necesarry conditions to make sure SVM packages optimizations run correctly and not diverge (Cover, 965).

12 40 Armand Eugen Păsărică Cover's Theorem is a statement in computational learning theory and is one of the primary theoretical motivations for the use of non-linear kernel methods in machine learning applications. The theorem states that given a set of training data that is not linearly separable, we can prove with high probability that is possible to transform it into a set that is linearly separable by projecting it into a higher-dimensional space via some non-linear transformation. A complex pattern-classification problem, being in a low dimensional space as nonlinearly, is more likely to be linearly separable into a high-dimensional space (Fig. 6). Theoretical fundaments will be found into the Hilbert spaces (Paulsen, 2009). Fig. 6 Exemplification of the Cover s Theorem Source: Output VISIO. Cortes and Vapnik proposed in 995 the following soft margin optimiser model: l T ψ ( ) = min + C ξi (, ξi, b) 2 i= ( ) ( i) T ( i) y ϕ x + b ξ i ξ 0, i=, 2,..., l i ( i) T ( i) Furthemore k( xi, x j) ϕ( x ) ϕ( x ) is called the kernel function. The following four types of kernel functions are subject of our paper about card fraud study: linear: (, ) ( ) ( i) T ( i) y ϕ x + b ξi, ξi 0 T i j = i j k x x x x T polynomial: k( xi x j) ( γ xi x j r) gaussian (RBF): ( i j), = +, γ > 0 d ( xi x j ) k x, x = e γ, γ > 0 T sigmoid k( xi, x j) = tanh( γ xi x j + r)

13 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, We firstlly must decide about the best form of the kernel and then to C, γ, r, d. decide about parameters selection ( ) 5. Machine Learning Design and the Implementation Typically in a commercial bank, there are two types of Back-Office applications available to prevent Card Fraud: Online Fraud Monitoring (OLMA) and Off-line Fraud Monitoring (OFMA).The first one gets data through a pipeline process from the live database and in case one or more triggers are activated a pop-up message shows up on the screen: possible card fraud attempt. The card holder must be notified immediately, and the card could be blocked at customer s requirement. The second application mentioned is part of a process included in End of Day and is actually a program that runs a Matlab routine called svmlab.exe and generates a separated report with all potentials fraudulent cards. Report must be further analyzed by the Risk Department during the next working day. The current paper does refer only to OFMA. After issuing the requirements received from the Card Fraud Department, the following phases were rolled out to implement the action plan against fraudulent cards (Fig. 7). Fig. 7 The chart-flow of implementing the Card Fraud Detection system at XXXBank. Source: Output VISIO. Phase : Firstly, the requirement comes from the Cards Department of one in the top 5 Romanian Banks. They would like to build an intelligent

14 42 Armand Eugen Păsărică system that should be able to learn and correctly classify whether a transaction is going to be fraudulent or not. After a long period of monitoring fraudulent transactions, the cards specialists of XXX BANK have noticed that the following attributes (Table ) are influencing the behavior of a fraudulent card that could provide us with a pattern to recognize frauds. The records are indexed to the unique key given by PAN Number and consists of 6 digits (e.g ). Phase 2: When the trigger called txn_in has been activated, which means that at least a PIN code has been entered into the Card System, then the record having attributes (Table ) are picked up from the primary all_tran.dbn and inserted into a new table named cards.dbn. This is a massive table containing around rows daily. Phase 3: Normalizing the data will improve the accuracy of the classification and the data attributes have been adjusted using the following formula: xɶ i x j µ i j = i(max) i(min) x j x j Phase 4: The data normalized is further split into 3 sets: Training Set, Cross-Validation Set and Testing Set: Training Set has 70% of the total data and is used to estimate the parameters of the model Cross-Validation Set has 5% of the total data and is used to fine calibrate and confirm the parameters estimated with the Training Set. Testing Set has the rest of 5% left and is designated to test and validate the final model before implementing in production. Table The Attributes/Variables of the SVM Model Field Name of the field Type Codification Index The amount of txn numerical Number of txn per window numerical {3,4,5} 3 The country were txn was made (country qualitative..00 code) 4 The Time when the txn was made numerical HH:MM:SS 5 The Channel were the txn has been processed qualitative = ATM 2 = POS 3 = Internet 6 The type of the Card: Smart Card or not qualitative {0,} 7 The type of Operation qualitative = Purchasing at POS 2 = Withdraw at ATM 3 = Internet txn 8 The number of unsuccessfully txn qualitative {,2,3} authorizations 9 The number of consecutive low value txn qualitative {,2,3,4}

15 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, The input file which has been further imported and processed in MATLAB using the LIBSVM library. The Table 2 reflects the results of applying LIBSVBM module in MATLAB via different kernel functions. The number of the variables was set to In trial all the variables have been generated randomly using the uniform repartition; in trial 2 the output variable is generated using Weibull repartition, the input variables are generated using a uniform repartition. Table 2 The Accuracy in Card Fraud Detection Using Various Types of Kernel Function Source: Output MATLAB Type of the kernel function Accuracy(Trial ) Number of iterations Accuracy(Trial 2) Number of iterations Linear 5.56% % (556/0000) (50/0000) Polynomial (d = 3) 53.39% % (5339/0000) (5466/0000) Logistic % % 3288 (502/0000) (4934/0000) Gaussian (γ = ) 68.08% % 0546 (6808/0000) (900/0000) Gaussian (γ = 0) 92.38% % 3436 (9238/0000) (9999/0000) Gaussian (γ = 00) 99.96% % 475 (9996/0000) Gaussian (γ = 000) 00% (0000/0000) (0000/0000) % (0000/0000) 4920 LIBSVM is an open source machine learning libraries, developed by the National Taiwan University and written in C++. LIBSVM implements the SMO algorithm for various types of kernel, outputting a very accurate classification for the massive structure of data. LIBLINEAR implements linear SVMs and logistic regression models trained using a coordinate descent algorithm (Chang & Lin, 20) The source code in MATLAB which uses a radial kernel to train and then to classify the frauds is the following: label=data(:,); feature=data(:,2:end); gama=0 model=svmtrain(label,feature,sprintf('-s 0 -t 2 -g %g', gama)); [predicted_label, accuracy, decision_values] =svmpredict(label,feature,model);...*...* optimization finished, #iter = 3436 nu = obj = , rho = nsv = 0000, nbsv = 4950

16 44 Armand Eugen Păsărică Total nsv = 0000 Accuracy = 99.99% (9999/0000) (classification) Fig. 8 Screen containing SVM using LIBSVM Source: Output MATLAB. 5. Conclusions. Using the specific SVM technology with a proposed Gaussian kernel with γ = 00 on a set of normalized data reveals an accuracy of screening potential fraudulent cards around 99.96%. The accuracy is net superior to other non-linear kernels like sigmoid, polynomial or exponential. Using a different set of data, the optimal parameter (γ) might vary. 2. Due to his exceptional performance, we further propose to implement the technique of SVM with Gaussian kernel being superior to other standard classification methods like: neural network, logistic regression or Bayesian analysis. 3. Based on this study we recommend this model to be further applied in Card Fraud Monitoring activity at any Retail Business. 4. In addition to this technique, we propose similar procedures based on Gaussian kernels for other outliers detection problems like: fake checks, counterfeit banknotes and so on. REFERENCES * * * Generalized Tikhonov Regularization - Basic Theory and Comprehensive Results on Convergence Rate- Dipl.-Math. Jens Flemming Chemnitz, 28 October 20, 2. Chang Chih-Chung, Lin Chih-Jen, LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology National Taiwan University, 2 6 (20).

17 Bul. Inst. Polit. Iaşi, t. LX (LXIV), f. 2, Cover T.M., Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern Recognition. Electronic Computers, IEEE Transactions on 965, 2 3. DasGupta A., Probability for Statistics and Machine Learning. Fundamentals and advanced topics, Springer, 736, 20. Huang T.-M., Kecman V., Kopriva I., Kernel Based Algorithms for Mining Huge Data Sets, Supervised, Semi-Supervised, and Unsupervised Learning. Springer- Verlag, Berlin (2006). Paulsen V., An Introduction to the Theory of Reproducing Kernel Hilbert Spaces. 4 6, Vapnik V., Statistical Learning Theory. Wiley-Interscience, 29 (998). DETECTAREA FRAUDELOR CU CARDURI BANCARE UTILIZÂND SVM (Rezumat) ImportanŃa detectării fraudelor cu carduri bancare este mare: în primul rând evită pierderi finanicare însemnate şi poate salva deopotrivă reputańia băncilor şi a comercianńilor; s-au efectuat numeroase studii, articole şi cărńi; subiectul este încă deschis şi nu are o soluńie stabilă acceptată. Lucrarea prezintă semnificańia învăńării de tip supervizat care este practic un proces cibernetic cu buclă feedback negativă, autoreglarea realizându-se prin minimizarea continuă a erorilor (de învăńare şi de testare). Decizia de utilizare a tehnicii Support Vector Machine în probleme de outliers classification se bazează pe numeroasele teste de tip benchmark care au reflectat superioritatea asupra altor clase de algoritmi: reńele neurale sau bayessiene. Lucrarea motivează de ce ideea de minimizare a riscului empiric este echivalentă atât cu minimizarea distanńei geografice care este de fapt ideea centrală în SVM (Vapnik), dar reprezintă şi un element de regularizare. De aceea este prezentat modul în care regularizarea Tikhonov generalizată reprezintă un bun compromis între bias şi varianńă. Abordarea metodologică presupune construcńia funcńiei de cost pornind de la funcńia sigmoid şi a EMV şi demonstrarea faptului că aceasta este convexă pe R. De un real ajutor a fost experienńa departamentului Card fraud monitoring din ING BANK în ceea ce priveşte selectarea variabilelor cu adevărat utile în construcńia modelului. În final, a fost utilizat pachetul libsvm în Matlab pentru diferitele simulări numerice datorită uşurinńei în lucrul cu volume mari de date precum şi a flexibilităńii deosebite în customizarea parametrilor funcńiei de kernel. A fost demonstrat numeric faptul că utilizând tehnica SVM cu funcńie de kernel Gaussian, acurateńea în faza de învăńare depăşeşte 90% clasificare corectă pe eşantioane generate aleator. În concluzie, propunem cu încredere utilizarea acestui model în lupta împotriva fraudelor cu carduri bancare.

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms

Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail

More information

Predict Influencers in the Social Network

Predict Influencers in the Social Network Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Introduction to Support Vector Machines. Colin Campbell, Bristol University

Introduction to Support Vector Machines. Colin Campbell, Bristol University Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.

More information

An Introduction to Machine Learning

An Introduction to Machine Learning An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical

More information

Support Vector Machines Explained

Support Vector Machines Explained March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),

More information

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk

Introduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems

More information

Linear Classification. Volker Tresp Summer 2015

Linear Classification. Volker Tresp Summer 2015 Linear Classification Volker Tresp Summer 2015 1 Classification Classification is the central task of pattern recognition Sensors supply information about an object: to which class do the object belong

More information

Making Sense of the Mayhem: Machine Learning and March Madness

Making Sense of the Mayhem: Machine Learning and March Madness Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research

More information

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS

MAXIMIZING RETURN ON DIRECT MARKETING CAMPAIGNS MAXIMIZING RETURN ON DIRET MARKETING AMPAIGNS IN OMMERIAL BANKING S 229 Project: Final Report Oleksandra Onosova INTRODUTION Recent innovations in cloud computing and unified communications have made a

More information

Early defect identification of semiconductor processes using machine learning

Early defect identification of semiconductor processes using machine learning STANFORD UNIVERISTY MACHINE LEARNING CS229 Early defect identification of semiconductor processes using machine learning Friday, December 16, 2011 Authors: Saul ROSA Anton VLADIMIROV Professor: Dr. Andrew

More information

Scalable Developments for Big Data Analytics in Remote Sensing

Scalable Developments for Big Data Analytics in Remote Sensing Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,

More information

A Simple Introduction to Support Vector Machines

A Simple Introduction to Support Vector Machines A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear

More information

Support Vector Machine (SVM)

Support Vector Machine (SVM) Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Azure Machine Learning, SQL Data Mining and R

Azure Machine Learning, SQL Data Mining and R Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:

More information

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.

CS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning. Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott

More information

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval

Search Taxonomy. Web Search. Search Engine Optimization. Information Retrieval Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!

More information

Detecting Corporate Fraud: An Application of Machine Learning

Detecting Corporate Fraud: An Application of Machine Learning Detecting Corporate Fraud: An Application of Machine Learning Ophir Gottlieb, Curt Salisbury, Howard Shek, Vishal Vaidyanathan December 15, 2006 ABSTRACT This paper explores the application of several

More information

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence

Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support

More information

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu

Machine Learning. CUNY Graduate Center, Spring 2013. Professor Liang Huang. huang@cs.qc.cuny.edu Machine Learning CUNY Graduate Center, Spring 2013 Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning Logistics Lectures M 9:30-11:30 am Room 4419 Personnel

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R

Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be

More information

Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

More information

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu

Introduction to Machine Learning Lecture 1. Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction to Machine Learning Lecture 1 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Introduction Logistics Prerequisites: basics concepts needed in probability and statistics

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Introduction to Logistic Regression

Introduction to Logistic Regression OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction

More information

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

More information

Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

More information

Support Vector Machine. Tutorial. (and Statistical Learning Theory)

Support Vector Machine. Tutorial. (and Statistical Learning Theory) Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced

More information

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376

BIOINF 585 Fall 2015 Machine Learning for Systems Biology & Clinical Informatics http://www.ccmb.med.umich.edu/node/1376 Course Director: Dr. Kayvan Najarian (DCM&B, kayvan@umich.edu) Lectures: Labs: Mondays and Wednesdays 9:00 AM -10:30 AM Rm. 2065 Palmer Commons Bldg. Wednesdays 10:30 AM 11:30 AM (alternate weeks) Rm.

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

6.2.8 Neural networks for data mining

6.2.8 Neural networks for data mining 6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural

More information

Machine learning for algo trading

Machine learning for algo trading Machine learning for algo trading An introduction for nonmathematicians Dr. Aly Kassam Overview High level introduction to machine learning A machine learning bestiary What has all this got to do with

More information

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier

Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier Feature Selection using Integer and Binary coded Genetic Algorithm to improve the performance of SVM Classifier D.Nithya a, *, V.Suganya b,1, R.Saranya Irudaya Mary c,1 Abstract - This paper presents,

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

Active Learning SVM for Blogs recommendation

Active Learning SVM for Blogs recommendation Active Learning SVM for Blogs recommendation Xin Guan Computer Science, George Mason University Ⅰ.Introduction In the DH Now website, they try to review a big amount of blogs and articles and find the

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

Classification of Bad Accounts in Credit Card Industry

Classification of Bad Accounts in Credit Card Industry Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

UNSING GEOGRAPHIC INFORMATION SYSTEM VISUALISATION FOR THE SEISMIC RISK ASSESMENT OF THE ROMANIAN INFRASTRUCTURE

UNSING GEOGRAPHIC INFORMATION SYSTEM VISUALISATION FOR THE SEISMIC RISK ASSESMENT OF THE ROMANIAN INFRASTRUCTURE BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de Universitatea Tehnică Gheorghe Asachi din Iaşi Tomul LVI (LX), Fasc. 3, 2010 Secţia CONSTRUCŢII. ĂRHITECTURĂ UNSING GEOGRAPHIC INFORMATION SYSTEM

More information

Support Vector Machines with Clustering for Training with Very Large Datasets

Support Vector Machines with Clustering for Training with Very Large Datasets Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano

More information

SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH

SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH 1 SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH Y, HONG, N. GAUTAM, S. R. T. KUMARA, A. SURANA, H. GUPTA, S. LEE, V. NARAYANAN, H. THADAKAMALLA The Dept. of Industrial Engineering,

More information

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject

Defending Networks with Incomplete Information: A Machine Learning Approach. Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Defending Networks with Incomplete Information: A Machine Learning Approach Alexandre Pinto alexcp@mlsecproject.org @alexcpsec @MLSecProject Agenda Security Monitoring: We are doing it wrong Machine Learning

More information

Equity forecast: Predicting long term stock price movement using machine learning

Equity forecast: Predicting long term stock price movement using machine learning Equity forecast: Predicting long term stock price movement using machine learning Nikola Milosevic School of Computer Science, University of Manchester, UK Nikola.milosevic@manchester.ac.uk Abstract Long

More information

Machine Learning in Spam Filtering

Machine Learning in Spam Filtering Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.

More information

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues

Acknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

Machine Learning in FX Carry Basket Prediction

Machine Learning in FX Carry Basket Prediction Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines

More information

Beating the NCAA Football Point Spread

Beating the NCAA Football Point Spread Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over

More information

Lecture 6. Artificial Neural Networks

Lecture 6. Artificial Neural Networks Lecture 6 Artificial Neural Networks 1 1 Artificial Neural Networks In this note we provide an overview of the key concepts that have led to the emergence of Artificial Neural Networks as a major paradigm

More information

FUTURE INTERNET AND ITIL FOR INTELLIGENT MANAGEMENT IN INDUSTRIAL ROBOTICS SYSTEMS

FUTURE INTERNET AND ITIL FOR INTELLIGENT MANAGEMENT IN INDUSTRIAL ROBOTICS SYSTEMS BULETINUL INSTITUTULUI POLITEHNIC DIN IAŞI Publicat de Universitatea Tehnică Gheorghe Asachi din Iaşi Tomul LVIII (LXII), Fasc. 1, 2012 SecŃia AUTOMATICĂ şi CALCULATOARE FUTURE INTERNET AND ITIL FOR INTELLIGENT

More information

Support Vector Machines for Classification and Regression

Support Vector Machines for Classification and Regression UNIVERSITY OF SOUTHAMPTON Support Vector Machines for Classification and Regression by Steve R. Gunn Technical Report Faculty of Engineering, Science and Mathematics School of Electronics and Computer

More information

Machine Learning and Financial Advice

Machine Learning and Financial Advice Faculty of Science Machine Learning and Financial Advice Christian Igel Department of Computer Science igel@diku.dk Slide 1/24 Outline 1 Machine Learning at DIKU 2 Example Applications in Finance 3 Risks

More information

E-commerce Transaction Anomaly Classification

E-commerce Transaction Anomaly Classification E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce

More information

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014

More information

Introduction to Online Learning Theory

Introduction to Online Learning Theory Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent

More information

Decompose Error Rate into components, some of which can be measured on unlabeled data

Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Theory Decompose Error Rate into components, some of which can be measured on unlabeled data Bias-Variance Decomposition for Regression Bias-Variance Decomposition for Classification Bias-Variance

More information

Introduction to Machine Learning Using Python. Vikram Kamath

Introduction to Machine Learning Using Python. Vikram Kamath Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression

More information

Trading Strategies and the Cat Tournament Protocol

Trading Strategies and the Cat Tournament Protocol M A C H I N E L E A R N I N G P R O J E C T F I N A L R E P O R T F A L L 2 7 C S 6 8 9 CLASSIFICATION OF TRADING STRATEGIES IN ADAPTIVE MARKETS MARK GRUMAN MANJUNATH NARAYANA Abstract In the CAT Tournament,

More information

A Logistic Regression Approach to Ad Click Prediction

A Logistic Regression Approach to Ad Click Prediction A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi kondakin@usc.edu Satakshi Rana satakshr@usc.edu Aswin Rajkumar aswinraj@usc.edu Sai Kaushik Ponnekanti ponnekan@usc.edu Vinit Parakh

More information

Lecture 6: Logistic Regression

Lecture 6: Logistic Regression Lecture 6: CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,

More information

Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

More information

Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur

Probabilistic Linear Classification: Logistic Regression. Piyush Rai IIT Kanpur Probabilistic Linear Classification: Logistic Regression Piyush Rai IIT Kanpur Probabilistic Machine Learning (CS772A) Jan 18, 2016 Probabilistic Machine Learning (CS772A) Probabilistic Linear Classification:

More information

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015

An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015 An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content

More information

Comparison of machine learning methods for intelligent tutoring systems

Comparison of machine learning methods for intelligent tutoring systems Comparison of machine learning methods for intelligent tutoring systems Wilhelmiina Hämäläinen 1 and Mikko Vinni 1 Department of Computer Science, University of Joensuu, P.O. Box 111, FI-80101 Joensuu

More information

MACHINE LEARNING IN HIGH ENERGY PHYSICS

MACHINE LEARNING IN HIGH ENERGY PHYSICS MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

An Introduction to Data Mining

An Introduction to Data Mining An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail

More information

Density Level Detection is Classification

Density Level Detection is Classification Density Level Detection is Classification Ingo Steinwart, Don Hush and Clint Scovel Modeling, Algorithms and Informatics Group, CCS-3 Los Alamos National Laboratory {ingo,dhush,jcs}@lanl.gov Abstract We

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Decision Support Systems

Decision Support Systems Decision Support Systems 50 (2011) 602 613 Contents lists available at ScienceDirect Decision Support Systems journal homepage: www.elsevier.com/locate/dss Data mining for credit card fraud: A comparative

More information

Programming Exercise 3: Multi-class Classification and Neural Networks

Programming Exercise 3: Multi-class Classification and Neural Networks Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning November 4, 2011 Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks

More information

A Decision Tree for Weather Prediction

A Decision Tree for Weather Prediction BULETINUL UniversităŃii Petrol Gaze din Ploieşti Vol. LXI No. 1/2009 77-82 Seria Matematică - Informatică - Fizică A Decision Tree for Weather Prediction Elia Georgiana Petre Universitatea Petrol-Gaze

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

Intrusion Detection via Machine Learning for SCADA System Protection

Intrusion Detection via Machine Learning for SCADA System Protection Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department

More information

large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)

large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) 1 three frequent ideas in machine learning. independent and identically distributed data This experimental paradigm has driven

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

Learning is a very general term denoting the way in which agents:

Learning is a very general term denoting the way in which agents: What is learning? Learning is a very general term denoting the way in which agents: Acquire and organize knowledge (by building, modifying and organizing internal representations of some external reality);

More information

A fast multi-class SVM learning method for huge databases

A fast multi-class SVM learning method for huge databases www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,

More information

Coding science news (intrinsic and extrinsic features)

Coding science news (intrinsic and extrinsic features) Coding science news (intrinsic and extrinsic features) M I G U E L Á N G E L Q U I N T A N I L L A, C A R L O S G. F I G U E R O L A T A M A R G R O V E S 2 Science news in Spain The corpus of digital

More information

Data Mining - Evaluation of Classifiers

Data Mining - Evaluation of Classifiers Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010

More information

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016

Network Machine Learning Research Group. Intended status: Informational October 19, 2015 Expires: April 21, 2016 Network Machine Learning Research Group S. Jiang Internet-Draft Huawei Technologies Co., Ltd Intended status: Informational October 19, 2015 Expires: April 21, 2016 Abstract Network Machine Learning draft-jiang-nmlrg-network-machine-learning-00

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /

More information

GURLS: A Least Squares Library for Supervised Learning

GURLS: A Least Squares Library for Supervised Learning Journal of Machine Learning Research 14 (2013) 3201-3205 Submitted 1/12; Revised 2/13; Published 10/13 GURLS: A Least Squares Library for Supervised Learning Andrea Tacchetti Pavan K. Mallapragada Center

More information

DATA MINING TECHNIQUES AND APPLICATIONS

DATA MINING TECHNIQUES AND APPLICATIONS DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,

More information

Machine Learning. 01 - Introduction

Machine Learning. 01 - Introduction Machine Learning 01 - Introduction Machine learning course One lecture (Wednesday, 9:30, 346) and one exercise (Monday, 17:15, 203). Oral exam, 20 minutes, 5 credit points. Some basic mathematical knowledge

More information

Big Data Analytics CSCI 4030

Big Data Analytics CSCI 4030 High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising

More information

EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA

EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA EMPIRICAL RISK MINIMIZATION FOR CAR INSURANCE DATA Andreas Christmann Department of Mathematics homepages.vub.ac.be/ achristm Talk: ULB, Sciences Actuarielles, 17/NOV/2006 Contents 1. Project: Motor vehicle

More information

Predict the Popularity of YouTube Videos Using Early View Data

Predict the Popularity of YouTube Videos Using Early View Data 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

Advanced Ensemble Strategies for Polynomial Models

Advanced Ensemble Strategies for Polynomial Models Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer

More information

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES

BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES ISSN: 2229-6956(ONLINE) ICTACT JOURNAL ON SOFT COMPUTING, JULY 2012, VOLUME: 02, ISSUE: 04 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES V. Dheepa 1 and R. Dhanapal 2 1 Research

More information

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM

A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Journal of Computational Information Systems 10: 17 (2014) 7629 7635 Available at http://www.jofcis.com A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Tian

More information

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence

Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School

More information

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar

Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Predictive Modeling in Workers Compensation 2008 CAS Ratemaking Seminar Prepared by Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com Louise.francis@data-mines.cm

More information