Kernel methods with Imbalanced Data and Applications to Weather Prediction
|
|
- Lilian Fowler
- 8 years ago
- Views:
Transcription
1 Kernel methods with Imbalanced Data and Applications to Weather Prediction Theodore B. Trafalis Laboratory of Optimization and Intelligent Systems School of Industrial and Systems Engineering University of Oklahoma San Francisco Airport Marriott Waterfront Congress Center San Francisco, CA, 8-10 August, 2015 INNS Conference on Big Data /
2 Part I Kernel Methods for Imbalanced Data and Application to Tornado Prediction INNS Conference on Big Data /
3 Imbalanced Data What is an Imbalanced Data Problem? 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
4 Imbalanced Data What is an Imbalanced Data Problem? Imbalanced Data Problems and their Importance The problem of learning from an imbalanced data set occurs when the number of samples in one class is significantly greater than that of the other class. Imbalanced data is very important in data mining and data classification. Examples of imbalanced data sets include: Fraudulent credit card transactions. Telecommunication equipment failures. Oil spills from satellite images. Tornado, earthquake and landslide occurrences. Cancer and health science data. INNS Conference on Big Data /
5 Imbalanced Data What is an Imbalanced Data Problem? Example of the Classification of Imbalanced Data Source: Tang et al. SVMs Modeling for Highly Imbalanced Classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(1): , INNS Conference on Big Data /
6 Imbalanced Data What is an Imbalanced Data Problem? Between Class and Within Class Imbalances Source: He and Garcia. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21(9): , INNS Conference on Big Data /
7 Imbalanced Data Impact of Imbalanced Data on Learning Machines 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
8 Imbalanced Data Impact of Imbalanced Data on Learning Machines Impact of Imbalanced Data Problems in Classification Classifiers tend to provide an imbalanced degree of accuracy with the majority class having close to 100 percent accuracy, and the minority class having an accuracy close to 0-10 percent. In the tornado data set, for example, a 10 percent accuracy for the minority class suggests that 72 tornadoes would be classified as nontornadoes. INNS Conference on Big Data /
9 Imbalanced Data Impact of Imbalanced Data on Learning Machines Illustration of Impact of Imbalanced Classification On the left, the accuracy of the minority class is zero percent. On the right, the accuracy for the minority class is 80 percent. INNS Conference on Big Data /
10 Imbalanced Data State of the Art Techniques for Imbalanced Learning 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
11 Imbalanced Data State of the Art Techniques for Imbalanced Learning Current Approaches for Imbalanced Learning I Algorithm Level Threshold method. Learn only the minority class. Cost-sensitive approaches. Data Level Random under-sampling and over-sampling. Uninformed under-sampling (EasyEnsemble, BalanceCascade). Synthetic sampling with data generation (SMOTE). Adaptive synthetic sampling (Borderline-SMOTE, ADASYN). Sampling with data cleaning (OSS method, CNN+Tomek Links integration method, Neighborhood Cleaning rule, SMOTE+ENN, SMOTE+Tomek). Cluster-based sampling (CBO). Integration of sampling and boosting (SMOTEBoost). Kernel-based approaches Variations of Support Vector Machines (SVM). INNS Conference on Big Data /
12 Imbalanced Data State of the Art Techniques for Imbalanced Learning Current Approaches for Imbalanced Learning II Kernel Logistic Regression. Evaluation metrics. Metrics used to evaluate accuracies. Receiver Operating Characteristic (ROC), Precision-Recall (PR) and Cost Curves. Singular assessment metrics based on the confusion or multi-class cost matrix (F-measure, G-mean, etc). Source: He and Garcia. Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21(9): , INNS Conference on Big Data /
13 Imbalanced Data State of the Art Techniques for Imbalanced Learning Problems with Imbalanced Data Prediction Improper evaluation metrics. Lack of data: absolute rarity. Number of observations is small in absolute sense. Relative lack of data: relative rarity. Relative to other events. Data fragmentation. Absolute lack of data within a single partition. Inappropriate inductive bias. Such as an assumption of linearity. INNS Conference on Big Data /
14 Kernel Methods General Description of Kernel Methods 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
15 Kernel Methods General Description of Kernel Methods Historical Perspective Efficient algorithms for detecting linear relations were used in the 1950s and 1960s (perceptron algorithm). Handling nonlinear relationships was seen as major research goal at that time but the development of nonlinear algorithms with the same efficiency and stability has proven as an elusive goal. In the mid 80s, the field of pattern analysis underwent a nonlinear revolution with backpropagation neural networks (NNs) and decision trees (based on heuristics and lacking a firm theoretical foundation, local minima problems, nonconvexity). In the mid 90s, kernel based methods have been developed while retaining the guarantees and understanding that have been developed for linear algorithms. INNS Conference on Big Data /
16 Kernel Methods General Description of Kernel Methods Overview I Kernel Methods are a new class of machine learning algorithms which can operate on very general types of data and can detect very general types of relations (e.g., Potential function method; Aizerman et al., 1964, Vapnik, 1982, 1995). Correlation, factor, cluster and discriminant analysis are some of the types of machine learning analysis tasks that can be performed on data as diverse as sequences, text, images, graphs and vectors using kernels. Kernel methods provide also a natural way to merge and integrate different types of data. INNS Conference on Big Data /
17 Kernel Methods General Description of Kernel Methods Overview II Kernel methods offer a modular framework. In a first step, a dataset is processed into a kernel matrix. Data can be of various types and also of heterogeneous types. In a second step, a variety of kernel algorithms can be used to analyze the data, using only the information contained in the kernel matrix. INNS Conference on Big Data /
18 Kernel Methods General Description of Kernel Methods Modular Framework Source: J. Shawe-Taylor and N. Cristianini, Kernel methods for pattern analysis, INNS Conference on Big Data /
19 Kernel Methods General Description of Kernel Methods Basic Idea of Kernel Methods Kernel Methods work by: Embedding data in a vector space called feature space using a kernel function. Looking for linear relations in such a space. Much of the geometry of the data in the embedding space (relative positions) is contained in all pair-wise inner products (information bottleneck). We can work in feature space by specifying an inner product function k between points in it. In many cases, inner product in the embedding space (feature space) is very cheap to compute. INNS Conference on Big Data /
20 Kernel Methods General Description of Kernel Methods Properties of Kernels I Definition (Mercer kernel) Let E be any set. A function k : E E R that is continuous, symmetric and finitely positive semi-definite is called here a Mercer kernel. Definition (Finitely positive semi-definiteness) A function k : E E R, where E is any set, is a finitely positive semi-definite kernel if m m i=1 j=1 k(x i,x j )λ i λ j 0, for any m N, λ i R, x i E and i 1,m. It can be seen as the generalization of a positive semi-definite matrix. INNS Conference on Big Data /
21 Kernel Methods General Description of Kernel Methods Properties of Kernels II Definition (RKHS) A Reproducing Kernel Hilbert Space (RKHS) F is a Hilbert space of complexvalued functions on a set E for which there exists a function k : E E C (the reproducing kernel) such that k(,x) F for any x E and such that f,k(,x) = f (x) for all f F (reproducing property). If k is a symmetric positive definite kernel then, by the Moore-Aronszajn s theorem, there is a unique RKHS with k as the reproducing kernel. A symmetric positive definite kernel k can be expressed as a dot product k : (x,y) φ(x),φ(y), where φ is a map from R n to a RKHS H (kernel trick). INNS Conference on Big Data /
22 Kernel Methods General Description of Kernel Methods Properties of Kernels III Properties For any x 1,...,x l the l l matrix K with entries K ij = k(x i,x j ) is symmetric and positive semi-definite. The matrix K is called kernel matrix. A kernel k can be expressed as k : (x,y) φ(x),φ(y), where φ is a map from R n to a Hilbert space H (kernel trick). The space H is called the feature space. INNS Conference on Big Data /
23 Kernel Methods General Description of Kernel Methods Properties of Kernels IV The image of R d by φ is a manifold S in H. Kernels can be interpreted as measures of distance and measures of angles on S. Simple geometric relations between S and hyperplanes of H can give complex forms in R d. INNS Conference on Big Data /
24 Kernel Methods Support Vector Machines Applied to Imbalanced Data 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
25 Kernel Methods Support Vector Machines Applied to Imbalanced Data Support Vector Machines Definition (SVM) Support Vector Machines are a family of learning algorithms that use kernel methods to solve supervised learning problems. Common supervised learning tasks concern problems of classification and regression. SVMs work by solving Quadratic Programming problems that aim to minimize the generalization error. If we are given a set S of l points x i R n where each x i belongs to either of two classes defined by y i { 1,+1}, then the objective is to find a hyperplane that divides S leaving all the points of the same class on the same side while maximizing the minimum distance between either of the two classes and the hyperplane [Vapnik 1995]. INNS Conference on Big Data /
26 Kernel Methods Support Vector Machines Applied to Imbalanced Data Optimal Separating Hyperplane Source: Microsoft Research. Vision, Graphics, and Visualization Group, INNS Conference on Big Data /
27 Kernel Methods Support Vector Machines Applied to Imbalanced Data Dual problem in nonlinear case The optimal hyperplane is obtained by solving the following Quadratic Programming (QP) problem: { min α t Kα/2 1,α : α,y = 0 and 0 α C }. α R l This QP problem is the dual formulation of a QP problem that maximizes the margin of separation between the sets of points in the feature space. Given a solution α, the optimal hyperplane is expressed as {x R n : l i=1 α i y i k(x i,x)+b = 0} where b is computed using the complementary slackness conditions of the primal formulation. INNS Conference on Big Data /
28 Kernel Methods Support Vector Machines Applied to Imbalanced Data Binary Classification of Imbalanced Data with SVMs Binary classification of imbalanced data needs a rewritting of the primal SVM problem, namely: min w,b,ξ subject to 1 2 w,w H + C 1 ξ i + C 1 ξ i ( y i =1 y i = 1 y i w,φ(xi ) H + b ) 1 ξ i, i [1,l] ξ i 0, i [1,l] C 1 is the trade-off coefficient for the minority class and C 1 is the tradeoff coefficient for the majority class. For imbalanced data, we wish to have C 1 < C 1 i.e. the penalty for outliers in the minority class is greater than the one for the majority class. This approach is strongly related to robust classification. INNS Conference on Big Data /
29 Kernel Methods Support Vector Machines Applied to Imbalanced Data Illustration of SVM Training with Imbalanced Data Source: Tang et al. SVMs Modeling for Highly Imbalanced Classification. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(1): , INNS Conference on Big Data /
30 Kernel Methods Support Vector Machines Applied to Imbalanced Data One-Class SVM for Anomaly Detection Anomaly detection is equivalent to building an enclosure around a cloud of points which are coding non-anomalous objects in order to separate them from outliers which represent anomalies. This problem is known as the soft minimal hypersphere problem and, for points mapped in the feature space H, it is expressed as min c,r,ξ subject to r 2 + C l i=1 ξ i φ(x i ) c 2 H r 2 + ξ i, i [1,l] ξ i 0, i [1,l] INNS Conference on Big Data /
31 Kernel Methods Support Vector Machines Applied to Imbalanced Data Example of a Soft Minimal Enclosing Hypersphere Sublevel sets for different values of the constant γ. INNS Conference on Big Data /
32 Kernel Methods Support Vector Machines Applied to Imbalanced Data Drawbacks of SVMs The soft-margin maximization paradigm minimizes the total error, which in return introduces a bias toward the majority class. Offline calculations. Unsuitable for processing data streams. Inadequate for large problems (except when using heuristics such as Platt s Sequential Minimal Optimization). INNS Conference on Big Data /
33 Application to Tornado Prediction Description of the Experiments 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
34 Application to Tornado Prediction Description of the Experiments Tornado Experiments I The data were randomly divided into two sets: training/validation and independent testing. In the complete training/validation set, there are 361 cases of tornadic observations and 5048 cases of non-tornadic observations from 59 storm days. In the independent testing set, there are 360 tornadic observations and 5047 non-tornadic observations from 52 storm days. The percentage of tornadic observations in each data set is 6.7%. INNS Conference on Big Data /
35 Application to Tornado Prediction Description of the Experiments Tornado Experiments II Cross validations were applied with different combinations of kernel functions (linear, polynomial and Gaussian radial basis function) and parameter values on the training/validation set. Each classifier is tested on the test observations drawn randomly with replacement using bootstrap resampling (Efron and Tibshirani, 1993) with 1000 replications on the independent testing set to establish confidence intervals. The best support vector solution is chosen for which the classifier has the highest mean Critical Success Index (Hit/(Hit + Miss + False Alarms)) on the validation set. The best classifier uses the Gaussian radial basis function kernel with radius of We apply these optimal parameters to predict the outcomes of the testing set. INNS Conference on Big Data /
36 Application to Tornado Prediction Description of the Experiments INNS Conference on Big Data /
37 Application to Tornado Prediction Results for the Tornado Data Set 1 Imbalanced Data What is an Imbalanced Data Problem? Impact of Imbalanced Data on Learning Machines State of the Art Techniques for Imbalanced Learning 2 Kernel Methods General Description of Kernel Methods Support Vector Machines Applied to Imbalanced Data 3 Application to Tornado Prediction Description of the Experiments Results for the Tornado Data Set INNS Conference on Big Data /
38 Application to Tornado Prediction Results for the Tornado Data Set Results for the Tornado Data Set Results computed from the binary confusion matrix with a 95% confidence interval. Measure Validation Test POD 57% ± 13% 57% ± 13% FAR 18% ± 10% 31% ± 14% CSI 50% ± 10% 45% ± 12% Bias % ± 21% 83% ± 20% HSS 62% ± 9% 60% ± 11% POD: probability of detection (hit/(hit + correct null)); FAR: false alarm rate (false alarm/(hit + false alarm)); CSI: critical success ratio (hit/(hit + false alarm + miss)); Bias ((hit + false alarm)/(hit + miss)); HSS: Heidke skill score. Source: I. Adrianto, T. B. Trafalis, and V. Lakshmanan. Support vector machines for spatiotemporal tornado prediction. International Journal of General Systems, 38(7): , 2009 INNS Conference on Big Data /
39 Part II Dynamic Forecasting Using Kernel Methods INNS Conference on Big Data /
40 Filtering with Kernel Methods Key Notions 4 Filtering with Kernel Methods Key Notions Approach Outline Assimilation with Kernel Methods 5 Applications to Meteorology Experimental Setup Lorenz 96 Model Quasi-Geostrophic Model INNS Conference on Big Data /
41 Filtering with Kernel Methods Key Notions Objectives of Dynamic Forecasting Using Kernel Methods Dynamical systems Physical systems are mathematically represented by states in some abstract space. Transitions between states are modeled using transition functions over the state space in order to simulate the system dynamics. Objectives To provide an alternative to Kalman filtering to predict the future states of nonlinear dynamical systems. To use machine learning techniques and kernel methods in order to build nonlinear state predictors. INNS Conference on Big Data /
42 Filtering with Kernel Methods Key Notions Kalman Filtering Definition (Kalman Filter) Given a sequence of perturbed measurements, a Kalman Filter is a process that estimates the states of a dynamical system. We will consider only differentiable real-time nonlinear dynamical systems (to which correspond nonlinear Kalman filters). The state transition and observation models of the nonlinear dynamical system are t x(t) = f (x,u,t) + w(t) and z(t) = h(x,t) + v(t), where x is the state of the system, z is the observation, f is the state transition function, h is the observation model, u is the control and (w,v) are the (Gaussian) noise. INNS Conference on Big Data /
43 Filtering with Kernel Methods Key Notions Example: radar tracking From Pattern Recognition and Machine Learning by C. M. Bishop. Blue points: true positions; Green points: noisy observations; Red crosses: forecasts. INNS Conference on Big Data /
44 Filtering with Kernel Methods Key Notions Kalman Filter and Kernel Methods: Comparison of Assumptions Unlike the linear Kalman Filter, the nonlinear variants do not necessarily give an optimal state estimator. The filter may also diverge if the initial estimate is wrong or if the model is incorrect. For Kalman filters, the process must be Markovian, the perturbations must be independent and they must follow a Gaussian distribution. Implementations must face problems related to matrix storage, matrix inversion and/or matrix factorization. Kernel methods need no statistical assumptions on the process noise and they work with both Markovian and non-markovian processes. Storage and computational requirements are modest. INNS Conference on Big Data /
45 Filtering with Kernel Methods Approach Outline 4 Filtering with Kernel Methods Key Notions Approach Outline Assimilation with Kernel Methods 5 Applications to Meteorology Experimental Setup Lorenz 96 Model Quasi-Geostrophic Model INNS Conference on Big Data /
46 Filtering with Kernel Methods Approach Outline Assimilation and Forecasting with Kernel Methods 1. Assimilation The assimilation step attempts to recover the unperturbed system states from the current and past observations using kernel-based regression techniques. Kernel methods are removing noise from states trajectories and are updating them from the previous forecasts. 2. Forecasting The last assimilated state can be used as an initial estimate for one iteration of a nonlinear Kalman filter. A polynomial predictive analysis on the last recorded state trajectories using a Lagrange interpolation with Chebyshev nodes can provide reliable extrapolations. The generalization property of the SVM regression function can be used to estimate the next future state. INNS Conference on Big Data /
47 Filtering with Kernel Methods Approach Outline Advantages and Shortcomings of Kernel Methods Advantages Low memory requirements (some kernels require only O(n) elements to be stored in memory). Acceptable computational complexity (of the order of O(n 2 ), data thinning can reduce the size of the input data set). Massive parallelization (can be applied separately to each trajectory). No statistical assumptions, no state transition model necessary. Can be combined with a Kalman filter if necessary. Shortcomings Estimation of some kernel parameters. INNS Conference on Big Data /
48 Filtering with Kernel Methods Assimilation with Kernel Methods 4 Filtering with Kernel Methods Key Notions Approach Outline Assimilation with Kernel Methods 5 Applications to Meteorology Experimental Setup Lorenz 96 Model Quasi-Geostrophic Model INNS Conference on Big Data /
49 Filtering with Kernel Methods Assimilation with Kernel Methods Interpolating States Trajectories without Model Main Idea Illustration Find a non-trivial function f such as for every given sample (t i,x i ) R 2 we have f (t i ) = x i (or f (t i ) is in an interval centered at x i and of half-width ε 0). The interpolation function is expressed using an affine combination of kernel-based functions k(t, ). The positive semi-definite matrix K with entries K ij = k(t i,t j ) is called the kernel matrix. Kalman filtering works differently. Contrary to this approach, no state trajectory interpolation takes place with KFs. Furthermore KFs absolutely need a model. Skip to Conclusions INNS Conference on Big Data /
50 Filtering with Kernel Methods Assimilation with Kernel Methods Fitting Functions The non-trivial function f such that f (t i ) = x i is chosen to belong to the function class: { F = t R l i=1 α i k(t i,t) + b R : α t Kα B 2 }. The Rademacher complexity of F measures the capability of the functions of F to fit random data with respect to a probability distribution generating this data. The empirical Rademacher complexity of F, denoted by ˆR(F ), is such that ˆR(F ) 2B tr(k)/l + 2 b / l. INNS Conference on Big Data /
51 Filtering with Kernel Methods Assimilation with Kernel Methods Minimizing the Generalization and the Empirical Errors We use ˆR(F ) to control the upper bound of the generalization error of the interpolation function. Small empirical errors and ˆR(F ) contribute to decrease this bound, therefore: Aim We need to minimize the absolute value of b. Also we have α t Kα K α 2. Thus minimizing α t α + b 2 contributes to a smaller ˆR(F ); The empirical error defined by ( l i=1 ξ i ) /l, where the ξ i s are the differences between targets and desired outputs, should be minimized. To minimize the quantity α t α + b 2 + Cξ t ξ with C > 0. INNS Conference on Big Data /
52 Filtering with Kernel Methods Assimilation with Kernel Methods Optimization Problem Introducing tolerances ρ i > 0, the empirical errors ξ i are equal to f (t i ) x i. That is the only constraints associated with the previous objective function. Hence the previous calculations lead to: Optimization Problem for Data Assimilation (Gilbert et al., 2010) { min α t α + b 2 + Cξ t ξ : Kα + b1 l x = ξ }. (α,ξ,b) R 2l+1 The solution of this optimization problem is: Analytical Solution ( ) K 2 + I l /C + 1 l 1 t l d = x, α = Kd, b = 1 t l d. The solutions α and b describe the regression function (that belongs to the function class F ) which interpolates state trajectories during assimilation. INNS Conference on Big Data /
53 Filtering with Kernel Methods Assimilation with Kernel Methods Computational details To compute the solution d of the linear system ( K 2 + I l /C + 1 l 1 t l) d = x, we define σ = 1/ C, the matrix A = K + σi l (A is symmetric and positive definite) and the following sequences: Aũ 0 = x, Au 0 = ũ 0, Aũ n = u n, Au n+1 = 2σ(u n σũ n ). Aṽ 0 = 1 l, Av 0 = ṽ 0, Aṽ n = v n, Av n+1 = 2σ(v n σṽ n ). We set u = u n and v = v n. Both series are rapidly convergent and they n 0 n 0 are truncated at a step m > 0 in practical problems. Once m is determined we then have d = u 1 l,u l,v v. INNS Conference on Big Data /
54 Filtering with Kernel Methods Assimilation with Kernel Methods Approach Summary We have chosen a class of functions in order to interpolate state trajectories without a model. We defined a fitness measure for this class of functions and linked it to the parameters describing a function in that class. We defined an optimization problem that aims to minimize empirical errors and maximize the fitness of the function interpolating state trajectories. An analytical solution of the optimization problem was derived and its computation was reduced to solving a sequence of linear systems. INNS Conference on Big Data /
55 Applications to Meteorology Experimental Setup 4 Filtering with Kernel Methods Key Notions Approach Outline Assimilation with Kernel Methods 5 Applications to Meteorology Experimental Setup Lorenz 96 Model Quasi-Geostrophic Model INNS Conference on Big Data /
56 Applications to Meteorology Experimental Setup Experimental Setup Machine and Software All codes were implemented on MATLAB 7.9 using a 2002 DELL Precision Workstation 530 with two 2.4 GHz Intel Xeon processors and 2 GiB of RAM. EnKF forecasts were generated with the EnKF Matlab toolbox version 0.23 by Pavel Sakov (available at Evensen s webpage at enkf.nersc.no). Experimental Models The kernel approach was tested on the Lorenz 96 model and the Quasi- Geostrophic 1.5-layer reduced gravity model. Forecasts were obtained using a combination of polynomial predictive analysis and kernel-based extrapolation during the assimilation stage. INNS Conference on Big Data /
57 Applications to Meteorology Lorenz 96 Model 4 Filtering with Kernel Methods Key Notions Approach Outline Assimilation with Kernel Methods 5 Applications to Meteorology Experimental Setup Lorenz 96 Model Quasi-Geostrophic Model INNS Conference on Big Data /
58 Applications to Meteorology Lorenz 96 Model Lorenz 96 Model: Description Introduced by Lorenz and Emanuel (1998). The state transition model is representing the values of atmospheric quantities at discrete locations spaced equally on a latitude circle (1D problem). The state transition model at a location i on the latitude circle is: t x i = (x i+1 x i 2 )x i 1 } {{ } advection x i }{{} dissipation + }{{} F. external forcing The states represent an unspecified scalar meteorological quantity, e.g. vorticity or temperature (Lorenz and Emanuel). This model was introduced in order to select which locations on a latitude circle are the most effective in improving weather assimilation and forecasts. Observations where generated with an error variance of 1. The external forcing was set to F = 8. The system showed a chaotic behavior. INNS Conference on Big Data /
59 Applications to Meteorology Lorenz 96 Model Lorenz 96 Model: Assimilation The following figures illustrate how kernel methods remove the observational noise and interpolate state trajectories during assimilation. The used kernel is a Gaussian RBF kernel with σ = 3δ t. INNS Conference on Big Data /
60 Applications to Meteorology Lorenz 96 Model Lorenz 96 Model: Forecast Errors INNS Conference on Big Data /
61 Applications to Meteorology Quasi-Geostrophic Model 4 Filtering with Kernel Methods Key Notions Approach Outline Assimilation with Kernel Methods 5 Applications to Meteorology Experimental Setup Lorenz 96 Model Quasi-Geostrophic Model INNS Conference on Big Data /
62 Applications to Meteorology Quasi-Geostrophic Model Quasi-Geostrophic Model: Description It is an atmospheric dynamical model involving an approximation of actual winds. It is used in the analysis of large scale extratropical weather systems. System states are scalar quantities representing the air flow. Horizontal winds are replaced by their geostrophic values in the horizontal acceleration terms of the momentum equations, and horizontal advection in the thermodynamic equation is approximated by geostrophic advection. Furthermore, vertical advection of momentum is neglected. It is a 2D problem: the atmosphere has a single level in the vertical and was represented by a square grid where each state is located on a node of that grid. Observations where generated with an error variance of 1. INNS Conference on Big Data /
63 Applications to Meteorology Quasi-Geostrophic Model Quasi-Geostrophic Model: Assimilation Example Despite the absence of dynamical model, the kernel interpolation obtained from the noisy observations closely matches the true states. Return to the Approach Outline INNS Conference on Big Data /
64 Applications to Meteorology Quasi-Geostrophic Model Quasi-Geostrophic Model: Forecast Example Kernel and EnKF forecasts at time step 51. The EnKF forecast is 8 units away from the true state while the kernel forecast is 0.5 units away. INNS Conference on Big Data /
65 Applications to Meteorology Quasi-Geostrophic Model Quasi-Geostrophic Model: Forecast Errors INNS Conference on Big Data /
66 Applications to Meteorology Quasi-Geostrophic Model Conclusions I What was Achieved? A viable kernel-based approach for data assimilation and forecasting has been introduced for nonlinear dynamical systems. It showed predictable performances on meteorological models ranging from equivalent to that of EnKF with 50 ensemble members to exceeding that of EnKF with 100 ensemble members, with lower error as an inverse function of the amount of chaos present in the dynamical system. Encouraging results in removing observational noise and interpolating state trajectories were obtained. They represent an improvement with respect to standard EnKF based on less than 20 ensembles. INNS Conference on Big Data /
67 Applications to Meteorology Quasi-Geostrophic Model Conclusions II Future Work We are currently applying these techniques to financial and petroleum engineering problems with the same type of multi-dimensional time series. We are developing approaches that identify independent factors influencing the shape of multi-dimensional time series in a nonlinear fashion. The same tools will be used for the prediction of rare events and their magnitude. INNS Conference on Big Data /
68 Questions? INNS Conference on Big Data /
69 For Further Reading R. C. Gilbert, M. B. Richman, L. M. Leslie, and T. B. Trafalis. Kernel Methods for Data Driven Numerical Modeling. Monthly Weather Review, submitted, E. N. Lorenz and K. A. Emanuel. Optimal sites for supplementary weather observations: simulation with a small model. Journal of the Atmospheric Sciences, 55(3): , P. Sarma and W. H. Chen. Generalization of the Ensemble Kalman Filter Using Kernels for Nongaussian Random Fields. In SPE Reservoir Simulation Symposium Proceedings, Society of Petroleum Engineers. F. Steinke and B. Schölkopf. Kernels, regularization and differential equations. Pattern Recognition, 41(11): , J. Shawe-Taylor and N. Cristianini. Kernel methods for pattern analysis. Cambridge University Press, Cambridge, UK, V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, NY, USA, INNS Conference on Big Data 2015 /
Support Vector Machines Explained
March 1, 2009 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationIntroduction to Support Vector Machines. Colin Campbell, Bristol University
Introduction to Support Vector Machines Colin Campbell, Bristol University 1 Outline of talk. Part 1. An Introduction to SVMs 1.1. SVMs for binary classification. 1.2. Soft margins and multi-class classification.
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationE-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce
More informationA Simple Introduction to Support Vector Machines
A Simple Introduction to Support Vector Machines Martin Law Lecture for CSE 802 Department of Computer Science and Engineering Michigan State University Outline A brief history of SVM Large-margin linear
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationCOPYRIGHTED MATERIAL. Contents. List of Figures. Acknowledgments
Contents List of Figures Foreword Preface xxv xxiii xv Acknowledgments xxix Chapter 1 Fraud: Detection, Prevention, and Analytics! 1 Introduction 2 Fraud! 2 Fraud Detection and Prevention 10 Big Data for
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationSupport Vector Machine (SVM)
Support Vector Machine (SVM) CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationClass #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris
Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationScalable Developments for Big Data Analytics in Remote Sensing
Scalable Developments for Big Data Analytics in Remote Sensing Federated Systems and Data Division Research Group High Productivity Data Processing Dr.-Ing. Morris Riedel et al. Research Group Leader,
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
More informationComparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
More informationSimple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
More informationA fast multi-class SVM learning method for huge databases
www.ijcsi.org 544 A fast multi-class SVM learning method for huge databases Djeffal Abdelhamid 1, Babahenini Mohamed Chaouki 2 and Taleb-Ahmed Abdelmalik 3 1,2 Computer science department, LESIA Laboratory,
More informationComparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data
CMPE 59H Comparison of Non-linear Dimensionality Reduction Techniques for Classification with Gene Expression Microarray Data Term Project Report Fatma Güney, Kübra Kalkan 1/15/2013 Keywords: Non-linear
More informationSupport Vector Machine. Tutorial. (and Statistical Learning Theory)
Support Vector Machine (and Statistical Learning Theory) Tutorial Jason Weston NEC Labs America 4 Independence Way, Princeton, USA. jasonw@nec-labs.com 1 Support Vector Machines: history SVMs introduced
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationDuality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
More informationData Mining. Nonlinear Classification
Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15
More informationBootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
More informationUsing Random Forest to Learn Imbalanced Data
Using Random Forest to Learn Imbalanced Data Chao Chen, chenchao@stat.berkeley.edu Department of Statistics,UC Berkeley Andy Liaw, andy liaw@merck.com Biometrics Research,Merck Research Labs Leo Breiman,
More informationExample: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationSupervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationClassification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
More informationDistance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center
Distance Metric Learning in Data Mining (Part I) Fei Wang and Jimeng Sun IBM TJ Watson Research Center 1 Outline Part I - Applications Motivation and Introduction Patient similarity application Part II
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationEnsemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05
Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationAnalecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
More informationA Learning Algorithm For Neural Network Ensembles
A Learning Algorithm For Neural Network Ensembles H. D. Navone, P. M. Granitto, P. F. Verdes and H. A. Ceccatto Instituto de Física Rosario (CONICET-UNR) Blvd. 27 de Febrero 210 Bis, 2000 Rosario. República
More informationTHE SVM APPROACH FOR BOX JENKINS MODELS
REVSTAT Statistical Journal Volume 7, Number 1, April 2009, 23 36 THE SVM APPROACH FOR BOX JENKINS MODELS Authors: Saeid Amiri Dep. of Energy and Technology, Swedish Univ. of Agriculture Sciences, P.O.Box
More informationDocument Image Retrieval using Signatures as Queries
Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationDecision Support Systems
Decision Support Systems 50 (2011) 602 613 Contents lists available at ScienceDirect Decision Support Systems journal homepage: www.elsevier.com/locate/dss Data mining for credit card fraud: A comparative
More informationAdaptive and Online One-Class Support Vector Machine-based Outlier Detection Techniques for Wireless Sensor Networks
2009 International Conference on Advanced Information Networking and Applications Workshops Adaptive and Online One-Class Support Vector Machine-based Outlier Detection Techniques for Wireless Sensor Networks
More informationAUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM
AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo 71251911@mackenzie.br,nizam.omar@mackenzie.br
More informationClusterOSS: a new undersampling method for imbalanced learning
1 ClusterOSS: a new undersampling method for imbalanced learning Victor H Barella, Eduardo P Costa, and André C P L F Carvalho, Abstract A dataset is said to be imbalanced when its classes are disproportionately
More informationSupport Vector Machines
Support Vector Machines Charlie Frogner 1 MIT 2011 1 Slides mostly stolen from Ryan Rifkin (Google). Plan Regularization derivation of SVMs. Analyzing the SVM problem: optimization, duality. Geometric
More informationIntroduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trakovski trakovski@nyus.edu.mk Neural Networks 2 Neural Networks Analogy to biological neural systems, the most robust learning systems
More informationComponent Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
More informationUsing artificial intelligence for data reduction in mechanical engineering
Using artificial intelligence for data reduction in mechanical engineering L. Mdlazi 1, C.J. Stander 1, P.S. Heyns 1, T. Marwala 2 1 Dynamic Systems Group Department of Mechanical and Aeronautical Engineering,
More informationMachine Learning for Medical Image Analysis. A. Criminisi & the InnerEye team @ MSRC
Machine Learning for Medical Image Analysis A. Criminisi & the InnerEye team @ MSRC Medical image analysis the goal Automatic, semantic analysis and quantification of what observed in medical scans Brain
More informationMachine Learning Big Data using Map Reduce
Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? -Web data (web logs, click histories) -e-commerce applications (purchase histories) -Retail purchase histories
More informationOnline Classification on a Budget
Online Classification on a Budget Koby Crammer Computer Sci. & Eng. Hebrew University Jerusalem 91904, Israel kobics@cs.huji.ac.il Jaz Kandola Royal Holloway, University of London Egham, UK jaz@cs.rhul.ac.uk
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationEvaluation & Validation: Credibility: Evaluating what has been learned
Evaluation & Validation: Credibility: Evaluating what has been learned How predictive is a learned model? How can we evaluate a model Test the model Statistical tests Considerations in evaluating a Model
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationSupport Vector Machines for Classification and Regression
UNIVERSITY OF SOUTHAMPTON Support Vector Machines for Classification and Regression by Steve R. Gunn Technical Report Faculty of Engineering, Science and Mathematics School of Electronics and Computer
More informationSupport Vector Machines
CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe are indeed the best)
More informationPredictive Data modeling for health care: Comparative performance study of different prediction models
Predictive Data modeling for health care: Comparative performance study of different prediction models Shivanand Hiremath hiremat.nitie@gmail.com National Institute of Industrial Engineering (NITIE) Vihar
More informationNumerical Analysis An Introduction
Walter Gautschi Numerical Analysis An Introduction 1997 Birkhauser Boston Basel Berlin CONTENTS PREFACE xi CHAPTER 0. PROLOGUE 1 0.1. Overview 1 0.2. Numerical analysis software 3 0.3. Textbooks and monographs
More information1816 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 7, JULY 2006. Principal Components Null Space Analysis for Image and Video Classification
1816 IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 15, NO. 7, JULY 2006 Principal Components Null Space Analysis for Image and Video Classification Namrata Vaswani, Member, IEEE, and Rama Chellappa, Fellow,
More informationMachine Learning and Data Mining. Regression Problem. (adapted from) Prof. Alexander Ihler
Machine Learning and Data Mining Regression Problem (adapted from) Prof. Alexander Ihler Overview Regression Problem Definition and define parameters ϴ. Prediction using ϴ as parameters Measure the error
More informationHYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION
HYBRID PROBABILITY BASED ENSEMBLES FOR BANKRUPTCY PREDICTION Chihli Hung 1, Jing Hong Chen 2, Stefan Wermter 3, 1,2 Department of Management Information Systems, Chung Yuan Christian University, Taiwan
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for
More informationSVM Ensemble Model for Investment Prediction
19 SVM Ensemble Model for Investment Prediction Chandra J, Assistant Professor, Department of Computer Science, Christ University, Bangalore Siji T. Mathew, Research Scholar, Christ University, Dept of
More informationAnalyzing PETs on Imbalanced Datasets When Training and Testing Class Distributions Differ
Analyzing PETs on Imbalanced Datasets When Training and Testing Class Distributions Differ David Cieslak and Nitesh Chawla University of Notre Dame, Notre Dame IN 46556, USA {dcieslak,nchawla}@cse.nd.edu
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationChapter 12 Discovering New Knowledge Data Mining
Chapter 12 Discovering New Knowledge Data Mining Becerra-Fernandez, et al. -- Knowledge Management 1/e -- 2004 Prentice Hall Additional material 2007 Dekai Wu Chapter Objectives Introduce the student to
More informationRandom Forest Based Imbalanced Data Cleaning and Classification
Random Forest Based Imbalanced Data Cleaning and Classification Jie Gu Software School of Tsinghua University, China Abstract. The given task of PAKDD 2007 data mining competition is a typical problem
More informationVisualization by Linear Projections as Information Retrieval
Visualization by Linear Projections as Information Retrieval Jaakko Peltonen Helsinki University of Technology, Department of Information and Computer Science, P. O. Box 5400, FI-0015 TKK, Finland jaakko.peltonen@tkk.fi
More information1. Classification problems
Neural and Evolutionary Computing. Lab 1: Classification problems Machine Learning test data repository Weka data mining platform Introduction Scilab 1. Classification problems The main aim of a classification
More informationSanjeev Kumar. contribute
RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a
More informationAssessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
More informationA Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM
Journal of Computational Information Systems 10: 17 (2014) 7629 7635 Available at http://www.jofcis.com A Health Degree Evaluation Algorithm for Equipment Based on Fuzzy Sets and the Improved SVM Tian
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
More informationAcknowledgments. Data Mining with Regression. Data Mining Context. Overview. Colleagues
Data Mining with Regression Teaching an old dog some new tricks Acknowledgments Colleagues Dean Foster in Statistics Lyle Ungar in Computer Science Bob Stine Department of Statistics The School of the
More informationMaschinelles Lernen mit MATLAB
Maschinelles Lernen mit MATLAB Jérémy Huard Applikationsingenieur The MathWorks GmbH 2015 The MathWorks, Inc. 1 Machine Learning is Everywhere Image Recognition Speech Recognition Stock Prediction Medical
More informationThese slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop
Music and Machine Learning (IFT6080 Winter 08) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher
More informationCS Master Level Courses and Areas COURSE DESCRIPTIONS. CSCI 521 Real-Time Systems. CSCI 522 High Performance Computing
CS Master Level Courses and Areas The graduate courses offered may change over time, in response to new developments in computer science and the interests of faculty and students; the list of graduate
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationBig Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
More informationA General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois at Urbana-Champaign IBM T. J. Watson Research Center
More informationA Survey on Pre-processing and Post-processing Techniques in Data Mining
, pp. 99-128 http://dx.doi.org/10.14257/ijdta.2014.7.4.09 A Survey on Pre-processing and Post-processing Techniques in Data Mining Divya Tomar and Sonali Agarwal Indian Institute of Information Technology,
More informationSURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH
1 SURVIVABILITY OF COMPLEX SYSTEM SUPPORT VECTOR MACHINE BASED APPROACH Y, HONG, N. GAUTAM, S. R. T. KUMARA, A. SURANA, H. GUPTA, S. LEE, V. NARAYANAN, H. THADAKAMALLA The Dept. of Industrial Engineering,
More informationMachine Learning in FX Carry Basket Prediction
Machine Learning in FX Carry Basket Prediction Tristan Fletcher, Fabian Redpath and Joe D Alessandro Abstract Artificial Neural Networks ANN), Support Vector Machines SVM) and Relevance Vector Machines
More informationStatistics in Retail Finance. Chapter 7: Fraud Detection in Retail Credit
Statistics in Retail Finance Chapter 7: Fraud Detection in Retail Credit 1 Overview > Detection of fraud remains an important issue in retail credit. Methods similar to scorecard development may be employed,
More informationMaking Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University atran3@stanford.edu ginzberg@stanford.edu I. Introduction III. Model The goal of our research
More informationPredict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More information4.1 Learning algorithms for neural networks
4 Perceptron Learning 4.1 Learning algorithms for neural networks In the two preceding chapters we discussed two closely related models, McCulloch Pitts units and perceptrons, but the question of how to
More informationCurrent Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary
Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:
More informationMachine Learning in Spam Filtering
Machine Learning in Spam Filtering A Crash Course in ML Konstantin Tretyakov kt@ut.ee Institute of Computer Science, University of Tartu Overview Spam is Evil ML for Spam Filtering: General Idea, Problems.
More informationPerformance Metrics for Graph Mining Tasks
Performance Metrics for Graph Mining Tasks 1 Outline Introduction to Performance Metrics Supervised Learning Performance Metrics Unsupervised Learning Performance Metrics Optimizing Metrics Statistical
More informationLecture 6: Logistic Regression
Lecture 6: CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 13, 2011 Outline Outline Classification task Data : X = [x 1,..., x m]: a n m matrix of data points in R n. y { 1,
More informationMachine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
More informationSTock prices on electronic exchanges are determined at each tick by a matching algorithm which matches buyers with
JOURNAL OF STELLAR MS&E 444 REPORTS 1 Adaptive Strategies for High Frequency Trading Erik Anderson and Paul Merolla and Alexis Pribula STock prices on electronic exchanges are determined at each tick by
More informationApplications to Data Smoothing and Image Processing I
Applications to Data Smoothing and Image Processing I MA 348 Kurt Bryan Signals and Images Let t denote time and consider a signal a(t) on some time interval, say t. We ll assume that the signal a(t) is
More informationCHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES
CHARACTERISTICS IN FLIGHT DATA ESTIMATION WITH LOGISTIC REGRESSION AND SUPPORT VECTOR MACHINES Claus Gwiggner, Ecole Polytechnique, LIX, Palaiseau, France Gert Lanckriet, University of Berkeley, EECS,
More informationMedical Information Management & Mining. You Chen Jan,15, 2013 You.chen@vanderbilt.edu
Medical Information Management & Mining You Chen Jan,15, 2013 You.chen@vanderbilt.edu 1 Trees Building Materials Trees cannot be used to build a house directly. How can we transform trees to building materials?
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More information