Signal processing tools for largescale network monitoring: a state of the art
|
|
- Brenda Johns
- 8 years ago
- Views:
Transcription
1 Signal processing tools for largescale network monitoring: a state of the art Kavé Salamatian Université de Savoie Outline Introduction & Methodologies Anomaly detection basic Statistical anomaly detection basis Parametric methods A refresh on Linear system theory Model calibration Filtering and processing Refresh on Neyman Pearson test theory Non Parametric Methods Sketches and hashes Random distributions and KL distance Sanov theorem and Stein Lemma Distributed anomaly detection (if we have time) Distributed KLT 1
2 Introduction and methodology Network state Network state results from Traffic demand Traffic matrix Not observable directly Capacity offer Routing matrix, link cap., Traffic Engineering, etc. Monitored by SNMP, etc. Network manager goal To drive this equilibrium to the best beneficial point by managing capacity offer Traffic engineering is the art of managing offered capacity 2
3 Network monitoring Monitoring? Being able to separate What is predictable Expected, normal, under control, What is not predictable Unexpected, abnormal, Interpretation framework Only what is unpredictable have a meaning, what can be predicted does not have a information Classes of anomaly detections Deterministic approaches Signature based Complexity vs. exhaustivity trade-off Statistical approaches Probabilistic False positive vs. detection rate trade off 3
4 Deterministic approaches Each observation is assumed to result from a known causality chain All anomalies are characterized by causality chain that describes the signature of an anomaly Anomaly detection consist of back tracking the causality sequence Need exhaustive anomaly signature database Have to check all possible signature Not scalable Statistical approach A dataset is a unique sequence of deterministic observation that is not anymore perfectly reproducible Statistical approach assumes that Observation results from a random function they come from a random choice in a (in)finite set of «possible» observations. Is this assumption sound? Gives access to an arsenal of probabilistic methods Stationarity : Intrinsic hypothesis Statistical properties hold over time Bet on future Possibility of radical error Essential False alarm vs. misdetection trade-off 4
5 /06/11 Empirical modeling Challenge Measurement contains two components A structured component that reflects essential properties of the phenomenon under study A random component that represent fluctuations that are put aside from the model Modeling challenge Separate structure from randomness and characterize it Signal processing role is to do this filtering About interpretation Measurements But what do they mean? Interpreting? Relating effects to causes Being able to predict the behaviours At different timescales Being able to react Interpretation need a priori 5
6 Plato cavern allegory Socrate : «compare our nature in respect of education and its lack to such an experience as this.picture men dwelling in a sort of cavern Picture further the light from fire burning higher up and at a distance behind them, and between the fire and the prisoners and above them men carrying past the wall implements of all kinds» Glaucon : «A strange image you speak of, and strange prisoners.» Socrate : Like to us, for, to begin with, tell me do you think that these men would have seen anything of themselves or of one another except the shadows cast from the fire on the wall of the cave that fronted them? Modelling approach in networking Descriptive approach Much more used in measurement papers Network is a black box with unknown structure describe observations only through descriptive statistics mean, variance, Hurst or multi-fractal parameters, etc Top-down approach Begin with observations and derive descriptive parameters Drawbacks It does not explain why? It does not answer what if? It is difficult to interpret them Interpretation need a priori It does not use all the available information We may have a priori information on the process generating the observation Constructive approach Classical approach Derivate IP performance through an explicative model of the process involved into the network Network is constituted of queues and routers, Uses simulation by ns or analytic queuing theory, network calculus, etc Down top approach Begin from input scenario and network structure and derive performance measures Drawbacks Generalization is difficult Too many parameters Simulation results do not describe real measurements The approach is open-loop 6
7 /06/11 Interpretation framework θ context Y X Hidden A priori Model Y observations X ˆ, ˆ What is the hidden causes (X et θ) that have lead to observing Y A priori mode compress all of our understanding of the process involved in the generation of the observation in Y=M(X,θ) Interpretation We have to deal with two main inverse problem Modelling problem What are the best context parameters θ that best describe the environment Interpretation problem Knowing the context θ what is hidden input X that will best describe observations A lot of measurement problem might be expressed in this framework Anomaly detection Flow classification and recognition 7
8 Statistical Anomaly detection Basics Anomalies??? What is an anomaly? What is normal? What is likely? Assumptions Anomaly are anomalous different from the norm Anomalies are rare Anomalous activity is malicious Reverse assumption : anomaly represents attacks or malicious behavior Are these assumptions correct??? 8
9 What is likely? Typical set The set of observed sequence 2 "n( H (Y k k"n )"# ) $ Pr Y k k +n (n ) From AEP Pr A " (1"#)2 "n( H (Y k k"n )+# ) (n) $ A # Y k n +k such that { } $ 2 "n( H (Y k k"n )+# ) { } #1$" $ 2 n( H (Y k k"n )+# ) Universal anomaly detector A simple anomaly detector k +n Calculate the likelihood Pr{ Y k } of an observed sequence Check if 2 "n( H (Y k k"n )"# ) k +n $ Pr{ Y k } $ 2 "n( H (Y k k"n )+# ) If not, it is an anomaly Non parametric and universal anomaly detector How to calculate the likelihood? Need simplifying hypothesis 9
10 Filtering What if the observations are independent k +n Pr Y k We have to transform the observation so that it become independent Basic signal processing question Input signal { } = ( Pr{ Y k }) n Pre-processing Flattened signal Anomaly detector Traffic dynamics 10
11 OD dynamics We need to capture all correlations Temporal Correlation Short-scale,dayly,weekly, etc.. Spatial Correlation Same Pop traffic Anomaly detector structure two generic steps Entropy reduction/filtering Remove dependencies in data Compress the data:information bottleneck Remove predictable component Decision/Detection Apply the typicality test on the resulting hopefully independent signal Statistical test In the Neyman-Pearson Framework Or other Bayesian, etc 11
12 Taxonomy of approaches Parametric Assume a parametric dependencies model Non Parametric Broke the dependencies in a universal way Parametric Approaches Assumes a parametric structure for dependencies Normal behavior model Calibrate the Model MLE estimation Filter the model prediction from observation Results in an IID gaussian innovation process Apply a Neyman-Pearson statistical test with H0: the observation is compatible with a zero mean with given variance 12
13 Non parametric approach Apply a de-correlating transform Random projection, hashing, Sketch, etc... Learn the resulting distribution of normal behavior Clustering, SVM, etc Detect anomalies by checking deviation from distribution of normal behavior Kullback-Leibner distance Parametric techniques 13
14 anomaly detection steps Modeling Filtering Decision How can we find a model? Physical models Direct relation with physical parameters High order, approximative, need complete process knowledge, physical parameters should be known For example using TCP models or Bianchi csma/ca model System identification Based on input/output measured data Inverse statistical problem Find the model that most likely has generated the observed data simple and efficient, with relatively low knowledge Limited validity (operating point, type of input), sensors,measurement noise, unknown model structure Frequent framework in practice 14
15 Model structure First step is to define a correlation/model structure Integrating temporal and spatial correlation The traffic is a dynamic entity Assume a state vector ( ) X k = x 1 1 k,x k "1,, x 1 k "T, x 2 k,, x 2 k "T,, x L L k,, x k "T Generic dynamical model " $ X k +1 = f ( X k ) + g( U k ) # % $ Y k = h( X k ) + i( U k ) Linear approximation Linearizing the generic dynamic model Using Taylor series expansion 2 " 2 f + X k + "f f ( X k ) = f (0) + X k "X k 0 "X k 2 Can extend the state vector to integrate powers of state X k 2,,X k N Any non-linear dynamical model can be represented by a linear dynamical model with large enough state vector 0 15
16 Linear Time Invariant systems After linearization " X k +1 = AX k + BU k # $ Y k = CX k + DU k A,B,C and D can change with time (Time variant system) The system might be without input Or have a random input Linear system representations time-domain representation input/output state-space frequency-domain representation Frequency response Pole and zero All four are equivalent Sometimes one of these representation is more useful 16
17 Time domain representations (1) x(k) y(k) [ ] Input/output y(k) = H x(k) If one know the impulse response h(k) = H "(k) of a system, he can derive the answer of the system to any input k y(k) = x(k) * h(k) = x(k " l)u(l) # l =0 [ ] Time representation(2) State space representation " X k +1 = AX k + BU k # $ Y k = CX k + DU k For random input iid Autoregressive model when B=I ARMA model in general " k # X k +1 = AX k + B" k $ % Y k = CX k + D" k 17
18 Frequency representation (1) Frequency response "( h(k) ) = H(#) Y(#) = H(#)X(#) For random signals S X " ( ) ( ) = # R XX (") S Y (") = H(") 2 S X (") Frequency representation(2) Z transform (or Laplace transform) X(z) = Z x(k) Y(z) = H(z)X(z) " # ( ) = x(k)z k j =0 Poles and Zeros values of z for which the value of the transfer function H(z) becomes infinity or zero respectively. For stochastic processes the mean of z-transform gives the characteristic function The moment of the distribution are derivative of the characteristic function. 18
19 Example x(k) = 0.91x(k "1) x(k " 2) x(k " 3) x(k " 4) Model calibration Several approaches Linear regression x(k) = a 1 x(k "1) + a 2 x(k " 2) + a 3 x(k " 3) + a 4 x(k " 4) Issue with correlation between values Maximum likelihood Find the model that have the largest probability of giving the output EM algorithm Approximation with Minimal Squared Error PCA/Karhunen Loeve expansion Finding the best approximation with a given dimension of the normal behavior 19
20 Biased/unbiased correlation Principal Component Analysis Let s suppose we have ( X correlated centered 1,,X K ) gaussian RV PCA finds an orthonormal basis P = ( φ 1,..., φ K ), where Y = PX such that the variance along the principal components is maximized After applying PCA, one can write in the new coordinate K X system as X = " Y j # j j =1 How to derive the basis? Using the covariance matrix " = E{ XX T } "# i = $ i # i var{ Y i } = " i 20
21 PCA as an approximation [ ] Transform matrix U = " 1 " 2 " K Y = UX An L dimensional approximation of X can be derived if we choose L<K column of U U L = [" 1 " 2 " L ] X ˆ = U T L Y L = U T L U L X L Y L = U L X X ˆ % = # Y j " j E& ( X $ X ˆ ) 2 K ( ) ' * = # + j This gives the MMSE approximation of dimension L j =1 j =L +1 Extension to Stochastic Processes Karhunen-Loeve expansion theorem one can rewrite a zero-mean stationary stochastic process X (t) as an orthogonal series expansion X ( t) = Y i Φ ( t) Y i i= 1 Where are pair-wise independent RV and Φ are pairwise orthogonal deterministic i (t) functions The KLE theorem for discrete processes is given by X[ k] = YiΦ i[ k] i= 1 i 21
22 22 How to derive KLE basis For time continuous process Galerkin approximation Apply PCA to the following matrix " i,l (s)# i, j (s $ t)ds = % l, j # l, j (t), j > 0 0 & ' i=1 K ( = = Φ = K i N j j i j i K k Y k X k X 1 1,, 1 ] [ ] [ ] [ + + = ) ( ) ( 1) ( (2) ) ( (1) ) ( ) ( 1) ( (2) ) ( (1) n x N x N n x x N n x x n x N x N n x x N n x x X K K K K K K KLE for modeling Let s assume a signal with model In PCA coordinates Using KLE approximation Effects on poles and zeros X k +1 = AX k +" k Y k +1 = U T AUY k +U T " k
23 Normal behaviour? What is the normal behavior characterized by the model Time and frequency representation are equivalent PCA model 23
24 Second step : Filtering Filter separate the signal space into two subspace Signals that pass through the filter Signals that are rejected by the filter Anomaly detection filter Pass everything is compatible with the normal behavior block everything is not compatible with the normal behavior Simplistic approach Deduce the signal as predicted by the normal behavior model from the observation. e(t) = X(t) " X ˆ (t) What to do if we have measurement error How to ensure that the error is not propagating in prediction 24
25 Kalman Filter Filtering what is compatible with the model Two steps Prediction Correction X ˆ (t +1) = X ˆ " (t +1) + K(t +1) Y(t +1) " CX ˆ " (t +1) Innovation: "(t +1) = Y(t +1) # CX ˆ # (t +1) X ˆ " (t +1) = AX ˆ (t) ( ) Kalman filter effect 25
26 Kalman filter results Innovation Recalibration needed 26
27 State of the Art Model+filtering+Decision PCA+difference+thresholding (Lakhina & al, 04) KLE+difference+thresholding (Brauckhof & al, 09) ML+Kalman+thresholding (Soule& al, 05) KLE+Kalman+thresholding (Ndong& al, 11) KLE+Kalman+fuzzy (Ndong & al, 11) Effect of sampling Hohn & al, 03 Brauckhof & al, 10 Results 27
28 Distributed case How to distribute the anomaly detection on a large-scale network By exchanging information between nodes Constraints The amount of traffic used Selfishness Privacy issues State sharing " Each node maintains a state vector " How to share useful information about states with interested nodes " In the simplest setting all the node are interested about all states " Nodes should be able to define which variables they are interested in 28
29 A crash course in linear estimation " Let s assume that we observe and we want to estimate " The MMSE estimator is known to be " For Jointly Gaussian distribution this reduce to " Example A crash course in source compression " How to represent n vectors using nr bits. " The optimal local compression scheme consists of two stages: " a projection stage where the state vector is projected linearly into an orthonormal space " a quantization phase that assigns the bit budget to different projection dimensions following a water filling argument. 29
30 Projection phase " Applying KLT/PCA " maps the correlated vector into an uncorrelated vector " By choosing just K columns " This gives an estimator " This is the MMSE estimator of dimension K Quantization Phase " Choose the number of component to transmit " Assign bits to each chosen components " Results in a signal sent 30
31 Single Hop case " m nodes are all connected (directly or through an overlay ) to a node c. " An approximation of the states of the m nodes should be derived. " Compression is needed. " Sampling " Local compression " Distributed compression Local compression " Each node calculate locally " its covariance " apply a KLT " apply the resulting projection " do quantization " forward to node c the compressed state vector " The state vector is reconstructed at node c 31
32 Distributed compression " Node i send to node c a noisy projection " What is the optimal projection and the optimal quantization? " At node c, is received " This can be used to estimate " Do not need to send just send the estimation error An iterative approach " Each terminal optimize its local encoder, while all other encoders are held fixed " Algorithm terminates when converged and leads to a local minimum. " Can be seen as a distributed learning phase using an EM loop An iteration of the distributed KLT algorithm: C2 and C3 are kept fixed while encoder 1 is chosen optimally 32
33 Optimum projection Multi-hop scenario " A node maintains three data structures " A vector of local states " A preference list with variance " a weighting list and a maximal variance " A list of received projections 33
34 Reception processing " Extracting node and state variable IDs and assigning each received value to the correct variables. " if a new projection or a new remote state variable is observed, update the data structures " Re-estimate the covariance matrix " Each 30 transmissions " Knowing, the estimation of remote state variables proceed " Update Covariance estimation " Estimating " Remote variables are not available locally but projections are " Covariance between states in nodes i and j " Covariance between state variables in neighbor nodes 34
35 Preference list processing " At time k=0 contains the IDs of the state variables node i is interested in with a high weight. " This is forwarded to neighbors " After reception neighbor's preference list, " a lower weight is assigned to the new values. Forwarding scheme " Forward variables that are correlated to neighbors " incentive/punishment mechanism " The node implements the distributed compression. " Apply the optimal projection " Forward the projections when they change 35
36 13/06/11 Incentive or punishment? " The node adds estimated variables from its own preference list to the forwarded variables " This adds its estimation error as a noise in the neighbors estimation process " Neighbors have an incentive to help it to reduce its estimation errors " Cooperation by punishment rather than by incentive Enforcing collaboration " The proposed punishment mechanism acts as a shadow price " If mechanism for enforcing shadow price exists we can implements a Pareto-Optimal cooperation mechanism " Being social becomes helpful " How to enforce shadow prices? " By entangling the performance of or neighbors estimation with our own estimation " Linear projections provide an elegant solution " Cooperation by punishment not by incentives 36
37 Cooperation scheme " Cooperation " Forward projections " Learn the environment " Infer covariance with neighbors " Use it to estimate best projections " Benefit from it " Estimate variable in your preference list " Want more benefits " Behave better with your neighbors Performance 37
38 Convergence Being social is helpful 38
39 Conclusion & Perspectives " We define a cooperative framework for networks " Illustrated with distributed state sharing " Applicable to a large set of scenarios " DTN, distributed compression/transmission " The essence of distributed setting is social intelligence " Node selfishness is essential Conclusion There is a new job emerging : network monitor This needs new skills Mixes Computer Science, Networking, Signal processing, Statistics, etc A new educational challenge Network monitoring and anomaly detection is becoming mature Issues in transferring research results to practice High sophistication Lack of source of real traffic to evaluate and calibrate methods Most of tools are very simplistic Issues with ground truth and the real value/cost of anomaly detection/non-detection 39
Detecting Network Anomalies. Anant Shah
Detecting Network Anomalies using Traffic Modeling Anant Shah Anomaly Detection Anomalies are deviations from established behavior In most cases anomalies are indications of problems The science of extracting
More informationProbability and Random Variables. Generation of random variables (r.v.)
Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly
More informationCCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York
BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationAdvanced Signal Processing and Digital Noise Reduction
Advanced Signal Processing and Digital Noise Reduction Saeed V. Vaseghi Queen's University of Belfast UK WILEY HTEUBNER A Partnership between John Wiley & Sons and B. G. Teubner Publishers Chichester New
More informationLecture 8: Signal Detection and Noise Assumption
ECE 83 Fall Statistical Signal Processing instructor: R. Nowak, scribe: Feng Ju Lecture 8: Signal Detection and Noise Assumption Signal Detection : X = W H : X = S + W where W N(, σ I n n and S = [s, s,...,
More informationEM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationOverview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model
Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written
More informationUnivariate and Multivariate Methods PEARSON. Addison Wesley
Time Series Analysis Univariate and Multivariate Methods SECOND EDITION William W. S. Wei Department of Statistics The Fox School of Business and Management Temple University PEARSON Addison Wesley Boston
More informationBEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES
BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 123 CHAPTER 7 BEHAVIOR BASED CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES 7.1 Introduction Even though using SVM presents
More informationManaging Incompleteness, Complexity and Scale in Big Data
Managing Incompleteness, Complexity and Scale in Big Data Nick Duffield Electrical and Computer Engineering Texas A&M University http://nickduffield.net/work Three Challenges for Big Data Complexity Problem:
More informationNetwork Security A Decision and Game-Theoretic Approach
Network Security A Decision and Game-Theoretic Approach Tansu Alpcan Deutsche Telekom Laboratories, Technical University of Berlin, Germany and Tamer Ba ar University of Illinois at Urbana-Champaign, USA
More informationUnderstanding and Applying Kalman Filtering
Understanding and Applying Kalman Filtering Lindsay Kleeman Department of Electrical and Computer Systems Engineering Monash University, Clayton 1 Introduction Objectives: 1. Provide a basic understanding
More informationCommunication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student
More informationLinear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
More informationSignal Detection. Outline. Detection Theory. Example Applications of Detection Theory
Outline Signal Detection M. Sami Fadali Professor of lectrical ngineering University of Nevada, Reno Hypothesis testing. Neyman-Pearson (NP) detector for a known signal in white Gaussian noise (WGN). Matched
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationTime Series Analysis
JUNE 2012 Time Series Analysis CONTENT A time series is a chronological sequence of observations on a particular variable. Usually the observations are taken at regular intervals (days, months, years),
More informationMaster s Theory Exam Spring 2006
Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationBayesian networks - Time-series models - Apache Spark & Scala
Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More information1 Short Introduction to Time Series
ECONOMICS 7344, Spring 202 Bent E. Sørensen January 24, 202 Short Introduction to Time Series A time series is a collection of stochastic variables x,.., x t,.., x T indexed by an integer value t. The
More informationQUICKEST MULTIDECISION ABRUPT CHANGE DETECTION WITH SOME APPLICATIONS TO NETWORK MONITORING
QUICKEST MULTIDECISION ABRUPT CHANGE DETECTION WITH SOME APPLICATIONS TO NETWORK MONITORING I. Nikiforov Université de Technologie de Troyes, UTT/ICD/LM2S, UMR 6281, CNRS 12, rue Marie Curie, CS 42060
More informationCS 2750 Machine Learning. Lecture 1. Machine Learning. http://www.cs.pitt.edu/~milos/courses/cs2750/ CS 2750 Machine Learning.
Lecture Machine Learning Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square, x5 http://www.cs.pitt.edu/~milos/courses/cs75/ Administration Instructor: Milos Hauskrecht milos@cs.pitt.edu 539 Sennott
More informationHT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
More informationReview Jeopardy. Blue vs. Orange. Review Jeopardy
Review Jeopardy Blue vs. Orange Review Jeopardy Jeopardy Round Lectures 0-3 Jeopardy Round $200 How could I measure how far apart (i.e. how different) two observations, y 1 and y 2, are from each other?
More informationThe Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
More informationParametric Statistical Modeling
Parametric Statistical Modeling ECE 275A Statistical Parameter Estimation Ken Kreutz-Delgado ECE Department, UC San Diego Ken Kreutz-Delgado (UC San Diego) ECE 275A SPE Version 1.1 Fall 2012 1 / 12 Why
More information6.2.8 Neural networks for data mining
6.2.8 Neural networks for data mining Walter Kosters 1 In many application areas neural networks are known to be valuable tools. This also holds for data mining. In this chapter we discuss the use of neural
More informationThe VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series.
Cointegration The VAR models discussed so fare are appropriate for modeling I(0) data, like asset returns or growth rates of macroeconomic time series. Economic theory, however, often implies equilibrium
More informationSequential Non-Bayesian Network Traffic Flows Anomaly Detection and Isolation
Sequential Non-Bayesian Network Traffic Flows Anomaly Detection and Isolation Lionel Fillatre 1, Igor Nikiforov 1, Sandrine Vaton 2, and Pedro Casas 2 1 Institut Charles Delaunay/LM2S, FRE CNRS 2848, Université
More informationNEURAL NETWORKS A Comprehensive Foundation
NEURAL NETWORKS A Comprehensive Foundation Second Edition Simon Haykin McMaster University Hamilton, Ontario, Canada Prentice Hall Prentice Hall Upper Saddle River; New Jersey 07458 Preface xii Acknowledgments
More informationMachine Learning and Data Analysis overview. Department of Cybernetics, Czech Technical University in Prague. http://ida.felk.cvut.
Machine Learning and Data Analysis overview Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz psyllabus Lecture Lecturer Content 1. J. Kléma Introduction,
More informationSystem Identification for Acoustic Comms.:
System Identification for Acoustic Comms.: New Insights and Approaches for Tracking Sparse and Rapidly Fluctuating Channels Weichang Li and James Preisig Woods Hole Oceanographic Institution The demodulation
More informationEnvironmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
More informationLearning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
More informationBroadband Networks. Prof. Dr. Abhay Karandikar. Electrical Engineering Department. Indian Institute of Technology, Bombay. Lecture - 29.
Broadband Networks Prof. Dr. Abhay Karandikar Electrical Engineering Department Indian Institute of Technology, Bombay Lecture - 29 Voice over IP So, today we will discuss about voice over IP and internet
More informationStochastic Protocol Modeling for Anomaly-Based Network Intrusion Detection
2003 IEEE International Workshop on Information Assurance March 24th, 2003 Darmstadt, Germany Stochastic Protocol Modeling for Anomaly-Based Network Intrusion Detection Juan M. Estévez-Tapiador (tapiador@ugr.es)
More informationMIMO CHANNEL CAPACITY
MIMO CHANNEL CAPACITY Ochi Laboratory Nguyen Dang Khoa (D1) 1 Contents Introduction Review of information theory Fixed MIMO channel Fading MIMO channel Summary and Conclusions 2 1. Introduction The use
More informationEE 570: Location and Navigation
EE 570: Location and Navigation On-Line Bayesian Tracking Aly El-Osery 1 Stephen Bruder 2 1 Electrical Engineering Department, New Mexico Tech Socorro, New Mexico, USA 2 Electrical and Computer Engineering
More informationAUTONOMOUS NETWORK SECURITY FOR DETECTION OF NETWORK ATTACKS
AUTONOMOUS NETWORK SECURITY FOR DETECTION OF NETWORK ATTACKS Nita V. Jaiswal* Prof. D. M. Dakhne** Abstract: Current network monitoring systems rely strongly on signature-based and supervised-learning-based
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationPYKC Jan-7-10. Lecture 1 Slide 1
Aims and Objectives E 2.5 Signals & Linear Systems Peter Cheung Department of Electrical & Electronic Engineering Imperial College London! By the end of the course, you would have understood: Basic signal
More informationChristfried Webers. Canberra February June 2015
c Statistical Group and College of Engineering and Computer Science Canberra February June (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 829 c Part VIII Linear Classification 2 Logistic
More informationNon Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization
Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Jean- Damien Villiers ESSEC Business School Master of Sciences in Management Grande Ecole September 2013 1 Non Linear
More informationPrinciples of Digital Communication
Principles of Digital Communication Robert G. Gallager January 5, 2008 ii Preface: introduction and objectives The digital communication industry is an enormous and rapidly growing industry, roughly comparable
More informationTensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
More informationCS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen
CS 591.03 Introduction to Data Mining Instructor: Abdullah Mueen LECTURE 3: DATA TRANSFORMATION AND DIMENSIONALITY REDUCTION Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major
More informationSome Research Challenges for Big Data Analytics of Intelligent Security
Some Research Challenges for Big Data Analytics of Intelligent Security Yuh-Jong Hu hu at cs.nccu.edu.tw Emerging Network Technology (ENT) Lab. Department of Computer Science National Chengchi University,
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L5: Novelty Detection and Regression Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune,
More informationA Learning Based Method for Super-Resolution of Low Resolution Images
A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method
More informationIntrusion Detection via Machine Learning for SCADA System Protection
Intrusion Detection via Machine Learning for SCADA System Protection S.L.P. Yasakethu Department of Computing, University of Surrey, Guildford, GU2 7XH, UK. s.l.yasakethu@surrey.ac.uk J. Jiang Department
More informationHow To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationEnhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm
1 Enhancing the SNR of the Fiber Optic Rotation Sensor using the LMS Algorithm Hani Mehrpouyan, Student Member, IEEE, Department of Electrical and Computer Engineering Queen s University, Kingston, Ontario,
More informationComponent Ordering in Independent Component Analysis Based on Data Power
Component Ordering in Independent Component Analysis Based on Data Power Anne Hendrikse Raymond Veldhuis University of Twente University of Twente Fac. EEMCS, Signals and Systems Group Fac. EEMCS, Signals
More informationSpatial Statistics Chapter 3 Basics of areal data and areal data modeling
Spatial Statistics Chapter 3 Basics of areal data and areal data modeling Recall areal data also known as lattice data are data Y (s), s D where D is a discrete index set. This usually corresponds to data
More informationAutomated Stellar Classification for Large Surveys with EKF and RBF Neural Networks
Chin. J. Astron. Astrophys. Vol. 5 (2005), No. 2, 203 210 (http:/www.chjaa.org) Chinese Journal of Astronomy and Astrophysics Automated Stellar Classification for Large Surveys with EKF and RBF Neural
More informationPricing and calibration in local volatility models via fast quantization
Pricing and calibration in local volatility models via fast quantization Parma, 29 th January 2015. Joint work with Giorgia Callegaro and Martino Grasselli Quantization: a brief history Birth: back to
More informationCONTROL SYSTEMS, ROBOTICS, AND AUTOMATION - Vol. V - Relations Between Time Domain and Frequency Domain Prediction Error Methods - Tomas McKelvey
COTROL SYSTEMS, ROBOTICS, AD AUTOMATIO - Vol. V - Relations Between Time Domain and Frequency Domain RELATIOS BETWEE TIME DOMAI AD FREQUECY DOMAI PREDICTIO ERROR METHODS Tomas McKelvey Signal Processing,
More informationClassification of Bad Accounts in Credit Card Industry
Classification of Bad Accounts in Credit Card Industry Chengwei Yuan December 12, 2014 Introduction Risk management is critical for a credit card company to survive in such competing industry. In addition
More informationAUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.
AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree
More informationContent. Professur für Steuerung, Regelung und Systemdynamik. Lecture: Vehicle Dynamics Tutor: T. Wey Date: 01.01.08, 20:11:52
1 Content Overview 1. Basics on Signal Analysis 2. System Theory 3. Vehicle Dynamics Modeling 4. Active Chassis Control Systems 5. Signals & Systems 6. Statistical System Analysis 7. Filtering 8. Modeling,
More informationAdaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement
Adaptive Demand-Forecasting Approach based on Principal Components Time-series an application of data-mining technique to detection of market movement Toshio Sugihara Abstract In this study, an adaptive
More informationAPPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder
APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large
More informationMachine Learning in Statistical Arbitrage
Machine Learning in Statistical Arbitrage Xing Fu, Avinash Patra December 11, 2009 Abstract We apply machine learning methods to obtain an index arbitrage strategy. In particular, we employ linear regression
More informationJava Modules for Time Series Analysis
Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series
More informationNeuro-Dynamic Programming An Overview
1 Neuro-Dynamic Programming An Overview Dimitri Bertsekas Dept. of Electrical Engineering and Computer Science M.I.T. September 2006 2 BELLMAN AND THE DUAL CURSES Dynamic Programming (DP) is very broadly
More informationThe Data Mining Process
Sequence for Determining Necessary Data. Wrong: Catalog everything you have, and decide what data is important. Right: Work backward from the solution, define the problem explicitly, and map out the data
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationCopyright. Network and Protocol Simulation. What is simulation? What is simulation? What is simulation? What is simulation?
Copyright Network and Protocol Simulation Michela Meo Maurizio M. Munafò Michela.Meo@polito.it Maurizio.Munafo@polito.it Quest opera è protetta dalla licenza Creative Commons NoDerivs-NonCommercial. Per
More informationEstimating an ARMA Process
Statistics 910, #12 1 Overview Estimating an ARMA Process 1. Main ideas 2. Fitting autoregressions 3. Fitting with moving average components 4. Standard errors 5. Examples 6. Appendix: Simple estimators
More informationDetection. Perspective. Network Anomaly. Bhattacharyya. Jugal. A Machine Learning »C) Dhruba Kumar. Kumar KaKta. CRC Press J Taylor & Francis Croup
Network Anomaly Detection A Machine Learning Perspective Dhruba Kumar Bhattacharyya Jugal Kumar KaKta»C) CRC Press J Taylor & Francis Croup Boca Raton London New York CRC Press is an imprint of the Taylor
More informationNetwork Tomography Based on to-end Measurements
Network Tomography Based on end-to to-end Measurements Francesco Lo Presti Dipartimento di Informatica - Università dell Aquila The First COST-IST(EU)-NSF(USA) Workshop on EXCHANGES & TRENDS IN NETWORKING
More informationMaster s thesis tutorial: part III
for the Autonomous Compliant Research group Tinne De Laet, Wilm Decré, Diederik Verscheure Katholieke Universiteit Leuven, Department of Mechanical Engineering, PMA Division 30 oktober 2006 Outline General
More informationDefending Against Traffic Analysis Attacks with Link Padding for Bursty Traffics
Proceedings of the 4 IEEE United States Military Academy, West Point, NY - June Defending Against Traffic Analysis Attacks with Link Padding for Bursty Traffics Wei Yan, Student Member, IEEE, and Edwin
More informationADVANCED APPLICATIONS OF ELECTRICAL ENGINEERING
Development of a Software Tool for Performance Evaluation of MIMO OFDM Alamouti using a didactical Approach as a Educational and Research support in Wireless Communications JOSE CORDOVA, REBECA ESTRADA
More informationIP Forwarding Anomalies and Improving their Detection using Multiple Data Sources
IP Forwarding Anomalies and Improving their Detection using Multiple Data Sources Matthew Roughan (Univ. of Adelaide) Tim Griffin (Intel Research Labs) Z. Morley Mao (Univ. of Michigan) Albert Greenberg,
More informationUnsupervised Data Mining (Clustering)
Unsupervised Data Mining (Clustering) Javier Béjar KEMLG December 01 Javier Béjar (KEMLG) Unsupervised Data Mining (Clustering) December 01 1 / 51 Introduction Clustering in KDD One of the main tasks in
More informationIntroduction to Kalman Filtering
Introduction to Kalman Filtering A set of two lectures Maria Isabel Ribeiro Associate Professor Instituto Superior écnico / Instituto de Sistemas e Robótica June All rights reserved INRODUCION O KALMAN
More informationTwo Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering
Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014
More informationlarge-scale machine learning revisited Léon Bottou Microsoft Research (NYC)
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) 1 three frequent ideas in machine learning. independent and identically distributed data This experimental paradigm has driven
More informationCITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION
No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August
More informationData Mining mit der JMSL Numerical Library for Java Applications
Data Mining mit der JMSL Numerical Library for Java Applications Stefan Sineux 8. Java Forum Stuttgart 07.07.2005 Agenda Visual Numerics JMSL TM Numerical Library Neuronale Netze (Hintergrund) Demos Neuronale
More informationEvaluation of Machine Learning Techniques for Green Energy Prediction
arxiv:1406.3726v1 [cs.lg] 14 Jun 2014 Evaluation of Machine Learning Techniques for Green Energy Prediction 1 Objective Ankur Sahai University of Mainz, Germany We evaluate Machine Learning techniques
More informationAdaptive Equalization of binary encoded signals Using LMS Algorithm
SSRG International Journal of Electronics and Communication Engineering (SSRG-IJECE) volume issue7 Sep Adaptive Equalization of binary encoded signals Using LMS Algorithm Dr.K.Nagi Reddy Professor of ECE,NBKR
More informationStructural Analysis of Network Traffic Flows Eric Kolaczyk
Structural Analysis of Network Traffic Flows Eric Kolaczyk Anukool Lakhina, Dina Papagiannaki, Mark Crovella, Christophe Diot, and Nina Taft Traditional Network Traffic Analysis Focus on Short stationary
More informationMultidimensional Network Monitoring for Intrusion Detection
Multidimensional Network Monitoring for Intrusion Detection Vladimir Gudkov and Joseph E. Johnson Department of Physics and Astronomy University of South Carolina Columbia, SC 29208 gudkov@sc.edu; jjohnson@sc.edu
More informationForecasting and Hedging in the Foreign Exchange Markets
Christian Ullrich Forecasting and Hedging in the Foreign Exchange Markets 4u Springer Contents Part I Introduction 1 Motivation 3 2 Analytical Outlook 7 2.1 Foreign Exchange Market Predictability 7 2.2
More informationLogistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.
Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features
More informationSupervised and unsupervised learning - 1
Chapter 3 Supervised and unsupervised learning - 1 3.1 Introduction The science of learning plays a key role in the field of statistics, data mining, artificial intelligence, intersecting with areas in
More informationCHAPTER 1 INTRODUCTION
21 CHAPTER 1 INTRODUCTION 1.1 PREAMBLE Wireless ad-hoc network is an autonomous system of wireless nodes connected by wireless links. Wireless ad-hoc network provides a communication over the shared wireless
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
More informationGaussian Processes in Machine Learning
Gaussian Processes in Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics, 72076 Tübingen, Germany carl@tuebingen.mpg.de WWW home page: http://www.tuebingen.mpg.de/ carl
More informationMultisensor Data Fusion and Applications
Multisensor Data Fusion and Applications Pramod K. Varshney Department of Electrical Engineering and Computer Science Syracuse University 121 Link Hall Syracuse, New York 13244 USA E-mail: varshney@syr.edu
More information