Lowering False Alarm rates in Motion Detection Scenarios using Machine Learning TIM LENNERYD

Size: px
Start display at page:

Download "Lowering False Alarm rates in Motion Detection Scenarios using Machine Learning TIM LENNERYD"


1 Lowering False Alarm rates in Motion Detection Scenarios using Machine Learning TIM LENNERYD Master of Science Thesis Stockholm, Sweden 2012

2 Lowering False Alarm rates in Motion Detection Scenarios using Machine Learning TIM LENNERYD 2D1021, Master s Thesis in Computer Science (30 ECTS credits) Degree Progr. in Computer Science and Engineering 270 credits Royal Institute of Technology year 2012 Supervisor at CSC was Hedvig Kjellström Examiner was Danica Kragic TRITA-CSC-E 2012:024 ISRN-KTH/CSC/E--12/024--SE ISSN Royal Institute of Technology School of Computer Science and Communication KTH CSC SE Stockholm, Sweden URL:

3 Abstract Camera motion detection is a form of intruder detection that may cause high false alarm rates, especially in home environments where movements from example pets and windows may be the cause. This article explores the subject of reducing the frequency of such false alarms by applying machine learning techniques, for the specific scenario where only data regarding the motion detected is available, instead of the full image. This article introduces two competitive unsupervised learning algorithms, the first a vector quantization algorithm for filtering false alarms from window sources, the second a self-organizing map for filtering out smaller events such as pets by way of scaling based on the distance to the camera. Initial results show that the two algorithms can provide the functionality needed, but that the algorithms need to be more robust to be used well in an unsupervised live situation. The majority of the results have been obtained using simulated data rather than live data due to issues with obtaining such live data at the time of the project, with live data tests to be done as future work.

4 Referat Reducering av falsklarm i rörelsedetektering genom användande av maskininlärning Rörelsedetektering med kamera är en form av inbrottslarm som kan ge upphov till en hög frekvens av falsklarm, speciellt i hemmiljöer då husdjur och fönster kan vara bidragande orsaker. Denna artikel utforskar möjligheten till reducering av falsklarmsfrekvensen genom användning av maskininlärningstekniker. Den specifika situationen som undersöks är den där endast data om den detekterade rörelsen används, istället för hela bilden. Denna artikel introducerar två algoritmer baserade på kompetitiv inlärning utan tillsyn. Den första algoritmen är en vektorkvantiseringsalgoritm för filtrering av falsklarm från fösterkällor och den andra är en self-organizing map för filtrering av händelser baserat på händelsernas storlek där storleken skalas beroende på distansen från kameran. Inledande resultat visar att algoritmerna kan tillhandahålla den funktionalitet som önskas, men att algoritmerna behöver vara mer robusta för att kunna användas väl utan tillsyn i verkliga situationer. Majoriteten av resultaten har erhållits från simulerad data snarare än reell data eftersom det har varit svårigheter att få fram reell data under projektets gång. Därför ligger tester med reell data som en viktig punkt i framtida arbete med projektet.

5 Contents 1 Introduction The Scenario Anomaly Detection Classification Related Work Theory Preliminaries Deriving the Distance between the Pet and Camera Deriving the Diagonal Length of the pet s Bounding Box as a Limit Competitive Learning Method and Implementation Simulation Visualizing the Results Keeping track of windows using Vector Quantization A Self-Organizing Map as a Height map for pet size Thresholds Results and Conclusions Window Adjustment Filter Pet Filtering Conclusions Future Work 39 Bibliography 41 Appendices 44 A Other Considered Methods 45 A.1 One Class and Two Class Support Vector Machines A.2 Clustering A.2.1 Computational Complexity

6 A.2.2 Advantages and Disadvantages A.3 Nearest Neighbor A.3.1 Computational Complexity A.3.2 Advantages and Disadvantages A.4 Neural networks A.4.1 Supervised and Semi-supervised Neural Networks A.4.2 Computational Complexity A.4.3 Advantages and Disadvantages

7 Chapter 1 Introduction 1.1 The Scenario A company is providing intrusion detection alarms for houses and apartments. These alarms are motion based, with cameras taking pictures and using motion detection algorithms to provide a bounding box around the detected motion. This box together with some supporting information is then sent to the algorithms developed and presented in this article. The constraint below, decided on by the author and the company in cooperation, limits the focus of the algorithms to be developed. By not using the full picture, the algorithms has to make do with less information than if the picture was available, something that influences both choice of algorithms and results. There are a number of reasons as to why this decision was made, but the most important of those were to get a lower dimensional feature space, reduce privacy concerns about machine analysis of private pictures and to reduce the computational complexity. Constraint 1 The algorithms may not assume that they have access to any of the pictures taken by the camera, the only data available will be the surrounding data and the bounding box information. Currently the company only uses very simplistic filtering within the routers connected to the cameras to avoid the most obvious false alarms. This filter consists of a few tests, as can be seen below: Disregard movement if the object has very inconsistent movements, when the velocity of the detected movement changes very rapidly between images. Disregard movement if the size of the object changes rapidly and inconsistently between images. Disregard movement if the velocity or size of the object is far too small to be anything but a false alarm. 1

8 CHAPTER 1. INTRODUCTION They hope with this project to include more intelligent detection of false alarms by applying learning algorithms on the different situations present in the camera environments. They currently do not use nor log the bounding box motion data on the server, but by sending the motion data on to the server together with the pictures taken they will have data for the learning algorithms to work on. Since this is a situation where mis-classifying a break-in attempt as a false alarm is very damaging, the system generated will need to minimize the number of such mis-classifications while still lowering the amounts of false alarms from the main factors. More formally, the task is to minimize the type I error (false positive), that is, minimize the number of false alarms, while keeping the type II errors (false negatives) on a reasonable level. Since the current filters used by the company are so basic, this condition holds for those filters. There is very little risk of misclassifying a break-in attempt as a false alarm with the above filters, since there is very little risk of humans appearing small enough to be disregarded, or by having inconsistent movements between camera pictures. The current filters however, only manage to catch very specific types of false alarms, such as quick light effects or camera distortions and the like. By applying more filters with more computing power the hope is that this can be vastly improved upon. The company has noted that there are two separate sources providing high numbers of false alarms, with a third possibly providing a lower but still significant amount. The first, windows within the vision of a camera, provide false alarms due to the fact that any movement outside of the window will register as detected movement, and there is currently no way for the camera to distinguish between these movements and the ones that would be within the protected house, thus it has to raise an alarm. Pets make up the second source of false alarms, due to the fact that whatever detected movement of the pet might very well cause a false alarm, depending on the velocity and the size. As such, the company cannot yet sell their product to clients with pets, since the false alarm rate will be too high. The third, lesser source is that of movements from outside windows translating into movements inside the room due to shadows and light. Examples of this includes shadows from trees moving due to wind, cars driving by outside which bounces light into the room and other quick light phenomenas. The actual movement of the sun, clouds and such do not fit in here, due to the fact that such movements are too slow to get past the threshold mentioned above. While the prototype algorithms realistically will not completely solve the above issues, the prototype should strive to minimize false alarms from these sources while still keeping the type II errors minimized. Since some algorithms cope better with certain types of data and others have big problems, it is useful to spend some time considering the nature of the data the scenario provides. The system provides some data, such as time, what camera, start position, end position and velocity vector of the bounding box containing the movement. From this, it is possible to derive certain other values that can be of importance, such as size, by calculating the area of the bounding box using the 2

9 1.2. ANOMALY DETECTION start and the end positions. This size value is then important in the consideration of whether the value is a false alarm or not, since small boxes could indicate pets within the house, or something else that is just too small to be a human. All cameras will be trained individually, so there is no reason to add information about which camera sent the event to the anomaly detection algorithm, it will not give the algorithm any more information to work with. However, the rest of the information can be of importance. The data is multivariate, since one data instance holds a number of different values and both scalars and vectors. Time Time of detection (Scalar) Start position The top left corner of the bounding box (2D Vector) End position The bottom right corner of the bounding box (2D Vector) Size The size of the box, either as a diagonal or the area (Scalar) Velocity The current velocity of the detected object (2D Vector) The scalars are one-dimensional and the vectors above are on a two-dimensional plane, that of the picture taken, so the total number of dimensions used by one data instance is eight. This is regardless of how many dimensions the algorithms presented in this article actually use, since there is always the option of completely ignoring some dimensions in the feature space. 1.2 Anomaly Detection Anomaly detection, also called outlier detection, is a heavily researched subject with many widely differing proposed algorithms for both general use and very specific situations. There are also a couple of notable definitions quoted by Hodge and Austin, [21] that were first presented by Grubbs (1969). An extension was also presented by Barnett and Lewis (1994). Grubbs: An outlying observation, or outlier, is one that appears to deviate markedly from other members of the sample in which it occurs. Barnett and Lewis: An observation (or subset of observations) which appears to be inconsistent with the remainder of that set of data. By using a few simple assumptions seen below, the defined task in section 1.1 can be defined as an anomaly detection problem. 1. The probability of a break-in is much smaller than that of a false alarm, regardless of the source of the false alarm. 2. Any false alarm due to movement seen outside of the window will by necessity be confined to the edges of the window. 3

10 CHAPTER 1. INTRODUCTION 3. Movements by pets generally conform to certain patterns, for example pets generally keep to the floor or certain preferred furniture while at home alone. If the class of normal everyday events that should be considered false alarms are based on these assumptions, then movements differing in perceivable ways can be found with algorithms that detect anomalies. This article will refer both to anomaly detection and outlier detection, but within the context of this article we do not in any way differentiate between the definition or the function of the two expressions. To help with separating the different algorithm classes, Hodge and Austin [21] define three approaches, defined below, depending on what is to be modeled and what knowledge is available. Type 1 Determine the outliers with no prior knowledge of the data. This is essentially a learning approach analogous to unsupervised clustering. The approach processes the data as a static distribution, pinpoints the most remote points and flags them as potential outliers. It is noted also that this approach requires that all data is available before processing. As part of a type 1 approach, two different techniques diagnosis and accommodation are commonly employed. Diagnosis detects the outlying points in the data and may remove them from future iterations, gradually pruning the data and fitting the model until no outliers are found. Accommodation incorporates outliers and employs a robust classification method that can withstand such isolated outliers [21]. Type 2 Model both normality and abnormality. This approach is analogous to supervised classification and requires pre-labeled data, tagged as normal or abnormal. Hodge and Austin [21] continues by referring to this type of approach as a normal/abnormal classification, either using one normal class or several depending on what is needed. They also note that these classifiers are best suited to static data unless an incremental classifier such as for example an evolutionary neural network is used, since the classification needs to be rebuilt if the distribution shifts. Type 3 Model only normality or in few cases model abnormality. Authors generally name this technique novelty detection or novelty recognition. It is analogous to a semi-supervised recognition or detection task and can be considered semi-supervised as the normal class is taught but the algorithm learns to recognize abnormality. The approach needs pre-classified data but only learns data marked normal. 4

11 1.2. ANOMALY DETECTION Figure 1.1: Example of Point Anomalies. Points noted by circles have been classified as normal, while points noted by a square are classified as point anomalies. While type 3 approaches may seem similar to type 2 approaches, the difference lies in the fact that by only labeling the normal class, one can avoid the corner cases where it is uncertain whether a data instance belongs to the normal class or not. Instead of a normal/abnormal separation, type 3 approaches may present a separation between the normal class and those data instances the approach cannot reliably classify as normal. While the above definition of approaches is very useful, there is also a need to define different types of anomalies. Chandola et al. [13] defines and refers to three different categories of anomalies, which we will describe briefly below: Point Anomalies If an individual data instance can be considered as anomalous with respect to the rest of the data, then that instance is termed a point anomaly. Point anomalies are the simplest types of anomalies and the focus of much anomaly detection research. This is also the type of anomaly detection we will be focusing on in this article, with the scenario and the prototype. Figure 1.1 shows a simple point anomaly example where the squares are classified as anomalous since when looking at the whole dataset, they are the few points that are markedly different in position from the rest. Contextual Anomalies If a data instance is anomalous in a specific context, but not otherwise, then it is termed a contextual anomaly (or conditional anomaly). 5

12 CHAPTER 1. INTRODUCTION Figure 1.2: Example of Collection Anomalies. Points noted by circles have been classified as normal, while points noted by squares are classified as anomalies. If a single point had been at the position of the squares the point would not have been deemed a collection anomaly, but since there are several of them on a line, this is deemed anomalous. If time is a contextual attribute (an attribute orienting the data instance within the dataset) in a dataset, then an event or occurrence at an unusual time might be a contextual anomaly, if the occurrence would be normal at other times. For example, if the safe of a bank is opened in the middle of the night when the bank is closed, as opposed to very specific times during the day when the bank is open and procedures are followed. Collective Anomalies If a collection of related data instances is anomalous with respect to the entire data set, it is termed a collective anomaly. The individual data instances in a collection anomaly may not be anomalies by themselves, but their occurrence together as a collection is anomalous. The easiest way to show collective anomalies is with an example. Figure 1.2 shows how a single point in the center would not be classified as anomalous, but with the concentrated distribution of points in the center differing from the rest of the dataset, the classifier classifies it as anomalous. In this article, the focus lies almost exclusively on point anomalies, contextual and collective anomalies will be completely disregarded when choosing algorithms. The reason for this is that the point anomaly definitions fit well with what the scenario is looking to achieve. While collective or contextual anomaly detection could very well be used to find and distinguish between false alarms and real alarms, the algorithms easily become more complex without necessarily finding the simpler point anomalies. 6

13 1.3. CLASSIFICATION 1.3 Classification The process of classification uses a model (classifier) that takes labeled data instances as training data, and adjusts the model to correctly classify as many of the training instances as possible into one of the available data classes [13]. After these adjustments have been made, similar data to that used for training is used to test how well the system generalizes on data the system has not seen. Anomaly detection by classification operates similarly, by first training the model using one or several normal classes and then testing the system by asking it whether particular data instances can be classified as one of the normal classes, or are anomalous. Chandola et al. [13] presents an assumption that anomaly detection algorithms based on classification operate under: Assumption: A classifier that can distinguish between the normal and anomalous classes can be learned in the given feature space. Within multi-class anomaly detection it is assumed that only if the data instance cannot be reliably placed in one of the available normal classes will it be defined as anomalous. In one-class anomaly detection a boundary around the normal class is formed within the given feature space, and any data instance that does appear within that boundary is classified as an anomaly. It is essentially the same in the multi-class case, except that the data instance is deemed anomalous only if it does not appear within the boundary for any of the classes. There exists a reduction from the outlier detection problem to that of classification [1], which allows the use of active learning techniques with outlier detection problems. While a formal reduction is in many cases not needed to apply traditional machine learning techniques as well as those detailed later in this article, it is in any case useful to note the existence. 1.4 Related Work Research done on subjects of anomaly detection can be separated into a number of sections, based on their focus. There are several surveys, articles and books that discuss a number of different techniques with widely differing basics. These broader reviews consider a large number of algorithms together with a number of domains. Chandola et al. touches on Classification, Clustering, Nearest Neighbor, Statistical, Information Theoretic and Spectral algorithms, and considers them for each of the domains Cyber-Intrusion, Fraud Detection, Medical Anomaly Detection, Image Processing, Textual Anomaly Detection and Sensor Networks [13]. Hodge and Austin is a survey similar to the above, but with a slightly slimmer scope, in that it does not go through the various application domains for anomaly detection techniques, but focus instead on the techniques themselves and various variants [21]. Hodge and Austin defines three fundamental approach types to the problem of outlier detection, based on what knowledge is available as well as what is being modeled. 7

14 CHAPTER 1. INTRODUCTION Markou and Singh has published a very extensive review of statistical approaches that introduces a number of principles useful in novelty detection and related problems. Among the considered statistical approaches are Hidden Markov Models (HMM), k-nearest Neighbor (knn) and k-means clustering [32]. Markou and Singh have also reviewed Neural Networks extensively, where they discuss Multi-Layer Perceptrons, Support Vector Machines, Auto-Associator, Hopfield Networks and Radial Basis Function approaches among others to give a good outline of available algorithms within the neural network class of algorithms [33]. Naturally, there are also a number of articles focusing on the individual techniques mentioned by the broader surveys, many of those used as sources by the surveys. Stefano et al. considers the use of an added reject option to a one-class neural classifier, with the reject option depending on a reliability evaluator depending on the classifiers architecture [36]. This reject option allows the system to reject the sample rather than classifying it with low reliability (essentially refusing to choose rather than chancing it). Abe et al. has reviewed the idea of reducing the problem of outlier detection to a classification problem, that can then be solved using active learning techniques [1]. Gwadera et al. considers the use of machine learning together with sliding windows to detect any suspicious sequences of events in an event stream, where they set up dynamic thresholds for the number of suspicious events that are allowed before an alarm is raised [17]. Ma and Perkins also considers temporal sequences, as they present an on-line novelty detection framework for temporal sequences using Support Vector machines [29] [30]. Also on the subject of SVMs, Mika et al. discuss how to use SVMs to create a boosting algorithm, and showing by equivalent mathematical programs that such can be done [34]. Kohonen has presented a very extensive book focusing on Self-Organizing Maps that details the variants of the algorithm and mathematical considerations among other things, that has been well received and referred to by all the wider surveys considering self-organizing maps [28]. Ando has presented an information theoretic analysis on the subject of minority and outlier detection [5]. This analysis is abstract for the most part, and focuses on clustering. They also present an algorithm that is also evaluated in the analysis. Aggarwal and Yu discuss challenges specific to high dimensional data, such as distance measures not being meaningful, and presents some solutions to the problems presented [3]. There are a number of articles dealing in the domain of Wireless Sensor Networks (WSNs). Much of the research done by these articles discuss the specific challenges of the WSNs. Branch et al. [9] and Janakiram et al. [24] for example discusses limited battery power, computational power and high error probability and how such things influence the choice of algorithms. 8

15 Chapter 2 Theory 2.1 Preliminaries The preliminary theory consists of deriving a constant that we may call d 0 and a formula that can be used to scale the diagonal of a bounding box depending on where in the image the box occurs. These derivations depends on the assumption that there exists a horizontal base-plane in the image that acts as a floor. By assuming this base-plane exists all movements detected will follow this floor plane and the size-changes in the bounding box and the diagonal will therefore be predictable. Assumption 1 There exists a ground-plane within the image defined as the floor, on which any movement will occur Deriving the Distance between the Pet and Camera Figure 2.1 shows in detail our efforts to derive the distance d between the camera and the pet, or in other words, to find the ratio h/d to use as a scaling factor. The below values are assumed to have been provided, either by the camera or by the user through some interface outside of the scope of this article. α camera tilt angle β camera field of view h camera height from the floor, in cm l pet length in cm p picture height in pixels (vertical resolution) P d height from bottom to detected movement box in pixels 9

16 CHAPTER 2. THEORY Figure 2.1: The defined angles used in the derivation of a bounding box scaling factor, with the variables defined in section The camera image plane can be seen, as well as how the pet is projected onto the plane. To find the distance d, the angle v can be used. But then the angle must first be derived. If P d = 0, that is if the pet is detected at the very bottom of the picture, then the angle v is simply that of α + (β/2). But whenever P d is more than zero, the angle v will need an appropriate value subtracted to account for the small slice that should not be counted. This angle that should be subtracted will then be β (P d /p). P d /p is the ratio of where the β angle should be divided to give v the correct angle. This can be easily visualized if one considers P d to be 1/2 of p. This will mean that β should be split up in two pieces, exactly as is being done by the focus line shown in the picture above and v would then be identical to α. By this reasoning, the angle v will be: v = α + β ( 2 β P ) d = α + β p ( 1 2 P ) d 2 (2.1) By using the definition of the sinus function, equation 2.2 can be defined as below. This is done by using equation 2.1 for v as the angle and the height h as the opposite side in the triangle, leaving the distance d between the pet and the camera as the hypotenuse. d = h sin(v) = h sin(α + β ( 1 2 P d p 10 ) (2.2)

17 2.1. PRELIMINARIES The ratio between height and distance (h/d) then becomes: f = h ( 1 d = sin(α + β 2 P ) d p (2.3) This scaling factor f can then be used to scale the diagonals for the respective position in the image, and we do not need to perform any further calculations here Deriving the Diagonal Length of the pet s Bounding Box as a Limit The given values will remain the same as in the previous section and figure 2.1 will again be of interest. The height h given, together with a field of view length we call b, will allow for a formulation of the below equation 2.4 by way of figure 2.2. The definition of the tangens function is used with h being the adjacent side, b/2 being the opposite side and β/2 as the given angle. ( ) β tan = b ( ) β 2 2 h b = 2 h tan 2 (2.4) Under on the assumption that we are using a camera based on the pinhole principle [12], the ratio of the pet length and the field of view length b will remain the same both within the picture and outside in the real room space. This can be used to our advantage by defining P h to be the length of the pet in pixels, giving us equation 2.5: Length of Pet Field of View Length = l b = P h p P h = l p b (2.5) Since we have already defined a formula for b in 2.4, we can simply plug this formula in to get equation 2.6: Figure 2.2: Represents the relationship between the height of the camera and the horizontal length that can be seen with the field of view angle β. The camera is pointed straight down, placing the image plane parallel to the ground. 11

18 CHAPTER 2. THEORY P h = l p 2 h tan( β 2 ) (2.6) Assumption If the pet length is l, then the diagonal of the box can be approximated as 2 l. While this assumption is not a good one, it gives a starting point. This approximation can later be modified if it is deemed to be too crude. With this approximation and by using the expression for P h instead of l, the diagonal of the box in pixels within the picture can be expressed as: 2 l p d 0 = ( ( )) (2.7) 2 h tan β 2 Given that the box is positioned directly under the camera, regardless of if the camera can see the box or not. If the box could be seen, it would have a diagonal of d 0 pixels. This gives a constant value d 0, that can be used to scale the diagonal to any height in the picture, provided we know enough to use the derivation in the previous section, finding the distance between the camera and the pet. When the position P d > 0, the diagonal will be scaled by multiplying expression 2.7 with the ratio found in formula 2.3 in the previous section, giving equation 2.8: ( d r = d 0 f = d 0 sin α + β ( 1 2 P )) d p (2.8) Any calculated diagonals can then be compared with d r to see whether they are small enough to be considered pets and therefore be ignored, or if they should be cause for alarm. That is: f(d) = 2.2 Competitive Learning { 1 if d dr 1 if d < d r (2.9) The competitive learning paradigm is something generally used with artificial neural networks. It can be used for any of the approaches described in 1.2. In this paradigm, nodes compete for the right to represent a particular input, and whichever node is closest earns the right to learn from the input. The learning in this case usually consists of moving the winning node slightly closer to the input in terms of the feature space. In two or three dimensions this means simply that the winning node will be moved closer to the x, y, z position of the input. In non-categorical data, the Euclidean Distance (equation 2.10) below is also often used to measure distance between the input x and the node y and by then comparing the distances identifying a winner. There are also a number of other distance measures used for different 12

19 2.2. COMPETITIVE LEARNING situations, such as the computationally expensive Mahalanobis distance (equation A.2) mentioned in section A.3 and the Manhattan distance measure. Manhattan distance is also often called taxicab geometry since it measures distance along cartesian axes, just as a taxicab would measure distance between a point x and a point y in a city. distance = n ( x i y i ) 2 (2.10) Two of the most widely used algorithms, the Vector Quantization algorithm for neural networks and the Self-Organizing Map operates unsupervised and can therefore be classified as a type 1 approach. The Self-Organizing Map was first introduced by Teuovo Kohonen using the vector quantization algorithm with unsupervised learning to produce a low-dimensional representation of the input space [28]. Hastie explains the behavior of the Self-Organizing Map as follows [20]: Constrained version of K-means clustering, in which the prototypes are encouraged to lie in a one- or two-dimensional manifold in the feature space Since Self-Organizing maps uses a neighborhood function, that is, allows the nodes close to the winning nodes to learn a little from the input as well, the map created by the SOM algorithm preserves the topology of the input data. If this neighborhood function is not used, that is, if the winner takes it all strategy is used, then the system according to Hastie will be analogous to a k-means clustering system [20]. The look of the neighborhood function determines how the topology is preserved and which nodes get updates. In many cases, the neighborhood function will return a wide neighborhood to start with, to give the whole map the general shape. By starting with a wide neighborhood, the chance is small that a part of the map is completely void of updates and remain in its start state. Gradually as learning goes on the function returns a smaller neighborhood, which translates to a more finely tuned topology over a smaller section of the map at the time. This can be easily visualized by considering a three-dimensional surface, where billowing hills are results of a wider neighborhood being used and smaller sharper peaks are the results of a smaller neighborhood. The concept of a learning rate δ that is often used in machine learning controls the time the system takes to converge when it comes to learning. If using a high δ value such as for example δ 1.0 fluctuations and divergence may occur, but the lower the δ value the slower the convergence. In essence, the δ value controls how much the system may learn from a single training pattern. Depending on the complexity of the network, as well as the availability of data, a system might be run for a number of iterations (epochs) through the training set to get convergence. In the case of on-line classification, there might be no need to run through several iterations since convergence might not be needed. 13 i=1

20 CHAPTER 2. THEORY Kohonen [28] mentions some ways of speeding up the SOM calculations using pointers to tentative winners, that will reduce the number of comparison operations from quadratic when performing learning through exhaustive search, to linear when using the pointers. While that would speed up the SOM, it is still not as quick as the Hopfield net due to the learning procedure involved in learning a SOM, as well as querying the system [21]. 14

21 Chapter 3 Method and Implementation 3.1 Simulation There are several ways of obtaining data needed for a learning system, the most effective method from the system s point of view is to use real-world data. In many cases this is infeasible however, due to the impossibility of collecting the amount of data needed as well as the cost of acquiring such data. Simulating data is a cheaper solution if real world data is not available, but the actual simulation requires some work if the generated data is to be accurate to any degree. To provide data in any way useful for the scenario outlined in this article, the simulation needs to be defined both by a number of parameters and also by a number of rules and assumptions. Any of these assumptions used will separate the generated data somewhat from the real world data, but these assumptions also lower the time and complexity of programming the simulation, which is direly needed in this case to allow the focus of the article to lie on the learning algorithm rather than the simulation. The actual programming of this simulation will happen in several steps, consisting first of defining the basic assumptions used and thereafter stepwise improving the assumptions to provide a better modeling of the data. Assumption 1 The three dimensional space used will have a right-handed base, meaning when x points to the right, y will point straight up and z will point straight out of the paper. Assumption 2 Input parameters to the simulation will be given in three dimensions, with the camera placed in the Cartesian coordinate system position (0,0,0) and it will be point straight ahead along the z axis for simplified calculations. Assumption one and two are mainly to define how the conversion from three dimensional space to two dimensional screen space will occur. Carlbom and Paciorek presents information about how to project the three dimensions down to two dimensions in the same way that certain cameras do [12]. The fact that cameras use 15

22 CHAPTER 3. METHOD AND IMPLEMENTATION similar techniques allows the simpler to define three dimensions while still achieving the important perspectives. That is, something being smaller further away from the camera despite being the same size in three dimensions and how shapes change when projected down depending on their position relative to the camera. Similar calculations to those presented by Carlbom and Paciorek can be found in many books and lecture notes dealing with computer graphics, since the perspective projection transformation is so vital to that field of computer science. Since the cameras used in the live situation will provide coordinates in two dimensions with proper perspective and positioning, the simulation will need to provide two dimensional points as well, otherwise the differences between the simulation and the live situation will be too large to present any meaningful data. Assumption 3 Movements outside defined windows will primarily be parallel to the window with few exceptions. The size of the shapes moving will be arbitrary. Assumption three is mainly an assumption to provide a starting point for the learning algorithms. The assumption is that the better part of any movements recorded are of people and cars driving and walking by the window, and thus these movements usually are parallel to the window. With this basic assumption made, changes can be made later on to provide a more accurate representation of such movements. Assumption 4 Movements defined by pets within the room will have an arbitrary direction and velocity. The movement events will be defined by the size of the pet. There are a number of ways that pets can move within a house, with varying speed, positions of rest and general movement. These can not all be simulated, and even simulating a single one of these continuous movements realistically is time consuming and complex. Therefore the first basic assumption is that movements recorded from pets are not connected, and they move more or less randomly. This does not fit very well with reality, but if such arbitrary movements can be classified with some reliability, then more reliable movements should hopefully be easier to classify. Regardless, this assumption can be improved upon at a later date if the simulation is kept. The simulation generates events used by the system, and it employs some programmatic techniques (mainly inheritance) that provides easier implementation of events, both events needed by the system and ones not within the scope of this article. By employing a time step and querying a normal distribution for a random value, the simulation checks whether a new event should be generated. Each event type to be simulated has a slightly different distribution connected to it, and that distribution decides the the ratio of events and at what times during the day that the events should be focused. When an event has been generated, it will be appended to an output file that will be used as input by the learning system. 16

23 3.2. VISUALIZING THE RESULTS 3.2 Visualizing the Results Data from the live scenario consists of either scalars or two-dimensional data, which means that they can be easily visualized using two-dimensional graphics in any mathematical program. The simulation can also return data with three dimensions, the points before they have been projected into the two-dimensional screen space. Therefore it can be useful to provide a top down view of a defined room for these three-dimensional points. In the top-down projection the y axis is simply ignored, allowing a R 3 R 2 projection. Figure 3.1 shows an example of how the top-down projection looks with some example data. Since the reverse projection, R 2 R 3 is not easily done, top-down only works for simulated data. To provide a useful data visualization both for the simulation and live scenario we might wish to present the two-dimensional screen projection of the bounding boxes defining a detected movement event. Adding windows to this screen space projection is done by simply projecting the three-dimensional coordinates of the window for the simulation case or using given two-dimensional screen space coordinates (see section 3.4). This visualization, as shown by figure 3.2, is especially useful for reviewing the effect of window event classifications, since it shows windows within the twodimensional screen space used by the live scenario. Other events, such as pet events may not be as apparent since they are not as constrained by the room geometry. Figure 3.1: Top-Down view example using only the window classifier. Data points are the center points of detected movement bounding boxes, with black circles being considered normal and red squares as anomalous. 17

24 CHAPTER 3. METHOD AND IMPLEMENTATION Figure 3.2: 2D perspective projection example using only the window classifier. Red squares show unrelated events that are there merely to sidetrack the window filter, black circles are successfully classified window false alarm events, and red circles are window events that the window filter has not managed to classify as such. While the above visualizations lay the foundations by showing training and test data and the classifications for those sets, additions to these foundations allow for a more informative visualization. To appropriately show the effect and behavior of the competitive learning algorithm used by the window classifier (see section 3.3), the nodes used in the algorithm will need to be visualized. Since the nodes work only in two dimensions, the screen space projection visualization above is useful for adding the nodes. Figure 3.2 also shows the starting positions of the competitive learning nodes as a dot and the end positions of the nodes after training on a set as a star. For the window classifier visualizing the individual nodes makes sense, but in section 3.4 the nodes used by the Self-Organizing Map for pet sizes will never move in the two dimensions visualized by the figures above. The nodes do however keep a value of the bounding box diagonal that can be used as a third dimension, making a height map a powerful tool of visualization for the pet classifier. If the scale of the map is chosen to be the same as in the screen space projection, the previous visualization (figure 3.2) and the height map can be presented together to show a more complete picture, such as figure

25 3.3. KEEPING TRACK OF WINDOWS USING VECTOR QUANTIZATION Figure 3.3: 2D Projection perspective with corresponding SOM heightmap. The heightmap shows how the scaled pet sizes have been modified from the default scaling by the algorithm to accommodate pets above the floor plane. Lighter areas allow a larger pet. 3.3 Keeping track of windows using Vector Quantization Assumption 1 Windows are the only sources of detected movement that can cause false alarms, and only from movements outside the windows such as pedestrians and cars. Assumption 2 Initial window coordinates are given to the system either by the user or by the system in some way not within the scope of this article. For the sake of discussion, let us assume that the above assumptions hold. Then the easiest option for eliminating any false alarms would be to simply ignore any events that have been constrained by any windows, provided that the window positions are known. By having the user fill in where in the image the windows are, events within the defined area could then be ignored, and as long as the camera angle and position remains the same, the system would know which events to ignore. 19

26 CHAPTER 3. METHOD AND IMPLEMENTATION Figure 3.4: The window events have no offset, as can be seen by the fact that the windows have not moved, and most events are defined as false alarms (black circles). Red squares represent events that cannot reliably classified as false alarms. Figure 3.5: The events have been offset by a change in camera angle, and due to this the window filters have moved to accommodate this change. The start positions of the filters are represented by a red dot, and the end positions by a blue star. There is a major problem with this naive approach, namely the assumption that the camera angle will remain constant. While cameras may not move drastically from day to day, the company has supplied that they will turn the lens toward the roof when deactivated to preserve privacy. The angle they return at may then differ somewhat, which in turn will mean that the filter initially provided by the user may be slightly misplaced, possibly resulting in false alarms from the windows. If an identical event occurs before and after a camera adjustment, then with the filter properly positioned the event will be properly classified as false alarm and ignored. But after the adjustment, the filter may be slightly misplaced and the system may classify the event as a real alarm, despite the event being identical to a previously classified event. To solve this inconsistency with camera movement, one option is to try and track the position of the window using the events generated by it. To effectively track a given window, only movement events close to the windows should be considered, 20

27 3.3. KEEPING TRACK OF WINDOWS USING VECTOR QUANTIZATION something that can be done by applying some distance limit parameters. This should automatically remove events from windows other than the chosen window, provided that the distance limits are small enough and that the camera movement is not too large. To perform a simple tracking, the filters may be pulled somewhat in the right direction to always try and keep the filter in the center of the closest window, provided one filter for every window exists. Since only the filter closest to any specific window should be moved for an input, by thinking of the windows as isolated nodes many of the different competitive learning algorithms can be applied to this problem. This works since competitive learning algorithms, as mentioned in section 2.2, compete for the right to process and learn from a subset of the possible input space. By using Competitive learning with a winner-takes-it-all strategy, as has been done in figures 3.4 and 3.5, the nodes will only learn from events generated by their own specific window if the distance limit parameters are appropriate. An issue with tracking the window in this way occurs if the first assumption above does not hold. If there are unrelated events, say from pets, they have the ability to influence the behavior of the window tracking. Such influence would cause the window filter to become unreliable, since any event, at any distance from the window or the current nodes, would pull the nodes away from the window, and thus spreading and warping the filter, as can be seen in figure 3.6. The severity of this issue can be reduced in various ways, the simplest being to create some additional requirements for when the window nodes may learn from an event. The system should always be able to classify an event, regardless of it is allowed to learn from it or not. By setting a maximum Euclidean distance value (defined in equation 2.10) allowed between the original window center and the event allows the lets the filter stay reasonably close to the original window position, while still allowing flexibility and limiting the effects of unrelated event interference. Obviously there are other solutions available for this particular problem, for example to filter using other filters beforehand so that the assumption above does Figure 3.6: Without constraining the filter using some maximum distance limits, the filters may use inputs from too many unrelated events and as such become unstable and unreliable. The middle window filter has moved far away from it s position. Black circles as false alarms, red squares as unrelated or real alarms. 21

28 CHAPTER 3. METHOD AND IMPLEMENTATION Figure 3.7: If the limiting parameters are used, then the results after training on the data set used in figure 3.6 are greatly improved. The filters stay reasonably close to the original positions, and while the trainer still has some problems with the rightmost window, this is expected due to the event density. hold for most reasonable cases, but the solution proposed above is far simpler and less computationally expensive. If the results are acceptable in this situation, then using a simpler solution is often the best choice. Following is a list of parameters used in the implementation of algorithm Vector Quantization algorithm defined in pseudo-code below (algorithm 1). numnodes Number of nodes used by the learning system. In a winner-takes-it-all strategy, each node will represent a window within the image. If learning is allowed for nodes close to the winner as well, a number of nodes could together represent a window. delta The learning rate of the system. This changes how much a single data point influences the system. singlewinner If the winner-takes-it-all strategy should be used. If the single winner strategy is not used, all nodes in an area may very well all converge at a specific point, which may or may not be useful depending on the situation. neighborhoodsize Only relevant if the single winner strategy is not used. This variable then describes the size of the neighborhood around any winning node that also get updates. maxdist The maximum euclidean Distance from a node from which a data point can affect it, even if the node is the winner. Used together with maxdistwin to constrain the filter to a window. maxdistwin The maximum euclidean Distance from a window centroid, that is calculated by the initial positions defined by the user, that a node can move. Used together with maxdist to constrain the filter to a window. 22

29 3.3. KEEPING TRACK OF WINDOWS USING VECTOR QUANTIZATION Algorithm 1 Vector Quantization algorithm {Initiating the nodes} for all node in nodes do window random(windows) node randomposwithin(window) end for {Learning and classifying events} for all event in events do closestn ode mineucdist(event, nodes) D eucdist(closestn ode, event) closestw in mineucdist(event, windows) Dwin eucdist(closestw in, event) diff event closestnode if D < maxdist and Dwin < maxdistw in then closestnode closestnode + diff delta if not singlew inner then for all node in neighborhood(closestn ode) do diff delta D eucdist(node,event) node node + end for end if end if {Classifying events} if event iswithinwindow(closestn ode) then return true else return false end if end for 23

30 CHAPTER 3. METHOD AND IMPLEMENTATION 3.4 A Self-Organizing Map as a Height map for pet size Thresholds After the window events have been dealt with, it is of interest to consider pet events since they make up the second largest false alarm source in some situations. Pet events are where pets are moving within the vision of the cameras and therefore get detected. A reasonable starting assumption is that only pet events will cause a false alarm, same as was assumed in section 3.3. Assumption 1 The only sources of false alarms are from pets being detected within the vision of the camera. Assumption 2 Pets generally move along the floor plane, but they may move in an arbitrary but predictable manner. They may for example have favorite spots diverging from the floor plane, such as for example on top of a sofa or table depending on the pet. Assumption 3 The length of the pet is given to the system either by the user or by the system in some way not within the scope of this article. In general the main difference between pets and their owners is the size. Pets do have a different shape than humans, but this shape may not always be visible on the picture due to angles and positions, and therefore the bounding box may not be that different from a bounding box of a moving human. Pets are in general smaller, which is something that can be used to differentiate between humans and pets. A naive approach to filtering out pet related events would then be to simply classify any events where the bounding box has a size smaller than or equal to a pet as false alarms. While this would work for some events, it would cause more problems than it solves due to the simple fact that without applying any scaling, a box only just fitting a cat could just as well be a human further away from the camera. Therefore the problem becomes two-fold, provided that the user supplies some data about the camera and the pet. First the event needs to be scaled depending on where in the image the movement takes place. After the scaling has been done, the values received can be compared with threshold values, to decide whether they should be classified as pet related false alarms or if they should be classified as real alarms. Sections and detail the math behind the scaling operation. The scaling operation uses information about the length of the pet, the height of the camera as well as the tilt angle and the field of view angle of the camera. With this information a scaled diagonal of the bounding box can be calculated depending on at which height in the picture the bottom corners of the box are positioned. Since self-organizing maps are topology preserving they can create height maps where the height corresponds to the the threshold values. Further, since scaling the event position coordinates (x,y) to fit the map can be done in constant time, the time complexity of classifying an event will be O(1). During training, if the pet has 24

31 3.4. A SELF-ORGANIZING MAP AS A HEIGHT MAP FOR PET SIZE THRESHOLDS Figure 3.8: Training the SOM using 200 simulated training points. Black circles close to the lower edge are already small enough to be allowed due to the scaling, and as such do not cause any changes to the map. The red squares represent data points not conforming to the scaling and as such the map tries to accommodate these anomalies. a spot it likes on for example a couch, the corresponding nodes in the map will learn to allow larger diagonal sizes to accommodate the difference from the floor plane, which is the norm. The self-organizing map will in essence become a heightmap, as can be seen in figure 3.8 where the height value used is the scaled diagonal allowed at that position. The initial normal class consist the scaled diagonals allowed at the different nodes, but after training and accommodation the normal class also includes the changes to the map to accommodate the anomalies. Below, pseudo-code of the proposed algorithm has been included for completeness. 25

32 CHAPTER 3. METHOD AND IMPLEMENTATION Algorithm 2 Self-Organizing Map for pet size thresholds {Initiate Map with default threshold} for all col in columns do for all row in rows do x scale(col) y scale(row) matrix(x, y) scaleddiagonal(x, y) end for end for {Learning and Classification phase} for all event in events do diagonal event.diag x scale(event.x) y scale(event.y) storeddiag scaleddiag(x, y) dif f diagonal storeddiag if diff > 0 then if not singlew inner then for all node in neighborhood(x, y) do dist (x node.x) 2 + (y node.y) change diff delta dist+ɛ matrix(node.x, node.y) matrix(node.x, node.y) + change end for else matrix(x, y) storeddiag + diff delta end if end if if diff > 0 then return true else return false end if end for 26

33 Chapter 4 Results and Conclusions Before going further into the individual and combined results of this project, one thing must be clearly mentioned. Due to factors outside of the authors direct control, no live data was available for the author to use for training and testing, something that was not originally intended. All of the results and conclusions are therefore based on data provided by the simulation detailed in section 3.1. This has a number of effects both on the available results and the discussion regarding them, as well as what conclusions can be drawn and the suggestions for future work. 4.1 Window Adjustment Filter Even though the window filter has some naive elements in the implementation, it can be seen in the following figures that the filter can handle both skewed distributions and fairly large window offsets fairly well, despite the naive distance limits explained previously. Datasets that consist of only window related events, with no unrelated events such as pet events or other random events, can be seen in the figures 3.4 and 3.5. It can there be seen that the error rate is close to zero for the training case, and that the test case for that distribution has similar results. What is of most importance is the results on the testing set, that is, the result on data that has not yet been seen by the system. To accurately measure the results, the test and training sets should have few discernible differences in terms of distribution. With this in mind we will mostly be considering the testing sets. Favored Distributions A distribution favoring a classifier is one where unrelated events make up a smaller part of the dataset than related events, allowing for less interference from such events. If a distribution not favored, the opposite is true. Then the classifier has to work with less relevant events and has to cope with more interference from unrelated events. Figures 4.1b and 4.1c shows that that in general, on distributions favoring the window classifier, a learning rate (δ) between seems to be most effective, with a success rate for the slightly offset windows being 97%. Something that 27

34 CHAPTER 4. RESULTS AND CONCLUSIONS (a) No window offset (b) Slight window offset (c) Heavy window offset Figure 4.1: Showing success rates using data sets favoring the window classifier. The three diagrams show how the learning rate influences the resulting classifications. In sets where the windows are offset, like 4.1c, a very low learning rate will cause a low success rate since the system cannot react to the change quickly enough. Note the difference in scales in the different diagrams. also fits well with what is generally known, that the learning rate should not be too high and the default learning rate that many people use is within Since for this scenario, the only time there is a need for learning is if the windows have been offset, it makes sense that figure 4.1a shows that the best value for delta is very low (0.02). This simply means that the system is already in the best state it can be, and further learning will only cause the system to overtrain. Even with heavily offset windows, such as those in figure 4.2, the system manages a degree of success, topping out at 72%. As can be seen in the figure, the two leftmost windows have less problems than the rightmost window, something that has to do with the fact that the rightmost window is on another wall in the simulated house. 28

35 4.1. WINDOW ADJUSTMENT FILTER Figure 4.2: A classification using δ = 0.14, the best choice of learning rate from figure 4.1c showing heavily offset windows. Red squares show unrelated events that are there merely to sidetrack the window filter, black circles are successfully classified window false alarm events, and red circles are window events that the window filter has not managed to classify as such. By being on the other wall, the angle to the camera is different, and therefore events may be positioned differently in the two-dimensional space. Add to that the fact that since the window is on the other wall, it is thinner than the other windows, which also reflects on the filter. This is something that is reflected in similar datasets as well, when there is a heavy offset in that direction. When using neutral or non-favoring distributions with the window classifier, it can be seen that the filters are negatively affected in some cases. Without using any limiting constraints, the filter may end up in the situation shown in figure 3.6, but this is a very extreme case when there are no constraints active. A more realistic example would be figures 4.4 and 4.5, showing the projection view and the top down view of a non-favored distribution classified by the window classifier. In the first figure, it can be seen at the rightmost window that the filter has problems classifying the rightmost points. This is most likely a result of close by unrelated events influencing the node coupled with the difficulty of the rightmost window. While the result of classifying the distribution discussed above is good (87%) for the chosen learning rate (δ = 0.08), figure 4.3 shows that this result is highly dependent on the learning rate chosen. The success rate takes a sharp dive right after the top value before finding a very stable success rate of 75%. 29

36 CHAPTER 4. RESULTS AND CONCLUSIONS Figure 4.3: Diagram showing the effect of different learning rates (δ) on a dataset not favoring the window classifier. There is a heavy offset, that can be seen in the two figures 4.4 and 4.5 also using the same dataset. Figure 4.4: For visibility, the unrelated events have been removed form this plot. The events are still there to affect the classifier, and can be seen in figure 4.5, which is the top down view of the same data set. Black circles represent correctly classified window events, and red circles represents incorrectly classified window events. 30

37 4.1. WINDOW ADJUSTMENT FILTER Figure 4.5: A top down view showing the window classification on a heavily offset, unfavored dataset. Here black circles correspond to successfully classified events, and red squares correspond to what has been classified as unrelated events. To avoid completely cluttering this section of the article with figures, the best results from the various datasets and classifications have been combined into table 4.1. As can be seen, a high success rate can be achieved for both slight and heavy offsets, depending on the chosen learning rate. The table does however show the fact mentioned above, that the window filter has difficulties with unfavorable distributions in certain situations. These problems tend toward certain windows and window configurations, such as the one used in all the figures in this section. Some possible solutions to these problems will be discussed in the later sections 4.3 and 5. Data Set Distribution δ-value Success Rate (%) Window favored, no offset Window favored, no offset Window favored, slight offset Window favored, heavy offset Pet favored, no offset Pet favored, slight offset Pet favored, heavy offset Pet favored, heavy offset Table 4.1: Table showing window classifier specific results from various dataset distributions at a specific δ value. 31

38 CHAPTER 4. RESULTS AND CONCLUSIONS 4.2 Pet Filtering Due to the random nature of how the simulation generates pet-related events, it is hard to create scenarios that are really lifelike. Therefore the focus of this section will be to show how the filter looks after training without deeply considering the dataset the filter is training on. Depending on whether the calculated default value is used or if the height map starts at zero the resulting map varies widely. For all the figures within this section, a learning rate of δ = 0.3 has been used. This decision was made to allow the filter to learn somewhat speedily, to lower the need for an extended training period and large datasets. Since the system learns in only one direction (raising the map), there is no chance of a situation where a node fluctuates between two points. The only adverse effect a high learning rate might have is that the system might learn too much from a single occurrence. While this is an important factor, as it might lead to over-training, with the lack of real and life-like data we are considering other factors and leaving this as something to be considered in the future, when real data has been obtained. All the simulated pet favored datasets have a high concentration of events along the lower edge, as seen in figure 4.6, with a lower concentration in the rest of the image. This is an effect of the camera projection and some simplifications made for that in the simulation. At the bottom of the image the largest diagonal sizes are allowed, since the closer a pet is the larger the bounding box becomes, and as seen along the lower edge these events are allowed (classified as false alarms) after only one iteration. As the iterations continue, the hight difference between these normal cases and the anomalies higher up in image increase to the point where there are only a few anomalies shown and only one point left that cannot yet be classified as a pet related false alarm. There are few cases in the live scenario where forty iterations would be used, since using so many iterations would cause a very high coupling between the specific training set and the SOM output, which would degrade the ability to generalize on unseen but similar data. Therefore, for further tests and comparisons only five iterations are used, which might hurt the test with no default values slightly in terms of raw success rate, but but will show how the different cases perform under identical conditions. If the default values are used during training, convergence will be much quicker since in theory there is no need to teach the system about what diagonal lengths should be allowed along the floor plane. Instead, only anomalies need to be learned, when pets move on top of furniture or stairs and as such leave the normality of the floor plane. The default values may also add robustness since the resulting map will most likely be much smoother than a map starting from a flat default value of zero. Examples of this can be seen by comparing the two figures 4.7 and 4.8, where the later is much more jagged, with obvious dips wherever the training data points do not reach. The former, by contrast, is very smooth with only a few peaks for anomalous points. The two figures have been created with the same axis values 32

39 4.2. PET FILTERING (a) Iterations: 1 (b) Iterations: 40 Figure 4.6: At early iterations much of the figure appears light, since the difference between the peaks are small. The figure darkens considerably with only a few lighter peaks as the iterations continue, meaning that those events are highly anomalous. Black circles represent events classified as false alarms, and red squares represent real alarms. to allow for a better comparison. By comparing the curve in figure 4.7 created by the default values and the more jagged curve in figure 4.8 it can be seen that they are somewhat similar, showing that the default values can in fact provide improved generalization on datasets similar to those that have been used for the pet filter. Further views on this similarity can also be seen in figure 4.9, where the raised images have been placed in a ninety degree sideways view for ease of comparison. While this result is by no means a complete proof of the effectiveness of this filter, it can be seen as a proof of concept for certain situations, and it remains to be shown whether the default value can provide similar results in a live situation with a camera in any reasonable position. This initial result is made with those assumptions mentioned in 3.1, and as such cannot be taken for fact by the company until they have been verified in the various live situations that can occur. Therefore the author has decided not to add any table showing exact results for different iteration values, since the values would have little meaning in the live scenario. 33

40 CHAPTER 4. RESULTS AND CONCLUSIONS Figure 4.7: A raised view of the map shown in figure 4.6, showing the points of the map raised beyond the normal case after five iterations. Figure 4.8: If a flat default value of zero is used, the filter can still create a workable map, but this map will take longer to reach similar success rate and will most likely end up with a much more jagged look. This figure uses the same dataset as 4.6, but with five iterations. 34

41 4.3. CONCLUSIONS (a) Default values used (b) Flat default value of zero. Figure 4.9: A 90 sideways view of the map, showing the effect of the curving created by the use of the calculated default values versus the use of a flat default value of zero. Five iterations have been used on the same dataset as in previous figures. Also note the differing y axis scaling. 4.3 Conclusions Both filters show some positive results, such as they are. The lack of data from a live situation unfortunately prevents many concrete conclusions regarding the effectiveness of the filters. This also means there are no effective ways to measure the type II error rate (real alarms being miss-classified), something that was wished for (mentioned early on in section 1.1). This is something that needs to be remedied if either of the algorithms are to be used in the live situation described. Integrating any code of some complexity into a working application without thorough testing may very well lead to unforeseen consequences like for example a higher type II error rate than expected. Below the author has provided a (possibly incomplete) list of possible consequences: Higher type II error rate than expected due to miss-classifying real alarms as false alarms. Long training phase due to low event detection frequency in the live situation. Overlearning or lack of ability to generalize after either a long training period or continuous learning. Incorrect assumptions about the nature of the live scenario, for example such as the derived formulas in sections and Camera and image related issues affecting the ability to provide the needed values for the filters, such as for example the α angle. Time and memory requirements on live data to successfully train might be higher than expected. 35


MINING DATA STREAMS WITH CONCEPT DRIFT Poznan University of Technology Faculty of Computing Science and Management Institute of Computing Science Master s thesis MINING DATA STREAMS WITH CONCEPT DRIFT Dariusz Brzeziński Supervisor Jerzy Stefanowski,

More information

A 3D OBJECT SCANNER An approach using Microsoft Kinect.

A 3D OBJECT SCANNER An approach using Microsoft Kinect. MASTER THESIS A 3D OBJECT SCANNER An approach using Microsoft Kinect. Master thesis in Information Technology 2013 October Authors: Behnam Adlkhast & Omid Manikhi Supervisor: Dr. Björn Åstrand Examiner:

More information

2 Basic Concepts and Techniques of Cluster Analysis

2 Basic Concepts and Techniques of Cluster Analysis The Challenges of Clustering High Dimensional Data * Michael Steinbach, Levent Ertöz, and Vipin Kumar Abstract Cluster analysis divides data into groups (clusters) for the purposes of summarization or

More information

Intrusion Detection Techniques for Mobile Wireless Networks

Intrusion Detection Techniques for Mobile Wireless Networks Mobile Networks and Applications? (2003) 1 16 1 Intrusion Detection Techniques for Mobile Wireless Networks Yongguang Zhang HRL Laboratories LLC, Malibu, California E-mail: ygz@hrl.com Wenke Lee College

More information

Dude, Where s My Card? RFID Positioning That Works with Multipath and Non-Line of Sight

Dude, Where s My Card? RFID Positioning That Works with Multipath and Non-Line of Sight Dude, Where s My Card? RFID Positioning That Works with Multipath and Non-Line of Sight Jue Wang and Dina Katabi Massachusetts Institute of Technology {jue_w,dk}@mit.edu ABSTRACT RFIDs are emerging as

More information

A Survey of Outlier Detection Methodologies.

A Survey of Outlier Detection Methodologies. A Survey of Outlier Detection Methodologies. Victoria J. Hodge (vicky@cs.york.ac.uk) (austin@cs.york.ac.uk) Dept. of Computer Science, University of York, York, YO10 5DD UK tel: +44 1904 433067 fax: +44

More information

Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis

Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis Foundations and Trends R in Computer Graphics and Vision Vol. 1, No 2/3 (2005) 77 254 c 2006 D.A. Forsyth, O. Arikan, L. Ikemoto, J. O Brien, D. Ramanan DOI: 10.1561/0600000005 Computational Studies of

More information

Practical Risk-Based Testing

Practical Risk-Based Testing Practical Risk-Based Testing Product RISk MAnagement: the PRISMA method Drs. Erik P.W.M. van Veenendaal CISA Improve Quality Services BV, The Netherlands www.improveqs.nl May, 2009 2009, Improve Quality

More information

Mining Data Streams. Chapter 4. 4.1 The Stream Data Model

Mining Data Streams. Chapter 4. 4.1 The Stream Data Model Chapter 4 Mining Data Streams Most of the algorithms described in this book assume that we are mining a database. That is, all our data is available when and if we want it. In this chapter, we shall make

More information

Learning Deep Architectures for AI. Contents

Learning Deep Architectures for AI. Contents Foundations and Trends R in Machine Learning Vol. 2, No. 1 (2009) 1 127 c 2009 Y. Bengio DOI: 10.1561/2200000006 Learning Deep Architectures for AI By Yoshua Bengio Contents 1 Introduction 2 1.1 How do

More information

Introduction to Data Mining and Knowledge Discovery

Introduction to Data Mining and Knowledge Discovery Introduction to Data Mining and Knowledge Discovery Third Edition by Two Crows Corporation RELATED READINGS Data Mining 99: Technology Report, Two Crows Corporation, 1999 M. Berry and G. Linoff, Data Mining

More information

A First Encounter with Machine Learning. Max Welling Donald Bren School of Information and Computer Science University of California Irvine

A First Encounter with Machine Learning. Max Welling Donald Bren School of Information and Computer Science University of California Irvine A First Encounter with Machine Learning Max Welling Donald Bren School of Information and Computer Science University of California Irvine November 4, 2011 2 Contents Preface Learning and Intuition iii

More information

Top 10 algorithms in data mining

Top 10 algorithms in data mining Knowl Inf Syst (2008) 14:1 37 DOI 10.1007/s10115-007-0114-2 SURVEY PAPER Top 10 algorithms in data mining Xindong Wu Vipin Kumar J. Ross Quinlan Joydeep Ghosh Qiang Yang Hiroshi Motoda Geoffrey J. McLachlan

More information

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science chord@lcs.mit.edu

More information

PASIF A Framework for Supporting Smart Interactions with Predictive Analytics

PASIF A Framework for Supporting Smart Interactions with Predictive Analytics PASIF A Framework for Supporting Smart Interactions with Predictive Analytics by Sarah Marie Matheson A thesis submitted to the School of Computing in conformity with the requirements for the degree of

More information

Reverse Engineering of Geometric Models - An Introduction

Reverse Engineering of Geometric Models - An Introduction Reverse Engineering of Geometric Models - An Introduction Tamás Várady Ralph R. Martin Jordan Cox 13 May 1996 Abstract In many areas of industry, it is desirable to create geometric models of existing

More information



More information

Real Time Person Tracking and Identification using the Kinect sensor Major Qualifying Project in Electrical & Computer Engineering

Real Time Person Tracking and Identification using the Kinect sensor Major Qualifying Project in Electrical & Computer Engineering WORCESTER POLYTECHNIC INSTITUTE Real Time Person Tracking and Identification using the Kinect sensor Major Qualifying Project in Electrical & Computer Engineering Matthew Fitzpatrick, Nikolaos Matthiopoulos

More information

1. Adaptation of cases for case-based forecasting with neural network support

1. Adaptation of cases for case-based forecasting with neural network support 1. Adaptation of cases for case-based forecasting with neural network support Corchado J. M. Artificial Intelligence Research Group Escuela Superior de Ingeniería Informática, University of Vigo, Campus

More information

THE development of methods for automatic detection

THE development of methods for automatic detection Learning to Detect Objects in Images via a Sparse, Part-Based Representation Shivani Agarwal, Aatif Awan and Dan Roth, Member, IEEE Computer Society 1 Abstract We study the problem of detecting objects

More information

X Vision: A Portable Substrate for Real-Time Vision Applications

X Vision: A Portable Substrate for Real-Time Vision Applications COMPUTER VISION AND IMAGE UNDERSTANDING Vol. 69, No. 1, January, pp. 23 37, 1998 ARTICLE NO. IV960586 X Vision: A Portable Substrate for Real-Time Vision Applications Gregory D. Hager and Kentaro Toyama

More information

THE PROBLEM OF finding localized energy solutions

THE PROBLEM OF finding localized energy solutions 600 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 45, NO. 3, MARCH 1997 Sparse Signal Reconstruction from Limited Data Using FOCUSS: A Re-weighted Minimum Norm Algorithm Irina F. Gorodnitsky, Member, IEEE,

More information

C3P: Context-Aware Crowdsourced Cloud Privacy

C3P: Context-Aware Crowdsourced Cloud Privacy C3P: Context-Aware Crowdsourced Cloud Privacy Hamza Harkous, Rameez Rahman, and Karl Aberer École Polytechnique Fédérale de Lausanne (EPFL) hamza.harkous@epfl.ch, rrameez@gmail.com, karl.aberer@epfl.ch

More information

On the selection of management/monitoring nodes in highly dynamic networks

On the selection of management/monitoring nodes in highly dynamic networks 1 On the selection of management/monitoring nodes in highly dynamic networks Richard G. Clegg, Stuart Clayman, George Pavlou, Lefteris Mamatas and Alex Galis Department of Electronic Engineering, University

More information

Distinctive Image Features from Scale-Invariant Keypoints

Distinctive Image Features from Scale-Invariant Keypoints Distinctive Image Features from Scale-Invariant Keypoints David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., Canada lowe@cs.ubc.ca January 5, 2004 Abstract This paper

More information

Predictive analytics and data mining

Predictive analytics and data mining Predictive analytics and data mining Charles Elkan elkan@cs.ucsd.edu May 28, 2013 1 Contents Contents 2 1 Introduction 7 1.1 Limitations of predictive analytics.................. 8 1.2 Opportunities for

More information

Daniel F. DeMenthon and Larry S. Davis. Center for Automation Research. University of Maryland

Daniel F. DeMenthon and Larry S. Davis. Center for Automation Research. University of Maryland Model-Based Object Pose in 25 Lines of Code Daniel F. DeMenthon and Larry S. Davis Computer Vision Laboratory Center for Automation Research University of Maryland College Park, MD 20742 Abstract In this

More information

Using Focal Point Learning to Improve Human-Machine Tacit Coordination

Using Focal Point Learning to Improve Human-Machine Tacit Coordination Using Focal Point Learning to Improve Human-Machine Tacit Coordination Inon Zuckerman 1, Sarit Kraus 1, Jeffrey S. Rosenschein 2 1 Department of Computer Science Bar-Ilan University Ramat-Gan, Israel {zukermi,

More information


AN INTRODUCTION TO PREMIUM TREND AN INTRODUCTION TO PREMIUM TREND Burt D. Jones * February, 2002 Acknowledgement I would like to acknowledge the valuable assistance of Catherine Taylor, who was instrumental in the development of this

More information

ViRi: View it Right. Pan Hu, Guobin Shen, Liqun Li, Donghuan Lu

ViRi: View it Right. Pan Hu, Guobin Shen, Liqun Li, Donghuan Lu ViRi: View it Right Pan Hu, Guobin Shen, Liqun Li, Donghuan Lu Microsoft Research Asia, Beijing, 18, China {v-pah, jackysh, liqul, v-donlu}@microsoft.com ABSTRACT We present ViRi an intriguing system that

More information