Embedded-Text Detection and Its Application to Anti-Spam Filtering

Size: px
Start display at page:

Download "Embedded-Text Detection and Its Application to Anti-Spam Filtering"

Transcription

1 UNIVERSITY OF CALIFORNIA Santa Barbara Embedded-Text Detection and Its Application to Anti-Spam Filtering A Thesis submitted in partial satisfaction of the requirements for the degree of Master of Science in Computer Science by Ching-Tung Wu Committee in Charge: Professor Kwang-Ting Cheng, Chair Professor Yuan-Fang Wang Associate Professor Matthew Turk April 2005

2 The Thesis of Ching-Tung Wu is approved: Professor Yuan-Fang Wang Associate Professor Matthew Turk Professor Kwang-Ting Cheng, Committee Chairperson April 2005

3 Embedded-Text Detection and Its Application to Anti-Spam Filtering Copyright c 2005 by Ching-Tung Wu iii

4 Abstract Embedded-Text Detection and Its Application to Anti-Spam Filtering Ching-Tung Wu Embedded-text in images usually carry important messages about the content. In the past, several algorithms have been proposed to detect text boxes in video frames. Previous work often followed a multi-step framework using a combination of image-analysis and machine-learning techniques. In this work, we propose a unified embedded-text detection framework to efficiently and accurately locate text boxes particularly in web and images. We approach the embeddedtext problem from the angle of object detection. We define position-independent features to capture the essence of characters and a smart-scan algorithm to trace text lines using their spatial and geometrical properties. We also propose a novel anti-spam system which utilizes visual clues, including the embedded-text information. The experimental results demonstrate the effectiveness of the proposed embedded-text detection framework and the anti-spam filtering system. Professor Kwang-Ting Cheng Thesis Committee Chair iv

5 Contents List of Figures List of Tables vii viii 1 Introduction Previous Work Motivation Contributions Outline Embedded-Text Detection Cascade Detector Position-Independent Feature - Motivation Position-Independent Feature - Definition Cascade Detector Training Experiments and Discussions Detection Framework One-Step Framework Two-Step Framework Experiments Metrics Experimental Results Computation Cost Memory Consumption Visual-Based Anti-Spam Filtering The Spam Datasets and Analysis The Anti-Spam Filter v

6 3.2.1 Feature Description Feature Extraction The Classifier Experiments Datasets Experimental Results Conclusions 65 Bibliography 67 vi

7 List of Figures 1.1 Embedded-Text in Images Example Top two features selected by AdaBoosting in Viola s work Canny edge detection result of a sample image Edge coding, 3x3 sub-window, horizontal, and vertical pattern Edge map of text and confusing background objects Top three features selected by BrownBoosting Tough case of negative training samples One-Step Framework Flow Example of Locality Shifts Locality-Filter Rectangle Features Vertical Locality-Filter Example of Letter A Horizontal Locality-Filter Example of Letter A Horizontal Locality-Filter Example of Letter I Horizontal Locality-Filter Example of Letter L Diagram of the search algorithm Two-Step Framework Flow Embedded-Text Detection Raw Results Embedded-Text Detection Final Results Spam with embedded-text in image example Text embedded in image example Detected embedded-text regions vii

8 List of Tables stage cascade detector using BrownBoosting Feature sets comparison Locality-Filter Verification Performance Comparison Computation Cost Comparison Statistics of s containing accessible images Classifier comparison Filter comparison Blocked/Missed analysis viii

9 Chapter 1 Introduction As the Internet continues to grow, the type of information available to users has shifted from text-only to multimedia-enriched. Embedded-text in multimedia content (e.g. images and videos) is becoming one of the prevalent means for delivering messages to content viewers. Several embedded-text detection and recognition algorithms have been proposed in the past. However, as the type of multimedia content keeps getting more and more diverse and the amount grows exponentially, developing an embedded-text detector that is accurate and efficient has become an interesting topic and a challenging task. Embedded-text often carries important messages regarding the where, what, and who of the content. This information clearly helps with the recognition and annotation of the content. Embedded-text within images and videos can be cat- 1

10 Chapter 1. Introduction egorized as natural scene text and artificial text. Natural scene text refers to text that appears as part of the scene in images or videos; for example, a news video showing an event that happened in a recognizable location. Artificial text, on the other hand, is manually added to provide content to viewers. One example of artificial text is the closed caption of a video; another is the text banner box on a web image. The first step to extract embedded-text in an image or a video frame is to pinpoint the location of the text boxes. In this thesis, we aim to develop a robust and accurate embedded-text detector particularly for detecting artificial text in web and images. The detection result can be fed into the existing Optical Character Recognition (OCR) packages such as OmniPage or GNU OCR. In the second part of this thesis, we also demonstrate an application to anti-spam filtering which directly uses the text-detection result. 1.1 Previous Work Artificial text boxes are designed to attract attention. These text boxes usually present strong coherence in space, geometry, and color. For example, they usually align horizontally, contain homogeneous color regions, pose strong edge energies, and exhibit high intensity differences from neighboring objects or contrasting 2

11 Chapter 1. Introduction background. Since visual, spatial, and geometric coherence provides strong clues, previous work has primarily focused on these characteristics for the task of text detection. Based on the detection framework, previous work can be roughly classified into two categories; (1) Image-analysis-based (2) Image-analysis plus machinelearning-based. In the first category, embedded-text detection depends heavily on image analysis techniques, such as edge detection, connected component, and texture analysis. In the second category, a machine learning method is applied in addition to the image analysis techniques. The machine learning step is used mainly for verification purposes because it can further reduce the number of false positives. In the image-analysis category, embedded-text detection depends heavily on image analysis techniques. These can be further sub-categorized into edge-based and texture-based approaches. The common flow works as follows: First, edge detection or texture analysis is performed on the image under consideration. The image s raw features are then grouped according to connectivity, spatial, and geometric correlations to form potential text regions. Second, potential text regions are further examined using rule-based heuristics, such as the size, aspect ratio, and orientation of the region. Fletcher et al. [6] used edge detection and connectedcomponent analysis to generate connected text blocks. Then, they applied a 3

12 Chapter 1. Introduction Hough transformation to aggregate the text blocks into text strings. LeBourgeois [12] proposed using a linear filter to transform the gradient map, such as the edge energy map of an image, to a texture map by smearing and dilating the gradient map in one direction. Since text pixels appear in a coherent way, text regions usually stand out and can be easily identified after the transformation. Zhou et al. [39] proposed a texture-based approach to extract text from web images using color clustering and connected component analysis. Wu et al. [23] presented a text segmentation based on distinctive textures of text. Text strokes are extracted and refined in multiple phases to form the final bounding boxes. In the second category, the machine-learning method is applied in conjunction with image analysis techniques. The machine-learning step serves as the text pixel or text region verifier. It is usually applied after the image analysis step to further reduce the false positive rate. Chen et al. [2] used edge dilation to merge horizontal and vertical edges into regions, then applied an SVM model to classify these text regions. The assumption was that text lines contain short horizontal and vertical edges within a confined space. The dilation and merging of these edges make text appear as continuous regions. These regions are then classified using the trained SVM model. Wolf et al. [25] extended LeBourgeois idea by applying more geometrical and spatial heuristics on the text lines identification. Then, an SVM model was trained for the final classification. Kim et al. [10] also 4

13 Chapter 1. Introduction presented a texture-based approach by using SVM to report the probability of a pixel being part of text texture, then, by applying CAMSHIFT algorithm on the probability map to locate text regions. The common part between image analysis based and image analysis plus machine learning based is that they often followed a two-step procedure. In the first step, potential text regions are identified. Then, in the second step, a statisticaldriven and/or heuristics-based models were applied to refine the potential text regions and to determine the final ones. 1.2 Motivation The first motivation for our work is that the number of images with embeddedtext has increased rapidly, especially in web and applications. Being able to detect text embedded in these images will be a good starting point to further analyze the received contents. One example is the Unsolicited Commercial (UCE), also known as spam, on the Internet. With the increasing importance of and the incursions of Internet marketers, spam has become a major problem. In the past, researchers have addressed this problem as a text classification or categorization problem. Recently, spammers have been using embedded-text in 5

14 Chapter 1. Introduction images to avoid these text-based anti-spam filters. Without the ability to detect embedded-text, it will be very difficult to alleviate this problem. The second motivation for our work is to improve the text-detection accuracy beyond the previous two-step framework. We hope to keep the advantages of the existing two-step framework while reducing or eliminating its disadvantages. In the previous embedded-text frameworks, the text detection problem is divided into two sub-tasks. In the first step, potential text regions are identified using primarily image-analysis techniques. Then, in the second step, potential text regions are further refined using statistical-driven models or heuristic rules based on spatial and geometrical constraints. One advantage of these frameworks is that the embedded-text detection problem can be divided into two easier subproblems. Each step in the two-step framework deals with one of the sub-tasks, making the process more efficient. The first step is to locate potential text regions. The second step, which usually takes more time, is to refine and to determine the final text regions. By separating the tasks, the slower step only has to deal with part of the image, rather than the entire image. Another advantage is that the false alarms can be further reduced as the text regions pass through each step. A major disadvantage of the two-step framework is that the two steps depend on each other, hence, the overall detection rate is constrained by the lowest one of 6

15 Chapter 1. Introduction the two. For example, if the lowest detection rate of the two steps is 80%, then, the overall detection rate cannot surpass 80%. Viola and Jones [22] proposed a rapid object-detection framework using rectangle features and boosted cascade detector for the task of face detection. The rectangle features are position-dependent and somewhat rigid. They are suitable for capturing the differences in intensity among facial regions. The cascade detector is a degenerated decision tree which acts as a rejection-based classifier. In each stage of the cascade detector, the classifier is trained using AdaBoosting to passes almost every positive sample while rejecting most of the negative ones. The combination of stages yields a desirable ratio of a high detection rate to a low false-positive rate. Both rectangle features and cascade detector can be evaluated efficiently at every location and scale. In our work, we focus on developing a robust embedded-text detection framework to locate artificial text boxes embedded in images, especially web and images. Our goals are (1) efficiency, especially in detection speed; (2) favorable ratio of high detection rate to low false-positive rate. We approach the embeddedtext detection problem from the angle of object detection. We define three sets of Position-Independent Features (PIF) and propose using the boosted cascade detector, similar to Viola and Jones, as the text detection engine. The PIF features can be evaluated efficiently. They are suitable for tackling the diverse and some- 7

16 Chapter 1. Introduction what unpredictable shapes of text. Then, we propose a unified embedded-text detection framework using the cascade detector and PIF features. The proposed framework is capable of detecting horizontal text lines embedded within an image at every location and scale. 1.3 Contributions The main contributions of this thesis are summarized as follows: We define three sets of position-independent features (PIF) which can be computed efficiently for the task of text detection. The three sets of PIF features are local edge-pattern, local edge-density, and global edge-density. These PIF features are able to capture diverse types and sizes of fonts as well as to differentiate edges of characters from those of random objects. We propose using a cascade detector for the task of text detection. The proposed cascade detector is trained using the position-independent features and BrownBoosting mentioned above. We propose a novel unified-step embedded-text detection framework using the above cascade detector. This embedded-text detector does not require any pre-processing step and can do full image scans. 8

17 Chapter 1. Introduction We introduce the use of visual information for anti-spam filtering. By thoroughly analyzing a large collection of spam s, we demonstrate that useful features and parameters can be derived from images in spam s for the purpose of anti-spam filtering. Based on such visual features and parameters, we propose a novel visualbased anti-spam filter. The proposed filter, used in conjunction with existing text-based filters, can improve the filtering accuracy. We have successfully integrated the proposed anti-spam filter with Thunderbird and demonstrated very promising results. 1.4 Outline The rest of this thesis is organized as follows: In chapter 2, we present the details of the proposed embedded-text detector. In chapter 2.1, we introduce the concept of the position-independent features as well as details of feature selection and training. In chapter 2.2, we present the detection framework and explain how to apply the cascade detector for the task of text detection. In chapter 2.3, we show experimental results to evaluate the effectiveness of the position-independent features and to demonstrate the robustness of the proposed embedded-text detector. 9

18 Chapter 1. Introduction In chapter 3, we present the details of the visual-based anti-spam filter. In chapter 3.1, we present the statistics of some visual parameters from a thorough analysis of more than 120K spam s downloaded from SpamArchive [30]. In chapter 3.2, we present the proposed filtering system in detail. Then, in chapter 3.3, we present experimental results to show the effectiveness of the proposed anti-spam filter. In chapter 4, we present the conclusions of this thesis. 10

19 Chapter 1. Introduction Figure 1.1: Embedded-Text in Images Example 11

20 Chapter 2 Embedded-Text Detection The heart of the proposed embedded-text detection system is the cascade detector and Position-Independent Features (PIF). The cascade detector is degenerated decision tree which consists of several stages organized in a sequential manner. Each stage, in turn, is a boosted classifier that is trained to pass almost every positive sample while rejecting as many negative ones as possible. The PIF features are designed to differentiate the edges of text from those of random objects. Both of the cascade detector and the PIF features can be evaluated efficiently. Therefore, they can be applied on part or even the entire image under consideration. In this chapter, we present the architecture of the proposed embedded-text detection system. In chapter 2.1, we discuss the details of the PIF features, feature 12

21 Chapter 2. Embedded-Text Detection selection, and cascade detector training. We elaborate the rationales behind the design of the PIF features and the training strategy. In chapter 2.2, we demonstrate how to applied the text detector engine (trained cascade detector and PIF features) in the traditional two-step framework and the proposed unified one-step framework. 2.1 Cascade Detector Viola and Jones [22] proposed a cascade framework that is capable of detecting face rapidly while achieving a high detection accuracy. There are three key contributions in their approach. The first one is the introduction of a new image structure called the Integral Image, which allows the features to be computed efficiently. The second one is the use of AdaBoosting learning algorithm to select and combine a small number of critical features to become efficient and yet strong classifiers. The third contribution is a smart framework for organizing classifiers in a cascade manner, which allows background regions to be quickly discarded while spending more computation on promising face-like regions. Our purpose is to extend the similar idea to a new task, detecting embedded-text within images. 13

22 Chapter 2. Embedded-Text Detection Position-Independent Feature - Motivation Viola and Jones defined a set of rectangle features, which can be calculated by computing the differences of raw pixel sums among regions within the rectangle under consideration. More than 40K rectangle features can be extracted from a 24x24 training sample, and they can be calculated quickly in less than 100 machine instructions per feature value using a data structure called integral image. Although rectangle features are proven to be effective and efficient for the specific purpose of face detection, no work has been done so far for general object detection using this type of feature. The main reason is that the rectangle features carry position information (position-dependent), and thus the target object has to strictly follow a well-defined rigid structure. For example, the first feature selected by AdaBoosting algorithm, as shown in Figure 2.1, measures the differences in intensity between the eyes and the upper cheeks. The feature capitalizes on the observation that the eye region is often darker than the cheeks. However, in the task of text detection, due to the different fonts, sizes, and shapes of text, position-dependent features cannot capture the unified information towards a set of samples. Edge information is a dominant feature used by many researchers for embeddedtext detection in images. Figure 2.2 shows the Canny edge detection result of a 14

23 Chapter 2. Embedded-Text Detection Figure 2.1: Top two features selected by AdaBoosting in Viola s work sample image. By comparing the edges from the text and the background, we conclude several key observations as follows: Text edges always follow a few fixed patterns, such as vertical and horizontal lines, half-circles, and T-like conjunctions, while background edges are totally unpredictable. Text edges should be smooth and continuous, and thus edge points of an individual character should be connected to each other. Therefore, if we define a small sub-window around a text edge point, the number of edge points within the sub-window should fall within a certain range. Neverthe- 15

24 Chapter 2. Embedded-Text Detection Figure 2.2: Canny edge detection result of a sample image less, the number of edge points within a sub-window around a background edge point is completely random. For a sub-window containing a full character, the edge points often equally spread over the entire sub-window. However, for a sub-window containing a background object, the edge points often cluster into small regions. 16

25 Chapter 2. Embedded-Text Detection Position-Independent Feature - Definition In this section, we introduce three different types of edge-based features corresponding to the three key observations mentioned above. In addition, the proposed features satisfy the following rules: Features must be able to differentiate edges of text from those of background objects. Features do not include any position information, i.e. position-independent. Features can be computed efficiently. Local Edge-Pattern Figure 2.3: Edge coding, 3x3 sub-window, horizontal, and vertical pattern 17

26 Chapter 2. Embedded-Text Detection To compute local-edge pattern, first, we apply the Canny edge detector on the image under consideration, then we binarize the detection result by using 0 to represent non-edge points and 1 to represent edge points. For each edge point, if we define a 3x3 sub-window as shown in Figure 2.3a, we can use 8-bit binary strings to represent different edge patterns. Figure 2.3b and Figure 2.3c show the horizontal and vertical patterns. They represent a horizontal and a vertical line respectively. Each binary number can be mapped to a decimal number. For example, the binary and decimal numbers for Figure 2.3b and Figure 2.3c are (66) and (24) respectively. By using the binary coding scheme and the 3x3 sub-window, we can extract and compute edge patterns efficiently. The total number of edge patterns defined by a 3x3 sub-window is 256. These edge patterns can be used to represent horizontal, vertical, diagonal, curvy, and even conjunctional patterns. In algorithm 1, we show the local edge-pattern extraction work flow. Theoretically, we can compute 256 integral images, one for each edge pattern. In real implementation, however, we do not need to compute all of them. Instead, we only need to compute the features that are actually selected during the cascade training. It should be noted that multiple integral images can be generated after a single scan. The additional cost is mainly for memory, not for computation. Moreover, we use a histogram manner to represent features of a rectangle to be 18

27 Chapter 2. Embedded-Text Detection Algorithm 1 Local Edge-Pattern Extraction INPUT: Image OUTPUT: Local Edge-Pattern Features step1: Produce an edge map by using Canny edge detector step2: Obtain a pattern map by evaluating each edge point step3: Compute integral images based on the pattern map step4: Extract and Evaluate histogram features evaluated. This design implicitly eliminates the position information. In other words, we only consider the frequency of an interested pattern in a sub-window, not the position of that pattern. Therefore, we refer to our feature as Position- Independent Features (PIF) in contrast to Viola and Jones rectangle features. Local Edge-Density Based on the second observation in section 2.1.1, we concluded that the local edge-density around each edge point will be useful for discriminating edges of text from those of background objects. Therefore, we propose the local edge-density feature set to capitalize on this information. We define a number of small subwindows around each edge point, and count the number of edge points within the sub-windows. Empirically, we adopt different size sub-windows from 3x3 to 7x7. Using the similar algorithm mentioned above, we can calculate the integral 19

28 Chapter 2. Embedded-Text Detection images based on local edge-density maps with respect to each pre-defined window. In order to eliminate position information, we adopt a normalized histogram representation as well. Global Edge-Density Besides local edge-density, we have also observed that the edges of text in the rectangle under consideration often form a well-proportioned distribution. Therefore, we propose the global edge-density feature set to capitalize on this finding. For a NxN rectangle, we use a N 2 x N 2 sub-window to do a sliding scan. Each sub-window steps N 4 pixels in x and/or y directions from the previous one. As a result, totally nine sub-windows can be scanned. Based on the integral image calculated from the edge map, we can extract nine numbers from the nine subwindows. These nine numbers exhibit an intuitive cue for the global distribution of the edge points. However, a potential problem is that they are not positionindependent since each sub-window contains position information with respect to the rectangle. Therefore, in order to remove the position information, we sort these nine numbers to make them position-independent. Moreover, we can produce more features based on the nine basic numbers, such as addition or subtraction between any pair of them. Because this feature set requires additional cost of 20

29 Chapter 2. Embedded-Text Detection sorting, the computation is significantly increased. In chapter 2.1.3, we introduce a hierarchical training algorithm to minimize the computation cost Cascade Detector Training The cascade detector is a degenerated decision tree in which several simple classifiers (stages) are organized in a sequential manner. During the testing phase, the stages are applied subsequently to a region of interest until at some stage the candidate is rejected or all the stages are passed. In general, such a hierarchical structure, i.e. cascade, can be learned by using Boosting algorithm which is a method of combining several weak learners into a strong learner. In Viola and Jones framework, a variant of AdaBoosting, discrete AdaBoosting, was adopted to train the cascade detector for the task of face detection. However, for the specific purpose of embedded-text detection, we not only adopted a different boosting algorithm, BrownBoosting[8], but also a different training strategy, hierarchical training. The reasons are explained in detail as follows: BrownBoosting: Different boosting algorithms use different strategies to re-weight the training samples. In discrete AdaBoosting, the weight of a sample is adjusted based on its classification result. If a sample is correctly classified, then the weight of this sample will be reduced for the next round training. On the other hand, if a sample is misclassified, the weight of the 21

30 Chapter 2. Embedded-Text Detection sample will remain the same. While the success of AdaBoosting is indisputable, however, there is a rising awareness that the algorithm is quite susceptible to noise. Dietterich has given a detailed explanation of this behavior in [9]. He shows that, in general, boosting tends to assign much higher weights to noisy samples than those to legitimate ones. As a result, the hypotheses generated in the later iterations often over-fit the noisy samples. In our embedded-text detection, this weakness will significantly degrade the performance of the resulting classifier. In Figure 2.4, we show the edge maps of two letters, T and H, as well as three examples from noisy objects. We observed an interesting phenomenon. If we only consider the features proposed in section 2.1.2, i.e. local edge-pattern, local edge-density, and global edge-density, it is very difficult to differentiate the edges of text from those of noisy background objects. Due to (1) some negative samples are very similar to positive ones (as shown in Figure 2.4); (2) Millions of negative samples are required during the cascade training, a careful and thorough check for every sample is quite impossible, if we follow the original AdaBoosting algorithm, the final classifier will be inevitably less desirable. Therefore, we borrow a key idea from BrownBoosting[8], another variant of AdaBoosting. We use a non-monotone function to re-weight the misclassified samples: for small values of margin, the weight increases in a way very 22

31 Chapter 2. Embedded-Text Detection similar to AdaBoosting; however, for some point onwards, the weight decreases to restrain the impact caused by the noisy examples (e.g. confusing background objects). Hierarchical Training: We noticed that the computation cost for the proposed three sets of PIF features differs from each other. The cost of the local edge-pattern feature set is as cheap as the rectangle features proposed in Viola and Jones work, while the local and global edge-density feature sets require more time to compute. During the testing phase, the number of testing samples reduces as they pass through the cascade. Therefore, it is important to keep the early stages as fast as possible. To avoid using the expensive sets of features at early stages of the cascade, we adopted a hierarchical training strategy. The training procedure is as follows: (1) for the first eleven stages, we only used the local edge-pattern feature set; (2) from the 12th to 14th stages, we added the second feature set, which is the local edge-density set; (3) from the 14th stages on, we applied all three sets of features until the training process converged and the final stop criteria was reached. Our experimental results showed that the hierarchical training scheme significantly boosted the detection performance without much compromise on the total computation cost. In other words, the hierarchical training scheme helps the boosting algorithm to find the best combination 23

32 Chapter 2. Embedded-Text Detection of the PIF feature sets by balancing the detection performance and computation cost. Figure 2.4: Edge map of text and confusing background objects Experiments and Discussions For the cascade detector training, we used 5,000 positive samples and 5,000 negative samples in each stage. The positive samples were manually labeled and selected from a pool of web images. Each positive sample contained a single English letter. The negative samples were dynamically generated from a selected set of Corel images [4]. Table 2.1 shows the details of the 18-stage cascade detector and its stage classifiers. The target criteria for each stage was 0.97 detection rate and 0.60 false-positive rate. In other words, each stage classifier must pass at least 97% of the positive samples while rejecting more than 40% of the negative ones. To save computation cost, the local edge-density and global edge-density feature sets were only used in later stages. As shown in Table 2.2, once the new feature sets were added, the number of features selected in that stage were greatly 24

33 Chapter 2. Embedded-Text Detection Table 2.1: 18-stage cascade detector using BrownBoosting Stage Features Detection Rate False Positive Features: Local Edge-Pattern Features: Local Edge-Pattern, Local Edge-Density Features: Local Edge-Pattern, Local Edge-Density, Global Edge-Density reduced. This behavior indicated that the classification accuracy and efficiency can be further improved by combining the three PIF feature sets. In Figure 2.5, we show the top three features selected in the first round of the boosting. The depicted features are considered as the most discriminative ones. An intuitive explanation for these three patterns are as follows: Figure 2.5a represents a horizontal line, Figure 2.5b represents a vertical line, and Figure 2.5c 25

34 Chapter 2. Embedded-Text Detection Table 2.2: Feature sets comparison Stage Features Detection Rate False Positive w/ Local Edge-Density w/o Local Edge-Density w/ Global Edge-Density w/o Global Edge-Density corresponds to a half circle. These patterns appear frequently as parts of the text edges. Figure 2.5: Top three features selected by BrownBoosting Figure 2.6 shows several negative samples which were misclassified when the local edge-pattern feature set was used alone. These samples were correctly classified after adding the local and global edge-density feature sets. This example 26

35 Chapter 2. Embedded-Text Detection Figure 2.6: Tough case of negative training samples clearly demonstrates the effectiveness of the edge-density feature sets and proves the rationales behind the design. As shown in Table 2.1, the final detection rate is around 80%. It is relatively low compared to that of the face-detection applications. (the common detection rate for the face-detection is around 90%). However, faces appear in images as independent objects while text objects appear as regions. This implies that if there is a text object detected at some location, then, there should be text objects of similar size at nearby positions. The additional spatial and geometric regularities can be used not only to enhance the overall detection accuracy but also to reduce the overall false positive rate. In the following sub-section, we present the details of our embedded-text framework. 2.2 Detection Framework Here we describe two detection frameworks, the proposed unified one-step framework, and traditional two-step framework. In the one-step framework, the 27

36 Chapter 2. Embedded-Text Detection cascade detector acts as a full image scanner while in the two-step framework, the cascade detector acts as the text region classifier. In the following subsections, we present both frameworks in detail One-Step Framework In the one-step framework, the cascade detector is applied on the entire image at every location and at various scales. For the task of text detection, we use a sliding window to scan through the entire image. The scanning starts with an initial window size. After each full image scan, the sliding window is enlarged by 10% until it is larger than the stopping window size or the image size. In our experiments, we found that initial window size of 12x10 and the stopping window size of 72x60 are sufficient to detect embedded-text in web images. Once a subwindow is detected as text, its neighboring windows are examined and aggregated to form the text region. In addition to the cascade detector, there are three major components in the proposed one-step framework. These components are locality filter, search algorithm, and post processing. The locality filter centers the detection window on the text object. The search algorithm helps cascade detector find the text line. Then, the post processing cleans up the detection results and merge them into regions. These components guides the cascade detector to right starting points 28

37 Chapter 2. Embedded-Text Detection and optimizes the detection performance during the full image scan. They are discussed in detail in the following sub-sections. Figure 2.7: One-Step Framework Flow During the scanning, the corresponding PIF features of the sub-window are calculated using integral images and their values are sent to the cascade detector to determine whether the sub-window contains text. A. Locality Filter Since the PIF features used in the cascade detector are completely position independent, they are insensitive to the locality change of the text in a window. An example is depicted in Figure 2.8. The cascade detector and PIF features would not be able to tell the differences between the left and right samples without additional helps. To avoid this phenomenon, we define three locality filters 29

38 Chapter 2. Embedded-Text Detection to tackle the horizontal and vertical shifts of text objects in the detection subwindows. The three locality filters are central, vertical, and horizontal locality filter. The locality filters serve for the following purposes: (1) To center the detection sub-window on the edge-rich region by eliminating sub-windows which does not contain enough edge points in the center region; (2) To reduce the number of detected sub-windows around the same text object; (3) To reject sub-windows which do not lie entirely on the text line by eliminating sub-windows with vertical shifts; (4) To pinpoint the begin and start position of a text line by eliminating sub-windows with left and right horizontal shifts. Figure 2.8: Example of Locality Shifts The locality filters act as the gate keeper and can be considered as the first stage of the cascade detector. Although the locality filters are not trained by using boosting algorithm, they are empirically verified to meet the same detection rate criteria, i.e. to maintain a near 100% detection rate. 30

39 Chapter 2. Embedded-Text Detection The evaluation of these filters are based on rectangle features which can be calculated from the integral image of edge points in the original image. Figure 2.9 shows the rectangle features used in the three locality filters. These rectangle features are similar to some of the extended features in [14]. The central locality-filter examines the number of edge points in the center region of a sub-window. The area of the center region is set to 81% of the overall area of the sub-window (9/10 in width and height). The sub-window is rejected if the number of edge points is lower than the predefined threshold. In our implementation, we set the threshold to 60% of the total number of edge points in the sub-window. The vertical locality-filter divides the sub-window into top, middle, and bottom regions, then examines the number of edge points in these regions. The height of each region is set to 10% of the sub-window height. If the number of edge points in any of the regions is zero, the, the sub-window is rejected. Figure 2.10 shows the example of vertical locality filters on letter A. The horizontal locality-filter divides the sub-window into left, and right regions, then examines the number of edge points in them. The width of each region is set to 10% of the sub-window width. If the number of edge points in one of the region is zero while the other is not zero, then, the sub-window is rejected. The reason is that some letters are very narrow in width, such as letter I, while 31

40 Chapter 2. Embedded-Text Detection others are nearly square, such as letter A and D. If the detection sub-window aligns with the letter, then, the letter should overlap with both regions (square letters), or should not overlap with any of the regions (narrow letters). Figure 2.11, figure 2.12, and figure 2.13 show the examples of horizontal locality-filter on letter A, I, and L respectively. Figure 2.9: Locality-Filter Rectangle Features Figure 2.10: Vertical Locality-Filter Example of Letter A 32

41 Chapter 2. Embedded-Text Detection Figure 2.11: Horizontal Locality-Filter Example of Letter A Figure 2.12: Horizontal Locality-Filter Example of Letter I B. Search Algorithm In face detection, since faces have no correlation between each other, a random and exhaustive scan of the entire image is necessary in order to detect all the independent faces in the image. Text objects, on the other hand, appear in an image as horizontal regions and have strong spatial and geometrical correlation among each other along the same line. If a text object is detected at a particular location, then, its neighboring sub-windows are likely to be text objects of the same height as well. This spatial and geometrical coherence can be used to (1) 33

42 Chapter 2. Embedded-Text Detection Figure 2.13: Horizontal Locality-Filter Example of Letter L verify whether a detected sub-window is indeed text, hence reduce false positive; (2) effectively detect text objects along the same line. In this paper, we propose an efficient search algorithm which captures the spatial and geometrical coherence among text objects on the same text line. The search algorithm works as follows: For each sub-window size, WxH, we divide the image into 20 bands horizontally. The width of each band is equal to the width of the sub-window W. Depends on the sub-window width, W, each band may or may not overlap with its neighbors. For each band, we do a vertical sliding scan. Each vertical scan steps 2 pixels from the previous one all the way toward the bottom of the image until it reaches the end. 34

43 Chapter 2. Embedded-Text Detection During the vertical sliding scan, if a sub-window is detected as text at band I, its neighboring ten sub-windows to the left and to the right from the detected location are scanned. Each sub-window steps by half of the subwindow width to the left or to the right. If there are less than half of the neighboring sub-windows contain text, then the original sub-window is regarded as false positive and discarded. This step is considered as the text verification step. If the sub-window is verified as text sub-window, we do a horizontal sliding scan starting from band I to band I-1 and from band I to band I+1. Each sub-window steps by half of the sub-window width to the left or to the right. The sub-windows detected consecutively are merged into a single text rectangle. This step is considered as the horizontal sliding step. The vertical sliding scan and horizontal sliding scan are repeatedly applied on the image until the entire image has been visited. Figure 2.14 shows the diagram of the search algorithm. The search algorithm makes use of the spatial and geometrical coherence between text objects along the same line. Figure 2.16 shows the raw detection results of images in figure 1.1 after the vertical and horizontal sliding scan. 35

44 Chapter 2. Embedded-Text Detection Figure 2.14: Diagram of the search algorithm C. Post-Processing Due to the exhaustive scan over the entire image, there are usually multiple text rectangles along the same text line. These rectangles could be neighboring, overlapping, or duplicate. Two rectangles are overlapping if their bounding boxes overlap. Two rectangles are neighboring if they do not overlap but the distance between each other is within a threshold. (In our application, we set the threshold to 10% of the width of the shorter rectangle). A rectangle is duplicate if the major part of it is covered by another one. (In our application, we set the threshold to 80%). In the post-processing step, actions are carried out as follows: (1) Overlapping text rectangles along the same line are merged into one single rectangle; (2) Du- 36

45 Chapter 2. Embedded-Text Detection plicate rectangles are removed; (3) Single rectangles that are narrow or short are removed. As in [22], detected rectangles are partitioned into subsets based on their height and centroid location on the Y-axis. For two rectangles in the same subset, if their bounding boxes overlap, they are merged by enlarging one of the rectangle to include the other. This process is repeated until there are no more overlapping text rectangles. After the merging process, each partition usually yields a single text line, sometimes a few. Figure 2.16 shows the examples of raw detection results. Figure 2.17 shows the final detection results after post-processing Two-Step Framework The cascade detector can also be applied in the two-step framework. In this case, the cascade detector works as a text-region classifier. There are two advantages of using cascade detector for classification. First, cascade detector is relatively fast. Second, the cascade detector can achieve very detection rate to false positive rate ratio. To classify a text region, we use a sliding window to scan the entire region. The height of the sub-window is set to the height of the text region and the width of the sub-window is set to 1.2 times of the height. Each subsequent sub-window 37

46 Chapter 2. Embedded-Text Detection steps 2 pixel from the previous one. The final decision for the text region is determined by the majority vote of each individual sub-window. Figure 2.15: Two-Step Framework Flow 2.3 Experiments We have designed and conducted experiments to answer the following questions: The effectiveness of the locality-filters. The comparison between the cascade detector and other learning-based classifiers, particularly Support Vector Machines (SVM). The performance comparison between the proposed top-down framework and traditional bottom-up approaches. The computation cost comparison between the proposed top-down framework and traditional bottom-up approaches. 38

47 Chapter 2. Embedded-Text Detection Metrics To evaluate the performance of the proposed embedded-text detector, we defined two types of metrics, pixel-based and region-based. Before we dive in the definiton of metrics, we define text regions as follows for better explanation: detected region: a text region reported by the text detector. matched region: a detected text region which covers more than 90% and less than 110% of the ground-truth region. In other words, a detection text region matches a ground-truth region if they are relatively similar in position and size. refined region: a matched region whose top and height are set to that of the matching ground-truth region. The reason for doing this is because sometimes there s a small shift between the detected region and groundtruth. The refined region is to overcome this phenomenon and to better reflect the detection results. The pixel-based metric is measured as the ratio of number of correctly detected pixels to the total number of ground-truth pixels. The DR pixel (Detection Rate) and F P pixel (False Positive) are defined as follows: 39

48 Chapter 2. Embedded-Text Detection DR pixel = area of refined regions area of ground truth regions (2.1) For the pixel-based metric, the false positive comes from the following two parts: (1) If a detected region is larger than the ground-truth region it matches, then region is considered matched, however, the exceeding part is counted as false positive. (2) If a detected text region does not match any of the ground-truth regions, the entire region is counted as false positive. F P pixel = area of mismatched regions area of the image (2.2) The region-based metric is measured as the ratio of number of correctly detected regions to the total number of ground-truth regions. The false positive is defined as the ratio of the number of mismatcched regions to the number of detected regions. The DR region (Detection Rate) and F P region (False Positive) are defined as follows: DR region = # of matched regions # of ground truth regions (2.3) F P region = # of mismatched regions # of detected regions (2.4) 40

49 Chapter 2. Embedded-Text Detection Experimental Results Locality-Filter Verification In this experiment, we verify the performance of locality-filters. To conduct this experiment, we prepared three sets of data, the positive set, vertical shift set, and horizontal shift set. They are summarized as follows: Positive set: 1,000 positive samples randomly selected from the cascade detector training samples. Vertical shift set: 5,000 samples manually generated using positive set. Each sample in the positive set was used to generate five samples by randomly shifting from zero to half of the height of the sample vertically. Horizontal shift set: 5,000 samples manually generated using positive set. Each sample in the positive set was used to generate five samples by randomly shifting from zero to half of the width of the sample horizontally. Table 2.3: Locality-Filter Verification Filter Positive Set Vertical Shift Set Horizontal Shift Set Central Filter Vertical Filter Horizontal Filter Overall

50 Chapter 2. Embedded-Text Detection Table 2.3 shows the experimental results. As we can see, the proposed localityfilters can maintain a near 99% detection rate while reducing over 50% of the sub-windows with vertical or horizontal shift. Also, the vertical locality-filter is very powerful in rejecting sub-windows with vertical shift. This filter can help the cascade detector find the text line more precisely. Framework Experiments In this experiment, we compared the performance among different frameworks, Image Analysis based [6, 12, 13, 15, 23], image analysis + machine learning based [2, 25, 38], image analysis + cascade detector, and the proposed framework. To conduct this experiment, we extracted 2,000 images from a pool of SPAM s for training and testing. We manually labeled each image and identified 4,461 text regions. We reserved 3,000 text regions (1,435 images) for training Support Vector Machines (SVM) classifiers. The rest of the images were used for testing. The datasets were summarized as follows: Testing dataset: 565 images. Totally 1,461 regions were labeled. SVM Training positive dataset: 1,435 images. Totally 3,000 regions were labeled. SVM Training negative dataset: 3,000 regions were randomly cropped from a pool of Corel images. 42

51 Chapter 2. Embedded-Text Detection We implemented two commonly used features to train the SVM classifiers. The first one was the distance map [2] and the second one was star-like pixel pattern [38]. The distance map is generated by transforming every pixel from its raw value to its distance to the closest edge point. The star-like pixel pattern is extracted directly from the gray level using a star pattern. The SVM program we used for this experiment was SVMTorch[3]. The two implementation were referred as SVM1 and SVM2 respectively in Table 2.4 and Table 2.5. Table 2.4: Performance Comparison Framework Pixel-based Region-based Proposed 93.60/ /3.70 Image Analysis 90.68/ /14.50 Image Analysis + SVM / /15.13 Image Analysis + SVM / /15.89 Image Analysis + Cascade 87.76/ /9.89 Table 2.4 shows the comparison results. There are three observations. (1) Clearly, the proposed embedded-text detection framework outperforms the rest. (2) For other frameworks that bases on image analysis step, the detection rate is bounded by the image analysis step. (3) Cascade detector based framework yields relatively lower false positive than others. 43

52 Chapter 2. Embedded-Text Detection Computation Cost In this experiment, we compared the computation cost among different frameworks. The experiment was carried out on Pentium-4, 2.8GHz Linux machine with 1GB memory. Table 2.5 shows the experimental results. As indicated in the table, the proposed embedded-text detection system only took 0.32 seconds per image, while SVM based systems required 0.54 and 1.98 seconds respectively. Table 2.5: Computation Cost Comparison Framework Step1 Step2 Overall Proposed na na 0.32s Image Analysis 0.11s na 0.11s Image Analysis + SVM1 0.11s 1.87s 1.98s Image Analysis + SVM2 0.11s 0.43s 0.54s Image Analysis + Cascade 0.11s 0.26s 0.37s If comparing SVMs and Cascade alone, it is obvious that the cascade approach is much more efficient than SVM-based ones. The star-like pixel pattern in SVM2 is based on raw pixels, therefore, it is close to the performance of cascade, however, cascade still outperforms because cascade generally requires less computation than SVM. 44

53 Chapter 2. Embedded-Text Detection Memory Consumption The memory consumption of the embedded-text detector is higher than that of the face detector. In the face detection problem, rectangle features can be calculated from a single integral image. In the proposed text detector, multiple integral images have to be built. To reduce the memory consumption, the image can be divided into sub-regions horizontally or vertically. For example, if the image is divided into four regions, the memory consumption can be reduced to 1/4 of the original without much computation overhead. 45

54 Chapter 2. Embedded-Text Detection Figure 2.16: Embedded-Text Detection Raw Results 46

55 Chapter 2. Embedded-Text Detection Figure 2.17: Embedded-Text Detection Final Results 47

56 Chapter 3 Visual-Based Anti-Spam Filtering With the increasing importance of and the incursions of Internet marketers, unsolicited commercial (also known as spam) has become a major problem on the Internet. Junk has been recognized as a problem since 1975 [27]. It was not a serious concern until marketers began to flood the system, overtaxing the resources of Internet Service Providers (ISPs). Since the late 90s, several anti-spam filtering solutions have been proposed. In general, these approaches treat the spam filtering problem as a text classification or categorization problem, employing various machine learning techniques to solve the problem. In [31], the authors proposed using a Naive Bayesian classifier to filter junk s. Some researchers have also suggested using Support Vector Machines (SVM) [33] and decision trees 48

57 Chapter 3. Visual-Based Anti-Spam Filtering [34] for this task. These text-based approaches have achieved a remarkable accuracy in filtering spam s. However, there are two major limitations to these text-based approaches. First, spammers often use various tricks to confuse text-based anti-spam filters [28]. Examples of these tricks are text obfuscation, random space or word insertion, HTML layout, and text embedded in images. Second, as the scale and capacity of the Internet continues to grow, the type of information in s has become more diverse. The genre of content has moved from text-only to multimedia-enriched. These limitations greatly reduce the effectiveness of existing text-based anti-spam filters. The key issue behind these challenges is that the type of content in s has switched from text-based to visual-based. On one hand, legitimate message senders start to add more multimedia content, particularly images, to text-only s to enrich the message. On the other hand, spammers mask spam s with unreadable content, such as images, and employ HTML-based tricks to confound text-based anti-spam filters. The raw content of these spam s may not make any sense, but the key messages can be rendered visible when recipients open them. Since visual information is becoming more prevalent in s, it becomes increasingly necessary to use such information to achieve high accuracy for anti- 49

58 Chapter 3. Visual-Based Anti-Spam Filtering spam filtering. Our research team has investigated ways of using visual information, particularly images, in anti-spam filtering. We studied the spam s to analyze the characteristics of the visual information in spam. One noticeable characteristic is the types of images used in spam s. These images are usually artificially generated and contain embedded text (i.e. text boxes embedded into image files). In this work, we propose a novel anti-spam filter, which classifies multimedia-enriched s based on their visual information. Specifically, the proposed anti-spam filter extracts image features and uses an one-class SVM classifier to decide whether an unseen is in the spam category. Experimental results show that, for s containing image data, the proposed anti-spam filter can achieve a detection rate of 80% or more with less than 1% false positives. In addition, the proposed anti-spam filter can work with existing text-based filters. In comparison with the Bayesian text-based filter used in Thunderbird (a mail client of Mozilla [29]), our proposed filter can improve the spam detection rate from 47.7% to 83.9% for the validation set derived from the SpamArchive dataset [30]. The main contributions of this paper are summarized as follows: We introduce the use of visual information for anti-spam filtering. By thoroughly analyzing a large collection of spam s, we demonstrate that 50

59 Chapter 3. Visual-Based Anti-Spam Filtering useful features and parameters can be derived from images in spam s for the purpose of anti-spam filtering. Based on such visual features and parameters, we propose a novel visualbased anti-spam filter. The proposed filter, used in conjunction with existing text-based filters, can improve the filtering accuracy. We have successfully integrated the proposed anti-spam filter with Thunderbird and demonstrated very promising results. The rest of this chapter is organized as follows: In Section 3.1, we present the statistics of some visual parameters from a thorough analysis of more than 120K spam s randomly downloaded from SpamArchive [30]. In Section 3.2, we present the proposed filtering system in detail. Then, in Section 3.3, we present experimental results to show the effectiveness of the proposed anti-spam filter. 3.1 The Spam Datasets and Analysis We prepared the following datasets for analysis and experiments: SpamArchive dataset: 122,877 spam s. Ling-Spam dataset: 2,412 legitimate s from Linguist mailing list and 481 spam s [32]. 51

60 Chapter 3. Visual-Based Anti-Spam Filtering The SpamArchive dataset was randomly downloaded from the SpamArchive website. The total number of downloaded and processed spam messages is 122,877. The Ling-Spam dataset is a public anti-spam filtering corpus [32]. For this work, we used the Ling-Spam dataset to train the Bayesian anti-spam filter in Thunderbird. Using these two datasets and the trained Bayesian anti-spam filter, we intended to show experimentally the limitations of text-based anti-spam filters, and also to demonstrate the additional power they can gain with our visual-based anti-spam filter. In the SpamArchive dataset, 37.76% (46,395/122,877), of the s contain images. Among those s containing images, only 43.72% (20,283/46,395) contain accessible images. Many s did not have images explicitly attached but they provided links to the images. A fraction of these links were no longer accessible at the time we processed them. We further analyzed these 20,283 spam s. The statistics are shown in Table 3.1. Table 3.1: Statistics of s containing accessible images Type no. of s Percentage w/ image with embedded-text 16, % w/ banner or graphics 19, % w/ external image 19, % blocked by Bayesian filter 9, % 52

61 Chapter 3. Visual-Based Anti-Spam Filtering Figure 3.1: Spam with embedded-text in image example The results shown in Table 3.1 clearly indicate that if a spam contains images, the images are likely to be artificially generated and external (i.e. not explicitly attached to the ). These images may be banners/graphics or text boxes embedded within the images. Figure 3.1 shows an example of a spam which only contains embedded-text message and no traditional text. The message embedded in the spam is clearly visible when being rendered by clients even though it contains no text at all. Figure 3.2 shows an example of a computer generated image with embedded-text regions. In the following section, we introduce the ideas behind our visual-based anti-spam filter. 53

62 Chapter 3. Visual-Based Anti-Spam Filtering 3.2 The Anti-Spam Filter For each , we extract a set of features from the images contained in the . The set of features is then used for classification, where the one-class Support Vector Machines (SVM) is used as the base classifier. In the following subsections, we discuss the two major components, the features and the classifier, respectively Feature Description Based on the observations summarized in Section 3.1, we define three sets of features as follows: Embedded-text features Banner and graphic features Image location features More and more frequently, spam s are embedding text messages in images to get around text-based anti-spam filters. To detect such devious techniques, it would be helpful to know (1) whether there is embedded text in the images, (2) if so, the area of text regions vs. the total image area. To derive such information, we have developed a text-in-image detector which is capable of detecting the text region(s) in an image. The details of the detector will be described later. We use 54

63 Chapter 3. Visual-Based Anti-Spam Filtering this text-in-image detector to scan through each image in the and derive the following embedded-text features: (1) the total number of text regions detected in all images in the , (2) the percentage of images with detected embedded-text regions, and (3) the pixel count ratio of the detected text regions to that of the overall image area. Figure 3.3 shows an example of an image with identified text regions. Many of the images in spam s are banners and computer-generated graphics which are part of advertisements. We have developed a banner detector and a graphics detector. Banner images are usually very narrow in width or height. Also, banner images usually have a large aspect ratio vertically or horizontally. Graphic images, on the other hand, usually contain homogeneous background and very little texture. Using these detectors, we can extract the following banner and graphic features: (1) the ratio of the number of banner images to the total number of images, and (2) the ratio of the number of graphic images to the total. Spammers usually put the images behind web servers and create references in the s to save server and network resources. This is in contrast to personal s, where images are usually attached with the s. We define the image location feature to be the ratio of the number of external images to the total number of images in the . 55

64 Chapter 3. Visual-Based Anti-Spam Filtering Feature Extraction Banner and Graphic Feature Extraction Since banner images carry certain geometric characteristics, they can be detected by using a very simple rule-based detector. In extracting banner features, we first use the rule-based detector to check the size and aspect ratio of images; then, we calculate the corresponding features according to the detected number of banner images. Because computer-generated graphics usually contain homogeneous color patterns, they contain almost no texture in fine resolution. To extract graphics features, we first apply wavelet transformation on the input images. Then, we extract texture features in three orientations (vertical, horizontal, and diagonal) at fine resolution. If any of these extracted texture features falls below a predefined threshold, the image is likely to be a computer-generated graphic. We calculate the graphic features based on the detected number of graphic images. Embedded-Text Feature Extraction Several text-detection and text-recognition methods have been proposed in the past. Previous text-detection methods often followed a multi-step framework using a combination of image analysis and machine learning techniques. While a multi-step framework greatly simplifies the problem by dividing the problem 56

65 Chapter 3. Visual-Based Anti-Spam Filtering into several sub-tasks, the overall detection accuracy is the product of every step and is bounded by the lowest one. Since the web and images are diverse, previous text-detection methods could not work well if any of the steps yields low detection rate. We have developed an unified embedded-text detector particularly for web and images. Based on Viola and Jones object detection framework [22], we define position-independent features to capture the essence of characters and a smart scan algorithm to trace text lines using their spatial and geometrical regularities. The embedded-text detector can be applied at every location and at every size of the image. The heart of the embedded-text detector is the boosted cascade detector. The cascade detector is a degenerated decision tree which acts as a rejection-based classifier. In each stage of the cascade detector, a classifier is training using BrownBoosting[8], which is a variant of AdaBoosting[7], to reject as many negative samples as possible while keeping almost all the positive samples. As a result, the cascade detector can yield a very high detection rate to false positive rate ratio. Figure 3.2 shows an example of an image with embeddedtext. Figure 3.3 shows the detection result by our embedded-text detector. We calculate the embedded-text feature based on the detected text regions, such as the number and area of regions. 57

66 Chapter 3. Visual-Based Anti-Spam Filtering Figure 3.2: Text embedded in image example The Classifier In previous approaches, the anti-spam filtering problem has typically been treated as a two-class or multiple-class classification problem. In the two-class case, researchers were trying to determine whether an unseen was spam, whereas in the multiple-class case, the unseen s were divided into several categories (such as commercial, financial, objectionable, health, spiritual, etc.). One difficulty with the two-class and multiple-class classification is the need for multiple sets of training samples. For example, in the Naive Bayesian approach, 58

67 Chapter 3. Visual-Based Anti-Spam Filtering Figure 3.3: Detected embedded-text regions one set of spam s and one set of legitimate s are required to train the classifier. While spam datasets are easily accessible, a representative set of legitimate s is difficult to collect. In our anti-spam filter, we define the antispam filtering problem as a task of finding whether an unseen is the spam class. We propose using the one-class SVM [37] as the base classifier. One-Class SVM The basic model of SVM[36] is a maximal margin classifier. Given a positive and a negative dataset, the SVM classifier maps the data from the input space to 59

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05

Ensemble Methods. Knowledge Discovery and Data Mining 2 (VU) (707.004) Roman Kern. KTI, TU Graz 2015-03-05 Ensemble Methods Knowledge Discovery and Data Mining 2 (VU) (707004) Roman Kern KTI, TU Graz 2015-03-05 Roman Kern (KTI, TU Graz) Ensemble Methods 2015-03-05 1 / 38 Outline 1 Introduction 2 Classification

More information

Face detection is a process of localizing and extracting the face region from the

Face detection is a process of localizing and extracting the face region from the Chapter 4 FACE NORMALIZATION 4.1 INTRODUCTION Face detection is a process of localizing and extracting the face region from the background. The detected face varies in rotation, brightness, size, etc.

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea

More information

L25: Ensemble learning

L25: Ensemble learning L25: Ensemble learning Introduction Methods for constructing ensembles Combination strategies Stacked generalization Mixtures of experts Bagging Boosting CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

More information

Machine Learning Final Project Spam Email Filtering

Machine Learning Final Project Spam Email Filtering Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE

More information

Local features and matching. Image classification & object localization

Local features and matching. Image classification & object localization Overview Instance level search Local features and matching Efficient visual recognition Image classification & object localization Category recognition Image classification: assigning a class label to

More information

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals

The Role of Size Normalization on the Recognition Rate of Handwritten Numerals The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,

More information

A Study of Automatic License Plate Recognition Algorithms and Techniques

A Study of Automatic License Plate Recognition Algorithms and Techniques A Study of Automatic License Plate Recognition Algorithms and Techniques Nima Asadi Intelligent Embedded Systems Mälardalen University Västerås, Sweden nai10001@student.mdh.se ABSTRACT One of the most

More information

Visual Structure Analysis of Flow Charts in Patent Images

Visual Structure Analysis of Flow Charts in Patent Images Visual Structure Analysis of Flow Charts in Patent Images Roland Mörzinger, René Schuster, András Horti, and Georg Thallinger JOANNEUM RESEARCH Forschungsgesellschaft mbh DIGITAL - Institute for Information

More information

How To Filter Spam Image From A Picture By Color Or Color

How To Filter Spam Image From A Picture By Color Or Color Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among

More information

The Scientific Data Mining Process

The Scientific Data Mining Process Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In

More information

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches

Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic

More information

Environmental Remote Sensing GEOG 2021

Environmental Remote Sensing GEOG 2021 Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Canny Edge Detection

Canny Edge Detection Canny Edge Detection 09gr820 March 23, 2009 1 Introduction The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Robust Real-Time Face Detection

Robust Real-Time Face Detection Robust Real-Time Face Detection International Journal of Computer Vision 57(2), 137 154, 2004 Paul Viola, Michael Jones 授 課 教 授 : 林 信 志 博 士 報 告 者 : 林 宸 宇 報 告 日 期 :96.12.18 Outline Introduction The Boost

More information

Colour Image Segmentation Technique for Screen Printing

Colour Image Segmentation Technique for Screen Printing 60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone

More information

Active Learning with Boosting for Spam Detection

Active Learning with Boosting for Spam Detection Active Learning with Boosting for Spam Detection Nikhila Arkalgud Last update: March 22, 2008 Active Learning with Boosting for Spam Detection Last update: March 22, 2008 1 / 38 Outline 1 Spam Filters

More information

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall

Assessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin

More information

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014

International Journal of Advanced Information in Arts, Science & Management Vol.2, No.2, December 2014 Efficient Attendance Management System Using Face Detection and Recognition Arun.A.V, Bhatath.S, Chethan.N, Manmohan.C.M, Hamsaveni M Department of Computer Science and Engineering, Vidya Vardhaka College

More information

Jiří Matas. Hough Transform

Jiří Matas. Hough Transform Hough Transform Jiří Matas Center for Machine Perception Department of Cybernetics, Faculty of Electrical Engineering Czech Technical University, Prague Many slides thanks to Kristen Grauman and Bastian

More information

Video OCR for Sport Video Annotation and Retrieval

Video OCR for Sport Video Annotation and Retrieval Video OCR for Sport Video Annotation and Retrieval Datong Chen, Kim Shearer and Hervé Bourlard, Fellow, IEEE Dalle Molle Institute for Perceptual Artificial Intelligence Rue du Simplon 4 CH-190 Martigny

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

Lecture 6: Classification & Localization. boris. ginzburg@intel.com

Lecture 6: Classification & Localization. boris. ginzburg@intel.com Lecture 6: Classification & Localization boris. ginzburg@intel.com 1 Agenda ILSVRC 2014 Overfeat: integrated classification, localization, and detection Classification with Localization Detection. 2 ILSVRC-2014

More information

Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences

Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences Detection and Restoration of Vertical Non-linear Scratches in Digitized Film Sequences Byoung-moon You 1, Kyung-tack Jung 2, Sang-kook Kim 2, and Doo-sung Hwang 3 1 L&Y Vision Technologies, Inc., Daejeon,

More information

3D Interactive Information Visualization: Guidelines from experience and analysis of applications

3D Interactive Information Visualization: Guidelines from experience and analysis of applications 3D Interactive Information Visualization: Guidelines from experience and analysis of applications Richard Brath Visible Decisions Inc., 200 Front St. W. #2203, Toronto, Canada, rbrath@vdi.com 1. EXPERT

More information

Automatic Extraction of Signatures from Bank Cheques and other Documents

Automatic Extraction of Signatures from Bank Cheques and other Documents Automatic Extraction of Signatures from Bank Cheques and other Documents Vamsi Krishna Madasu *, Mohd. Hafizuddin Mohd. Yusof, M. Hanmandlu ß, Kurt Kubik * *Intelligent Real-Time Imaging and Sensing group,

More information

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9

Glencoe. correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 3-3, 5-8 8-4, 8-7 1-6, 4-9 Glencoe correlated to SOUTH CAROLINA MATH CURRICULUM STANDARDS GRADE 6 STANDARDS 6-8 Number and Operations (NO) Standard I. Understand numbers, ways of representing numbers, relationships among numbers,

More information

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier

A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier A Study Of Bagging And Boosting Approaches To Develop Meta-Classifier G.T. Prasanna Kumari Associate Professor, Dept of Computer Science and Engineering, Gokula Krishna College of Engg, Sullurpet-524121,

More information

CS231M Project Report - Automated Real-Time Face Tracking and Blending

CS231M Project Report - Automated Real-Time Face Tracking and Blending CS231M Project Report - Automated Real-Time Face Tracking and Blending Steven Lee, slee2010@stanford.edu June 6, 2015 1 Introduction Summary statement: The goal of this project is to create an Android

More information

Machine Learning using MapReduce

Machine Learning using MapReduce Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous

More information

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j

Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j Analysis of kiva.com Microlending Service! Hoda Eydgahi Julia Ma Andy Bardagjy December 9, 2010 MAS.622j What is Kiva? An organization that allows people to lend small amounts of money via the Internet

More information

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining

Extend Table Lens for High-Dimensional Data Visualization and Classification Mining Extend Table Lens for High-Dimensional Data Visualization and Classification Mining CPSC 533c, Information Visualization Course Project, Term 2 2003 Fengdong Du fdu@cs.ubc.ca University of British Columbia

More information

Supervised Feature Selection & Unsupervised Dimensionality Reduction

Supervised Feature Selection & Unsupervised Dimensionality Reduction Supervised Feature Selection & Unsupervised Dimensionality Reduction Feature Subset Selection Supervised: class labels are given Select a subset of the problem features Why? Redundant features much or

More information

Signature verification using Kolmogorov-Smirnov. statistic

Signature verification using Kolmogorov-Smirnov. statistic Signature verification using Kolmogorov-Smirnov statistic Harish Srinivasan, Sargur N.Srihari and Matthew J Beal University at Buffalo, the State University of New York, Buffalo USA {srihari,hs32}@cedar.buffalo.edu,mbeal@cse.buffalo.edu

More information

Random Forest Based Imbalanced Data Cleaning and Classification

Random Forest Based Imbalanced Data Cleaning and Classification Random Forest Based Imbalanced Data Cleaning and Classification Jie Gu Software School of Tsinghua University, China Abstract. The given task of PAKDD 2007 data mining competition is a typical problem

More information

Distances, Clustering, and Classification. Heatmaps

Distances, Clustering, and Classification. Heatmaps Distances, Clustering, and Classification Heatmaps 1 Distance Clustering organizes things that are close into groups What does it mean for two genes to be close? What does it mean for two samples to be

More information

Categorical Data Visualization and Clustering Using Subjective Factors

Categorical Data Visualization and Clustering Using Subjective Factors Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,

More information

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

More information

Data Mining. Nonlinear Classification

Data Mining. Nonlinear Classification Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Nonlinear Classification Classes may not be separable by a linear boundary Suppose we randomly generate a data set as follows: X has range between 0 to 15

More information

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan

ECE 533 Project Report Ashish Dhawan Aditi R. Ganesan Handwritten Signature Verification ECE 533 Project Report by Ashish Dhawan Aditi R. Ganesan Contents 1. Abstract 3. 2. Introduction 4. 3. Approach 6. 4. Pre-processing 8. 5. Feature Extraction 9. 6. Verification

More information

Image Spam Filtering Using Visual Information

Image Spam Filtering Using Visual Information Image Spam Filtering Using Visual Information Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio Roli, Dept. of Electrical and Electronic Eng., Univ. of Cagliari Piazza d Armi, 09123 Cagliari, Italy

More information

Clustering & Visualization

Clustering & Visualization Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.

More information

Poker Vision: Playing Cards and Chips Identification based on Image Processing

Poker Vision: Playing Cards and Chips Identification based on Image Processing Poker Vision: Playing Cards and Chips Identification based on Image Processing Paulo Martins 1, Luís Paulo Reis 2, and Luís Teófilo 2 1 DEEC Electrical Engineering Department 2 LIACC Artificial Intelligence

More information

Analecta Vol. 8, No. 2 ISSN 2064-7964

Analecta Vol. 8, No. 2 ISSN 2064-7964 EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,

More information

Open-Set Face Recognition-based Visitor Interface System

Open-Set Face Recognition-based Visitor Interface System Open-Set Face Recognition-based Visitor Interface System Hazım K. Ekenel, Lorant Szasz-Toth, and Rainer Stiefelhagen Computer Science Department, Universität Karlsruhe (TH) Am Fasanengarten 5, Karlsruhe

More information

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK

LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK vii LIST OF CONTENTS CHAPTER CONTENT PAGE DECLARATION DEDICATION ACKNOWLEDGEMENTS ABSTRACT ABSTRAK LIST OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF NOTATIONS LIST OF ABBREVIATIONS LIST OF APPENDICES

More information

A Dynamic Approach to Extract Texts and Captions from Videos

A Dynamic Approach to Extract Texts and Captions from Videos Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

SIGNATURE VERIFICATION

SIGNATURE VERIFICATION SIGNATURE VERIFICATION Dr. H.B.Kekre, Dr. Dhirendra Mishra, Ms. Shilpa Buddhadev, Ms. Bhagyashree Mall, Mr. Gaurav Jangid, Ms. Nikita Lakhotia Computer engineering Department, MPSTME, NMIMS University

More information

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

More information

Part-Based Recognition

Part-Based Recognition Part-Based Recognition Benedict Brown CS597D, Fall 2003 Princeton University CS 597D, Part-Based Recognition p. 1/32 Introduction Many objects are made up of parts It s presumably easier to identify simple

More information

To determine vertical angular frequency, we need to express vertical viewing angle in terms of and. 2tan. (degree). (1 pt)

To determine vertical angular frequency, we need to express vertical viewing angle in terms of and. 2tan. (degree). (1 pt) Polytechnic University, Dept. Electrical and Computer Engineering EL6123 --- Video Processing, S12 (Prof. Yao Wang) Solution to Midterm Exam Closed Book, 1 sheet of notes (double sided) allowed 1. (5 pt)

More information

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S.

AUTOMATION OF ENERGY DEMAND FORECASTING. Sanzad Siddique, B.S. AUTOMATION OF ENERGY DEMAND FORECASTING by Sanzad Siddique, B.S. A Thesis submitted to the Faculty of the Graduate School, Marquette University, in Partial Fulfillment of the Requirements for the Degree

More information

Edge tracking for motion segmentation and depth ordering

Edge tracking for motion segmentation and depth ordering Edge tracking for motion segmentation and depth ordering P. Smith, T. Drummond and R. Cipolla Department of Engineering University of Cambridge Cambridge CB2 1PZ,UK {pas1001 twd20 cipolla}@eng.cam.ac.uk

More information

Boosting. riedmiller@informatik.uni-freiburg.de

Boosting. riedmiller@informatik.uni-freiburg.de . Machine Learning Boosting Prof. Dr. Martin Riedmiller AG Maschinelles Lernen und Natürlichsprachliche Systeme Institut für Informatik Technische Fakultät Albert-Ludwigs-Universität Freiburg riedmiller@informatik.uni-freiburg.de

More information

Fast Sequential Summation Algorithms Using Augmented Data Structures

Fast Sequential Summation Algorithms Using Augmented Data Structures Fast Sequential Summation Algorithms Using Augmented Data Structures Vadim Stadnik vadim.stadnik@gmail.com Abstract This paper provides an introduction to the design of augmented data structures that offer

More information

The Re-emergence of Data Capture Technology

The Re-emergence of Data Capture Technology The Re-emergence of Data Capture Technology Understanding Today s Digital Capture Solutions Digital capture is a key enabling technology in a business world striving to balance the shifting advantages

More information

Galaxy Morphological Classification

Galaxy Morphological Classification Galaxy Morphological Classification Jordan Duprey and James Kolano Abstract To solve the issue of galaxy morphological classification according to a classification scheme modelled off of the Hubble Sequence,

More information

Data Preprocessing. Week 2

Data Preprocessing. Week 2 Data Preprocessing Week 2 Topics Data Types Data Repositories Data Preprocessing Present homework assignment #1 Team Homework Assignment #2 Read pp. 227 240, pp. 250 250, and pp. 259 263 the text book.

More information

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data.

In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. MATHEMATICS: THE LEVEL DESCRIPTIONS In mathematics, there are four attainment targets: using and applying mathematics; number and algebra; shape, space and measures, and handling data. Attainment target

More information

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca

Clustering. Adrian Groza. Department of Computer Science Technical University of Cluj-Napoca Clustering Adrian Groza Department of Computer Science Technical University of Cluj-Napoca Outline 1 Cluster Analysis What is Datamining? Cluster Analysis 2 K-means 3 Hierarchical Clustering What is Datamining?

More information

IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

More information

Minnesota Academic Standards

Minnesota Academic Standards A Correlation of to the Minnesota Academic Standards Grades K-6 G/M-204 Introduction This document demonstrates the high degree of success students will achieve when using Scott Foresman Addison Wesley

More information

EXTRACTION OF UNCONSTRAINED CAPTION TEXT FROM GENERAL-PURPOSE VIDEO

EXTRACTION OF UNCONSTRAINED CAPTION TEXT FROM GENERAL-PURPOSE VIDEO The Pennsylvania State University The Graduate School Department of Computer Science and Engineering EXTRACTION OF UNCONSTRAINED CAPTION TEXT FROM GENERAL-PURPOSE VIDEO A Thesis in Computer Science and

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

Beating the NCAA Football Point Spread

Beating the NCAA Football Point Spread Beating the NCAA Football Point Spread Brian Liu Mathematical & Computational Sciences Stanford University Patrick Lai Computer Science Department Stanford University December 10, 2010 1 Introduction Over

More information

Visualization methods for patent data

Visualization methods for patent data Visualization methods for patent data Treparel 2013 Dr. Anton Heijs (CTO & Founder) Delft, The Netherlands Introduction Treparel can provide advanced visualizations for patent data. This document describes

More information

A Method of Caption Detection in News Video

A Method of Caption Detection in News Video 3rd International Conference on Multimedia Technology(ICMT 3) A Method of Caption Detection in News Video He HUANG, Ping SHI Abstract. News video is one of the most important media for people to get information.

More information

IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

More information

More Local Structure Information for Make-Model Recognition

More Local Structure Information for Make-Model Recognition More Local Structure Information for Make-Model Recognition David Anthony Torres Dept. of Computer Science The University of California at San Diego La Jolla, CA 9093 Abstract An object classification

More information

Face Recognition in Low-resolution Images by Using Local Zernike Moments

Face Recognition in Low-resolution Images by Using Local Zernike Moments Proceedings of the International Conference on Machine Vision and Machine Learning Prague, Czech Republic, August14-15, 014 Paper No. 15 Face Recognition in Low-resolution Images by Using Local Zernie

More information

Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras

Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras Locating and Decoding EAN-13 Barcodes from Images Captured by Digital Cameras W3A.5 Douglas Chai and Florian Hock Visual Information Processing Research Group School of Engineering and Mathematics Edith

More information

REVIEW OF ENSEMBLE CLASSIFICATION

REVIEW OF ENSEMBLE CLASSIFICATION Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IJCSMC, Vol. 2, Issue.

More information

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com

LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE. indhubatchvsa@gmail.com LOCAL SURFACE PATCH BASED TIME ATTENDANCE SYSTEM USING FACE 1 S.Manikandan, 2 S.Abirami, 2 R.Indumathi, 2 R.Nandhini, 2 T.Nanthini 1 Assistant Professor, VSA group of institution, Salem. 2 BE(ECE), VSA

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

Application of Face Recognition to Person Matching in Trains

Application of Face Recognition to Person Matching in Trains Application of Face Recognition to Person Matching in Trains May 2008 Objective Matching of person Context : in trains Using face recognition and face detection algorithms With a video-surveillance camera

More information

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection

Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection CSED703R: Deep Learning for Visual Recognition (206S) Lecture 6: CNNs for Detection, Tracking, and Segmentation Object Detection Bohyung Han Computer Vision Lab. bhhan@postech.ac.kr 2 3 Object detection

More information

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap

Palmprint Recognition. By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palmprint Recognition By Sree Rama Murthy kora Praveen Verma Yashwant Kashyap Palm print Palm Patterns are utilized in many applications: 1. To correlate palm patterns with medical disorders, e.g. genetic

More information

CS 534: Computer Vision 3D Model-based recognition

CS 534: Computer Vision 3D Model-based recognition CS 534: Computer Vision 3D Model-based recognition Ahmed Elgammal Dept of Computer Science CS 534 3D Model-based Vision - 1 High Level Vision Object Recognition: What it means? Two main recognition tasks:!

More information

Strategic Online Advertising: Modeling Internet User Behavior with

Strategic Online Advertising: Modeling Internet User Behavior with 2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew

More information

Online Play Segmentation for Broadcasted American Football TV Programs

Online Play Segmentation for Broadcasted American Football TV Programs Online Play Segmentation for Broadcasted American Football TV Programs Liexian Gu 1, Xiaoqing Ding 1, and Xian-Sheng Hua 2 1 Department of Electronic Engineering, Tsinghua University, Beijing, China {lxgu,

More information

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

More information

Automatic Labeling of Lane Markings for Autonomous Vehicles

Automatic Labeling of Lane Markings for Autonomous Vehicles Automatic Labeling of Lane Markings for Autonomous Vehicles Jeffrey Kiske Stanford University 450 Serra Mall, Stanford, CA 94305 jkiske@stanford.edu 1. Introduction As autonomous vehicles become more popular,

More information

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition

Open Access A Facial Expression Recognition Algorithm Based on Local Binary Pattern and Empirical Mode Decomposition Send Orders for Reprints to reprints@benthamscience.ae The Open Electrical & Electronic Engineering Journal, 2014, 8, 599-604 599 Open Access A Facial Expression Recognition Algorithm Based on Local Binary

More information

Such As Statements, Kindergarten Grade 8

Such As Statements, Kindergarten Grade 8 Such As Statements, Kindergarten Grade 8 This document contains the such as statements that were included in the review committees final recommendations for revisions to the mathematics Texas Essential

More information

Author Gender Identification of English Novels

Author Gender Identification of English Novels Author Gender Identification of English Novels Joseph Baena and Catherine Chen December 13, 2013 1 Introduction Machine learning algorithms have long been used in studies of authorship, particularly in

More information

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS

COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS COMPARISON OF OBJECT BASED AND PIXEL BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES USING ARTIFICIAL NEURAL NETWORKS B.K. Mohan and S. N. Ladha Centre for Studies in Resources Engineering IIT

More information

Lecture 2: The SVM classifier

Lecture 2: The SVM classifier Lecture 2: The SVM classifier C19 Machine Learning Hilary 2015 A. Zisserman Review of linear classifiers Linear separability Perceptron Support Vector Machine (SVM) classifier Wide margin Cost function

More information

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang

Classifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental

More information

Monotonicity Hints. Abstract

Monotonicity Hints. Abstract Monotonicity Hints Joseph Sill Computation and Neural Systems program California Institute of Technology email: joe@cs.caltech.edu Yaser S. Abu-Mostafa EE and CS Deptartments California Institute of Technology

More information

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability

Classification of Fingerprints. Sarat C. Dass Department of Statistics & Probability Classification of Fingerprints Sarat C. Dass Department of Statistics & Probability Fingerprint Classification Fingerprint classification is a coarse level partitioning of a fingerprint database into smaller

More information

Efficient Data Structures for Decision Diagrams

Efficient Data Structures for Decision Diagrams Artificial Intelligence Laboratory Efficient Data Structures for Decision Diagrams Master Thesis Nacereddine Ouaret Professor: Supervisors: Boi Faltings Thomas Léauté Radoslaw Szymanek Contents Introduction...

More information

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS

IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS IMPLICIT SHAPE MODELS FOR OBJECT DETECTION IN 3D POINT CLOUDS Alexander Velizhev 1 (presenter) Roman Shapovalov 2 Konrad Schindler 3 1 Hexagon Technology Center, Heerbrugg, Switzerland 2 Graphics & Media

More information

Mathematics. What to expect Resources Study Strategies Helpful Preparation Tips Problem Solving Strategies and Hints Test taking strategies

Mathematics. What to expect Resources Study Strategies Helpful Preparation Tips Problem Solving Strategies and Hints Test taking strategies Mathematics Before reading this section, make sure you have read the appropriate description of the mathematics section test (computerized or paper) to understand what is expected of you in the mathematics

More information

How to Recognize Captions in Video?

How to Recognize Captions in Video? Caption Analysis and Recognition For Building Video Indexing Systems Fu Chang, Guey-Ching Chen, Chin-Chin Lin and Wen-Hsiung Lin Institute of Information Science, Academia Sinica Dept. of Electrical Engineering,

More information

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp

SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS. Nitin Khanna and Edward J. Delp SOURCE SCANNER IDENTIFICATION FOR SCANNED DOCUMENTS Nitin Khanna and Edward J. Delp Video and Image Processing Laboratory School of Electrical and Computer Engineering Purdue University West Lafayette,

More information

3D Scanner using Line Laser. 1. Introduction. 2. Theory

3D Scanner using Line Laser. 1. Introduction. 2. Theory . Introduction 3D Scanner using Line Laser Di Lu Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute The goal of 3D reconstruction is to recover the 3D properties of a geometric

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Unit # 6 Sajjad Haider Fall 2014 1 Evaluating the Accuracy of a Classifier Holdout, random subsampling, crossvalidation, and the bootstrap are common techniques for

More information