Fusion of Text and Image Features: A New Approach to Image Spam Filtering
|
|
- Sharon Ryan
- 8 years ago
- Views:
Transcription
1 Fusion of Text and Image Features: A New Approach to Image Spam Filtering Congfu Xu 1, Kevin Chiew 2, Yafang Chen 1,andJuxinLiu 1 1 Institute of Artificial Intelligence, Zhejiang University, Hangzhou, China 2 School of Engineering, Tan Tao University, Long An, Vietnam Abstract. While enjoying the convenience of communications, many users have also experienced annoying spam. Even if the current spam detecting approaches have gained a competitive edge against text-based spam, they still face the challenge arising from imagebased spam (image spam in short). Image spam normally includes embedded images that contain the spam messages in binary format rather than text format and cost more storage and bandwidth resources. In this paper, we propose a hybrid image spam filtering framework to detect spam images based on both extracted text and image features. Our experimental results show that our approach achieves significant improvement in detection accuracy as compared with other methods that simply use text or image features, and works robustly in an environment with either complex background or compression artifact. 1 Introduction Nowadays one of the most pervasive applications of the Internet is the service which has brought great convenience in our communications. While enjoying the facilities of service, users are also facing a big number of annoying spam. spam, of which the volume has been growing tremendously in past few years as reported, has also decreased the quality of service. This is partly because spam costs the resources of storage and communication bandwidth. Moreover, a latest news 1 reports a research result telling that spam produces millions of tons of CO 2 globally every year. Many solutions are proposed for detecting and filtering spam s to prevent them from being received, forwarded, and spread. The basic technique for these solutions is to train classifiers to identify spam images from ham (hold-andmodify) images. These classifiers normally use two types of rules: (a) rules based on connection and relay properties of s, and (b) rules using the features extracted from the contents of s. The second type of rules that carry out contents filtering by using machine learning mechanisms such as Naive Bayes classification or support vector machines (SVM), have been a cornerstone of anti-spam systems [16] and have shown the advantage of high accuracy. However, currently there is a new attack which could be devastating on content filters. Instead of obscuring the message s text, spammers now are able to 1 See Y. Wang and T. Li (Eds.): Practical Applications of Intelligent Systems, AISC 124, pp springerlink.com c Springer-Verlag Berlin Heidelberg 2011
2 130 C. Xu et al. Fig. 1. Examples of spam images (noticing the high amount of text and the use of text obfuscation technique against OCR) defeat text analysis techniques by replacing text with images. A whitepaper released in November 2006 [17] shows the rise of image spam from 10% in April to 27% of all spam in October 2006 totaling up to 48 billion s every day. A possible way to detect image spam is using a pipeline of an optical character recognition (OCR) system, which extracts and recognizes embedded text, followed by a text classifier that separates spam from legitimate content. It was found that this approach can be effective for clean images [8]. However image spam has allowed spammers to design spam as CAPTCHAs (see the right part of Figure 1) or use obscuring image text to defeat OCR tools. Thus if an image spam filter is equipped with an OCR-based module as the unique countermeasure against spam, it is vulnerable to image spam with obfuscated text. In this paper, we propose a solution for image spam filtering. Since most of spam images contain large proportions of text as shown in Figure 1, our solution first extracts the text information embedded into images, together with the image information that can be identified by the unique properties [14] of spam images as compared with those of natural scene images or generic computer-generated graphic images. We then use a combinational filter with two-layer structure for training and classification, of which the bottom-layer classifiers obtain the image spam confidence score by using the two types of features, and a top-layer classifier makes the final decision by using the outputs of the bottom-layer classifiers. The remaining sections of the paper are organized as follows. Firstly in Section 2 we review the related work on the filtering techniques for contentbased image spam, following which in Section 3 we introduce the framework of image spam filtering in details. In Section 4, we report experimental results on real data sets of ham and spam images, and conclude the paper in Section 5. 2 Related Work The detection of image spam is a special case of image categorization, which is addressed as a task of two-class classification between ham and spam images in [1, 6,8] and has been extensively studied in context of many important applications. In [1], Aradhye et al. used a support vector classifier to extract the text regions in an image, followed by which they identified five visual features of the spam. The first feature is the relative area of the image occupied by text. It is used with the underlying idea that spam images usually contain more text than
3 Fusion of Text and Image Features: A New Approach 131 legitimate images. The other features such as color heterogeneity and saturation are identified over text and non-text regions based on the assumption that images of which the main part are synthetic are normally more likely to be spam. Based on the method in [1], Dredze et al. [6] proposed to use different kind of features. Although some visual features are used (like average RGB colors, the relative area occupied by the most common color, and color saturation features as in [1]), the most important role is played by metadata extracted from the images. They also introduced a feature selection algorithm (JIT) to select the most discriminant features based on their speed as well as the predictive power. Fumera et al. [8] proposed an approach to anti-spam filtering which exploits the text information embedded into images sent as attachments. This approach is based on the consideration that text embedded into images plays the same role as text in the body of s without images (i.e., it conveys the spam messages). After extracting text with OCR tools from images attached to s, they carried out the semantic analysis of text using text categorization techniques like the ones applied to the body of the without images. A method [4] is presented to recognize image spam based on detecting the presence of content obscuring techniques which aim to compromise the OCR effectiveness. The implementation is based on two low-level image features aimed at measuring the extent of character breaking or the presence of small noise components, and the presence of merged characters or large noise components. Nhung and Phuong used simple edge-based features [16] to compute a vector of similarity scores between an image and a set of templates. This similarity vector is then used with an SVM to separate spam images from other common categories of images. In [11] specific features are selected for inspection by the components-based method, and then the spam-filter system uses these features to identify image spam by feature matching. 3 Hybrid Framework for Image Spam Filtering Since the content obscuring techniques can defeat the attempts of using OCR tools [8] to detect text embedded into images, to filter such image spam, we propose an image categorization approach that detects both text and image features. Figure 2 shows the proposed hybrid framework for image spam filtering. The framework works by three phases. Firstly, we calculate the features of an input spam . This work includes keyword detection and text-related features extraction. We then use an SVM to obtain the image spam confidence score. Secondly, we define a small number of reliable spam-indicative features from the image metadata and image color properties, and then use an SVM again to classify the image. Lastly, we use fusion classifier to make a decision based on the outputs of both text and image classifiers. An example of a spam image is shown in Figure 3. The spam image is identified by our framework as a ham image with the confidence score of by the image classifier and as a spam image by the text classifier with the confidence score of Thus finally the image is identified as a spam image after fusion
4 132 C. Xu et al. Fig. 2. Architecture of our hybrid framework for image spam filtering Fig. 3. An example of spam image of both confidence scores. The functions of major components are introduced as follows. 3.1 Keyword Detection Semantic analysis of text embedded into images first requires text extraction by techniques such as OCR which may bring with the following two issues: (a) high computational complexity and (b) susceptible to content obscuring techniques. For the first issue, it is possible to reduce the computational complexity by using a hierarchical architecture for the spam filter. Text extraction and analysis are carried out only if the previous and less complex modules are unable to reliably identify whether an is legitimate or not. To further reduce computational complexity, techniques based on image signature could be employed. For the second issue, since embedded text extraction is often inaccurate, we use keyword detection to improve classification accuracy. We first define a keyword set composed of thirty words and five phrases. And then, for every image we calculate a feature indicating whether at least one element of the keyword set is detected in the text extracted by an OCR system. Performing OCR on images attached to s is carried out by the demo version of the commercial software ABBYY FineReader 8.0 Professional with default parameter settings.
5 Fusion of Text and Image Features: A New Approach Text-Related Features Extraction The text-related features detect the properties of text in an image. The text regions in the image are firstly extracted. A subsequent step defines some features from the image by using the extracted text regions. Our method of text region extraction comprises the following three main steps. Step 1: Edge detection. A convolution operation with a compass operator [12] is used to generate intensity images of four oriented edges which are at 0, 45,90 and 135 orientations respectively. For color images, we convert them into gray images at first. Step 2: Feature generation. We first subdivide an image into a grid of w h equally sized cells C ij where i =1,...,w and j =1,...,h(each cell is as big as pixels in this work), and then compute the six features over all cells. These six features, namely mean μ, standard deviation σ, energye g, entropy E t, inertial-quadrature I, and local homogeneity H, are defined by the following Equations (1) to (6) [5, 9]: μ = 1 w h E(i, j) (1) w h i=1 j=1 σ = 1 w h [E(i, j) μ] w h 2 (2) E g = i,j i=1 j=1 E 2 (i, j) (3) E t = i,j I = i,j H = i,j E(i, j)loge(i, j) (4) (i j) 2 E(i, j) (5) 1 E(i, j) (6) 1+(i j) 2 in which E(i, j) is the normalized symmetrical gray level co-occurrence matrix (GLCM) of cell C ij [10]. Step 3: Text region detection. We first use the K-means clustering based on the above features to obtain the text areas and background areas, and then refine the text region by morphological dilation and erosion. Figure 4 illustrates the process of text region detection. Based on the extracted text regions, we calculate the following simple features that are most indicative of spam images: (1) Extent of text regions. The extent of text in the image is defined as the proportion between the area of the extracted text regions and the total areas of the image; (2) Amount of text regions; and (3) Amount of text letters.
6 134 C. Xu et al. (a) Initial picture (b) Candidate of text region (c) After erosion operation (d) After dilation operation (e) Final result (f) Labeled by pane Fig. 4. Illustration of the process of text region detection Text may be inherently presented in natural scene images in the form of road signs, building names, company names or others, and synthetic images may include text. However, the extraction of text features as defined above is intuitively expected to be discriminative between spam images and non-spam images. Figure 5 shows the distributions of features 1 and 3, from which we can find that the spam images and non-spam images distribute in different data domains. For feature 1, more than 40% of ham images distribute in the range of 0 to 0.1, and more than 80% of spam images in the range of 0.2 to 0.6; whereas for feature 3, more ham images distribute in the range of 0 to 6, and more spam images in the range of 6 to 60. According to[3],we also use three features to detect the presence of content obscuring. The idea is to measure the perimetric complexity which is used in the psychophysics of reading literature and aspect ratio (the ratio between width and height). The perimetric complexity is defined as the squared length of the boundary between black and white pixels in the whole image, divided by the black area.
7 Fusion of Text and Image Features: A New Approach % 80% ham images spam images 90% 80% ham images spam images 70% 70% 60% 60% 50% 50% 40% 40% 30% 30% 20% 20% 10% 10% 0% More than 0.6 0% More than 60 (a) Distribution of extent of text regions (feature 1) (b) Distribution of amount of text letters (feature 3) Fig. 5. Feature distributions in all images 3.3 Image Features Extraction Our first group of image features relies on the following metadata: (1) File format. The file format of an image includes its extension, the actual file format (as identified by metadata) and whether they match with each other; and (2) Image metadata. We extract 10 features that are contained in the image metadata, including whether the image has comments, bits per pixel, number of bands, progressive flag, sample precision, transparent color, approx high, index value, logical height and width. The rest of our image features based on the following color properties: (1) Color saturation. As defined by Frankel et al. [7], color saturation is quantified as the fraction of the total number of pixels in the image for which the difference max(r, G, B) min(r, G, B) is greater than a predefined threshold; (2) Color histogram. The color histogram is a compact summary of the image, and the legitimate images typically convey a much larger number of colors than spam images. We chose a 6-bit color space leading to 64 feature vectors; and (3) Color moments. The use of color moments is based on the assumption that the distribution of color in an image can be interpreted as a probability distribution. The distribution of spam images is always not continuous since they are synthetic. In our study, we use the following three central moments of an image s color distribution, namely mean, standard deviation and skewness. Using RGB channels and three moments for each channel, we obtain nine feature vectors. Figure 6 shows several ham and spam images and Figure 7 shows their color saturation, from which we can see that spam images are generally more saturated as compared with images of natural scenes. 3.4 Bottom-Layer Classifiers Some significant advantages of an SVM, such as excellent generalization ability through maximum margin approach, the absence of local minima, and the sparse representation of solution, are the major reason for using an SVM as a
8 136 C. Xu et al. (a) Ham image 1 (b) Ham image 2 (c) Ham image 3 (d) Spam image 1 (e) Spam image 2 Fig. 6. Three ham images and two spam images Fig. 7. Color saturation of images in Figure 6 powerful model in classification tasks. Both the text classifier and image classifier use SVMs first to differentiate between text and images, and obtain the spam confidence scores as the inputs of classifier fusion for further decision. The kernel trick is another important point to the success of SVMs. Polynomial kernel, radial basic function (RBF) kernel and sigmoid kernel are three typical kernels. In our study, LIBSVM 2 is adopted and RBF is used as a kernel function since the corresponding Hilbert space is of infinite dimension. The 2 The software is available at
9 Fusion of Text and Image Features: A New Approach 137 default parameters are used. In the previous section, we extract features and obtain the vector space model (VSM) which represents each image. The text-based vector space includes seven feature vectors and the image-based vector space includes 87 feature vectors. The text classifier and image classifier use their vectors as inputs to the SVM for training and classification respectively. 3.5 Classifier Fusion Combining the outputs from multiple tools has been reported effective in terms of improving information retrieval [13,15] and classification performance [2,18]. Our experiments also show that we can improve accuracy by combining the results of several classifiers. Furthermore, it makes sense that by including the inputs of many types of classifiers we can protect ourselves from risk of any one classifier being compromised. We use an SVM again to fuse the confidence scores of text and image classifiers. The outputs of bottom-layer classifiers constitute a vector for SVM training and classification. The vector is defined as (S t,s i )in which S t is the confidence score of text classifier and S i the confidence score of image classifier. Similar to bottom-layer classifiers, LIBSVM and RBF are also adopted for classifiers fusion. 4 Experiment 4.1 Experimental Setup The experiments are carried out on the corpora of images taken from real s. The corpora are collections of personal s used in [6], containing 2006 ham images and 3297 spam images. To our best knowledge, this is the only corpus of real ham images publicly available to research communities 3. For the experiments, the images are first split into two subsets: about 60% are randomly chosen for training classifiers on the bottom layer, and the other 40% for testing. And then for fusion stage, about 50% images are randomly chosen for training, and the other 50% for testing. We repeat this random selection 10 times and average all of the results. We first reduce the images by scaling so that the width and height are no more than 200 pixels. This simple mechanism makes our method robust to random pixels and simple scaling. It also meets the computational requirements since image analysis has high computational complexity. We then extract features from all the images from the positive and negative test sets. In our evaluation, accuracy, precision, image spam recall (recall in short) and image non-spam recall (non-spam recall in short) are defined as follows: accuracy = # of all images correctly classified # of all images 3 Available at spam/
10 138 C. Xu et al. 100% % 87.00% Performance 80% 60% 40% 20% Image classifier Text classifier Fusion classifier with averaging Fusion classifier with SVM Accuracy Precision Recall Non-spam recall Measure Performance 74.00% 61.00% 48.00% 35.00% SA with Bayes-OCR Huang's approach in [8] Our approach Precision Measure Recall (a) Performance comparison for different approaches (b) Performance comparison with Huang s approach Fig. 8. Experimental results precision = recall = # of spam images correctly classified # of images classified as spam # of spam images correctly classified # of all spam images # of non-spam images correctly classified non-spam recall = # of all non-spam images All the experiments are conducted on a typical PC with Core 2 Quad Q6600 CPU and 4GB memory and with Windows XP installed. 4.2 Experimental Results Figure 8(a) shows the details of experiment results, from which we can see that, as compared with the text classifier, the image classifier can obtain higher accuracy for common categories of images classification; whereas the text classifier has a better discriminative capability for spam images classification. The fusion classifier with averaging has achieved better results in total accuracy though, we cannot see any improvement in other indicators. The discriminative capability is greatly improved when we fuse the confidence scores of text classifier and image classifier with an SVM. Therefore, we can draw such a conclusion from the results: the fusion classifier with an SVM combines the classification performance from the text and image classifiers in a complementary fashion that unites the strengths of both. To evaluate the performance of our approach, we compare it with a public spam corpus SpamAssassin 4 (SA in short) in its standard configuration and equipped with a device Bayes-OCR for filtering image spam, and with the existing approach which is presented in a recent paper [11]. The comparative results are shown in Figure 8(b). The results of SA with Bayes-OCR are our baseline, of which the precision values are very good (almost as high as 100%) while the recall is still acceptably challenged (lower than 40%). Although our experiment 4 Available at
11 Fusion of Text and Image Features: A New Approach 139 and the approach in [11] are not using the same corpora, from the table we can see that our approach obtains better results, i.e., the precision is high enough to compete that from SA with Bayes-OCR, while the recall is much more improved. We also compare our approach with the existing approach in [6] which uses the same corpus. The average accuracy of our approach is %, better than the result of % by the approach in [6]. For some text-based anti-spam filtering experiments, there are a number of public benchmark datasets publicly available; whereas for our experiments, there are not any other shared ham images available besides another public corpus SpamArchive 5 which consists of 16,021 spam images. We hope that a larger corpus with real spam and non-spam images be available in the future to facilitate the experiments so that we can conduct a more fair comparison for the above mentioned approaches. 5 Conclusion In this paper, we have presented a novel hybrid framework for detecting spam with content embedded in images by fusion of classifiers. Given a spammed image, our method has been able to extract both the text and image features, and input the vector into the bottom-layer classifiers respectively, and lastly obtain the final decision based on the fusion of the outputs of the classifiers. Our experimental results have shown that our approach has achieved a significant improvement in the accuracy of image spam detection as compared with other approaches. For the next stage of study, we will further formalize our framework and approach, and will develop an online version of the fusion method by considering the spam filter s handing capacity and test the image model s ability in spam detection. Acknowledgments. This paper is supported by the 863 Plan project of China (No. 2007AA01Z197) and the Natural Science Foundations of China (No ), and partially supported by the National Basic Research Program of China (No. 2010CB327903). We would like to thank Dr. Mark Dredze who is now in the Department of Computer Science at University of Pennsylvania for making his data set publicly available and sending us his code for performing the feature extraction. References 1. Aradhye, H.B., Myers, G.K., Herson, J.A.: Image analysis for efficient categorization of image-based spam . In: Proceedings of International Conference on Document Analysis and Recognition, pp (August 2005) 5 SpamArchive was downloadable from SpamArchive.org which has been shut down. It is now available at image spam/
12 140 C. Xu et al. 2. Bennett, P.N., Dumais, S.T., Horvitz, E.: The combination of text classifiers using reliability indicators. Information Retrieval 8(1), (2005) 3. Biggio, B., Fumera, G., Pillai, I., Roli, F.: Image spam filtering by content obscuring detection. In: Proceedings of the Fourth Conference on and Anti-Spam (CEAS 2007), pp. 2 3 (August 2007) 4. Biggio, B., Fumera, G., Pillai, I., Roli, F.: Image spam filtering using visual information. In: Proceedings of the 14th International Conference on Image Analysis and Processing (ICIAP 2007), pp (September 2007) 5. Cheng, H.D., Sun, Y.: A hierarchical approach to color image segmentation using homogeneity 9(12), (2000) 6. Dredze, M., Gevaryahu, R., Elias-Bachrach, A.: Learning fast classifiers for image spam. In: Proceedings of the Fourth Conference on and Anti-Spam (CEAS 2007), pp (August 2007) 7. Frankel, C., Swain, M., Athitsos, V.: Webseer: an image search engine for the world wide web. Technical report, University of Chicago (1996) 8. Fumera, G., Pillai, I., Roli, F.: Spam filtering based on the analysis of text information embedded into images. Journal of Maching Learning Research (special issue on Machine Learning in Computer Security) 7, (2006) 9. Gopalan, C., Manjula, D.: Statistical modeling for the detection, localization and extraction of text from heterogeneous textual images using combined feature scheme, (2010) 10. Haralick, R., Shanmugam, K., Dinstein, I.: Textual features for image classification 3(6), (1973) 11. Huang, H., Guo, W., Zhang, Y.: A novel method for image spam filtering. In: Proceedings of the 9th International Conference for Young Computer Scientists (ICYCS 2008), pp (November 2008) 12. Jain, A.K.: Fundamentals of Digital Image Processing. Prentice-Hall, Inc., Upper Saddle River (1989) 13. Lynam, T.R., Buckley, C., Clarke, C.L.A., Cormack, G.V.: A multi-system analysis of document and term selection for blind feedback. In: Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM 2004), pp (November 2004) 14. Mehta, B., Nangia, S., Gupta, M., Nejdl, W.: Detecting image spam using visual features and near duplicate detection. In: Proceedings of the 17th International Conference on World Wide Web (WWW 2008), pp (April 2008) 15. Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of the 11th ACM Conference on Information and Knowledge Management (CIKM 2002), pp (November 2002) 16. Nhung, N.P., Phuong, T.M.: An efficient method for filtering image-based spam. In: Proceedings of 2007 IEEE International Conference on Research, Innovation and Vision for the Future, pp (March 2007) 17. Secure Computing Whitepaper. Image spam: The latest attack on the enterprise inbox. Technical report (November 2006) 18. Zhang, Y.: Using bayesian priors to combine classifiers for adaptive filtering. In: Proceedings of the 27th Conference on Research and Development in Information Retrieval (SIGIR 2004), pp (July 2004)
Image Spam Filtering Using Visual Information
Image Spam Filtering Using Visual Information Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio Roli, Dept. of Electrical and Electronic Eng., Univ. of Cagliari Piazza d Armi, 09123 Cagliari, Italy
More informationImage Spam Filtering by Content Obscuring Detection
Image Spam Filtering by Content Obscuring Detection Battista Biggio, Giorgio Fumera, Ignazio Pillai, Fabio Roli Dept. of Electrical and Electronic Eng., University of Cagliari Piazza d Armi, 09123 Cagliari,
More informationHow To Filter Spam Image From A Picture By Color Or Color
Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among
More informationLearning Fast Classifiers for Image Spam
Learning Fast Classifiers for Image Spam Mark Dredze Computer and Information Sciences Dept. University of Pennsylvania Philadelphia, PA 19104 mdredze@seas.upenn.edu Reuven Gevaryahu Philadelphia, PA 19104
More informationSpam Filtering Based On The Analysis Of Text Information Embedded Into Images
Journal of Machine Learning Research 7 (2006) 2699-2720 Submitted 3/06; Revised 9/06; Published 12/06 Spam Filtering Based On The Analysis Of Text Information Embedded Into Images Giorgio Fumera Ignazio
More informationImage spam filtering using textual and visual information
Image spam filtering using textual and visual information Giorgio Fumera Ignazio Pillai Fabio Roli Battista Biggio Dept. of Electrical and Electronic Eng., Univ. of Cagliari Piazza d Armi, 09123 Cagliari,
More informationA survey and experimental evaluation of image spam filtering techniques
A survey and experimental evaluation of image spam filtering techniques Battista Biggio, Giorgio Fumera, Ignazio Pillai and Fabio Roli Department of Electrical and Electronic Engineering, University of
More informationImproved Spam Filter via Handling of Text Embedded Image E-mail
J Electr Eng Technol Vol. 9, No.?: 742-?, 2014 http://dx.doi.org/10.5370/jeet.2014.9.7.742 ISSN(Print) 1975-0102 ISSN(Online) 2093-7423 Improved Spam Filter via Handling of Text Embedded Image E-mail Seongwook
More informationCombining Optical Character Recognition (OCR) and Edge Detection Techniques to Filter Image-Based Spam
Combining Optical Character Recognition (OCR) and Edge Detection Techniques to Filter Image-Based Spam B. Fadiora Department of Computer Science The Polytechnic Ibadan Ibadan, Nigeria tundefadiora@yahoo.com
More informationThe Role of Size Normalization on the Recognition Rate of Handwritten Numerals
The Role of Size Normalization on the Recognition Rate of Handwritten Numerals Chun Lei He, Ping Zhang, Jianxiong Dong, Ching Y. Suen, Tien D. Bui Centre for Pattern Recognition and Machine Intelligence,
More informationEmail Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
More informationAnalecta Vol. 8, No. 2 ISSN 2064-7964
EXPERIMENTAL APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN ENGINEERING PROCESSING SYSTEM S. Dadvandipour Institute of Information Engineering, University of Miskolc, Egyetemváros, 3515, Miskolc, Hungary,
More informationAssessment. Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall
Automatic Photo Quality Assessment Presenter: Yupu Zhang, Guoliang Jin, Tuo Wang Computer Vision 2008 Fall Estimating i the photorealism of images: Distinguishing i i paintings from photographs h Florin
More informationCAS-ICT at TREC 2005 SPAM Track: Using Non-Textual Information to Improve Spam Filtering Performance
CAS-ICT at TREC 2005 SPAM Track: Using Non-Textual Information to Improve Spam Filtering Performance Shen Wang, Bin Wang and Hao Lang, Xueqi Cheng Institute of Computing Technology, Chinese Academy of
More informationPARTIAL IMAGE SPAM E-MAIL DETECTION USING OCR
PARTIAL IMAGE SPAM E-MAIL DETECTION USING OCR V. Sathiya *1 M.Divakar #2 T.S. Sumi *3 1 Faculty, Department of M.C.A, Panimalar Engineering College, Anna University, Chennai, India 2 PG Scholar, Department
More informationDetecting Image Spam Using Image Texture Features
Detecting Image Spam Using Image Texture Features Basheer Al-Duwairi*, Ismail Khater and Omar Al-Jarrah *Department of Network Engineering & Security Department of Computer Engineering Jordan University
More informationBlog Post Extraction Using Title Finding
Blog Post Extraction Using Title Finding Linhai Song 1, 2, Xueqi Cheng 1, Yan Guo 1, Bo Wu 1, 2, Yu Wang 1, 2 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing 2 Graduate School
More informationOn Attacking Statistical Spam Filters
On Attacking Statistical Spam Filters Gregory L. Wittel and S. Felix Wu Department of Computer Science University of California, Davis One Shields Avenue, Davis, CA 95616 USA Paper review by Deepak Chinavle
More informationImage Based Spam: White Paper
The Rise of Image-Based Spam No matter how you slice it - the spam problem is getting worse. In 2004, it was sufficient to use simple scoring mechanisms to determine whether email was spam or not because
More informationAutomatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 269 Class Project Report
Automatic 3D Reconstruction via Object Detection and 3D Transformable Model Matching CS 69 Class Project Report Junhua Mao and Lunbo Xu University of California, Los Angeles mjhustc@ucla.edu and lunbo
More informationA MACHINE LEARNING APPROACH TO SERVER-SIDE ANTI-SPAM E-MAIL FILTERING 1 2
UDC 004.75 A MACHINE LEARNING APPROACH TO SERVER-SIDE ANTI-SPAM E-MAIL FILTERING 1 2 I. Mashechkin, M. Petrovskiy, A. Rozinkin, S. Gerasimov Computer Science Department, Lomonosov Moscow State University,
More informationThe Scientific Data Mining Process
Chapter 4 The Scientific Data Mining Process When I use a word, Humpty Dumpty said, in rather a scornful tone, it means just what I choose it to mean neither more nor less. Lewis Carroll [87, p. 214] In
More informationNot So Naïve Online Bayesian Spam Filter
Not So Naïve Online Bayesian Spam Filter Baojun Su Institute of Artificial Intelligence College of Computer Science Zhejiang University Hangzhou 310027, China freizsu@gmail.com Congfu Xu Institute of Artificial
More informationAN ENHANCED APPROACH FOR CONTENT FILTERING IN SPAM DETECTION
AN ENHANCED APPROACH FOR CONTENT FILTERING IN SPAM DETECTION Shashi Kant Rathore Department of Computer Science & Engineering, Lovely Professional University, Jalandhar, Punjab shashi.mnit@gmail.com Jyoti
More informationAn Approach to Image Spam Filtering Based on Base64 Encoding and N-Gram Feature Extraction
An Approach to Image Spam Filtering Based on Base64 Encoding and N-Gram Feature Extraction Congfu Xu Institute of Artificial Intelligence College of Computer Science Zhejiang University Hangzhou 327, China
More informationNeural Network based Vehicle Classification for Intelligent Traffic Control
Neural Network based Vehicle Classification for Intelligent Traffic Control Saeid Fazli 1, Shahram Mohammadi 2, Morteza Rahmani 3 1,2,3 Electrical Engineering Department, Zanjan University, Zanjan, IRAN
More informationA Two-Pass Statistical Approach for Automatic Personalized Spam Filtering
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences
More informationA Novel Approach towards Image Spam Classification
A Novel Approach towards Image Spam Classification M.Soranamageswari, Dr.C.Meena Abstract The volume of unsolicited commercial mails has grown extremely in the past few years because of increased internet
More informationCategorical Data Visualization and Clustering Using Subjective Factors
Categorical Data Visualization and Clustering Using Subjective Factors Chia-Hui Chang and Zhi-Kai Ding Department of Computer Science and Information Engineering, National Central University, Chung-Li,
More informationFeature Subset Selection in E-mail Spam Detection
Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature
More informationIdentifying Image Spam based on Header and File Properties using C4.5 Decision Trees and Support Vector Machine Learning
Identifying Image Spam based on Header and File Properties using C4.5 Decision Trees and Support Vector Machine Learning Sven Krasser, Yuchun Tang, Jeremy Gould, Dmitri Alperovitch, Paul Judge Abstract
More informationBayesian Spam Filtering
Bayesian Spam Filtering Ahmed Obied Department of Computer Science University of Calgary amaobied@ucalgary.ca http://www.cpsc.ucalgary.ca/~amaobied Abstract. With the enormous amount of spam messages propagating
More informationPSSF: A Novel Statistical Approach for Personalized Service-side Spam Filtering
2007 IEEE/WIC/ACM International Conference on Web Intelligence PSSF: A Novel Statistical Approach for Personalized Service-side Spam Filtering Khurum Nazir Juneo Dept. of Computer Science Lahore University
More informationSpam detection with data mining method:
Spam detection with data mining method: Ensemble learning with multiple SVM based classifiers to optimize generalization ability of email spam classification Keywords: ensemble learning, SVM classifier,
More informationCombining Global and Personal Anti-Spam Filtering
Combining Global and Personal Anti-Spam Filtering Richard Segal IBM Research Hawthorne, NY 10532 Abstract Many of the first successful applications of statistical learning to anti-spam filtering were personalized
More informationsiftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service
siftservice.com - Turning a Computer Vision algorithm into a World Wide Web Service Ahmad Pahlavan Tafti 1, Hamid Hassannia 2, and Zeyun Yu 1 1 Department of Computer Science, University of Wisconsin -Milwaukee,
More informationEmail Spam Detection A Machine Learning Approach
Email Spam Detection A Machine Learning Approach Ge Song, Lauren Steimle ABSTRACT Machine learning is a branch of artificial intelligence concerned with the creation and study of systems that can learn
More informationTerm extraction for user profiling: evaluation by the user
Term extraction for user profiling: evaluation by the user Suzan Verberne 1, Maya Sappelli 1,2, Wessel Kraaij 1,2 1 Institute for Computing and Information Sciences, Radboud University Nijmegen 2 TNO,
More informationArtificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing and Developing E-mail Classifier
International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-1, Issue-6, January 2013 Artificial Neural Network, Decision Tree and Statistical Techniques Applied for Designing
More informationHow To Create A Text Classification System For Spam Filtering
Term Discrimination Based Robust Text Classification with Application to Email Spam Filtering PhD Thesis Khurum Nazir Junejo 2004-03-0018 Advisor: Dr. Asim Karim Department of Computer Science Syed Babar
More informationSpam Filtering Based on Latent Semantic Indexing
Spam Filtering Based on Latent Semantic Indexing Wilfried N. Gansterer Andreas G. K. Janecek Robert Neumayer Abstract In this paper, a study on the classification performance of a vector space model (VSM)
More informationCosdes: A Collaborative Spam Detection System with a Novel E- Mail Abstraction Scheme
IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719, Volume 2, Issue 9 (September 2012), PP 55-60 Cosdes: A Collaborative Spam Detection System with a Novel E- Mail Abstraction Scheme
More informationA Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
More informationRecognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature
3rd International Conference on Multimedia Technology ICMT 2013) Recognition Method for Handwritten Digits Based on Improved Chain Code Histogram Feature Qian You, Xichang Wang, Huaying Zhang, Zhen Sun
More informationSemantic Video Annotation by Mining Association Patterns from Visual and Speech Features
Semantic Video Annotation by Mining Association Patterns from and Speech Features Vincent. S. Tseng, Ja-Hwung Su, Jhih-Hong Huang and Chih-Jen Chen Department of Computer Science and Information Engineering
More informationA Dynamic Approach to Extract Texts and Captions from Videos
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationA MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS
A MACHINE LEARNING APPROACH TO FILTER UNWANTED MESSAGES FROM ONLINE SOCIAL NETWORKS Charanma.P 1, P. Ganesh Kumar 2, 1 PG Scholar, 2 Assistant Professor,Department of Information Technology, Anna University
More informationA Method of Caption Detection in News Video
3rd International Conference on Multimedia Technology(ICMT 3) A Method of Caption Detection in News Video He HUANG, Ping SHI Abstract. News video is one of the most important media for people to get information.
More informationMining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
More informationMultiscale Object-Based Classification of Satellite Images Merging Multispectral Information with Panchromatic Textural Features
Remote Sensing and Geoinformation Lena Halounová, Editor not only for Scientific Cooperation EARSeL, 2011 Multiscale Object-Based Classification of Satellite Images Merging Multispectral Information with
More informationPredicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
More informationA Personalized Spam Filtering Approach Utilizing Two Separately Trained Filters
2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology A Personalized Spam Filtering Approach Utilizing Two Separately Trained Filters Wei-Lun Teng, Wei-Chung Teng
More informationMachine Learning Final Project Spam Email Filtering
Machine Learning Final Project Spam Email Filtering March 2013 Shahar Yifrah Guy Lev Table of Content 1. OVERVIEW... 3 2. DATASET... 3 2.1 SOURCE... 3 2.2 CREATION OF TRAINING AND TEST SETS... 4 2.3 FEATURE
More informationRepresentation of Electronic Mail Filtering Profiles: A User Study
Representation of Electronic Mail Filtering Profiles: A User Study Michael J. Pazzani Department of Information and Computer Science University of California, Irvine Irvine, CA 92697 +1 949 824 5888 pazzani@ics.uci.edu
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationComparing the Results of Support Vector Machines with Traditional Data Mining Algorithms
Comparing the Results of Support Vector Machines with Traditional Data Mining Algorithms Scott Pion and Lutz Hamel Abstract This paper presents the results of a series of analyses performed on direct mail
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015
RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering
More informationDetecting Image Spam using Visual Features and Near Duplicate Detection
Detecting Image Spam using Visual Features and Near Duplicate Detection Bhaskar Mehta Google Inc. Brandschenkestr 110 Zurich, Switzerland bmehta@google.com Saurabh Nangia* IIT Guwahati Guwahati 781039
More informationImage Classification for Dogs and Cats
Image Classification for Dogs and Cats Bang Liu, Yan Liu Department of Electrical and Computer Engineering {bang3,yan10}@ualberta.ca Kai Zhou Department of Computing Science kzhou3@ualberta.ca Abstract
More informationAn Algorithm for Classification of Five Types of Defects on Bare Printed Circuit Board
IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 5, No. 3, July 2011 CSES International 2011 ISSN 0973-4406 An Algorithm for Classification of Five Types of Defects on Bare
More informationTracking and Recognition in Sports Videos
Tracking and Recognition in Sports Videos Mustafa Teke a, Masoud Sattari b a Graduate School of Informatics, Middle East Technical University, Ankara, Turkey mustafa.teke@gmail.com b Department of Computer
More informationDetecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach
Detecting Spam Bots in Online Social Networking Sites: A Machine Learning Approach Alex Hai Wang College of Information Sciences and Technology, The Pennsylvania State University, Dunmore, PA 18512, USA
More informationKeywords Phishing Attack, phishing Email, Fraud, Identity Theft
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Detection Phishing
More informationSearch Taxonomy. Web Search. Search Engine Optimization. Information Retrieval
Information Retrieval INFO 4300 / CS 4300! Retrieval models Older models» Boolean retrieval» Vector Space model Probabilistic Models» BM25» Language models Web search» Learning to Rank Search Taxonomy!
More informationSpeed Performance Improvement of Vehicle Blob Tracking System
Speed Performance Improvement of Vehicle Blob Tracking System Sung Chun Lee and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu, nevatia@usc.edu Abstract. A speed
More informationImage Spam: The Email Epidemic of 2006
S e c u r i t y T r e n d s Overview Image Spam: The Email Epidemic of 2006 S E C U R I T Y T R E N D S O v e r v i e w End-users around the world are reporting an increase in spam. Much of this increase
More informationE-commerce Transaction Anomaly Classification
E-commerce Transaction Anomaly Classification Minyong Lee minyong@stanford.edu Seunghee Ham sham12@stanford.edu Qiyi Jiang qjiang@stanford.edu I. INTRODUCTION Due to the increasing popularity of e-commerce
More informationAzure Machine Learning, SQL Data Mining and R
Azure Machine Learning, SQL Data Mining and R Day-by-day Agenda Prerequisites No formal prerequisites. Basic knowledge of SQL Server Data Tools, Excel and any analytical experience helps. Best of all:
More informationSpam Filtering using Naïve Bayesian Classification
Spam Filtering using Naïve Bayesian Classification Presented by: Samer Younes Outline What is spam anyway? Some statistics Why is Spam a Problem Major Techniques for Classifying Spam Transport Level Filtering
More informationAdaption of Statistical Email Filtering Techniques
Adaption of Statistical Email Filtering Techniques David Kohlbrenner IT.com Thomas Jefferson High School for Science and Technology January 25, 2007 Abstract With the rise of the levels of spam, new techniques
More informationHoodwinking Spam Email Filters
Proceedings of the 2007 WSEAS International Conference on Computer Engineering and Applications, Gold Coast, Australia, January 17-19, 2007 533 Hoodwinking Spam Email Filters WANLI MA, DAT TRAN, DHARMENDRA
More informationResearch of Postal Data mining system based on big data
3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Research of Postal Data mining system based on big data Xia Hu 1, Yanfeng Jin 1, Fan Wang 1 1 Shi Jiazhuang Post & Telecommunication
More informationDomain Classification of Technical Terms Using the Web
Systems and Computers in Japan, Vol. 38, No. 14, 2007 Translated from Denshi Joho Tsushin Gakkai Ronbunshi, Vol. J89-D, No. 11, November 2006, pp. 2470 2482 Domain Classification of Technical Terms Using
More informationMultimodal Biometric Recognition Security System
Multimodal Biometric Recognition Security System Anju.M.I, G.Sheeba, G.Sivakami, Monica.J, Savithri.M Department of ECE, New Prince Shri Bhavani College of Engg. & Tech., Chennai, India ABSTRACT: Security
More informationFlorida International University - University of Miami TRECVID 2014
Florida International University - University of Miami TRECVID 2014 Miguel Gavidia 3, Tarek Sayed 1, Yilin Yan 1, Quisha Zhu 1, Mei-Ling Shyu 1, Shu-Ching Chen 2, Hsin-Yu Ha 2, Ming Ma 1, Winnie Chen 4,
More informationHow To Train A Classifier With Active Learning In Spam Filtering
Online Active Learning Methods for Fast Label-Efficient Spam Filtering D. Sculley Department of Computer Science Tufts University, Medford, MA USA dsculley@cs.tufts.edu ABSTRACT Active learning methods
More informationPersonalized Spam Filtering for Gray Mail
Personalized Spam Filtering for Gray Mail Ming-wei Chang Computer Science Dept. University of Illinois Urbana, IL, USA mchang21@uiuc.edu Wen-tau Yih Microsoft Research One Microsoft Way Redmond, WA, USA
More informationInvestigation of Support Vector Machines for Email Classification
Investigation of Support Vector Machines for Email Classification by Andrew Farrugia Thesis Submitted by Andrew Farrugia in partial fulfillment of the Requirements for the Degree of Bachelor of Software
More informationData Mining - Evaluation of Classifiers
Data Mining - Evaluation of Classifiers Lecturer: JERZY STEFANOWSKI Institute of Computing Sciences Poznan University of Technology Poznan, Poland Lecture 4 SE Master Course 2008/2009 revised for 2010
More informationAnti-Spam Filter Based on Naïve Bayes, SVM, and KNN model
AI TERM PROJECT GROUP 14 1 Anti-Spam Filter Based on,, and model Yun-Nung Chen, Che-An Lu, Chao-Yu Huang Abstract spam email filters are a well-known and powerful type of filters. We construct different
More informationScienceDirect. Brain Image Classification using Learning Machine Approach and Brain Structure Analysis
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 50 (2015 ) 388 394 2nd International Symposium on Big Data and Cloud Computing (ISBCC 15) Brain Image Classification using
More informationData Mining Algorithms Part 1. Dejan Sarka
Data Mining Algorithms Part 1 Dejan Sarka Join the conversation on Twitter: @DevWeek #DW2015 Instructor Bio Dejan Sarka (dsarka@solidq.com) 30 years of experience SQL Server MVP, MCT, 13 books 7+ courses
More informationMedical Image Segmentation of PACS System Image Post-processing *
Medical Image Segmentation of PACS System Image Post-processing * Lv Jie, Xiong Chun-rong, and Xie Miao Department of Professional Technical Institute, Yulin Normal University, Yulin Guangxi 537000, China
More informationTowards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis
Towards SoMEST Combining Social Media Monitoring with Event Extraction and Timeline Analysis Yue Dai, Ernest Arendarenko, Tuomo Kakkonen, Ding Liao School of Computing University of Eastern Finland {yvedai,
More informationEnvironmental Remote Sensing GEOG 2021
Environmental Remote Sensing GEOG 2021 Lecture 4 Image classification 2 Purpose categorising data data abstraction / simplification data interpretation mapping for land cover mapping use land cover class
More informationKnowledge Discovery from patents using KMX Text Analytics
Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers
More informationData Pre-Processing in Spam Detection
IJSTE - International Journal of Science Technology & Engineering Volume 1 Issue 11 May 2015 ISSN (online): 2349-784X Data Pre-Processing in Spam Detection Anjali Sharma Dr. Manisha Manisha Dr. Rekha Jain
More informationFace Recognition For Remote Database Backup System
Face Recognition For Remote Database Backup System Aniza Mohamed Din, Faudziah Ahmad, Mohamad Farhan Mohamad Mohsin, Ku Ruhana Ku-Mahamud, Mustafa Mufawak Theab 2 Graduate Department of Computer Science,UUM
More informationCircle Object Recognition Based on Monocular Vision for Home Security Robot
Journal of Applied Science and Engineering, Vol. 16, No. 3, pp. 261 268 (2013) DOI: 10.6180/jase.2013.16.3.05 Circle Object Recognition Based on Monocular Vision for Home Security Robot Shih-An Li, Ching-Chang
More informationLaser Gesture Recognition for Human Machine Interaction
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-04, Issue-04 E-ISSN: 2347-2693 Laser Gesture Recognition for Human Machine Interaction Umang Keniya 1*, Sarthak
More informationPredict Influencers in the Social Network
Predict Influencers in the Social Network Ruishan Liu, Yang Zhao and Liuyu Zhou Email: rliu2, yzhao2, lyzhou@stanford.edu Department of Electrical Engineering, Stanford University Abstract Given two persons
More informationPractical Data Science with Azure Machine Learning, SQL Data Mining, and R
Practical Data Science with Azure Machine Learning, SQL Data Mining, and R Overview This 4-day class is the first of the two data science courses taught by Rafal Lukawiecki. Some of the topics will be
More informationColour Image Segmentation Technique for Screen Printing
60 R.U. Hewage and D.U.J. Sonnadara Department of Physics, University of Colombo, Sri Lanka ABSTRACT Screen-printing is an industry with a large number of applications ranging from printing mobile phone
More information6367(Print), ISSN 0976 6375(Online) & TECHNOLOGY Volume 4, Issue 1, (IJCET) January- February (2013), IAEME
INTERNATIONAL International Journal of Computer JOURNAL Engineering OF COMPUTER and Technology ENGINEERING (IJCET), ISSN 0976-6367(Print), ISSN 0976 6375(Online) & TECHNOLOGY Volume 4, Issue 1, (IJCET)
More informationDocument Image Retrieval using Signatures as Queries
Document Image Retrieval using Signatures as Queries Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Harish Srinivasan, Chen Huang CEDAR, University at Buffalo(SUNY) Amherst, New York 14228 Gady Agam and
More informationT-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier. Santosh Tirunagari : 245577
T-61.3050 : Email Classification as Spam or Ham using Naive Bayes Classifier Santosh Tirunagari : 245577 January 20, 2011 Abstract This term project gives a solution how to classify an email as spam or
More informationA Genetic Algorithm-Evolved 3D Point Cloud Descriptor
A Genetic Algorithm-Evolved 3D Point Cloud Descriptor Dominik Wȩgrzyn and Luís A. Alexandre IT - Instituto de Telecomunicações Dept. of Computer Science, Univ. Beira Interior, 6200-001 Covilhã, Portugal
More informationBayesian Spam Detection
Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal Volume 2 Issue 1 Article 2 2015 Bayesian Spam Detection Jeremy J. Eberhardt University or Minnesota, Morris Follow this and additional
More informationDistributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
More informationDepartment of Mechanical Engineering, King s College London, University of London, Strand, London, WC2R 2LS, UK; e-mail: david.hann@kcl.ac.
INT. J. REMOTE SENSING, 2003, VOL. 24, NO. 9, 1949 1956 Technical note Classification of off-diagonal points in a co-occurrence matrix D. B. HANN, Department of Mechanical Engineering, King s College London,
More information