Serial and Parallel Bayesian Spam Filtering using Aho- Corasick and PFAC
|
|
|
- June Terry
- 10 years ago
- Views:
Transcription
1 Serial and Parallel Bayesian Spam Filtering using Aho- Corasick and PFAC Saima Haseeb TIEIT Bhopal Mahak Motwani TIEIT Bhopal Amit Saxena TIEIT Bhopal ABSTRACT With the rapid growth of Internet, , with its convenient and efficient characteristics, has become an important means of communication in people s life. It reduces the cost of communication. It comes with Spam. Spam s, also known as junk s, are unsolicited one s sent in bulk with hidden or forged identity of the sender, address, and header information. It is vital to pursue more effective spam ing approaches to maintain normal operations of systems and to protect the interests of users. In this paper we developed a Spam based on Bayesian ing method using Aho-corasick and PFAC string matching algorithm. This developed an improved version of spam based on traditional Bayesian spam ing to improve spam ing efficiency, and to reduce chances of misjudgement of malignant spam. For further improvement of Spam ing process we are transform the in to parallel spam on GPGPU's by using PFAC Algorithm. Keywords Spam Filter, Bayesian Spam Filter, Aho-Corasick, PFAC. 1. INTRODUCTION With the growing use of electronic mail the problem of having spam becomes the major issue in today s concern [1]. Spams are the unwanted s which floods the internet with many copies of the same message. Sometimes spam carries malicious content that harm our system and degrades the performance [2]. It becomes a need to design a which is capable of handling different variety of spams and reduces false positives. Millions of goes through the servers. This increases the demand of using spam on server which is fast enough to compensate the receiving rate of spam. Further it should not miss a little bit of spam otherwise it will be costly for the receiver because there is a chance of spam being opened and activated and affects security. Therefore the must have high accuracy rate [3]. The proliferation of spam occupies a large number of mail server storage spaces and violates privacy of the recipients [4]. Spam not only costs recipients time to deal with, more importantly, the harm of spam s unhealthy content, including pornography and violence, is difficult to estimate and measure[5]. Various techniques have been used to design spam s. Some of them are list based s and content based s [6]. List based s check the mails on the basis of their servers. There is a predefined list of all the servers which distinguishes spammers with legitimate server. The mails from spammers are rejected otherwise accepted. Blacklist s [7], Real time Black hole list s and White list s, Grey list s comes in this category. Content based s evaluate words or phrases found in each individual message to determine whether an is spam or legitimate. Some content based s are word based s, heuristic s and Bayesian s. Bayesian s, considered the most advanced form of content-based ing, employ the laws of mathematical probability to determine which messages are legitimate and which are spams. In order for a Bayesian to effectively block spam, the end user must initially "train" it by manually flagging each message as either junk or legitimate. Over time, the takes words and phrases found in legitimate s and adds them to a list; it does the same with terms found in spam. In this paper we design a Bayesian sequential and parallel spam based on Enron data set and its parallel version is based on PFAC algorithm. 2. RELATED WORK Blacklist spam s created a list of all the addresses and IP that have been previously used for sending spam. When a mail arrives it checks the mail against the list if it is from the listed sender it is rejected otherwise accepted [4].Real time black hole list s is also based on same concept except there is a involvement of third party which creates the list of spam senders for the organization. It reduces the burden of IT staff. Third party receives the mail, checks it against the list and decides accordingly [5]. White list works exactly opposite to that of a blacklist. Instead of creating a list of spammers, a list of legitimate senders is created [4].Grey list is based on the fact that many spammers send bulk of once. If any bulk of reaches server it rejects it and reports error message to sender. If it is attempted to send twice it is considered as the legitimate mails and its address is added to the list of legitimate senders created by grey list. List based s may misidentify legitimate senders as spammer [4, 5]. Word based is content based in which a list of spam words is created. If the receiving mail contains the blocked words it is reported as the spam but there may be a chance that spammer misspells certain words to pretend its spam as a legitimate mail. It increases the burden of updating blocked word database regularly. Heuristic ing is again a concept based on content based ing. It created a list of suspected words with its heuristic count. Whenever any new message arrives it scans the content of message for the list and calculates the total heuristic count of all the keywords if it is greater than the current count it is considered as a spam otherwise ham [6]. 9
2 Bayesian is probability based ing technique. It learns from spam as well as good mails. At the initial stage is trained by calculating spam probability of known spam and ham keywords. Later this list is used to calculate total spamicity of testing mail. If the spamicity is found greater than or equal to threshold value it is rejected as spam [6, 7]. Content based s are considered as the most efficient s as it checks the content of message and can easily identify spams sending through legitimate users also[8]. Content based s use Aho-Corasick algorithm to calculate count of each pattern in test mail. Aho-Corasick is the multipatterns string matching algorithm which locates all the occurrence of set of keywords in a text of string. It first creates deterministic finite automata for all the predefined keywords and then by using automaton, it processes a text in a single pass. Aho-Corasick works in two phases: preprocessing phase and searching phase as shown in figure 1 and 2. Preprocessing phase constructs finite state automata for the set of predefined keywords (or keyword tree) which are supposed to be found in the text string. After constructing automata, failure function of each node is calculated. Failure function of a node is defined as the longest suffix of the string that is also the prefix of some node. Output function for final states has to be calculated. Searching phase proceeds with scanning the testing mail using automata build in previous phase and reports the count of each keywords [9,10,11]. Aho-Corasick is previously applied in many areas of networks and computer security, and bioinformatics. These networks and bioinformatics applications are computationally demanding and require high speed parallel processing. To speed up the performance of Aho-Corasick algorithm, a parallel version of Aho-Corasick PFAC (parallel failure less Aho-Corasick) is developed [12,13,14]. PFAC uses the concept of GPGPU to fix the occurrence of keywords in a string. In preprocessing phase PFAC build DFA with no back track links. No failure function is calculated for the DFA. Suppose we have 3 patterns [HER, IRIS, IS]. The PFAC DFA for the patterns without back track lines is built as: Fig 1: pre processing phase DFA is placed in global memory from where it is going to accessed by each thread to take the copy of DFA. The concept of allocating DFA to each thread increases the efficiency of Aho-Corasick algorithm. In searching phase, each alphabet of text is assigned to each thread and total no. of thread is equal to text length. Supposed the text to be scanned is IRISTHER. Each thread accesses the copy of automata from global memory and processes its alphabet if a valid transition found, it proceeds otherwise terminates itself. Thread 0 searches the automata for alphabet I it gets valid transition. After taking the input IRIS, Thread 0 reaches state 7, which indicates pattern IRIS, is matched. Thread 1 starts with scanning alphabet R, no transition is found for R so it terminates at state 0. Thread 2 gets transition for I, after taking input IS, Thread 2 reaches state 8, which indicates pattern IS is matched, no transition is found for T, terminates in state 8. Thread 3 and 4 found no transition for S and T, terminate early at state 0. Thread 5 found transition 1 for H, after taking input HER reaches state 3, pattern HER is matched, terminates in state 3. Thread 6 and 7 found no transition for E and R, terminates at state 0. Fig. 2: Searching Phase GPGPU is the concept used to boost many applications in real world [15,16].GPGPU is the use of GPU for general purpose computation[17,18]. To use GPU for general computation it must be programmed by using parallel programming language like Cuda and OpenCL [19, 20].GPU typically handles computation for computer graphics. GPU was originated in the late 1990s as the coprocessor for accelerating the simulation and visualization of 3D images. From 2006 GPU have developed to be more flexible and even considered for GPGPU. In today scenario the high performance of applications is mandatory; in order to meet that requirement GPU comprises parallization of many applications to hike their performance[21,22,23]. A direct implementation of parallel computation on GPUs is to divide an input stream into multiple segments, each of which is processed by a parallel thread for string matching [24,25]. 3. PROPOSED ALGORITHM We design a parallel spam using GPGPU (general purpose computation on GPU).For this purpose we design serial spam and parallelize this approach to make it parallel spam. To design a we use the Bayesian approach. It works in two phases: training phase and ing phase. In training phase, it creates 3 databases, Database for keywords taken from ham and spam mails, Database for ham mails, Database for spam mails. After creating database, it calculates spam probability of every keyword by using Bayesian statistics. Bayesian statistics tell us that if a word connect appears in 35 of 1000 ham mails and in 750 of 1000 spam mails. Then the presence of word connect means that the given message has 95.54%chance of being spam. Spam probability of content =750( ) =95.54 %. This phase creates a file containing list of keywords with their corresponding probability which is later used in ing phase. Filtering 10
3 Fig. 3: Serial Spam Filter phase takes the file created in training phase and testing mail as input and check whether the mail is spam or ham mail. Spamicity of mail is calculated by using the formula Spamicity = p1*p2*p3 Pm / [(p1*p2* pm) + ((1-p1) *(1-p2)* (1-pm))]. If the spamicity comes out to be greater than or equal to threshold the mail is reported as spam. Filtering process uses Aho-corasick, a multipattern string matching algorithm, to calculate count of each keyword in testing mail. This count is used to calculate spamicity of mail. Overall diagram for serial spam is represented in figure 3 and 4. Here T is training Bayesian probability data of keywords. 3.2 Filtering Algorithm for serial spam 1. First we fetch the file produced in first module having name of keywords and their corresponding probability. 2. Fetch mail for which we want to know that is spam or ham 3. Scan the mail by using Aho-Corasick algorithm and calculate frequency count of Spam Keywords in mail. 4. Calculate Spamicity of the mail by using Bayes theorem: Spamicity = (p1*p2*p3 Pm) / [(p1*p2* pm) + ((1-p1) *(1-p2)* (1-pm))]. 5. Compare Spamicity of mail with Threshold value (which is set by reverse Engineering). 6. If Spamicity is greater than Threshold then mail is Spam otherwise It is Ham. Fig. 3: Serial Spam Filter The project is divided into two modules: Filter trainer and Filter. In trainer module, three databases are created: spam keywords, trainer ham database, and trainer spam database. These databases are passed to training algorithm i.e. Bayesian training. Training algorithm calculates spam probability of all keywords based on training data set and named it T. In second module, Aho-Corasick algorithm is used which takes spam keywords and test mail as inputs and provides count of all spam keywords in test mail. 3.1 Training Algorithm for serial spam 1. First we created a list of Spam Keywords and search them in ham and spam database. 2. Spam probability of all keywords is calculated by using Bayesian statistics Parallel spam uses PFAC approach. The trainer module of parallel spam is same as in serial spam. In second module, PFAC algorithm is used which takes spam keywords and test mail as inputs. Each thread is assigned to each alphabet of test mail and report count of keywords. These counts and Bayesian formula are used to calculate spamicity of mail. If it comes out to be greater than or equal to threshold value it is reported as spam otherwise ham. The overall diagram is shown in Figure 5 and 6. Fig. 5: Overall diagram for parallel spam 3. We create a list having name of keywords and their corresponding probability and save this file being used in second module. 11
4 4. 4. EXPERIMENTAL RESULTS AND Fig. 6: Parallel Spam Filter 3.3 Training Algorithm for parallel spam 1. First we created a list of Spam Keywords and search them in ham and spam database. 2. Spam probability of all keywords is calculated by using Bayesian statistics. 3. Then we create a list having name of keywords and their corresponding probability and save this file being used in second module. 3.4 Filtering Algorithm for parallel spam 1. First we fetch the file produced in first module having name of keywords and their corresponding probability 2. Then we fetch mail for which we wish to know that it is spam or ham. 3. Scan the mail by using PFAC algorithm and calculate frequency count of Spam Keywords in mail. 4. Calculate Spamicity of the mail by using Bayes theorem: Spamicity = p1*p2*p3 Pm / [(p1*p2* pm) ((1-p1) *(1- p2)* (1-pm))]. 5. Compare Spamicity of mail with Threshold value (which is set by reverse Engineering). 4. COMPARITIVE ANALYSIS Fig. 4: Comparative analysis between Serial Spam Filter and Parallel Spam is shown in table 1 and figure Experimental Environment Processor: Core i3 RAM: 4 GB OS: Windows 7 Language: Visual C++ runs on Visual Studios 2008 GPGPU: AMD Radeon HD 6800 series Language (parallel implementation): OpenCL 4.2 Experimental Data No. of test mails size is 1000, 2000, 5000 and Experimental Results Table 1: Execution time for serial and parallel spam S.No. No. of Test Mail Parallel Filter Speed Serial Filter Speed sec. 129 sec sec. 198 sec sec. 302 sec sec. 601 sec. 6. If spamicity is greater than Threshold then mail is Spam otherwise It is Ham. 3.5 Calculation of Threshold by Reverse Engineering First we scan all spam mails of Enron Data Set and calculate spamicity of each mail with help of second module. Set minimum value of Spamicity of these mails as Threshold. 12
5 Graphical representation of this experimental results is shown in figure 7 Fig. 7 Experimental Results Figure 7 analyze the empirical comparison between serial execution and parallel execution. Parallel execution takes less than 2 seconds to execute larger data sets where as for same data set serial execution takes more than 10 minutes. Figure 7 explain that parallel execution is much efficient than serial execution. Two different lines represented the execution time in above given graph. One represents parallel execution and another one represents serial execution. Looking into graph we will find that parallel execution is much efficient than serial execution. Table 2 represents accuracy for serial spam and parallel spam. Accuracy for both versions will be same. As the no. of mails increase accuracy will decrease and after a point it will be constant. Table 2 Accuracy calculation S.No. No. of Test Mails Accuracy % % % % % % % % % % Fig. 8: accuracy calculation Spam s accuracy rate is approximately 70%. For training we have taken Enron data sets. Keywords are limited. If we will increase training data sets and keywords than results will be more accurate and efficient. 5. CONCLUSION In order to the s Bayesian spam is an adequate spam. It is more advanced form of content based ing. To ameliorate efficiency of Bayesian spam we have implemented it parallel on GPGPU with Parallel Failure-less Aho-Corasick technique. Parallel spam is efficient on larger data sets and processing time is much better than serial spam. Accuracy of our spam is approximately 70%. 6. REFERENCES [1] Wu, Y. L., Using Visual Features For Anti-Spam Filtering, 2005 IEEE International Conference on Image Processing (ICIP2005), pp , Postini : Monitoring + Filtering Blog. [2] Toshihiro Tabata, SPAM mail ing : commentary of Bayesian, The journal of Information Science and Technology Association, Vol.56, No.10, pp , [3] Spam corpus, SMS corpus, [4] /smscorpus/ [5] Amayri O, Bouguil N (2009). Online Spam Filtering Using Support Vector Machines.IEEE., pp [6] C. Pu, S. Webb, O. Kolesnikov, W. Lee, and R. Lipton. Towards the Integration of Diverse Spam Filtering Techniques. In Proc. of IEEE International Conference on Granular Computing, pages 7 10, [7] I. Androutsopoulos and et., An Evaluation of Naïve Bayesian Anti-Spam Filtering, 11th EurpoeanConference on Machine Learning, pp 9-17, Barcelona, Spain, June 2000 [8] Paul Graham, Better Bayesian Filter, [9] A.V. Aho and M. J. Corasick, Efficient String Matching: An aid Bibliographic search. In Communication of the ACM Vol. 18, issues 6, pp ,
6 [10] Cheng-Hung Lin and Shih-Chieh-Chang, Efficient pattern matching algorithm for memory architecture, Vol. 19, issue 1, pp , January [11] Chengguo Chang and Hui Wang, Comparison of Two- Dimensional String Matching Algorithms In the proc. International Conference on Computer Science and Electronics Engineering (ICCSEE), Vol. 3, pp ,march [12] Raphael Clifford, Markus Jalsenius, Ely Porat and Benjamin Sach, Pattern matching in multiple stream, in the proc. 23rd Annual conference on Combinatorial Pattern Matching, pp ,2012. [13] R. Takahashi, U. Inoue, Parallel Text Matching Using GPGPU, in the proc. 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel & Distributed Computing (SNPD), pp , Aug [14] C. Lin, et al., Accelerating String Matching Using Multi-Threaded Algorithm on GPU, Proc. IEEE Global Telecommunications Conf., pp. 1-5, [15] J. D. Owens, et al., A Survey of General-Purpose Computation on Graphics Hardware, Computer Graphics forum, Vol. 26, No. 1, pp , [16] C. Lin, C. Liu, L. Chien, and S. Chang, Accelerating Pattern Matching Using a Novel Parallel Algorithm on GPUs, IEEE Transactions on computers, vol. pp, issue 1. [17] ZhaXinyan and S. Sahni, Multipattern string matching on a GPU, In the proc. IEEE conference on Computers and Communications (ISCC), pp , July [18] Tran Nhat-Phuong, Lee Myungho, Hong Sugwon and Minho Shin, Memory Efficient Parallelization for Aho- Corasick Algorithm on a GPU, IEEE 14th International Conference on High Performance Computing and Communication, pp , June [19] Jungwon Kim, Honggyu Kim, Joo Hwan Lee and Jaejin Lee, Achieving a single compute device image in OpenCL for multiple GPUs, Proceedings of the 16th ACM symposium on Principles and practice of parallel programming, pp ,2011. [20] NVIDIA, CUDA Best Practices Guide: NVIDIA CUDA C Programming Best Practices Guide CUDA Toolkit 4.0, May, 2011 [21] Xinyan Zha and Sartaj Sahni, GPU-to-GPU and Hostto-Host Multipattern String Matching on a GPU, IEEE Transactions on Computers, Volume 62, Issue 6, pp ,2013 [22] J.E. Stone, D.Gohara, and G.Shi, OpenCl: A parallel programming standard for heterogeneous computing systems, Computing in Science Engineering,vol. 12,no.3,pp.66-73,2010. [23] HyeranJeon, Xia Yinglong and V.K. Prasanna, Parallel Exact Inference on a CPU-GPGPU Heterogeneous System, In the proc. 39th International Conference on parallel Processing (ICPP), pp ,Sept [24] Liang Hu, CheXilong and XieZhenzhen, GPGPU cloud: A paradigm for general purpose computing, Tsinghua Science and Technology, Vol. 18, issue 1, pp , Feb [25] M. C. Schatz and C. Trapnell, Fast Exact String Matching on the GPU, Technical report IJCA TM : 14
Savita Teli 1, Santoshkumar Biradar 2
Effective Spam Detection Method for Email Savita Teli 1, Santoshkumar Biradar 2 1 (Student, Dept of Computer Engg, Dr. D. Y. Patil College of Engg, Ambi, University of Pune, M.S, India) 2 (Asst. Proff,
Bayesian Spam Filtering
Bayesian Spam Filtering Ahmed Obied Department of Computer Science University of Calgary [email protected] http://www.cpsc.ucalgary.ca/~amaobied Abstract. With the enormous amount of spam messages propagating
International Journal of Research in Advent Technology Available Online at: http://www.ijrat.org
IMPROVING PEFORMANCE OF BAYESIAN SPAM FILTER Firozbhai Ahamadbhai Sherasiya 1, Prof. Upen Nathwani 2 1 2 Computer Engineering Department 1 2 Noble Group of Institutions 1 [email protected] ABSTARCT:
How To Filter Spam Image From A Picture By Color Or Color
Image Content-Based Email Spam Image Filtering Jianyi Wang and Kazuki Katagishi Abstract With the population of Internet around the world, email has become one of the main methods of communication among
6367(Print), ISSN 0976 6375(Online) & TECHNOLOGY Volume 4, Issue 1, (IJCET) January- February (2013), IAEME
INTERNATIONAL International Journal of Computer JOURNAL Engineering OF COMPUTER and Technology ENGINEERING (IJCET), ISSN 0976-6367(Print), ISSN 0976 6375(Online) & TECHNOLOGY Volume 4, Issue 1, (IJCET)
Spam Detection and the Types of Email
Volume 3, Issue 7, July 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Spam Detection
Dual Mechanism to Detect DDOS Attack Priyanka Dembla, Chander Diwaker 2 1 Research Scholar, 2 Assistant Professor
International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Engineering, Business and Enterprise
IMPROVING SPAM EMAIL FILTERING EFFICIENCY USING BAYESIAN BACKWARD APPROACH PROJECT
IMPROVING SPAM EMAIL FILTERING EFFICIENCY USING BAYESIAN BACKWARD APPROACH PROJECT M.SHESHIKALA Assistant Professor, SREC Engineering College,Warangal Email: [email protected], Abstract- Unethical
Email Spam Detection Using Customized SimHash Function
International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 35-40 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Email
eprism Email Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide
eprism Email Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide This guide is designed to help the administrator configure the eprism Intercept Anti-Spam engine to provide a strong spam protection
Index Terms Domain name, Firewall, Packet, Phishing, URL.
BDD for Implementation of Packet Filter Firewall and Detecting Phishing Websites Naresh Shende Vidyalankar Institute of Technology Prof. S. K. Shinde Lokmanya Tilak College of Engineering Abstract Packet
Journal of Information Technology Impact
Journal of Information Technology Impact Vol. 8, No., pp. -0, 2008 Probability Modeling for Improving Spam Filtering Parameters S. C. Chiemeke University of Benin Nigeria O. B. Longe 2 University of Ibadan
A Personalized Spam Filtering Approach Utilizing Two Separately Trained Filters
2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology A Personalized Spam Filtering Approach Utilizing Two Separately Trained Filters Wei-Lun Teng, Wei-Chung Teng
Lan, Mingjun and Zhou, Wanlei 2005, Spam filtering based on preference ranking, in Fifth International Conference on Computer and Information
Lan, Mingjun and Zhou, Wanlei 2005, Spam filtering based on preference ranking, in Fifth International Conference on Computer and Information Technology : CIT 2005 : proceedings : 21-23 September, 2005,
AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM
ISSN: 2229-6956(ONLINE) ICTACT JOURNAL ON SOFT COMPUTING, APRIL 212, VOLUME: 2, ISSUE: 3 AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM S. Arun Mozhi Selvi 1 and R.S. Rajesh 2 1 Department
1 Introductory Comments. 2 Bayesian Probability
Introductory Comments First, I would like to point out that I got this material from two sources: The first was a page from Paul Graham s website at www.paulgraham.com/ffb.html, and the second was a paper
A MACHINE LEARNING APPROACH TO SERVER-SIDE ANTI-SPAM E-MAIL FILTERING 1 2
UDC 004.75 A MACHINE LEARNING APPROACH TO SERVER-SIDE ANTI-SPAM E-MAIL FILTERING 1 2 I. Mashechkin, M. Petrovskiy, A. Rozinkin, S. Gerasimov Computer Science Department, Lomonosov Moscow State University,
Anti Spamming Techniques
Anti Spamming Techniques Written by Sumit Siddharth In this article will we first look at some of the existing methods to identify an email as a spam? We look at the pros and cons of the existing methods
An Overview of Spam Blocking Techniques
An Overview of Spam Blocking Techniques Recent analyst estimates indicate that over 60 percent of the world s email is unsolicited email, or spam. Spam is no longer just a simple annoyance. Spam has now
GLoP: Enabling Massively Parallel Incident Response Through GPU Log Processing
GLoP: Enabling Massively Parallel Incident Response Through GPU Log Processing Xavier J. A. Bellekens Department of Electronic and Electrical Engineering University of Strathclyde Glasgow, G1 1XW, UK [email protected]
Enhancing Cloud-based Servers by GPU/CPU Virtualization Management
Enhancing Cloud-based Servers by GPU/CPU Virtualiz Management Tin-Yu Wu 1, Wei-Tsong Lee 2, Chien-Yu Duan 2 Department of Computer Science and Inform Engineering, Nal Ilan University, Taiwan, ROC 1 Department
Bayesian Spam Detection
Scholarly Horizons: University of Minnesota, Morris Undergraduate Journal Volume 2 Issue 1 Article 2 2015 Bayesian Spam Detection Jeremy J. Eberhardt University or Minnesota, Morris Follow this and additional
ANALYSIS OF RSA ALGORITHM USING GPU PROGRAMMING
ANALYSIS OF RSA ALGORITHM USING GPU PROGRAMMING Sonam Mahajan 1 and Maninder Singh 2 1 Department of Computer Science Engineering, Thapar University, Patiala, India 2 Department of Computer Science Engineering,
Adaption of Statistical Email Filtering Techniques
Adaption of Statistical Email Filtering Techniques David Kohlbrenner IT.com Thomas Jefferson High School for Science and Technology January 25, 2007 Abstract With the rise of the levels of spam, new techniques
Data Mining in Web Search Engine Optimization and User Assisted Rank Results
Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management
Email Spam Detection A Machine Learning Approach
Email Spam Detection A Machine Learning Approach Ge Song, Lauren Steimle ABSTRACT Machine learning is a branch of artificial intelligence concerned with the creation and study of systems that can learn
Sender and Receiver Addresses as Cues for Anti-Spam Filtering Chih-Chien Wang
Sender and Receiver Addresses as Cues for Anti-Spam Filtering Chih-Chien Wang Graduate Institute of Information Management National Taipei University 69, Sec. 2, JianGuo N. Rd., Taipei City 104-33, Taiwan
Intelligent Word-Based Spam Filter Detection Using Multi-Neural Networks
www.ijcsi.org 17 Intelligent Word-Based Spam Filter Detection Using Multi-Neural Networks Ann Nosseir 1, Khaled Nagati 1 and Islam Taj-Eddin 1 1 Faculty of Informatics and Computer Sciences British University
Solutions IT Ltd Virus and Antispam filtering solutions 01324 877183 [email protected]
Contents Reduce Spam & Viruses... 2 Start a free 14 day free trial to separate the wheat from the chaff... 2 Emails with Viruses... 2 Spam Bourne Emails... 3 Legitimate Emails... 3 Filtering Options...
A Novel Distributed Denial of Service (DDoS) Attacks Discriminating Detection in Flash Crowds
International Journal of Research Studies in Science, Engineering and Technology Volume 1, Issue 9, December 2014, PP 139-143 ISSN 2349-4751 (Print) & ISSN 2349-476X (Online) A Novel Distributed Denial
Adaptive Filtering of SPAM
Adaptive Filtering of SPAM L. Pelletier, J. Almhana, V. Choulakian GRETI, University of Moncton Moncton, N.B.,Canada E1A 3E9 {elp6880, almhanaj, choulav}@umoncton.ca Abstract In this paper, we present
Intercept Anti-Spam Quick Start Guide
Intercept Anti-Spam Quick Start Guide Software Version: 6.5.2 Date: 5/24/07 PREFACE...3 PRODUCT DOCUMENTATION...3 CONVENTIONS...3 CONTACTING TECHNICAL SUPPORT...4 COPYRIGHT INFORMATION...4 OVERVIEW...5
Figure 1. The cloud scales: Amazon EC2 growth [2].
- Chung-Cheng Li and Kuochen Wang Department of Computer Science National Chiao Tung University Hsinchu, Taiwan 300 [email protected], [email protected] Abstract One of the most important issues
Configurable String Matching Hardware for Speeding up Intrusion Detection. Monther Aldwairi*, Thomas Conte, Paul Franzon
Configurable String Matching Hardware for Speeding up Intrusion Detection Monther Aldwairi*, Thomas Conte, Paul Franzon Department of Electrical and Computer Engineering, North Carolina State University,
Spam filtering. Peter Likarish Based on slides by EJ Jung 11/03/10
Spam filtering Peter Likarish Based on slides by EJ Jung 11/03/10 What is spam? An unsolicited email equivalent to Direct Mail in postal service UCE (unsolicited commercial email) UBE (unsolicited bulk
Feature Subset Selection in E-mail Spam Detection
Feature Subset Selection in E-mail Spam Detection Amir Rajabi Behjat, Universiti Technology MARA, Malaysia IT Security for the Next Generation Asia Pacific & MEA Cup, Hong Kong 14-16 March, 2012 Feature
Why Content Filters Can t Eradicate spam
WHITEPAPER Why Content Filters Can t Eradicate spam About Mimecast Mimecast () delivers cloud-based email management for Microsoft Exchange, including archiving, continuity and security. By unifying disparate
A Proposed Algorithm for Spam Filtering Emails by Hash Table Approach
International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 4 (9): 2436-2441 Science Explorer Publications A Proposed Algorithm for Spam Filtering
ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU
Computer Science 14 (2) 2013 http://dx.doi.org/10.7494/csci.2013.14.2.243 Marcin Pietroń Pawe l Russek Kazimierz Wiatr ACCELERATING SELECT WHERE AND SELECT JOIN QUERIES ON A GPU Abstract This paper presents
Spam? Not Any More! Detecting Spam emails using neural networks
Spam? Not Any More! Detecting Spam emails using neural networks ECE / CS / ME 539 Project Submitted by Sivanadyan, Thiagarajan Last Name First Name TABLE OF CONTENTS 1. INTRODUCTION...2 1.1 Importance
Parallel Firewalls on General-Purpose Graphics Processing Units
Parallel Firewalls on General-Purpose Graphics Processing Units Manoj Singh Gaur and Vijay Laxmi Kamal Chandra Reddy, Ankit Tharwani, Ch.Vamshi Krishna, Lakshminarayanan.V Department of Computer Engineering
SURVEY PAPER ON INTELLIGENT SYSTEM FOR TEXT AND IMAGE SPAM FILTERING Amol H. Malge 1, Dr. S. M. Chaware 2
International Journal of Computer Engineering and Applications, Volume IX, Issue I, January 15 SURVEY PAPER ON INTELLIGENT SYSTEM FOR TEXT AND IMAGE SPAM FILTERING Amol H. Malge 1, Dr. S. M. Chaware 2
A Composite Intelligent Method for Spam Filtering
, pp.67-76 http://dx.doi.org/10.14257/ijsia.2014.8.4.07 A Composite Intelligent Method for Spam Filtering Jun Liu 1*, Shuyu Chen 2, Kai Liu 1 and ong Zhou 1 1 College of Computer Science, Chongqing University,
A Content based Spam Filtering Using Optical Back Propagation Technique
A Content based Spam Filtering Using Optical Back Propagation Technique Sarab M. Hameed 1, Noor Alhuda J. Mohammed 2 Department of Computer Science, College of Science, University of Baghdad - Iraq ABSTRACT
Cloud Services. Email Anti-Spam. Admin Guide
Cloud Services Email Anti-Spam Admin Guide 10/23/2014 CONTENTS Introduction to Anti- Spam... 4 About Anti- Spam... 4 Locating the Anti- Spam Pages in the Portal... 5 Anti- Spam Best Practice Settings...
An Efficient Spam Filtering Techniques for Email Account
American Journal of Engineering Research (AJER) e-issn : 2320-0847 p-issn : 2320-0936 Volume-02, Issue-10, pp-63-73 www.ajer.org Research Paper Open Access An Efficient Spam Filtering Techniques for Email
Introduction to GPU Computing
Matthis Hauschild Universität Hamburg Fakultät für Mathematik, Informatik und Naturwissenschaften Technische Aspekte Multimodaler Systeme December 4, 2014 M. Hauschild - 1 Table of Contents 1. Architecture
Detecting spam using social networking concepts Honours Project COMP4905 Carleton University Terrence Chiu 100605339
Detecting spam using social networking concepts Honours Project COMP4905 Carleton University Terrence Chiu 100605339 Supervised by Dr. Tony White School of Computer Science Summer 2007 Abstract This paper
Spam Filtering Methods for Email Filtering
Spam Filtering Methods for Email Filtering Akshay P. Gulhane Final year B.E. (CSE) E-mail: [email protected] Sakshi Gudadhe Third year B.E. (CSE) E-mail: [email protected] Shraddha A.
MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO THE SENDER MAIL SERVER
MINIMIZING THE TIME OF SPAM MAIL DETECTION BY RELOCATING FILTERING SYSTEM TO THE SENDER MAIL SERVER Alireza Nemaney Pour 1, Raheleh Kholghi 2 and Soheil Behnam Roudsari 2 1 Dept. of Software Technology
Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism
Accelerating BIRCH for Clustering Large Scale Streaming Data Using CUDA Dynamic Parallelism Jianqiang Dong, Fei Wang and Bo Yuan Intelligent Computing Lab, Division of Informatics Graduate School at Shenzhen,
Speeding Up RSA Encryption Using GPU Parallelization
2014 Fifth International Conference on Intelligent Systems, Modelling and Simulation Speeding Up RSA Encryption Using GPU Parallelization Chu-Hsing Lin, Jung-Chun Liu, and Cheng-Chieh Li Department of
Spam Filtering using Naïve Bayesian Classification
Spam Filtering using Naïve Bayesian Classification Presented by: Samer Younes Outline What is spam anyway? Some statistics Why is Spam a Problem Major Techniques for Classifying Spam Transport Level Filtering
Accelerating Techniques for Rapid Mitigation of Phishing and Spam Emails
Accelerating Techniques for Rapid Mitigation of Phishing and Spam Emails Pranil Gupta, Ajay Nagrale and Shambhu Upadhyaya Computer Science and Engineering University at Buffalo Buffalo, NY 14260 {pagupta,
Achieve more with less
Energy reduction Bayesian Filtering: the essentials - A Must-take approach in any organization s Anti-Spam Strategy - Whitepaper Achieve more with less What is Bayesian Filtering How Bayesian Filtering
GPU System Architecture. Alan Gray EPCC The University of Edinburgh
GPU System Architecture EPCC The University of Edinburgh Outline Why do we want/need accelerators such as GPUs? GPU-CPU comparison Architectural reasons for GPU performance advantages GPU accelerated systems
BARRACUDA. N e t w o r k s SPAM FIREWALL 600
BARRACUDA N e t w o r k s SPAM FIREWALL 600 Contents: I. What is Barracuda?...1 II. III. IV. How does Barracuda Work?...1 Quarantine Summary Notification...2 Quarantine Inbox...4 V. Sort the Quarantine
Cosdes: A Collaborative Spam Detection System with a Novel E- Mail Abstraction Scheme
IOSR Journal of Engineering (IOSRJEN) e-issn: 2250-3021, p-issn: 2278-8719, Volume 2, Issue 9 (September 2012), PP 55-60 Cosdes: A Collaborative Spam Detection System with a Novel E- Mail Abstraction Scheme
An OpenCL Candidate Slicing Frequent Pattern Mining Algorithm on Graphic Processing Units*
An OpenCL Candidate Slicing Frequent Pattern Mining Algorithm on Graphic Processing Units* Che-Yu Lin Science and Information Engineering Chung Hua University [email protected] Kun-Ming Yu Science and
Spam Filtering and Removing Spam Content from Massage by Using Naive Bayesian
www..org 104 Spam Filtering and Removing Spam Content from Massage by Using Naive Bayesian 1 Abha Suryavanshi, 2 Shishir Shandilya 1 Research Scholar, NIIST Bhopal, India. 2 Prof. (CSE), NIIST Bhopal,
Spam DNA Filtering System
The Excedent Spam DNA Filtering System provides webmail.us customers with premium and effective junk email protection. Threats to email services are rising rapidly. A Growing Problem As of November 2002,
Tweaking Naïve Bayes classifier for intelligent spam detection
682 Tweaking Naïve Bayes classifier for intelligent spam detection Ankita Raturi 1 and Sunil Pranit Lal 2 1 University of California, Irvine, CA 92697, USA. [email protected] 2 School of Computing, Information
Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
A Partition-Based Efficient Algorithm for Large Scale. Multiple-Strings Matching
A Partition-Based Efficient Algorithm for Large Scale Multiple-Strings Matching Ping Liu Jianlong Tan, Yanbing Liu Software Division, Institute of Computing Technology, Chinese Academy of Sciences, Beijing,
DRAFT 18-09-2003. 2.1 Gigabit network intrusion detection systems
An Intrusion Detection System for Gigabit Networks (Working paper: describing ongoing work) Gerald Tripp Computing Laboratory, University of Kent. CT2 7NF. UK e-mail: [email protected] This draft
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology November-2015 Volume 2, Issue-6
International Journal of Engineering Research ISSN: 2348-4039 & Management Technology Email: [email protected] November-2015 Volume 2, Issue-6 www.ijermt.org Modeling Big Data Characteristics for Discovering
CAS-ICT at TREC 2005 SPAM Track: Using Non-Textual Information to Improve Spam Filtering Performance
CAS-ICT at TREC 2005 SPAM Track: Using Non-Textual Information to Improve Spam Filtering Performance Shen Wang, Bin Wang and Hao Lang, Xueqi Cheng Institute of Computing Technology, Chinese Academy of
Introduction to GP-GPUs. Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1
Introduction to GP-GPUs Advanced Computer Architectures, Cristina Silvano, Politecnico di Milano 1 GPU Architectures: How do we reach here? NVIDIA Fermi, 512 Processing Elements (PEs) 2 What Can It Do?
SpamNet Spam Detection Using PCA and Neural Networks
SpamNet Spam Detection Using PCA and Neural Networks Abhimanyu Lad B.Tech. (I.T.) 4 th year student Indian Institute of Information Technology, Allahabad Deoghat, Jhalwa, Allahabad, India [email protected]
Impact of Feature Selection Technique on Email Classification
Impact of Feature Selection Technique on Email Classification Aakanksha Sharaff, Naresh Kumar Nagwani, and Kunal Swami Abstract Being one of the most powerful and fastest way of communication, the popularity
Towards better accuracy for Spam predictions
Towards better accuracy for Spam predictions Chengyan Zhao Department of Computer Science University of Toronto Toronto, Ontario, Canada M5S 2E4 [email protected] Abstract Spam identification is crucial
A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster
, pp.11-20 http://dx.doi.org/10.14257/ ijgdc.2014.7.2.02 A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster Kehe Wu 1, Long Chen 2, Shichao Ye 2 and Yi Li 2 1 Beijing
Analysis of Spam Filter Methods on SMTP Servers Category: Trends in Anti-Spam Development
Analysis of Spam Filter Methods on SMTP Servers Category: Trends in Anti-Spam Development Author André Tschentscher Address Fachhochschule Erfurt - University of Applied Sciences Applied Computer Science
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering
A Two-Pass Statistical Approach for Automatic Personalized Spam Filtering Khurum Nazir Junejo, Mirza Muhammad Yousaf, and Asim Karim Dept. of Computer Science, Lahore University of Management Sciences
A Phased Framework for Countering VoIP SPAM
International Journal of Advanced Science and Technology 21 A Phased Framework for Countering VoIP SPAM Jongil Jeong 1, Taijin Lee 1, Seokung Yoon 1, Hyuncheol Jeong 1, Yoojae Won 1, Myuhngjoo Kim 2 1
SPAM FILTER Service Data Sheet
Content 1 Spam detection problem 1.1 What is spam? 1.2 How is spam detected? 2 Infomail 3 EveryCloud Spam Filter features 3.1 Cloud architecture 3.2 Incoming email traffic protection 3.2.1 Mail traffic
About this documentation
Wilkes University, Staff, and Students have a new email spam filter to protect against unwanted email messages. Barracuda SPAM Firewall will filter email for all campus email accounts before it gets to
REVIEW AND ANALYSIS OF SPAM BLOCKING APPLICATIONS
REVIEW AND ANALYSIS OF SPAM BLOCKING APPLICATIONS Rami Khasawneh, Acting Dean, College of Business, Lewis University, [email protected] Shamsuddin Ahmed, College of Business and Economics, United Arab
Online Spam Filter for Duplicate or Near Duplicate Message Content Detection Scheme
Online Spam Filter for Duplicate or Near Duplicate Message Content Detection Scheme 1 Rahul Verma, 2 Joydip Dhar ABV- Indian Institute of Information Technology and Management, Gwalior-474015, India, 1,
Immunity from spam: an analysis of an artificial immune system for junk email detection
Immunity from spam: an analysis of an artificial immune system for junk email detection Terri Oda and Tony White Carleton University, Ottawa ON, Canada [email protected], [email protected] Abstract.
A Time Efficient Algorithm for Web Log Analysis
A Time Efficient Algorithm for Web Log Analysis Santosh Shakya Anju Singh Divakar Singh Student [M.Tech.6 th sem (CSE)] Asst.Proff, Dept. of CSE BU HOD (CSE), BUIT, BUIT,BU Bhopal Barkatullah University,
