Multi-Protocol Content Filtering
|
|
|
- Darlene Nichols
- 10 years ago
- Views:
Transcription
1 Multi-Protocol Content Filtering Matthew Johnson MEng Individual Project 1 Title hello, etc. 1-1
2 Why filter content? Information overload Specific personal interests General signal-to-noise ratio...affected by unwanted content, usually commercial or advertisement-based... 2 Information overload too much content, or too many content items to handle, but nothing we specifically don t want to know about, message digests of mailing lists, kerneltraffic and friends Specific personal interest lots of content, but a lot of it is about stuff we are not interested in knowing about, yet, that other content may be interesting to other people, e.g. on a developers mailing list, we may be interested in bugfixes and problems with a Linux version, but couldn t care less about VMS. General signal-to-noise ratio SNR: ratio of content in which the userbase is interested compared to that in which the userbase is disinterested. 2-1
3 Why is spam such a problem? Not just Usenet / Netnews also suffers. significant increase, 48% in last 12 months 3 Half of all s monitored by MessageLabs in May 2003 were spam, June seems to have reduced so far but we are not at the end of the month yet. 3-1
4 filtration options Killfiles / Blacklists - simplistic header-based filter Spammers regularly spoof headers not much help. Precise hash matches (e.g. Vipul s Razor) Spammers regularly insert hashbusters into their content. But collaborative filtering not without merit... Regexp-based content matching and server blacklisting (e.g. SpamAssassin) Very effective, but suffers due to static heuristic rules. 4 Killfiles still used because users understand them REALLY well, despite their lack of effectiveness. Still useful for deliberately blocking posts from contributors who rub you up the wrong way, but useless for spam. Concept of matching content: right direction but not foolproof. Spamware agents the ability to insert hashbusters which are deliberately designed to throw off trivial hash-collision detection methods. Note the benefits of collab filtering though; when it works, it s good. Discuss SA rules body and header matching, e.g. Nigerian spam, mail-client spoofing. Effectiveness is excellent but errors remain possible. Quite a lot of confirmation s (e.g. Easyjet, Ryanair) get misclassified because they match the heuristics. Equally, if spam comes along which doesn t match the static rules, it s not detected. 4-1
5 The dynamic solution Static rules can make an educated guess as to what the user thinks may be spam......but the only way to find out precisely is to have the user tell us. The user s wishes are unlikely to be codifiable as a set of static rules we must find a different way. 5 Project Objectives Implementation of a content filter for mail and news, controlled and influenced by the individual user. Content filtration by statistical classification and distribution of content hashes Investigation of statistical classification as applied to news 6
6 System Architecture Incoming Mail Incoming News Mail Handler News Handler Mgmt Clients Spam Handler Collab Handler Content Handler Management Interface Core Bayesian Classifier, Collaborative Filter Filtered Mail Filtered News Collab Messages Incoming Mail 7 Statistical filtering Analyze a set of examples which the user tells us are either spam or non-spam. Calculate the prior probability of each word in the examples based on how often they appear in spam content. e.g. Click appears in 939 out of 2,355 spam examples and 113 out of 4,787 non-spam content. p spam = =
7 The Naïve Bayesian Classifier To test a content item, search for the probability of every word in the new content in the table we created. Find the most extreme n probabilities (those closest to 0 or 1) Use the word probabilities as likelihood indicators for the new content being spam. n k=1 P spam = p k n k=1 p k + n k=1 (1 p k) 9 Collaborative filtration Users generally in some form of community The same spam content may reach more than one member of the community Time delay in mail handling works to our advantage Can we share knowledge within communities to reduce the amount of spam a user sees? 10
8 Better content matching Current hash-detection systems fail too readily Need function such that: If content a and b are substantively similar values α and β are arithmetically similar. A fuzzy hash hash where two hashes are quantitatively comparable. 11 Using fuzzy hashing in collaboration Alice receives an , which is detected as spam. Alice s mail filter hashes the content, notes the hash, and sends it on to any interested collaborators. Bob s mail filter receives a collaborative message regarding the new spam. It notes the hash. Bob then receives an . The is hashed, and compared with those it knows about. Bob s mail filter discovers the new mail is a 98% match with the spam Alice told us about. Bob has set his hash match threshold to 70%, so the mail is detected as spam. 12
9 Implementation Challenges Homogenization of content from various protocols abstract message format PGP integration for trustworthy collaboration News protocol implementation 13 Results Like-for-like testing: My filter: 75% accuracy with no false positives SpamAssassin: 90% accuracy with no false positives Hard to test collaborative filtering Reasonable performance but not really comparable with the bleeding edge 14
10 Demonstration 15 Further Work Optimization of configuration variables Token thresholds, number of tokens used in testing. Optimization of fuzzy hash matching algorithm Slow due to attempted rolling window matches Addition of other protocols Web-based bulletin boards? User interface extensions Provide a usable mail/news client SpamAssassin for news, meta-filtration Infrastructure could apply SpamAssassin to news, refactor to allow multiple content testing methods. 16
11 Summary A content filter which functions acceptably Bayesian filtering and fuzzy hash matching are useful Sole use of these technologies may not be sufficient Combining filters likely to be the best solution 17 Any further questions? 18
Antispam Security Best Practices
Antispam Security Best Practices First, the bad news. In the war between spammers and legitimate mail users, spammers are winning, and will continue to do so for the foreseeable future. The cost for spammers
eprism Email Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide
eprism Email Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide This guide is designed to help the administrator configure the eprism Intercept Anti-Spam engine to provide a strong spam protection
An Overview of Spam Blocking Techniques
An Overview of Spam Blocking Techniques Recent analyst estimates indicate that over 60 percent of the world s email is unsolicited email, or spam. Spam is no longer just a simple annoyance. Spam has now
Objective This howto demonstrates and explains the different mechanisms for fending off unwanted spam e-mail.
Collax Spam Filter Howto This howto describes the configuration of the spam filter on a Collax server. Requirements Collax Business Server Collax Groupware Suite Collax Security Gateway Collax Platform
About this documentation
Wilkes University, Staff, and Students have a new email spam filter to protect against unwanted email messages. Barracuda SPAM Firewall will filter email for all campus email accounts before it gets to
Filtering Spam Using Search Engines
Filtering Spam Using Search Engines Oleg Kolesnikov, Wenke Lee, and Richard Lipton ok,wenke,rjl @cc.gatech.edu College of Computing Georgia Institute of Technology Atlanta, GA 30332 Abstract Spam filtering
Spam Filtering using Naïve Bayesian Classification
Spam Filtering using Naïve Bayesian Classification Presented by: Samer Younes Outline What is spam anyway? Some statistics Why is Spam a Problem Major Techniques for Classifying Spam Transport Level Filtering
Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools
Tightening the Net: A Review of Current and Next Generation Spam Filtering Tools Spam Track Wednesday 1 March, 2006 APRICOT Perth, Australia James Carpinter & Ray Hunt Dept. of Computer Science and Software
Intercept Anti-Spam Quick Start Guide
Intercept Anti-Spam Quick Start Guide Software Version: 6.5.2 Date: 5/24/07 PREFACE...3 PRODUCT DOCUMENTATION...3 CONVENTIONS...3 CONTACTING TECHNICAL SUPPORT...4 COPYRIGHT INFORMATION...4 OVERVIEW...5
COMBATING SPAM. Best Practices OVERVIEW. White Paper. March 2007
COMBATING SPAM Best Practices March 2007 OVERVIEW Spam, Spam, More Spam and Now Spyware, Fraud and Forgery Spam used to be just annoying, but today its impact on an organization can be costly in many different
Lan, Mingjun and Zhou, Wanlei 2005, Spam filtering based on preference ranking, in Fifth International Conference on Computer and Information
Lan, Mingjun and Zhou, Wanlei 2005, Spam filtering based on preference ranking, in Fifth International Conference on Computer and Information Technology : CIT 2005 : proceedings : 21-23 September, 2005,
PANDA CLOUD EMAIL PROTECTION 4.0.1 1 User Manual 1
PANDA CLOUD EMAIL PROTECTION 4.0.1 1 User Manual 1 Contents 1. INTRODUCTION TO PANDA CLOUD EMAIL PROTECTION... 4 1.1. WHAT IS PANDA CLOUD EMAIL PROTECTION?... 4 1.1.1. Why is Panda Cloud Email Protection
ASAV Configuration Advanced Spam Filtering
ASAV Configuration Advanced Spam Filtering Step 1: Login to http://asav.mediaring.sg/ using the login credentials supplied in the Spam, Virus (ASAV) activation email. Step 2: Configuring Protection Level
AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM
ISSN: 2229-6956(ONLINE) ICTACT JOURNAL ON SOFT COMPUTING, APRIL 212, VOLUME: 2, ISSUE: 3 AN EFFECTIVE SPAM FILTERING FOR DYNAMIC MAIL MANAGEMENT SYSTEM S. Arun Mozhi Selvi 1 and R.S. Rajesh 2 1 Department
Groundbreaking Technology Redefines Spam Prevention. Analysis of a New High-Accuracy Method for Catching Spam
Groundbreaking Technology Redefines Spam Prevention Analysis of a New High-Accuracy Method for Catching Spam October 2007 Introduction Today, numerous companies offer anti-spam solutions. Most techniques
procmail and SpamAssassin
procmail and SpamAssassin UCLA Linux User Group, February 2004 Ben Clifford [email protected] procmailandspamassassin p.1 Outline Two related themes: procmail Filtering mail SpamAssassin Detecting spam
Improving the Performance of Heuristic Spam Detection using a Multi-Objective Genetic Algorithm. James Dudley
Improving the Performance of Heuristic Spam Detection using a Multi-Objective Genetic Algorithm James Dudley This report is submitted as partial fulfilment of the requirements for the Honours Programme
MDaemon configuration recommendations for dealing with spam related issues
Web: Introduction MDaemon configuration recommendations for dealing with spam related issues Without a doubt, our most common support queries these days fall into one of the following groups:- 1. Why did
Spam filtering. Peter Likarish Based on slides by EJ Jung 11/03/10
Spam filtering Peter Likarish Based on slides by EJ Jung 11/03/10 What is spam? An unsolicited email equivalent to Direct Mail in postal service UCE (unsolicited commercial email) UBE (unsolicited bulk
USER S MANUAL Cloud Email Firewall 4.3.2.4 1. Cloud Email & Web Security
USER S MANUAL Cloud Email Firewall 4.3.2.4 1 Contents 1. INTRODUCTION TO CLOUD EMAIL FIREWALL... 4 1.1. WHAT IS CLOUD EMAIL FIREWALL?... 4 1.1.1. What makes Cloud Email Firewall different?... 4 1.1.2.
BARRACUDA. N e t w o r k s SPAM FIREWALL 600
BARRACUDA N e t w o r k s SPAM FIREWALL 600 Contents: I. What is Barracuda?...1 II. III. IV. How does Barracuda Work?...1 Quarantine Summary Notification...2 Quarantine Inbox...4 V. Sort the Quarantine
Configuring MDaemon for Centralized Spam Blocking and Filtering
Configuring MDaemon for Centralized Spam Blocking and Filtering Alt-N Technologies, Ltd 2201 East Lamar Blvd, Suite 270 Arlington, TX 76006 (817) 525-2005 http://www.altn.com July 26, 2004 Contents A Centralized
Anti Spamming Techniques
Anti Spamming Techniques Written by Sumit Siddharth In this article will we first look at some of the existing methods to identify an email as a spam? We look at the pros and cons of the existing methods
1.1.1. What makes Panda Cloud Email Protection different?... 4. 1.1.2. Is it secure?... 4. 1.2.1. How messages are classified... 5
Contents 1. INTRODUCTION TO PANDA CLOUD EMAIL PROTECTION... 4 1.1. WHAT IS PANDA CLOUD EMAIL PROTECTION?... 4 1.1.1. What makes Panda Cloud Email Protection different?... 4 1.1.2. Is it secure?... 4 1.2.
Handling Unsolicited Commercial Email (UCE) or spam using Microsoft Outlook at Staffordshire University
Reference : USER 190 Issue date : January 2004 Revised : November 2007 Classification : Staff Originator : Richard Rogers Handling Unsolicited Commercial Email (UCE) or spam using Microsoft Outlook at
Spam Filtering based on Naive Bayes Classification. Tianhao Sun
Spam Filtering based on Naive Bayes Classification Tianhao Sun May 1, 2009 Abstract This project discusses about the popular statistical spam filtering process: naive Bayes classification. A fairly famous
SPAM FILTER Service Data Sheet
Content 1 Spam detection problem 1.1 What is spam? 1.2 How is spam detected? 2 Infomail 3 EveryCloud Spam Filter features 3.1 Cloud architecture 3.2 Incoming email traffic protection 3.2.1 Mail traffic
Achieve more with less
Energy reduction Bayesian Filtering: the essentials - A Must-take approach in any organization s Anti-Spam Strategy - Whitepaper Achieve more with less What is Bayesian Filtering How Bayesian Filtering
Spam Filtering Methods for Email Filtering
Spam Filtering Methods for Email Filtering Akshay P. Gulhane Final year B.E. (CSE) E-mail: [email protected] Sakshi Gudadhe Third year B.E. (CSE) E-mail: [email protected] Shraddha A.
Analysis of Spam Filter Methods on SMTP Servers Category: Trends in Anti-Spam Development
Analysis of Spam Filter Methods on SMTP Servers Category: Trends in Anti-Spam Development Author André Tschentscher Address Fachhochschule Erfurt - University of Applied Sciences Applied Computer Science
Detecting spam using social networking concepts Honours Project COMP4905 Carleton University Terrence Chiu 100605339
Detecting spam using social networking concepts Honours Project COMP4905 Carleton University Terrence Chiu 100605339 Supervised by Dr. Tony White School of Computer Science Summer 2007 Abstract This paper
Antispam Evaluation Guide. White Paper
Antispam Evaluation Guide White Paper Table of Contents 1 Testing antispam products within an organization: 10 steps...3 2 What is spam?...4 3 What is a detection rate?...4 4 What is a false positive rate?...4
Securepoint Security Systems
HowTo: Configuration of the spam filter Securepoint Security Systems Version 2007nx Release 3 Contents 1 Configuration of the spam filter with the Securepoint Security Manager... 3 2 Spam filter configuration
SPAM FILTERING IMPLEMENTATION USING OPEN SOURCE SOFTWARE
SPAM FILTERING IMPLEMENTATION USING OPEN SOURCE SOFTWARE AHMAD BAKHTIAR BIN IBRAHIM A PROJECT PAPER SUBMITTED IN PARTIAL FULFILMENT OF REQUIREMENT BARCHELOR OF SCIENCE (Hons) IN DATA COMMUNICATION AND
Introduction. How does email filtering work? What is the Quarantine? What is an End User Digest?
Introduction The purpose of this memo is to explain how the email that originates from outside this organization is processed, and to describe the tools that you can use to manage your personal spam quarantine.
Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007
Introduction to Bayesian Classification (A Practical Discussion) Todd Holloway Lecture for B551 Nov. 27, 2007 Naïve Bayes Components ML vs. MAP Benefits Feature Preparation Filtering Decay Extended Examples
Do you need to... Do you need to...
TM Guards your Email. Kills Spam and Viruses. Do you need to... Do you need to... Scan your e-mail traffic for Viruses? Scan your e-mail traffic for Viruses? Reduce time wasted dealing with Spam? Reduce
Eiteasy s Enterprise Email Filter
Eiteasy s Enterprise Email Filter Eiteasy s Enterprise Email Filter acts as a shield for companies, small and large, who are being inundated with Spam, viruses and other malevolent outside threats. Spammer
Kaspersky Anti-Spam 3.0
Kaspersky Anti-Spam 3.0 Whitepaper Collecting spam samples The Linguistic Laboratory Updates to antispam databases Spam filtration servers Spam filtration is more than simply a software program. It is
A MACHINE LEARNING APPROACH TO SERVER-SIDE ANTI-SPAM E-MAIL FILTERING 1 2
UDC 004.75 A MACHINE LEARNING APPROACH TO SERVER-SIDE ANTI-SPAM E-MAIL FILTERING 1 2 I. Mashechkin, M. Petrovskiy, A. Rozinkin, S. Gerasimov Computer Science Department, Lomonosov Moscow State University,
Solutions IT Ltd Virus and Antispam filtering solutions 01324 877183 [email protected]
Contents Reduce Spam & Viruses... 2 Start a free 14 day free trial to separate the wheat from the chaff... 2 Emails with Viruses... 2 Spam Bourne Emails... 3 Legitimate Emails... 3 Filtering Options...
AntiSpam QuickStart Guide
IceWarp Server AntiSpam QuickStart Guide Version 10 Printed on 28 September, 2009 i Contents IceWarp Server AntiSpam Quick Start 3 Introduction... 3 How it works... 3 AntiSpam Templates... 4 General...
University of Mary s Spam Solution
University of Mary s Spam Solution Spam is a growing problem worldwide. Spam causes productivity loss, network traffic, vast amount of corporate resources to be consumed and valuable server space to be
Email Filter User Guide
Table of Contents Subject Page Getting Started 2 Logging into the system 2 Your Home Page 2 Manage your Account 3 Account Settings 3 Change your password 3 Junk Mail Digests 4 Digest Scheduling 4 Using
How To Stop Spam From Being A Problem
Solutions to Spam simple analysis of solutions to spam Thesis Submitted to Prof. Dr. Eduard Heindl on E-business technology in partial fulfilment for the degree of Master of Science in Business Consulting
Immunity from spam: an analysis of an artificial immune system for junk email detection
Immunity from spam: an analysis of an artificial immune system for junk email detection Terri Oda and Tony White Carleton University, Ottawa ON, Canada [email protected], [email protected] Abstract.
Symantec Hosted Mail Security. Console and Spam Quarantine User Guide
Symantec Hosted Mail Security Console and Spam Quarantine User Guide Symantec Hosted Mail Security Console and Spam Quarantine User Guide The software described in this book is furnished under a license
Why Bayesian filtering is the most effective anti-spam technology
Why Bayesian filtering is the most effective anti-spam technology Achieving a 98%+ spam detection rate using a mathematical approach This white paper describes how Bayesian filtering works and explains
FortiMail Email Filtering Course 221-v2.0. Course Overview. Course Objectives
FortiMail Email Filtering Course 221-v2.0 Course Overview FortiMail Email Filtering is a 2-day instructor-led course with comprehensive hands-on labs to provide you with the skills needed to configure,
Spam, Spam and More Spam. Spammers: Cost to send
Spam, Spam and More Spam cs5480/cs6480 Matthew J. Probst *with some slides/graphics adapted from J.F Kurose and K.W. Ross Spammers: Cost to send Assuming a $10/mo dialup account: 13.4 million messages
IMPROVING SPAM EMAIL FILTERING EFFICIENCY USING BAYESIAN BACKWARD APPROACH PROJECT
IMPROVING SPAM EMAIL FILTERING EFFICIENCY USING BAYESIAN BACKWARD APPROACH PROJECT M.SHESHIKALA Assistant Professor, SREC Engineering College,Warangal Email: [email protected], Abstract- Unethical
The Network Box Anti-Spam Solution
NETWORK BOX TECHNICAL WHITE PAPER The Network Box Anti-Spam Solution Background More than 2,000 years ago, Sun Tzu wrote if you know yourself but not the enemy, for every victory gained you will also suffer
Anti Spam Best Practices
39 Anti Spam Best Practices Anti Spam Engine: Time-Tested Scanning An IceWarp White Paper October 2008 www.icewarp.com 40 Background The proliferation of spam will increase. That is a fact. Secure Computing
BULLGUARD SPAMFILTER
BULLGUARD SPAMFILTER GUIDE Introduction 1.1 Spam emails annoyance and security risk If you are a user of web-based email addresses, then you probably do not need antispam protection as that is already
Spam Filtering with Naive Bayesian Classification
Spam Filtering with Naive Bayesian Classification Khuong An Nguyen Queens College University of Cambridge L101: Machine Learning for Language Processing MPhil in Advanced Computer Science 09-April-2011
Title: Spam Filter Active / Spam Filter Active : CAB Page 1 of 5
Friday, April 18, 2008 11:24:8 AM Title: Spam Filter Active / Spam Filter Active : CAB Page 1 of 5 Name: Author: Subject: Spam Filter Active CAB Administrator Spam Filter Active Keywords: Category: The
A Case-Based Approach to Spam Filtering that Can Track Concept Drift
A Case-Based Approach to Spam Filtering that Can Track Concept Drift Pádraig Cunningham 1, Niamh Nowlan 1, Sarah Jane Delany 2, Mads Haahr 1 1 Department of Computer Science, Trinity College Dublin 2 School
Bayesian Filtering. Scoring
9 Bayesian Filtering The Bayesian filter in SpamAssassin is one of the most effective techniques for filtering spam. Although Bayesian statistical analysis is a branch of mathematics, one doesn't necessarily
How to Create and Manage your Junk Email Inbox Rule
How to Create and Manage your Junk Email Inbox Rule Overview Every email message received from outside of the University is scanned by a program called SpamAssassin before it is delivered to your Exchange
SURVEY PAPER ON INTELLIGENT SYSTEM FOR TEXT AND IMAGE SPAM FILTERING Amol H. Malge 1, Dr. S. M. Chaware 2
International Journal of Computer Engineering and Applications, Volume IX, Issue I, January 15 SURVEY PAPER ON INTELLIGENT SYSTEM FOR TEXT AND IMAGE SPAM FILTERING Amol H. Malge 1, Dr. S. M. Chaware 2
MailScanner Tips for NOCO Hosting Clients
MailScanner Tips for NOCO Hosting Clients March 2014 1) cpanel MailScanner Configuration With the MailScanner service you can control what happens to spam and viruses by changing the configuration in your
Personal Spam Solution Overview
Personal Spam Solution Overview Please logon to https://mailstop.dickinson.edu using your network logon and password. This is an overview of what you can do with this system. Common Terms used in this
How To Secure A Website With A Password Protected Login Process (Www.Siphone)
Preventing Spoofing, Phishing and Spamming by Secure Usability and Cryptography ICDCS 07/07/2006 Amir Herzberg Computer Science Department, Bar Ilan University http://amirherzberg.com 04/05/06 http://amirherzberg.com
E-Mail Security. on your terms SOFTSCAN
E-Mail Security on your terms SOFTSCAN With fraudulent and offensive emails delivered in vast quantities to businesses every day a foolproof email security system is essential. Setting the standards SoftScan
Anti-spam filtering techniques
Anti-spam filtering techniques Stéphane Bortzmeyer AFNIC (.fr registry) [email protected] ITU, 19 january 2006 1 Anti-spam filtering techniques Background on this work This work started in the french Working
the barricademx end user interface documentation for barricademx users
the barricademx end user interface documentation for barricademx users BarricadeMX Plus The End User Interface This short document will show you how to use the end user web interface for the BarricadeMX
Anti-SPAM Solutions as a Component of Digital Communications Management
Anti-SPAM Solutions as a Component of Digital Communications Management Ron Shuck CISSP, GCIA, CCSE Agenda What is Spam & what can you do? What is the cost of Spam E-mail E to organizations? How do we
International Journal of Research in Advent Technology Available Online at: http://www.ijrat.org
IMPROVING PEFORMANCE OF BAYESIAN SPAM FILTER Firozbhai Ahamadbhai Sherasiya 1, Prof. Upen Nathwani 2 1 2 Computer Engineering Department 1 2 Noble Group of Institutions 1 [email protected] ABSTARCT:
