SPAM FILTERING IMPLEMENTATION USING OPEN SOURCE SOFTWARE



Similar documents
A White Paper. VerticalResponse, Delivery and You A Handy Guide. VerticalResponse,Inc nd Street, Suite 700 San Francisco, CA 94107

Quarantined Messages 5 What are quarantined messages? 5 What username and password do I use to access my quarantined messages? 5

An Overview of Spam Blocking Techniques

It is a program or piece of code that is loaded onto your computer without your knowledge and runs against your wishes.

FILTERING FAQ

1 Accessing accounts on the Axxess Mail Server

Filtering for Spam: PC


eprism Security Appliance 6.0 Intercept Anti-Spam Quick Start Guide

BARRACUDA. N e t w o r k s SPAM FIREWALL 600

Handling Unsolicited Commercial (UCE) or spam using Microsoft Outlook at Staffordshire University

Anti Spam Best Practices

Smart E-Marketer s Guide

Antispam Security Best Practices

About the Junk Filter

How to Access Your Private Message Center if you need more control

Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.

SPAM UNDERSTANDING & AVOIDING

SPAM FILTER Service Data Sheet

Marketing Do s and Don ts A Sprint Mail Whitepaper

Introduction. How does filtering work? What is the Quarantine? What is an End User Digest?

PROOFPOINT - SPAM FILTER

Fighting spam in Australia. A consumer guide

Enterprise Marketing: The 8 Essential Success Factors

Stop Spam Now! By John Buckman. John Buckman is President of Lyris Technologies, Inc. and programming architect behind Lyris list server.

MDaemon configuration recommendations for dealing with spam related issues

eprism Security Suite

Savita Teli 1, Santoshkumar Biradar 2

How To Block Ndr Spam

ContentCatcher. Voyant Strategies. Best Practice for Gateway Security and Enterprise-class Spam Filtering

How To Stop Spam From Being A Problem

What Spammers Don t Want You To Know About Permanently Blocking Their Vicious s

PANDA CLOUD PROTECTION User Manual 1

Intercept Anti-Spam Quick Start Guide

Introduction: What is Spam?... 3 How to Bypass Spam Filters Common Mistakes... 7

Spam DNA Filtering System

Solutions IT Ltd Virus and Antispam filtering solutions

Mail Services. Easy-to-manage Internet mail solutions featuring best-in-class open source technologies. Features

Microsoft Outlook 2010 contains a Junk Filter designed to reduce unwanted messages in your

Configuring MDaemon for Centralized Spam Blocking and Filtering

OUTLOOK SPAM TUTORIAL

SCORECARD MARKETING. Find Out How Much You Are Really Getting Out of Your Marketing

Eiteasy s Enterprise Filter

How to use Outlook Express for your

Articles Fighting SPAM in Lotus Domino

MARKETING TIPS. From Our InfoUSA Experts

If you encounter difficulty or need further assistance the Archdiocesan help desk can be reached at (410) , option 1. Access Methods:

IBM Express Managed Security Services for Security. Anti-Spam Administrator s Guide. Version 5.32

More Details About Your Spam Digest & Dashboard

BULLGUARD SPAMFILTER

Best Practices: How To Improve Your Survey Invitations and Deliverability Rate

How to make sure you receive all s from the University of Edinburgh

PC Security and Maintenance

Deliverability Counts

(For purposes of this Agreement, "You", " users", and "account holders" are used interchangeably, and where applicable).

Barracuda Spam Firewall

How To Filter From A Spam Filter

EFFECTIVE SPAM FILTERING WITH MDAEMON

Analysis of Spam Filter Methods on SMTP Servers Category: Trends in Anti-Spam Development

Top 25 Marketing Terms You Should Know. Marketing from Constant Contact

Basic E- mail Skills. Google s Gmail.

No filter is perfect. But with your help, MailCleaner may aim at perfection. Case Description Solution

FortiMail Filtering Course 221-v2.2 Course Overview

Setting up Microsoft Outlook to reject unsolicited (UCE or Spam )

DON T BE FOOLED BY SPAM FREE GUIDE. Provided by: Don t Be Fooled by Spam FREE GUIDE. December 2014 Oliver James Enterprise

Anti-SPAM Solutions as a Component of Digital Communications Management

Advanced Settings. Help Documentation

NEVER guess an address. Your mail will nearly always go to the wrong person.

Software Solutions Digital Marketing Business Services. Marketing. What you need to know

Transcription:

SPAM FILTERING IMPLEMENTATION USING OPEN SOURCE SOFTWARE AHMAD BAKHTIAR BIN IBRAHIM A PROJECT PAPER SUBMITTED IN PARTIAL FULFILMENT OF REQUIREMENT BARCHELOR OF SCIENCE (Hons) IN DATA COMMUNICATION AND NETWORKING FACULTY OF INFORMATION TECHNOLOGY AND QUANTITATIVE SCIENCE MARA UNIVERSITY OF TECHNOLOGY SHAH ALAM APRIL 2006

SPAM FILTERING IMPLEMENTATION USING OPEN SOURCE SOFTWARE AHMAD BAKHTIAR BIN IBRAHIM 2004219847 This project submitted to the Faculty of Information Technology and Quantitative Science MARA University of Technology In Partial fulfillment of requirement for the BACHELOR OF SCIENCE (Hons) In DATA COMMUNICATION AND NETWORKING Approved By the Examining Committee:. Mr. Mohammad Yusof bin Darus Project Supervisor. Mr. Ali bin Isa Examiner MARA UNVERSITY OF TECHNOLOGY SHAH ALAM APRIL 2006 ii

ACKNOWLEDGMENT In the name of Allah S.W.T the Most Merciful and Most Gracious. Praise to Allah the Mighty for showering me a good experience throughout this project paper and for all that has bestowed on me. It is with His superiority the project paper completed. First I would like to express my deepest gratitude and the most sincere appreciation to my supervisors En.Mohammad Yusof bin Darus for his greatest advice and encouragement to complete this thesis project. Her guidance and wise supervision has benefited me greatly. For my beloved parents, Haji Ibrahim bin Mohamed, Hajjah Siti Sara binti Salleh and my siblings, thanks for your loving, caring and supporting that give me strength to learn. Last and certainly not least, I am thankful to my entire friend who has contributed so much on their ideas and helpful suggestion. It means so much for me to complete this thesis project. To all mentioned here, might God bless all of you. Thank you so much. iii

TABLES OF CONTENTS CHAPTER PAGE 1. INTRODUCTION 1.1 Background 1 1.2 Problem Statement 3 1.3 Objectives 4 1.4 Project Scope 4 1.5 Significances Of Study 5 1.6 Organization of This Report 6 2. LITERITURE REVIEW 2.1 Spam Overview 7 2.1.2 Definition of Spam 7 2.1.2 Spamming In Different Media 8 2.1.2.1 Email Spam 8 2.1.2.2 Messaging Spam 9 2.1.2.3 Forum Spam 9 2.1.2.4 Online Games Messaging 10 2.1.3 Spam Filtering Methods 10 2.1.3.1 Ruled Based (Heuristic) Filtering 10 2.1.3.2 Bayesian Filtering 11 2.1.4 Mail Filtering SpamAssassin 12 2.1.5 How Spammers Operate 14 2.2 Email Server Overview 15 2.2.1 How Does Email works 15 2.2.2 Email System 16 iv

2.2.2.1 Simple Mail Transfer Protocol (SMTP) 16 2.2.2.2 POP3 Server 17 2.2.2.3 IMAP Server 17 2.2.3 Email Filtering 18 2.2.4 Understanding of Mail Transfer Agent (MTA), Mail User Agent (MUA) and Mail Delivery Agent (MDA) 19 2.2.4.1 Mail Transfer Agent 19 2.2.4.2 Mail Delivery Agent 19 2.2.4.3 Mail User Agent 20 3. METHODOLOGY 3.1 Introduction 21 3.2 Hardware And Software Requirement 22 3.2.1 Minimum Hardware Requirement 22 3.2.2 Software Requirement 22 3.3 Process Phase 26 3.3.1 Information Gathering 26 3.3.2 Planning 26 3.3.3 Installation and configuration 27 3.3.3.1 SuSE Linux 10.0 27 3.3.3.2 Postfix Mail Transfer Agent (MTA) 28 3.3.3.3 MailScanner 29 3.3.3.4 MailWatch 30 3.3.3.5 SpamAssassin 31 3.3.3.6 Bayesian Database 31 3.3.4 Testing 32 3.3.5 Analysis 33 v

4. FINGINGS AND DISCUSSION 4.1 Components of Email Server with Spam Filtering 34 4.1.1 Postfix 34 4.1.2 MailScanner 35 4.1.3 MailWatch 36 4.1.4 SpamAssassin 38 4.2 Rule Based Filtering Versus Bayesian Filtering 39 4.3 Spam Filtering Analysis 40 5. CONCLUSION AND RECOMMENDATION 5.1 Conclusion 44 5.2 Recommendation 46 5.2.1 Implementation of Anti Virus at Email Server 46 5.2.2 Administrator Maintenance 46 REFERENCES 48 APPENDIX APPENDIX A 49 APPENDIX B 53 APPENDIX C 55 APPENDIX D 60 vi

LIST OF TABLES PAGE TABLE 3.1: Hardware Requirement 22 TABLE 3.2: Software Requirement 22 vii

LIST OF FIGURE PAGE Figure 1: SMTP and POP3 15 Figure 2: Mail Transfer Agent (MTA) and Mail User Agent (MUA) 20 Figure 3: The YaST Control Center 23 Figure 4: SuSE Linux 10.0 Interface 24 Figure 5: Network Diagram 27 Figure 6: MailScanner configuration files editing 36 Figure 7: MailWatch Interface 40 Figure 8: Today Total Scanning of Email 41 Figure 9: Detailed of Email 42 Figure 10: Total Mail Processed By Date 42 Figure 11: Total of Incoming Email 43 viii

LIST OF ABBREVIATIONS Bayesian filtering: Process of using Bayesian statistical methods to classical documents into categories. GTUBE: Generic Test for Unsolicited Bulk Email. Provides a test by which can verify that the filter is installed correctly and is detecting incoming spam. Heuristic filtering: Rule based (Heuristic) filtering applies a set of rules to each incoming message to detect spam. IMAP: Internet Message Access Protocol. This protocol is method for user to accessing their email on mail server. The function is same with POP3. MDA: Mail Delivery Agent. Software that accepts incoming e mail messages and distributes them to recipients' individual mailboxes or forwards back to an SMTP server. MTA: Mail Transfer Agent. A computer program or software agent that is responsible for receiving, routing, and delivering e-mail messages from one computer to another. MUA: Mail User Agent. The program that users use to read, write and send e-mails POP3: Post Office Protocol 3. Standard protocol for receiving email and it can be client or server. SMTP: Simple Mail Transfer Protocol. A standard protocol for email transmission over the Internet. YaST: Yet Another Setup Tool. An operating system setup and configuration tool that is featured in SuSE Linux distribution. ix

ABSTRACT Spam is a serious problem that has been increasing plaguing users of the Internet. Programs known as spam filters are employ to assist the user in deciding if an email is worth reading or not. This paper is focuses on build the mail server with spam filter and analysis the effectiveness of rule based (Heuristic) and Bayesian filtering method that use in SpamAssassin. The mail server consolidates software which is available in public domain as a powerful and low cost solution for seeking option in fighting spam. The rule based method using set of rules to define the incoming email as spam or not and Bayesian method using probability of incoming mail to define that the mail is belong to either spam or legitimate mail category after its require a training period for SpamAssassin. The combination of both methods show that SpamAssassin is a better spam checking. x

CHAPTER 1 INTRODUCTION 1.1 PROJECT BACKGROUND Electronic mail (email) is now considered the easiest and most efficient way to communicate. Internet users can simply type a letter and at the click of a button instantaneously communicate with people all over the world. Despite the obvious advantages of using email, its simplicity carries with it spam. The term spam refers to unsolicited, unwanted, and inappropriate bulk email. Spam is often referred to as Unsolicited Bulk Email (UBE), Excessive Multi-Posting (EMP), Unsolicited Commercial Email (UCE), Unsolicited Automated Email (UAE), bulk mail or just junk mail. Spammers use many tactics to get email address to send spam. Through the internet, spammers can get the email from newsgroup posting, webpage or mailing list. Another tactics is using social engineering such as chain letter or purchase address from another spammer. They also used computer programs called robots or spiders to harvest email address from websites. The problem with spam is it makes up 30% to 60% of mail traffic and is on the rise. It can make the mail traffic become slow. When spam received and storage in mailbox, the mailbox can cause the problem like shutdown. Sometimes the spam is difficult or impossible to unsubscribe. User also wasting time to managing and deleting and this can give negative effect on productivity. This research present an analysis of spam filters based on server site, including the issues of spam, and study of Linux mail server. In this research, spam filters are discussed on method or algorithms their use to filter spam. 1

The meat of this research is the analysis. Our main goal was to provide a effectiveness between the rules based filtering method and the Bayesian filtering that using in SpamAssassin. We wanted to see how this filter worked better and their effectiveness to solve problem with spam. In term of which method is more effective spam filtering, the analysis will do on the deal of tool to filtering spam. We will looking on the condition like is the program effective at keeping Spam in the Spam folder or quarantine area, keeping the Spam out of outbox, allowing only the legitimate mail to enter the inbox and the effectiveness at keeping legitimate mail out of spam or quarantine area. Other study include on simulate the mail server based on open source software using Postfix (MTA) email server. 2

1.2 PROBLEM STATEMENT Spam is junk mail email that is sent to you by someone who has no existing relationship with you. Whether calls it unsolicited commercial email (UCE) or unsolicited bulk email (UBE) spam is defined by the fact that the recipients did not request the mail or reveal their email addresses for the purposes of receiving such mail. Spam creates a variety of problems for consumers, businesses, and Internet Service Provider (ISP).From unwanted pornographic images to overtaxed servers, these problems create differing types and severity. On consumer site, spam can abuse their privacy. Most cases, the spammer knows little or nothing about consumer. Billions of emails sent everyday like inviting to meet new friends or buy a part of product that they not interest at all. Spam is also a crime problem for consumer when spam pornographic included. Internet Service Provider (ISP) charged their consumer on spent online, so the process of downloading and reading spam may increase consumer cost. Nowadays, ISP charge a flat rate to their customer and that will not become a big problem. However, many Internet mobile devices charge user depend on hour or minute their surfing internet and this should become the cost problem for users. Spam can impact on business site too. When hit with spam attack, business needs to invest resource to reduce this problem. For example, if the mail has sent from inside of company computer, the company needs to check whether that mail has been infected with virus or spam. That mean a company will increase their technical support cost. Spammer occasionally put the name of a legitimate company in the email to give the trust to recipients that email is from well known company or is sent with their authorization. When the recipient think the company is sending spam, their reputation and trusted from consumer will fall and suffer. This is the example of spoofing problem. Spam also affect how legitimate business to market their product. Many consumers subscribe to email list from well known companies in order to receive special discount 3

offer of product. However, these email are sometimes defines as spam by spam filtering product or recipient. Filter product that recognize words common in spam, such as sale, order or price sometimes block wanted email as spam. This problem can make the difficulties in marketing. Another problem is sexual harassment. This problem can reduce worker productivity. Spam gives troubles to Internet Service Provider (ISP) because it uses large amount of bandwidth, storage space, and increase technical support cost. To dealing with spam, ISP must build the sophisticated program into their system. Other problem at ISP site is server strain. When sending and receiving amount of email in short period of time, server may become strain on ISP resources. They have to upgrade their equipment and pay higher bandwidth bill to deal with raise of traffic. Sometimes, spammer using multiple combination of common name at popular domain name to send spam. For example, they might send to smith@yahoo.com or asmith@yahoo.com. This problem puts huge drain on ISP server and bounce message returned for addresses that never existed. When ISP does not gives the best service, customer will complaints. This could reduce time when help desk or customer service dealing with customer concerns about spam. 1.3 OBJECTIVES These research objectives are: i. To create mail server with spam filtering using open source software. ii. To analysis the effectiveness of spam filtering method (rule based and Bayesian) in SpamAssassin. 4

1.4 PROJECT SCOPE The scope for this research included: i. Filtering spam at server site using SpamAssassin for filtering email. ii. Create mail server using open source software like SuSE Linux 10.0 for operating system and Postfix for Mail Transfer agent (MTA). iii. iv. Analysis the effectiveness of SpamAssassin to catch spam using rule based and Bayesian filtering method. Build the small Local Area Network (LAN) included 1 or 2 host and mail server 1.5 SIGNIFICANCES OF STUDY This thesis is mainly focusing on developing spam filtering server using open source software and network devices. By using open source software, users can create their own spam filtering to reduce the spam with low cost. Managing and deleting spam or unwanted messages can give negative effect on user productivity. With this spam filtering server, users no need waste their time to delete unwanted message or spam in their mailbox. An additional benefit of this project is that provide valuable information about spam filtering. So the problem like overload on mail traffic, shutdown of mailbox, and waste of disk storage on mail server can be reduce and that will make the network working at high performance. spam. Other advantage is give better understanding to future researchers about 5

1.6 ORGANIZATION OF THIS REPORT In this chapter, problem statement of this project is laid out. Project s objectives and project s scopes are also discussed. Finally, the expected significant of this project are mentioned. This project is further strutted as follows: Chapter 2: This chapter will review about literature work. There are discussions about Spam filtering analysis that have been done by others. Also the discussion about the related study of spam and Linux mail server. Chapter 3: This chapter will focus on methodology that used for this project. It is showing the procedures on achieving the objectives of this project. Chapter 4: This chapter will discuss about the finding of this project. An analysis of finding and testing which has been done will be explained in this chapter. Chapter 5: The last chapter of this project, including the conclusion and recommendations for this project. 6

CHAPTER 2 LITERATURE REVIEW 2.1 SPAM OVERVIEW In this section, it will discuss the information about spam like the definition of spam, spamming in different media, spam filtering methods and the spam filtering tools that will use for this project. 2.1.1 Definition of Spam According to Paul Lalor, (2004) spam is refers to unsolicited, unwanted, inappropriate bulk email and also often referred to as Unsolicited Bulk E-Mail (UBE), Excessive Multi-Posting (EMP), Unsolicited Commercial E-mail (UCE), Unsolicited Automated E-mail (UAE), Spam mail, bulk email or just junk mail. Another definition is refer to Mail Abuse Prevention System (MAPS) (2005), electronic message is spam if the recipient's personal identity and context are irrelevant because the message is equally applicable to many other potential recipients and the recipient has not verifiably granted deliberate, explicit, and still-revocable permission for it to be sent and the transmission and reception of the message appears to the recipient to give a disproportionate benefit to the sender. 7

2.1.2 Spamming In Different Media Refer to Wikipedia (2005), there are a lot spam among different media such email, messaging, forum and online game messaging. 2.1.2.1 Email Spam Email spam is the most common form of spamming on the internet. It involves sending identical or nearly identical unsolicited messages to a large number of recipients using users email that get using spambots. By using spambots, spammers create email viruses that will render an unprotected PC a "zombie computer" and the zombie will inform a central unit of its existence, and the central unit will command the "zombie" to send a low volume of spam.another tactics to get email address is using social engineering such as chain letter or purchase address from another spammers. There are four types of email spam: i. Unsolicited commercial email (UCE) is an email message that you receive without asking for it marketing a product or service. Also called junk email. ii. iii. Unsolicited bulk email (UBE) refers to email messages that are sent in bulk to thousand of recipients. UBE may be commercial in nature, in which case it is also UCE. But it may be sent for other purposes such as political lobbying or irritation. Make money fast (MMF) messages, often in the form of chain letters or multi-level marketing schemes, are messages that suggest you can get rich by sending money to the top name on a list, removing that name, adding your name to the bottom of the list, and forwarding the message to other people. 8

iv. Reputation attacks are messages that appear to be sent from one person or organization, but are actually sent from another. The purpose of the messages isn't to promote a particular service or product, but to make the recipients of the message angry at to sender. The most cruel reputation attacks include the actual email addresses, phone numbers, and street addresses of the victim. 2.1.2.2 Messaging Spam Messaging spam sometimes called spim, using the instant messaging system such as as AOL Instant Messenger or ICQ to send spam.in messaging spam, spammers need scriptable software and the recipient s usernames.the instant messaging system offer a directory of users, including user information such as age and sex. So spammers are easy to gather information and send unsolicated message to system users. 2.1.2.3 Forum Spam Spamming an internet forum is when a user posts something which is doesn t have anything to do with the current subject. It also can define where a person repeatedly posts about a certain subject in a manner that is unwanted by the general population of the forum. Lastly there is also the case where a person posts messages soley for the purpose of increasing his or her ranking on the forum. 9

2.1.3.4 Online Games Messaging Online game usually allow players to chat with each others players on chatrooms.these service is also using by spammers to send spam such as promote for certain websites or online strores to another players. 2.1.3 Spam Filtering Methods There are many ways to solving the problem of filtering spam. This section describes two methods of filtering spam that are used in this project. The methods are ruled based (Heuristic) and Bayesian filtering. 2.1.3.1 Rule Based (Heuristic) Filtering According to Paul Lalor, (2004) Heuristic spam filtering uses feature-matching rule set, gained through experience, to capture spam. Through detailed analysis of incoming email based on carefully designed rules, heuristic filtering assigns a numerical value or score to each message (calculated from assigning a particular number to different words contained in the mail). This score is used to determine whether the message is likely to be spam or not, so for example, if the e-mail contained obvious spam words (sexy, debt, loan, etc ) they would be assigned a higher score than innocent words. Finally an e-mail is classed as spam if it exceeds a certain threshold (value). Through years of learning what spam (and nonspam) mail usually look like, the default set of rules, and as a result the scores assigned by them have become very reliable and effective in detecting what is and what is not spam. An example of the rule could be that all email that contains the text order confirmation or the text with red color will defined as spam. 10

This method is easily to write but making that works with a set of rules that make sense. The problem with this method is that the rules are written by people looking for the obvious characteristic of spam with advancing technology increasing every day. For example, a message with the subject of "F R E E S E X" is spam, but a ruled based program might ignore it because of the spaces between the letters. Another problem is that the more comprehensive a rules-based program gets, the slower it will run. Refer to Appendix D for example of rule based set. 2.1.3.2 Bayesian Filtering According to Wikipedia (2006), Bayesian spam filtering is the process of using Bayesian statistical methods to classify documents into categories. Using well known mathematics, it is possible to generate a spam indicate probability for each word. Bayesian is different from others because that it s learning. To decide that incoming mail is spam or not, the filter needs to know about the mail that user receives. Since the test of the words and frequencies, the solution is keep in table to record how often each words in that mail appears. Spam is kept in separate table and that probabilities can be calculated here. Bayesian rule using this probability: For example, most email users encounter the word Viagra in spam email, but rarely want it in other email. The filter doesn t know these probabilities in advance and must be trained first so it can build them up. In this case, the user must manually indicates whether that email 11

is spam or not to train the filter. Then, the filter will set the probabilities in spam or legitimate email database for all word in training email. After training, the word probabilities are used to compute the probability of incoming mail to define that the mail is belong to either spam or legitimate mail category. The email that defines as spam will be automatically moved to spam mail folder or deleted absolutely. 2.1.4 Mail Filtering - SpamAssassin SpamAssassin is a mail filter which attempts to identify spam using a variety of mechanisms including text analysis and Bayesian filtering. SpamAssassin is a mail filter or classifier. It will examine each message accessible to it, and assign a score representing the chance that the mail is spam. SpamAssassin uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited commercial email. SpamAssassin also includes a Bayesian learning filter, so it is meaningful training SpamAssassin with your collection of non-spam and spam. This will make it more accurate for checking the incoming mail. SpamAssassin is a mail filter to identify spam and it is an intelligent email filter which uses a diverse range of tests to identify spam. These tests are applied to email headers and content to classify email using advanced statistical methods. In addition, SpamAssassin has a modular architecture that allows other technologies to be quickly wielded against spam and is designed for easy integration into virtually any email system. 12

Based on comments from many sources, we felt that SpamAssassin was the best rule based filter available at the time we started our project, which is why we chose to use and test it. SpamAssassin uses a point system when analyzing an email. Every email is scanned for instances of each characteristic from its list. If a characteristic is found, that email gains the number of points related with that characteristic. Points can be both negative and positive. For example, if an email contains the word GUARANTEED in the subject column, that emails score will increase several points. If the final score of an email is over a certain threshold, configurable by the user, the email gets flagged as being spam. The list of email characteristics is general, including specific words to the color of text. It is also configurable by the user. The SpamAssassin creators keep a database of emails, the rules, and whether or not those emails were spam. It is not necessary to store the contents of the emails after the matching rules are calculated, so the actual emails are not stored. The SpamAssassin team encourages users to make changing to this database to improve SpamAssassin s accuracy. But our primary worry with SpamAssassin was its static character. Everyone using SpamAssassin has the same settings and a spammer could easily test their new message on the latest version of the filter to verify if it would be blocked or not. There are even services available online to save the designer the maintaining of the latest versions of all spam prevention software. This static character is, however, an advantage in regards to implementing the filter. Since it does not learn, its effectiveness is not based on user input and as a result requires no maintenance. 13

2.1.5 How Spammers Operate Unlike junk paper mail, e-mail spam costs the sender very little to send; almost all of the costs are paid by the recipient and the carriers, because the spammer does not have to pay for the entire Internet bandwidth tied up in the delivery of the spam. Because they have no incentive to be efficient in their mass e-mailing, spammers usually don't put much effort into verifying e-mail addresses. They use automatic programs called bots to scour the Web and Usenet newsgroups, collecting addresses, or buy them in bulk from other companies. One of most common tricks used by spammers is to relay messages through the e-mail server of an innocent third party. This tactic doubles the damages: both the receiving system and the innocent relay system are flooded with spam. And for any mail that gets through, often the flood of complaints goes back to the innocent site because it was made to look like the origin of the spam. Many spammers send their spam from a free account from a large ISP such as AOL, Yahoo!, or Hotmail, then abandon the account and open a new one to use for the next assault. Another common trick that spammers use is to forge the headers of messages, making it appear as though the message originated elsewhere. This is called spoofed e-mail. There are some pieces of information in the full headers that the spammer cannot forge, but even after technical investigation into the source of the message, most often the resulting information leads to a dead end. 14