Spam, Spam and More Spam cs5480/cs6480 Matthew J. Probst *with some slides/graphics adapted from J.F Kurose and K.W. Ross Spammers: Cost to send Assuming a $10/mo dialup account: 13.4 million messages per month might be sent A cost of about 1 penny per 14,300 messages Free trials make it free! 1
$$ You: Cost to Receive $$ 10+ Billion spam sent each day At 5 seconds per spam (to recognize & delete).. That s 50 billion seconds of lost productivity each day (39,457 work years) Assuming $36k average income per person: $1.5 Billion per day in lost productivity to economy. Driving Business Incentives Pump and dump penny-stocks Scams-Nigerian investments, phishing,etc. Meds Insurance Porn Loans/Mortgages Others 2
Botnets and Spammers DDOS Replication Vender Spammer Bot controller Spam Example: Storm worm currently running on up to 50 million infected computers. More computing power than top 500 supercomputers in world combined! Used for DDOS attacks, penny stock spam and propagating itself via email. Mail access protocols SMTP: delivery/storage to receiver s server Mail access protocol: retrieval from server POP: Post Office Protocol [RFC 1939] authorization ( <-->server) and download : Internet Mail Access Protocol [RFC 1730] more features (more complex) manipulation of stored msgs on server HTTP: Hotmail, Yahoo! Mail, etc. 3
Ideal place to filter filter? Source machine Source MTA server In middle of network Recipient MTA server Recipient machine Pros & Cons of each. ISP IP block white-listing 12.1.1.5 SMTP Only SMTP 12.1.X.X POP3 or allowed! Source MTA filter. ISPs allow any IP blocks on their network to relay through their mail servers. Disallows mobility Allows viruses 4
Username Password SMTP SMTP-AUTH SMTP POP3 or Source MTA requires name/password before relaying a message. Only ISP s own customers allowed to relay Optional: Block all other outgoing SMTP Allows mobility, Blocks dumb viruses Free Trial ISP accounts. Fraudulently acquired accounts. Rate throttling 25 M/H Simple: Source MTA Limits the number/rate of emails from individual senders. Limit on: Max recipients per message Max messages per time period etc. Problems: Spammers can code their own MTAs Millions of throttled bots can still spam-a-lot! 5
SPF (Sender Policy Framework) spf? (13.1.1.1) 13.1.1.1 Alice.com Recipient MTA Filter TXT dns record on a domain that lists Authorized relays for email marked as coming from that domain. Only effective with mass adoption. Spammers comply with SPF Relay Blacklists (RBLs) (13.1.1.1) Recipient MTA Filter 13.1.1.1 ok? rbl1 DB of IP addresses (and blocks) that should not be allowed to relay email. 100s of lists publicly available. Mail servers commonly use several RBLs Individually and group maintained. Conservative vs ultraliberal inclusion. rbl3 rbl2 6
Relay Blacklists (RBLs) cont. Spamhaus Stats: http://www.spamhaus.org/statistics/ Take it or leave it one-size-fits-all. (Is either too aggressive or too passive). Central RBL servers easy to DDOS. If done within network, then prevents smtpauth. Relay White-lists (13.1.1.1) 13.1.1.1 ok? Recipient MTA Filter wl2 wl1 Automatically allows email from specific domains, relays and senders through Easy to get out of date? Spammers can use legitimate email addresses, ISPs and domains. (botnets,etc). wl3 7
Greylists Don t fully allow (not a whitelist) Don t completely block (not a blacklist). Slow down handshaking & negotiation (tarpit) and/or take more time/resources to scan. Tarpitting doesn t block very determined spammers. SMTP Tricking Spammers bob.com mx? 14.1.1.1, 14.1.12 SMTP FAIL! Fake MTA 14.1.1.1 POP3 or Bob.com (14.1.1.2) Require MTAs to adhere to full SMTP RFC. Point primary MX record at null sync. Secondary MX record point to real MTA. Spammers can make their MTAs smarter Some Spammers use existing ISP MTAs 8
Domain Keys Identified Mail (DKIM) Pub Key? (Signs Message) Alice.com <PubKey> (Authenticates message) Sender MTA signs message hash w/ priv key. Adds signature as new header: DomainKey- Signature Recipient MTA uses txt record to find public key to authenticate signature. Adoption Spammer domains can conform Spammers can use legitimate ISP account Signs Message S/MIME Signatures Verifies Signature Senders obtain a digital cert from a legitimate Certificate Authority (CA). Can use the cert for both signing as well as encryption of messages. Recipients can verify certs via certificate chain (just like web browsers). Adoption Cost of per sender cert. CA 9
Bayesian Content Filters Hash( Viagra )? SPAM! Recipient filter Individualized DB. Requires training Learns common words & phrases from spam Spam scoring given to each message. Randomized spam content misspellings jpeg/pdf spam DB Vipul s Razor Recipient Filter. Hash of email body or paragraphs (messages signature ). Lookup this signature in centralized DB of known spam. Only Authorized Reporters can register spam signatures. (computes signature) 2e821f039 ok? Randomized content jpeg/pdf spam. Razor DB Razor DB 10
Spam Training Honeypots Dedicate an inbox to receive only spam. Randomly generated name: asdf@domain.com or common (unused) name: bob@jones.com Email received by this box can be fed to bayesian filter, vipuls razor & personal RBLs. What is used today? Combination of all of these techniques. Spamassassin as an example. RBLs are low hanging fruit Commonly block 80%+ of spam. 11
Remaining Problems Increased client mobility P2P email (no reliance on central scanners or CA). Fast vs slow path selection based on trust of sender & sender s email path. Fast reaction to entity behavior changes ( Zombiefication of hosts) Micro-payments Senders pay fraction of a cent for each email they send. Won t deter normal email s, but would definitely stop many spammers. Variation: Rather than charge for each email Force all email s to put $$ in escrow only charging account upon receiving complaint. 12
Transitive Social-net Trust Nancy Jim Carol Alice trust trust trust trust Bob Email Based off of Small Worlds No centralized filters Can be completely P2P Trust levels are constantly changing (fast reaction to observed mis-behaviors) P2P Experience & RBL User s collect their own experience (positive and negative) and share them with their social peers. User s generate their own personal RBLs mods based off of their experience DB. User s query for neighbor s experiences using multi-casting. 13
Dynamic Grey-listing Selectively decide which message to send on fast-path (Layer 3) vs through tarpit (Layer-7..for further inspection). Fast path may include no scanning at all freeing up scanning resources to be used on un-trusted messages. 14
Questions? Questions / Comments / Feedback? 15