SMTP, Porcupine. Amin Vahdat CSE 123b May 4, 2006



Similar documents
Lessons from Giant-Scale Services and Load Balancing Switches. George Porter CSE 124 January 27, 2015

FTP and . Computer Networks. FTP: the file transfer protocol

CPSC Network Programming. , FTP, and NAT.

Protocolo FTP. FTP: Active Mode. FTP: Active Mode. FTP: Active Mode. FTP: the file transfer protocol. Separate control, data connections

Internet Technology 2/13/2013

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP Abstract Message Format. The Client/Server model is used:

Chapter 2 Application Layer. Lecture 5 FTP, Mail. Computer Networking: A Top Down Approach

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP - Message Format. The Client/Server model is used:

CS43: Computer Networks . Kevin Webb Swarthmore College September 24, 2015

. Daniel Zappala. CS 460 Computer Networking Brigham Young University

G Porcupine. Robert Grimm New York University

, SNMP, Securing the Web: SSL

FTP: the file transfer protocol

1. The Web: HTTP; file transfer: FTP; remote login: Telnet; Network News: NNTP; SMTP.

CSCI-1680 SMTP Chen Avin

1 Introduction: Network Applications

Introduction to Computer Security Benoit Donnet Academic Year

Application Example: WWW. Communication in the WWW. WWW, HTML, URL and HTTP. Loading of Web Pages. The Client/Server model is used in the WWW

FTP: the file transfer protocol

Telematics. 13th Tutorial - Application Layer Protocols

Internet Technologies Internet Protocols and Services

Networking Applications

Proxies. Chapter 4. Network & Security Gildas Avoine

2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET)

DATA COMMUNICATOIN NETWORKING

Cisco Application Networking for IBM WebSphere

SiteCelerate white paper

CS 356 Lecture 27 Internet Security Protocols. Spring 2013

Basic & Advanced Administration for Citrix NetScaler 9.2

Load Balancing and Sessions. C. Kopparapu, Load Balancing Servers, Firewalls and Caches. Wiley, 2002.

Fundamentals of the Internet 2009/ Explain meaning the following networking terminologies:

SMTP Servers. Determine if an message should be sent to another machine and automatically send it to that machine using SMTP.

Electronic Mail

Application Delivery and Load Balancing for VMware View Desktop Infrastructure

Firewalls. Chapter 3

The basic groups of components are described below. Fig X- 1 shows the relationship between components on a network.

Remote login (Telnet):

Internet Security [1] VU Engin Kirda

The Application Layer. CS158a Chris Pollett May 9, 2007.

CS514: Intermediate Course in Computer Systems

Domain Name System (DNS)

DEPLOYMENT GUIDE Version 1.1. Deploying F5 with IBM WebSphere 7

Mail system components. Electronic Mail MRA MUA MSA MAA. David Byers

E-Commerce Security. The Client-Side Vulnerabilities. Securing the Data Transaction LECTURE 7 (SECURITY)

Astaro Deployment Guide High Availability Options Clustering and Hot Standby

Introduction to Computer Networks

Cisco Application Networking for BEA WebLogic

2- Electronic Mail (SMTP), File Transfer (FTP), & Remote Logging (TELNET)

AppDirector Load balancing IBM Websphere and AppXcel

THE BCS PROFESSIONAL EXAMINATIONS BCS Level 6 Professional Graduate Diploma in IT. April 2009 EXAMINERS' REPORT. Network Information Systems

Cisco Secure PIX Firewall with Two Routers Configuration Example

Electronic mail security. MHS (Message Handling System)

Simple Mail Transfer Protocol

TELE 301 Network Management. Lecture 17: File Transfer & Web Caching

CS 164 Winter 2009 Term Project Writing an SMTP server and an SMTP client (Receiver-SMTP and Sender-SMTP) Due & Demo Date (Friday, March 13th)

F5 Configuring BIG-IP Local Traffic Manager (LTM) - V11. Description

Single Pass Load Balancing with Session Persistence in IPv6 Network. C. J. (Charlie) Liu Network Operations Charter Communications

Considerations In Developing Firewall Selection Criteria. Adeptech Systems, Inc.

DEPLOYMENT GUIDE Version 1.0. Deploying the BIG-IP LTM with the Zimbra Open Source and Collaboration Suite

Introduction. -- some basic concepts and terminology -- examples for attacks on protocols -- main network security services

Understanding TCP/IP. Introduction. What is an Architectural Model? APPENDIX

BlackBerry Enterprise Service 10. Secure Work Space for ios and Android Version: Security Note

REPORT ON AUDIT OF LOCAL AREA NETWORK OF C-STAR LAB

MOC 5047B: Intro to Installing & Managing Microsoft Exchange Server 2007 SP1

Chapter 5. Figure 5-1: Border Firewall. Firewalls. Figure 5-1: Border Firewall. Figure 5-1: Border Firewall. Figure 5-1: Border Firewall

Firewalls. Firewalls. Idea: separate local network from the Internet 2/24/15. Intranet DMZ. Trusted hosts and networks. Firewall.

Configuring Health Monitoring

Lesson Plans Configuring Exchange Server 2007

HTTP. Internet Engineering. Fall Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

FortiOS Handbook - Load Balancing VERSION 5.2.2

Network Services. SMTP, Internet Message Format. Johann Oberleitner SS 2006

Communication Systems Network Applications - Electronic Mail

CS 665: Computer System Security. Network Security. Usage environment. Sources of vulnerabilities. Information Assurance Module

MailEnable Scalability White Paper Version 1.2

Firewalls, IDS and IPS

CS 188/219. Scalable Internet Services Andrew Mutz October 8, 2015

Cork Institute of Technology Master of Science in Computing in Education National Framework of Qualifications Level 9

Oracle Collaboration Suite

F-Secure Internet Gatekeeper

Integrating the F5 BigIP with Blackboard

Assuring Your Business Continuity

APV9650. Application Delivery Controller

WAN Optimization, Web Cache, Explicit Proxy, and WCCP. FortiOS Handbook v3 for FortiOS 4.0 MR3

Architecture. The DMZ is a portion of a network that separates a purely internal network from an external network.

CSE331: Introduction to Networks and Security. Lecture 12 Fall 2006

Security (II) ISO : Security Architecture of OSI Reference Model. Outline. Course Outline: Fundamental Topics. EE5723/EE4723 Spring 2012

Applications and Services. DNS (Domain Name System)

Basic Networking Concepts. 1. Introduction 2. Protocols 3. Protocol Layers 4. Network Interconnection/Internet

Chapter 15: Advanced Networks

AX Series with Microsoft Exchange Server 2010

A host-based firewall can be used in addition to a network-based firewall to provide multiple layers of protection.

Network Pop Quiz 5 Brought to you by please visit our site!

Scalability of web applications. CSCI 470: Web Science Keith Vertanen

Transcription:

SMTP, Porcupine Amin Vahdat CSE 123b May 4, 2006

Midterm: May 9 Annoucements Second assignment due May 15

Load Balancing Switches

F5 Networks 3DNS

Network Service Architecture WAN Router Firewall Load Balancing Switch Integrated? Ethernet LAN Switches

Network Service Architecture HTTP (front end) Load Balancing Switch Application Servers Ethernet LAN Switches Storage Tier (Fileserver/Database)

Network Service Architecture HTTP (front end) Load Balancing Switch Virtual Cluster 1 (site1.com) Load balancing switch maps external virtual IP addresses to physical IP addresses in internal network Virtual Cluster 2 (site2.com)

F5 Big-IP

Big IP: Cookie Encryption rule when HTTP_REQUEST { # # Decrypt the cookie on its way to the server. # HTTP::cookie decrypt "cookie_name" "password-key" } rule when HTTP_RESPONSE { # # Encrypt the cookie on its way to the client. # HTTP::cookie encrypt "cookie_name" "password-key" }

Big IP: Sanitize HTTP Headers From All Apps rule when HTTP_RESPONSE { # # Remove all but the given headers. # HTTP::header sanitize "ETag" "Content-Type" "Connection" }

Big IP: Remove SSN from Responses rule ssn_rule { when HTTP_RESPONSE_DATA { set payload [HTTP::payload [HTTP::payload length]] set ssnx "xxx-xxx-xxxx" # Find the SSN numbers regsub -all {\d{3}-\d{2}-\d{4}} $payload $ssnx new_response } # Replace the content if there was a match

Performance Security Availability Management Redline Networks Statistics, event-logging, etc. Flexibility In-line deployment, or single connection w/redirect Multi-function

Redline Performance Compression Policy Engine Automatically compress content to reduce bandwidth consumption Optional compression of MIME types: doc, xls, ppt, flash, etc. Web services compression for server-to-server protocols including SOAP TCP Connection Management Terminates and persistently maintains client and server connections Servers typically see 1 connection for every 500 1,000 clients Transaction Brokering All requests upgraded to 1.1, full support for chunking and pipelining Never-Close Client Connections Eliminates unnecessary TCP slow-start for faster delivery Reduces client connection set-up time for lower latency

Redline Security Protocol Security Protocol scrubbing never passes packet fragments, ensures only valid, well-formed requests reach servers End-to-end and one-way SSL with accelerated download of secure content Network Security Defend against SYNFlood and DoS attacks of 100Mbps+ Support SYN cache and SYN cookie Launch-Point Security No UNIX shell no "root" access Cannot install or run shell scripts, arbitrary programs, Trojan horses, etc. Deep Packet Inspection Block, log or re-write bad URLs, malicious requests Buffer overflow inspection + protection

Redline Availability Active-N High Availability Self-healing mesh of up to 64 boxes actively processing traffic to one or more VIPS with cascading failover Linear scaling only deploy one box rather than a pair for additional capacity Also provide Active-Active and Active-Standby high-availability Server Load Balancing Full layer 4-7 server load balancing including HTTP/S, TCP, UDP, and FTP protocols Fewest Outstanding Requests, round-robin, least connections, failover chaining, etc. Cookie sticky with state shared across up to 64 Redline devices for HTTP or HTTPS (SSL) traffic IP sticky for Intranet or Internet IP addresses

Email Unattributed Internet Source

Terminology User Agent: end-user mail program Message Transfer Agent: responsible for communicating with remote hosts and transmitting/receiving email Both client and server Mail Exchanger: host that takes care of email for a domain

SMTP Used to exchange mail messages between mail servers (Message Transfer Agents) MTA SMTP MTA SMTP MTA UA UA File System

SMTP sender is the client SMTP receiver is the server Alternating dialogue: SMTP Protocol Client sends command and server responds with command status message Order of the commands is important Status messages include ASCII encoded numeric status code (like HTTP,FTP) and text string

SMTP Commands HELO - identifies sender MAIL FROM: - starts a mail transaction and identifies the mail originator RCPT TO: - identifies individual recipient. There may be multiple RCPT TO: commands DATA - sender ready to transmit a series of lines of text, each ends with \r\n. A line containing only a period. indicates the end of the data

Data Format ASCII only- must convert binary to an ASCII representation to send via email What if we want to send a line containing only a period? Sender prepends a period to any line staring with a period (in the message) Receiver strips the leading period in any line that starts with a period and has more stuff

Typical Exchange [vahdat@ramp vahdat]$ telnet fast.ucsd.edu 25 Trying 132.239.15.4... Connected to fast.ucsd.edu. Escape character is '^]'. 220 fast.ucsd.edu ESMTP HELO ramp.ucsd.edu 250 fast.ucsd.edu Hello ramp.ucsd.edu [137.110.222.239], pleased to meet you MAIL FROM: <vahdat@cs.ucsd.edu> 250 2.1.0 <vahdat@cs.ucsd.edu>... Sender ok RCPT TO: vahdat 250 2.1.5 vahdat... Recipient ok DATA 354 Enter mail, end with "." on a line by itself This is a test message. More text. 250 2.0.0 j3qkgnsr029250 Message accepted for delivery

Leading Period DATA 354 Enter mail, end with "." on a line by itself Hi dave - this message is a test of SMTP....foo... 250 VAA0771 Message accepted for delivery Resulting Message: Hi Hi dave - this message is is a test of of SMTP..foo.

Other SMTP Commands VRFY - confirm that a name is a valid recipient EXPN - expand an alias (group email address) TURN - switch roles (sender <=> receiver)

SOML - Send Or Mail more Commands if recipient is logged in, display message on terminal, otherwise email. SAML - Send and Mail NOOP - send back a positive reply code RSET - abort current transaction

Mail Headers Email messages contain many headers Some headers are created by UA Some automatically added by MTA Every MTA adds (at least) a Received: header Spamassassin? Some of the headers are read by (parsed) intermediate MTAs, but the content is ignored and passed on transparently

POP Post Office Protocol Used to transfer mail from a mail server to a UA. File System Mail Server POP UA

POP (version 3) Similar to SMTP command/reply lockstep protocol Used to retrieve mail for a single user requires authentication Commands and replies are ASCII lines Replies start with +OK or -ERR. Replies may contain multiple lines

POP-3 3 Commands USER - specify username PASS - specify password STAT - get mailbox status number of messages in the mailbox LIST - get a list of messages and sizes One per line, termination line contains. only RETR - retrieve a message

More POP-3 3 Commands DELE - mark a message for deletion from the mailbox NOOP - send back positive reply RSET - reset. All deletion marks are unmarked QUIT - remove marked messages and close the (TCP) connection

Optional Commands TOP - send header lines from messages APOP - alternative authentication Message digest based on opening greeting sent from POP server Requires shared secret No cleartext password on network Does not authenticate the server

A Pop3 Exchange [vahdat@ramp vahdat]$ telnet fast pop3 Trying 132.239.15.4... Connected to fast. Escape character is '^]'. +OK QPOP (version 3.1.2) at fast.ucsd.edu starting. <263.1114547001@fast.ucsd.edu> user vahdat +OK Password required for vahdat. pass mypassw +OK vahdat has 1 visible message (0 hidden) in 1761 octets. Stat +OK 1 1761 list +OK 1 visible messages (1761 octets) 1 1761.

Pop3 Example continued retr 1 +OK 1761 octets Received: from mailbox7.ucsd.edu (mailbox7.ucsd.edu [132.239.1.59]) Message-Id: <200504262023.j3QKNATg015405@smtp.ucsd.edu> Reply-To: <vahdat@cs.ucsd.edu> From: "Amin Vahdat" <vahdat@cs.ucsd.edu> To: <vahdat@cs.ucsd.edu> Subject: test Date: Tue, 26 Apr 2005 13:23:08-0700 MIME-Version: 1.0 Content-Type: text/plain;charset="us-ascii" X-Greylisting: NO DELAY (Trusted relay host); processed by UCSD_GL-v1.2 on mailbox7.ucsd.edu; X-MailScanner: PASSED (v1.2.8 34463 j3qkna8g038256 mailbox7.ucsd.edu) X-Spam-Flag: Spam NO X-Scanned-By: milter-spamc/0.15.245 (fast.ucsd.edu [132.239.15.4]); pass=yes; Tue, 26 Apr 2005 13:23:12-0700 X-Spam-Status: NO, hits=-4.90 required=5.00 test.

MIME From: alice@crepes.fr To: bob@hamburger.edu Subject: Picture of yummy crepe. MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=98766789 --98766789 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain Dear Bob, Please find a picture of a crepe. --98766789 Content-Transfer-Encoding: base64 Content-Type: image/jpeg base64 encoded data.........base64 encoded data --98766789-- (Kevin Jeffay)

Functionally Homogeneous Clustering: A New Architecture for Scalable Data-intensive Internet Services Yasushi Saito

Goals Use cheap, unreliable hardware components to build scalable data-intensive Internet services. Data-intensive Internet services: email, BBS, calendar, etc. Three facets of scalability... Performance: linear increase with system size Manageability: react to changes automatically Availability: survive failures gracefully

Contributions Functional homogeneous clustering: Dynamic data and function distribution Exploitation of application semantics Three techniques: Naming and automatic recovery High-throughput optimistic replication Load balancing Email as target application Evaluation of the architecture using Porcupine

Data-intensive Internet services Examples: Email, Usenet, BBS, calendar, Internet collaboration (photobook, equill.com, crit.org) Growing rapidly as demand for personal services grows High update frequency Low access locality Web techniques (caching, stateless data transformation) not effective Weak data consistency requirements Well-defined & structured data access path Embarrassingly parallel

Rationale for Email Email as target application. Most important among data-intensive services Service concentration (Hotmail, Gmail, AOL,,...). Practical demands The most update-intensive No access locality

Just buy a big machine + Easy deployment + Easy management Conventional Solutions: Big Iron IBM - Limited scalability - Single failure domain - Really expensive

Conventional Solutions: Clustering Connect many small machines + Cheap + Incremental scalability + Natural failure boundary - Software & management complexity

Data General Data General Data General Data General Data General Existing Cluster Solutions Static partitioning: assign data & function to nodes statically Internet Management problems: Manual data partition Performance problems: No dynamic load balancing Availability problems: SMTP NFS Limited fault tolerance Alice's data Bob's data Chuck's data

Functionally Homogeneous Clustering Static function and data partitioning leads to problems So, make everything dynamic: Any node can handle any task (client interaction, user management, etc) Any node can store any piece of data (email messages, user profile)

Advantages: Advantages Better load balance, hot spot dispersion Support for heterogeneous clusters Automatic reconfiguration and task re-distribution upon node failure/recovery Easy node addition/retirement Results: Better Performance Better Manageability Better Availability

Challenges Dynamic function distribution: Solution: run every function on every node. Dynamic data distribution: How is data named and located? How is data placed? How does data survive failures?

Key Techniques and Relationships Functional Homogeneity Framework Load Balancing Name DB w/ Reconfiguration Replication Techniques Performance Manageability Availability Goals

Overview: Porcupine Internet SMTP server POP server IMAP server Load Balancer Router/ firewall LAN User map Membership Manager RPC Replication Manager DNS Porcupine cluster Email msgs User profile Mail map

Receiving Email in Porcupine Protocol handling User lookup Load Balancing Data store (replication) Internet DNS-RR selection... A 1. send mail to bob B C 3. Verify bob D... A 4. OK, bob has msgs on C C, D, & E 6. Store msg B D 7. Store msg 2. Who manages bob? A 5. Pick the best nodes to store new msg {C,D}

Basic Data Structures BCACABAC bob hash( bob ) = 2 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 BCACABAC BCACABAC User map bob: {A,C} suzy: {A,C} ann: {B} joe: {B} Mail map / user profile Bob s MSGs Suzy s MSGs Ann s MSGs Joe s MSGs Bob s MSGs Suzy s MSGs Mailbox storage A B C

Scaling Performance User map distributes user management responsibility evenly to nodes Load balancing distributes data storage responsibility evenly to nodes Workload is very parallel Scalable performance

Measurement Environment Porcupine email server Linux-2.2.7+glibc-2.1.1+ext2 50,000 lines of C++ code 30 node cluster of not-quite-all-identical PCs 100Mb/s Ethernet + 1Gb/s hubs Performance disk-bound Homogeneous configuration Synthetic load Modeled after UWCSE server Mixture of SMTP and POP sessions

Porcupine Performance POP performance, no email replication Messages /second 800 700 600 500 400 300 200 100 0 Porcupine sendmail+popd 0 5 10 15 20 25 30 Cluster size 68m/day 25m/day

How Do Computers Fail? Large clusters are unreliable Assumption: live nodes respond correctly in bounded time time most of the time Network can partition Nodes can become very slow temporarily Nodes can fail (and may never recover) Byzantine failures excluded

Recovery: Goals and Strategies Goals: Maintain function after unusual failures React to changes quickly Graceful performance degradation / improvement Strategy: Two complementary mechanisms Make data soft as much as possible Hard state: email messages, user profile Optimistic fine-grain replication. Soft state: user map, mail map Reconstruction after configuration change

Soft-state Recovery Overview 1. Membership protocol Usermap recomputation 2. Distributed disk scan A B C A B A B A C bob: {A,C} B A A B A B A B bob: {A,C} suzy: B A A B A B A B bob: {A,C} suzy: {A,B} B B C A B A B A C joe: {C} B A A B A B A B joe: {C} ann: B A A B A B A B joe: {C} ann: {B} C B C A B A B A C suzy: {A,B} ann: {B,C} B C A B A B A C suzy: {A,B} ann: {B,C} B C A B A B A C suzy: {A,B} ann: {B,C} Timeline

How does Porcupine React to Configuration Changes? Messages /second 700 600 500 400 No failure One node failure Three node failures Six node failures 300 0 100 200 300 400 500 600 700 800 Nodes fail New membership determined Nodes recover New membership determined Time(seconds) See breakdown