Application Layer Abusayeed Saifullah CS 5600 Computer Networks These slides are adapted from Kurose and Ross
Web caches (proxy server) goal: satisfy client request without involving origin server v user sets browser: Web accesses via cache v browser sends all HTTP requests to cache object in cache: cache returns object else cache requests object from origin server, then returns object to client client client proxy server origin server origin server
More about Web caching v cache acts as both client and server server for original requesting client client to origin server v typically cache is installed by ISP (university, company, residential ISP) why Web caching? v reduce response time for client request v reduce traffic on an institution s access link v Internet dense with caches: enables poor content providers to effectively deliver content (so too does P2P file sharing)
Caching example: assumptions: v v v v v avg object size: 100K bits avg request rate from browsers to origin servers:15/sec avg data rate to browsers: 1.50 Mbps RTT from institutional router to any origin server: 2 sec access link rate: 1.54 Mbps consequences: v LAN utilization: 0.15% problem! v access link utilization = 99% v total delay = Internet delay + access delay + LAN delay = 2 sec + minutes + usecs institutional network public Internet 1.54 Mbps access link origin servers 1 Gbps LAN
Caching example: fatter access link assumptions: v v v v v avg object size: 100K bits avg request rate from browsers to origin servers:15/sec avg data rate to browsers: 1.50 Mbps RTT from institutional router to any origin server: 2 sec access link rate: 1.54 Mbps consequences: v LAN utilization: 0.15% v access link utilization = 99% 9.9% v total delay = Internet delay + access delay + LAN delay = 2 sec + minutes + usecs msecs origin servers 154 Mbps 1.54 Mbps 154 Mbps access link institutional network public Internet Cost: increased access link speed (not cheap!) 1 Gbps LAN
Caching example: install local cache assumptions: v v v v v avg object size: 100K bits avg request rate from browsers to origin servers:15/sec avg data rate to browsers: 1.50 Mbps RTT from institutional router to any origin server: 2 sec access link rate: 1.54 Mbps consequences: v LAN utilization: 0.15% v access link utilization = 100%? v total delay = Internet? delay + access delay + LAN delay How to compute link = 2 sec + minutes + usecs utilization, delay? Cost: web cache (cheap!) institutional network public Internet 1.54 Mbps access link origin servers 1 Gbps LAN local web cache
Caching example: install local cache Calculating access link utilization, delay with cache: v suppose cache hit rate is 0.4 40% requests satisfied at cache, 60% requests satisfied at origin v access link utilization: 60% of requests use access link v data rate to browsers over access link = 0.6*1.50 Mbps =.9 Mbps utilization = 0.9/1.54 =.58 v total delay = 0.6 * (delay from origin servers) +0.4 * (delay when satisfied at cache) = 0.6 (2.01) + 0.4 (~msecs) = ~ 1.2 secs less than with 154 Mbps link (and cheaper too!) institutional network public Internet 1.54 Mbps access link origin servers 1 Gbps LAN local web cache
Conditional GET v Goal: don t send object if cache has up-to-date cached version no object transmission delay lower link utilization v cache: specify date of cached copy in HTTP request If-modified-since: <date> v server: response contains no object if cached copy is up-to-date: HTTP/1.0 304 Not Modified client HTTP request msg If-modified-since: <date> HTTP response HTTP/1.0 304 Not Modified HTTP request msg If-modified-since: <date> HTTP response HTTP/1.0 200 OK <data> server object not modified before <date> object modified after <date>
Roadmap 1 principles of network applications 2 Web and HTTP 3 FTP 4 electronic mail SMTP, POP3, IMAP 5 DNS 6 P2P applications
FTP: the file transfer protocol user at host FTP user interface FTP client local file system file transfer FTP server remote file system v transfer file to/from remote host v client/server model client: side that initiates transfer (either to/from remote) server: remote host v ftp: RFC 959 v ftp server: port 21
FTP: separate control, data connections v FTP client contacts FTP server at port 21, using TCP v client authorized over control connection v client browses remote directory, sends commands over control connection v when server receives file transfer command, server opens 2 nd TCP data connection (for file) to client v after transferring one file, server closes data connection FTP client TCP control connection, server port 21 TCP data connection, server port 20 FTP server v server opens another TCP data connection to transfer another file FTP vs HTTP: v 2 parallel TCP in FTP à control conn. is out of band v FTP server maintains state : current directory, earlier authentication
FTP commands, responses sample commands: v sent as ASCII text over control channel v USER username v PASS password v LIST return list of file in current directory v RETR filename retrieves (gets) file v STOR filename stores (puts) file onto remote host sample return codes v status code and phrase (as in HTTP) v 331 Username OK, password required v 125 data connection already open; transfer starting v 425 Can t open data connection v 452 Error writing file
Roadmap 1 principles of network applications 2 Web and HTTP 3 FTP 4 electronic mail SMTP, POP3, IMAP 5 DNS 6 P2P applications
Electronic mail Three major components: v user agents v mail servers v simple mail transfer protocol: SMTP User Agent v a.k.a. mail reader v composing, editing, reading mail messages v e.g., Outlook, Thunderbird, iphone mail client v outgoing, incoming messages stored on server mail server SMTP mail server user agent user agent SMTP SMTP user agent outgoing message queue mail server user mailbox user agent user agent user agent
Electronic mail: mail servers mail servers: v mailbox contains incoming messages for user v message queue of outgoing (to be sent) mail messages v SMTP protocol between mail servers to send email messages client: sending mail server server : receiving mail server mail server SMTP mail server user agent user agent SMTP SMTP user agent mail server user agent user agent user agent
Electronic Mail: SMTP [RFC 2821] v uses TCP to reliably transfer email message from client to server, port 25 v direct transfer: sending server to receiving server v three phases of transfer handshaking (greeting) transfer of messages closure v command/response interaction (like HTTP, FTP) commands: ASCII text response: status code and phrase v messages must be in 7-bit ASCII
Scenario: Alice sends message to Bob 1) Alice uses UA to compose message to bob@someschool.edu 2) Alice s UA sends message to her mail server; message placed in message queue 3) client side of SMTP opens TCP connection with Bob s mail server 4) SMTP client sends Alice s message over the TCP connection 5) Bob s mail server places the message in Bob s mailbox 6) Bob invokes his user agent to read message 1 user agent mail server 2 3 4 mail server 6 user agent Alice s mail server 5 Bob s mail server
Sample SMTP interaction S: 220 hamburger.edu C: HELO crepes.fr S: 250 Hello crepes.fr, pleased to meet you C: MAIL FROM: <alice@crepes.fr> S: 250 alice@crepes.fr... Sender ok C: RCPT TO: <bob@hamburger.edu> S: 250 bob@hamburger.edu... Recipient ok C: DATA S: 354 Enter mail, end with "." on a line by itself C: Do you like ketchup? C: How about pickles? C:. S: 250 Message accepted for delivery C: QUIT S: 221 hamburger.edu closing connection
Try SMTP interaction for yourself: v telnet servername 25 v see 220 reply from server v enter HELO, MAIL FROM, RCPT TO, DATA, QUIT commands above lets you send email without using email client (reader)
SMTP: final words v SMTP uses persistent connections v SMTP requires message (header & body) to be in 7-bit ASCII v SMTP server uses CRLF.CRLF to determine end of message comparison with HTTP: v HTTP: pull v SMTP: push v both have ASCII command/response interaction, status codes v HTTP: each object encapsulated in its own response msg v SMTP: multiple objects sent in one msg
Mail access protocols user agent SMTP SMTP mail access protocol (e.g., POP, IMAP) user agent sender s mail server receiver s mail server v SMTP: delivery/storage to receiver s server v mail access protocol: retrieval from server POP: Post Office Protocol [RFC 1939]: authorization, download IMAP: Internet Mail Access Protocol [RFC 1730]: more features, including manipulation of stored msgs on server HTTP: gmail, Hotmail, Yahoo! Mail, etc.
POP3 protocol authorization phase v v client commands: user: declare username pass: password server responses +OK -ERR transaction phase, client: v v v v list: list message numbers retr: retrieve message by number dele: delete quit S: +OK POP3 server ready C: user bob S: +OK C: pass hungry S: +OK user successfully logged on C: list S: 1 498 S: 2 912 S:. C: retr 1 S: <message 1 contents> S:. C: dele 1 C: retr 2 S: <message 1 contents> S:. C: dele 2 C: quit S: +OK POP3 server signing off
POP3 (more) and IMAP more about POP3 v previous example uses POP3 download and delete mode Bob cannot re-read e- mail if he changes client v POP3 download-andkeep : copies of messages on different clients v POP3 is stateless across sessions IMAP v keeps all messages in one place: at server v allows user to organize messages in folders v keeps user state across sessions: names of folders and mappings between message IDs and folder name
Roadmap 1 principles of network applications 2 Web and HTTP 3 FTP 4 electronic mail SMTP, POP3, IMAP 5 DNS 6 P2P applications
DNS: domain name system people: many identifiers: SSN, name, passport # Internet hosts, routers: IP address (32 bit) - used for addressing datagrams name, e.g., www.yahoo.com - used by humans Q: how to map between IP address and name, and vice versa? Domain Name System: v distributed database implemented in hierarchy of many name servers v application-layer protocol: hosts, name servers communicate to resolve names (address/name translation) note: core Internet function, implemented as applicationlayer protocol
DNS: services, structure DNS services v hostname to IP address translation v host aliasing canonical, alias names v mail server aliasing v load distribution replicated Web servers: many IP addresses correspond to one name why not centralize DNS? v single point of failure v traffic volume v distant centralized database v maintenance A: doesn t scale!
DNS: a distributed, hierarchical database Root DNS Servers com DNS servers org DNS servers edu DNS servers yahoo.com DNS servers amazon.com DNS servers pbs.org DNS servers poly.edu umass.edu DNS servers DNS servers client wants IP for www.amazon.com; 1 st time: v client queries root server to find com DNS server v client queries.com DNS server to get amazon.com DNS server v client queries amazon.com DNS server to get IP address for www.amazon.com
DNS: root name servers v contacted by local name server that can not resolve name v root name server: contacts authoritative name server if name mapping not known gets mapping returns mapping to local name server e. NASA Mt View, CA f. Internet Software C. Palo Alto, CA (and 48 other sites) c. Cogent, Herndon, VA (5 other sites) d. U Maryland College Park, MD h. ARL Aberdeen, MD j. Verisign, Dulles VA (69 other sites ) a. Verisign, Los Angeles CA (5 other sites) b. USC-ISI Marina del Rey, CA l. ICANN Los Angeles, CA (41 other sites) g. US DoD Columbus, OH (5 other sites) k. RIPE London (17 other sites) i. Netnod, Stockholm (37 other sites) m. WIDE Tokyo (5 other sites) 13 root name servers worldwide
TLD, authoritative servers top-level domain (TLD) servers: responsible for com, org, net, edu, aero, jobs, museums, and all top-level country domains, e.g.: uk, fr, ca, jp Network Solutions maintains servers for.com TLD Educause for.edu TLD authoritative DNS servers: organization s own DNS server(s), providing authoritative hostname to IP mappings for organization s named hosts can be maintained by organization or service provider
Local DNS name server v does not strictly belong to hierarchy v each ISP (residential ISP, company, university) has one also called default name server v when host makes DNS query, query is sent to its local DNS server has local cache of recent name-to-address translation pairs (but may be out of date!) acts as proxy, forwards query into hierarchy
DNS name resolution example root DNS server v host at cis.poly.edu wants IP address for gaia.cs.umass.edu 2 3 4 5 TLD DNS server iterated query: v contacted server replies with name of server to contact v I don t know this name, but ask this server local DNS server dns.poly.edu 1 8 requesting host cis.poly.edu 7 6 authoritative DNS server dns.cs.umass.edu gaia.cs.umass.edu
DNS name resolution example root DNS server recursive query: 2 7 6 3 v puts burden of name resolution on contacted name server local DNS server dns.poly.edu 5 4 TLD DNS server v heavy load at upper levels of hierarchy? 1 8 requesting host cis.poly.edu authoritative DNS server dns.cs.umass.edu gaia.cs.umass.edu
DNS: caching, updating records v once (any) name server learns mapping, it caches mapping cache entries timeout (disappear) after some time (TTL) TLD servers typically cached in local name servers thus root name servers not often visited v cached entries may be out-of-date (best effort name-to-address translation!) if name host changes IP address, may not be known Internet-wide until all TTLs expire v update/notify mechanisms proposed IETF standard RFC 2136
DNS records DNS: distributed db storing resource records (RR) RR format: (name, value, type, ttl) type=a name is hostname value is IP address type=ns name is domain (e.g., foo.com) value is hostname of authoritative name server for this domain type=cname name is alias name for some canonical (the real) name www.ibm.com is really servereast.backup2.ibm.com value is canonical name type=mx value is name of mailserver associated with name
DNS protocol, messages v query and reply messages, both with same message format 2 bytes 2 bytes msg header v v identification: 16 bit # for query, reply to query uses same # flags: query or reply recursion desired recursion available reply is authoritative identification flags # questions # answer RRs # authority RRs # additional RRs questions (variable # of questions) answers (variable # of RRs) authority (variable # of RRs) additional info (variable # of RRs)
DNS protocol, messages 2 bytes 2 bytes identification # questions flags # answer RRs name, type fields for a query RRs in response to query records for authoritative servers additional helpful info that may be used # authority RRs # additional RRs questions (variable # of questions) answers (variable # of RRs) authority (variable # of RRs) additional info (variable # of RRs)
Inserting records into DNS v example: new startup Network Utopia v register name networkuptopia.com at DNS registrar (e.g., Network Solutions) provide names, IP addresses of authoritative name server registrar inserts two RRs into.com TLD server: (networkutopia.com, dns1.networkutopia.com, NS) (dns1.networkutopia.com, 212.212.212.1, A) v create authoritative server type A record for www.networkuptopia.com; type MX record for networkutopia.com (networkutopia.com, 212.212.77.4, A) (networkutopia.com, mail.networkutopia.com, MX)
Roadmap 1 principles of network applications 2 Web and HTTP 3 FTP 4 electronic mail SMTP, POP3, IMAP 5 DNS 6 P2P applications
Pure P2P architecture v no always-on server v arbitrary end systems directly communicate v peers are intermittently connected and change IP addresses examples: file distribution (BitTorrent) Streaming (KanKan) VoIP (Skype)
File distribution: client-server vs P2P Question: how much time to distribute file (size F) from one server to N peers? peer upload/download capacity is limited resource u s : server upload capacity file, size F server u s u 1 d 1 u 2 d 2 d i : peer i download capacity u N d N network (with abundant bandwidth) d i u i u i : peer i upload capacity
File distribution time: client-server v server transmission: must sequentially send (upload) N file copies: time to send one copy: F/u s time to send N copies: NF/u s v client: each client must download file copy d min = min client download rate min client download time at least: F/d min F u s network d i u i time to distribute F to N clients using client-server approach D c-s > max{nf/u s,,f/d min } increases linearly in N
File distribution time: P2P v server transmission: must upload at least one copy time to send one copy: F/u s v client: each client must download file copy min client download time: F/d min v clients: as aggregate must download NF bits F u s network d i u i max upload rate (limting max download rate) is u s + Σu i time to distribute F to N clients using P2P approach D P2P > max{f/u s,,f/d min,,nf/(u s + Σu i )} increases linearly in N but so does this, as each peer brings service capacity
Client-server vs. P2P: example client upload rate = u, F/u = 1 hour, u s = 10u, d min u s Minimum Distribution Time 3.5 3 2.5 2 1.5 1 0.5 0 P2P Client-Server 0 5 10 15 20 25 30 35 N
P2P file distribution: BitTorrent v file divided into 256Kb chunks v peers in torrent send/receive file chunks tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file Alice arrives obtains list of peers from tracker and begins exchanging file chunks with peers in torrent
P2P file distribution: BitTorrent v peer joining torrent: has no chunks, but will accumulate them over time from other peers registers with tracker to get list of peers, connects to subset of peers ( neighbors ) v while downloading, peer uploads chunks to other peers v peer may change peers with whom it exchanges chunks v churn: peers may come and go v once peer has entire file, it may (selfishly) leave or (altruistically) remain in torrent
BitTorrent: requesting, sending file chunks requesting chunks: v at any given time, different peers have different subsets of file chunks v periodically, Alice asks each peer for list of chunks that they have v Alice requests missing chunks from peers, rarest first sending chunks: tit-for-tat v Alice sends chunks to those four peers currently sending her chunks at highest rate other peers are choked by Alice (do not receive chunks from her) re-evaluate top 4 every10 secs v every 30 secs: randomly select another peer, starts sending chunks optimistically unchoke this peer newly chosen peer may join top 4
BitTorrent: tit-for-tat (1) Alice optimistically unchokes Bob (2) Alice becomes one of Bob s top-four providers; Bob reciprocates (3) Bob becomes one of Alice s top-four providers higher upload rate: find better trading partners, get file faster!