Computer Networks. Instructor: Niklas Carlsson Email: niklas.carlsson@liu.se

Similar documents
The Different Types of TCPIP and Their Uses

Protocolo HTTP. Web and HTTP. HTTP overview. HTTP overview

The Web: some jargon. User agent for Web is called a browser: Web page: Most Web pages consist of: Server for Web is called Web server:

Network Technologies

1 Introduction: Network Applications

CS640: Introduction to Computer Networks. Applications FTP: The File Transfer Protocol

Principles of Network Applications. Dr. Philip Cannata

Application Layer: HTTP and the Web. Srinidhi Varadarajan

The Web History (I) The Web History (II)

DATA COMMUNICATOIN NETWORKING

Network Applications

1. When will an IP process drop a datagram? 2. When will an IP process fragment a datagram? 3. When will a TCP process drop a segment?

Distributed Systems. 2. Application Layer

CONTENT of this CHAPTER

HTTP. Internet Engineering. Fall Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

Review of Networking Basics. Yao Wang Polytechnic University, Brooklyn, NY11201

Project #2. CSE 123b Communications Software. HTTP Messages. HTTP Basics. HTTP Request. HTTP Request. Spring Four parts

Computer Networks & Security 2014/2015

Computer Networks. Lecture 7: Application layer: FTP and HTTP. Marcin Bieńkowski. Institute of Computer Science University of Wrocław

The Hyper-Text Transfer Protocol (HTTP)

CSIS CSIS 3230 Spring Networking, its all about the apps! Apps on the Edge. Application Architectures. Pure P2P Architecture

reference: HTTP: The Definitive Guide by David Gourley and Brian Totty (O Reilly, 2002)

Web. Services. Web Technologies. Today. Web. Technologies. Internet WWW. Protocols TCP/IP HTTP. Apache. Next Time. Lecture # Apache.

Computer Networks. Examples of network applica3ons. Applica3on Layer

Application Layer. CMPT Application Layer 1. Required Reading: Chapter 2 of the text book. Outline of Chapter 2

HTTP Protocol. Bartosz Walter

Computer Networks. Instructor: Niklas Carlsson

World Wide Web. Before WWW

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP Abstract Message Format. The Client/Server model is used:

Internet Technologies. World Wide Web (WWW) Proxy Server Network Address Translator (NAT)

TCP/IP Networking An Example

Lecture 8a: WWW Proxy Servers and Cookies

HTTP Caching & Cache-Busting for Content Publishers

By Bardia, Patit, and Rozheh

Data Communication I

1. The Web: HTTP; file transfer: FTP; remote login: Telnet; Network News: NNTP; SMTP.

CloudOYE CDN USER MANUAL

Architecture of So-ware Systems HTTP Protocol. Mar8n Rehák

Introduction to Computer Networks

Hypertext for Hyper Techs

Application-layer Protocols and Internet Services

Domain Name System (DNS)

No. Time Source Destination Protocol Info HTTP GET /ethereal-labs/http-ethereal-file1.html HTTP/1.

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP - Message Format. The Client/Server model is used:

sessionx Desarrollo de Aplicaciones en Red Web Applications History (1) Content History (2) History (3)

Lecture 8a: WWW Proxy Servers and Cookies

Lecture 2. Internet: who talks with whom?

Alteon Browser-Smart Load Balancing

Configuring Health Monitoring

Chapter 27 Hypertext Transfer Protocol

The exam has 110 possible points, 10 of which are extra credit. There is a Word Bank on Page 8. Pages 7-8 can be removed from the exam.

THE PROXY SERVER 1 1 PURPOSE 3 2 USAGE EXAMPLES 4 3 STARTING THE PROXY SERVER 5 4 READING THE LOG 6

Streaming Stored Audio & Video

Architecture and Performance of the Internet

Computer Networks and the Internet

Final for ECE374 05/06/13 Solution!!

Internet Technologies Internet Protocols and Services

NAT TCP SIP ALG Support

First Workshop on Open Source and Internet Technology for Scientific Environment: with case studies from Environmental Monitoring

Application-layer protocols

CS 213, Fall 2000 Lab Assignment L5: Logging Web Proxy Assigned: Nov. 28, Due: Mon. Dec. 11, 11:59PM

CS 5480/6480: Computer Networks Spring 2012 Homework 1 Solutions Due by 9:00 AM MT on January 31 st 2012

Internet Technologies 4-http. F. Ricci 2010/2011

Single Pass Load Balancing with Session Persistence in IPv6 Network. C. J. (Charlie) Liu Network Operations Charter Communications

Content'Delivery'Infrastructure' HTTP'Overview' A'Web'Page' Computer Networks. Lecture'9:'HTTP'

Life of a Packet CS 640,

Application layer Web 2.0

Network: several computers who can communicate. bus. Main example: Ethernet (1980 today: coaxial cable, twisted pair, 10Mb 1000Gb).

APACHE WEB SERVER. Andri Mirzal, PhD N

M3-R3: INTERNET AND WEB DESIGN

Communications Software. CSE 123b. CSE 123b. Spring Lecture 13: Load Balancing/Content Distribution. Networks (plus some other applications)

WHAT IS A WEB SERVER?

Research of Web Real-Time Communication Based on Web Socket

internet technologies and standards

How To Understand The Power Of A Content Delivery Network (Cdn)

Content Delivery Networks (CDN) Dr. Yingwu Zhu

Layer 7 Load Balancing and Content Customization

IxLoad - Layer 4-7 Performance Testing of Content Aware Devices and Networks

CDN Operation Manual

Front-End Performance Testing and Optimization

1945: 1989: ! Tim Berners-Lee (CERN) writes internal proposal to develop a. 1990:! Tim BL writes a graphical browser for Next machines.

Basic Internet programming Formalities. Hands-on tools for internet programming

Chapter 5. Data Communication And Internet Technology

DATA COMMUNICATOIN NETWORKING

Protocols. Packets. What's in an IP packet


P and FTP Proxy caching Using a Cisco Cache Engine 550 an

Domain Name System (DNS)

S y s t e m A r c h i t e c t u r e

Remote login (Telnet):

Network Security. Vorlesung Kommunikation und Netze SS 10 E. Nett

International Journal of Engineering & Technology IJET-IJENS Vol:14 No:06 44

Overview. Securing TCP/IP. Introduction to TCP/IP (cont d) Introduction to TCP/IP

SiteCelerate white paper

INTRODUCTION TO FIREWALL SECURITY

Chapter 2 Application Layer. Lecture 5 FTP, Mail. Computer Networking: A Top Down Approach

Chapter 8 Security Pt 2

URLs and HTTP. ICW Lecture 10 Tom Chothia

Transcription:

Computer Networks Instructor: Niklas Carlsson Email: niklas.carlsson@liu.se Notes derived from Computer Networking: A Top Down Approach, by Jim Kurose and Keith Ross, Addison-Wesley. The slides are adapted and modified based on slides from the book s companion Web site, as well as modified slides by Anirban Mahanti and Carey Williamson.

Outline Background Introduction to App-Layer Protocols Brief History of WWW Architecture HTTP Connections HTTP Format Web Performance Cookies 2

What s a protocol? protocols define format, order of msgs sent and received among network entities, and actions taken on msg transmission, receipt 3

Protocol: Connection oriented or not? Connection oriented: Hand shaking Explicit setup phase for logical connection Connection release afterwards Establishes state information about the connection Mechanisms for reliable data transfer, error control, flow control, etc. Guarantees that data will arrive (eventually) 4

Protocol: Connection oriented or not? Connection oriented: Hand shaking Explicit setup phase for logical connection Connection release afterwards Establishes state information about the connection Mechanisms for reliable data transfer, error control, flow control, etc. Guarantees that data will arrive (eventually) Connection less: No handshaking No (significant) state information (at end points or in network) No mechanisms for flow control etc. No guarantees of arrival (or when) Simpler (and faster?) 5

Protocol: Connection oriented or not? Connection oriented: Hand shaking Explicit setup phase for logical connection Connection release afterwards Establishes state information about the connection Mechanisms for reliable data transfer, error control, flow control, etc. Guarantees that data will arrive (eventually) Connection less: No handshaking No (significant) state information (at end points or in network) No mechanisms for flow control etc. No guarantees of arrival (or when) Simpler (and faster?) Which is the best? It depends on (i) what it is used for, and (ii) what it is built ontop of 6

Internet protocol stack application transport network link physical E.g., TCP (CO) UDP (CL) E.g., IP (CL) E.g., Ethernet (CL) ATM (CO) Two notes on the physical layer: Guided (e.g., coaxial cable, fiber, etc) vs. unguided (satellite, wireless, etc.) Signaling, modulation, encoding, etc, 7

8

Outline Background Introduction to App-Layer Protocols Brief History of WWW Architecture HTTP Connections HTTP Format Web Performance Cookies 9

Creating a network app write programs that run on (different) end systems communicate over network application transport network data link physical No need to write software for network-core devices network-core devices do not run user applications applications on end systems allows for rapid app development, propagation application transport network data link physical application transport network data link physical 10

App-layer protocols public-domain protocols: Often defined in RFCs allows for interoperability e.g., HTTP, SMTP, BitTorrent proprietary protocols: e.g., Skype, Spotify 11

Application architectures client-server peer-to-peer (P2P) hybrid of client-server and P2P 12

Network applications: some jargon Process: program running within a host. within same host, two processes communicate using inter-process communication (IPC, defined by OS). processes running on different hosts communicate with an application-layer protocol User agent: interfaces with user above and network below. implements user interface & application-level protocol Web: browser E-mail: mail reader streaming audio/video: media player 13

Processes communicating Process: program running within a host. processes in different hosts communicate by exchanging messages Client-server paradigm client process: process that initiates communication server process: process that waits to be contacted 14

Sockets process sends/receives messages to/from its socket host or server host or server process socket TCP with buffers, variables controlled by app developer Internet process socket TCP with buffers, variables controlled by OS 15

Addressing processes: For a process to receive messages, it must have an identifier Identifier includes both the IP address and port numbers associated with the process on the host. Example port numbers: HTTP server: 80 Mail server: 25 More on this later 16

Addressing processes: For a process to receive messages, it must have an identifier Every host has a unique 32-bit IP address Q: does the IP address of the host on which the process runs suffice for identifying the process? Answer: No, many processes can be running on same host Identifier includes both the IP address and port numbers associated with the process on the host. Example port numbers: HTTP server: 80 Mail server: 25 More on this later 17

18

What transport service does an app need? Data loss Bandwidth Timing 19

What transport service does an app need? Data loss some apps (e.g., file transfer, telnet) require 100% reliable data transfer other apps (e.g., audio) can tolerate some loss Timing some apps (e.g., Internet telephony, interactive games) require low delay to be effective Bandwidth most apps ( elastic apps ) make use of whatever bandwidth they get other apps (e.g., multimedia) require minimum amount of bandwidth to be effective 20

Internet transport protocols services TCP service: connection-oriented: setup required between client and server processes reliable transport between sending and receiving process flow control: sender won t overwhelm receiver congestion control: throttle sender when network overloaded does not provide: timing, minimum throughput guarantees, security UDP service: unreliable data transfer between sending and receiving process does not provide: connection setup, reliability, flow control, congestion control, timing, throughput guarantee, or security Q: why bother? Why is there a UDP? 21

Internet apps: application, transport protocols Application Application layer protocol Underlying transport protocol e-mail remote terminal access Web file transfer streaming multimedia Internet telephony SMTP [RFC 2821] Telnet [RFC 854] HTTP [RFC 2616] FTP [RFC 959] proprietary (e.g., RealNetworks, youtube, netflix, spotify) proprietary (e.g., Dialpad, skype) 22

Internet apps: application, transport protocols Application Application layer protocol Underlying transport protocol e-mail remote terminal access Web file transfer streaming multimedia Internet telephony SMTP [RFC 2821] TCP Telnet [RFC 854] TCP HTTP [RFC 2616] TCP FTP [RFC 959] TCP proprietary TCP (or UDP) (e.g., RealNetworks, youtube, netflix, spotify) proprietary (e.g., Dialpad, skype) UDP or TCP typically UDP 23

24

WWW terminology and HTTP overview 25

Some Web Terminology Web page may contain links to other pages (sometimes also called Web Objects) Object can be HTML file, JPEG image, Java applet, audio file, Web pages are Hypertexts One page points to another Each object is addressable by a URL: http://www.someschool.edu/somedept/pic.gif protocol host name path name 26

HTTP overview HTTP: hypertext transfer protocol Web s application layer protocol 27

HTTP overview HTTP: hypertext transfer protocol Web s application layer protocol PC running Internet Explorer or Firefox Server running Apache Web server Mac running Safari 28

HTTP overview HTTP: hypertext transfer protocol Web s application layer protocol client/server model client: browser that requests, receives, displays Web objects server: Web server sends objects in response to requests HTTP 1.0: RFC 1945 HTTP 1.1: RFC 2616 PC running Internet Explorer or Firefox Mac running Safari Server running Apache Web server 29

HTTP overview (continued) Uses TCP: client initiates TCP connection (creates socket) to server, port 80 server accepts TCP connection from client HTTP messages (applicationlayer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server) TCP connection closed HTTP is stateless server maintains no information about past client requests 30

HTTP overview (continued) Uses TCP: client initiates TCP connection (creates socket) to server, port 80 server accepts TCP connection from client HTTP messages (applicationlayer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server) TCP connection closed HTTP is stateless server maintains no information about past client requests Protocols that maintain state are complex! aside past history (state) must be maintained if server/client crashes, their views of state may be inconsistent, must be reconciled 31

32

Outline Introduction to App Layer Protocols Brief History of WWW Architecture HTTP Connections HTTP Format Web Performance Cookies 33

Response time modeling Definition of RTT: time to send a small packet to travel from client to server and back. Response time: one RTT to initiate TCP connection one RTT for HTTP request and first few bytes of HTTP response to return file transmission time total = 2*RTT+transmit time initiate TCP connection RTT request file RTT file received time time time to transmit file 34

HTTP connections Non-persistent HTTP At most one object is sent over a TCP connection. HTTP/1.0 uses nonpersistent HTTP Persistent HTTP Multiple objects can be sent (one at a time) over a single TCP connection between client and server. HTTP/1.1 uses persistent connections in default mode Pipelined Non-pipelined 35

Persistent HTTP Nonpersistent HTTP issues: requires 2 RTTs per object OS must work and allocate host resources for each TCP connection but browsers often open parallel TCP connections to fetch referenced objects Persistent HTTP server leaves connection open after sending response subsequent HTTP messages between same client/server are sent over connection Persistent without pipelining: client issues new request only when previous response has been received one RTT for each referenced object Persistent with pipelining: default in HTTP/1.1 client sends requests as soon as it encounters a referenced object as little as one RTT for all the referenced objects 36

Network View: HTTP and TCP TCP is a connection-oriented protocol SYN SYN/ACK GET URL ACK Web Client YOUR DATA HERE Web Server FIN ACK FIN/ACK 37

Example Web Page Harry Potter Movies page.html As you all know, the new HP book will be out in June and then there will be a new movie shortly after that hpface.jpg castle.gif Harry Potter and the Bathtub Ring 38

Client Server TCP SYN G page.html TCP FIN TCP SYN G hpface.jpg The classic approach in HTTP/1.0 is to use one HTTP request per TCP connection, serially. TCP FIN TCP SYN G TCP FIN castle.gif 39

Client Server Concurrent (parallel) TCP TCP SYN connections can be used G to make things faster. C S C S page.html S S TCP FIN G G hpface.jpg castle.gif F F 40

Client TCP SYN Server G G G page.html hpface.jpg castle.gif Timeout The persistent HTTP approach can re-use the same TCP connection for Multiple HTTP transfers, one after another, serially. Amortizes TCP overhead, but maintains TCP state longer at server. TCP FIN 41

Client Server TCP SYN G page.html GG hpface.jpg castle.gif Timeout The pipelining feature in HTTP/1.1 allows requests to be issued asynchronously on a persistent connection. Requests must be processed in proper order. Can do clever packaging. TCP FIN 42

43

Outline Introduction to App Layer Protocols Brief History of WWW Architecture HTTP Connections HTTP Format Web Performance Cookies 44

HTTP request message HTTP request message: ASCII (human-readable format) request line (GET, POST, HEAD commands) header lines GET /somedir/page.html HTTP/1.1 Host: www.someschool.edu User-agent: Mozilla/4.0 Connection: close Accept-language:fr Carriage return, line feed indicates end of message (extra carriage return, line feed) 45

HTTP request message: general format 46

HTTP Methods GET: retrieve a file (95% of requests) HEAD: just get meta-data (e.g., mod time) POST: submitting a form to a server PUT: store enclosed document as URI DELETE: removed named resource LINK/UNLINK: in 1.0, gone in 1.1 TRACE: http echo for debugging (added in 1.1) CONNECT: used by proxies for tunneling (1.1) OPTIONS: request for server/proxy options (1.1) 47

Trying out HTTP (client side) for yourself 1. Telnet to your favorite Web server: telnet www.eurecom.fr 80 Opens TCP connection to port 80 (default HTTP server port) at www.eurecom.fr. Anything typed in sent to port 80 at www.eurecom.fr 2. Type in a GET HTTP request: GET /~ross/index.html HTTP/1.0 By typing this in (hit carriage return twice), you send this minimal (but complete) GET request to HTTP server 3. Look at response message sent by HTTP server! 48

HTTP response message status line (protocol status code status phrase) data, e.g., requested HTML file header lines HTTP/1.1 200 OK Connection: close Date: Thu, 06 Aug 1998 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Jun 1998... Content-Length: 6821 Content-Type: text/html data data data data data... 49

HTTP Response Status Codes 1XX: Informational (def d in 1.0, used in 1.1) 100 Continue, 101 Switching Protocols 2XX: Success 200 OK, 206 Partial Content 3XX: Redirection 301 Moved Permanently, 304 Not Modified 4XX: Client error 400 Bad Request, 403 Forbidden, 404 Not Found 5XX: Server error 500 Internal Server Error, 503 Service Unavailable, 505 HTTP Version Not Supported 50

HTTP Response Status Codes 1XX: Informational (def d in 1.0, used in 1.1) 100 Continue, 101 Switching Protocols 2XX: Success 200 OK, 206 Partial Content 3XX: Redirection 301 Moved Permanently, 304 Not Modified 4XX: Client error 400 Bad Request, 403 Forbidden, 404 Not Found 5XX: Server error 500 Internal Server Error, 503 Service Unavailable, 505 HTTP Version Not Supported 51

52

Outline Introduction to App Layer Protocols Brief History of WWW Architecture HTTP Connections HTTP Format Web Performance Cookies 53

Web caches (proxy server) Goal: satisfy client request without involving origin server user sets browser: Web accesses via cache browser sends all HTTP requests to cache client Proxy server origin server object in cache: cache returns object else cache requests object from origin server, then returns object to client client origin server Application 2-54

More about Web caching cache acts as both client and server typically cache is installed by ISP (university, company, residential ISP) Application 2-55

Caching example origin servers public Internet 1.5 Mbps access link institutional network 10 Mbps LAN institutional cache Application 2-56

Caching example (cont) origin servers public Internet 10 Mbps access link institutional network 10 Mbps LAN institutional cache Application 2-57

Caching example (cont) origin servers public Internet 1.5 Mbps access link institutional network 10 Mbps LAN institutional cache Application 2-58

Caching example (cont) why Web caching? Reduce response time for client request Reduce traffic on an institution s access link. Offloads server Internet dense with caches: enables poor content providers to effectively deliver content (but so does P2P file sharing) institutional network public Internet 1.5 Mbps access link 10 Mbps LAN origin servers institutional cache Application 2-59

Typical hit rates? Application 2-60

Typical hit rates? Traces suggests 40-50% hit rate for objects and 20-25% for bytes (across geographies and over time) But the total relative traffic reduction is reducing with increasing use of HTTPS Example references P Gill, M. Arlitt, N. Carlsson, A. Mahanti, C. Williamson, Characterizing Organizational Use of Web-based Services: Methodology, Challenges, Observations, and Insights, ACM Transactions on the Web (ACM TWEB), Vol. 5, No. 4 (Oct. 2011), pp. 19:1--19:23. A. Wolman, G. Voelker, N. Sharma, N. Cardwell, A. Karlin, and H. Levy, On the scale and performance of cooperative Web proxy caching. Proc. ACM Symposium on Operating Systems Principles (ACM SOSP). Kiawah Island, SC, Dec. 1999, pp. 16 31. Application 2-61

Web Caching Hierarchy national/international proxy cache regional proxy cache local proxy cache (e.g., local ISP, University) client 62

Some Issues Not all objects can be cached Cache consistency Cache Replacement Policies Prefetch? 63

Some Issues Not all objects can be cached E.g., dynamic objects, copyrighted material, HTTPS Cache consistency strong weak Cache Replacement Policies Variable size objects Varying cost of not finding an object (a miss ) in the cache Prefetch? A large fraction of the requests are one-timers 64

Weak Consistency Each cached copy has a TTL beyond which it must be validated with the origin server TTL = freshness life time age freshness life time: often heuristically calculated; sometimes based on MAX_AGE or EXPIRES headers age = current time (at client) timestamp on object (time at which server generated response) Age Penalty? 65

Conditional GET: client-side caching Goal: don t send object if client has up-to-date cached version client: specify date of cached copy in HTTP request client HTTP request msg If-modified-since: <date> HTTP response HTTP/1.0 304 Not Modified server object not modified server: response contains no object if cached copy is upto-date. HTTP request msg If-modified-since: <date> HTTP response HTTP/1.0 200 OK <data> object modified 66

Content distribution networks (CDNs) The content providers are the CDN customers. Content replication CDN company installs hundreds of CDN servers throughout Internet in lower-tier ISPs, close to users CDN replicates its customers content in CDN servers. When provider updates content, CDN updates servers Different approaches origin server in North America CDN distribution node CDN server CDN server in S. America CDN server in Asia in Europe 67

68

Cookies: keeping state Many major Web sites use cookies Four components: 1) cookie header line in the HTTP response message 2) cookie header line in HTTP request message 3) cookie file kept on user s host and managed by user s browser 4) back-end database at Web site Example: User visits a specific e- commerce site 69

Cookies: keeping state (cont.) Cookie file ebay: 8734 client usual http request msg usual http response + Set-cookie: 1678 server server creates ID 1678 for user Cookie file amazon: 1678 ebay: 8734 70

Cookies: keeping state (cont.) Cookie file ebay: 8734 client usual http request msg usual http response + Set-cookie: 1678 server server creates ID 1678 for user Cookie file amazon: 1678 ebay: 8734 usual http request msg cookie: 1678 usual http response msg cookiespecific action 71

Cookies: keeping state (cont.) Cookie file ebay: 8734 client usual http request msg usual http response + Set-cookie: 1678 server server creates ID 1678 for user Cookie file amazon: 1678 ebay: 8734 one week later: Cookie file amazon: 1678 ebay: 8734 usual http request msg cookie: 1678 usual http response msg cookiespecific action 72

Cookies: keeping state (cont.) Cookie file ebay: 8734 client usual http request msg usual http response + Set-cookie: 1678 server server creates ID 1678 for user Cookie file amazon: 1678 ebay: 8734 one week later: Cookie file amazon: 1678 ebay: 8734 usual http request msg cookie: 1678 usual http response msg usual http request msg cookie: 1678 usual http response msg cookiespecific action cookiespectific action 73

Cookies (continued) What cookies can bring: authorization shopping carts recommendations user session state (Web e-mail) aside Cookies and privacy: cookies permit sites to learn a lot about you you may supply name and e-mail to sites search engines use redirection & cookies to learn yet more advertising companies obtain info across sites 74

Web & HTTP The major application on the Internet A large fraction of traffic is HTTP Client/server model: Clients make requests, servers respond to them Done mostly in ASCII text (helps debugging!) Various headers and commands Web Caching & Performance Content Distribution Networks 75

76

77

More slides 78

Introduction to HTTP http request http request Laptop w/ Netscape http response Server w/ Apache http response Desktop w/ Explorer HTTP: HyperText Transfer Protocol Communication protocol between clients and servers Application layer protocol for WWW Client/Server model: Client: browser that requests, receives, displays object Server: receives requests and responds to them Protocol consists of various operations Few for HTTP 1.0 (RFC 1945, 1996) Many more in HTTP 1.1 (RFC 2616, 1999) 79

Request Generation User clicks on something Uniform Resource Locator (URL): http://www.cnn.com http://www.cpsc.ucalgary.ca https://www.paymybills.com ftp://ftp.kernel.org Different URL schemes map to different services Hostname is converted from a name to a 32-bit IP address (DNS lookup, if needed) Connection is established to server (TCP) 80

What Happens Next? Client downloads HTML document Sometimes called container page Typically in text format (ASCII) Contains instructions for rendering (e.g., background color, frames) Links to other pages Many have embedded objects: Images: GIF, JPG (logos, banner ads) Usually automatically retrieved I.e., without user involvement can control sometimes (e.g. browser options, junkbusters) <html> <head> <meta name= Author content= Erich Nahum > <title> Linux Web Server Performance </title> </head> <body text= #00000 > <img width=31 height=11 src= ibmlogo.gif > <img src= images/new.gif> <h1>hi There!</h1> Here s lots of cool linux stuff! <a href= more.html > Click here</a> for more! </body> sample html file </html> 81

Web Server Role Respond to client requests, typically a browser Can be a proxy, which aggregates client requests Could be search engine spider or robot May have work to do on client s behalf: Is the client s cached copy still good? Is client authorized to get this document? Hundreds or thousands of simultaneous clients Hard to predict how many will show up on some day (e.g., flash crowds, diurnal cycle, global presence) Many requests are in progress concurrently 82

HTTP Request Format GET /images/penguin.gif HTTP/1.0 User-Agent: Mozilla/0.9.4 (Linux 2.2.19) Host: www.kernel.org Accept: text/html, image/gif, image/jpeg Accept-Encoding: gzip Accept-Language: en Accept-Charset: iso-8859-1,*,utf-8 Cookie: B=xh203jfsf; Y=3sdkfjej <cr><lf> Messages are in ASCII (human-readable) Carriage-return and line-feed indicate end of headers Headers may communicate private information (browser, OS, cookie information, etc.) 83

Request Types Called Methods: GET: retrieve a file (95% of requests) HEAD: just get meta-data (e.g., mod time) POST: submitting a form to a server PUT: store enclosed document as URI DELETE: removed named resource LINK/UNLINK: in 1.0, gone in 1.1 TRACE: http echo for debugging (added in 1.1) CONNECT: used by proxies for tunneling (1.1) OPTIONS: request for server/proxy options (1.1) 84

Response Format HTTP/1.0 200 OK Server: Tux 2.0 Content-Type: image/gif Content-Length: 43 Last-Modified: Fri, 15 Apr 1994 02:36:21 GMT Expires: Wed, 20 Feb 2002 18:54:46 GMT Date: Mon, 12 Nov 2001 14:29:48 GMT Cache-Control: no-cache Pragma: no-cache Connection: close Set-Cookie: PA=wefj2we0-jfjf <cr><lf> Similar format to requests (i.e., ASCII) <data follows > 85

Response Types 1XX: Informational (def d in 1.0, used in 1.1) 100 Continue, 101 Switching Protocols 2XX: Success 200 OK, 206 Partial Content 3XX: Redirection 301 Moved Permanently, 304 Not Modified 4XX: Client error 400 Bad Request, 403 Forbidden, 404 Not Found 5XX: Server error 500 Internal Server Error, 503 Service Unavailable, 505 HTTP Version Not Supported 86

Outline of an HTTP Transaction This section describes the basics of servicing an HTTP GET request from user space Assume a single process running in user space, similar to Apache 1.3 We ll mention relevant socket operations along the way initialize; forever do { get request; process; send response; log request; } server in a nutshell 87

Readying a Server s = socket(); /* allocate listen socket */ bind(s, 80); /* bind to TCP port 80 */ listen(s); /* indicate willingness to accept */ while (1) { newconn = accept(s); /* accept new connection */b First thing a server does is notify the OS it is interested in WWW server requests; these are typically on TCP port 80. Other services use different ports (e.g., SSL is on 443) Allocate a socket and bind()'s it to the address (port 80) Server calls listen() on the socket to indicate willingness to receive requests Calls accept() to wait for a request to come in (and blocks) When the accept() returns, we have a new socket which represents a new connection to a client 88

Processing a Request remoteip = getsockname(newconn); remotehost = gethostbyname(remoteip); gettimeofday(currenttime); read(newconn, reqbuffer, sizeof(reqbuffer)); reqinfo = serverparse(reqbuffer); getsockname() called to get the remote host name for logging purposes (optional, but done by most) gethostbyname() called to get name of other end again for logging purposes gettimeofday() is called to get time of request both for Date header and for logging read() is called on new socket to retrieve request request is determined by parsing the data GET /images/jul4/flag.gif 89

Processing a Request (cont) filename = parseoutfilename(requestbuffer); fileattr = stat(filename); servercheckfilestuff(filename, fileattr); open(filename); stat() called to test file path to see if file exists/is accessible may not be there, may only be available to certain people "/microsoft/top-secret/plans-for-world-domination.html" stat() also used for file meta-data e.g., size of file, last modified time "Has file changed since last time I checked? might have to stat() multiple files and directories assuming all is OK, open() called to open the file 90

Responding to a Request read(filename, filebuffer); headerbuffer = serverfigureheaders(filename, reqinfo); write(newsock, headerbuffer); write(newsock, filebuffer); close(newsock); close(filename); write(logfile, requestinfo); read() called to read the file into user space write() is called to send HTTP headers on socket (early servers called write() for each header!) write() is called to write the file on the socket close() is called to close the socket close() is called to close the open file descriptor write() is called on the log file 91

Summary of Web and HTTP The major application on the Internet Majority of traffic is HTTP (or HTTP-related) Client/server model: Clients make requests, servers respond to them Done mostly in ASCII text (helps debugging!) Various headers and commands Too many to go into detail here Many web books/tutorials exist (e.g., Krishnamurthy & Rexford 2001) 92

93