Network Technologies Glenn Strong Department of Computer Science School of Computer Science and Statistics Trinity College, Dublin January 28, 2014
What Happens When Browser Contacts Server I Top view: The browser determines the hostname (the address that was clicked upon or typed) www.adobe.com (host name is Mnemonic - assists human memory). The browser asks the DNS (Domain Name Service) for the IP address of www.adobe.com DNS - a distributed database that translates hostnames to IP addresses DNS replies with the corresponding IP address. For example 192.150.14.120 The browser then makes a TCP connection to port 80 on 192.150.14.120 Handshakes with web server process running on 192.150.14.120 Sets up a reliable connection between the two machines It then sends the HTTP command GET /index.html http/1.0 So the HTTP data command is encapsulated in TCP segment, encapsulated in IP.
What Happens When Browser Contacts Server II The web server running on 192.150.14.120 retrieves index.html from its file system. It attaches some HTTP header information and passes the file stream to its TCP process. Encapsulation again. The browser receives the file and closes the TCP connection Browser reads HTML and renders page Since we are using HTTP 1.0 the browser will reopen a new TCP connection for every graphic or embedded item in the page
TCP and Ports Weve already mentioned that a server machine might have several server processes running on it, each providing different type of service - HTTP, FTP, Telnet, SMTP Each server process can be identified by the port at which it listens on that machine What exactly is a port? We use a port to set up a TCP connection Programming term specifying a logical connection place In TCP/IP, allows a client program specify a particular server process on a computer in a network Port numbers are from 0 to 65536 Ports 0 to 1024 are reserved for use by certain privileged services. See RFC 1700 For the HTTP service, port 80 is defined as a default and it does not have to be specified in the Uniform Resource Locator (URL). FTP: port 27, TELNET: port 23
Ports When taken together, the source port, the destination port, source and destination IP numbers uniquely identify an application process This allows several processes to communicate without the signals becoming mixed up This is all carried out by the transport layer It identifies the data destined for each process by examining the source/destination IP address source/destination port number of each incoming segment The source and destination port numbers are included in each TCP segment.
HTTP Ok - weve introduced what happens when a client contacts a web server After it establishes a TCP connection, we saw that it issues a GET command followed by the file name it requires and version of HTTP it is running HTTP is the application layer protocol for retrieving web pages The browser uses TCP as the transport mechanism to send HTTP information between client and server. HTTP is the Hypertext Transfer Protocol WWWs application layer protocol Client/server model client: browser that requests, receives, displays WWW objects server: WWW server sends objects in response to requests They are standardized. HTTP 1.0 defined by RFC 1945, HTTP 1.1 defined by RFC 2068.
HTTP Example Suppose a user enters the URL: www.tcd.ie/library/index.htm 1 HTTP client initiates TCP connection to HTTP server (process) at www.tcd.ie. Port 80 is default for HTTP server. 2 HTTP server at host www.tcd.ie waiting for TCP connection at port 80 accepts connection, notifying client. 3 HTTP client sends http request message (containing URL) into TCP connection socket. 4 HTTP server receives request message, forms response message containing requested object (Library/index.htm), sends message into socket. 5 HTTP server closes TCP connection. 6 HTTP client receives response message containing HTML file, displays HTML. Parsing the HTML file, the browser finds 10 referenced jpeg objects 7 Steps 1-5 repeated for each of 10 jpeg objects
How many TCP connections do we need to establish? Non-persistent connection: one object in each TCP connection Some browsers create multiple TCP connections simultaneously - one per object Persistent connection: multiple objects transferred within one TCP connection (Default for HTTP/1.1). This is now the most common way web pages and embedded content are transferred.
A Note on Virtual Hosting Ive mentioned servers in the context of a single machine hosting a single web domain. However, a single machine might have several host names associated with its IP address. This is called virtual hosting; It is made possible by HTTP/1.1 Every HTTP/1.1 request includes a HOST field which specifies the domain name associated with the request This allows the server know which part of its file system has been allocated to the domain name. Offering a virtual host facility is often mis-named as offering virtual server facility A virtual server is a facility whereby you are given complete control of a server process Essentially, you are running your own remote server, giving Complete configuration control You have your own IP address You can sub-host (offer your own virtual hosting facility)
Back to the HTTP Protocol The TCP transport service for HTTP: Client initiates TCP connection (creates socket) to server, port 80 Server accepts TCP connection from client HTTP messages (application-layer protocol messages) exchanged between browser (http client) and WWW server (http server) TCP connection closeda Note that each request-response transaction is independent: HTTP is stateless. Server maintains no information about past client requests Protocols that maintain state are complex! past history (state) must be maintained if server/client crashes, their views of state may be inconsistent, must be reconciled
HTTP Message Format: Request There are two types of HTTP message: request response The HTTP request message is ASCII text, in a human-readable format: GET /somedir/page.html HTTP/1.1 Host: www.virtualhost.com Connection: close User-agent: Mozilla/4.0 Accept: text/html, image/gif,image/jpeg Accept-language:fr (extra carriage return line feed)
HTTP Message Format: Request There are several different kinds of request; the previous example was a GET; here is a POST POST /somedir/script.php HTTP/1.0 User-Agent: Mozilla/4.0 Content-Type: application/x-www-form-urlencoded Content-Length: 51 username=fred&password=mumble&sweeties=cola+bottles In which a block of data is sent along with the request. This is a common way to submit form data (in a GET request the form contents are sent as part of the URI). There is another: HEAD which restricts the response to the status and headers
HTTP Message Format: Reply The response sent by the server consists of a status line (the protocol status code and status phrase), followed by header lines and the data requested (e.g. an HTML file). HTTP/1.1 200 OK Connection: close Date: Thu, 06 Apr 2004 12:00:15 GMT Server: Apache/1.3.0 (Unix) Last-Modified: Mon, 22 Mar 2004... Content-Length: 6821 Content-Type: text/html <HTML><HEAD><TITLE>...
HTTP Reply Status Codes In first line in server to client response message. A few sample codes: 200 OK request succeeded, requested object later in this message 301 Moved Permanently requested object moved, new location specified later in this message (Location:) 400 Bad Request request message not understood by server 404 Not Found requested document not found on this server 505 HTTP Version Not Supported you are talking a language I don t understand
User-Server Interaction: Authentication HTTP is stateless. So how do we: control access to server documents? Serve content specifically tailored to a particular user? We need some form of authentication. The goal of authentication is to control access to server documents. stateless: client must present authorization in each request authorization: typically name and password authorization: header line in request if no authorization presented, server refuses access, sends WWW authenticate: header line in response
User-Server Interaction: Cookies [RFC 2109] Server sends cookie to client in response, as part of the headers Set-cookie: UniqueID=1243af54e Client presents cookie in later requests as part of the request headers cookie: UniqueID=1243af54e Server matches presented-cookie with server-stored cookies authentication remembering user preferences, previous choices If the cookie has options they are set as part of the value string:set-cookie: UniqueID=1243af54e; expires=fri, 9-Apr-2010 09:00:00 GMT
How Amazon.com Redirects your Browser and Sets a Cookie When you type www.amazon.com into your browser for the first time several things occur before you receive a web page Firstly Amazon.com (like many large sites) do not keep their front page at index.html. Your browser is redirected to the front page by a HTTP 302 response When your browser then requests this new page, the Amazons server checks to see if a cookie has already been set If not, it sends your browser another 302 response this time with a cookie Your browser sets the cookie Your browser requests the new Amazon front page, this time returning the cookie ID The Amazon server at last sends you the front page of its site After the cookie has been set your browser will include a cookie field in every request it makes to www.amazon.com You can follow this transaction using a telnet tool to emulate a browser.
Client Caching: Conditional GET Goal: dont send object if client has up-to-date stored (cached) version What if the cached version is stale? Client does a quick check: Client specifies last-modified date of cached copy in a HTTP request If-modified-since: <date> Server: response contains no object if cached copy up-to-date: HTTP/1.0 304 Not Modified
Web Caches (Proxy Server) Goal: satisfy client request without involving origin server User sets browser so that it accesses the web a via the proxy server The proxy has a cache of pages already downloaded Client sends all HTTP requests to web cache if object at web cache, web cache immediately returns object in HTTP response else proxy server requests object from origin server, then returns http response to client Note: client first opens TCP connection to proxy server and sends HTTP request over connection. If proxy doesnt have the item, it opens a TCP connection to remote web server and sends request for the item Retrieves item - stores it in cache and forwards it over original client TCP connection