Internet Technologies World Wide Web (WWW) Proxy Server Network Address Translator (NAT)
What is WWW? System of interlinked Hypertext documents Text, Images, Videos, and other multimedia documents navigate between them via Hyperlinks Components of WWW Web Server Maintains the resources which the user wants to share These resources are liked to each other to navigate from one another Also Called as HTTP server HTML Language to publish the contents It tells how to display the document and how it should be presented Hyper Text Transfer Protocol (HTTP) Language for server and client to communicate Web Browser
URL Is a character string that specifies where a known resource is available on the internet and the mechanism for retrieve it. The syntax is scheme://domain:port/path?query_string#fragment_id Scheme: Type of the service to be used to access the resource Like http, ftp, mailto, https etc. Domain or IP Address: Domain name or the IP address of the web server where the resource is located Port: Port number of the destination http server Port number is optional If port number if omitted port number 80 is assumed (port number 80 is default http port) Default port for https is 443
URL Continued Path Path of the resource to be fetched or path of the script to be executed by the server Query_String Contains the form data to be processed by the program running on the server Fragment ID: Specifies a part or a position within the overall resource or document. When used with HTTP, it usually specifies a section or location within the page, and the browser may scroll to display that part of the page.
URL Examples http://google.com:80 Performing an HTTP request to the host at google.com, using the port number 80 mailto:askcs@uohyd.ernet.in Start an e-mail composer with the address askcs@uohyd.ernet.in in the To field ftp://asmith:passwd@ftp.example.org http://dcis.uohyd.ernet.in/~askcs/ca502.html http://www.xyz.com/cgi-bin/xyz.pl?roll=1234&sex=m Form data is provided as input to the script xyz.pl http://www.example.org/foo.html#bar
HTTP Basics Protocol for client/server communication Client sends a request message, server replies with response message First client establishes a socket connection with the server and then the HTTP request has to be sent Stateless Successive requests are independent
HTTP Request commands GET Retrieve the document specified by the URL HEAD Retrieve the header information about a document specified by the URL POST Give information to the server PUT Store specified document under the given URL DELETE Remove the document specified by the URL
HTTP Request methods Basic Syntax of the HTTP request an initial line, zero or more header lines, a blank line (i.e. a CRLF by itself), and an optional message body (e.g. a file, or query data, or query output). GET /path/to/file/index.html HTTP/1.0 GET /path/script.cgi?field1=value1&field2=value2 HTTP/1.0 Initial Response Line (Status Line) The initial response line, called the status line, also has three parts separated by spaces: the HTTP version, a response status code that gives the result of the request, and an English reason phrase describing the status code. Typical status lines are: HTTP/1.0 200 OK or HTTP/1.0 404 Not Found
HTTP response codes The status code is a three-digit integer, and the first digit identifies the general category of response: 1xx indicates an informational message only 2xx indicates success of some kind 3xx redirects the client to another URL 4xx indicates an error on the client's part 5xx indicates an error on the server's part The most common status codes 200 OK - The request succeeded, and the resulting resource is returned in the message body. 404 Not Found - The requested resource doesn't exist.
Some HTTP exchanges with GET To retrieve the file at the URL http://www.somehost.com/path/file.html Client request GET /path/file.html HTTP/1.0 From: someuser@jmarshall.com User-Agent: HTTPTool/1.0 [blank line here] Server response HTTP/1.0 200 OK Date: Fri, 31 Dec 1999 23:59:59 GMT Content-Type: text/html Content-Length: 1354 <html> <body> <h1>happy New Millennium!</h1> (more file contents)... </body> </html>
HEAD HTTP Request method Similar to GET but it requests only the document header information only Document contents are not downloaded from the server This is useful to check the characteristics of the resource without downloading it Response to HEAD request never contain message body
POST HTTP Request POST is used when data to be sent to the server which is processed by some program at the server This is used when POST action method is used in the form POST /path/script.cgi HTTP/1.0 From: frog@jmarshall.com User-Agent: HTTPTool/1.0 Content-Type: application/x-www-form-urlencoded Content-Length: 32 home=cosby&favorite+flavor=flies
HTTP Proxy Server An HTTP proxy is a program that acts as an intermediary between a client and a server It receives requests from clients, and forwards those requests to the intended servers The responses pass back through it in the same way It acts both as a server and client Proxy can also be used a network firewall When proxy is present clients has to provide complete URL of the resource GET http://www.somehost.com/path/file.html HTTP/1.0 Otherwise proxy does not have any information about the domain name
Network Address Translator (NAT) IP addresses are limited in number. Not always possible to assign a public IP address to each system on the network NAT allows single device (router) to act as an agent between the internet (public network) and the local (private) network. Only single IP address is required to represent group of local computers
Types of NAT Static NAT Mapping an unregistered IP address to a registered IP address on a one-toone basis Not much of use except some filtering the contents Dynamic NAT Maps an unregistered IP address to a registered IP address from a group of IP addresses Overloading Maps multiple unregistered IP address to a single registered IP address using different port numbers (Port Address Translation) Stub Domain It is a LAN that uses IP addresses internally (private IP addresses). Most of the traffic is local A snub domain can include both registered and unregistered IP addresses Unregistered IP address uses NAT to communicate to the rest of the world