World Wide Web. Before WWW



Similar documents
Chapter 27 Hypertext Transfer Protocol

1. When will an IP process drop a datagram? 2. When will an IP process fragment a datagram? 3. When will a TCP process drop a segment?

The Web: some jargon. User agent for Web is called a browser: Web page: Most Web pages consist of: Server for Web is called Web server:

Internet Technologies. World Wide Web (WWW) Proxy Server Network Address Translator (NAT)

Network Technologies

The Web History (I) The Web History (II)

CONTENT of this CHAPTER

Lecture 2. Internet: who talks with whom?

WWW. World Wide Web Aka The Internet. dr. C. P. J. Koymans. Informatics Institute Universiteit van Amsterdam. November 30, 2007

Computer Networks. Lecture 7: Application layer: FTP and HTTP. Marcin Bieńkowski. Institute of Computer Science University of Wrocław

By Bardia, Patit, and Rozheh

Web. Services. Web Technologies. Today. Web. Technologies. Internet WWW. Protocols TCP/IP HTTP. Apache. Next Time. Lecture # Apache.

HTTP Protocol. Bartosz Walter

HTTP. Internet Engineering. Fall Bahador Bakhshi CE & IT Department, Amirkabir University of Technology

reference: HTTP: The Definitive Guide by David Gourley and Brian Totty (O Reilly, 2002)

Project #2. CSE 123b Communications Software. HTTP Messages. HTTP Basics. HTTP Request. HTTP Request. Spring Four parts

Protocolo HTTP. Web and HTTP. HTTP overview. HTTP overview

1 Introduction: Network Applications

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP Abstract Message Format. The Client/Server model is used:

Outline Definition of Webserver HTTP Static is no fun Software SSL. Webserver. in a nutshell. Sebastian Hollizeck. June, the 4 th 2013

Web Programming. Robert M. Dondero, Ph.D. Princeton University

WHAT IS A WEB SERVER?

Application layer Web 2.0

The Hyper-Text Transfer Protocol (HTTP)

Data Communication I

The World Wide Web: History

The World-Wide Web Gateway to Hyper-G: Using a Connectionless Protocol to Access Session-Oriented Services

CS640: Introduction to Computer Networks. Applications FTP: The File Transfer Protocol

SWE 444 Internet and Web Application Development. Introduction to Web Technology. Dr. Ahmed Youssef. Internet

Web Development. Owen Sacco. ICS2205/ICS2230 Web Intelligence

Fachgebiet Technische Informatik, Joachim Zumbrägel

TCP/IP Networking An Example

Hypertext for Hyper Techs

Oct 15, Internet : the vast collection of interconnected networks that all use the TCP/IP protocols

Evolution of the WWW. Communication in the WWW. WWW, HTML, URL and HTTP. HTTP - Message Format. The Client/Server model is used:

Application Layer: HTTP and the Web. Srinidhi Varadarajan

Basic Internet programming Formalities. Hands-on tools for internet programming

Internet Technologies_1. Doc. Ing. František Huňka, CSc.

Lecture 8a: WWW Proxy Servers and Cookies

Introduction to Internet and WWW


CREATING WEB PAGES USING HTML INTRODUCTION

Transport Layer Security Protocols

M3-R3: INTERNET AND WEB DESIGN

Internet Technologies Internet Protocols and Services

Connecting with Computer Science, 2e. Chapter 5 The Internet

APACHE HTTP SERVER 2.2.8

Introduction to LAN/WAN. Application Layer (Part II)

How To Understand The History Of The Web (Web)

Lektion 2: Web als Graph / Web als System

Lecture 8a: WWW Proxy Servers and Cookies

Web Browsing Examples. How Web Browsing and HTTP Works

Introduction to Web Technology. Content of the course. What is the Internet? Diana Inkpen

CloudOYE CDN USER MANUAL

No. Time Source Destination Protocol Info HTTP GET /ethereal-labs/http-ethereal-file1.html HTTP/1.

A host-based firewall can be used in addition to a network-based firewall to provide multiple layers of protection.

Instructor: Betty O Neil

CTIS 256 Web Technologies II. Week # 1 Serkan GENÇ

1. The Web: HTTP; file transfer: FTP; remote login: Telnet; Network News: NNTP; SMTP.

Deployment Guide Microsoft IIS 7.0

APACHE WEB SERVER. Andri Mirzal, PhD N

Web Design and Development ACS-1809

LabVIEW Internet Toolkit User Guide

Introduction to Web Technologies

ICT 6012: Web Programming

Chapter 6 Virtual Private Networking Using SSL Connections

Alteon Browser-Smart Load Balancing

Network Security TCP/IP Refresher

Computer Networks 1 (Mạng Máy Tính 1) Lectured by: Dr. Phạm Trần Vũ MEng. Nguyễn CaoĐạt

Web Services April 21st, 2009 with Hunter Pitelka

People Data and the Web Forms and CGI CGI. Facilitating interactive web applications

Terminology. Internet Addressing System

International Journal of Engineering & Technology IJET-IJENS Vol:14 No:06 44

Web Hosting. Definition. Overview. Topics. 1. Overview of the Web

URLs and HTTP. ICW Lecture 10 Tom Chothia

Guide to Analyzing Feedback from Web Trends

Proxies. Chapter 4. Network & Security Gildas Avoine

Project Report on Implementation and Testing of an HTTP/1.0 Webserver

DEPLOYMENT GUIDE Version 1.1. Deploying the BIG-IP LTM v10 with Citrix Presentation Server 4.5

Application Example: WWW. Communication in the WWW. WWW, HTML, URL and HTTP. Loading of Web Pages. The Client/Server model is used in the WWW

How to Run an Apache HTTP Server With a Protocol

Modern Web Development From Angle Brackets to Web Sockets

Cyber Security Workshop Ethical Web Hacking

Architecture of So-ware Systems HTTP Protocol. Mar8n Rehák

The following multiple-choice post-course assessment will evaluate your knowledge of the skills and concepts taught in Internet Business Associate.

Domain Name System (DNS)

Chapter 4: Networking and the Internet

S y s t e m A r c h i t e c t u r e

10. Java Servelet. Introduction

Network: several computers who can communicate. bus. Main example: Ethernet (1980 today: coaxial cable, twisted pair, 10Mb 1000Gb).

Web Application Development

The Application Layer. CS158a Chris Pollett May 9, 2007.

Transcription:

World Wide Web Joao.Neves@fe.up.pt Before WWW Major search tools: Gopher and Archie Archie Search FTP archives indexes Filename based queries Gopher Friendly interface Menu driven queries João Neves 2 1

Web Born Tim Berners-Lee et al. at CERN in 1991 HyperText Transfer Protocol (HTTP) Hypertext - embedded links in text to link to another text document Hyperlinks RFC 1945, May 1996, HTTP/1.0 RFC 2068 obsolete by RFC 2616, June 1999, HTTP/1.1 João Neves 3 Internet Evolution Ano Hosts (*) 1983 562 1984 1024 1985 1961 1986 2308 1987 5089 1988 28174 1989 80000 1990 290000 1991 500000 1992 727000 1993 1200000 1994 2217000 1995 4852000 João Neves 4 2

Total Sites Across All Domains August 1995 - March 2008 Source http://news.netcraft.com/archives/web_server_survey.html João Neves 5 Layering HyperText Transfer Protocol Telnet Simple Network Management Dynamic Host Configuration Transmission Control Protocol (TCP) User Datagram Protocol (UDP) Internet Protocol (IP) Ethernet Wi-Fi SONET João Neves 6 3

HTTP Standard protocol for web transfer Request-response interaction between client and server The server has resources as HTML files and images Request methods: GET, HEAD, PUT, POST, DELETE, Response: Status line + additional info (e.g., a web page) João Neves 7 Introduction to HTTP It has been in use by the World-Wide Web global information initiative since 1990 Its first version (referred to as HTTP/0.9) was a simple protocol for raw data transfer across the Internet HTTP/1.0 improved the protocol by allowing messages to be in the format of MIME-like messages: containing metainformation about the data transferred and modifiers on the request/response semantics João Neves 8 4

HTTP Transaction Client HTTP Server HTTP client: web browser HTTP server: web server Standard port: 80 Suggested alternate ports: 81, 8080, 8081 HTTP is used to transmit resources File/documents Image files Query results Outputs from CGI scripts Anything that can be identified by a URL dir file.html WebRoot João Neves 9 Web Clients Lynx 2.0 (1993, character based interface) NCSA Mosaic (1993, first with graphical interface) Marc Andreessen (author of Mosaic) moved to Netscape Microsoft Internet Explorer ( new name for Mosaic ) Mozilla Firefox Opera Safari Chrome João Neves 10 5

The Browser The browser 1. fetches the page requested 2. interprets the text and formatting commands that it contains 3. displays the page properly formatted on the screen On the page strings of text that are links to other pages, called hyperlinks On the screen the hyperlinks are highlighted, either by underlining, displaying them in a special color, or both João Neves 11 Web Servers NCSA HTTPd non-commercial free Apache HTTP Server freeware Apache Tomcat freeware lighttpd freeware Microsoft Internet Information Services (IIS) payware Zeus Web Server payware Zope freeware... João Neves 12 6

Server Share Server Share amongst the Million Busiest Sites, March 2009 source http://news.netcraft.com/archives/web_server_survey.html João Neves 13 HTML SHTML SGML XML Markup Languages João Neves 14 7

Markup Markup are codes inserted into texts documents to manage formatting, printing or other process. A description markup indicates the nature, function, or content of the data in a file. A procedural markup defines what processing is to be carried out at particular points in the document. João Neves 15 HyperText Markup Language Language in which web pages are written Contains formatting commands Tells browser what to display and how to display Examples: <TITLE> Welcome to My Great Site </TITLE> The title of this page is Welcome to My Great Site <B>Great News!</B> Set Great News! in boldface <A HREF= http://www.xptoo.org/ >I m the One</A> A link pointing to the web page http:// www.xptoo.org/index.html with the text I m the One displayed João Neves 16 8

Sample HTML Tags <A> </A> Anchor link or name <BODY> </BODY> Document Contents <BR> Break <FORM> </FORM> Input form <H1> </H1> Heading level 1 <HEAD> </HEAD> Header of a document <HR> Horizontal Rule <HTML> </HTML> The doc type is HTML <LI> List Item <OL> </OL> Ordered List <P> </P> Paragraph break <PRE> </PRE> Preformatted text <TITLE> </TITLE> Document title <UL> Unnumbered list João Neves 17 Uniform Resource Identifiers A URI is an identifier for some resource, and a Uniform Resource Locator (URL) gives you specific information as to obtain that resource HTTP is also used as a generic protocol for communication between user agents and proxies/gateways to other Internet systems, including those supported by the next protocols: SMTP, NNTP, FTP RFC 2396, August 1998 In this way, HTTP allows basic hypermedia access to resources available from diverse applications João Neves 18 9

Uniform Resource Identifiers The following examples illustrate URL that are in common use: Name Utility Example ftp ftp scheme for File Transfer Protocol services ftp://ftp.is.co.za/rfc/rfc1808.txt http http scheme for Hypertext Transfer Protocol services http://www.math.uio.no/faq/compressionfaq/part1.html file Local file file:/usr/local/etc/ntp.conf news news scheme for USENET news groups and articles news:comp.infosystems.www.servers.unix telnet telnet scheme for interactive services via the TELNET telnet://melvyl.ucop.edu/ Protocol mailto mailto scheme for electronic mail addresses mailto:mduerst@ifi.unizh.ch gopher gopher scheme for Gopher and Gopher+ Protocol services gopher://stap.umn.edu/00/weather/ca/los%20angeles João Neves 19 Uniform Resource Locator <scheme>: // [userinfo @] hostname [: port] / path [; parameters] [?query] Some URL schemes use the format "user:password" in the userinfo field. This practice is NOT RECOMMENDED, because the passing of authentication information in clear text (such as URI) has proven to be a security risk in almost every case where it has been used. [RFC2396] João Neves 20 10

HTTP HyperText Transfer Protocol A very simple, stateless protocol for sessionless exchanges Browser creates a new connection each time it wants to make a new request (for a page, image, etc.) Exceptions: HTTP 1.1 added support for persistent connections and pipelining Clients + servers might keep state information Cookies provide a way of recording state João Neves 21 The http protocol: more http: TCP transport service client initiates TCP connection (creates socket) to server, port 80 server accepts TCP connection from client http messages (application-layer protocol messages) exchanged between browser (http client) and Web server (http server) TCP connection closed http is stateless server maintains no information about past client requests João Neves 22 11

HTTP GET /path/to/file/index.html HTTP/1.0 HTTP method Path: the part of the URL after the hostname, i.e. request URI The HTTP version João Neves 23 HTTP Session jneves@bart(1)$ telnet www.inescporto.pt 80 [...] GET /~jneves/index.html HTTP/1.0 From: Joao.Neves@xptoo.org User-Agent: Camachina/5.0 HTTP/1.1 200 OK Date: Tue, 26 May 2009 18:06:13 GMT Server: Apache/2.30 (Unix) PHP/5.5 DAV/2 mod_perl/2.9 Perl/v5.20 Last-Modified: Fri, 04 May 2007 18:41:20 GMT Accept-Ranges: bytes Content-Length: 91 Connection: close Content-Type: text/html <html> <head> <meta HTTP-EQUIV="REFRESH" content="0; url=./index.shtml"> </head> </html> Connection closed by foreign host. João Neves 24 12

HTTP Request Headers Header From User-Agent Accept Accept-encoding Accept-Language Referrer If-Modified-Since Content-length Content-Type Pragma: no-cache Description RFC822 E-mail address of the user Client Software File types that client will accept, e.g., text/plain, text/html Compression methods, e.g., x-compress; x-zip Language(s) used (optional) URL of the document (or element within the document) from which the URL in the request was obtained Return document if modified since specified date Length in octets of data to follow Type of the item Directive understood by a proxy server; When present the proxy should not return a document from the cache João Neves 25 HTTP Response Headers Header Server Date Last-Modified Expires Location Pragma MIME-version Link Content-Length Allowed Description Server Software Current Date Modification date of the document Document expiration date The location of the document in redirection responses A hint, e.g. no cache URL of document s parent Length in octets Requests that user can issue, e.g., GET João Neves 26 13

HTTP Status Codes Code Text 2xx Success 3xx Redirection 301 Moved 302 Found 4xx Client Errors 400 Bad Request 401 Unauthorized 404 Not found 5xx Server Errors 500 Internal Error 502 Service Overload João Neves 27 HTTP over TLS bash-4.0# openssl s_client -connect secure.xptoo.org:443 -showcerts CONNECTED(00000004) [ ] --- GET / HTTP/1.0 HTTP/1.1 200 OK [ ] João Neves 28 14

HTTP 1.1 Features Persistent TCP Connections: remain open for multiple requests Partial Document Transfers: clients can specify start and stop positions Conditional Fetch: several additional conditions Better content negotiation More flexible authentication João Neves 29 Static vs. Dynamic Pages HTML pages vs. database Personalized Context-aware services Browsing Device-dependent João Neves 30 15

HTTP Proxy An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients; Requests are serviced internally or by passing them on, with possible translation, to other servers; A proxy must implement both the client and server requirements of this specification; The client makes a request to the proxy server using the complete URL; The proxy server connects to the remote server and requests the resource relative to that server (no protocol and hostname in the URL). João Neves 31 HTTP Proxy GET http://hostname/path/to/file.html HTTP/1.0 Client HTTP/1.0 200 Document... HTTP Proxy Server GET /path/to/file.html HTTP/1.0 HTTP/1.0 200 Document... Server WebRoot The client makes a request to the proxy server using the complete URL; The proxy server connects to the remote server and requests the resource relative to that server (no protocol and hostname in the URL). dir file.html João Neves 32 16

HTTP Proxy + Cache GET http://hostname/path/to/file.html HTTP/1.0 Client HTTP/1.0 200 Document... HTTP Proxy Server GET /path/to/file.html HTTP/1.0 HTTP/1.0 200 Document... Server WebRoot dir Cache file.html João Neves 33 HTTP Proxy Transparent Configured (http://proxy.xptoo.org:3128/) Automatic (Web Proxy AutoDiscovery) Advantages vs. disadvantages João Neves 34 17

Assume: cache is close to client (e.g., in same network) smaller response time: cache closer to client decrease traffic to distant servers link out of institutional/local ISP network often bottleneck Why Web Caching (Proxies)? institutional network Internet 1,5 Mb/s access link (bottleneck ) 10 Mb/s LAN institutional cache origin servers João Neves 35 Web Load Handling Thousands of clients Load sharing DNS Round Robin Web Switching L4 L7 Load Balancing Devices Nortel Alteon A10 Networks Cisco Content Switching... Akamai João Neves 36 18

Bibliography Comer, Douglas E. Internetworking with TCP/IP (VOL I) Prentice Hall, 5th Ed. (2006) ISBN 0-13-187671-6 Tanenbaum, Andrew S. Computer Networks Prentice Hall International Editions 4th Ed. (2003) ISBN 0-13-038488-7 João Neves 37 19