Client-Side Dynamic Web Page Generation CGI, PHP, JSP, and ASP scripts solve the problem of handling forms and interactions with databases on the server. They can all accept incoming information from forms, look up information in one or more databases, and generate HTML pages with the results. What none of them can do is respond to mouse movements or interact with users directly. For this purpose, it is necessary to have scripts embedded in HTML pages that are executed on the client machine rather than the server machine. Starting with HTML 4.0, such scripts are permitted using the tag <script>. The most popular scripting language for the client side is JavaScript. JavaScript is a scripting language, very loosely inspired by some ideas from the Java programming language. It is definitely not Java. Like other scripting languages, it is a very high level language. For example, in a single line of JavaScript it is possible to pop up a dialog box, wait for text input, and store the resulting string in a variable. JavaScript language has all the power of C or Java. It has variables, strings, arrays, objects, functions, and all the usual control structures. It also has a large number of facilities specific for Web pages, including the ability to manage windows and frames, set and get cookies, deal with forms, and handle hyperlinks. High-level features like this make JavaScript ideal for designing interactive Web pages. On the other hand, the fact that it is not standardized makes it extremely difficult to write JavaScript programs that work on all platforms, but maybe one day it will stabilize. The difference between server-side scripting and client-side scripting is illustrated in Fig (1), including the steps involved. In both cases, the numbered steps start after the form has been displayed. Step 1 consists of accepting the user input, then comes the processing of the input, which differs in the two cases. Fig (1) (a) Server-side scripting with PHP. (b) Client-side scripting with JavaScript. 1
This difference does not mean that JavaScript is better than PHP. Their uses are completely different. PHP (and, by implication, JSP and ASP) are used when interaction with a remote database is needed. JavaScript is used when the interaction is with the user at the client computer. It is certainly possible (and common) to have HTML pages that use both PHP and JavaScript, although they cannot do the same work or own the same button, of course. JavaScript is not the only way to make Web pages highly interactive. Another popular method is through the use of applets which are small Java programs that can be embedded in HTML pages (between <applet> and </applet>). Because Java applets are interpreted rather than directly executed, the Java interpreter can prevent them from doing Bad Things, at least in theory. In addition, Java applets are more portable than JavaScript programs. Microsoft's answer to Sun's Java applets was allowing Web pages to hold ActiveX controls, which are programs executed on the bare hardware. ActiveX controls are faster and more flexible than interpreted Java applets because they can do anything a program can do. When Internet Explorer sees an ActiveX control in a Web page, it downloads it, verifies its identity, and executes it. However, downloading and running foreign programs raises security issues. As a general rule, JavaScript programs are easier to write, Java applets execute faster, and ActiveX controls run fastest of all. Before leaving the subject of dynamic Web content, let us briefly summarize what we have covered so far. Complete Web pages can be generated by various scripts on the server machine. Once they are received by the browser, they are treated as normal HTML pages and just displayed. The scripts can be written in Perl, PHP, JSP, or ASP, as shown in Fig (2). Fig (2) the various ways to generate and display content 2
Dynamic content generation is also possible on the client side. Web pages can be written in XML and then converted to HTML according to an XSL file. (XSL stands for extensible Stylesheet Language which is a style sheet language for XML documents.). JavaScript programs can perform arbitrary computations. Finally, plug-ins and helper applications can be used to display content in a variety of formats. How the Browser Finds Things: URLs URLs: Addresses for Web Pages Before a browser can connect with a website; it needs to know the site s address, the URL. The URL (Uniform Resource Locator) is a string of characters that points to a specific piece of information anywhere on the web. In other words, the URL is the website s unique address. A URL consists of: (1) the web protocol, (2) the domain name or web server name, (3) the directory (or folder) on that server, and (4) the file within that directory (perhaps with an extension such as html or htm). Consider the following example of a URL for a website offered by the National Park Service for Yosemite National Park (located in California, UAS): Let s look at these elements. The protocol: http:// A protocol is a set of communication rules for exchanging information. The web protocol (HTTP) appears at the beginning of some web addresses. It stands for HyperText Transfer Protocol (HTTP), the communications rules that allow browsers to connect with web servers. (Note: Most browsers assume that all web addresses begin with http://, and so you don t need to type this part; just start with whatever follows, such as www). The domain name (web server name): www.nps.gov/ A domain is simply a location on the internet, the particular web server. Domain names tell the location and the type of address. Domainname components are separated by periods (called dots ). The last part of 3
4 the domain, called the top-level domain, is a three-letter extension that describes the domain type:. gov,.com,.net,.edu,.org,.mil,.int government, commercial, network, educational, nonprofit, military, or international organization. In our example, the www stands for World Wide Web, of course; the. nps stands for National Park Service, and. gov is the top-level domain name indicating that this is a government website. Some top-level domain names also include a two-letter code extension for the country for example,.us for United States,.ca for Canada. These country codes are optional. The directory name: yose/ The directory name is the name on the server for the directory, or folder, from which a browser needs to pull the file. The file name and extension: home.htm The file is the particular page or document that you are seeking. Here it is home.htm, because you have gone to the home page, or welcome page. The (.htm) is an extension to the file name, and this extension informs the browser that the file is an HTML file. Web Portals: Starting Points for Finding Information There are many guidebooks for finding information on the web, sort of internet superstations known as web portals. Types of Web Portals a web portal is a type of gateway website that functions as an anchor site and offers a broad array of resources and services, online shopping malls, email support, community forums, current news and weather, stock quotes, travel information, and links to other popular subject categories. In addition, there are wireless portals, designed for web-enabled portable devices. An example is Yahoo! Mobile, which offers Yahoo! onesearch, Yahoo! Maps, Yahoo! Entertainment, and so on. Yahoo! Mobile users can access not only email, calendar, news, and stock quotes but also Yahoo! s directory of wireless sites, movies, and auctions. Portals may be general public portals (horizontal portals or megaportals), such as Yahoo!, Google, Bing (formerly MSN) and AOL. There are also specialized portals called vertical portals, or vortals, which focus on specific narrow audiences or communities such as ivillage.com for women, Burpee.com for gardeners, and searchnetworking.techtarget.com for network administrators.
Search Services & Search Engines, & How They Work Search services are organizations that maintain databases accessible through websites to help you find information on the internet. Examples are not only parts of portals such as Yahoo! and Bing/MSN but also Google, Ask.com to name just a few. Search services also maintain search engines, programs that enable you to ask questions or use keywords to help locate information on the web. Search services compile their databases by using special programs called spiders also known as crawlers, bots (for robots ), or agents that crawl through the World Wide Web, following links from one web page to another and indexing the words on that site. This method of gathering information has two important implications: o A Search Never Covers the Entire Web Whenever you are doing a search with a search engine; you are never searching the entire web. Exceptions are some news databases, such as Yahoo! News or Google Breaking News, offer up-to-the-minute reports on a number of subjects. In addition, you should realize that there are a lot of databases whose material is not publicly available. Finally, a lot of published material from the 1970s and earlier has never been scanned into databases and made available. o Search Engines Differ in What They Cover Search engines list their results according to some kind of relevance ranking, and different search engines use different ranking schemes. Some search engines, for instance, rank web pages according to popularity (frequency of visits by people looking for a particular keyword), but others don t. 5
Four Web Search Tools: Individual Search Engines, Subject Directories, Metasearch Engines, & Specialized Search Engines There are many types of search tools, but the most popular versions can be categorized as (1) individual search engines, (2) subject directories, (3) metasearch engines, and (4) specialized search engines. The most popular search sites, measured in share of visitors, are Google, Yahoo!, Bing, and Ask. 1. Individual Search Engines: An individual search engine compiles its own searchable database on the web. You search for information by typing one or more keywords, and the search engine then displays a list of web pages, or hits, that contain those keywords, ordered from most likely to least likely to contain the information you want. Hits are defined as the sites that a search engine returns after running a keyword search. Examples of this kind of search engine are Ask, Bing, Google, and Yahoo!. The search engine Ask allows users to ask questions in a natural way, such as: What is the population of the United States? 2. Subject Directories Unlike a search engine, a subject directory is created and maintained by human editors, not electronic spiders, and allows you to search for information by selecting lists of categories or topics, such as Health and Fitness or Science and Technology. Directories tend to be smaller than search engine databases, usually indexing only the top-level pages of a website. Subject directories are best for browsing and for searches of a more general nature. Examples of subject directories are Galaxy, Google Directory, and Yahoo! Directory. 3. Metasearch Engines A metasearch engine allows you to search several search engines simultaneously. Metasearch engines are very fast and can give you a good picture of what s available across the web and where it can be found. Examples are Clusty, Mamma, MetaCrawler, and Webcrawler. Clusty organizes search results into groups or clusters. Thus, for example, if you do a search on the word Indians, the top results will be grouped into clusters such as Tribe, Native Americans, Baseball, and Indian Students. 4. Specialized Search Engines There are also specialized search engines, which help locate specialized subject matter, such as material about movies, health, and jobs. These overlap with the specialized portals, or vortals, we discussed above. 6