The World Wide Web


Background

The concept of the WWW was first implemented by Tim Berners-Lee of CERN (European Laboratory for Particle Physics) in March, 1989. Berners-Lee intended to create a wide-area hypermedia information retrieval system giving universal access to a large universe of documents. Originally aimed at the High Energy Physics community, it has spread to other areas and attracted much interest in user support, resource discovery, and collaborative work. Currently, it is the most advanced information system deployed on the Internet, and is equipped to embrace many future advances in technology, including new networks, protocols, and data formats.

Although the WWW was originally intended only to link documents, recent advancements have expanded this concept. It is now possible to transmit picture files, audio files, and even movies stored in a variety of different formats (assuming the user has software to support the different files). In addition, the WWW can support documents written in PostScript, ASCII, and HTML among others.

Over the past few years, the WWW has grown at a phenomenal rate. As of January 1993, the WWW was ranked 127th of all network services in terms of sheer byte traffic. That August, the WWW was ranked 13th. In addition, the amount of byte traffic across the National Science Foundation's North American network attributed to WWW use multiplied by 414 times during this period ("Entering the World Wide Web: A Guide to Cyberspace", Kevin Hughes).

The Big Picture

Imagine thousands of computers worldwide wishing to view the same group of documents. Without any type of communication system, these documents would have to be located on each computer, thus requiring tremendous disk space and numerous copies of each document. Now imagine a system of common links connecting all of these computers together (the Internet). These computers can now "talk" to each other, and can be given access to the same documents without needing to store a copy on their own computer. But how do they get these documents? There are many transfer methods (protocols) to choose from, but each computer would need to have access to every protocol to obtain every available document. Or, they could have access to only one information system which could support almost every protocol available (The World Wide Web). This is why the WWW was created.

Protocols Supported by the WWW

The first things to decide, when making information available, is which protocol to use. Different protocols have various strengths, relative to intended use and data formats, for instance. The following is a list of the protocols supported by the WWW, a description of their capabilities, and a suggestion of when each should be used.

WAIS
Description: The Wide Area Information Server (WAIS) automates the search and retrieval of many forms of electronic information over wide area networks. Its primary use is to search through documents for keywords and display any matches occurring in its database.

Information Retrieval: WAIS is very useful to find available documents relating to a certain subject in a particular database.

Information Dissemination: WAIS should be used when a person wishes to make a document available and expects the user will locate this document using a keyword search.

Network News
Description: The Network News protocol provides a list of Internet newsgroups sorted by area of interest. A newsgroup is a group of articles written by various users throughout the Internet on a specific topic.

Information Retrieval: Network News allows users to browse or read available news articles by simply clicking on them.

Information Dissemination: Network News is used only for newsgroups and does not support any other forms of information dissemination.

Gopher
Description: Gopher, also known as the "Internet Gopher" allows users to browse for resources using menus. When users find something they like, they can read or access it through Gopher without having to worry about domain names, IP addresses, etc.

Information Retrieval: Gopher is used when the geographical location of a document or group of documents is known. For example, to view "The Guide to NIST", first click on North America from the main menu, then USA, then Maryland, then NIST, then NIST General Information, and then Guide to NIST.

Information Dissemination: Gopher should be used when it is anticipated that the user will be accessing a document by its geographical location. Gopher can also support directory hierarchies, which allow documents to be grouped and therefore be easier to locate in a directory structure.

Telnet
Description: Telnet allows users to connect to a specified remote machine and appear as if they were working from that machine. The user must have the necessary permissions (as determined by the system administrator) to telnet successfully.

Information Retrieval: Telnet is primarily used to run applications on remote machines that are not available on a user's machine. Some possible applications are archie, gopher, WAIS, and veronica. Specific logins are needed for different applications.

Information Dissemination: Telnet should be used to make an application available to the public. It should not be used to make non-executable files, such as publications, available.

Anonymous FTP
Description: File Transfer Protocol (FTP) transfers files to and from a remote network site. Anonymous FTP is an anonymous version of FTP; the available directories are generally restricted through user permissions. In this instance, users would login as anonymous instead of using their actual login name. This is permitted only when the remote site is set up for such a login.

Information Retrieval: FTP is used to transfer a file when the exact location of the file is known. Navigating through a directory structure and listing of the files in a particular directory are supported. Viewing the contents of a file is not supported. FTP should only be used to transfer a file from one location to another.

Information Dissemination: Anonymous FTP should be used to make a file or group of files available to the public to copy. It should not be used when the user may wish to view the file before copying it.

Whois
Description: Whois is a simple internet telephone book system. A list of colleges and organizations is provided as an entry menu, each linked to their respective electronic telephone listing. Once a location is selected and a name is entered, a keyword search is performed to find the telephone number of the person requested.

Information Retrieval: Whois is used to find out information about a particular person when their organization is known.

Information Dissemination: Whois should be used only to make information about staff (email address, phone numbers) available.

HTTP
Description: Hypertext Transfer Protocol (HTTP) is used to transfer documents written in Hypertext Markup Language (HTML). It has the lightness and speed necessary for a distributed collaborative hypermedia information system.

Information Retrieval: HTTP should be used with a WWW browser to access documents written in HTML when the URL is known.

Information Dissemination: HTTP is used for documents written in HTML. Such documents must reside on a server that supports HTTP access.

Clients, Servers, and How They Are Related

Overview

Computers communicate with each other by various methods. One method is the client-server method. One system serves the request(s) of another (client) system. This model has been adopted in information dissemination and retrieval processes.

Clients

A client is a software package located on the user's machine which allows access to documents in the WWW. Clients usually provide an attractive user interface that allows movement through documents to be as easy as a touch of a button. Although Mosaic is a popular WWW client, there are many other clients available. Different clients (also known as browsers) are available for different platforms.

Servers

A server is a program normally located on a remote machine which responds to incoming connections by a client and provides a service. There are many varieties of WWW server software to serve different forms of data.

Relationship

Any user wishing to access information through the WWW must use a client. In addition, any information made available must be done so through a server. When users type the URL of the document they wish to view, the client searches for the server which handles that URL and "asks" for the document using the specified protocol. The server finds the document and "hands" it to the client for viewing. The document only resides on the client's machine while being viewed. This entire process happens relatively quickly (between a couple seconds and a couple minutes) - even if the connection stretches to the other side of the world!

Further Reading

There are many other references which go into further detail regarding what the WWW and the Internet offer. Many of them are located online. Some books that were used to help write this document are: Complete citations are available in the references section.


You may also want to:


Last Edited: Tuesday, 23-Jul-1996 13:55:14 EDT

Written by: Craig Schlenoff