How To Find Just About Anything
Information Highway
There's a good reason the World Wide Web is called the information highway. With its many hundreds of millions of pages there for the choosing and perusing, you can find out just about anything you'd want to know. However, the downside of all this is that sifting through millions and millions of pages is a daunting prospect when you're looking for that one piece of trivia. To top it off, the authors of web pages may give their compositions less than topical names, making it hard to glean the subject material at a glance.
So, say you want to read up on a specific topic. How are you going to know which pages are your best bets for finding out what you want to know? It's simple—you turn yourself over to an Internet search engine.
Three Basic Tasks
Search engines are designed to sift through all that vast material to help you hone in on the specific information you're seeking. Not all search engines are created equal, however, and the various search engines work in different ways. No matter how the goal is accomplished, search engines all carry out the same three tasks:
*Search engines look for keywords scattered throughout the Internet
*Search engines maintain an index of all words found as well as their location
*Search engines enable users to seek words or chains of words within the engine's index
In the early days of search engines, each maintained an index of a mere 300,000 documents and pages, and received some 2000 queries every day. This is a far cry from modern times in which the most popular search engine adds hundreds of millions of pages to its index on a daily basis and performs tens of millions of searches on behalf of its users each day. Let's take a look at how this all works.
Data Collection
Most people refer to the Internet when they are really talking about the World Wide Web--the most visible component of the Internet. Before the Web became so prominent, people in the know were already employing search engines to find information. Two of the most prominent programs for this purpose were named "gopher" and "Archie." These programs collected data and kept this data indexed making a huge difference in how long it took users to find the documents or programs they sought. The savvy computer user of the late 1980's used these engines to maximize their use of the Internet. Today's Internet user is focused on searching the Web, and uses search engines that help them navigate the contents of Web pages.
Search engines use special robots called "spiders" to tabulate the words they find on Websites. The busy work of these spiders, going through vast quantities of material and compiling lists, is call Web crawling. Because a search engine's spiders need look at so much material, they are designed to use as their starting point the servers which receive the most traffic and the pages viewed most often. The spider indexes the words on these pages, and proceeds to follow any links found at the site. In this manner, the spider begins to travel throughout the Web, lapping up information and compiling user-friendly indices for Web surfers.
|