1. General seo information
1.1 History of search engines
A D V E R T I S E M E N T
In the early days of Internet development, its users were a privileged
minority and the amount of available information was relatively small. Access
was mainly restricted to employees of various universities and laboratories who
used it to access scientific information. In those days, the problem of finding
information on the Internet was not nearly as critical as it is now.
Site directories were one of the first methods used to facilitate access to
information resources on the network. Links to these resources were grouped by
topic. Yahoo was the first project of this kind opened in April 1994. As the
number of sites in the Yahoo directory inexorably increased, the developers of
Yahoo made the directory searchable. Of course, it was not a search engine in
its true form because searching was limited to those resources who�s listings
were put into the directory. It did not actively seek out resources and the
concept of seo was yet to arrive.
Such link directories have been used extensively in the past, but nowadays
they have lost much of their popularity. The reason is simple � even modern
directories with lots of resources only provide information on a tiny fraction
of the Internet. For example, the largest directory on the network is currently
DMOZ (or Open Directory Project). It contains information on about five million
resources. Compare this with the Google search engine database containing more
than eight billion documents.
The WebCrawler project started in 1994 and was the first full-featured search
engine. The Lycos and AltaVista search engines appeared in 1995 and for many
years Alta Vista was the major player in this field.
In 1997 Sergey Brin and Larry Page created Google as a research project at
Stanford University. Google is now the most popular search engine in the world.
Currently, there are three leading international search engines � Google,
Yahoo and MSN Search. They each have their own databases and search algorithms.
Many other search engines use results originating from these three major search
engines and the same seo expertise can be applied to all of them. For example,
the AOL search engine (search.aol.com) uses the Google database while AltaVista,
Lycos and AllTheWeb all use the Yahoo database.
1.2 Common search engine principles
To understand seo you need to be aware of the architecture of search engines.
They all contain the following main components:
Spider - a browser-like program that downloads web pages.
Crawler � a program that automatically follows all of the links on
each web page.
Indexer - a program that analyzes web pages downloaded by the spider
and the crawler.
Database� storage for downloaded and processed pages.
Results engine � extracts search results from the database.
Web server � a server that is responsible for interaction between the
user and other search engine components.
Specific implementations of search mechanisms may differ. For example, the
Spider+Crawler+Indexer component group might be implemented as a single program
that downloads web pages, analyzes them and then uses their links to find new
resources. However, the components listed are inherent to all search engines and
the seo principles are the same.
Spider. This program downloads web pages just like a web browser. The
difference is that a browser displays the information presented on each page
(text, graphics, etc.) while a spider does not have any visual components and
works directly with the underlying HTML code of the page. You may already know
that there is an option in standard web browsers to view source HTML code.
Crawler. This program finds all links on each page. Its task is to
determine where the spider should go either by evaluating the links or according
to a predefined list of addresses. The crawler follows these links and tries to
find documents not already known to the search engine.
Indexer. This component parses each page and analyzes the various
elements, such as text, headers, structural or stylistic features, special HTML
tags, etc.
Database. This is the storage area for the data that the search engine
downloads and analyzes. Sometimes it is called the index of the search engine.
Results Engine. The results engine ranks pages. It determines
which pages best match a user's query and in what order the pages should be
listed. This is done according to the ranking algorithms of the search engine.
It follows that page rank is a valuable and interesting property and any seo
specialist is most interested in it when trying to improve his site search
results. In this article, we will discuss the seo factors that influence page
rank in some detail.
Web server. The search engine web server usually contains a HTML page
with an input field where the user can specify the search query he or she is
interested in. The web server is also responsible for displaying search results
to the user in the form of an HTML page.
|