Sunday, February 15, 2009

How Do Search Engines Work?

(My Original Blog Post: How Do Search Engines Work?)
It is the search engines that finally bring your website to the notice of the prospective customers in most cases. Therefore it is better to know how search engines actually work and how they present information to the potential customer initiating a search.

There are basically two types of search engines. The first is by robots called crawlers or spiders.

Search Engines use spiders to index websites on the web. When you submit your website pages to a search engine by completing their submission page, or in most cases the search engine finds your link on another site, the search engine spider will then index your site by following the links from one page to another. A ‘spider’ is an automated program that is run by the search engine system and follows rules that the computers have been instructed for the spider. Spider visits a website, reads the content on the actual site, the site's Meta tags and also follows the internal and external links that are placed on the site. The spider then returns all that information back to a central depository, where the data is indexed--kind of like a public library indexes information. It will visit each link you have on your website and index those sites/pages as well, unless specifically insrtucted not to by the 'nofollow' attribute. Some spiders will only index a certain number of pages on your site, so don’t create a site with 500 pages just for indexing!
Instead create pages for humans and then spiders will find the information too, if the pages have quality content in them.
The spider will periodically return to the sites to check for any information that has changed. The frequency with which this happens is determined by the rules of the search engine.

A spider is almost like a book where it contains the table of contents, the actual content and the links and references for all the websites it finds during its search, and it may index up to a million pages a day or more.

Example: Google, Yahoo, Live, Alexa, etc.

When you ask a search engine to find information, it is actually searching through the index which it has created and not actually searching the Internet. Different search engines produce different rankings because not every search engine uses the same algorithm--or rules--to search through the indexes.

One of the things that a search engine algorithm scans for is the frequency and location of keywords on a web page, but it can also easily detect artificial keyword stuffing or spam-indexing. This type of practice from website owners is called black-hat SEO. Then the algorithms analyzes the way that pages link to other pages in the Web. By checking how pages link to each other, an engine can both determine what a page is about, if the keywords of the linked pages are similar to the keywords on the original page, the the search engine can better determine if the web page is of honest quality or web spam.

No comments: