General Search Engine Information

General Search Engine Information


Search Engine: A machine “tuned” by humans to index web pages. For instance, Excite.

Algorithm: The way in which the search engine is “tuned”. An algorithm is a way the search engine will determine ranks – it is the way the search engine is programmed to determine ranks. An algorithm may take only certain things into account – like keywords in the title or link popularity. Some engines use cyclical algorithms – meaning they may change algorithms from week to week.

Directory: A list of sites compiled by humans. For instance, Yahoo!

Spider: A spider goes to your site and finds your pages. It then stores those pages in a database for future retrieval by the search engine.

Indexing: When the search engine takes the pages from the database that the spider has created and places them in an order based on the algorithms of that engine. All search engines have a different indexing process – due to different algorithms – that’s why you get different results in different engines.

Query: The keywords that a person types into a search box. A person is “querying” the search engine.

Crawling: When the spider follows the links from the page you submit – the spider is “crawling” your site.

Automatic Update: When the spider returns to your pages at periodic intervals to check to see if you’ve made any changes.

Optimizing: You can optimize, tune or configure your web pages for a specific search engine. This means that you are employing specific strategies for specific engines.

Examples of Spam

  1. Using the same keyword more than three times in your keywords tag.
  2. Putting keywords into your tags that has nothing to do with your actual page content.
  3. Using text, spacers, or borders the same color as the background.
  4. Using tiny text with keywords ina n attempt to increase ranks.

Search Engines v. Directories

There is a difference between a search engine and a directory. A search engine is a machine – or a “robot”. A human may program algorithms for a search engine, but a human will have nothing to do with your site when the spider is visiting your site or the engine is indexing your pages.

A directory can be compiled by a robot, but more often than not, it is compiled by humans. Yahoo! is a prime example of a directory. When you submit your site to Yahoo! a human will review your site for consideration in their index.

The lines between search engines and directories are becoming jaded. This is because each major “search engine” is now associated with a “directory”. For instance, we used to call AltaVista a search engine. However, we have to be careful with that terminology. When you go to AltaVista and you type in a search – you are definitely getting results from the “engine” part of AltaVista. But when you search down through the “categories” – you haven’t typed anything into the “search-box” – you are now getting results from a directory (these results come from two directories – Open Directory Project and LookSmart.)

There is a relationship between search results in the “engine” and the directory or directories that are associated with a particular search engine. It appears that many search engine algorithms have been set to include results based on the directory. Therefore, it is imperative that you are listed in the directory associated with each search engine.

What happens when I submit my site to a search engine?

First, the search engine’s spider will visit your site immediately, and schedule your site for inclusion in the search engine’s index.

Second, usually within a few weeks, the engine will place your site in their index.

Third, the spider will revisit your site, to include any updates. Once you are included in the index, the spider will usually revisit every two weeks. The spider will also begin to “crawl” your site by following the links off of the page that you submitted. This process is also called “automatic update”. With Excite – these new updates seem to be automatically included once the spider has visited the site. However, if you are dealing with the Inktomi spider – slurp – which gathers data for Hotbot, Snap, Yahoo!, and others, this information may not be included in each particular engine’s index for several weeks.

Fourth, when someone uses a search engine, they type “keywords” into the search box. They are submitting a query to a search engine. The search engine, depending on how it has been tuned, will pull up all of the relevant sites which pertain to that query.

What if I don’t want a page indexed?

If you want to prevent a search engine from indexing certain pages use a <meta name=”robots” content=”noindex”> tag on every page that you don’t want search engines to index. Better yet, you can very simply include a robots.txt file in your main directory – in this way you can exclude all engines from certain parts of your site, or specific engines from specific pages.

Variables That Affect Ranks

When you are tuning (from now on we’ll call it optimizing) your web pages for certain engines you must always keep in mind that keyword frequency in text and location of your keywords, is the most important part of how the engine will rank your pages. ALL search engines rank pages based on the frequency and location of keywords.

Some engines also are tuned to give a boost to pages that meet the following criteria:

1. link popularity

2. keywords in the title, most important keywords first

3. keywords in the names of the linked pages
for instance: &#60a href=”educational-toys.htm”&#62educational toys</a>

4. keywords in alt tags

5. keywords as names of images
for instance: <img src=”educational-toys.gif” alt=”educational toys”>

6. keywords in the description tag

7. keywords in the keywords tag, most important keywords first

More Resources

How Search Engines Work: The difference between engines and directories.
Search Engine Terms: Everything you ever wanted to know about search engines.
Search Engine Glossary: Another great list of terms.