How to search engines operate

Search engines have two major functions: crawling and building an index, and providing search users with a ranked list of the websites they’ve determined are the most relevant.

Each stop is a unique document (usually a web page, but sometimes a PDF, JPG, or other file). The search engines need a way to “crawl” the entire city and find all the stops along the way, so they use the best path available—links.

Imagine the World Wide Web as a network of stops in a big city subway system.

  1. Crawling and IndexingCrawling and indexing the billions of documents, pages, files, news, videos, and media on the World Wide Web.
  2. Providing AnswersProviding answers to user queries, most frequently through lists of relevant pages that they’ve retrieved and ranked for relevancy.

The link structure of the web serves to bind all of the pages together.

Links allow the search engines’ automated robots, called “crawlers” or “spiders,” to reach the many billions of interconnected documents on the web.

Once the engines find these pages, they decipher the code from them and store selected pieces in massive databases, to be recalled later when needed for a search query. To accomplish the monumental task of holding billions of pages that can be accessed in a fraction of a second, the search engine companies have constructed datacenters all over the world.

These monstrous storage facilities hold thousands of machines processing large quantities of information very quickly. When a person performs a search at any of the major engines, they demand results instantaneously; even a one- or two-second delay can cause dissatisfaction, so the engines work hard to provide answers as fast as possible.

How do search engines determine relevance and popularity?

To a search engine, relevance means more than finding a page with the right words. In the early days of the web, search engines didn’t go much further than this simplistic step, and search results were of limited value. Over the years, smart engineers have devised better ways to match results to searchers’ queries. Today, hundreds of factors influence relevance, and we’ll discuss the most important of these in this guide.

Search engines typically assume that the more popular a site, page, or document, the more valuable the information it contains must be. This assumption has proven fairly successful in terms of user satisfaction with search results.

Popularity and relevance aren’t determined manually. Instead, the engines employ mathematical equations (algorithms) to sort the wheat from the chaff (relevance), and then to rank the wheat in order of quality (popularity).

These algorithms often comprise hundreds of variables. In the search marketing field, we refer to them as “ranking factors.” Moz crafted a resource specifically on this subject: Search Engine Ranking Factors.

Search engines are answer machines. When a person performs an online search, the search engine scours its corpus of billions of documents and does two things: first, it returns only those results that are relevant or useful to the searcher’s query; second, it ranks those results according to the popularity of the websites serving the information. It is both relevance and popularity that the process of SEO is meant to influence.

You can surmise that search engines believe that Ohio State is the most relevant and popular page for the query “Universities” while the page for Harvard is less relevant/popular.
Advertisements