Sunday, April 12, 2009

Some Facts on Google Cache

Why a cache? Given the two hyperlinks to the underlying website, it may seem not particularly useful for the result set to contain a hyperlink to the potentially stale version in the Google cache. However, there may be times when the underlying website is offline, has had pages deleted, or links broken. In these situations, having a cache backup can be a lifesaver.

  When a user clicks on the Cached hyperlink, a disclaimer is posted on top of the cached page.Webmasters can prevent Google from caching their site by using the ‘no-archive’ metatag. Webmaster can also prevent Google from indexing their site at all by using the ‘no-index’ metatag or putting the Google cache is basically a repository of crawled web pages.

  In its unending quest to index the billions of web pages available on the Internet, the GoogleBot locates, crawls, analyzes, and catalogs the web pages in its searchable index. Consequently, when you type in a search term on Google, rather than crawling the entire web searching for pages matching your criteria and returning a result set a month later, Google accesses its index using optimized algorithms and returns a result set in a matter of milliseconds. 

The result set, as shown below contains usually contains the following six items: Title, Snippet, URL, Size, Similar Pages, and Cached (the last three are optional depending on the site As you can see from the illustration, the Title is the prominent part of the result set, and it also contains the hyperlink to the underlying website. The Snippet consists of one or two sentences from the Google index that have previously been extracted from the website. 

No comments:

Post a Comment