What is Google Page Rank (PR)?

by Canonical SEO on August 25, 2009

Page Rank

Google Page Rank, commonly referred to as PR, was named after Larry Page, one of the cofounders of Google.  Page Rank was first described in Section 2.1 of the whitepaper The Anatomy of a Large-Scale Hypertextual Web Search Engine which was written by Larry Page and Sergey Brin at Stanford.  This document was basically the blueprint of what became the Google search engine. 

Page Rank is Google’s measurement of the link popularity of a given URL.  At the time that the idea of PR was conceived, search engines had been using inbound link counts as a measurement of link popularity.  But Google changed things with Page Rank by not treating all inbound links equally. The PR of a URL measures not only the quantity of inbound links to that URL, but also the quality of each of those inbound links. 

Originally each outbound link from a page passed a predictable amount of PR to the target of  the link – essentially, the Page Rank of the page where the link was located divided by the number of outbound links on that page.  This has been refined over the years as Google modified their definition of the “quality” of links.  But the general concepts described in the original documentation of Page Rank still hold true. 

Google Page Rank: Live or Memorex?

Originally there was a single Page Rank for a given URL, but because of the Google Toolbar there are now two different kinds of Google Page Rank.  I like to use the “Is it live, or is it Memorex?” metaphor to describe them.  Hopefully, you remember the Memorex commercials back in the day when cassette tapes were the state of the art home recording medium. One version of Page Rank is live or real-time (the “actual” PR of your URL) and the other is like a Memorex recording from days long gone (the Google Toolbar PR of your URL). 

What is “actual” Page Rank?

The “actual” PR is what is used by Google as one of their 200+ ranking factors to rank your URL for a particular keyword phrase.   It is the important version of Google Page Rank, though its importance has diminished greatly from the early days when it played a more prominent part in the ranking algorithm.  Today the Page Rank of the URL being ranked is a VERY minor ranking factor.  It is very important to note that you will NEVER know what the “actual” Page Rank is for a given URL.

Your URL’s “actual” PR is constantly updated as Google crawls your URL, crawls URLs that link to your URL, crawls URLs that link to URLs that link to your URL, etc. ad infinitum.  Because the web is a mesh of interlinked web pages that allow cycles (A -> B -> A  or A -> B -> C -> A), calculating PR is a recursive, continuous process.  When a page at one URL adds or drops a link to some other page, it causes a waterfall of updates to “actual” PR for neighboring pages, the neighbors of those neighboring pages, and so on. 

How is “actual” Page Rank Calculated?

Page Rank was originally defined in the Stanford whitepaper linked to above by Brin and Page as follows:

We assume page A has pages T1…Tn which point to it (i.e., are citations). The parameter d is a damping factor which can be set between 0 and 1. We usually set d to 0.85. There are more details about d in the next section. Also C(A) is defined as the number of links going out of page A. The PageRank of a page A is given as follows:

PR(A) = (1-d) + d (PR(T1)/C(T1) + … + PR(Tn)/C(Tn))

Note that the PageRanks form a probability distribution over web pages, so the sum of all web pages’ PageRanks will be one.

Once you take the time to absorb the above definition, you see that several things are quite obvious.  A URL’s Page Rank depends on the pages that link to it… specifically it’s a function of the number of pages that link to it, the PR of those pages that link to it, and how many outbound links appear on each of the pages that link to it.  There is also a damping or decay factor which figures into the equation such that the total amount of PR passed out of a page on its outbound links is always a little less than the PR of the page itself. 

Page Rank can be thought of as the probability that a web surfer given a random URL will continue to click on or follow links. Page Rank also takes into account that at some point the surfer will likely get bored and enter another random URL.  The probability that they will stop clicking on links and enter another random URL is represented by the damping factor.  These same basic concepts generally hold true today.

The diagram below gives a simplified picture of how Google’s Page Rank is derived: 

Diagram showing how actual Page Rank is calculated

Diagram showing how actual Page Rank is calculated

Disclaimer:  The above image does NOT take into account the damping or decay factor in an effort to make the example easier to understand.  It also does not take into account the various ways that individual links may be weighted depending on quality factors which might possibly include attributes like the trust, authority, etc. of the source of the link which have likely become part of the Google Page Rank algorithm as it has evolved over the years.

What is Google Toolbar Page Rank?

The Google Toolbar Page Rank is what most people refer to when they speak about PR.  It’s that little green bar you see in the Google Toolbar that every webmaster seems to freak out over… that SEO Holy Grail!  NOT!  

The Google Toolbar PR has displayed values of 0/10 – 10/10 (and also “Current page is not ranked by Google” for URLs with no visible PR).  This version of Page Rank is NOT used by Google for any type of ranking.  It’s sole purpose is to be used by the Google Toolbar to give webmasters “a general idea” of what their URLs’ actual Page Ranks might be.  There are several things that should be noted about Google Toolbar PR.

The toolbar version of Page Rank is very misleading.  Contrary to most people’s belief, it is NOT an accurate depiction of a URL’s actual Page Rank.  The Google Toolbar PR is a snapshot of what the URL’s actual PR was at some point in the past.  You’re looking at history…  The Google Toolbar PR is an old, out-dated PR value for your URL, NOT the “actual” PR that is being used to rank your URLs.  More accurately stated, you’re looking at a “mapping” or “scaling” of an old “actual” PR to a 0-10 scale. 

Because it is an outdated mapping of an old PR, a URL’s Google Toolbar Page Rank might say “Current page is not ranked by Google” when, in fact, the URL in question has an actual PR value that is being used by the ranking algorithm.  Because you are looking at historical data, the Google Toolbar may also show an incorrect PR such as PR3 when in reality the “actual” Page Rank for the URL might be PR1 or PR5.

Because you are looking at an outdated mapping of what a URL’s “actual” Page Rank was at some point in the past and because that Google Toolbar PR is only updated by Google typically once every few months, a URL’s Google Toolbar PR should always be taken with a grain of salt.  And from a ranking perspective, it is meaningless.

How is Google Toolbar Page Rank calculated?

What follows is likely somewhat simplified, but it should help you to visualize how Google is calculating the Google Toolbar PR. Imagine…

At random intervals (typically once every 3-6 months over the past several years, but more frequently in the last 6 months it seems) Google takes a snapshot of all of the URLs in their index and the corresponding “actual” PR of those URLs to create a mini-index that can be used by the Google Toolbar.  They then apply a logarithmic algorithm (base unknown) to map all of the “actual” Page Ranks in the snapshot to a value between 0 and 10.  They then store the 0-10 value as a third column in the mini-index as the Google Toolbar PR.  

Anytime a URL is requested by a browser with the Google Toolbar installed, the Toolbar can lookup the Toolbar PR in the mini-index using the URL as the key and display it as a little green bar in the toolbar.  When it looks up a URL that does NOT exist in the mini-index (which would mean that the URL was not indexed at the time of the snapshot) then rather than displaying a Toolbar PR, the toolbar displays “Page not currently ranked by Google”. 

For example, if you think of the actual PR as being some potentially huge integer number and the logarithmic algorithm that maps “actual” PR to Google Toolbar PR as being base 10 then Google might map each of the “actual” Page Ranks for the URLs in the snapshot to their corresponding Google Toolbar Page Ranks similar to the following:

Actual PR ———————-> Google Toolbar PR
0-9 ———————————————> 0
10-99 ——————————————> 1
100-999  ————————————-> 2
1000-9999 ———————————-> 3
10000-99999 ——————————> 4
100000-999999 ————————–> 5
1000000-9999999 ———————-> 6
10000000-99999999 ——————> 7
100000000-999999999 ————–> 8
1000000000-9999999999 ———-> 9
10000000000-99999999999 —–> 10

Disclaimer:  I’m NOT saying that the “actual” PR is stored as a large integer.  It could just as easily be stored as a real number between 0 and 1 representing a probability that a user will click on a link to the site.  I’m also not saying that the logarithm is base 10.  No one knows the actual scaling algorithm.  I’m just using it as an example to show how it gets exponentially harder and harder to get to the next Toolbar PR.

You should take away a few key points from this post. 

  • There are two distinct versions of Page rank for a given URL – the “actual” Page Rank and the Google Toolbar Page Rank.
  • The actual Page Rank of a URL is updated constantly, is used by the ranking algorithm.  However, it plays only a VERY minor roll overall in how a URL ranks for a particular keyword phrase
  • The Google Toolbar Page Rank is never up-to-date and is NOT used by the ranking algorithm. 

Hope this post has helped you visualize what Page Rank is, how it is used, how it is calculated, and the difference between “actual” PR and Google Toolbar PR.

Leave a Comment

Previous post:

Next post: