Google Page Rank Python Script
This isn’t my script but I thought it would appeal to the reader of this blog. It’s a script that will lookup the Google Page Rank for any website and uses the same interface as the Google Toolbar to do it. I’d like to thank Fred Cirera for writing it and you can checkout his blog about this script here.
I’m not exactly sure what I would use this for but it might have applications for anyone who wants to do some really advanced SEO work and find a real way to accomplish Page Rank sculpting. Perhaps finding the best websites to put links on.
The reason it is such an involved bit of math is that it need to compute a checksum in order to work. It should be pretty reliable since it doesn’t involve and scraping.
Example usage:
$ python pagerank.py http://www.google.com/ PageRank: 10 URL: http://www.google.com/ $ python pagerank.py http://www.mozilla.org/ PageRank: 9 URL: http://www.mozilla.org/ $ python pagerank.py http://halotis.com PageRange: 3 URL: http://www.halotis.com/
And the script:
#!/usr/bin/env python # # Script for getting Google Page Rank of page # Google Toolbar 3.0.x/4.0.x Pagerank Checksum Algorithm # # original from http://pagerank.gamesaga.net/ # this version was adapted from http://www.djangosnippets.org/snippets/221/ # by Corey Goldberg - 2010 # # Licensed under the MIT license: http://www.opensource.org/licenses/mit-license.php import urllib def get_pagerank(url): hsh = check_hash(hash_url(url)) gurl = 'http://www.google.com/search?client=navclient-auto&features=Rank:&q=info:%s&ch=%s' % (urllib.quote(url), hsh) try: f = urllib.urlopen(gurl) rank = f.read().strip()[9:] except Exception: rank = 'N/A' if rank == '': rank = '0' return rank def int_str(string, integer, factor): for i in range(len(string)) : integer *= factor integer &= 0xFFFFFFFF integer += ord(string[i]) return integer def hash_url(string): c1 = int_str(string, 0x1505, 0x21) c2 = int_str(string, 0, 0x1003F) c1 >>= 2 c1 = ((c1 >> 4) & 0x3FFFFC0) | (c1 & 0x3F) c1 = ((c1 >> 4) & 0x3FFC00) | (c1 & 0x3FF) c1 = ((c1 >> 4) & 0x3C000) | (c1 & 0x3FFF) t1 = (c1 & 0x3C0) < < 4 t1 |= c1 & 0x3C t1 = (t1 << 2) | (c2 & 0xF0F) t2 = (c1 & 0xFFFFC000) << 4 t2 |= c1 & 0x3C00 t2 = (t2 << 0xA) | (c2 & 0xF0F0000) return (t1 | t2) def check_hash(hash_int): hash_str = '%u' % (hash_int) flag = 0 check_byte = 0 i = len(hash_str) - 1 while i >= 0: byte = int(hash_str[i]) if 1 == (flag % 2): byte *= 2; byte = byte / 10 + byte % 10 check_byte += byte flag += 1 i -= 1 check_byte %= 10 if 0 != check_byte: check_byte = 10 - check_byte if 1 == flag % 2: if 1 == check_byte % 2: check_byte += 9 check_byte >>= 1 return '7' + str(check_byte) + hash_str if __name__ == '__main__': if len(sys.argv) != 2: url = 'http://www.google.com/' else: url = sys.argv[1] print get_pagerank(url)
More from halotis.com
Related posts:
- Scrape Google Search Results Page
- Scrape Digg Search Results Python Script
- SEOCheck: Track Your Google Position Over Time
- Find Links on Del.icio.us with a Python Script
- Scrape Advertisements from Google Search Results with Python



Google Page Range Python Script | HalOtis Marketing http://bit.ly/wvum8
This comment was originally posted on Twitter
Google Page Range Python Script – This isn’t my script but I thought it would appeal to the reader of this bl.. http://bit.ly/41WL96
This comment was originally posted on Twitter
Seems to have stopped working.
Thanks for letting me know. I’ll look into it.
Any luck sorting out why it stopped working?
Unfortunately it doesn’t seem to be fixable. :(
Previous versions of the Google toolbar had a simple way to get the data. But since they introduced the SideWiki they changed how the toolbar works. The new API seems to return binary data (possibly encrypted) Cracking this new scheme is not going to be easy.