List the Links in Your Twitter Timeline Python Script

This is a simple Twitter Python script that checks your friends time-line and prints out any links that have been posted. In addition it visits each of the URLs and finds the actual title of the destination page and prints that along side. This simple script demonstrates an easy way to gather some of the hottest trends on the internet the moment they happen.

If you set up a Twitter account within a niche and find a few of the players in that niche to follow then you can simply find any links posted, check them to see if they are on topic (using some keyword/heuristics) and then either notify yourself of the interesting content, or automatically scrape it for use on one of your related websites. That gives you perhaps the most up to date content possible before it hits Google Trends. It also gives you a chance to promote it before the social news sites find it (or be the first to submit it to them).

With a bit more work you could parse out some of the meta tag keywords/description, crawl the website, or find and cut out the content from the page. If it’s a blog you could post a comment.

Example Usage:

$ python TwitterLinks.py
http://bit.ly/s8rQX - Twitter Status - Tweets from users you follow may be missing from your timeline
http://bit.ly/26hiT - Why Link Exchanges Are a Terrible, No-Good Idea - Food Blog Alliance
http://FrankAndTrey.com - Frank and Trey
http://bit.ly/yPRHp - Gallery: Cute animals in the news this week
...

And here’s the python code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# (C) 2009 HalOtis Marketing
# written by Matt Warren
# http://halotis.com/
 
try:
   import json
except:
   import simplejson as json # http://undefined.org/python/#simplejson
import twitter     #http://code.google.com/p/python-twitter/
 
from urllib2 import urlopen
import re
 
SETTINGS = {'user':'twitter user name', 'password':'you password here'}
 
def listFriendsURLs(user, password):
    re_pattern='.*?((?:http|https)(?::\\/{2}[\\w]+)(?:[\\/|\\.]?)(?:[^\\s"]*))'	# HTTP URL
    rg = re.compile(re_pattern,re.IGNORECASE|re.DOTALL)
 
    api = twitter.Api(user, password)
    timeline = api.GetFriendsTimeline(user)
 
    for status in timeline:
        m = rg.search(status.text)
        if m:
            httpurl=m.group(1)
            title = getTitle(httpurl)
            print httpurl, '-', title
 
def getTitle(url):
    req = urlopen(url)
    html = req.read()
 
    re_pattern='<title>(.*?)</title>'
    rg = re.compile(re_pattern,re.IGNORECASE|re.DOTALL)
 
    m = rg.search(html)
    if m:
        title = m.group(1)
        return title.strip()
    return None
 
if __name__ == '__main__':
    listFriendsURLs(SETTINGS['user'], SETTINGS['password'])
Bookmark and Share

Technorati Tags: , , ,

Related posts:

  1. Targeting Twitter Trends Script
  2. Scrape Digg Search Results Python Script
  3. SEOCheck: Track Your Google Position Over Time
  4. Download Images From Flickr With Python
  5. Getting links to a domain using Alexa and Python
Stumble it!


RSS feed | Trackback URI

1 Comment »

Comment by SJL Web Design
2009-11-23 09:30:04

Thanks for the great Python script, I had been looking to create something similar for a new project but I might just alter this script as it is much better than anything I could write just yet.

 
Name (required)
E-mail (required - never shown publicly)
URI
Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped=""> in your comment.



Additional comments powered by BackType