Translate An RSS Feed To Another Language in Python

A reader suggested that it might be useful to have a script that could get an RSS feed translate it to another language and republish that feed somewhere else. Thankfully that’s pretty easy to do in Python.

I wrote this script by taking bits and pieces from some of the other scripts that I’ve posted on this blog in the past. It’s surprising just how much of a resource this site has turned into.

It uses the Google Translate Service to convert the RSS feed content from one language to another and will simply echo out the new RSS content to the standard out. If you wanted to republish the content then you could easily direct the output to a file and upload that to your web server.

Example Usage:

$ python translateRSS.py
< ?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0"><channel><title>HalOtis Marketing</title><link>http://www.halotis.com</link><description>Esprit d&amp;#39;entreprise dans le 21?me si?cle</description>
.....
</channel></rss>

Here’s the Script:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# (C) 2009 HalOtis Marketing
# written by Matt Warren
# http://halotis.com/
 
import feedparser  # available at feedparser.org
from translate import translate  # available at http://www.halotis.com/2009/07/20/translating-text-using-google-translate-and-python/
import PyRSS2Gen # avaliable at http://www.dalkescientific.com/Python/PyRSS2Gen.html
 
import datetime 
import re
 
def remove_html_tags(data):
    p = re.compile(r'< .*?>')
    return p.sub('', data)
 
def translate_rss(sl, tl, url):
 
    d = feedparser.parse(url)
 
    #unfortunately feedparser doesn't output rss so we need to create the RSS feed using PyRSS2Gen
    items = [PyRSS2Gen.RSSItem( 
        title = translate(sl, tl, x.title), 
        link = x.link, 
        description = translate(sl, tl, remove_html_tags(x.summary)), 
        guid = x.link, 
        pubDate = datetime.datetime( 
            x.modified_parsed[0], 
            x.modified_parsed[1], 
            x.modified_parsed[2], 
            x.modified_parsed[3], 
            x.modified_parsed[4], 
            x.modified_parsed[5])) 
        for x in d.entries]
 
    rss = PyRSS2Gen.RSS2( 
        title = d.feed.title, 
        link = d.feed.link, 
        description = translate(sl, tl, d.feed.description), 
        lastBuildDate = datetime.datetime.now(), 
        items = items) 
    #emit the feed 
    xml = rss.to_xml()
 
    return xml
 
if __name__ == '__main__':
  feed = translate_rss('en', 'fr', 'http://www.halotis.com/feed/')
  print feed
Bookmark and Share

Technorati Tags: , , , , , , ,

Related posts:

  1. Automatically Respond to Twitter Messages
  2. Scrape Technorati Search Results in Python
  3. Scrape Advertisements from Google Search Results with Python
  4. Sending Email from Python using Gmail
  5. How To Get RSS Content Into An Sqlite Database With Python – Fast
Stumble it!


RSS feed | Trackback URI

1 Comment »

Comment by Matt_Warren
2009-08-04 08:10:18

Translate An RSS Feed To Another Language in Python – A reader suggested that it might be useful to have a script t.. http://bit.ly/3Myn4V

This comment was originally posted on Twitter

 
Name (required)
E-mail (required - never shown publicly)
URI
Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped=""> in your comment.



Additional comments powered by BackType