Translating Text Using Google Translate and Python

translate_logoSometimes is can be quite useful to be able to translate content from one language to another from within a program. There are many compelling reasons why you might like the idea of auto translating text. The reason why I’m interested in writing this script is that it is useful to sometimes create unique content online for SEO reasons. Search engines like to see unique content rather than words that have been copied and pasted from other websites. What you’re looking for in web content is:

  1. A lot of it.
  2. Highly related to the keywords you’re targeting.

When trying to get a great position in the organic search results it is important to recognize that you’re competing against an army of low cost outsourced people that are pumping out page after page of mediocre content and then running scripts to generate thousands of back-links to the sites they are trying to rank.  It is very much impossible to get the top spot for any desirable keyword if you’re writing all the content yourself.  You need some help with this.

That’s where Google Translate comes in.

Take an article from somewhere, push it through a round trip of translation such as English->French->English and the content will then be unique enough that it won’t raise any flags that it has been copied from somewhere else on the internet.  The content may not be readable but it will make for fodder for the search engines to eat up.

Using this technique it is possible to build massive websites of unique content overnight and have it quickly rank highly.

Unfortunately Google doesn’t provide an API for translating text.  That means the script has to resort to scraping which is inherently prone to breaking.  The script uses BeautifulSoup to help with the parsing of the HTML content. (Note: I had to use the older 3.0.x series of BeautifulSoup to successfully parse the content)

The code for this was based on this script by technobabble.

import sys
import urllib2
import urllib
 
from BeautifulSoup import BeautifulSoup # available at: http://www.crummy.com/software/BeautifulSoup/
 
def translate(sl, tl, text):
    """ Translates a given text from source language (sl) to
        target language (tl) """
 
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)')]
 
    translated_page = opener.open(
        "http://translate.google.com/translate_t?" + 
        urllib.urlencode({'sl': sl, 'tl': tl}),
        data=urllib.urlencode({'hl': 'en',
                               'ie': 'UTF8',
                               'text': text.encode('utf-8'),
                               'sl': sl, 'tl': tl})
    )
 
    translated_soup = BeautifulSoup(translated_page)
 
    return translated_soup('div', id='result_box')[0].string
 
if __name__=='__main__':
    print translate('en', 'fr', u'hello')

To generate unique content you can use this within your own python program like this:

import translate
 
content = get_content()
new_content = translate('fr', 'en', translate('en','fr', content))
publish_content(new_content)
Bookmark and Share

Technorati Tags: , , , , , , , ,

Related posts:

  1. Scrape Google Search Results Page
  2. Scrape Advertisements from Google Search Results with Python
  3. SEOCheck: Track Your Google Position Over Time
  4. Scrape Yahoo Search Results Page
  5. Getting Ezine Article Content Automatically with Python
Stumble it!


RSS feed | Trackback URI

4 Comments »

Comment by tardmrr
2009-07-20 15:57:52

This reeks of sleeze and I want nothing to do with it.

This comment was originally posted on Reddit

 
Comment by joe
2009-08-03 02:34:21

very useful article, i was looking for this. Also, it would be nice to be able to create my own plugin so that i can create unique content by google translation from RSS feeds content. thanks

 
Comment by Uri Subscribed to comments via email
2009-08-14 20:30:01

I like this. I do not love googles frame in the fly xlation and it stops working during navigation. So, since i don’t know a lot yet, i subscribing to this post so i learn how to implement this on my product site. maybe i’ll pay you to integrate for me? let me send you a link to my site pls. thanks. Uri

 
2009-09-16 10:07:12

[...] by WP Greet BoxOk, so this isn’t my script but it’s a much nicer version of the one I wrote that scrapes the actual Google translate website to do the same thing. I’d like to thank [...]

 
Name (required)
E-mail (required - never shown publicly)
URI
Subscribe to comments via email
Your Comment (smaller size | larger size)
You may use <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped=""> in your comment.



Additional comments powered by BackType