Automated localization

January 16, 2010 at 9:15 am (Programming, python) ()

I’ve never done localization of any application myself but I’ve fair enough idea of how it is done. Basically the process involves two major things

  • Extraction of text strings from the software
  • Generation of translated documentation

In the first part,

  1. A .POT file is generated which contains all the strings in your application which needs to be translated.
  2. .po file is generated for a specific language code

.po file contains a filled-in ‘msgid‘ for each instrumented string and an empty ‘msgstr‘ to hold each translation. Now what is alarming here is that normally they edit the .po file manualy to provide the actual translation of each msgstr into the target language. Oh gross ! you start acting like a translator than an engineer/programmer. I find it a sheer waste of time.

I wondered how nice it would have been to have an inbuilt library to do the word to word translation and avoid the unnecessary manual process. Then first thing that comes to mind is “Google Translator“. I search for the APIs to find that its available in AJAX. Now what would you do if yours isn’t a web application. Find a way to call the AJAX methods through the language you are using.  Thankfully I find the module “simplejson” in python to my rescue. Now write a python script to call the translate API or find it here.

import urllib
import simplejson

baseUrl = "http://ajax.googleapis.com/ajax/services/language/translate"

def getSplits(text,splitLength=4500):
 '''
 Translate Api has a limit on length of text(4500 characters) that can be translated at once,
 '''
 return (text[index:index+splitLength] for index in xrange(0,len(text),splitLength))

def translate(text,src='', to='en'):
 '''
 A Python Wrapper for Google AJAX Language API:
 * Uses Google Language Detection, in cases source language is not provided with the source text
 * Splits up text if it's longer then 4500 characters, as a limit put up by the API
 '''

 params = ({'langpair': '%s|%s' % (src, to),
 'v': '1.0'
 })
 retText=''
 for text in getSplits(text):
     params['q'] = text
     resp = simplejson.load(urllib.urlopen('%s' % (baseUrl), data = urllib.urlencode(params)))
     try:
       retText += resp['responseData']['translatedText']
     except:
       raise
 return retText

Now call the translate method with text to be translated, source language code & destination language code.

Once .po file is ready with translation, compile it to get the .mo file and you are done. Localization made easy ;)

Permalink 3 Comments

Recursion gotcha

January 3, 2010 at 10:23 am (Programming, python) ()

Well as programmers, we make use of recursion as and when need arises and it fits well within our requirement. Well we do take into consideration, the depth upto which it has to keep recurring but do we think of the recursion depth your compiler is ready to take. Frankly I never botherd till date, dint even thought of such an issue.

For python, default recursion depth limit is 1000. Haven’t yet explored for other languages. So beware while using recursive method calls and incase your recursive calls are exceeding the count of 1000, better change the depth limit explicitly. Add the following lines to avoid the error:“RuntimeError: maximum recursion depth exceeded”.

import sys
sys.setrecursionlimit(x)

where x is an integral value as per your need(say 1500).

Permalink 1 Comment

Psyco prototype

January 3, 2010 at 9:58 am (Programming, python) (, , )

Python programmer’s dont need to get psyched with the performance issue. Psyco prototype is there for your relief. Its a representation-based Just-in-time specialization for python, which claims to make your python code run as fast as C.

Get the packages from project’s home page, install & add the following lines to your code.

import psyco
psyco.full()

To know more, follow the link. :)

Permalink 1 Comment

“python” hobby horse

March 17, 2009 at 12:34 pm (python) (, )

Even the smallest things have their own significance. Some sort of style is associated with everything aronud you, and so it is also there with the code snippet that you write. Shame of me that I read this Python PEP8 specifications after two years of my journey with python development :( No reason why I didn’t get into this earlier. Anyways after reading, I strongly feel that everyone must read this before writing his/her first python code. Get it here :)

Permalink Leave a Comment

Follow

Get every new post delivered to your Inbox.