pondělí 25. ledna 2010

Do it

Under such fitting phrase is hidden a make program implementation in python. DoIt enables to create building rules with the help of python script, therefore the dependencies have not to be hardly encoded. The change of dependencies is detected according the file md5 hash and not only by last modification time.

With the help of this framework I was able to approximately 5 times speed up our server start by creating production rules for cheetah templates compilation.

pátek 22. ledna 2010

Sorting according local customs in python

In the previous week I played with sorting of strings according Czech language locale. I tried to sort records according the Czech collation in our Firebird 2.1 database, but I found no way how to do it for utf-8 charset (please share if you know how). Thus I decided to sort records on the application layer.

Python provides two modules for international sorting the locale and the PyICU both depend on installed system locales (on debian based systems look for dpkg-reconfigure locales). But locale module set the the locale setting for the whole system, instead of intel library PyICU, which can set a locale only for specific function. Because in our threaded web application can be used multiple locales together, the Python locale is no way.

PyICU is a only a thin SWIG generated wrapper above C++ library therefore it's interface is not very pythonic.

I use a function for a sorting key generation because custom comparative function for sort is not supported in Python 3.0 and above. Sorting through a sorting key is also much more efficient.


# -*- coding: utf-8 -*-

import PyICU

def getCollate(locale):
def getCollationKey(s):
s = PyICU.UnicodeString(s)
k = icuCol.getCollationKey(s)
return k.getByteArray()

icuLoc = PyICU.Locale(locale)
icuCol = PyICU.Collator.createInstance(icuLoc)
return getCollationKey

names = [u'šárka', u'suzan', u'čech']
names.sort(key=getCollate('cs_CZ')
print names


In case of error, don't forget to check installed locales in your system.