Monthly Archives: April 2009

Amazons kindle2 and scientific papers

I own a kindle2 and its very easy to push pdfs to the kindle. Unlike the misconceptions out there , the Kindle2 is NOT protected and offers “free” ( see below)  conversion of pdfs to the *.azw format that you can then “push” to your kindle2. 

Here is how it works:

When you buy a kindle2 . The owner can register the device with his/her amazon account and then associate a new email address with the kindle2. for eg

Once you do this you can send pdfs to your device in the following two ways
1) email the pdf attachment to Amazon converts the file and pushes it to your device wirelessly ( over the “whispernet” ) . you get charged 0.10 per email for pushing it to your device 


2) You can email the same atatchment to  ( note the FREE) . Amazon converts the file and emails it to the amazon account email address (in my this case say . I can then save the file to my computer and use the provided usb cable to save the converted document to the “documents” folder in the kindle. This way you dont get charged anything. 

In addition to the above “amazon blessed” ways of pushing content to your kindle2. There are standalone applications like Calibre and mobipocket that convert documents (pdfs , word doc files , html pages) to the mobibook format on which kindle2 *.azw format is based. Calibre runs on Mac, Windows and Linux and Mobipocket is a windows only app.  I have not yet tested mobipocket but  Calibre by kovid Goyal is a free and open source app that offers a multi-platform itunes-like front end to push and manage  third party content on the kindle2. With Calibre you can convert pdfs to the *.mobi format and save it to the documents folder for reading on your kindle .

I recently tested an old Acta Cryst paper (1999) for conversion to the kindle2 format using all three approaches above. The conversion and upload to the device was trivial using either the whispernet -push or save to kindle2 via usb via free email or on the Mac.  The pdf immediately shows up in your library . The text of the paper was immensely readable but when it came to equations and symbols all hell breaks loose . Most of the equations in these papers were probably not embedded as images but instead as their equivalent fonts. I am sure the encoding of these fonts to the mobibook format is not trivial and it shows when simple equations line 

B = A + T have the = and + symbols all messed up  with (?) symbols once converted  i.e B ? A ? T

SO it is quite difficult to read the paper on the kindle2 , when many of the equations are garbled.

This is obviously an evolving space. Even for pdf until say five to six years ago it was not uncommon to have  funky characters replacing our alphas and taus in the printed pdf . Whats most intersting is that projects like Calibre bring the power of open-source approaches to the kindle and other ebook reader platforms . I wont be surprised if the open source world rallies behind open software to allow users to create and make available content that can be read on your ebook reader of choice .

Seeing how it took pdf nearly ten years to be the format of choice for electronic wysiwyg documents , I hope we dont have to wait too long for all content to be seemlessly transcoded for reading on any given ebook reader 

refs: Nature in its 2nd April 2009 issue has two news features on ebook readers. , ireadreview an good site for reviews on all thinks ebook readers

Command line handling in Python with optparse

Command lines interfaces to programs are very empowering. I started using computers with the  Linux command line and  I have never strayed too far from programs that are predominantly command line based. Whether it was rasmol , povray , phenix or ffmpeg I always found that the command line gave the program a more  transparent interface . By that I mean it was easier ( at least for me ) to figure out how to do decipher a  manual page  than to go looking for a particular functionality in a GUI window.

No in the case of python ,  I have been writing scripts that take in user input from the command line for a while now. In these scripts , I would generally accept only one input and that would be the first in the sys.argv list. If I had more than one input I would iterate over the input list and try and figure out what the inputs were . Even worse in most cases I would  hard code the order  of inputs into my code ( terrible practise). Fortunately  for me , my discovery of the optparse module has changed all that.

The optparse module is  an object oriented ( dont let that scare you) and super-intuitive way to add command line options to any python scriptSo say you want to add an input file command line switch with the -i attribute , All you have to do is

from optparse import OptionParser

optparse_object =OptionParser()

optparser_object.add_option(“-i”,”–infile”, dest=”infile”,help=”input file for script” , metavar=”[infile.txt]”)

Once you do this you can easily have the module parse the sys.argv list and make sense of it .So you would add the following line

options_object, spillover_options = optparser_object.parse_args()

Then options_object.infile  will have the value of the input option . This is specified by the dest section in the add_option argument list) . The nice thing with the module is that all possibilities can be mapped to the same options_object.infile destination . So for eg I have mapped “-i” and “–infile” to the same destination .Even better is the option to add a help string with the help=”help text” argument . This help is then printed out if the user provides an option that the script cannot handle or if the code specifically calls  the optparser_object.print_help() function.

For a concrete example on how to use the opt_parse module consult the docs or my example code on github .