Category Archives: Python

Pretty documentation and code snippets with

Whether I am learning a new module or thinking up a new set of python modules to solve a problem at hand, I use google docs extensively to collect my thoughts and plan out my code.

In my google docs, I very often grab screenshots ( esp of dragged out screen area ) with ⌘-CTRL-Shift-4 and directly paste them into my Google Docs on Google Chrome as images. However, when it comes to inserting snippets of code I wrote into these documents I really like using I first came across after Kenneth Love used it in his kickstarter funded getting started with django lessons. copy paste of html formatted syntax colored code is awesome! copy paste of html formatted syntax colored code is awesome!

With , any code snippet is a copy-paste away.  I found myself using so much that I decided to get the pro-account . To create a snipt, I copy and paste my code into the form . Then by playing with a number of formatting options, I can change the way my code is colored and syntax highlighted . Once I am happy with the look . I save and close the snippet. Then I just copy and paste the final snippet view into google docs or make the snipt public and grab the javascript code and paste it into the Text compose view in WordPress.

For example here is some pretty,formatted python code output rendered by javascript.

By default all snippets in paid accounts are private. Both public and private posts can be tagged, given a title and an optional description. The public posts can be directly published with a default built in template allowing for very quick “code focussed blogging”. These posts have a comments section that integrates with Facebook, Twitter, Google and Disqus logins.

A code snippet "blog post" on
A code snippet “blog post” on

Although I dont use its blogging tools much, has made inserting syntax colored code into my Google Docs and blog posts such as this one super easy!.

If you spend any time on the shell command line use “z” it will save you tonnes of time!

I am not for using superlatives to attract attention. But I cannot but emphasize how amazing this shell add in has been for my daily activities.

I do spend a lot of time on the command line changing between directories. My typical directory path will read something like this

cd /home/hari/data_processing/cro_data_4_14_23/p53_p212121/compound_id_4642222/final_procesing

Now imagine if you could just cd to that directory by matching against a keyword that is unique to that directory. I could just say

z 2222 TAB

And bing! It “cd’s” you into that directory!

THAT is the amazing power of “z”

This and a few other tips like this came from a talk on Tools for Web developers from +Paul Irish

Screen shot 2013-06-23 at 5.56.36 PM

Customizing you Django user admin with Inline sections

Screen Shot 2013-05-12 at 3.56.43 PM

I was using django-userena for user management on a django app. Userena is a django app that works with “django.contrib.auth.user” to provide additional features such activation, password resets over email and messaging in addition to signin and signups.

With Userena the django Auth admin site had a number of “Inline” management sections that incorporated the information coming in from the Userena app relevant to the particular user.

The sections were:
Personal Info
Important Dates
Userena Registrations

I wanted to add my Custom profile to the Bottom of that list. My custom profile , otherwise also known as the one you set as the “AUTH_PROFILE_MODULE”

AUTH_PROFILE_MODULE = “accounts.MyCustomProfile”

How does one do that?

Well basically you want to extend the “custom” admin interface . The relevant django doc for all things ModelAdmin is located here.

My model which I want to add in: is

from django.db import models
from django.contrib.auth.models import User
from django.utils.translation import ugettext as _
from userena.models import UserenaBaseProfile

# Create your models here.

class MyCustomProfile(UserenaBaseProfile):
    """The Userena User profile"""
    user = models.OneToOneField(User,
    bio = models.CharField(max_length=1000,default="",blank=True)
    # Make this field not editable because it controls access to uid from legacy db
    my_legacy_user_object = models.SmallIntegerField(null=True,editable=False,unique=True)

    def __unicode__(self):

To accomplish the inline display of this model along with the rest of the stuff userena already put into the admin page for the user I had to add this as my for the accounts app.

__author__ = 'hari'
from django.contrib import admin
from .models import MyCustomProfile
from userena.utils import get_user_model,get_profile_model
from django.contrib.auth.admin import UserAdmin
from django.contrib.auth.models import User
from userena.admin import UserenaAdmin,UserenaSignupInline

class MyCustomProfileAdmin(admin.ModelAdmin):
    list_display = ["user" ,"bio","my_legacy_user_object"]
    search_fields = ["user__first_name","user__last_name","my_legacy_user_object"]

#docs from,MyCustomProfileAdmin)

class MyCustomProfileAdminInline(admin.StackedInline):
    model = MyCustomProfile

class MyCustomProfileAddedAdmin(UserenaAdmin):
    inlines = [UserenaSignupInline,MyCustomProfileAdminInline],MyCustomProfileAddedAdmin)

The result is what we get nicely here

Getting Solr 4.0 running on Amazon EC2 for django-haystack

After almost two years this is my attempt at getting back to blogging. These instructions are not too detailed. But I spent some time trying to get Solr 4 .0 setup to serve up my django-haystack search results and figured I would document the two hurdles I faced.

I have been running apache-solr for my django-haystack search on my home Ubuntu Linux box. The whole setup was working great but the requirement to keep this box powered on all day , coupled with the noisy fans on the box made me decide to switch to hosting my solr instance on the cloud.
After much “googling” I couldnt find anything warning against running apache Solr on an Amazon micro instance so I decided to give it a try since it was going to be a dev instance anyways.

What I wanted to do : Get a full Solr 4.0 setup which would index my django dev database and serve up search results. I am using a T1.micro instance on the Amazon EC2 cloud

Step 0 : Get a new Amazon micro instance. I am using an Ubuntu 12.04 LTS 64 bit instance. I got all the required packages and java on there and installed tomcat6 ( sorry this step is deficient in details)

Step 1 : Get and install apache-solr . I used the 4.0 (dubbed Solrcloud) release which is quite different from earlier solr releases. From what I understand Solr 4.0 has a better support for distributed indexes .

tar -zxvf apache-solr-4.0.0.tgz
cd apache-solr-4.0.0

Step 2 : Get the schema.xml from django to play nice with the new solr 4.0.
Solr 4.0 has changed a little bit how it does things. While the start.jar is in the same place the conf and schema.xml directory are now in a few places since solr now has split up the data blocks into collection directories. For now I decided to co-opt the collection1 directory to serve up my django index.

The schema.xml from django ( gotten by running “python build_solr_schema”) now needs to be placed into the example/solr/collections/conf directory

cp schema.xml $HOME/apache-solr-4.0.0/example/solr/collections/conf

Edit this schema.xml to add the reserved __versions__ field name into the fields section.Since Solr’s updatelog is “ON” by default , the __version__ field name is required in the configuration. Alternatively you could turn that “OFF” by editing solrconfig.xml. I chose to leave that setting as it is and instead added this now required field name to the schema.xml. Look for the “fields” field in the django generated schema.xml and add the lines shown below anywhere inside that block. Here is what mine looks like after I added the “__version__ field name.

<!-- general -->
<field name="id" type="string" indexed="true" stored="true" multiValued="false" required="true"/>
<field name="django_ct" type="string" indexed="true" stored="true" multiValued="false"/>
<field name="django_id" type="string" indexed="true" stored="true" multiValued="false"/>
<!-- added this field for solrcloud , to play friendly with updatelog -->
<field name="_version_" type="long" indexed="true" stored="true"/>
<dynamicField name="*_i"  type="int"    indexed="true"  stored="true"/>
<dynamicField name="*_s"  type="string"  indexed="true"  stored="true"/>
<dynamicField name="*_l"  type="long"   indexed="true"  stored="true"/>
<dynamicField name="*_t"  type="text_en"    indexed="true"  stored="true"/>

Once this was done. I could easily start my solr jetty server and then ask django to rebuild the index

Step 4 : Start the solr jetty server
In directory apache-solr-4.0.0/example
java -jar start.jar

Step 5 : Rebuild the index

python rebuild_index

And everything is up and running.
Edit: I had some trouble setting up a start and stop script with ubuntu that would start the solr process and stop it on boot and add it to the default run level . Finally I followed the clear instructions at this blog entry and the script described worked just great.

Boto : Super easy Python library to interact with amazon web services

I had first heard of Mitch Granaats boto thanks to “Monster Muck Mashup” where he used boto to build a video transcoding service using amazons compute and storage cloud EC2 and S3. Being a relative python newbie then, I remember reading the code examples and not understanding what they were doing entirely.

After a long period of dormancy I decided to resurrect and cleanup my S3 buckets. Stupid me! , I had turned on logging for the code-itch bucket ever so long ago. Result!,tens of thousands of log entries. Every S3 browser I tried, struggled to display and scroll the massive number of files. I needed a script and quick.

Initially I considered jets3t a java based super-library for all things AWS. But after reading an answer to this question on stackoverflow I decided to try boto again. This time around , I found the syntax very easy to comprehend. Within minutes I had a script that worked. The script read like pseudocode. I will reproduce it here.

I will definitely be delving more into boto to explore amazons many offerings.

Mitch Granaat also writes an excellent blog on all things aws and boto at elastician

Boto library home on googlecode

import boto
from boto import *
import re
# Regexp to match the log file names
log_key = re.compile("code-itch\.[\d]{4}\-[\d]{1,2}\-[\d]{1,2}\-[\d]{1,2}\-[\d]{1,2}\-[\d]{1,2}\-[a-fA-F\d]{16}")
f = open("listbuckets_code-itch_deleted.txt" , "w")
if __name__ == '__main__':
    s3 = boto.connect_s3()
    mybucket = s3.get_bucket("code-itch")
    for key in mybucket:
        if log_key.match(
            f.write( + "\n")
            print "Deleted",