Getting at the nodes and attributes in an XML document
May 17th, 2007 by harijay
In the last screencast and post we got our xml exported file from backpackit parsed and printed using the minidom package using a python script.
Over the last year I have been gathering a lot of my experimental data in the form of “note entries” on several backpackit pages ( One for each project). As a first task – I wanted to get a feeling for how many such notes are present.
So with a little help from Mark Pringles excellent book called dive into python we can do the following two things
First obtain an array containing all the note nodes of the kind
<note title=”Bt-TRK pH 7.0 uptake flux with Trypsin treated Bt-TRK 06/19/06″ id=”1005550″ created_at=”2006-08-22 22:38:31″
Then use the created_at attribute and write it to a text file for further processing
SO now for the code
# Create XML object
xmldoc = minidom.parse('backpackit.xml')
# Thats as simple as the call getElementsByTagName
Num_notes = len(notelist)
outfile = open ('notelist.txt','w')
print "I have " + str(Num_notes) + " notes total "
date_elem = notelist[i].attributes["created_at"]
outfile.write(date_elem.nodeValue + "\n")
outfile.close()
print "Wrote file notelist.txt"
[youtube=http://www.youtube.com/watch?v=yb6R87eFR44]
So now we have a file with all the created_at dates and times. I really want to know how busy I was for the last year and chart a pattern of how many posts I had per week for the whole year. So lets try and get at that information in the next codeitch project.
Refs minidom python doc
Powered by ScribeFire.