Python extension module tutorial

The hippo Python extension module is designed to be used interactively or via Python scripts.

Thus the interface is somewhat different from the C++ interface to the HippoDraw library.

Using HippoDraw interactively can be as simple as two lines of Python code. Below is an example session.

> python
Python 2.4 (#2, Apr 15 2005, 17:09:59)
[GCC 3.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hippo
>>> app = hippo.HDApp()
>>>

Obviously, even typing these two lines for every session can get boring. Instead one can put the commands in a file and use that as initialization step for the Python session. For example, the file, canvas.py, in the testsuite directory contains

import hippo
app = hippo.HDApp()
canvas = app.canvas()

where we also show how to get a handle on the current canvas window. One can run this script from a UNIX shell or Windows command prompt like this

> python -i canvas.py
>>>

This launches the complete HippoDraw application in a separate thread with the ability to interact with it from the Python shell.

Getting help and documentation

Python has interactive help system. The get all help for the hippo module, just do the following...

> python
Python 2.4 (#2, Apr 15 2005, 17:09:59)
[GCC 3.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hippo
>>> help ( hippo )

This gives you all the built-in documentation in a pager like the UNIX more or less command which is not always convienent. However, one can get the documentation on one class by ...

> python
Python 2.4 (#2, Apr 15 2005, 17:09:59)
[GCC 3.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hippo
>>> help ( hippo.HDApp )

Or even one member function like this ...

> python
Python 2.4 (#2, Apr 15 2005, 17:09:59)
[GCC 3.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hippo
>>> help ( hippo.HDApp.canvas )

Another way to access the same information is to use the pydoc program that came with your Python installation (except under Windows).

> pydoc hippo

This also gives you all the built-in help in a pager. Probably the most convienent method is to generate html version of the documentation. You do this by typing ...

> pydoc -w hippo

and a hippo.html file is created in your working directory. Here is what it looks like. Not very pretty, but it is the standard ouput from pydoc. You can use this link as your documentation for HippoDraw. However, it will get updated for each release and you may be using an older version.

The following sections shows and explains some example script. See also Examples of HippoDraw use with Python for more examples

Creating the NTuple in Python

One might generate some data in Python that you want to display with HippoDraw. For example, you could generate a Python list with random Gaussian distribution like this

>>> import random
>>> x = []
>>> for i in range ( 10000 ) :
...     x.append ( random.gauss ( 45, 10 ) )
...
>>>

To display the data as a histogram, one can then type

>>> from hippo import Display
>>> hist = Display ( 'Histogram', ( x, ), ('Gaussian', ) )
>>> canvas.addDisplay ( hist )
>>>

The first argument to the Display function specifies the type of display to create. The second is a Python tuple of Python list objects that will be used by the display. The third argument is a Python tuple of string labels for the lists.

You can now modify the plot, for example, changing the width of the bins in two ways. From the Python shell, one can invoke a member function of the histogram object like this...

>>> hist.setBinWidth ( 'x', 2 )
>>>

But it is much easier to use Axis inspector and change it with the slider or text field.

The function created a DataSource called a ListTuple. It holds references to the Python list objects as columns. The list is not copied, just referenced. It also holds the labels of each column. Displays don't redraw themselves unless they know there's been a change, like changing the bin width. But should the contents of your Python list change, the Display wouldn't know about it. But you can force the display to redraw like this...

>>> hist.update()
>>>

The Python tuple of strings provide the data source labels, but they also giving the bindings of the displays to the data source. Some displays have binding that are optional. For the example, an "XY Plot" display had binding for the X and Y axis, and optionally, for an error on X or Y. To say which optional bindings not to use the "nil" column label is used. The we can do the following

"""
   Demonstrates making simple XY plot.  

   author Paul F. Kunz <Paul_Kunz@slac.stanford.edu>
"""
#
# load the HippoDraw module
#
from load_hippo import app, canvas

from hippo import Display

# Create list of data
energy = [90.74, 91.06, 91.43, 91.50, 92.16, 92.22, 92.96, 89.24, 89.98, 90.35]
sigma  = [ 29.0, 30.0,  28.40, 28.80, 21.95, 22.90, 13.50,  4.50, 10.80, 24.20]
errors = [  5.9,  3.15,  3.0,   5.8,  7.9,   3.1,   4.6,    3.5,   4.6,   3.6]

# make a plot to test it.
xy = Display ( "XY Plot", [ energy, sigma, errors ],
               ['Energy', 'Sigma', 'nil', 'error' ] )

canvas.addDisplay ( xy )

xy.setTitle ( 'Mark II Z0 scan' )

print "An XY plot is now displayed.   You can use the Inspector dialog"
print "to modify the appearance or fit a function to it."

The "nil" string can also be use by the Data inspector as well. Note in this example, we used list of lists instead of tuple of lists. Either can be used.

Speaking of the Data Inspector, sometimes it is more convenient to give HippoDraw all the data you might want to use for displays, and use the Data Inspector to create them. To do this, one creates a DataSource manually. There are three kinds supported: ListTuple, NTuple, and NumArrayTuple. They share a common interface and differ on how they store the column data. As we've seen, the ListTuple stores references to Python list objects. The NTuple makes copies of Python list objects and stores it internally as a C++ vector of doubles. The NumArrayTuple stores references to rank 1 numarray objects. The NTuple has the feature that you can add and replace rows or columns.

Creating displays with the DataInspector doesn't preclude one from also creating them with Python. The interface is similar to what we've already seen. For example

>>> energy = [90.74, 91.06, 91.43, 91.5, 92.16, 92.22, 92.96, 89.24, 89.98 ]
>>> sigma  = [ 29.0, 30.0,  28.40, 28.8, 21.95, 22.9,  13.5,   4.5,  10.8 ]
>>> errors = [  5.9,  3.15,  3.0,   5.8,  7.9,   3.1,   4.6,   3.5,   4.6,]
>>> ntuple = NTuple () # an empty NTuple
>>> ntc = NTupleController.instance ()
>>> ntc.registerNTuple ( ntuple )
>>> ntuple.addColumn ( 'Energy', energy )
>>> ntuple.addColumn ( 'Sigma', sigma )
>>> ntuple.addColumn ( 'error', errors )

>>> xy = Display ( "XY Plot", ntuple,  ('Energy', 'Sigma', 'nil', 'error' ) )
>>> canvas.addDisplay ( xy )
>>>

Registering the ntuple with the NTupleController is necessary in order for the Data Inspector to know of their existence.

Getting data from a file

An NTuple data source can also be created by reading a plain text file. See ASCII file for the details. The example file, histogram.py, in the testsuite directory shows how to read a file and create displays from Python. It contains ...

1 """ -*- mode: python -*-
2 
3 This script tests the creation and modification of Histogram along
4 with some test of exception handling.
5 
6 Copyright (C) 2001, 2003, 2004 The Board of Trustees of The Leland
7 Stanford Junior University. All Rights Reserved.
8 
9 @author Paul F. Kunz <Paul_Kunz@slac.stanford.edu>
10 
11 $Id: histogram.py.in,v 1.26 2006/02/15 19:29:56 pfkeb Exp $
12 
13 """
14 import setPath
15 from hippo import HDApp
16 app = HDApp.instance()
17 canvas = app.canvas()
18 
19 from hippo import NTupleController
20 ntc = NTupleController.instance()
21 
22 nt1 = ntc.createNTuple ( '../examples/aptuple.tnt' )
23 
24 from hippo import Display
25 
26 hist = Display ("Histogram", nt1, ('Cost',) )
27 canvas.addDisplay( hist )
28 hist.setAspectRatio ( 1.5 )
29 
30 hist.setRange ( 'x', 0., 30000. )

After reading a HippoDraw compatible data source file, this Python script creates two displays. It sets the range on the first and the bin width on the second. The results of running this script are shown below.

hist_2.png
Result of using histogram.py

The Display class is actually a small wrapper around the internal HippoDraw C++ library class. It is needed because Qt is running in a separate thread from Python. Since creating a display and perhaps modifying it requires interaction with Qt's event loop, the application must be locked before calling a member function of the actual HippoDraw class and then unlocked when returning.

Using the hippoplotter interface

Making use of Python's default parameter value feature in calling functions, Jim Chiang has extended the HippoDraw interface with his The hippoplotter.py module.

The file, pl_exp_test.py, in the testsuite directory shows an example of using this module.

""" -*- mode: python -*-

   Testing the PowerLaw and Exponential classes and
   exercising the hippoplotter.py wrapper.

   author: James Chiang <jchiang@slac.stanford.edu>
   
"""
#
# $Id: pl_exp_test.py.in,v 1.12 2005/04/08 02:08:42 jchiang Exp $
#
# Author: J. Chiang <jchiang@slac.stanford.edu>
#

from setPath import *

import random, math

import hippoplotter as plot

#
# Generate power-law and exponential distributions
#
nt1 = plot.newNTuple( ([], ), ("x", ) )
pl_display = plot.Histogram(nt1, "x", xlog=1, ylog=1,
                               title="power-law dist.")

nt2 = plot.newNTuple( ([], ), ("x", ) )
exp_display = plot.Histogram(nt2, "x", ylog=1, title="exponential dist.")

both = plot.newNTuple( ([], ), ("x",) )
combo_display =plot.Histogram(both, "x", ylog=1,
                                 title="power-law & exponential dists.")

shoot = random.uniform

#
# The parameters describing the distributions
#
x0 = 5.            # Characteristic scale for exponential

xmin = 1.          # Bounds for the power-law distribution
xmax = 100.
gamma = 2.1        # The power-law index

xpmax = math.pow(xmax, 1.-gamma)
xpmin = math.pow(xmin, 1.-gamma)

nsamp = 10000

print "Filling NTuple with data."

for i in range(nsamp):
    xi = shoot(0, 1)
    xx1 = math.pow( xi*(xpmax - xpmin) + xpmin, 1./(1.-gamma) )
    nt1.addRow( (xx1, ) )
    both.addRow( (xx1,) )
    
    xi = shoot(0, 1)
    xx2 = -x0*math.log(1. - xi)
    nt2.addRow( (xx2, ) )
    both.addRow( (xx2, ) )

#
# Fit these distributions
#
Function = plot.hippo.Function
powerlaw = Function( "PowerLaw", pl_display.getDataRep() )
powerlaw.addTo( pl_display )
powerlaw.fit()

exponential = Function( "Exponential", exp_display.getDataRep() )
exponential.addTo( exp_display )
exponential.fit()

#
# Do a fit to sum of functions.
#
pl = Function ( "PowerLaw", combo_display.getDataRep() )
pl.addTo( combo_display )

exp2 = Function ( "Exponential", combo_display.getDataRep() )
exp2.addTo( combo_display )

# Fit to linear sum
exp2.fit()

print "Demonstrated power law, exponential, and linear sum fitting"
print ""

The above script leads to the canvas shown below

hist_exp.png
Results of pl_exp_test.py script

Extracting data from a display

The interaction with HippoDraw from Python is not just one direction. Once can extract data from the displays and use them in Python. The file function_ntuple.py illustrates this...

1 """ -*- python -*-
2 
3  This is a script to test adding functions and fitting. It only
4 works with the C Python interface. Jython interface needs Function
5 class in Java.
6 
7 Copyright (C) 2002, 2003 The Board of Trustees of The Leland Stanford
8 Junior University. All Rights Reserved.
9 
10 Author: Paul_Kunz@slac.stanford.edu
11 
12 $Id: function_ntuple.py.in,v 1.4 2004/07/07 21:48:48 pfkeb Exp $nt
13 
14 """
15 import setPath
16 from load_hippo import app, canvas
17 
18 from hippo import NTupleController
19 ntc = NTupleController.instance()
20 
21 # Create NTuple with its controller so Inspector can see it.
22 nt1 = ntc.createNTuple ( '../examples/aptuple.tnt' )
23 
24 from hippo import Display
25 
26 hist = Display ( "Histogram", nt1, ("Cost", ) )
27 canvas.addDisplay( hist )
28 datarep1 = hist.getDataRep()
29 
30 from hippo import Function
31 
32 gauss = Function ( "Gaussian", datarep1 )
33 # hist.addDataRep ( gauss )
34 gauss.addTo ( hist )
35 
36 print "Before fitting"
37 parmnames = gauss.parmNames ( )
38 print parmnames
39 
40 parms = gauss.parameters ( )
41 print parms
42 
43 # Now do the fitting.
44 gauss.fit ( )
45 
46 print "After fitting"
47 parms = gauss.parameters ( )
48 print parms
49 
50 gauss1 = Function ( "Gaussian", datarep1 )
51 gauss1.addTo ( hist )
52 
53 # Do another fit, should fit to linear sum
54 gauss1.fit ()
55 
56 result = hist.createNTuple ()
57 ntc.registerNTuple ( result )
58 
59 coords = result.getColumn ( 'Cost' )
60 values = result.getColumn ( 'Density' )
61 res = []
62 for i in range ( result.rows ) :
63  x = coords[i]
64  diff = values[i] - gauss1.valueAt ( x )
65  res.append ( diff )
66 
67 result.addColumn ( 'residuals', res )
68 resplot=Display ( "XY Plot", result, ( 'Cost', 'residuals', 'nil', 'Error' ) )
69 canvas.addDisplay ( resplot )

Like the previous script, it fits two functions to a histogram. It also shows how to extract the function parameter names and their values. Near the end of the script, one extracts the contents of the histogram bins in the form of an NTuple. In the for loop at the end, one uses the NTuple to calculate the residuals between the function and the bin contents and put them in a Python list. The the list is added as a column to the NTuple. Finally, one creates an XYPlot to display them and adds it to the canvas. The result looks like this...

hist_resid.png
Results of function_ntuple.py

However, one didn't have to write this script to plot the residuals, as the is a control in the Function inspector that does it for you.

Using a FITS file

A FITS file can be used as input to HippoDraw. Here's how one can it to view an image of the EGRET All-Sky survey from such a file

1 """ -*- mode: python -*-
2 
3  Displaying the EGRET All Sky survey
4 
5  author: James Chiang <jchiang@slac.stanford.edu>
6 
7 """
8 #
9 # $Id: egret.py.in,v 1.11 2003/09/29 18:12:50 jchiang Exp $
10 #
11 
12 import sys
13 sys.path.reverse()
14 sys.path.append('../python')
15 sys.path.append('../python')
16 sys.path.reverse()
17 
18 from setPath import *
19 
20 import hippoplotter as plot
21 
22 file = "../examples/EGRET_AllSky.fits"
23 plot.fitsImage(file, zlog=1, aspect=2)

The resulting canvas is shown below

canvas_egret.png
The EGRET All-Sky survey.

The FITS data format is a standard astronomical data and mandated by NASA for some projects. It supports images as well as binary or ASCII tables. A FITS table is essentially a NTuple with added information in the form of keyword-value pairs. James Chiang also wrote the following Python function to convert a FITS table to a HippoDraw NTuple.

1 """
2 Read in a series of FITS table files and make them accessible as
3 numarrays, optionally creating a HippoDraw NTuple.
4 
5 @author J. Chiang <jchiang@slac.stanford.edu>
6 """
7 #
8 # $Id: FitsNTuple.py,v 1.13 2009/03/10 21:00:58 jchiang Exp $
9 #
10 import sys, pyfits
11 try:
12  import numpy as num
13 except ImportError:
14  import numarray as num
15 
16 class FitsNTuple:
17  def __init__(self, fitsfiles, extension=1):
18  cat = num.concatenate
19  #
20  # If fitsfile is not a list or tuple of file names, assume
21  # it's a single file name and put it into a single element
22  # tuple.
23  #
24  if type(fitsfiles) != type([]) and type(fitsfiles) != type(()):
25  fitsfiles = (fitsfiles, )
26  #
27  # Process each file named in the list or tuple.
28  #
29  columnData = {}
30  for i, file in zip(xrange(sys.maxint), fitsfiles):
31  #print "adding", file
32  table = pyfits.open(file.strip(" "))
33  if i == 0:
34  self.names = table[extension].columns.names
35  for name in self.names:
36  if i == 0:
37  columnData[name] = table[extension].data.field(name)
38  else:
39  columnData[name] = cat((columnData[name],
40  table[extension].data.field(name)))
41  #
42  # Add these columns to the internal dictionary.
43  #
44  self.__dict__.update(columnData)
45 
46  def makeNTuple(self, name=None, useNumArray=1):
47  import hippo
48  if useNumArray:
49  nt = hippo.NumArrayTuple()
50  else:
51  nt = hippo.NTuple()
52  if name != None:
53  nt.setTitle(name)
54  ntc = hippo.NTupleController.instance()
55  ntc.registerNTuple(nt)
56  for name in self.names:
57  if len(self.__dict__[name].shape) > 1: # have multicolumn variable
58  columns = self.__dict__[name]
59  columns.transpose()
60  for i, col in enumerate(columns):
61  colname = "%s%i" % (name, i)
62  nt.addColumn(colname, col)
63  else:
64  try:
65  nt.addColumn(name, self.__dict__[name])
66  except TypeError:
67  pass
68  return nt

Using ROOT files

Another example is reading a ROOT file that has the form of an ntuple as define in RootNTuple class. The Python code might look like this...

1 """ -*- mode:python -*-
2 
3 Demo of reading ROOT file with function, cuts, and calculation
4 
5 author: Paul F. Kunz <Paul_Kunz@slac.stanford.edu>
6 
7 """
8 
9 import setPath
10 from load_hippo import app, canvas
11 
12 from hippo import RootController, Display
13 rc = RootController.instance()
14 
15 filename = "/nfs/farm/g/glast/u33/InstrumentAnalysis/MC/EngineeringModel-v6r070329p28/Surface_muons/surface_muons_4M_merit.root"
16 
17 ntuple_names = rc.getNTupleNames ( filename )
18 print "In this file, tree names are ", ntuple_names
19 
20 ntuple = rc.createDataArray ( filename, ntuple_names[1] )
21 print "Number of columns = ", ntuple.columns
22 
23 labels = ntuple.getLabels()
24 print "First ten column labels are ... ", labels[:10]
25 
26 print "Number of rows = ", ntuple.rows
27 
28 hist = Display ( "Histogram", ntuple, ('TkrEnergy', ) )
29 canvas.addDisplay ( hist )
30 
31 hist.setLog ( 'y', True )
32 
33 from hippo import Cut
34 
35 hits_cut = Cut ( ntuple, ('TkrTotalHits',) )
36 canvas.addDisplay ( hits_cut )
37 hits_cut.setLog ( 'y', True )
38 
39 hits_cut.addTarget ( hist )
40 hits_cut.setCutRange ( 4, 110, 'x' )
41 
42 hist.setRange ( 'x', 40, 700 )
43 
44 from hippo import Function
45 datarep = hist.getDataRep ()
46 
47 exp1 = Function ( "Exponential", datarep )
48 exp1.addTo ( hist )
49 
50 exp1.fit ()
51 pnames = exp1.parmNames ()
52 print pnames
53 
54 parms = exp1.parameters ()
55 print parms
56 
57 exp2 = Function ( "Exponential", datarep )
58 exp2.addTo ( hist )
59 
60 exp1.fit() # always fit to linear sum
61 
62 label = "Raw sum"
63 ntuple [ label ] = ntuple [ 'TkrEnergy' ] + ntuple [ 'CalEnergyRaw' ]
64 
65 sum_hist = Display ( 'Histogram', ntuple, (label, ) )
66 canvas.addDisplay ( sum_hist )
67 
68 sum_hist.setLog ( 'x', True )
69 sum_hist.setLog ( 'y', True )
70 
71 merit_sum = Display ( 'Histogram', ntuple, ( 'EvtEnergyCorr', ) )
72 canvas.addDisplay ( merit_sum )
73 
74 merit_sum.setLog ( 'x', True )
75 merit_sum.setLog ( 'y', True )
76 
77 sum_hist.setRange ( 'x', 1.0, 1e+06 )

This script not only uses ROOT, but it also uses numarray. It converts a ROOT brach into a numarray array so it can do vector calculations. The ROOT C++ macro to do the equivalent of the above Python script would be considerable more complex.

Limitations.

With this release, not all of HippoDraw's C++ library is exposed to Python. Although this could be done, it is thought to be not necessary. Rather, selected higher level components are exposed in one of two ways. Some classes are exposed directly with a one to one relationship between the C++ member functions and Python member functions. An example is the NTuple class.

One can view the reference documentation for the hippo extension module with Python's online help command, One can also use the pydoc program to view it or generated HTML file with the command "pydoc -w hippo".

In order to be able to have an interactive Python session that interacts with the HippoDraw canvas items and at the same time have interaction with the same items from the Inspector, it was necessary to run the HippoDraw application object in a separate thread. Threading conflicts could then occur. Thus some of HippoDraw's C++ classes are exposed to Python via a thin wrapper class which locks the Qt application object before invoking an action and unlocks it when done.

One good thing about Python is that what ever you do, Python never crashes. Thus, what ever you do with the HippoDraw extension module should not crash Python. An interactive user, however, can easily mis-type an argument to a function. For example, he could try to create a display with "ContourPlot" instead of "Contour Plot". For such errors, the C++ library throws a C++ exception. The HippoDraw extension module catches them and translates them to a Python exception. Thus, when the Python user makes an error, he will receive a message instead of crashing his session.

Another reason the wrapper classes exist is to try to present a more Python "interactive friendly" interface to the user than the raw C++ interface which was designed for the application writer. With this release, it is not clear what a more "friendly" interface should look like. Maybe the Python extension module should be closer to the C++ interface and provide Python classes to wrap them in a more friendly way like James Chiang has done. Feed back on this topic would be very welcome.


Generated for HippoDraw by doxygen