Profiling, debugging and testing with Python Jonathan Bollback, Georg Rieckh and Jose Guzman
Overview 1.- Profiling 4 Profiling: timeit 5 Profiling: exercise 6 2.- Debugging 7 Debugging: pdb 8 Debugging: exercise 11 3.- Testing 12 Simple case testing 13 Unittesting: exercise 16 Documentation 17... it is all about writing good code
1.- Profiling Code optimization refers to the practice of reducing the processor time of an algorithm or function. Two rules to write optimized code (in Python) 1. avoid for loops if possible 2. use existing routines or modules (e.g NumPy/Scipy libraries) To evaluate where the code spends more of its time we need a Profiler 1.- Profiling
Profiling: timeit Syntax: Timer("main statement", "setup statement") timeit() returns the time (in seconds) that takes to execute the main statement. It takes the number of executions as argument. from timeit import Timer myprofiler = Timer("np.random.rand(1e6)", "import numpy as np") myprofiler.timeit(2) # call main statement 2 times 0.061825990676879883 repeat(n, m) calls timeit(m) n-times to execute the main statement m-times and returns a list. myprofiler.repeat(3, 2) # call timeit(2) 3 times [0.065870046615600586, 0.061974048614501953, 0.061568975448608398] Profiling: timeit
Profiling: exercise We will evaluate the following functions described in matrices.py 1. matrix_python0(n,p) 2. matrix_python1(n,p) 3. matrix_python2(n,p) 4. matrix_numpy(n,p) 5. matrix_numpy1(n,p) 6. matrix_c(n,p) where n is the size of a quadratic matrix whose elements are one with a probability of p, or zero otherwise. To profile generate 10 matrices of size 1000 and 0.5 probabilities (n=1000, p=0.5) return the average of at least 20 profiles to have a representative measurement Profiling: exercise
2.- Debugging A bug is a failure in a routine that blocks the standard execution of a program. A bug is not: An exception (i.e the program does not work under specified conditions) A not implemented feature (i.e a program do not cover all features) Example Find 2+1 bugs in the code above def sphere_volume(radius): returns the volume of a sphere of radius given as argument vol = (3/4)*PI*(raduis)**3 return(vol) 2.- Debugging
Debugging: pdb pdb is a command-line based debugger it opens an interactive shell that allows us: to examine and change the value of variables execute code line by line set up breakpoints examine call stacks To execute the Python debugger we simply type in our shell: $ pdb <filename>.py Debugging: pdb
pdb basic commands: Demo quit [q] quit debugging next [n] executes the next statement print [p] prints the value of a variable list [l] list the current code continue [c] continues until pdb.set_trace() statement We will run the pdb debugger on simple.py to know if the variable val is taking the right values. def parabola(x, c): returns the solution to f(x) = x^2 + c return(x**2+c) if name == ' main ': offset = 8 for i in range(0,10,2): val = parabola(x=i, c=offset) pdb basic commands:
pdb basic commands:
Debugging: exercise Use a debugger to find the following discrepancy in debugme.py : from debugme import average1 mydata = range(10) average1(mydata) >>> 0 sum(mydata)/10. # point in denominator to return a float >>> 4.5 Debugging: exercise
3.- Testing Test suites are fundamental part of modern programming techniques. unittest is the standard Python testing library Unittesting is designed to avoid errors of the type described bellow: from math import pi as PI def sphere_volume(radius): returns the volume of a sphere of radius given as argument volume(r) = 4/3*pi*r^3 vol = (3/4)*PI*(radius)**3 return(vol) 3.- Testing
Simple case testing A basic schema for unittesting look like this: import unittest class FirstTestCase(unittest.TestCase): first sets of tests def test_true(self): methods beginning with 'test' are executed self.asserttrue() self.assertfalse() if name == ' main ': unittest.main() Simple case testing
Simple case testing A first implementation for testing that the volume of a sphere is a float would be... import unittest from volume import sphere_volume as v_sphere class ReturnValues(unittest.TestCase): def test_is_float(self): Test if return values are floats sol = v_sphere(radius=1) self.asserttrue( type(sol) == float ) def test_is_not_int(self): Test if return values are floats sol = v_sphere(radius=1) self.assertfalse( type(sol) == int) Simple case testing
Unittesting cases TestCase methods Examples asserttrue() asserttrue( isinstance([1,2], list) ) asserttrue( 'Hi'.islower() ) assertfalse() assertfalse( isinstance([1,2], float) ) assertfalse( isinstance('hello', str) ) assertequal() assertequal( [2,3], [2,3] ) assertequal( 3.14, 3.1416) assertnotequal() assertnotequal( 1.2, 1.21 ) assertnotequal( 3.14, 3.14 ) assertalmostequal() assertalmostequal( 1.125, 1.12, 2 ) assertnotequal( 1.125, 1.12, 3) Unittesting cases
Unittesting: exercise Perform the testing of the function sphere_volume in volumen.py. For that, you can test that it returns its correct value: for radius=2 units the volume of a sphere is 18.84 units^3 Unittesting: exercise
Documentation Despite the whole debugging techniques, one of the best practices to write good code is to provide a good documentation: To Provide a thoughtful documentation of your code 1. Enter docstrings at the beginning of your file models.py Date: Sat Sep 8 2012 this script contains probability distribution functions... import numpy as np This allows the reader to have additional information about the file (titles tend to be rather unspecific). Documentation
2. Use doc strings in every new function/method def random(n, seed=none): returns a NumPy array with random numbers Arguments: n -- number of random in the array seed -- the seed Example: >>> random(5, 100) array([ 0.54340494, 0.27836939, 0.42451759, 0.84477613, 0.00471886]) if seed is not None: np.random.seend(seed) return np.random.rand(n) 3. Comment any non obvious piece of code np.random.rand(1e6) # we did not use a seed here Your doc strings will be accessible via the help statement in Python. To create HTML documentation of your scripts use pydoc -w myscript Documentation