CIS 192: Lecture 13 Scientific Computing and Unit Testing Lili Dworkin University of Pennsylvania
Scientific Computing I Python is really popular in the scientific and statistical computing world I Why? Python is slow! I So we use libraries with routines written in C/C++ I We will look at Numpy/Scipy and Matplotlib/Pylab I But first, how do we benchmark Python?
Cosine Similarity I Goal: find how similar two vectors are I One measure: compute angle between them I Cosine similarity of two vectors U and V : I cos(0) = 1, cos( ) = 1 cos( ) = U V kukkv k
Cosine Similarity def cosine_similarity(u,v): mag1, mag2, dot = 0.0, 0.0, 0.0 for a,b in zip(u,v): dot += a * b mag1 += a ** 2 mag2 += b ** 2 return dot / (math.sqrt(mag1) * math.sqrt(mag2))
Timeit Previously I used time.time() don t do that. Instead: >>> import timeit >>> t = timeit.timer("<statement to time>", "<setup code>") >>> t.timeit() I The second argument is usually an import that sets up a virtual environment for the statement I timeit calls the statement 1 million times and returns the total elapsed time I timing.py
Numpy/Scipy I Basic operations numpy_demo.py I Doing things faster numpy_timing.py
Numpy/Scipy I Fast integer and floating point types (numpy.int, numpy.float) I Fast arrays (numpy.array) and matrices(numpy.ndarray), and fast operations over every element I Tons of functions and algorithms, including linear algebra, Fourier transforms, etc I Scipy depends on Numpy, and gathers a lot of high level science and engineering modules together (integrate, linalg, optimize, stats...)
Matplotlib/Pylab I Plotting/charting library I Good alternative to Excel I Matpotlib is the whole package I matplotlib.pyplot - non-interactive plotting (scripting) I matplotlib.pylab - interactive calculations and plotting I pylab_demo.py
Polya s Urn
Doctest The doctest module searches for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown.
Doctest >>> import doctest >>> options = (doctest.ignore_exception_detail doctest.normalize_whitespace) >>> doctest.testmod(optionflags=options)
Doctest How do exceptions work? >>> factorial(30.1) Traceback (most recent call last):... ValueError: n must be exact integer If we specified doctest.ignore_exception_detail, everything to the right of the colon is ignored. Example in doctest_demo.py.
Unit Testing I Well-written programs can be broken up into units I I I I Functions, methods Classes Modules Packages I Unit testing aims to test the functionality of all the units in your program
Unittest I The unittest module makes it relatively easy to write a suite of unit tests for your programs I Modeled after JUnit, a Java unit testing framework I Caveat: many ways to use it, we ll just look at one approach
Random Python s random module: >>> import random >>> random.random() 0.41872335193283494 >>> random.randint(10, 20) 18 >>> seq = [2, 4, 6, 8, 10] >>> random.shuffle(seq) >>> seq [8, 4, 10, 2, 6] >>> random.choice(seq) 4 >>> random.sample(seq, 3) [4, 2, 6]
Unittest import unittest import random class TestSequenceFunctions(unittest.TestCase): def setup(self): self.seq = range(10) def test_shuffle(self): def test_choice(self): def test_sample(self): if name == ' main ': unittest.main()
Unittest I Inherit from unittest.testcase, abaseclassfortestcase I Here we used it to define multiple test cases at once I The setup method is called before each test case is run I Any method whose name starts with test defines a test case I unittest.main() runs all test cases
Unittest How do we write test cases? Use TestCase.assert* methods: I self.assertequal() I self.asserttrue() I self.assertraises() Let s practice in testing.py.
Digression Each of the following functions takes a callable and a list of arguments to provide it: self.assertraises(valueerror, random.sample, self.seq, 20) and Thread(target=add, args=(5,6))
Digression Let s look at the headers : assertraises(exception, callable, *args, **kwds) vs. Thread(target=None, args=(), kwargs={})
Unittest I unittest distinguishes between failures and errors I Failure: The assert statement was wrong I self.assertequal(range(5), range(10)) I Error: Something is wrong with the code I print 5 + hi