Agile Development and Testing in Python Grig Gheorghiu and Titus Brown PyCon 2006, Feb. 23, Addison, TX
Introduction agile concepts what is "agile"? rapid feedback continuous integration automated testing collaboration / whole team approach small frequent iterations testing pyramid: unit tests functional/acceptance tests UI tests
Introduction testing pyramid courtesy of Jason Selenium Huggins
Introduction (cont.) functionality implemented as stories a story is not done unless it is unit tested and acceptance tested automated tests safety net that enables merciless refactoring unit tests ("code facing tests") should run fast acceptance/functional tests ("customer tests", "business facing tests") should run via continuous integration process
Introduction (cont.) tracer bullet" development technique: like candle making dip wick in wax, you get a thin but functional candle keep dipping until you get a fully formed one Test Enhanced Development (TED) write some code write unit tests (different person can do it) write more code write more tests etc.
Introduction (cont.) Lessons learned continuous integration and automated testing are essential remote pair programming works really well: each of the two is accountable to the other whole team approach to both development and testing no code thrown over the wall to QA
MailOnnaStick: our AUT Three basic requirements: Browse e mail Search e mail Comment e mail
Technology base CherryPy Web framework Durus object database Commentary AJAX commenting system jwzthreading Netscape 4.x like threading py.lib logging & HTML generation quixote.html HTML escaping code
Implementation history (since Dec. 2005) Browse mailboxes via the Web. Add search functionality. Add commenting. Refactored to allow multiple mailboxes per mail source (e.g. Mail/ dir, IMAP). Next step: optimize mail indexing. After that: optimize database storage/retrieval. Throughout, test the bejeezus out of it.
Testing MOS is a challenge Accesses network for outside mailboxes Has Web GUI Commentary uses AJAX/JavaScript
Agile project management with Trac Wiki: allows easy editing/jotting down of ideas and encourages brainstorming defect tracking system: simplifies keeping track of bugs, tasks and enhancements source browser: makes it easy to see all the SVN changesets roadmap: milestones that are tied into the tracker
Agile project management with Trac (cont.) ability to link between bug/task tickets, changesets, source, wiki pages: everything is treated as wiki text timeline : constantly updated, RSS ified chronological image of all the important events that happened in the project: code checkins, ticket changes, wiki changes, milestones
Trac lessons learned ease of wiki editing makes it a good way to jot down ideas; same with ticketing system great collaborative tool for doing Test Enhanced Development user stories / requirements can be specified as tickets of type 'enhancement' and 'task'
Source code management with Subversion centralized source code management system like CVS, but better in a number of ways. repository can be accessed via file system, SSH, Apache/DAV integrates with Trac standard layout trunk: main body of code branches: correspond to a specific release tags: copy of the code corresponding to a specific release email notifications via SvnReporter
Unit testing Completely automated tests that should run quickly; Test small, individual units of function; Test functionality of specific interface or function; DON T test full path through some code; Traditionally require setup/teardown (e.g. load/reset database contents, or build mock objects).
Using nose to unit test Easy to build ad hoc unit test code; Test frameworks like nose give you a structure within which to run, manipulate, display, and manage tests and results; nose is one of many Python unit test frameworks. Author Jason Pellerin; Based on concepts from py.test; Can be integrated into setup.py;
nose caveats Path handling is a bitch where's my app code?! Because all code is executed within single interpreter, can have cross module artifacts and inconsistencies: isolating code properly can be difficult. stdout/stderr capture is annoying.
nose features and hacks Select subset of tests to execute on the command line. Hack unit test display to time individual unit tests. Option to AVOID slow unit tests
nose lessons learned Tests that are both fast and easy to run ( % nosetests ) are more likely to be used; 30 seconds was getting somewhat long Unit tests serve as a kind of explicit documentation of functions/interfaces. In particular, this makes writing unit tests for other people s code a very worthwhile activity: you learn the code well.
Acceptance testing with Fit/FitNesse FitNesse: more user friendly variant of Ward Cunningham's Fit framework "business facing" or customer facing tests, as opposed to "code facing" tests (i.e. unit tests) tests are expressed as stories ("storytests") business domain language higher level compared to unit tests FitNesse tests make sure you "write the right code" unit tests make sure you "write the code right"
Acceptance testing with Fit/FitNesse (cont.) wiki format encourages collaboration Fit/FitNesse brings together business customers, developers and testers, and it forces them to focus on the core business rules of the application James Shore: Done right, FIT fades into the background
Acceptance testing with Fit/FitNesse (cont.) PyFIT is Python port of FIT/FitNesse tests are written in tabular format, with inputs and expected outputs fixtures are a thin layer of glue code that tie the test tables to the application
Acceptance testing with Fit/FitNesse (cont.) ColumnFixture: similar to SQL select/insert row from/into table RowFixture: similar to SQL select from table declarative style (ColumnFixture) vs. procedural style (ActionFixture) tests can be executed from the command line can be included in a smoke test run via a continuous integration tool such as buildbot
Fit/FitNesse lessons learned acceptance tests written with FitNesse can be seen as an alternative GUI into the business logic of the applications as such, they prove to be very resilient in the presence of UI changes GUI tests are fragile and need frequent changes to keep up with changes in the UI
Regression testing with TextTest TextTest is a tool for "behavior based" acceptance testing it looks at behavior of the AUT as expressed by log files, stdout and stderr golden images of logs, stdout/err are compared with what is obtained at run time documentation on project's home page is plentiful, but lacks howtos
TextTest lessons learned to get the best mileage out of it if, plan your logging carefully, with different severity levels, and have different log files for different functionality areas of the app (a single log file tends to change too much and too rapidly)
twill functional Web tests twill is a HTTP driver program for scripting Web sessions, i.e. it s a scriptable command line browser. Titus is the primary author: inject grain o salt. Layers a domain specific language on top of a simple set of Python functions, e.g. go http://python.org/ Becomes go( http://python.org )
more twill twill scripts are simple to write. twill tests are just twill scripts that shouldn t ever fail. Interactive browsing via the command line is pretty cool ;). Extending twill with Python is easy.
Automating twill tests Integrating twill into unit test framework is great: completely automated functional testing. However, setup/teardown is annoying: must start a full HTTP server on some port; run tests; then kill server. Nonetheless, should definitely do it as a smoke test. Other drawbacks: multiple processes/threads makes coverage tracking and performance profiling difficult.
In process twill testing Can use wsgi_intercept module to reroute httplib (client) requests directly to a WSGI application object.
Standard HTTP client/server
In process httplib >WSGI interception
In process httplib >WSGI interception def create_app(): return wsgi_app twill.add_wsgi_intercept('localhost ', 80, create_app)
In process twill testing lets us avoid complexities of multi threaded/multi process apps This also lets us analyze coverage of tests & profile the app. (also works with all other Python Web testing frameworks) n.b. long article on how to do this for CherryPy and Quixote at advogato.org.
Simple coverage testing Determines which lines of code were executed, or covered, by a particular test; Only free/oss tool is coverage.py, by Gareth Rees and Ned Batchelder. Uses sys.settrace to record which lines of code were executed.
Coverage testing with twill (or really any functional) tests Can use coverage analysis to figure out which parts of your app aren t hit by other tests; Usually quite easy to quickly run through that code in a new functional test.
Profiling Many profilers for Python. Most are poorly documented. Personally, I like statprof, a statistical profiler developed by Andy Wingo.
Profiling with twill Using twill, you can automate a particular user path through your site; Then, do profiling analysis to figure out where the bottlenecks are; Lets you target specific user paths, which seems cool.
twill lessons learned twill scripts are easier to write quickly than Python code; Possible to achieve high levels of coverage quite quickly by hacking together new twill functional tests with an eye on coverage; twill scripts can be re used: Automated unit tests; Setup/teardown for other tests; Command line tests of functionality.
Web UI testing with Selenium functional/acceptance testing at the UI level for Web applications uses an actual browser driven via JavaScript unique features client side JavaScript testing (think AJAX) browser compatibility testing: can be used cross platform and cross browser Selenium framework needs to be deployed on the same server that is running the AUT
Web UI testing with Selenium (cont.) individual tests and test suites written as HTML tables, similar to tests written in FIT/FitNesse tests consist in actions (open, type, select, click) and assertions Selenium IDE record/playback tool, aims for full fledged IDE uses Mozilla chrome to get around JavaScript security Other useful tools XPath Checker and XPather Firefox extensions
Web UI testing with Selenium (cont.) "Driven mode" stand alone Selenium server using Twisted + XML RPC reverse proxy gets around JavaScript cross site scripting security limitation driven by scripts written in any language with XML RPC bindings Selenium tests can be added to continuous integration process setup/teardown via twill post results for reporting purposes
Selenium lessons learned Selenium tests are fairly brittle (as all GUI tests are) in the presence of UI changes it's no great fun to write the tests, but Selenium IDE helps a lot Selenium is the only way known to mankind for testing AJAX
Agile documentation with doctest and epydoc doctest: "literate testing"/"executable documentation" unit tests expressed as stories (big docstrings) that offer context storytests : stories together with acceptance criteria that validate their implementation many projects generate documentation from doctests with minimal processing code (Django, Zope3) epydoc makes it trivial to show the storytests
Agile documentation with doctest and epydoc (cont.) test list: set of unit tests for a given module test map: set of unit test functions that exercise a given application function/method easy to generate automatically and integrate with epydoc Lessons learned unit test duplication is not necessarily a bad thing when writing "agile documentation", we discovered bugs that weren't caught by the initial unit tests
Continuous integration process
Continuous integration with buildbot continuous integration automate time consuming tasks important for the immediate public feedback it gives you about changes you made the more often you build and test your software, the quicker you will discover and fix bugs buildbot is based on Twisted pretty hard to configure, but works really well once you have it up and running can be deployed behind Apache for access control purposes
Continuous integration with buildbot (cont.) master process kicks off build and test process on configured slaves (via scheduler or email notifications) easy to extend master.cfg module with extensions for running various types of commands and tests you get the most value out of the process if you have slaves running on as many of the OSes/Python versions/etc. you plan to support
Continuous integration with buildbot (cont.) run as many types of testing as possible with buildbot unit tests acceptance w. fitnesse & texttest unit/functional with twill UI with selenium coverage and profiling egg creation and installation (this exercises setup.py!) the more aspects of the application you cover, the better protected you are against breakages
Continuous integration with buildbot (cont.) Lessons learned running unit/acceptance tests as separate user under different environment will uncover all kinds of environment specific issues OS versions Python versions required package versions hard coded paths log locations
buildbot hacks Goal: automate everything & stick it in Buildbot. Problem: Selenium tests run inside a real browser Solution: VNC and FireFox.
buildbot hacks (cont.) Goal: be able to do effective post mortems on failed tests. Problem: not all output belongs on stdout. Solution: record all output & dump in build specific directory.
Conclusions Holistic testing: test the application at all levels (no one test type does it all) unit testing functional/acceptance testing of business logic and UI regression testing Make it as easy as humanly possible to run all these types of tests automatically via a continuous integration tool or they will not get run!
Meta Lesson Learned [tests] DO OR DO NOT [pass] THERE IS NO TRY Agile Master Yoda