Documentation and Project Organization Software Engineering Workshop, December 5-6, 2005 Jan Beutel ETH Zürich, Institut TIK December 5, 2005
Overview Project Organization Specification Bug tracking/milestones Access to information mail, web, files, version control Documentation User Guides, Specifications Inline doc s commented code, automated doc s README vs. ChangeLog Mailing lists Structure overview documents Examples Do s and don ts in a university environment/sada 2
Project Organization Different Means Plain text documents MS Project File server Mailing lists Web pages Authored/Static Wikis Repositories sourceforge.net savannah.nongnu.org BSCW 3
Single Documents Plain Text The ultimate LT-interface Easy to generate and read Universal Easy to automate with shell scripts, Perl, Java, etc. Require lot s of discipline (formatting, location, versions, debugging) File Server Storage Easy to use, fast Offline sync works with Windows, CVS, SVN or gnu sync Not universally accessible Not maintenance free (human intervention necessary) Permission Problems 4
MS Project et. al. Industry standard for large projects Integrate resource/personnel planning Based on tool availability Revision control? Good means for planning/tracking Milestones Subprojects Resources/personnel Extensive reporting capabilities Automated checks Exporting to html Bit of an overkill for most of us 5
Email Mailing Lists Fast, universal, accessible Mailing lists Different audiences use different lists (devel, user, core) Usually too much information to stay current Mailing lists usually come with archival function Operation not maintenance free (access, spam, etc ) Mailing lists Searching the archives Local mail dir starts at subscription date and bloats No straightforward search through everything in archives Not all lists are search(-ed/-able) by Google 6
Web Pages Authored/Static pages Fast Who is the webmaster? Wiki Pages Everyone is the webmaster! Hierarchical permissions Continuous improvement of content Transparent ChangeLog Control/accuracy of content? 7
Wiki Pages www.btnode.ethz.ch 8
Wiki Pages Simple Markup for Everyone 9
Wiki Pages Integrated ChangeLog 10
Wiki Pages History Compare 11
Wiki Pages Access Control, Internal Page 12
Wiki Pages igem Collaboration Success 13
Wiki Pages Revisited Projects must have certain size/impact DB based Wikis require maintenance Easy porting of static pages using script interfaces Some doc s still need static pages Auto-generated content: weblogs, doxygen, Downloads Long-term static links 14
Weblogs Interesting Project Statistics 15
Project Specific Weblog in 5 minutes Grep through the access_log for your files Run webalizer over this data #/bin/sh cd /foo_bar_dir/btnode_weblog rm -rf btnode_access.log grep -h -i btnode /install_dir/apache/logs/access_log > /foo_bar_dir/btnode_weblog/btnode_access.log webalizer -c./webalizer.conf -Q 16
sourceforge.net SourceForge.net is the world's largest open source software development web site, hosting more than 100,000 projects and over 1,000,000 registered users with a centralized resource for managing projects, issues, communications, and code. Free of cost Hosting of many services: Web pages, release management, CVS, news, support, tracking facilities, stats, mailing lists, user management Access/availability: Very good for developers, stable enough for public 17
SourceForge.net 18
SourceForge.net Feature Requests 19
SourceForge.net - Tracker 20
BSCW Repositories Basic Support for Cooperative Work Originates in very large, multinational projects Many features, lots of maintenance, complicated usage 21
Software Documentation Software Documentation or Source Code Documentation is written text that accompanies computer software. It either explains how it operates or how to use it. [wikipedia.org] Types of documentation: Architecture - Architectural overview of software; including relations to an environment, construction principles to be used in design and technical documentation, etc. Design - The design of software components. Technical - Documentation of code, algorithms, interfaces, APIs. End User - Manuals for the end-user. Operator - Manuals for the systems administrator. Application operator - Manuals for the "superuser" of the software. Help desk - Manuals for first and second line support. 22
Software Documentation cont. Often, tools such as Doxygen, javadoc, ROBODoc, POD or TwinText can be used to auto-generate the code documents; that is they extract the comments from the source code and create reference manuals in such forms as text or HTML files. Code documents are often organized into a reference guide style, allowing a programmer to quickly look up an arbitrary function or class. Many programmers really like the idea of auto-generating documentation for various reasons. For example, because it is extracted from the source code itself (for example, through comments), the programmer can write it while referring to his code, and can use the same tools he used to create the source code, to make the documentation. This makes it much easier to keep the documentation up-to-date. Of course, a downside is that only programmers can edit this kind of documentation, and it depends on them to refresh the output (for example, by running a cron job to update the documents nightly). Some would characterize this as a pro rather than a con. Donald Knuth has insisted on the fact that software documentation can be a very difficult afterthought process and has been advocating Literate programming where documentation is written in the same time as the source code and extracted by automatic means. 23
Industry Documentation Flows 24
Useful Documentation Types User Guides Specification Documents Inline doc s comments, automated doc s README vs. ChangeLog Mailing lists Structure overview documents Examples 25
User Guides Engineers tend to ignore these We usually unpack, plug-in, start up and have troubles Usually contain interesting details Basic overview Release Notes Configuration information Large effort to create and maintain 26
Specifications 27
TinyOS TEP Conventions General Conventions ============================================================ - Avoid the use of acronyms and abbreviations that are not well known. Try not to abbreviate "just because". - Acronyms should be capitalized (as in Java), i.e., Adc, not ADC. Exception: 2-letter acronyms should be all caps (e.g., AM for active messages, not Am) - If you need to abbreviate a word, do so consistently. Try to be consistent with code outside your own. - All code should be documented using `nesdoc` [nesdoc]_, `Doxygen` [Doxygen]_ or `Javadoc` [Javadoc]_. Ideally each command, event and function has documentation. At a bare minimum the interface, component, class or file needs a paragraph of description. - If you write code for a file, add an `@author` tag to the toplevel documentation block. 28
TinyOS TEP Structure 29
Doxygen Doxygen is a documentation system for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors) and to some extent PHP, C#,... HTML, Latex, RTF and XML output Simple configuration files and wizards Simple, versatile markup: 30
Doxygen Overview Output 31
Doxygen Simple Document Markup Describe every function Add additional information Parameters Usage Context Data types Separate overview doc s Single configuration file 32
Doxygen - File References 33
Doxygen File Reference and Examples 34
Doxygen Simple Centralized Descriptions 35
Project ChangeLog THE document to stay up-to-date Easy to implement Easy to maintain Usage and discipline must be strictly enforced 36
Maintained ChangeLogs in Source Files 37
Handcrafted Comments and Version Info 38
Handcrafted Comments Often necessary to understand application context Must be used with care Quality control is hard Who wrote it? When? Why? Is it still current? What was the exact case Often not cleaned up, lacking maintenance Can be pretty private 39
Handcrafted Comments The Good & Bad 40
Structure and Overview Doc s 41
Worthwhile Tutorials Don t really work unless moderated or someone is exceptionally motivated 42
Maintenance Intensive - FAQs 43
Documentation and Organization Typical Problems People from different groups departments schools companies countries No common infrastructure available Access hierarchies (internal, external) Most projects follow an explorative path You organize as you go Not much long term planning Different people join at different times All information must be archived and accessible Public visibility Must be organized, structured and clean Will lead to questions (and workload) 44
Do s and Don ts University: environments, responsibilities, project membership and working styles change frequently People come and go fast New/young people start from scratch every 3-4 years No corporate policies on resources/doc s Do not Set up complicated/maintenance intensive systems Do not rely on information to be pushed at you 45
Undergrad Student Projects Enforce discipline Weekly meetings or status updates (email) This is exceptional! 46
Undergrad Student Projects Enforce discipline Weekly meetings or status updates (email) Interactive preparation of milestones Use of version control Reading and posting to mailing lists (very hard) Do not rely on quality/results of student projects to meet project/paper deadlines SA projects are dry runs, time is short SAs usually provide a good proof of concept. Technical documentation usually good Context, scrutinizing questions, numerical analysis often lacking in precision/depth 47