Sampling from the Debian GNU/Linux Distribution:



Similar documents
On the Influence of Free Software on Code Reuse in Software Development

Open Source Software Project Success Factors / Modularity As a Success Factor

Traditional Commercial Software Development. Open Source Development. Traditional Assumptions. Intangible Goods. Dr. James A.

Collaborative Software Development Using R-Forge

Of Penguins and Wildebeest. Anthony Rodgers VA7IRL

Open Source Software Developer and Project Networks

Selection and Management of Open Source Software in Libraries.

Two case studies of Open Source Software Development: Apache and Mozilla

Open Source. Knowledge Base. By: Karan Malik INTRODUCTION

Problems and Measures Regarding Waste 1 Management and 3R Era of public health improvement Situation subsequent to the Meiji Restoration

Libre software business models (from an European point of view)

Open Source Sustainability and RDM. Scott Wilson

Latest Trends in Testing. Ajay K Chhokra

FOSS License Restrictions and Some Important Issues

Free software GNU/Linux TOR project

DESIGN FOR QUALITY: THE CASE OF OPEN SOURCE SOFTWARE DEVELOPMENT

Software, Shareware and Opensource CSCU9B2

Library Technology Reports

Writing Open Source Software for BlackBerry

How To Use Open Source Software

Open Source tools for geospatial tasks

Aspects of Software Quality Assurance in Open Source Software Projects: Two Case Studies from Apache Project

Google and Open Source. Jeremy Allison Google Open Source Programs Office

A Crowd Method for Internet-based Software with Big Data

A TOPOLOGICAL ANALYSIS OF THE OPEN SOURCE SOFTWARE DEVELOPMENT COMMUNITY

Retour d'expérience sur le choix d'une forge logicielle / Choosing a software forge

PROPOSAL: OCP COMMON LINUX SWITCH DISTRIBUTION. Rob Sherwood and Mansour Karam OCP November 2013

A Framework to Represent Antecedents of User Interest in. Open-Source Software Projects

System requirements. Java SE Runtime Environment(JRE) 7 (32bit) Java SE Runtime Environment(JRE) 6 (64bit) Java SE Runtime Environment(JRE) 7 (64bit)

Faronics Products SYSTEM REQUIREMENTS Last modified: October 2014

Open source development

Embedded Linux development with Buildroot training 3-day session

Open Source vs. Collaborative Software: FOSS is Not Enough

Open-Source Software Development

Report of the LHC Computing Grid Project. Software Management Process RTAG CERN

Profiling an Open Source Project Ecology and Its Programmers

Zend Server 4.0 Beta 2 Release Announcement What s new in Zend Server 4.0 Beta 2 Updates and Improvements Resolved Issues Installation Issues

The Impact of Project License and Operating System on the Effectiveness. of the Defect-Fixing Process in Open Source Software Projects

Ubuntu, FEAP, and Virtualiza3on. Jonathan Wong Lab Mee3ng 11/08/10

Open Source Software Usage in the Schools conceptual strategy

Issues and Challenges in Open Source Software Environment with Special Reference to India

Open Source India. Open Source. Community meets Business. Michael. Meskes. credativ

( ) = ( ) = {,,, } β ( ), < 1 ( ) + ( ) = ( ) + ( )

Free/Libre Open Source Software Development: What We Know and What We Do Not Know

Free Java and OpenJDK. Andrew Haley Tech Lead, Open Source Java

Discovering Determinants of Project Participation in an Open Source Social Network

The Efficiency of Open Source Software Development

apt-p2p: A Peer-to-Peer Distribution System for Software Package Releases and Updates

Bacula The Network Backup Solution

Development of a questionnaire to study motivations of Drupal contributors Introduction Sample

Keynote Speech. Free and Open Software: Features, Development, Experiences, Benefits and Opportunities

Bacula The Network Backup Tool for *BSD, Linux, Mac, Unix and Windows

Quality Assurance under the. Open Source Development Model

Transcription:

Sampling from the Debian GNU/Linux Distribution: Software Reuse in Open Source Software Development HICSS 2007, Hawaii Authors: Sebastian Spaeth, Matthias Stuermer, Stefan Haefliger, Georg von Krogh

Research Project on Software Reuse Motivation: Software reuse lowers development costs. Software reuse is difficult to achieve. Software reuse is abundant in OSS development. Research question: What are the characteristics of software components that are reused more often than others? Sampling: What are the advantages of sampling from Debian GNU/Linux? January 4 th 2007 Chair of Strategic Management and Innovation 2

Facts about Debian GNU/Linux Founded 1993 by Ian Murdock (his wife Debra -> Debian) Community-controlled Linux distribution 20,000 ready compiled software packages Categorized in sections such as mail, text or libs Information on packages: name, version, maintainer, license... Dependency information January 4 th 2007 Chair of Strategic Management and Innovation 3

The Debian package system: Binaries, sources and dependencies Binary packages (B) are compiled from S1 B1.1 B1.3 source packages (S) B1.2 Dependencies are among binary packages S2 B2.1 January 4 th 2007 Chair of Strategic Management and Innovation 4

Mozilla Firefox 07.01.07 ETH Folienlayout 5

Advantages of sampling Debian vs. SourceForge 1. SourceForge.net excludes systematically OSS projects. other collaboration platforms (Tigris etc.) individually hosted projects (Mozilla, Apache, GNU, GNOME) 2. Debian maintainers doing peer review 3. Software actually in use 4. Packages form an integrated environment January 4 th 2007 Chair of Strategic Management and Innovation 6

Limitations Using Debian for Sampling exclusion of non-linux software (e.g. OSS for Windows) not appropriate to measure project failure license restrictions for Debian-acceptable software January 4 th 2007 Chair of Strategic Management and Innovation 7

Sample and Method Total of 19,692 binary packages (for 32bit computers) Total of 8,890 source packages (for all platforms) 1,146 source packages contain packages marked as libs or oldlibs (categorized by Debian developers as reusable library component) Our sample: random sub-sample of 466 components Total reuse of 16,949 average reuse 36 Log-linear regression Deductive research Reuse as dependent variable January 4 th 2007 Chair of Strategic Management and Innovation 8

Results: Coefficients of log-linear regression Multiple R 2 : 0.256 * 10% ** 5% *** 1% significance level ln(y)= ln( α) + ln( β)x Variable coef(std.err) (Intercept) -0.33(-0.3) website 0.06(0.1) doc 0.38(0.4)* freshmeat 0.73(0.7)*** umbrella 0.47(0.5)*** legal_entity 0.25(0.3)* C_prog 0.50(0.5)*** standard -0.18(-0.2) strict_lic -0.50(-0.5)*** bugs 0.01(0.0)*** binmodules 0.08(0.1)*** binsizemb -0.04(-0.0)* age 0.13(0.1)*** January 4 th 2007 Chair of Strategic Management and Innovation 9

Descriptive Statistics Variable Mean SD 1 2 3 4 5 6 7 8 9 10 11 1. website 0,86 0,34 2. doc 0,58 0,4 0,53 3. freshmeat 0,54 0,5 0,35 0,37 4. umbrella 0,35 0,48-0,07-0,09-0,1 5. legal_entity 0,37 0,48-0,05 0,06-0,07 0,32 6. C_prog 0,71 0,46 0-0,1 0,06-0,05-0,07 7. standard 0,4 0,49 0,17 0,15 0,15-0,14 0,03 0,14 8. strict_lic 0,28 0,45-0,07-0,1 0,02 0,11-0,05 0-0,07 9. bugs 9,43 37,63 0,05 0,06 0,11 0,06 0,15-0,03 0,08 0,16 10. binmodules 4,1 4,62 0,08 0,14 0 0,03 0,1-0,1-0,05 0,09 0,24 11. binsizemb 1,71 4,28 0,05 0,16 0,02-0,03 0,14-0,17-0,03 0,06 0,32 0,66 12. age 5,83 2,69 0,05 0,1 0,12-0,18 0,06 0,16 0,05-0,08 0,16 0,2 0,21 n=466 January 4 th 2007 Chair of Strategic Management and Innovation 10

Results: Hypotheses and Control Variables # Hypotheses/Control Variable Result Effect on Reuse H1 A dedicated web page of a component has a positive effect on reuse. x Not significant H2 Published documentation of a component has a positive effect on reuse. + 46% * H3 The listing of a component on a platform (Freshmeat) has a positive effect on reuse. + 107% *** H4 The existence of an umbrella project for a component has a positive effect on reuse. + 60% *** H5 The existence of a legal entity for a component has a positive effect on reuse. + 29% * H6 The use of C as a programming languagehas a positive effect on reuse. + 65% *** H7 The implementation of a standard has a positive effect on reuse. x Not significant H8 The use of a restrictive license (GPL) has a negative effect on reuse. - 39% *** C Number of bugs in the Debian bug database for this component + 0.5% per bug *** C Number of binary packages per source package + 7.9% per package * C Size of binary packages of a source package - 3.7% per MB *** C Age of a component + 13.7% per year *** n=466 r 2 =0.256 January 4 th 2007 Chair of Strategic Management and Innovation 11

Discussion Questions? Contact and weblog: www.smi.ethz.ch January 4 th 2007 Chair of Strategic Management and Innovation 12