Web 2.0 and Collaborative Software Development



Similar documents
Successfully managing geographically distributed development

Enterprise Resource Planning Analysis of Business Intelligence & Emergence of Mining Objects

T task Distribution and Selection Based Algorithm

Agile Software Engineering, a proposed extension for in-house software development

RE tools survey (part 1, collaboration and global software development in RE tools)

LECTURES NOTES Organisational Aspects of Software Development

Communication Needs, Practices and Supporting Structures in Global Inter- Organizational Software Development Projects

Software Engineering Practices in Jordan

E10: Controlled Experiments

TRENDS IN THE DEVELOPMENT OF BUSINESS INTELLIGENCE SYSTEMS

Mitel Professional Services Catalog for Contact Center JULY 2015 SWEDEN, DENMARK, FINLAND AND BALTICS RELEASE 1.0

CONTENT STORE SURVIVAL GUIDE

Processing and data collection of program structures in open source repositories

On- and Off-Line User Interfaces for Collaborative Cloud Services

SKILL DEVELOPMENT IN THE ERA OF QUALITY ASSURANCE MANAGEMENT WITH RESPECT TO PRODUCTS & SERVICES BASED SOFTWARE IT ORGANIZATIONS

A Mind Map Based Framework for Automated Software Log File Analysis

IBM WebSphere Operational Decision Management Improve business outcomes with real-time, intelligent decision automation

Enterprise content management solutions Better decisions, faster. Storing, finding and managing content in the digital enterprise.

72. Ontology Driven Knowledge Discovery Process: a proposal to integrate Ontology Engineering and KDD

The 7 Attributes of a Good Software Configuration Management System

Exploiting Dynamic Information in IDEs Eases Software Maintenance

Study on the Patterns of Library Resource Construction and Services in MOOC

Beyond BOM 101: Next Generation Bill of Materials Management whitepaper

Communication Problems in Global Software Development: Spotlight on a New Field of Investigation

An Introduction to Software Development Process and Collaborative Work

Fogbeam Vision Series - The Modern Intranet

A Comparison of E-Learning and Traditional Learning: Experimental Approach

IBM ECM Employee Lifecycle Management August HR best practices: Managing employee information from hire to retire

Semantic Concept Based Retrieval of Software Bug Report with Feedback

Soft Skills Requirements in Software Architecture s Job: An Exploratory Study

Keywords IS-SDE, software engineering, CALM, ALM, collaborative software development, development tools

This software agent helps industry professionals review compliance case investigations, find resolutions, and improve decision making.

Managing Requirement Risks in Global Software Development

Enhance visibility into and control over software projects IBM Rational change and release management software

IBM Tivoli Composite Application Manager for WebSphere

Achieve greater efficiency in asset management by managing all your asset types on a single platform.

CRM Integration Best Practices

Comparison of Cloud vs. Tape Backup Performance and Costs with Oracle Database

IBM Cognos Performance Management Solutions for Oracle

Implementation of ITIL in a Moroccan company: the case of incident management process

Intelligent Analysis of User Interactions in a Collaborative Software Engineering Context

Coverity Services. World-class professional services, technical support and training from the Coverity development testing experts

ORACLE OLAP. Oracle OLAP is embedded in the Oracle Database kernel and runs in the same database process

Single Level Drill Down Interactive Visualization Technique for Descriptive Data Mining Results

Agile Development with Jazz and Rational Team Concert

APPLICATION OF SERVER VIRTUALIZATION IN PLATFORM TESTING

Application Lifecycle Management White Paper. Source Code Management Best Practice: Applying Economic Logic to Migration ALM

Research on Operation Management under the Environment of Cloud Computing Data Center

A Visualization Approach for Bug Reports in Software Systems

MARKETING: THE NEXT GROWTH AREA FOR OUTSOURCING IN HIGH TECHNOLOGY COMPANIES

Expense Planning and Control Performance Blueprint Powered by TM1

Achieve greater efficiency in asset management by managing all your asset types on a single platform.

Managing the Product Value Chain for the Industrial Manufacturing Industry

A Divided Regression Analysis for Big Data

An Oracle White Paper November Leveraging Massively Parallel Processing in an Oracle Environment for Big Data Analytics

SHOULD SALES FORCE AUTOMATION CHANGES BRAND AUTOMATION FOR LG

How To Use Data Mining For Knowledge Management In Technology Enhanced Learning

A STUDY OF WHETHER HAVING A PROFESSIONAL STAFF WITH ADVANCED DEGREES INCREASES STUDENT ACHIEVEMENT MEGAN M. MOSSER. Submitted to

A Case Study in Integrated Quality Assurance for Performance Management Systems

ARC VIEW. The Next Generation of ALIM: Connecting the Digital Asset. Keywords. Summary. By Dick Slansky and Paula Hollywood

Introduction to the course, Eclipse and Python

Setting smar ter sales per formance management goals

Service assurance for communications service providers White paper. Improve service quality and enhance the customer experience.

Nexus Professional Whitepaper. Repository Management: Stages of Adoption

Test Run Analysis Interpretation (AI) Made Easy with OpenLoad

Enhancing Dataset Processing in Hadoop YARN Performance for Big Data Applications

Relationship management is dead! Long live relationship management!

arenasolutions.com Whitepaper Has Your BOM Solution Bombed? Next Generation Bill of Materials Management

Simple Linear Regression Inference

RISK MANAGEMENT IN CITIZEN ORIENTED INNOVATIVE SOFTWARE DEVELOPMENT PROJECTS

IBM Rational Asset Manager

How should an enterprise move toward Unified Communications?

Symantec Enterprise Vault for Lotus Domino

Performance Analysis: Benchmarking Public Clouds

Unicenter Asset Intelligence r11

EMC ACADEMIC ALLIANCE

Transcription:

, pp.107-120 http://dx.doi.org/10.14257/ijseia.2014.8.7,09 Web 2.0 and Collaborative Software Development Dr. JavedFerzund 1, RobailYasrab 2 and SaadRazzaq 3 1,2 COMSATS Institute of Information Technology, Sahiwal, 3 University of Sargodha jferzund@ciitsahiwal.edu.pk, robail.yasrab@ciitsahiwal.edu.pk, saadrazzaq@uos.edu.pk Abstract Effective communication and collaboration are among the most important elements of software development lifecycle. Especially, when software is developed online using development networks, it becomes more critical to incorporate effective communication and collaboration facilities all through software development process. These days web 2.0 based technologies offer a lot of support for communication and collaboration. These technologies are helpful in improving the process of software development by offering excellent support for communication and collaboration. The basic objective of this study is to assess and investigate the influence of effective communication and collaboration tools/facilities on the online collaborative software development. We establish nine hypotheses to analyze the relationship of web 2.0 based communication tools and software quality. We have found that online collaborative software development based on web 2.0 communication tools like blogs, instant messaging, project news and RSS feed etc. shows significant improvement in the software product and process quality. Keywords: Web 2.0, Social Networks, Open Source software, Communication, Collaboration, Software Quality, Virtual Teams, Software Development, Blogs, News, Bugs 1. Introduction At present, the trend of distributed software engineering is attracting more and more software development firms and individuals [10]. Most of the corporations in the United States are concerned in o shore software outsourcing. Software corporations having implemented globally distributed development arrangements are able to attain a lot of advantages like low production cost, access to an extremely skilled labor marketplace, and a decrease in the distance to the clients. On the other hand, online collaborative software development requires a high degree of collaboration and coordination due to the dispersed nature of project teams. Previous research has shown that software development companies had not always been successful in understanding the people's problems working in project teams. The ignorance from the people (social) issues in software engineering could have a harmful impact on software development process. In this scenario, social issue is more than just combined aspects of individual team member qualities, skills and actions; for instance, a team should be able to perform as a single unit of analysis. In large projects typical systems developers use more than 70% of their time working with others or in collaboration. The team related activities take more than 85% of the expenditures of huge software systems [2]. Internet offers excellent support for cooperation among geographically distributed development teams. But, in collaborative system development using social networks, a lot of issues emerge such as determining whether we need a partner or not, availability of right mix ISSN: 1738-9984 IJSEIA Copyright c 2014 SERSC

of development team, project management, code quality, good relationships with stakeholders and with team members, quality assurance, development models, etc., Also, distributed software development brings a lot of issues and disputes to software development groups. The main objectives of this study include: o Analyze the role of Web 2.0 tools in online collaborative software development o A comprehensive survey of the online platforms supporting collaborative software development o Study the impact of several modern technology based awareness and communication tools on code quality There emerge a lot of issues in case of web based collaborative software development. The basic source of these issues and problems in collaborative software development is the lack of communication and collaboration among development teams. In other words, most of the issues and problems in collaborative software development emerge due to the lack of communication and collaboration among project team members. Thus, this research intends to analyze the impact of web 2.0 facilities for communication on collaborative software development. We have formulated our research problem into nine research hypotheses. We consider the bug fixing rate an important factor in improving the software quality. Further, we consider the number of downloads as an indicator of good quality. The number of members present at a development network may also influence the development process. So, we base our hypotheses on the communication tools, number of members, number of down-loads and the bug fixing rate. The reason is to identify whether higher bug fix rates are either due to large number of members or the communication tools. Following is a list of the established Null Hypotheses: H 0 1: Higher number of members at online open source collaborative software development network leads to higher number of blogs based communication and collaboration. H 0 2: Higher number of members at open source collaborative software development network leads to increased news based communication and collaboration. H 0 3: Higher number of members at open source collaborative software development network leads to higher number of software downloads. H 0 4: Blogs based communication and collaboration at some open source collaborative software development network leads to higher rate of bug fixing. H 0 5: Higher number of software downloads at some open source collaborative software development network is due to the higher rate of bug fixing. H 0 6: Higher number of members at open source collaborative software development network leads to higher rate of bug fixing. H 0 7: News portal based communication and collaboration at some open source collaborative software development network leads to higher rate of bug fixing. H 0 8: Higher number of software downloads at some open source collaborative software development network is due to the higher number of news postings at that network. H 0 9: Higher number of software downloads at some open source collaborative software development network is due to the higher number of blogs posting at that network. To conduct the study and obtain some empirical results, a number of collaborative development networks are selected. These networks are selected on the basis of the support for one or more communication tools. A large number of open source projects are hosted on these platforms. We randomly select five projects from each collaborative development network. Data about the selected projects is retrieved from the corresponding web sites. We collect information in two categories: first is related to communication facilities like blogs, news and instant messaging etc., second is related to quality features like bug fixes, number of downloads etc. 108 Copyright c 2014 SERSC

The collected data is used to test all the Null Hypotheses that are established for this study. We use Chi-Square test in this case, a test used to find-out the association between 2 variables of a given sample of data. We analyze whether high bug fix rates are either due to large number of members or due to the communication facilities. Similarly, we analyze whether the web 2.0 facilities result in high quality software that is evident from higher number of downloads. The remaining paper is organized as follows: in Section 2 we discuss the related work. In Section 3, detail of the experiment is presented. Results are discussed in Section 4 and finally we conclude the paper in Section 5. 2. Related Work Now a days it is unimaginable to think of a software development process done by individuals without any cooperation and coordination among themselves, not simply because human beings are naturally social but for the reason that projects are large size and complex to be managed and controlled individually [21, 23]. High level system development collaboration is important for the reason that the software development procedures became more complex. The complexity is because of change in requirements, real-time transformations and services being delivered [14]. Normally system development teams are geographically distributed, incorporating the complications of diverse organizational policies, different time zones and languages. All these factors necessitate development teams to correspond and monitor development all through the development lifecycle and to work jointly to discover solutions that offer efficient software delivered on time as well as within budget [14]. Humans have numerous limitations that influence their capability to produce some piece of software. They have excellent memory; however not fairly deep as well as precise enough to keep in mind a project's countless particulars. Also, they are not capable to follow what everybody is performing in a huge group, and consequently risk hitting or duplicating the work of others. Thus, when they work on high points of abstraction such as collecting system requirements, writing code, designing system, or producing test cases, they could be error prone and slow. As a result, they have to work jointly to complete several small and big projects in given time. The software engineering projects naturally demand more cooperative environment, in which a lot of software developers work together to develop a big software system. However, when they start working in cooperation with other people, they face additional issues. For instance, the language they use to communicate is amazingly expressive, however often ambiguous [23]. Human factors create a lot of collaboration and knowledge exchange problems, which can affect the quality of software development tasks. Furthermore, empirical researches have outlined that technically driven design decisions have a great deal of influence over knowledge sharing and coordination in system development teams, that is able to sequentially reduce their productivity [11]. Software engineering tools offer poor support for communication and collaboration at remote development scenario. In addition, these SE tools are foundational upon a single-user view of the system development lifecycle. Moreover, frequent problems can be failure of software engineering tools to tackle transactional clashes where simultaneous changes to the project conflicts with each other semantically [18]. There are lots of researches and assessment studies conducted for analyzing the issues and gaps in communication and collaboration at the online software development platforms. In distributed collaborative software development co-ordination and communication problems comprise a high time investment for better contact, a lack of understanding between different Copyright c 2014 SERSC 109

sites, and an excessively high dependence on expertise of social group working in remote sites [3]. Brown, et al., state some of the important security based concerns for open source software development platforms. It is assessed that these security concerns rose due to less secure communication infrastructure at the web based social networks [5]. Katsamakas, et al., state that open source software development through the web based platform is naturally unbalanced. This happens because of the less effective system development and role management at such platforms. Lack of commitment and its proper enforcement leads to low quality software development. This happens due to less effective task management and communication among the development groups [15]. System design and development tools used in commercial software development are not comparable to online free open source development [7]. Open source development patterns are not effective for software systems that involve more contributing users who are not programmers. As a result, most of the web based open source projects do not achieve considerable success [22]. Ehrlich, et al., offered systematic evidence regarding a major communication delay and a task achievement delay for change requests concerning cross-site work. Here, an assessment of data from cross-site and same site projects showed that developments comprising distributed contributors take about 2.5 times extra to finish as compared to related development tasks. This outcome is described through the apparent communication delay. Other issues also affect development success can be number of people concerned in the job, and the size of the job. Not unexpected to this research, there were major dissimilarities in the size of same-site and distributed communication networks. In this scenario, the distance present a negative in influence on the characteristics of distributed social networks [9]. Moreover, numerous research works have been conducted in the world of software engineering and social networks those have shown the issues of communication and collaboration in distributed software development team [12]. Especially, Herbsleb, et al., [13] discovered that communication hindrance among distributed software development team is extremely high. Though, in present few years, some studies [17, 6] have outlined strategies to enhance communication in distributed software environment. Moreover, a lot of communication and collaborative systems and tools have been developed to deal with these problems [7]. Table 1. Websites Selected Platform SOURCEFORGE (www.sourceforge.net) OSOR (www.osor.eu) ADVOGATO (www.advogato.org) TIGRIS (www.tigris.org) CODEPLEX (www.codeplex.com) ECLIPSE (www.eclipse.org) FORGE (www.forge.ow2.org) JAVAFORGE (www.javaforge.com) GFORGE (www.gforge.org) KENAI (www.kenai.com) LAUNCHPAD (www.launchpad.net) KFORGE (www.kforgeproject.com) FUSIONFORGE (www.fusionforge.org) LIBRESOURCE (www.dev.libresource.org) GITORIOUS (www.gitorious.org) SAVANE (www.savane.com) Selected No No No No No 110 Copyright c 2014 SERSC

Our work is similar to the previous work, as we also study the issues in collaborative software development. We particularly focus on Web 2.0 based communication tools to solve the problems highlighted in the previous studies. Moreover, we provide an empirical study to analyze the importance of Web 2.0 tools in the collaborative development process. 3. Experimental Details In order to determine the impact of Web 2.0 tools on collaborative software development, we considered several aspects regarding online projects and their quality. For this purpose we have selected some well-known online open source development websites and have gathered project features from those web sites. In the coming sections, we will present a brief overview of approximately all websites and their features like communication facilities offered, bugs reported, bugs fixed, etc. 3.1. Selection of Websites For the quantitative analysis of web based collaborative software development networks, we have collected data from a number of websites. We conducted a survey and collected web 2.0 supported features from these websites. The main websites we have used for this research are listed in Table 1. A lot of questions arise on the selection of these websites. However, the most important question is: why do we have selected these websites? The answer to this question is very simple. Our research focuses on some specific features (communication tools, bugs fixing, member s contribution, etc.,) and there are only a few online collaborative development platforms as compared to other portals such as entertainment, games, music, commerce and communication based websites. Additionally, we need to analyze communication and collaboration features and take decision for potential improvement of these aspects. So, we searched for the possible websites dedicated to the online open source software development and collected the features and characteristics from these websites. We find that most of the websites lack in important features and characteristics for collaborative software development like web 2.0 communication tools, bugs reporting tools etc., some of these web sites include KForge, FusionForge and LibreSource etc. Thus, by considering the objectives of our study, we have chosen the web-sites listed in Table 1. The selected websites offer better facilities and communication platform for the development of open-source software. A brief description of the selected websites is given below: SourceForge is a web based open source software development community network. This web based platform facilitates people to develop the leading resource for open source software community. It is presently offering a lot of online software development facilities and facilitating 2.7 million open source developers to develop over 260,000 projects with more than 46 million clients. OSOR offers facilities like technology based guidance, news, contacts, links, and work on an open source software projects repository or Collaboration Development Environment. Advogato is a very old website and one of the initial trend-setters for the new ways of communication. It is a community website devoted to free open systems development. This website is also one of the initial social networking platforms that incorporated the most recent entries from every user's diary collectively into single news and project information feed that is known as recent-log. Codeplex an open source software development platform was established by the Microsoft. Microsoft has also established the vision of the open source software development, Copyright c 2014 SERSC 111

and Codeplex is one of the main examples of such vision. We can make use of Codeplex open source software development platform to develop new systems, to connect other people who have previously initiated their own software projects, to share information with other people or to make use of the free software on this website and offer feedback. In addition, Codeplex offers a comprehensive web based software development atmosphere for system developers to build, host and handle their open source projects and application all through the project development lifecycle. Eclipse is a free, open source and modern Java applications development platform. Eclipse is also offering an IDE (Integrated Development Environment) because it offers advanced system development tools to handle workspaces, to launch, develop and debug software. Eclipse also offers a great facility of sharing software application and source code with a development team as well as other website members. Forge OW2 is another online open source collaborative development com-munity also known as Object-Web Forge. Through such type of open source software development platform we are able to access the OW2 technical plat-form that comprises Subversion, CVS, bug tracking, mailing lists, task management, message boards/forums, permanent le archival, site hosting, overall web based administration and complete software backups. Gforge is a system development website built in order to improve the communication and collaboration in open source applications development for the software community. It offers a comprehensive configured development system through versioning, a project website and system for communication among members of a software development team. It also permits the development of an open source development knowledgebase. Javaforge is also a free website for the open-source software hosting. It was established by the JavaLobby in 2005 as one of the initial Java system programming based as well as Subversion supported free online collaborative development website in the open source world. This website is currently having hundreds of well-known open source projects and thousands of open source software development devotees. Kenai is a web based collaborative foster hosting website intended for free and open source systems. The Project Kenai was established by the Sun Microsystems and at the present owned by Oracle. It facilitates developers to discover each other's interest and collaboration, as well as it presents free software project hosting. Launchpad is an online open source application that facilitates in software development and maintenance. There are lots of characteristics that Launch pad facilitates in formulating the free software and in enhancing the overall quality of open source application development. These characteristics comprise features similar to Trac and Bugzilla. Moreover, it offers information about the system development bugs. 3.2. Data Collection For the data collection from online open source collaborative development websites, we use the project as an object. We have paid more attention on gathering such attributes which are more aligned to effective project development through better communication among system developers. We selected five projects from each website and collected information about these projects. The projects are selected randomly from different development domains like business, networks, security, office applications etc., We tried to collect data that is proper and complete, so that we could better perform our analysis. Following is a list of the features of the projects which have been used for the analysis. The basic purpose of selecting these features was to assess the influence of the communication and collaboration facilities on the online open source development process. 112 Copyright c 2014 SERSC

Members No. of Bugs Reported No. of Bugs Fixed No. of Downloads Total Blog Entries Total News Entries News IM Live RSS Blogs International Journal of Software Engineering and Its Applications Members Bugs Fixed Downloads Support of RSS feed Instant messages Audio video chat support Live help or support Blog of site Total blog entries News portal of site Total news updates We collected information about the features listed above and aggregated the values for each website, so that we could apply some statistical tool on the resultant data sets. Table 2 presents the values of different features for each website. 3.3. Analysis Method For the purpose of data analysis, we reviewed some analytical/statistical tools and methods. The basic aim was to find out a tool that could better assess the mutual relation among the different research parameters for our study. We selected the Chi-square statistical test for the data analysis, that is used to measure difference between observed (O) and expected (E) frequencies of given variables. There are two fundamental categories of Chi- Square analysis. The first is "Test of Independence", used by two variables and second is the "Goodness of Fit Test". However, these both forms of data analysis use same chi-square formula [7]. Table 2. Data Collected from Websites Website Sourceforge 3687 2479 2326 2800 2195 1693 No No Osor 61 122 108 185 121 101 No No Advogato 9 44 42 59 59 37 No No Tigris 25 129 119 172 137 120 No No Codeplex 469 69 65 214 72 72 No No Eclipse 72 990 864 1200 990 702 No Forge.ow2 47 113 97 136 1 86 No No No Javaforge 55 139 135 156 142 108 No Gforge 14 32 19 46 21 10 No No No Kenai 35 64 49 80 56 0 No No No No Launchpad 72 610 490 709 510 0 No No No No Copyright c 2014 SERSC 113

Percentage International Journal of Software Engineering and Its Applications In addition, Chi-Square is an analytical test for checking the statistical significance of bivariant tabular analysis (or cross-breaks). Additionally, a properly executed test of statistical significance allows us to identify the level of confidence we could have while rejecting and accepting any pre-established hypothesis. Normally, the hypothesis tested using chi-square statistical test offers us information whether or not two dissimilar data samples (of projects, products, etc.,) are different enough in a number of features or characteristics of their behavior. In this way we are able to simplify them and generalize their mutual relation from data space, where our samples are drawn-out [1]. The formula of Chi-Square is given below: 2 ( O E ) X E 2 In the above given formula the letter "O" refers to observed frequency that is the actual count or real data we have collected. Here "E" stands for expected frequency that is the theoretical count for the entire previously gathered data entries. Its value is computed on some frequency calculation formula. Additionally, the formula of Chi-Square takes the sum of values of O-E squared and divided by E. In this scenario, the greater the value of "O differs from E", the greater the value of X 2 would be [20]. Second type of chi-square test is "test for independence". This test is used to find-out the association between 2 variables of a given sample of data. In this test scenario independence refers to the fact that two given factors are not associated. Normally in social-science research, we are more concerned with discovering the data factors those are related, for example income and education, behavior and age, etc., [16]. (1) Percentage 93.8 95.5 88.5 92.2 94.2 97.1 87.3 90.0 85.8 76.6 80.3 75.0 59.4 60.0 45.0 30.0 15.0 0.0 Websites SOURCEFORGE OSOR ADVOGATO TIGRIS CODEPLEX ECLIPSE FORGE.OW2 JAVAFORGE GFORGE KENAI LAUNCHPAD Percentage Figure 1. Bug Fixing Percentage Test for independence provides a two way Chi-Square analysis that is a convenient method for assessing the significance of the difference between two given frequencies of occurrence in 2 or more classes with two or more groups [19]. 114 Copyright c 2014 SERSC

4. Results Bugs seriously affect the software quality. Effort required to fix the bugs also indicates code quality. If the bugs persist in an application, acceptance of that application will be reduced. So, it is required that bugs should be fixed as soon as possible. We have investigated the role of web 2.0 tools in the bug resolution process. We have compared the bug fixing rate with the bug reporting rate. A percentage formula is used to calculate the percentage of bugs fixed at each of the selected online open source software development networks. Figure 1 shows the percentage of bugs fixed at each network. We find that every website having better communication and collaboration technology shows higher bug fixing rate. Thus, this leads to better quality of the online projects. We can conclude that a website offering Web 2.0 based communication facilities will be supportive for quick bug fixations that ultimately leads to a greater performance of software development process. The open source software development website JAVAFORGE has the high-rate of bug fixing because this is the only website which offers the facility of "Instant Messaging" to online application developers. This communication facility offers better support and capability for establishing interaction among online software development team members. This communication facility is much better as compared to other communication mediums because of its live approach of message transfer. Therefore we can conclude that better communication and collaboration leads to better interaction and relationship among the software development team members. This ultimately leads to better bug fixing process that enhances the software quality. Website Table 3. Fixed vs. Blogs Observed Data Fixed (no. of problem fixed) Blog (no. of blogs reported) Row Total SOURCEFORGE 2326 2195 4521 OSOR 108 121 229 ADVOGATO 42 59 101 TIGRIS 119 137 256 CODEPLEX 65 72 137 ECLIPSE 864 990 1854 FORGE.OW2 1 1 2 JAVAFORGE 135 142 277 GFORGE 19 21 40 KENAI 49 56 105 LAUNCHPAD 490 510 1000 Column Total 4218 4304 8522 4.1. Chi-Square Analysis We have used Chi-Square test to accept or reject each of the Null Hypothesis established in Section 1. Chi-Square test is applied on the gathered quantitative data to assess the overall probability of acceptance of results. We calculate the Chi-Square and P-Value for each dataset. P-Value is also called the probability value, probability of acceptance of results. We have used the standard acceptance level 0.05 or 5 percent. It means if we get a P-Value higher Copyright c 2014 SERSC 115

than standard acceptance level 0.05, then we will accept the Null Hypothesis, otherwise we will reject the pre-established Null Hypothesis. Due to space constraints, we will present testing of H 0 4 in detail and summarize the rest. The main purpose of testing H 0 4 is to determine the relationship between the bug fix rate and the availability/facility of blogs. In other words, at any collaboration development website, the facility of blogs results in higher bug fixing rates. The observed data is collected from the chosen websites for the assessment of relation between the bugs fixed and the number of blogs reported. We have gathered the data about the overall numbers of development communication blogs entries at these websites and the overall bugs fixed regarding currently operational projects at these websites, as shown in Table 3. The expected distribution of data is presented in Table 4. The expected distribution of data refers to the distribution that we should get if there were no influence/effect of the variable. These expected values are calculated using the below given formula: Expected value of a cell = (row total x column total) / total number Website Table 4. Fixed vs. Blogs Expected Value Fixed (no. of problem fixed) Blog (no. of blogs reported) Row Total SOURCEFORGE 2237.69 2283.31 4521 OSOR 113.34 115.66 229 ADVOGATO 49.99 51.01 101 TIGRIS 126.71 129.29 256 CODEPLEX 67.81 69.19 137 ECLIPSE 917.65 936.35 1854 FORGE.OW2 0.99 1.01 2 JAVAFORGE 137.10 139.90 277 GFORGE 19.80 20.20 40 KENAI 51.97 53.03 105 LAUNCHPAD 494.95 505.05 1000 Column Total 4218 4304 8522 P-Value is greater than 0.05 so there is probability for acceptance of this null hypothesis: Table 5. Chi Square and P Value Method Results P-Value= 0.058 Chi 12472.38 DoF 10 From the Chi-Square analysis we got P-Value higher than the standard value (0.05), see Table 5. This offers us an opportunity to accept the established Null Hypothesis. So based on the P-Value, our Null Hypothesis is accepted. Thus, we can declare that higher bug fixation rate is possible through the use of better blogs based communication among open source project developers. Similarly, we tested all other Null Hypotheses established in the Section 1. Results of 116 Copyright c 2014 SERSC

acceptance or rejection are presented in Table 6. We find that most of the Null Hypotheses are rejected. Out of the nine Null-Hypotheses, only two are accepted. A review of the obtained results indicates that accepted Null-Hypotheses are related to the availability of communication and collaboration tools/facilities. H 0 7 is also accepted, it states that higher bug fixing rate is possible through the use of better news based communication among open source software development communities. This analysis also demonstrates the higher need for the communication and collaboration facilities at the online open source software development networks. It outlines that by establishing some effective communication medium on any online open source software development platform, we could get quality systems. Table 6. Results of Null Hypothesis Evaluation No. Null-Hypothesis P-Value Status H 0 1 Membersvs Blogs: Higher number of 0 Rejected members at online open source collaborative software development network leads to higher number of blogs based communication and collaboration. H 0 2 Membersvs News: Higher number of 4.71 Rejected members at open source collaborative software development network leads to increased news based communication and collaboration. H 0 3 Membersvs Downloads: Higher number of 0 Rejected members at open source collaborative software development network leads to higher number of software downloads. H 0 4 Fixed vs Blog: Blogs based 0.058 Accepted communication and collaboration at some open source collaborative software development network leads to higher rate of bug fixing. H 0 5 Fixed vs Downloads: Higher number of 1.60 Rejected software downloads at some open source collaborative software development network is due to the higher rate of bug fixing. H 0 6 Fixed vs Members: Higher number of 0 Rejected members at open source collaborative software development network leads to higher rate of bug fixing. H 0 7 Fixed vs News: News portal based 0.06 Accepted communication and collaboration at some open source collaborative software development network leads to higher rate of bug fixing. H 0 8 Downloads vs News: Higher number of software downloads at some open source 7.11 Rejected collaborative software development Copyright c 2014 SERSC 117

H 0 9 network is due to the higher number of news postings at that network. Downloads vs Blogs: Higher number of software downloads at some open source collaborative software development network is due to the higher number of blogs postings at that network. 3.56 Accepted This research has found that communication and collaboration is one of the essential requirements for the software development using open source web based applications. A lot of factors are involved in case of software engineering process such as time zone, nature of application, language compatibility, platform independence, etc., However, there is a vital need of communication and collaboration technologies in traditional software engineering practice. This need turns out to be higher in case of web based software engineering process. In case of open source application development, where its members are less interested in the overall development process, the delayed and restricted communication mediums (like e- mail) are not so effective. In this scenario, the new web 2.0 based communication and collaborative tools/facilities can offer greater support and facilities to online software developers. This research has demonstrated that multiple and better communication facilities lead to better collaboration among software development teams. Traditionally the quality of open source software is considered extremely poor. The main reason of the poor quality is the less effective development and handling of the overall software development process [8]. The new internet communication technologies like email, instant messaging and web 2.0 based communication and collaborative tools like RSS Feeds, Blogs, project news, live help offer a tremendous facility for the online open source collaborative application development. The availability of these tools and technologies has proved that better software development is possible through the establishment of effective communication and collaborative facilities among distant development team members. This ultimately improves the development lifecycle and system quality. This overall statistical analysis has shown that quality of online collaborative systems can be improved through the better communication. Availability of the better communication can offer a great deal of support regarding overall software development process that ultimately leads to better software quality and excellence. Quality of open source software can be assessed by its acceptance rate, number of downloads, number of bugs and the bug resolution time. In case of better communication and collaboration the software development teams will be able to have a greater interaction with each other that offer superior software development performance and quality. 5. Conclusions In this paper we have presented an empirical study to investigate the role of Web 2.0 tools in online collaborative software development. We selected a number of online platforms/websites that support collaborative software development. These websites were selected based on the support for Web 2.0 tools like instant messaging, blogs, news and RSS feeds etc. We analyzed the impact of the tools/facilities on the software quality. Bug fixing rate and number of downloads were selected as parameters to indicate software quality. Data was collected about a number of projects from the selected platforms. We established nine Null Hypotheses to find the associations between Web 2.0 tools, number of members, number of downloads and the bug fixing rate. Chi Square test was used to evaluate the 118 Copyright c 2014 SERSC

established Null Hypotheses. Two of the nine Null Hypotheses were accepted indicating that news and blogs based communications result in higher bug fixing rate. Early and on time bug resolution results in high acceptance rates of online projects. Thus communication tools result in better quality of software. It was found that higher number of members does not mean that number of communications will be high. Similarly, higher number of members does not influence bug fixing rate and downloads. All of the Null Hypotheses related to members are rejected. In short, better availability of communication and collaboration leads to better software quality, management and handling at web based collaborative software development networks. It is evident from the empirical results that by establishing effective communication and collaboration technology and tools, we can achieve effective online collaboratively developed systems quality and performance. 6. Future Work We want to extend this study to find out other communication and collaboration ways that can simplify the collaborative development process. Particularly, Web 3.0 features would be of interest to us. We want to analyze semantic search and linking of development artifacts in this regard. 7. References [1] Y. Amemiya and T. W. Anderson, The Annals of Statistics, Institute of Mathematical Statistics, vol. 18, no. 3, (1990), pp. 1453-1463. [2] C. Amrit and J. V. Hillegersberg, Detecting Coordination Problems in Collaborative Software Development Environments, Information Systems Management, vol. 25, no. 1, (2008), pp. 57-70. [3] R. Battin and R. Crocker, Leveraging Resources in Global Software Development, IEEE Software, vol. 18, no. 2, (2001). [4] B. Berliner, CVS II: Parallelizing Software Development, In Proceedings of the USENIX Winter 1990 Technical Conference, Berkeley, CA, (1990). [5] A. W. Brown and G. Booch, Reusing Open-Source Software and Practices, The Impact of Open-Source on Commercial Vendors, (2002). [6] E. Carmel and R. Agarwal, Tactical Approaches for Alleviating Distance in Global Software Development, IEEE Software, (2001), pp. 22-29. [7] D. Cubranic, Hipikat, University of British Columbia, IBM Ottawa Software Lab, and the National Research Council of Canada, (2004) October, pp. 117-126, Canada. [8] M. E. Domino, Conflicts in Collaborative Software Development, Proceedings of the 2003 SIGMIS conference on Computer personnel research: Freedom in Philadelphia leveraging differences and diversity in the IT workforce, Philadelphia, Pennsylvania, ACM Press New York, NY, USA, (2003). [9] K. Ehrlich, G. Valetto and M. Helander, Seeing inside: Using social network analysis to understand patterns of collaboration and coordination in global software teams, Global Software Engineering, (2007). [10] S. Handschuh, The Nepomuk Project- On the Way to the Social Semantic Desktop, In Proceedings of ISemantic 07', vol. 9, (2007), pp. 201-211. [11] J. Herbsleb and A. Mockus, An empirical study of speed and communication in globally-distributed software development, IEEE Trans SoftwEng, vol. 29, (2003), pp. 1-14. [12] J. Herbsleb, D. Moitra, Guest Editors' Introduction: Global Software Development, IEEE Software., (2001), pp. 16-20. [13] J. Herbsleb, D. Paulish and M. Bass, Global software development at siemens: experience from nine projects, Proceedings of the 27th international conference on Software engineering, St. Louis, MO, USA: ACM, (2005). [14] E. V. Hippel and G. Krogh, Open Source Software and the "Private-Collective" Innovation Model: Issues for Organization Science, Organization Science, vol. 14, no. 2, (2003), pp. 209-223. [15] E. Katsamakas and N. Georgantzas, Why most open source development projects do not succeed?, Fordham University, NY, (2007). [16] H. O. Lancaster and E. Seneta, Chi-Square Distribution, Encyclopedia of Biostatistics, (2005). Copyright c 2014 SERSC 119

[17] B. Lings, Ten Strategies for Successful Distributed Development, The Transfer and Diffusion of Information Technology for Organizational Resilience, (2006), pp. 19-37. [18] T. A. Mens, State-of-the-Art Survey on Software Merging, In Transactions on Software Engineering, IEEE, vol. 28, no. 5, (2002), pp. 449-462. [19] R. Miller and D. Siegmund, Maximally Selected Chi Square Statistics, Biometrics, vol. 38, (1982), pp. 1011-1016. [20] A. Satorra and P. M. Bentler, A scaled difference chi-square test statistic for moment structure analysis, vol. 66, no. 4, (1994), pp. 507-514, DOI: 10.1007/BF02296192. Authors Dr. JavedFerzund I am Head of Department at Department of Computer Science, COMSATS Institute of Information Technology, Sahiwal. I received my PhD from Graz University of Technology, Austria in 2009. My main research interests are in software engineering, semantic web and machine learning. Particularly, I am interested in software maintenance & evolution, reverse engineering, software metrics, ontologies, semantic techniques and mining software repositories. I am working in the Knowledge Systems Research Lab at COMSATS Institute Sahiwal. SaadRazzaq I am an assistant professor at Department of Computer Science, University of Sargodha. I received my MS from FAST- NU, Pakistan in 2006. My main research interests are in machine learning, data warehouse, and cloud computing. RobailYasrab I am lecturer at Department of Computer Science, COMSATS Institute of Information Technology, Sahiwal. I received my MS from University of Sargodha, Pakistan in 2012. My main research interests are in software engineering. 120 Copyright c 2014 SERSC