A Crowd Method for Internet-based Software with Big Data
|
|
|
- Madeleine Lawrence
- 10 years ago
- Views:
Transcription
1 2014 中 南 大 学 英 特 尔 透 明 计 算 与 大 数 据 研 讨 会 A Crowd Method for Internet-based Software with Big Data Gang Yin Software Collaboration and Data Mining Group National University of Defense Technology Changsha, July, 1st, 2014
2 Contents Motivation Approach Application From Bazaar to Big Data 2014/7/14 2
3 Internet-based Software On the Internet The various online user communities are reshaping the development of Internet-based software 软 件 问 答 软 件 版 本 In the Internet Through the Internet 2014/7/14 3
4 Characteristics of Internet-based Software Function Attractive Solutions and Features Construction Rapid Experience and Response Evolution Continuous Evolution and Improvement 2014/7/14 4
5 Characteristics of Internet-based Software Production Oriented Innovation Oriented 2014/7/14 5
6 Open Source Miracles Richard Stallman Linus Torvalds Eric Raymond launched the GNU Project, wrote the GPL lead the Linux kernel project 2014/7/14 6
7 Open Source Miracles Collaborative Development Communities Sourceforge GitHub MIUI Baidu Crowd Test Sourceforge:3.5 million users, 400,000 projects Github:4 million users, 6 million repositories MIUI: 1 million users 2014/7/14 7
8 Open Source Miracles Knowledge Sharing Communities StackOverflow OSChina CSDN ZDNet Slashdot 2 million users users developers IT practitioners 14 million topics Avg. response time : 11 minutes Open source software has strongly demonstrated the power of the Crowds 2014/7/14 8
9 Open Source Miracles 2014/7/14 9
10 Other Peer-based Practices Peering Sharing Collaboration 2014/7/14 10
11 Crowd-based Approach Open Source Crowd-based Approach? High-Level Language Software Engineering 1960s 1970s 1990s Engineering Approach Automated Approach 2014/7/14 11
12 Crowd-based Approach: Step I Crowd-based Approach Traditional Approaches Peer-based Approaches 2014/7/14 12
13 Big Data in Software Development Collaborative Development Communities project profile source code issue tracker mailing list API software user tag time Knowledge Sharing Communities Q & A tags / features forum posts blogs / news These data contain valuable information and knowledge for crowd-based software development 2014/7/14 13
14 The power of Big Data Crowd-based Approach Traditional Approaches Peer-based Approaches Scope Quality SourceForge GitHub ohloh Softpedia StackOverflow Internet-based Software Communities 2014/7/14 14
15 Crowd-based Approach: Step II Crowd-based Approach with Big Data? Fundamental Approaches Human-Centric Peer-based Approaches Approaches for Mining Engineering Data Data-Centric Approaches for Mining Community Data How to combine the strengths of the Crowds and the Big Data? 2014/7/14 15
16 Trustie Project National High-Tech Development Plan (863 Program) National Trustworthy Software Resource Sharing and Cooperating Production Environment (Trustie, Since 2007) 2014/7/14 16
17 Contents Motivation Approach Application The secret of our approach is the meaning of trustworthiness 2014/7/14 17
18 Software Trustworthiness Given enough eyeballs, all bugs are shallow The history of Linux suggested a surprising theories about software engineering. Human-Centric Vision 2014/7/14 18
19 Software Trustworthiness Trustworthiness of Internet-based software is hidden in the big data Novelty Productivity Quality Open source software gives us a new sense of value for software development. Engineering Data + Community Data Data-Centric Vision 2014/7/14 19
20 Data-centric Innovation Cycle Crowd-based Creation Software Data Crowd-based Construction Crowd-based Evolution 2014/7/14 20
21 The Crowd Method Three Key Principles Open Sharing, Mass Collaboration, Data Analysis 2014/7/14 21
22 Principles of the Crowd Method The three key principles should be carried out during all innovation cycles 2014/7/14 22
23 Research Issues on Software Big Data Mass Collaboration How to support the engineers and crowds to collaborate in large scale development? How to enable the crowd development for the industrial software production? Internet Software Communities Data Analysis How to evaluate the contribution of the developers in projects? How to evaluate the trustworthiness of software artifacts? Open Resource Sharing How to find the software more accurately across the various Internet communities? How to locate the trustworthy software artifacts in Internet communities? Trustie team has published papers in international journals (TSE, TSC, JASE, ) and top level conferences (ICSE, ASE, FSE, ICSM,...). 2014/7/14 23
24 Results on Data Analysis Developers productivity plateaus within 6-7 months in small and medium projects and it takes up to 12 months in large projects. Minghui Zhou, Audris Mockus: Developer fluency: achieving true mastery in software projects. SIGSOFT FSE 2010: /7/14 24
25 Results on Data Analysis The crowds can find interesting projects The crowds can collaborate with engineers Minghui Zhou, Audris Mockus: What make long term contributors: Willingness and opportunity in OSS community. ICSE 2012: /7/14 25
26 New Results on Mass Collaboration Android Issue Tracker Bugs StackOverflow Q&A Community Similarity of the texts of bugs and posts Text Time The time when the issues and Q & A are published Co-occurred users in the two communities Co-occurred users Automatic Knowledge Propagation across Communities: A Case Study of Android Issue Tracker and Stack Overflow, to be submitted. 2014/7/14 26
27 New Results on Mass Collaboration Coder Reviewer Prediction Classifier 0.17 Top-N Who Should Review this Pull-Request: Recommending Reviewers to Expedite Crowd Collaboration, to be submitted. 2014/7/14 27
28 New Results on Resource Sharing SourceForge Hierarchical Categories Software Communities Ohloh Freecode Aggregation of online descriptions Hierarchical Classifier Fine grained, efficient software resource classification for Crowd generated artifacts Tao Wang, Huaimin Wang, Gang Yin, Charles X. Ling, Xiang Li, Peng Zou: Mining Software Profile across Multiple Repositories for Hierarchical Categorization. ICSM 2013: /7/14 28
29 Platform and Practices Application Practices Application in Large Scale Software Industries Neusoft Careland Wonders group Digital China Common Application Modes and Platforms Enterprise Version Community Version Education Version Application in Mission Critical Systems Space flight Electricity Flight control Defense Componentbased SPL Serviceoriented SPL Heterogeneous SPL Runtimemonitoring SPL Third-party SPL Third-party SPL Development Environment Trustie Collaborative Development Toolset Software Communities Trustie Software Resource Sharing Toolset Trustie Software Data Storage and Analysis Toolset Technology System Large Scale Software Resource Sharing Technologies Large Scale Software Collaborative Development Technologies Crowd-based Software Development Approach Big Data enabled Software Trustworthiness Analysis Technologies 2014/7/14 29
30 Contents Motivation Approach Application Software industries Software engineering education Critical information systems Is the Crowd Method practically efficient, or not? 2014/7/14 30
31 Application in Internet Communities Collaboration Community more than 32,000 users more than 1,500 projects users and projects can be analyzed comprehensively Sharing Community various kinds of software resources OSS, services, components, more than 60,000 evaluated resources 2014/7/14 31
32 Application in Software Industries Neusoft Corporation Trustie supported the development of 8 health care information systems in Neusoft. Software reusability increases 75%; productivity increases 65% Digital China Holdings Limited Digital China set up the industrial SPL for trustworthy taxation software development. Software reusability increases 60%; # of bugs decreases 20% Trustie are imported into more than 10 software companies in China, and successfully supports 22 large scale software projects. 2014/7/14 32
33 Application in Software Industries 2014/7/14 33
34 Application in Universities project Course Course project Interests project Course Collaboration Course MOOC project MOOP MOOC 2.0 Big Data for Education? 2014/7/14 34
35 Application in Universities Project Hosting Version control Issue tracking Project profile Forum/wiki Gant/Documents Course Hosting Course management Member management Exercise monitoring Resource management Forum/Message/Board Contest Hosting Contest publishing Submission of works Discussion Ranking Notification Social Network + Data Analysis 2014/7/14 35
36 Future Work Application of Trustie Technologies MOOP, MOOC 2.0 Software engineering education Software garden and industries Industry Education Critical System Research on the Crowd Method Data-driven collaborative development Data-driven software resource sharing Data-driven trustworthiness analysis Software Engineering Network Analysis Data Mining 2014/7/14 36
37 2014 中 南 大 学 英 特 尔 透 明 计 算 与 大 数 据 研 讨 会 Thank You! Questions? /7/14 37
Collaborative Software Development Using R-Forge
Collaborative Software Development Using R-Forge Stefan Theußl Achim Zeileis Kurt Hornik Department of Statistics and Mathematics Wirtschaftsuniversität Wien August 13, 2008 Why Open Source? Source code
On the Influence of Free Software on Code Reuse in Software Development
On the Influence of Free Software on Code Reuse in Software Development Marco Balduzzi Abstract Software reuse has become a topic of much interest in the software community due
Traditional Commercial Software Development. Open Source Development. Traditional Assumptions. Intangible Goods. Dr. James A.
Open Source Development Dr. James A. Bednar [email protected] http://homepages.inf.ed.ac.uk/jbednar Traditional Commercial Software Development Producing consumer-oriented software is often done in
Agile Requirements Definition for Software Improvement and Maintenance in Open Source Software Development
Agile Requirements Definition for Software Improvement and Maintenance in Open Source Software Development Stefan Dietze Fraunhofer Institute for Software and Systems Engineering (ISST), Mollstr. 1, 10178
Open Source Software Development
Open Source Software Development OHJ-1860 Software Systems Seminar, 3 cr Imed Hammouda Institute of Software Systems Tampere University of Technology Course Information Open Source Software Development
BUSMASTER An Open Source Tool
BUSMASTER An Open Source Tool Dr. Tobias Lorenz, ETAS GmbH Presented by Dr. Andrew Borg In August, ETAS and Robert Bosch Engineering and Business Solutions (RBEI) jointly published BUSMASTER, a free open
Aspects of Software Quality Assurance in Open Source Software Projects: Two Case Studies from Apache Project
Aspects of Software Quality Assurance in Open Source Software Projects: Two Case Studies from Apache Project Dindin Wahyudin, Alexander Schatten, Dietmar Winkler, Stefan Biffl Institute of Software Technology
Collaborative Software Development Platforms for Crowdsourcing
SOFTWARE TECHNOLOGY Editor: Christof Ebert Vector Consulting Services [email protected] Collaborative Software Development Platforms for Crowdsourcing Xin Peng, Muhammad Ali Babar, and Christof
Selection and Management of Open Source Software in Libraries.
Selection and Management of Open Source Software in Libraries. Vimal kumar V. Asian School of Business Padmanabha Building Technopark, Trivandrum-695 581 [email protected] Abstract Open source software
Wait For It: Determinants of Pull Request Evaluation Latency on GitHub
Wait For It: Determinants of Pull Request Evaluation Latency on GitHub Yue Yu, Huaimin Wang, Vladimir Filkov, Premkumar Devanbu, and Bogdan Vasilescu College of Computer, National University of Defense
Legal Documentation Guidelines and Procedures
Legal Documentation Guidelines and Procedures Coin-OR Foundation November 2, 2010 Contents 1 Introduction 2 2 Ownership 2 3 Acceptable Licenses 2 4 Significance of Contributions 3 5 Legal Documentation
FOSS License Restrictions and Some Important Issues
Free and Open Source Software (FOSS) Part II presented by Wolfgang Leister INF 5780 Høstsemester 2009 Ifi Universitetet i Oslo Some thoughts about FOSS Open Source Software (1) 1. Free Redistribution The
Understanding the popularity of reporters and assignees in the Github
Understanding the popularity of reporters and assignees in the Github Joicy Xavier, Autran Macedo, Marcelo de A. Maia Computer Science Department Federal University of Uberlândia Uberlândia, Minas Gerais,
Free software GNU/Linux TOR project
Fair Young Sustainable Inclusive and Cooperative (FYSIC) @ Modica Sicily Free software GNU/Linux TOR project Solira Software Libero Ragusa What is Solira? We promote the Free Software philosophy on local
Open-source business models: Creating value from free stuff'
Best Practice in Innovation, Entrepreneurship & Design Open-source business models: Creating value from free stuff' 31 March 2010-18.00 to 19.30 Panellists: Prof. Bart Clarysse - Chair in Entrepreneurship,
FOSSBazaar A Governance Initiative to manage Free and Open Source Software life cycle
FOSSBazaar A Governance Initiative to manage Free and Open Source Software life cycle Table of contents Executive summary......2 What is FOSS Governance 3 The importance of open source governance...3 Why
Inner Source Adopting Open Source Development Practices within Organizations: A Tutorial
Inner Source Adopting Open Source Development Practices within Organizations: A Tutorial Klaas-Jan Stol and Brian Fitzgerald Lero the Irish Software Engineering Research Centre, University of Limerick,
Sampling from the Debian GNU/Linux Distribution:
Sampling from the Debian GNU/Linux Distribution: Software Reuse in Open Source Software Development HICSS 2007, Hawaii Authors: Sebastian Spaeth, Matthias Stuermer, Stefan Haefliger, Georg von Krogh Research
Mining Textual Data for Software Engineering Tasks
Mining Textual Data for Software Engineering Tasks Latifa Guerrouj Benjamin C. M. Fung McGill University McGill University 3661 Peel St., Canada H3A 1X1 3661 Peel St., Canada H3A 1X1 Mobile: (+1) 514-791-0085
Agenda. Tango meeting : Krakow
Agenda Which databases for which data? Some reminders on services required on top of these databases Some reminders on project organization A few figures on operational usage of these databases at SOLEIL
Software Configuration Management, Advantages and Limitations
HUT / SOBERIT 2003 T-76.651 DISTRIBUTED DEVELOPMENT SEMINAR 1 Comparison of Open Source Software Configuration Management Tools Tero Kojo 44809J [email protected] Abstract Software Configuration Management
Open Source and Closed Source Software Development Methodologies
Open Source and Closed Source Software Development Methodologies Vidyasagar Potdar, Elizabeth Chang School of Information System, Curtin University of Technology, Perth, Australia 6845 [email protected],
What is PINES? PINES Current and Future Members. About PINES. About PINES
Julie Walker Deputy State Librarian Elizabeth McKinney PINES Program Director What is PINES? PINES Current and Future Members A consortium of 50 Georgia public library systems 275 facilities and bookmobiles
Two case studies of Open Source Software Development: Apache and Mozilla
1 Two case studies of Open Source Software Development: Apache and Mozilla Audris Mockus, Roy Fielding, and James D Herbsleb Presented by Jingyue Li 2 Outline Research questions Research methods Data collection
Introduction to Free Software
, facts, myths, actors http://www.poirrier.be/ June 28th, 2007 Freedom What are free software? Software Freedom What are free software? Free... Software? Freedom What are free software? Freedom No subordination,
What s Hot in Software Engineering Twitter Space?
What s Hot in Software Engineering Twitter Space? Abhishek Sharma, Yuan Tian, and David Lo School of Information Systems Singapore Management University {abhisheksh.2014,yuan.tian.2012,davidlo}@smu.edu.sg
Innovative Program to Access VMware Tools
TA18 Innovative Program to Access VMware Tools Jyothy Reddy Director, R&D VMware Inc. This session may contain product features that are currently under development. This session/overview of the new technology
TSRR: A Software Resource Repository for Trustworthiness Resource Management and Reuse
TSRR: A Software Resource Repository for Trustworthiness Resource Management and Reuse Junfeng Zhao 1, 2, Bing Xie 1,2, Yasha Wang 1,2, Yongjun XU 3 1 Key Laboratory of High Confidence Software Technologies,
Do Onboarding Programs Work?
Do Onboarding Programs Work? Adriaan Labuschagne and Reid Holmes School of Computer Science University of Waterloo Waterloo, ON, Canada alabusch,[email protected] Abstract Open source software systems
Security Vulnerability Management. Mark J Cox
Security Vulnerability Management Mark J Cox Responsibility & Accountability Unique challenges Many vendors all ship the same thing The vulnerabilities are there. The fact that somebody in the middle of
Bug management in open source projects
Bug management in open source projects Thomas Basilien, Roni Kokkonen & Iikka Manninen Abstract 1. Introduction 2. Bug management in general 2.1 Bug management in proprietary projects 2.2 Project management
What CCPForge does Introduction to SESC and CCPForge Workshop Gemma Poulter [email protected] http://www.softeng-support.ac.
What CCPForge does Introduction to SESC and CCPForge Workshop Gemma Poulter [email protected] http://www.softeng-support.ac.uk What is CCPForge? Software development environment Originally intended
Analysis of Open Source Software Development Iterations by Means of Burst Detection Techniques
Analysis of Open Source Software Development Iterations by Means of Burst Detection Techniques Bruno Rossi, Barbara Russo, and Giancarlo Succi CASE Center for Applied Software Engineering Free University
Inner Source Adopting Open Source Development Practices in Organizations A Tutorial
FEATURE: OPEN SOURCE DEVELOPMENT Inner Source Adopting Open Source Development Practices in Organizations A Tutorial Klaas-Jan Stol and Brian Fitzgerald, Lero The Irish Software Engineering Research Centre
An elearning platform for distanced collaborative programming
An elearning platform for distanced collaborative programming Final report by Low Hau Sum Team Member: Chow Tsz Wun, Low Hau Sum, Mok Ka Hei Supervisor: Dr Chui C K FYP14006 2 Table of Contents 1 Introduction...
Software Development In the Cloud Cloud management and ALM
Software Development In the Cloud Cloud management and ALM First published in Dr. Dobb's Journal, February 2009: http://www.ddj.com/development-tools/212900736 Nick Gulrajani is a Senior Solutions Architect
TAUS Quality Dashboard. An Industry-Shared Platform for Quality Evaluation and Business Intelligence September, 2015
TAUS Quality Dashboard An Industry-Shared Platform for Quality Evaluation and Business Intelligence September, 2015 1 This document describes how the TAUS Dynamic Quality Framework (DQF) generates a Quality
Case Study. Using Knowledge: Advances in Expertise Location and Social Networking
Case Study Using Knowledge: Advances in Expertise Location and Social Networking Best practices from IBM Global Business Services IBM provides business insight and IT solutions to help its clients become
Open Source Software Project Management A Case Study Red Hat Enterprise Linux. Bob Johnson, Red Hat
Open Source Software Project Management A Case Study Red Hat Enterprise Linux Bob Johnson, Red Hat Goals for this talk Red Hat Enterprise Development Model From the community perspective Red Hat Process
A Cloud Platform for Delivering Instant Development Service with Service Oriented Approaches
Open Source for Open Cloud Nov 24~25, 2010 Paris, France A Cloud Platform for Delivering Instant Development Service with Service Oriented Approaches Hailong Sun [email protected] Beihang University
What You Should Know About Open Source Software
What You Should Know About Open Source Software J.D. Marple Silicon Valley Latham & Watkins operates as a limited liability partnership worldwide with an affiliate in the United Kingdom and Italy, where
An Introduction to Recommendation Systems in Software Engineering
Chapter 1 An Introduction to Recommendation Systems in Software Engineering Martin P. Robillard and Robert J. Walker Abstract. Software engineering is a knowledge-intensive activity that presents many
SOFTWARE DEVELOPMENT BASICS SED
SOFTWARE DEVELOPMENT BASICS SED Centre de recherche Lille Nord Europe 16 DÉCEMBRE 2011 SUMMARY 1. Inria Forge 2. Build Process of Software 3. Software Testing 4. Continuous Integration 16 DECEMBRE 2011-2
Release Management Within Open Source Projects
Management Within Open Source Projects Justin R. Erenkrantz Institute for Software Research University of California, Irvine Irvine, CA 92697-3425 [email protected] Abstract A simple classification
The Virtualization Practice
The Virtualization Practice White Paper: Managing Applications in Docker Containers Bernd Harzog Analyst Virtualization and Cloud Performance Management October 2014 Abstract Docker has captured the attention
Open Source Software: Strategies and Risk Management
Open Source Software: Strategies and Risk Management Elisabeth Esner i DLA Pper i Rudnick Gray Cary US LLP (858) 677-1484 elisabeth.e isner@dlap iper.com Mark Lehberg DLA Pper i Rudnick Gray Cary US LLP
PAID VS. VOLUNTEER WORK IN OPEN SOURCE
PAID VS. VOLUNTEER WORK IN OPEN SOURCE Dirk Riehle Philipp Riemer Computer Science Department Friedrich-Alexander University Erlangen-Nürnberg Martensstr. 3, 91058 Erlangen, Germany [email protected] Computer
Reactive Variability Realization with Test-Driven Development and Refactoring
Reactive Variability Realization with Test-Driven Development and Refactoring Glauco Silva Neves Informatics and Statistics Department - INE Federal University of Santa Catarina - UFSC Florianópolis, Brazil
Air Force SOA Enterprise Service Bus Study Using Business Process Management Workflow Orchestration for C4I Systems Integration
Air Force SOA Enterprise Service Bus Study Using Business Process Management Workflow Orchestration for C4I s Integration Dr. Timothy D. Kehoe, Irene Chang, Dave Czulada, Howard Kong, Dr. Dino Konstantopoulos
Dr. Marco Hugentobler, Sourcepole Twitter: @sourcepole. QGIS from a geodata viewer to a GIS platform
QGIS from a geodata viewer to a GIS platform Dr. Marco Hugentobler, Sourcepole Twitter: @sourcepole QGIS > Open source desktop GIS > License: GNU GPL, Version 3 > Founded in 2002 by Gary Sherman > C++,
HP Systinet. Software Version: 10.01 Windows and Linux Operating Systems. Concepts Guide
HP Systinet Software Version: 10.01 Windows and Linux Operating Systems Concepts Guide Document Release Date: June 2015 Software Release Date: June 2015 Legal Notices Warranty The only warranties for HP
Measuring API Documentation on the Web
Measuring API Documentation on Web Chris Parnin Georgia Institute of Technology College of Computing Atlanta, GA USA [email protected] Christoph Treude University of Victoria Dept. of Computer Science
Bug Tracking and Reliability Assessment System (BTRAS)
Bug Tracking and Reliability Assessment System (BTRAS) V.B. Singh 1, Krishna Kumar Chaturvedi 2 1 Delhi College of Arts and Commerce, University of Delhi, Delhi, India [email protected] 2 Indian
Guidelines and Procedures for Project Management
Guidelines and Procedures for Project Management Coin-OR Foundation May 17, 2007 Contents 1 Introduction 3 2 Responsibilities 3 3 Contacts and Information 4 4 Definitions 4 5 Establishing a New Project
A Manual Categorization of Android App Development Issues on Stack Overflow
2014 IEEE International Conference on Software Maintenance and Evolution A Manual Categorization of Android App Development Issues on Stack Overflow Stefanie Beyer Software Engineering Research Group University
Developer Fluency: Achieving True Mastery in Software Projects
Developer Fluency: Achieving True Mastery in Software Projects Minghui Zhou, [email protected], Peking University, Beijing, China Audris Mockus [email protected] Avaya Research Labs, NJ, USA Agenda History
Presentation. Open Source is NOT Free. For ISACA. By Dave Yip / Gamatech Ltd. Agenda
Presentation Open Source is NOT Free For ISACA By Dave Yip / Gamatech Ltd Agenda Gamatech Introduction to Open Source Open Source and Enterprises Open Source Licensing Open Source Risks Open Source Management
A Model for Effective Asset Re-use in Software Projects
A Model for Effective Asset Re-use in Software Projects Abhay Joshi Abstract Software Asset re-use has the potential to enhance the quality and reduce the time to market of software projects. However,
