NextBug: A Tool for Recommending Similar Bugs in Open-Source Systems
|
|
|
- Jeffery Gregory
- 9 years ago
- Views:
Transcription
1 NextBug: A Tool for Recommending Similar Bugs in Open-Source Systems Henrique S. C. Rocha 1, Guilherme A. de Oliveira 2, Humberto T. Marques-Neto 2, Marco Túlio O. Valente 1 1 Department of Computer Science Federal University of Minas Gerais (UFMG) Belo Horizonte MG Brazil 2 Department of Computer Science Pontifical Catholic University of Minas Gerais (PUC Minas) Belo Horizonte MG Brazil [email protected], guilherme.oliveira @sga.pucminas.br [email protected], [email protected] Abstract. Due to the characteristics of the maintenance process of open-source systems, grouping similar bugs to improve developers productivity is a challenging task. In this paper, we proposed and evaluate a tool, called NextBug, for recommending similar bugs in open-source systems. NextBug is implemented as Bugzilla plug-in and it was design to help maintainers selecting the next bug he/she would fix. We also report an experience on using NextBug with 109,145 bugs previously reported for Mozilla products. Video URL: < 1. Introduction Considering the great importance, the costs, and the increasing complexity of software maintenance activities, most organizations usually maintain their systems by performing tasks periodically, i.e., maintenance requests are grouped and implemented as part of large software projects [Tan and Mookerjee 2005; Aziz et al. 2009; Junio et al. 2011; Marques-Neto et al. 2013]. On the other hand, open-source projects typically adopt continuous maintenance policies where the maintenance requests are addressed by maintainers with different skills and commitment levels, as soon as possible, after being registered in an issue tracking platform, such as Bugzilla and Jira [Mockus et al. 2002; Tan and Mookerjee 2005; Liu et al. 2012]. However, this process is usually uncoordinated, which results in a high number of issues from which many are invalid or duplicated [Liu et al. 2012]. In 2005, a certified maintainer from the Mozilla Software foundation made the following comment on this situation: everyday, almost 300 bugs appear that need triaging. This is far too much for only the Mozilla programmers to handle [Anvik et al. 2006]. The dataset formed by bugs reported for the Mozilla projects indicates that, in 2011, the number of reported issues per year increased approximately 75% when compared to In this context, tools to assist in the issue processing would be very helpful and could contribute to increase the productivity of open-source systems development.
2 In this paper, we claim that a very simple form of periodic maintenance policy can be promoted in open-source systems by recommending similar maintenance requests to maintainers whenever they manifest interest in handling a given request. Suppose that a developer has manifested interest in a bug with a textual description d i. In this case, we rely on text mining techniques to retrieve open bugs with descriptions d j similar to d i and we recommend such bugs to the maintainers. More specifically, we present NextBug, a tool to recommend similar bugs to maintainers based on the textual description of each bug stored in Bugzilla, an issue tracking system widely used by open-source projects. The proposed tool is compatible with the software development process followed by open-source systems for the following reasons: (a) it is based on recommendations and, therefore, maintainers are not required to accept extra bugs to fix; (b) it is a fully automatic and unsupervised approach which does not depend on human intervention; and (c) it relies on information readily available in Bugzilla. Assuming the recommendations effectively denote similar bugs and supposing that the maintainers would accept the recommendations pointed out by NextBug, the tool can contribute to introduce gains of scale similar to the ones achieved with periodic policies [Banker and Slaughter 1997]. We also report a field study when we populated NextBug with a dataset of bugs reported for Mozilla systems. The remainder of this paper is organized as follows. Section 2 discuss tools for finding duplicated issue reports in bug tracking systems and also tools that assign bugs to developers. The architecture and the central features of NextBug are described in Section 3. An example of usage is presented in Section 4. Finally, conclusions are offered in Section Related Tools Most open-source systems adopt an Issue Tracking System (ITS) to support their maintenance processes. Normally, in such systems both users and testers can report modification requests [Liu et al. 2012]. This practice usually results in a continuous maintenance process where maintainers address the change requests as soon as possible. The ITS provides a central knowledge repository which also serves as a communication channel for geographically distributed developers and users [Anvik et al. 2006; Ihara et al. 2009]. Recent studies have focused on finding duplicated issue reports in bug tracking systems. Duplicated reports can hamper the bug triaging process and may drain maintenance resources [Cavalcanti et al. 2013]. Typically, studies for finding duplicated issues rely on traditional information retrieval techniques such as natural language processing, vector space model, and cosine similarity [Alipour et al. 2013]. Approaches to infer the most suitable developer to correct a software issue are also reported in the literature. Most of them can be viewed as recommendation systems that suggest developers to handle a reported bug. For instance, [Anvik and Murphy 2011] proposed an approach based on supervised machine learning that requires training to create a classifier. This classifier assigns the data (bug reports) to the closest developer. However, to the best of our knowledge, we are not aware of any tool designed to recommend similar bugs to maintainers of open-source systems.
3 Figure 1. NextBug Screenshot (similar bugs are shown in the lower right corner) 3. NextBug in a Nutshell In this section, we present NextBug 1 main features (Section 3.1). We also present the tool s architecture and main components (Section 3.2) Main Features Currently, there are several ITSs that are used in software maintenance such as Bugzilla, Jira, Mantis, and RedMine. NextBug was implemented as a Bugzilla plug-in mainly because this ITS is used by the Mozilla project, which was used to validate our tool. When a developer is analyzing or browsing an issue, NextBug can recommend similar bugs in the usual Bugzilla web interface. As described in Section 3.2, NextBug uses a textual similarity algorithm to verify the similarity among bug reports. Figure 1 shows an usage example of our tool. This figure shows a real bug from the Mozilla project, which refers to a FirefoxOS application issue related to a mobile device camera (Bug ). As we can observe, Bugzilla shows detailed information about this bug, such as a summary description, creation date, product, component, operational system, and hardware information. NextBug extends this original interface by showing a list of similar bugs to the browsed one. This list is shown on the lower right corner. Another important feature is that NextBug is only executed if its Ajax link is clicked and, thus, it will not cause additional overhead or hinder performance to developers who do not wish to use similar bug recommendations. In Figure 1, NextBug suggested three similar bugs to the one which is browsed on the screenshot. As we can note, NextBug not only detects similar bugs but it also calculates an index to express this similarity. Our final goal is to guide the developer s workflow by suggesting similar bugs to the one he/she is currently browsing. If a developer chooses to handle one of the recommended bugs, we claim he/she can minimize the context change inherent to the task of handling different bugs and, consequently, improve his/her productivity. 1 NextBug is open-source and available under the Mozilla Public License (MPL) at < labsoft.dcc.ufmg.br/nextbug/>.
4 Figure 2. NextBug Architecture 3.2. Architecture and Algorithms Figure 2 shows NextBug s architecture, including the system main components and the interaction among them. As described in Section 3.1, NextBug is a plug-in for Bugzilla. Therefore, it is implemented in Perl, the same language used in the implementation of Bugzilla. Basically, NextBug instruments the Bugzilla interface used for browsing and for selecting bugs reported for a system. NextBug registers an Ajax event in this interface that calls NextBug passing the browsed issue as an input parameter. NextBug architecture has two central components: Information Retrieval (IR) Process and Recommender. The IR Process component obtains all open issues currently available on the Bugzilla system along with the browsed issue. Then it relies on IR techniques for natural language processing such as: tokenization, stemming, and stop-words removal [Runeson et al. 2007]. We implemnted all such techniques in Perl. After this processing, the issues are transformed into vectors using the vector space model [Baeza-Yates and Ribeiro-Neto 1999; Runeson et al. 2007]. VSM is a classical information retrieval model to process documents and to quantify their similarities. The usage of VSM is accomplished by decomposing the data (available bug reports and queries) into t-dimensional vectors, assigning weights to each indexed term. The weights w i are positive real numbers that represent the i-th index term in the vector. To calculate w i we used the following equation, which is called a tf-idf weight formula: w i = (1 + log 2 f i ) log 2 N n i where f i is the frequency of the i-th term in the document, N is the total number of documents, and n i is the number of documents in which the i-th term occurs. The Recommender component receives the processed issues and verifies the ones similar to the browsed issue. The similarity is computed using the cosine similarity measure [Baeza-Yates and Ribeiro-Neto 1999; Runeson et al. 2007]. More specifically, the similarity between the vectors of a document d j and a query q is described by the following equation, which is called the cosine similarity because it measures the cosine of the angle between the two vectors: Sim(d j, q) = cos(θ) = dj q d j q = t t i=1 w i,d w i,q i=1 (w i,d) 2 t i=1 (w i,q) 2
5 Since all the weights are greater or equal to zero, we have 0 Sim(d j, q) 1, where zero indicates that there is no relation between the two vectors, and one indicates the highest possible similarity, i.e., both vectors are actually the same. The issues are then ordered according to their similarity before being returned to Bugzilla. Since NextBug started as an Ajax event, the recommendations are showed in the same Bugzilla interface used by developers for browsing and selecting bugs to fix. 4. Evaluation We used a dataset with bugs from the Mozilla project to evaluate the proposed tool. Mozilla is composed of 69 products from different domains which are implemented in different programming languages. Mozilla project includes some popular systems such as Firefox, Thunderbird, SeaMonkey, and Bugzilla. We considered only issues that were actually fixed from January 2009 to October More specifically, we ignored issue types such as duplicated, incomplete, and invalid. Mozilla issues are also classified according to their severity in the following scale: blocker, critical, major, normal, minor, and trivial. Table 1 shows the number and the percentage of each of these severity categories in the considered dataset. This scale also includes enhancements as a particular severity category. Although, it was not considered in our study, i.e., we do not provide recommendations for similar enhancements. Severity Table 1. Issues per Severity Issues Days to Resolve Number % Min Max Avg Dev Med blocker 2, critical 7, enhancement 3, major 7, minor 3, normal 103, trivial 2, Total 130, Final Dataset 109, Table 1 also shows the number of days required to fix the issues in each category. We can observe that blocker bugs are quickly corrected by developers, showing the lowest values for maximum, average, standard deviation, and median measures among the considered categories. The presented lifetimes also indicate that issues with critical and major severity are closer to each other. Finally, enhancements are very different from the others, showing the highest values for average, standard deviation, and median. Issues marked as blocker, critical, or major were not considered in our evaluation because developers have to fix them as quickly as possible. In other words, they would probably not consider fixing other issues together, since their ultimate priority is to fix the main blocker issue. In other words, our dataset is formed by fixed issues classified as normal, minor, and trivial. These issues count for 109,154 bugs (83.64%) from our initial population of bugs available for the NextBug evaluation.
6 We used three metrics in our evaluation: Feedback, Precision, and Likelihood. These metrics were inspired by the evaluation followed by the ROSE recommendation system [Zimmermann et al. 2004]. Feedback presents the ratio of queries where NextBug makes at least k recommendations. Precision indicates the percentage of recommendations that were actually relevant among the top-k suggestions by NextBug. Finally, Likelihood indicates whether at least one relevant recommendation is included in NextBug s top-k suggestions. In this evaluation, we defined a relevant recommendation as one that shares the same developer with the main issue. More specifically, we consider that a recently created issue q is connected to a second opened issue when they are handled by the same developer. The assumption in this case is that our approach fosters gains of productivity whenever it recommends issues that were later fixed anyway by the same developer. Figure 3 shows the average feedback (left chart), precision (central chart) and likelihood (right chart) up to k = 5. metric value Feedback Precision Likelihood k=1 k=2 k=3 k=4 k=5 Figure 3. Average Evaluation Results for k = 1 to k = 5. We summarize our results as follows: We achieved a feedback of 0.63 for k = 1. Therefore, on average, NextBug made at least one suggestion for 63% of the bugs, i.e., for every five bugs NextBug was able to provide at least one similar recommendation for three of those. Moreover, NextBug showed on average 3.2 recommendations for its queries. We achieved a precision of 0.31 or more for all values of k. In other words, the NextBug recommendations were on average 31% relevant (i.e., further handled by the same developer), no matter how many suggestions were given. We achieved a likelihood of 0.54 for k = 3. More specifically, in about 54% of the cases, there is a top-3 recommendation that was later handled by the same developer responsible for the original bug. We also conducted a survey with Mozilla developers using our tool. We gave recommendations suggested by NextBug to 176 Mozilla maintainers and asked them a few questions. Our summarized results are: (i) 77% found our recommendations relevant; (ii) 85% confirmed that a tool to recommend similar bugs would be useful to the Mozilla community and it would allow them to do more work in less time.
7 4.1. Example of Recommendation Table 2 presents an example of a bug (browsed or main issue) opened for the component DOM:Device Interfaces of the Core Mozilla product and the first three recommendations (top-3) suggested by our tool for this bug. As we can observe in the summary description, both query and recommendations require maintenance in the Device Storage API, used by Web apps to access local file systems. Moreover, all four issues were handled by the same developer (Dev ID ). Table 2. Example of Recommendation Similarity Bug ID Summary Creation Date Fix Date Device Storage - Default Browsed location for device storage on windows should be NS WIN PERSONAL DIR Top-1 56% Device Storage - Clean up error strings Top-2 47% Device Storage - Convert tests to use public types Top-3 42% Device Storage - use a properties file instead of the mime service We can also observe that the three recommended issues were created before the original query. In fact, the developer fixed the bugs associated to the second and the third recommendations in the same date which he has fixed the original query, i.e. on However, he only resolved the other recommended bug (ID ) 41 days latter, i.e., on Therefore, our approach would have helped this maintainer to discover quickly the related issues. This task probably demanded more effort without a recommendation automatically provided by a tool like NextBug. 5. Conclusion This paper presented NextBug, a tool for recommending similar bugs. NextBug is implemented as a plug-in for Bugzilla, a widely used Issue Tracking Systems (ITS), specially used by open-source systems. The proposed tool relies on information retrieval techniques to extract semantic information from issue reports in order to identify the similarity of open bugs with the one that is being handled by a developer. We evaluate the NextBug with a dataset of 109,154 Mozilla bugs, achieving feedback results of 63%, precision results around 31% and likelihood results greater than 54%. Those results are very reasonable compared to other recommendation tools. We also conducted a survey with 176 Mozilla developers using recommendations provided by NextBug. From such developers, 77% of them thought our recommendations were relevant and 85% confirmed that a tool like NextBug would be useful to the Mozilla community. 6. Acknowledgements This work was supported by CNPq, CAPES, and FAPEMIG.
8 References [Alipour et al. 2013] Alipour, A., Hindle, A., and Stroulia, E. (2013). A contextual approach towards more accurate duplicate bug report detection. In 10th Working Conference on Mining Software Repositories (MSR), pages [Anvik et al. 2006] Anvik, J., Hiew, L., and Murphy, G. C. (2006). Who should fix this bug? In 28th International Conference on Software engineering (ICSE), pages [Anvik and Murphy 2011] Anvik, J. and Murphy, G. C. (2011). Reducing the effort of bug report triage: recommenders for development-oriented decisions. ACM Transactions on Software Engineering Methodology (TOSEM), 20(3):10:1 10:35. [Aziz et al. 2009] Aziz, J., Ahmed, F., and Laghari, M. (2009). Empirical analysis of team and application size on software maintenance and support activities. In 1st International Conference on Information Management and Engineering (ICIME), pages [Baeza-Yates and Ribeiro-Neto 1999] Baeza-Yates, R. A. and Ribeiro-Neto, B. (1999). Modern information retrieval. Addison-Wesley, 2nd edition. [Banker and Slaughter 1997] Banker, R. D. and Slaughter, S. A. (1997). A field study of scale economies in software maintenance. Management Science, 43: [Cavalcanti et al. 2013] Cavalcanti, Y. C., Mota Silveira Neto, P. A., Lucrédio, D., Vale, T., Almeida, E. S., and Lemos Meira, S. R. (2013). The bug report duplication problem: an exploratory study. Software Quality Journal, 21(1): [Ihara et al. 2009] Ihara, A., Ohira, M., and Matsumoto, K. (2009). An analysis method for improving a bug modification process in open source software development. In 7th International Workshop Principles of Software Evolution and Software Evolution (IWPSE-Evol), pages [Junio et al. 2011] Junio, G., Malta, M., de Almeida Mossri, H., Marques-Neto, H., and Valente, M. (2011). On the benefits of planning and grouping software maintenance requests. In 15th European Conference on Software Maintenance and Reengineering (CSMR), pages [Liu et al. 2012] Liu, K., Tan, H. B. K., and Chandramohan, M. (2012). Has this bug been reported? In 20th ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), pages 28:1 28:4. [Marques-Neto et al. 2013] Marques-Neto, H., Aparecido, G. J., and Valente, M. T. (2013). A quantitative approach for evaluating software maintenance services. In 28th ACM Symposium on Applied Computing (SAC), pages [Mockus et al. 2002] Mockus, A., Fielding, R. T., and Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology, 11(3): [Runeson et al. 2007] Runeson, P., Alexandersson, M., and Nyholm, O. (2007). Detection of duplicate defect reports using natural language processing. In 29th International Conference on Software Engineering (ICSE), pages [Tan and Mookerjee 2005] Tan, Y. and Mookerjee, V. (2005). Comparing uniform and flexible policies for software maintenance and replacement. IEEE Transactions on Software Engineering, 31(3): [Zimmermann et al. 2004] Zimmermann, T., Weisgerber, P., Diehl, S., and Zeller, A. (2004). Mining version histories to guide software changes. In 26th International Conference on Software Engineering (ICSE), pages
An Empirical Study on Recommendations of Similar Bugs
An Empirical Study on Recommendations of Similar Bugs Henrique Rocha, Marco Tulio Valente, Humberto Marques-Neto, and Gail C. Murphy Federal University of Minas Gerais, Brazil Pontifical Catholic University
Assisting bug Triage in Large Open Source Projects Using Approximate String Matching
Assisting bug Triage in Large Open Source Projects Using Approximate String Matching Amir H. Moin and Günter Neumann Language Technology (LT) Lab. German Research Center for Artificial Intelligence (DFKI)
Assisting bug Triage in Large Open Source Projects Using Approximate String Matching
Assisting bug Triage in Large Open Source Projects Using Approximate String Matching Amir H. Moin and Günter Neumann Language Technology (LT) Lab. German Research Center for Artificial Intelligence (DFKI)
BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs
BugMaps-Granger: A Tool for Causality Analysis between Source Code Metrics and Bugs César Couto 1,2, Pedro Pires 1, Marco Túlio Valente 1, Roberto S. Bigonha 1, Andre Hora 3, Nicolas Anquetil 3 1 Department
A Visualization Approach for Bug Reports in Software Systems
, pp. 37-46 http://dx.doi.org/10.14257/ijseia.2014.8.10.04 A Visualization Approach for Bug Reports in Software Systems Maen Hammad 1, Somia Abufakher 2 and Mustafa Hammad 3 1, 2 Department of Software
Classification Algorithms for Detecting Duplicate Bug Reports in Large Open Source Repositories
Classification Algorithms for Detecting Duplicate Bug Reports in Large Open Source Repositories Sarah Ritchey (Computer Science and Mathematics) [email protected] - student Bonita Sharif (Computer
Open Source Bug Tracking Characteristics In Two Open Source Projects, Apache and Mozilla
Open Source Bug Tracking Characteristics In Two Open Source Projects, Apache and Mozilla Mikko Koivusaari, Jouko Kokko, Lasse Annola Abstract 1. Introduction 2. Bug tracking in open source projects 2.1
What Questions Developers Ask During Software Evolution? An Academic Perspective
What Questions Developers Ask During Software Evolution? An Academic Perspective Renato Novais 1, Creidiane Brito 1, Manoel Mendonça 2 1 Federal Institute of Bahia, Salvador BA Brazil 2 Fraunhofer Project
University of Alberta. Anahita Alipour. Master of Science. Department of Computing Science
University of Alberta A CONTEXTUAL APPROACH TOWARDS MORE ACCURATE DUPLICATE BUG REPORT DETECTION by Anahita Alipour A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study
Does the Act of Refactoring Really Make Code Simpler? A Preliminary Study Francisco Zigmund Sokol 1, Mauricio Finavaro Aniche 1, Marco Aurélio Gerosa 1 1 Department of Computer Science University of São
BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests
Couto et al. Journal of Software Engineering Research and Development 2014, 2:1 SOFTWARE Open Access BugMaps-Granger: a tool for visualizing and predicting bugs using Granger causality tests Cesar Couto
Integrating Service Oriented MSR Framework and Google Chart Tools for Visualizing Software Evolution
2012 Fourth International Workshop on Empirical Software Engineering in Practice Integrating Service Oriented MSR Framework and Google Chart Tools for Visualizing Software Evolution Yasutaka Sakamoto,
Semantic Search in Portals using Ontologies
Semantic Search in Portals using Ontologies Wallace Anacleto Pinheiro Ana Maria de C. Moura Military Institute of Engineering - IME/RJ Department of Computer Engineering - Rio de Janeiro - Brazil [awallace,anamoura]@de9.ime.eb.br
BUGMINER: Software Reliability Analysis Via Data Mining of Bug Reports
BUGMINER: Software Reliability Analysis Via Data Mining of Bug Reports Leon Wu Boyi Xie Gail Kaiser Rebecca Passonneau Department of Computer Science Columbia University New York, NY 10027 USA {leon,xie,kaiser,becky}@cs.columbia.edu
PDF hosted at the Radboud Repository of the Radboud University Nijmegen
PDF hosted at the Radboud Repository of the Radboud University Nijmegen The following full text is an author's version which may differ from the publisher's version. For additional information about this
Bug Localization Using Revision Log Analysis and Open Bug Repository Text Categorization
Bug Localization Using Revision Log Analysis and Open Bug Repository Text Categorization Amir H. Moin and Mohammad Khansari Department of IT Engineering, School of Science & Engineering, Sharif University
Understanding the popularity of reporters and assignees in the Github
Understanding the popularity of reporters and assignees in the Github Joicy Xavier, Autran Macedo, Marcelo de A. Maia Computer Science Department Federal University of Uberlândia Uberlândia, Minas Gerais,
Processing and data collection of program structures in open source repositories
1 Processing and data collection of program structures in open source repositories JEAN PETRIĆ, TIHANA GALINAC GRBAC AND MARIO DUBRAVAC, University of Rijeka Software structure analysis with help of network
A Systematic Review Process for Software Engineering
A Systematic Review Process for Software Engineering Paula Mian, Tayana Conte, Ana Natali, Jorge Biolchini and Guilherme Travassos COPPE / UFRJ Computer Science Department Cx. Postal 68.511, CEP 21945-970,
JSClassFinder: A Tool to Detect Class-like Structures in JavaScript
JSClassFinder: A Tool to Detect Class-like Structures in JavaScript Leonardo Humberto Silva 1, Daniel Hovadick 2, Marco Tulio Valente 2, Alexandre Bergel 3,Nicolas Anquetil 4, Anne Etien 4 1 Department
Enriching SE Ontologies with Bug Report Quality
Enriching SE Ontologies with Bug Report Quality Philipp Schuegerl 1, Juergen Rilling 1, Philippe Charland 2 1 Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
Automated Duplicate Detection for Bug Tracking Systems
Automated Duplicate Detection for Bug Tracking Systems Nicholas Jalbert University of Virginia Charlottesville, Virginia 22904 [email protected] Westley Weimer University of Virginia Charlottesville,
Assigning Bug Reports using a Vocabulary-Based Expertise Model of Developers
Assigning Bug Reports using a Vocabulary-Based Expertise Model of Developers Dominique Matter, Adrian Kuhn, Oscar Nierstrasz Software Composition Group University of Bern, Switzerland http://scg.unibe.ch
Analysis of Software Project Reports for Defect Prediction Using KNN
, July 2-4, 2014, London, U.K. Analysis of Software Project Reports for Defect Prediction Using KNN Rajni Jindal, Ruchika Malhotra and Abha Jain Abstract Defect severity assessment is highly essential
Semantic Concept Based Retrieval of Software Bug Report with Feedback
Semantic Concept Based Retrieval of Software Bug Report with Feedback Tao Zhang, Byungjeong Lee, Hanjoon Kim, Jaeho Lee, Sooyong Kang, and Ilhoon Shin Abstract Mining software bugs provides a way to develop
Automating the Measurement of Open Source Projects
Automating the Measurement of Open Source Projects Daniel German Department of Computer Science University of Victoria [email protected] Audris Mockus Avaya Labs Department of Software Technology Research
Clustering Technique in Data Mining for Text Documents
Clustering Technique in Data Mining for Text Documents Ms.J.Sathya Priya Assistant Professor Dept Of Information Technology. Velammal Engineering College. Chennai. Ms.S.Priyadharshini Assistant Professor
A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS
A COMBINED TEXT MINING METHOD TO IMPROVE DOCUMENT MANAGEMENT IN CONSTRUCTION PROJECTS Caldas, Carlos H. 1 and Soibelman, L. 2 ABSTRACT Information is an important element of project delivery processes.
The Importance of Bug Reports and Triaging in Collaborational Design
Categorizing Bugs with Social Networks: A Case Study on Four Open Source Software Communities Marcelo Serrano Zanetti, Ingo Scholtes, Claudio Juan Tessone and Frank Schweitzer Chair of Systems Design ETH
Database Marketing, Business Intelligence and Knowledge Discovery
Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski
Bug Localization Using Revision Log Analysis and Open Bug Repository Text Categorization
Bug Localization Using Revision Log Analysis and Open Bug Repository Text Categorization Amir H. Moin and Mohammad Khansari Department of IT Engineering, School of Science & Engineering, Sharif University
Creating Synthetic Temporal Document Collections for Web Archive Benchmarking
Creating Synthetic Temporal Document Collections for Web Archive Benchmarking Kjetil Nørvåg and Albert Overskeid Nybø Norwegian University of Science and Technology 7491 Trondheim, Norway Abstract. In
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies
Text Mining Approach for Big Data Analysis Using Clustering and Classification Methodologies Somesh S Chavadi 1, Dr. Asha T 2 1 PG Student, 2 Professor, Department of Computer Science and Engineering,
Mining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
Data Mining Yelp Data - Predicting rating stars from review text
Data Mining Yelp Data - Predicting rating stars from review text Rakesh Chada Stony Brook University [email protected] Chetan Naik Stony Brook University [email protected] ABSTRACT The majority
Predict the Popularity of YouTube Videos Using Early View Data
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
ANALYSIS OF OPEN SOURCE DEFECT TRACKING TOOLS FOR USE IN DEFECT ESTIMATION
ANALYSIS OF OPEN SOURCE DEFECT TRACKING TOOLS FOR USE IN DEFECT ESTIMATION Catherine V. Stringfellow, Dileep Potnuri Department of Computer Science Midwestern State University Wichita Falls, TX U.S.A.
Search engine ranking
Proceedings of the 7 th International Conference on Applied Informatics Eger, Hungary, January 28 31, 2007. Vol. 2. pp. 417 422. Search engine ranking Mária Princz Faculty of Technical Engineering, University
Data Collection from Open Source Software Repositories
Data Collection from Open Source Software Repositories GORAN MAUŠA, TIHANA GALINAC GRBAC SEIP LABORATORY FACULTY OF ENGINEERING UNIVERSITY OF RIJEKA, CROATIA Software Defect Prediction (SDP) Aim: Focus
Using Provenance to Improve Workflow Design
Using Provenance to Improve Workflow Design Frederico T. de Oliveira, Leonardo Murta, Claudia Werner, Marta Mattoso COPPE/ Computer Science Department Federal University of Rio de Janeiro (UFRJ) {ftoliveira,
How To Cluster On A Search Engine
Volume 2, Issue 2, February 2012 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: A REVIEW ON QUERY CLUSTERING
Combining Text Mining and Data Mining for Bug Report Classification
2014 IEEE International Conference on Software Maintenance and Evolution Combining Text Mining and Data Mining for Bug Report Classification Yu Zhou 1,2, Yanxiang Tong 1, Ruihang Gu 1, Harald Gall 3 1
ReLink: Recovering Links between Bugs and Changes
ReLink: Recovering Links between Bugs and Changes Rongxin Wu, Hongyu Zhang, Sunghun Kim and S.C. Cheung School of Software, Tsinghua University Beijing 100084, China [email protected], [email protected]
Context Capture in Software Development
Context Capture in Software Development Bruno Antunes, Francisco Correia and Paulo Gomes Knowledge and Intelligent Systems Laboratory Cognitive and Media Systems Group Centre for Informatics and Systems
1 o Semestre 2007/2008
Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction
Data Mining in Web Search Engine Optimization and User Assisted Rank Results
Data Mining in Web Search Engine Optimization and User Assisted Rank Results Minky Jindal Institute of Technology and Management Gurgaon 122017, Haryana, India Nisha kharb Institute of Technology and Management
RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS
ISBN: 978-972-8924-93-5 2009 IADIS RANKING WEB PAGES RELEVANT TO SEARCH KEYWORDS Ben Choi & Sumit Tyagi Computer Science, Louisiana Tech University, USA ABSTRACT In this paper we propose new methods for
The Empirical Commit Frequency Distribution of Open Source Projects
The Empirical Commit Frequency Distribution of Open Source Projects Carsten Kolassa Software Engineering RWTH Aachen University, Germany [email protected] Dirk Riehle Friedrich-Alexander-University Erlangen-Nürnberg,
COURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.
Applied Mathematical Sciences, Vol. 7, 2013, no. 112, 5591-5597 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2013.38457 Accuracy Rate of Predictive Models in Credit Screening Anirut Suebsing
An Information Retrieval using weighted Index Terms in Natural Language document collections
Internet and Information Technology in Modern Organizations: Challenges & Answers 635 An Information Retrieval using weighted Index Terms in Natural Language document collections Ahmed A. A. Radwan, Minia
Homework 2. Page 154: Exercise 8.10. Page 145: Exercise 8.3 Page 150: Exercise 8.9
Homework 2 Page 110: Exercise 6.10; Exercise 6.12 Page 116: Exercise 6.15; Exercise 6.17 Page 121: Exercise 6.19 Page 122: Exercise 6.20; Exercise 6.23; Exercise 6.24 Page 131: Exercise 7.3; Exercise 7.5;
A Bug You Like: A Framework for Automated Assignment of Bugs
A Bug You Like: A Framework for Automated Assignment of Bugs Olga Baysal Michael W. Godfrey Robin Cohen School of Computer Science University of Waterloo Waterloo, ON, Canada {obaysal, migod, rcohen}@uwaterloo.ca
Characterizing and Predicting Blocking Bugs in Open Source Projects
Characterizing and Predicting Blocking Bugs in Open Source Projects Harold Valdivia Garcia and Emad Shihab Department of Software Engineering Rochester Institute of Technology Rochester, NY, USA {hv1710,
Efficient Collection and Analysis of Program Data - Hackystat
You can t even ask them to push a button: Toward ubiquitous, developer-centric, empirical software engineering Philip Johnson Department of Information and Computer Sciences University of Hawaii Honolulu,
Debt-Prone Bugs: Technical Debt in Software Maintenance
Debt-Prone Bugs: Technical Debt in Software Maintenance Jifeng Xuan, Yan Hu, * He Jiang School of Software, Dalian University of Technology Dalian 116621, China [email protected], [email protected],
A Manual Categorization of Android App Development Issues on Stack Overflow
2014 IEEE International Conference on Software Maintenance and Evolution A Manual Categorization of Android App Development Issues on Stack Overflow Stefanie Beyer Software Engineering Research Group University
Multi-Project Software Engineering: An Example
Multi-Project Software Engineering: An Example Pankaj K Garg [email protected] Zee Source 1684 Nightingale Avenue, Suite 201, Sunnyvale, CA 94087, USA Thomas Gschwind [email protected] Technische
Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND II. PROBLEM AND SOLUTION
Brian Lao - bjlao Karthik Jagadeesh - kjag Legal Informatics Final Paper Submission Creating a Legal-Focused Search Engine I. BACKGROUND There is a large need for improved access to legal help. For example,
Efficient Bug Triaging Using Text Mining
2185 Efficient Bug Triaging Using Text Mining Mamdouh Alenezi and Kenneth Magel Department of Computer Science, North Dakota State University Fargo, ND 58108, USA Email: {mamdouh.alenezi, kenneth.magel}@ndsu.edu
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
AN EFFICIENT APPROACH TO PERFORM PRE-PROCESSING
AN EFFIIENT APPROAH TO PERFORM PRE-PROESSING S. Prince Mary Research Scholar, Sathyabama University, hennai- 119 [email protected] E. Baburaj Department of omputer Science & Engineering, Sun Engineering
Agile Software Engineering, a proposed extension for in-house software development
Journal of Information & Communication Technology Vol. 5, No. 2, (Fall 2011) 61-73 Agile Software Engineering, a proposed extension for in-house software development Muhammad Misbahuddin * Institute of
arxiv:1506.04135v1 [cs.ir] 12 Jun 2015
Reducing offline evaluation bias of collaborative filtering algorithms Arnaud de Myttenaere 1,2, Boris Golden 1, Bénédicte Le Grand 3 & Fabrice Rossi 2 arxiv:1506.04135v1 [cs.ir] 12 Jun 2015 1 - Viadeo
Distributed Computing and Big Data: Hadoop and MapReduce
Distributed Computing and Big Data: Hadoop and MapReduce Bill Keenan, Director Terry Heinze, Architect Thomson Reuters Research & Development Agenda R&D Overview Hadoop and MapReduce Overview Use Case:
Augmented Search for Web Applications. New frontier in big log data analysis and application intelligence
Augmented Search for Web Applications New frontier in big log data analysis and application intelligence Business white paper May 2015 Web applications are the most common business applications today.
Predicting Bug Category Based on Analysis of Software Repositories
Predicting Bug Category Based on Analysis of Software Repositories Mostafa M. Ahmed, Abdel Rahman M. Hedar, and Hosny M. Ibrahim Abstract Defective software modules cause software failures, in-crease development
Machine Learning Log File Analysis
Machine Learning Log File Analysis Research Proposal Kieran Matherson ID: 1154908 Supervisor: Richard Nelson 13 March, 2015 Abstract The need for analysis of systems log files is increasing as systems
PayLess: A Low Cost Network Monitoring Framework for Software Defined Networks
PayLess: A Low Cost Network Monitoring Framework for Software Defined Networks Shihabur R. Chowdhury, Md. Faizul Bari, Reaz Ahmed and Raouf Boutaba David R. Cheriton School of Computer Science, University
Characterizing User Behavior on a Mobile SMS-Based Chat Service
Characterizing User Behavior on a Mobile SMS-Based Chat Service Rafael de A. Oliveira, Wladmir C. Brandão & Humberto T. Marques-Neto Instituto de Ciências Exatas e Informática Pontifícia Universidade Católica
ISTQB Certified Tester. Foundation Level. Sample Exam 1
ISTQB Certified Tester Foundation Level Version 2015 American Copyright Notice This document may be copied in its entirety, or extracts made, if the source is acknowledged. #1 When test cases are designed
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2
Optimization of Search Results with Duplicate Page Elimination using Usage Data A. K. Sharma 1, Neelam Duhan 2 1, 2 Department of Computer Engineering, YMCA University of Science & Technology, Faridabad,
Mining a Corpus of Job Ads
Mining a Corpus of Job Ads Workshop Strings and Structures Computational Biology & Linguistics Jürgen Jürgen Hermes Hermes Sprachliche Linguistic Data Informationsverarbeitung Processing Institut Department
