Incorporating Data Mining Techniques on Software Cost Estimation: Validation and Improvement

Size: px
Start display at page:

Download "Incorporating Data Mining Techniques on Software Cost Estimation: Validation and Improvement"

Transcription

1 Incorporating Data Mining Techniques on Software Cost Estimation: Validation and Improvement 1 Narendra Sharma, 2 Ratnesh Litoriya Department of Computer Science and Engineering Jaypee University of Engg & Technology Guna, India 1 narendra_sharma88@yahoo.com 2 ratnesh.litoriya@juet.ac.in Abstract Generally, data mining is the process of analyzing data from different perspectives and summarizing it into useful information. Information that can be used to increase revenue, cuts costs, or both. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. I am using data mining tools weka to identify the important and common cost drivers that are used to generate the estimate of a project. Cost drivers are multiplicative factors that determine the effort required to complete our software project. In the analogy estimation models, the cost drivers are the base of cost estimation models. They estimate the new project with compare the past project data or cost drivers and set the value of cost drivers in the new projects. The aim of this research work to identify the important cost drivers in the past project data with the help of data mining tools weka.. Keywords Data mining, agile COCOMO Software estimation tools. Weka data mining tools, software engineering etc. I. INTRODUCTION Cost estimation is a process or an approximation of the probable cost of a prod.uct, program, or a project, computed on the basis of available information. Accurate cost estimation is very important for every kind of project, if we do not estimate the projects in a proper way; result the cost of the project is very high sometimes it will be reached % more than the original cost. So in that case it is very necessary to estimate the project correctly. In this research we are working with two different-different fields one is software engineering and another field is data mining. Data mining help us to classified the past project data and generate the valuable information. These knowledge or information applied in the cost estimation models and try to generate the approximate estimation on the basis of past project data. In this research I am trying to identify the common cost drivers that are affected the cost of the project. For estimation the cost of the new project we are using the agile cocomo model [2]. This paper investigates the systemic cost estimation issues that have been identified and best performing machine learning techniques. While we have found that agile COCOMO II, a software estimation model with publicly available algorithms developed by Barry Boehm, et al. [9], is a very robust model, it is generate the more accurate result on the basis of past project data that are very similar for our new projects.. However these results were only internally validated, using leave one out cross validation, with the historical data within the data mining system. We seek to find the prediction accuracy of the new model developed by the data mining system against new external data to evaluate the true effectiveness of these models in comparison to standard cost models that do not use machine learning techniques. In this research we are used the data mining tools weka for performing the data mining. The main aim of the research to increase the efficiency of software cost estimation with the help of the data mining techniques [1,3]. II. INTRODUCTION OF DATA MINING AND WEKA TOOLS We know that the all software cost estimation models are not able to produce accurate estimates that often can be off by greater than 50% from the actual cost, and sometimes as much as % off from the actual cost. So we need such types of new methods or models that can be helpful for us for generate the actual costs and their accuracy are being investigated. Even methods that show a small improvement are considered great in the field of software estimation [2]. 301

2 With the enormous amount of data stored in files, databases, and other repositories, it is increasingly important, if not necessary, to develop powerful means for analysis and perhaps interpretation of such data and for the extraction of interesting knowledge that could help in decision-making. Data Mining, also popularly known as Knowledge Discovery in Databases (KDD), refers to the nontrivial extraction of implicit, previously unknown and potentially useful information from data in databases. While data mining and knowledge discovery in databases (or KDD) are frequently treated as synonyms, data mining is actually part of the knowledge discovery process [5,7]. Data mining, at its core, is the transformation of large amounts of data into meaningful patterns and rules. Further, it could be broken down into two types: directed and undirected. In directed data mining, you are trying to predict a particular data point the sales price of a house given information about other houses for sale in the neighborhood, for example. In undirected data mining, we are trying to create groups of data, or find patterns in existing data creating the "Soccer Mom" demographic group, for example. In effect, every U.S. census is data mining, as the government looks to gather data All paragraphs must be indented. All paragraphs must be justified, i.e. both left-justified and right-justified. About everyone in the country and turn it into useful information. Today we are using data mining in every type of applications such as banking, insurances, medical, education etc. A. Some basic operations of data mining- Regression Working with categorical data or a mixture of continuous numeric and categorical data? Classification analysis might suit your needs well. This technique is capable of processing a wider variety of data than regression and is growing in popularity. We ll also find output that is much easier to interpret. Instead of the complicated mathematical formula given by the regression technique you'll receive a decision tree that requires a series of binary decisions. One popular classification algorithm is the k-means clustering algorithm. WEKA Data mining isn't solely the domain of big companies and expensive software. In fact, there's a piece of software that does almost all the same things as these expensive pieces of software the software is called WEKA. WEKA is the product of the University of Waikato (New Zealand) and was first implemented in its modern form in It uses the GNU General Public License (GPL). The figure of weka is shown in the figure 1.The software is written in the Java language and contains a GUI for interacting with data files and producing visual results (think tables and curves). It also has a general API, so you can embed WEKA, like any other library, in our own applications to such things as automated server-side data-mining tasks. I am using the k-means clustering algorithms for classification of data. For working of weka we not need the deep knowledge of data mining that s reason it is very popular data mining tool. Weka also provides the graphical user interface of the user and provides many facilities [4, 7]. Regression is the oldest and most well-known statistical technique that the data mining community utilizes. Basically, regression takes a numerical dataset and develops a mathematical formula that fits the data. When you're ready to use the results to predict future behavior, you simply take your new data, plug it into the developed formula and you've got a prediction! The major limitation of this technique is that it only works well with continuous quantitative data (like weight, speed or age). If you're working with categorical data where order is not significant (like color, name or gender) you're better off choosing another technique [2,7]. K-means clustering is a data mining/machine learning algorithm used to cluster observations into groups of related observations without any prior knowledge of those relationships. The k-means algorithm is one of the simplest clustering techniques and it is commonly used in medical imaging, biometrics and related fields. C. The k-means Algorithm: The k-means algorithm is an evolutionary algorithm that gains its name from its method of operation. The algorithm clusters observations into k groups, where k is provided as an input parameter. B. Classification 302

3 It then assigns each observation to clusters based upon the observation s proximity to the mean of the cluster. The cluster s mean is then recomputed and the process begins again. Here s how the algorithm works [7]: 1. The algorithm arbitrarily selects k points as the initial cluster centres ( means ). 2. Each point in the dataset is assigned to the closed cluster, based upon the Euclidean distance between each point and each cluster centre. 3. Each cluster centre is recomputed as the average of the points in that cluster. 4. Steps 2 and 3 repeat until the clusters converge. Convergence may be defined differently depending upon the implementation, but it normally means that either no observations change clusters when steps 2 and 3 are repeated or that the changes do not make a material difference in the definition of the clusters. III. INTRODUCTION OF COST ESTIMATION In recent years, software has become the most expensive component of computer system projects. The bulk of the cost of software development is due to the human effort, and most cost estimation methods focus on this aspect and give estimates in terms of person-months [9]. Accurate software cost estimates are critical to both developers and customers. They can be used for generating request for proposals, contract negotiations, scheduling, monitoring and control. Underestimating the costs may result in management approving proposed systems that then exceed their budgets, with underdeveloped functions and poor quality, and failure to complete on time. Overestimating may result is too many resources committed to the project, or, during contract bidding, result in not winning the contract, which can lead to loss of jobs [6]. Figure1- front view of weka IV. WHY WE NEED THIS STUDY There are so many techniques available for software cost estimation but they are not very effectively. There is more work done of using data mining and software engineering. I m trying to data predict good result to the combining both fields. V. EXISTING METHODS FOR ESTIMATION Data mining techniques are being used extensively in a variety of fields. It has been frequently applied in the business arena for customer relationship management and market analysis. In addition to the multitude of applications of data mining, there has been parallel research in improving data mining algorithms. While data mining techniques have been applied across broad domains, it has been rarely applied in the field of software cost estimation, a subfield of software engineering [4]. 303 The estimation is a process of determining amount of efforts, money, resources and time for building a software project with the help of available quality information. Many estimation methods have been proposed in last 30 years and almost all methods require quantitative information of productivity, size of project and other important factors that affect the project. There are various practices of software estimation such as analogy, expert opinion and empirical based practices [Jones, 2007]. Analogy based practices require historical data of projects as an input for comparison whereas expert opinion are intuition based [Jorgenson and Sheppard, 2007]. Empirical way is a practice of deriving the cost of software using some mathematical/ algorithmic model. Examples of methods that use such practices are FP based method and COCOMO II method in TEMs. Mostly, all traditional software development methods follow either COCOMO II or FP based estimation methods successfully due to complete set of requirement specification.

4 The figure given in the below show that the methodology of the research. We are applying the k-means clustering algorithms and classifieds the data. 2 CEE is the first model work with using the machine learning algorithms and cost estimation algorithms for generating the cost of the projects but it is specially designed for the NASA, so we cannot use it for publically but it is gives the important guideline for the new researchrs [2, 4]. Figure2:- functional diagram of existing methodology VI. 2CEE COST ESTIMATION TOOLS 2CEE (21st Century Effort Estimation) is one of the cost estimation tools that can be used both data mining area and software engineering fields. It is developed for the NASA and copyrighted by NASA. It uses a variety of data mining and machine learning techniques nearest neighbour, feature subset selection, bootstrapping local calibration to propose the most accurate software cost model. It is designed to explore the uncertainty in the model and in the estimate, to allow estimates early in the lifecycle by representing new projects as ranges of values, and to provide numerous calibration options. 2CEE1 has been encoded in a Windows based tool that can be used to both generate an estimate and allow the model developer to calibrate and develop models using various machine learning, data mining, and statistical techniques. By automating many tasks for the user it provides gains in cost analyst efficiency. 2CEE uses leaveone out cross validation as a measure of model performance. 304 Agile cocomo model -- A COCOMO tool that is very simple to use and easy to learn. It incorporates the full COCOMO parametric model and used for analogy-based estimation to generate accurate results for a new project. Estimation by analogy is one of the most popular ways to estimate software cost and effort. While comparing similarities between the new and old projects provides a great way to estimate, results could still be inaccurate from overlooking differences between the two projects especially if the grounds of dissimilarity are fairly important. To build on the estimation by analogy approach while accounting for differences between projects, USC-CSE has created Agile COCOMO-II, a cost estimation tool that is based on COCOMO-II. It uses analogy based estimation to generate accurate results while being very simple to use and easy to learn. It can provide the facility to estimate the project in various ways, it is shown in the figure 5. We can estimate the project in tem of person- month, in term of dollars, in term of object points, in term of function points etc. In this paper, we discuss motivation for the program, the program's structure, the results of our research, and provide insight into the future direction of this tool [10]. VII. AN INTRODUCTION OF SCALE FACTORS AND COST DRIVER A. The Scale Drivers In the COCOMO II model, some of the most important factors contributing to a project's duration and cost are the Scale Drivers. You set each Scale Driver to describe your project; these Scale Drivers determine the exponent used in the Effort Equation. There are five scale driver used in the cocomo model and each cost driver play an important role in the estimation [5,9]. The 5 Scale Drivers are: Precedentedness Development Flexibility Architecture / Risk Resolution

5 Team Cohesion International Journal of Emerging Technology and Advanced Engineering C. Introduction of some cost drivers Process Maturity 1. Required Software Reliability (RELY) B. Cost Drivers COCOMO II has 17 cost drivers for estimation of project, development environment, and team to set each cost driver. The cost drivers are multiplicative factors that determine the effort required to complete your software project. For example, if your project will develop software that controls an airplane's flight, you would set the Required Software Reliability (RELY) cost driver to Very High. That rating corresponds to an effort multiplier of 1.26, meaning that your project will require 26% more effort than a typical software project. In the cocomo model, the cost drivers divide in the four groups show in the below and given an introduction some cost drivers in short form[5]. The cost drivers dived four groups: Personnel Factors: 1. Analyst Capability 2. Programmer Capability 3. Applications Experience 4. Platform Experience 5. Personnel Continuity 6. Use of Software Tools Product cost driver: 1. Required Software Reliability 2. Data Base Size 3. Required Reusability 4. Documentation match to life-cycle needs etc. Platform Factors: 1. Execution Time Constraint 2. Platform Volatility Project Factors: 1. Required Development Schedule 2. Multisite Development etc. This is the measure of the extent to which the software must perform its intended function over a period of time. If the effect of a software failure is only slight inconvenience then RELY is low. If a failure would risk human life then RELIES is very high. 2. Data Base Size (DATA) This measure attempts to capture the affect large data requirements have on product development. The rating is determined by calculating D/P. The reason the size of the database is important to consider it because of the effort required to generate the test data that will be used to exercise the program. 3. Product Complexity (CPLX) Complexity is divided into five areas: control operations, computational operations, device -dependent operations, data management operations, and user interface management operations. Select the area or combination of areas that characterize the product or a sub-system of the product. The complexity rating is the subjective weighted average of these areas. 4. Required Reusability (RUSE) This cost driver accounts for the additional effort needed to construct components intended for reuse on the current or future projects. This effort is consumed with creating more generic design of software, more elaborate documentation, and more extensive testing to ensure components are ready for use in other applications. 5. Execution Time Constraint (TIME) This is a measure of the execution time constraint imposed upon a software system. the rating is expressed in term of the percentage of available execution time expected to be used by the system or subsystem consuming the execution time resource. The rating ranges from nominal, less than 50% of the execution time resource used, to extra high, 95% of the execution time resource is consumed. 305

6 6. Analyst Capability (ACAP) 10. Use of Software Tools (TOOL) Analysts are personnel that work on requirements, high level design and detailed design. The major attributes that should be considered in this rating are Analysis and Design ability, efficiency and thoroughness, and the ability to communicate and cooperate. The rating should not consider the level of experience of the analyst; that is rated with AEXP. Analysts that fall in the 15th percentile are rated very low and those that fall in the 95th percentile are rated as very high. 7. Programmer Capability (PCAP) Current trends continue to emphasize the importance of highly capable analysts. However the increasing role of complex COTS packages, and the significant productivity leverage associated with programmers' ability to deal with these COTS packages, indicates a trend toward higher importance of programmer capability as well. Evaluation should be based on the capability of the programmers as a team rather than as individuals. Major factors which should be considered in the rating are ability, efficiency and thoroughness, and the ability to communicate and cooperate. The experience of the programmer should not be considered here; it is rated with AEXP. A very low rated programmer team is in the 15 th percentile and a very high rated programmer team is in the 95th percentile. 8. Applications Experience (AEXP) This rating is dependent on the level of applications experience of the project team developing the software system or subsystem. The ratings are defined in terms of the project team's equivalent level of experience with this type of application. A very low rating is for application experience of less than 2 months. A very high rating is for experience of 6 years or more. 9. Platform Experience (PEXP) The Post-Architecture model broadens the productivity influence of PEXP, recognizing the importance of understanding the use of more powerful platforms, including more graphic user interface, database, networking, and distributed middleware capabilities. Software tools have improved significantly since the 1970's projects used to calibrate COCOMO. The tool rating ranges from simple edit and code, very low, to integrated lifecycle management tools, very high[5]. VIII. COCOMO II EFFORT EQUATION The COCOMO II model makes its estimates of required effort (measured in Person-Months ï ½ PM) based primarily on your estimate of the software project's size (as measured in thousands of SLOC, KSLOC)): Effort = 2.94 * EAF * (KSLOC) E Where EAF Is the Effort Adjustment Factor derived from the Cost Drivers E Is an exponent derived from the five Scale Drivers As an example, a project with all Nominal Cost Drivers and Scale Drivers would have an EAF of 1.00 and exponent, E, of Assuming that the project is projected to consist of 9,000 source lines of code, COCOMO II estimates that 29.9 Person-Months of effort is required to complete it[ 1,9]. Effort = 2.94 * (1.0) * (9) = 29.9 Person-Months. Methodology Our methodology is very simple, I am combine two different-different fields data mining and the software engineering and try to generate the accurate cost of the project with the help of past project data whose cost or effort is known and find out the common cost factors. We used weka tools for data mining and agile cocomo tools for software estimation. I am using the promise data set for the analysis. IX. DATASET This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering. The data files in the arff and.csv format. These data set directly apply in the weka and apply the various algorithms. Result of weka applied in the agile cocomo model. 306

7 Result Agile cocomo model is the analogy model. In this model we estimate the new project with the help of compare the past project data. The feature of new project and past project is very similar to the past project. with the help of weka and agile we are predicted some useful result. In this research we have taken 60 nasa past project data whose efforts are already given, the list of the project is shown in the figure. I have search that the common cost drivers and the scale factors that are mainly affected the project estimation. With the help of agile cocomo model we have changed one of the values of the cost drivers or scale factors and predict the value of the cost drivers. The below figure shown the classification of the after apply the k-means clustering algorithms. With the help of clustering we are grouped of similar group of cost drivers. These cost drivers are very helpful to predict the estimate the new projects. In the weka, it is provide the facility to classify the data we are used Apriori algorithms. It also provides the graphical user interface and command line interface of the user. with the help of table 1 and 2 I am showing the cost drivers, found out after the analysis of past project data. These cost driver used in every type of project. Figure4- clustering Next figure show that the front view of agile cocomo model. It provides the facility of estimate the project in various way such as in term of the cost of the project in term of dollars, in term of the person month, in term of function point and object points etc Figure3- past project dataset in weka This figure 3 show that different cost drivers used in the various past projects. I am using 60 past NASA project s data and apply these project data in the weka, This figure show the actual effort of the past project data. We are taken as a base value in the agile cocomo model and set the new value of cost driver. After applying the k-means clustering we are find out the clusters that are store the similar cost drivers. Result of k-means clustering is shown in the figure 4. With the help of clustering we grouped the similar behaviour instances in to the clusters. 307 Figure5- front view of agile cocomo model

8 The next figure show the various cost factors. We set the new value of the cost factors and change their value with respect to the past project cost drivers. We are find out some important or useful cost drivers that can be used in every project and they are responsible for increase or decrease the cost of project. These cost drivers shown in the table 1 and 2. Decrease these to decrease cost of the project Store main memory constraint Data data base size Time time constraint for cpu Virt Rely machine volatility required software reliability etc Table 2- show cost drivers whose values is decrease X. CONCLUSION Figure6- show the various cost drivers Increase these to decrease effort Acap analysts capability Pcap programmers capability Aaexp application experience Modp Modern programming practices Tool use of software tools etc Lexp language experience Table1- show the cost drivers whose value is increased These results suggest that building data mining and machine learning techniques into existing software estimation techniques such as COCOMO can effectively improve the performance of a proven method. We have used weka tools for data mining because it consist of differentdifferent machine learning algorithms that can be help us to classify the data easily. We understand that there is a lack of serious research in this field. Our main aim to show the data mining is also very useful for the field of software engineering. Not all data mining techniques performed better than the traditional method of local calibration. However, a couple of techniques used in combination did provide more accurate software cost models than the traditional technique. While the best combination of data mining techniques were not consistent across the different stratifications of data, it shows that there are different populations of software projects and that rigorous data collection should be continued for improving the development of accurate cost estimation models. On the basis of this research we can say that cost drivers and scale factors perform important role in this estimation which we used any analogy models. I found out some common cost drivers that we can use for all projects. The future work is the need to investigate some more data mining algorithms that can be help to improve the process of software cost estimation and easy to use. The main reason for choose the cocomo model for this research because it is the best model of the software cost estimation and it is publicly available easily. 308

9 . International Journal of Emerging Technology and Advanced Engineering References [1] COCOMO II Model definition manual, version 1.4, University of Southern California. [2] Karen T. Lum, Daniel R. Baker, and Jairus M. Hihn The Effects of Data Mining Techniques on Software Cost Estimation 2009 IEEE. [3] Zhihao Chen, Tim Menzies? Dan PortTim Menzies? Dan Port Feature Subset Selection Can Improve Software Cost Estimation Accuracy Center for Software Engineering,Univ. of Southern California. [4] Jairus Hihn,Karen Lum 2CEE, A TWENTY FIRST CENTURY EFFORT ESTIMATION METHODOLOGY Lane Dept. CSEE West Virginia University ISPA / SCEA 2009 Joint International Conference. [5] ] Z. Oscar Marbán, Antonio de Amescua, Juan J. Cuadrado, Luis García, Cost Drivers of a Parametric Cost Estimation Model for Data Mining Projects Notes, vol. 30, no. 4, pp. 1-6, 2005 [6] Oscar Marbán, Antonio de Amescua, Juan J. Cuadrado, Luis García A cost model to estimate the effort of data mining projects Universidad Carlos III de Madrid (UC3M) [7] Dr. Alassane Ndiaye and Dr. Dominik Heckmann Weka: Practical machine learning tools and techniques with Java implementations AI Tools Seminar University of Saarland, WS 06/07 [8] S. Chandrasekaran1, R.Lavanya2 and V.Kanchana MULTI- CRITERIA APPROACH FOR AGILE SOFTWARE COST ESTIMATION MODEL [9] Caper Jones., Estimating software cost tata Mc- Graw -Hill Edition 2007 [10] AgileCOCOMOII/Main.html 309

Software cost estimation. Predicting the resources required for a software development process

Software cost estimation. Predicting the resources required for a software development process Software cost estimation Predicting the resources required for a software development process Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 23 Slide 1 Objectives To introduce the fundamentals

More information

CSSE 372 Software Project Management: Software Estimation With COCOMO-II

CSSE 372 Software Project Management: Software Estimation With COCOMO-II CSSE 372 Software Project Management: Software Estimation With COCOMO-II Shawn Bohner Office: Moench Room F212 Phone: (812) 877-8685 Email: bohner@rose-hulman.edu Estimation Experience and Beware of the

More information

Finally, Article 4, Creating the Project Plan describes how to use your insight into project cost and schedule to create a complete project plan.

Finally, Article 4, Creating the Project Plan describes how to use your insight into project cost and schedule to create a complete project plan. Project Cost Adjustments This article describes how to make adjustments to a cost estimate for environmental factors, schedule strategies and software reuse. Author: William Roetzheim Co-Founder, Cost

More information

Project Plan. Online Book Store. Version 1.0. Vamsi Krishna Mummaneni. CIS 895 MSE Project KSU. Major Professor. Dr.Torben Amtoft

Project Plan. Online Book Store. Version 1.0. Vamsi Krishna Mummaneni. CIS 895 MSE Project KSU. Major Professor. Dr.Torben Amtoft Online Book Store Version 1.0 Vamsi Krishna Mummaneni CIS 895 MSE Project KSU Major Professor Dr.Torben Amtoft 1 Table of Contents 1. Task Breakdown 3 1.1. Inception Phase 3 1.2. Elaboration Phase 3 1.3.

More information

MTAT.03.244 Software Economics. Lecture 5: Software Cost Estimation

MTAT.03.244 Software Economics. Lecture 5: Software Cost Estimation MTAT.03.244 Software Economics Lecture 5: Software Cost Estimation Marlon Dumas marlon.dumas ät ut. ee Outline Estimating Software Size Estimating Effort Estimating Duration 2 For Discussion It is hopeless

More information

Project Plan 1.0 Airline Reservation System

Project Plan 1.0 Airline Reservation System 1.0 Airline Reservation System Submitted in partial fulfillment of the requirements of the degree of Master of Software Engineering Kaavya Kuppa CIS 895 MSE Project Department of Computing and Information

More information

Chapter 23 Software Cost Estimation

Chapter 23 Software Cost Estimation Chapter 23 Software Cost Estimation Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 23 Slide 1 Software cost estimation Predicting the resources required for a software development process

More information

Software cost estimation

Software cost estimation Software cost estimation Ian Sommerville 2004 Software Engineering, 7th edition. Chapter 26 Slide 1 Objectives To introduce the fundamentals of software costing and pricing To describe three metrics for

More information

An Overview of Knowledge Discovery Database and Data mining Techniques

An Overview of Knowledge Discovery Database and Data mining Techniques An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,

More information

Cost Drivers of a Parametric Cost Estimation Model for Data Mining Projects (DMCOMO)

Cost Drivers of a Parametric Cost Estimation Model for Data Mining Projects (DMCOMO) Cost Drivers of a Parametric Cost Estimation Model for Mining Projects (DMCOMO) Oscar Marbán, Antonio de Amescua, Juan J. Cuadrado, Luis García Universidad Carlos III de Madrid (UC3M) Abstract Mining is

More information

Cost Estimation Driven Software Development Process

Cost Estimation Driven Software Development Process Cost Estimation Driven Software Development Process Orsolya Dobán, András Pataricza Budapest University of Technology and Economics Department of Measurement and Information Systems Pázmány P sétány 1/D

More information

Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase

Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase NAZRI KAMA, MEHRAN HALIMI Advanced Informatics School Universiti Teknologi Malaysia 54100, Jalan

More information

CISC 322 Software Architecture

CISC 322 Software Architecture CISC 322 Software Architecture Lecture 20: Software Cost Estimation 2 Emad Shihab Slides adapted from Ian Sommerville and Ahmed E. Hassan Estimation Techniques There is no simple way to make accurate estimates

More information

COCOMO II and Big Data

COCOMO II and Big Data COCOMO II and Big Data Rachchabhorn Wongsaroj*, Jo Ann Lane, Supannika Koolmanojwong, Barry Boehm *Bank of Thailand and Center for Systems and Software Engineering Computer Science Department, Viterbi

More information

Topics. Project plan development. The theme. Planning documents. Sections in a typical project plan. Maciaszek, Liong - PSE Chapter 4

Topics. Project plan development. The theme. Planning documents. Sections in a typical project plan. Maciaszek, Liong - PSE Chapter 4 MACIASZEK, L.A. and LIONG, B.L. (2005): Practical Software Engineering. A Case Study Approach Addison Wesley, Harlow England, 864p. ISBN: 0 321 20465 4 Chapter 4 Software Project Planning and Tracking

More information

Software cost estimation

Software cost estimation Software cost estimation Sommerville Chapter 26 Objectives To introduce the fundamentals of software costing and pricing To describe three metrics for software productivity assessment To explain why different

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014 RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer

More information

Software project cost estimation using AI techniques

Software project cost estimation using AI techniques Software project cost estimation using AI techniques Rodríguez Montequín, V.; Villanueva Balsera, J.; Alba González, C.; Martínez Huerta, G. Project Management Area University of Oviedo C/Independencia

More information

The COCOMO II Estimating Model Suite

The COCOMO II Estimating Model Suite The COCOMO II Estimating Model Suite Barry Boehm, Chris Abts, Jongmoon Baik, Winsor Brown, Sunita Chulani, Cyrus Fakharzadeh, Ellis Horowitz and Donald Reifer Center for Software Engineering University

More information

SPATIAL DATA CLASSIFICATION AND DATA MINING

SPATIAL DATA CLASSIFICATION AND DATA MINING , pp.-40-44. Available online at http://www. bioinfo. in/contents. php?id=42 SPATIAL DATA CLASSIFICATION AND DATA MINING RATHI J.B. * AND PATIL A.D. Department of Computer Science & Engineering, Jawaharlal

More information

COCOMO-SCORM Interactive Courseware Project Cost Modeling

COCOMO-SCORM Interactive Courseware Project Cost Modeling COCOMO-SCORM Interactive Courseware Project Cost Modeling Roger Smith & Lacey Edwards SPARTA Inc. 13501 Ingenuity Drive, Suite 132 Orlando, FL 32826 Roger.Smith, Lacey.Edwards @Sparta.com Copyright 2006

More information

TEXT ANALYTICS INTEGRATION

TEXT ANALYTICS INTEGRATION TEXT ANALYTICS INTEGRATION A TELECOMMUNICATIONS BEST PRACTICES CASE STUDY VISION COMMON ANALYTICAL ENVIRONMENT Structured Unstructured Analytical Mining Text Discovery Text Categorization Text Sentiment

More information

E-COCOMO: The Extended COst Constructive MOdel for Cleanroom Software Engineering

E-COCOMO: The Extended COst Constructive MOdel for Cleanroom Software Engineering Database Systems Journal vol. IV, no. 4/2013 3 E-COCOMO: The Extended COst Constructive MOdel for Cleanroom Software Engineering Hitesh KUMAR SHARMA University of Petroleum and Energy Studies, India hkshitesh@gmail.com

More information

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model

Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model Iman Attarzadeh and Siew Hock Ow Department of Software Engineering Faculty of Computer Science &

More information

Safe and Simple Software Cost Analysis Barry Boehm, USC Everything should be as simple as possible, but no simpler.

Safe and Simple Software Cost Analysis Barry Boehm, USC Everything should be as simple as possible, but no simpler. Safe and Simple Software Cost Analysis Barry Boehm, USC Everything should be as simple as possible, but no simpler. -Albert Einstein Overview There are a number of simple software cost analysis methods,

More information

Comparison of K-means and Backpropagation Data Mining Algorithms

Comparison of K-means and Backpropagation Data Mining Algorithms Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and

More information

Module 11. Software Project Planning. Version 2 CSE IIT, Kharagpur

Module 11. Software Project Planning. Version 2 CSE IIT, Kharagpur Module 11 Software Project Planning Lesson 28 COCOMO Model Specific Instructional Objectives At the end of this lesson the student would be able to: Differentiate among organic, semidetached and embedded

More information

COURSE RECOMMENDER SYSTEM IN E-LEARNING

COURSE RECOMMENDER SYSTEM IN E-LEARNING International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand

More information

Software Engineering. Dilbert on Project Planning. Overview CS / COE 1530. Reading: chapter 3 in textbook Requirements documents due 9/20

Software Engineering. Dilbert on Project Planning. Overview CS / COE 1530. Reading: chapter 3 in textbook Requirements documents due 9/20 Software Engineering CS / COE 1530 Lecture 4 Project Management Dilbert on Project Planning Overview Reading: chapter 3 in textbook Requirements documents due 9/20 1 Tracking project progress Do you understand

More information

Achieving Estimation Accuracy on IT Projects

Achieving Estimation Accuracy on IT Projects Achieving Estimation Accuracy on IT Projects By Chris Dwyer 16 October 2009 Overview This whitepaper continues on from the paper presented by Martin Vaughan at PMOZ Conference Canberra 2009 Improving Estimating

More information

PREDICTING THE COST ESTIMATION OF SOFTWARE PROJECTS USING CASE-BASED REASONING

PREDICTING THE COST ESTIMATION OF SOFTWARE PROJECTS USING CASE-BASED REASONING PREDICTING THE COST ESTIMATION OF SOFTWARE PROJECTS USING CASE-BASED REASONING Hassan Y. A. Abu Tair Department of Computer Science College of Computer and Information Sciences King Saud University habutair@gmail.com

More information

An Introduction to WEKA. As presented by PACE

An Introduction to WEKA. As presented by PACE An Introduction to WEKA As presented by PACE Download and Install WEKA Website: http://www.cs.waikato.ac.nz/~ml/weka/index.html 2 Content Intro and background Exploring WEKA Data Preparation Creating Models/

More information

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery

Index Contents Page No. Introduction . Data Mining & Knowledge Discovery Index Contents Page No. 1. Introduction 1 1.1 Related Research 2 1.2 Objective of Research Work 3 1.3 Why Data Mining is Important 3 1.4 Research Methodology 4 1.5 Research Hypothesis 4 1.6 Scope 5 2.

More information

A Review of Data Mining Techniques

A Review of Data Mining Techniques Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,

More information

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH

EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One

More information

2 Evaluation of the Cost Estimation Models: Case Study of Task Manager Application. Equations

2 Evaluation of the Cost Estimation Models: Case Study of Task Manager Application. Equations I.J.Modern Education and Computer Science, 2013, 8, 1-7 Published Online October 2013 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijmecs.2013.08.01 Evaluation of the Cost Estimation Models: Case

More information

Software Migration Project Cost Estimation using COCOMO II and Enterprise Architecture Modeling

Software Migration Project Cost Estimation using COCOMO II and Enterprise Architecture Modeling Software Migration Project Cost Estimation using COCOMO II and Enterprise Architecture Modeling Alexander Hjalmarsson 1, Matus Korman 1 and Robert Lagerström 1, 1 Royal Institute of Technology, Osquldas

More information

Keywords Software Cost; Effort Estimation, Constructive Cost Model-II (COCOMO-II), Hybrid Model, Functional Link Artificial Neural Network (FLANN).

Keywords Software Cost; Effort Estimation, Constructive Cost Model-II (COCOMO-II), Hybrid Model, Functional Link Artificial Neural Network (FLANN). Develop Hybrid Cost Estimation For Software Applications. Sagar K. Badjate,Umesh K. Gaikwad Assistant Professor, Dept. of IT, KKWIEER, Nasik, India sagar.badjate@kkwagh.edu.in,ukgaikwad@kkwagh.edu.in A

More information

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing

Introduction to Data Mining and Machine Learning Techniques. Iza Moise, Evangelos Pournaras, Dirk Helbing Introduction to Data Mining and Machine Learning Techniques Iza Moise, Evangelos Pournaras, Dirk Helbing Iza Moise, Evangelos Pournaras, Dirk Helbing 1 Overview Main principles of data mining Definition

More information

Monte Carlo Simulation for Software Cost Estimation. Pete MacDonald Fatma Mili, PhD.

Monte Carlo Simulation for Software Cost Estimation. Pete MacDonald Fatma Mili, PhD. Monte Carlo Simulation for Software Cost Estimation Pete MacDonald Fatma Mili, PhD. Definition Software Maintenance - The activities involved in implementing a set of relatively small changes to an existing

More information

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree

Predicting the Risk of Heart Attacks using Neural Network and Decision Tree Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,

More information

INCORPORATING VITAL FACTORS IN AGILE ESTIMATION THROUGH ALGORITHMIC METHOD

INCORPORATING VITAL FACTORS IN AGILE ESTIMATION THROUGH ALGORITHMIC METHOD International Journal of Computer Science and Applications, 2009 Technomathematics Research Foundation Vol. 6, No. 1, pp. 85 97 INCORPORATING VITAL FACTORS IN AGILE ESTIMATION THROUGH ALGORITHMIC METHOD

More information

SOFTWARE COST DRIVERS AND COST ESTIMATION IN NIGERIA ASIEGBU B, C AND AHAIWE, J

SOFTWARE COST DRIVERS AND COST ESTIMATION IN NIGERIA ASIEGBU B, C AND AHAIWE, J SOFTWARE COST DRIVERS AND COST ESTIMATION IN NIGERIA Abstract ASIEGBU B, C AND AHAIWE, J This research work investigates the effect of cost drivers on software cost estimation. Several models exist that

More information

Introduction Predictive Analytics Tools: Weka

Introduction Predictive Analytics Tools: Weka Introduction Predictive Analytics Tools: Weka Predictive Analytics Center of Excellence San Diego Supercomputer Center University of California, San Diego Tools Landscape Considerations Scale User Interface

More information

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool

K-means Clustering Technique on Search Engine Dataset using Data Mining Tool International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 6 (2013), pp. 505-510 International Research Publications House http://www. irphouse.com /ijict.htm K-means

More information

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm

Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm Association Technique on Prediction of Chronic Diseases Using Apriori Algorithm R.Karthiyayini 1, J.Jayaprakash 2 Assistant Professor, Department of Computer Applications, Anna University (BIT Campus),

More information

Chapter 6. The stacking ensemble approach

Chapter 6. The stacking ensemble approach 82 This chapter proposes the stacking ensemble approach for combining different data mining classifiers to get better performance. Other combination techniques like voting, bagging etc are also described

More information

Using Data Mining for Mobile Communication Clustering and Characterization

Using Data Mining for Mobile Communication Clustering and Characterization Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer

More information

Software Cost Estimation: A Tool for Object Oriented Console Applications

Software Cost Estimation: A Tool for Object Oriented Console Applications Software Cost Estimation: A Tool for Object Oriented Console Applications Ghazy Assassa, PhD Hatim Aboalsamh, PhD Amel Al Hussan, MSc Dept. of Computer Science, Dept. of Computer Science, Computer Dept.,

More information

Social Media Mining. Data Mining Essentials

Social Media Mining. Data Mining Essentials Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers

More information

Spend Enrichment: Making better decisions starts with accurate data

Spend Enrichment: Making better decisions starts with accurate data IBM Software Industry Solutions Industry/Product Identifier Spend Enrichment: Making better decisions starts with accurate data Spend Enrichment: Making better decisions starts with accurate data Contents

More information

Knowledge Discovery from patents using KMX Text Analytics

Knowledge Discovery from patents using KMX Text Analytics Knowledge Discovery from patents using KMX Text Analytics Dr. Anton Heijs anton.heijs@treparel.com Treparel Abstract In this white paper we discuss how the KMX technology of Treparel can help searchers

More information

Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data

Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition Data Proceedings of Student-Faculty Research Day, CSIS, Pace University, May 2 nd, 2014 Classification of Titanic Passenger Data and Chances of Surviving the Disaster Data Mining with Weka and Kaggle Competition

More information

Multinomial Logistic Regression Applied on Software Productivity Prediction

Multinomial Logistic Regression Applied on Software Productivity Prediction Multinomial Logistic Regression Applied on Software Productivity Prediction Panagiotis Sentas, Lefteris Angelis, Ioannis Stamelos Department of Informatics, Aristotle University 54124 Thessaloniki, Greece

More information

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA

Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA Digging for Gold: Business Usage for Data Mining Kim Foster, CoreTech Consulting Group, Inc., King of Prussia, PA ABSTRACT Current trends in data mining allow the business community to take advantage of

More information

Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach

Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach www.ijcsi.org 692 Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach Manoj Kumar Panda HEAD OF THE DEPT,CE,IT & MCA NUVA COLLEGE OF ENGINEERING & TECH NAGPUR, MAHARASHTRA,INDIA

More information

Facilitating Predictive Cost Analytics via Modelling V&V

Facilitating Predictive Cost Analytics via Modelling V&V Facilitating Predictive Cost Analytics via Modelling V&V John Swaren, Solution Architect, Price Systems LLC 2015 PRICE Systems, LLC All Rights Reserved Decades of Cost Management Excellence 1 Why Verify

More information

Clustering Marketing Datasets with Data Mining Techniques

Clustering Marketing Datasets with Data Mining Techniques Clustering Marketing Datasets with Data Mining Techniques Özgür Örnek International Burch University, Sarajevo oornek@ibu.edu.ba Abdülhamit Subaşı International Burch University, Sarajevo asubasi@ibu.edu.ba

More information

Effect of Schedule Compression on Project Effort

Effect of Schedule Compression on Project Effort Effect of Schedule Compression on Project Effort Ye Yang, Zhihao Chen, Ricardo Valerdi, Barry Boehm Center for Software Engineering, University of Southern California (USC-CSE) Los Angeles, CA 90089-078,

More information

A Study on Software Metrics and Phase based Defect Removal Pattern Technique for Project Management

A Study on Software Metrics and Phase based Defect Removal Pattern Technique for Project Management International Journal of Soft Computing and Engineering (IJSCE) A Study on Software Metrics and Phase based Defect Removal Pattern Technique for Project Management Jayanthi.R, M Lilly Florence Abstract:

More information

An Evaluation of Neural Networks Approaches used for Software Effort Estimation

An Evaluation of Neural Networks Approaches used for Software Effort Estimation Proc. of Int. Conf. on Multimedia Processing, Communication and Info. Tech., MPCIT An Evaluation of Neural Networks Approaches used for Software Effort Estimation B.V. Ajay Prakash 1, D.V.Ashoka 2, V.N.

More information

Software cost estimation

Software cost estimation CH26_612-640.qxd 4/2/04 3:28 PM Page 612 26 Software cost estimation Objectives The objective of this chapter is to introduce techniques for estimating the cost and effort required for software production.

More information

A DIFFERENT KIND OF PROJECT MANAGEMENT

A DIFFERENT KIND OF PROJECT MANAGEMENT SEER for Software SEER project estimation and management solutions improve success rates on complex software projects. Based on sophisticated modeling technology and extensive knowledge bases, SEER solutions

More information

Project Planning and Project Estimation Techniques. Naveen Aggarwal

Project Planning and Project Estimation Techniques. Naveen Aggarwal Project Planning and Project Estimation Techniques Naveen Aggarwal Responsibilities of a software project manager The job responsibility of a project manager ranges from invisible activities like building

More information

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept

Statistics 215b 11/20/03 D.R. Brillinger. A field in search of a definition a vague concept Statistics 215b 11/20/03 D.R. Brillinger Data mining A field in search of a definition a vague concept D. Hand, H. Mannila and P. Smyth (2001). Principles of Data Mining. MIT Press, Cambridge. Some definitions/descriptions

More information

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat

WebFOCUS RStat. RStat. Predict the Future and Make Effective Decisions Today. WebFOCUS RStat Information Builders enables agile information solutions with business intelligence (BI) and integration technologies. WebFOCUS the most widely utilized business intelligence platform connects to any enterprise

More information

Final Project Report

Final Project Report CPSC545 by Introduction to Data Mining Prof. Martin Schultz & Prof. Mark Gerstein Student Name: Yu Kor Hugo Lam Student ID : 904907866 Due Date : May 7, 2007 Introduction Final Project Report Pseudogenes

More information

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD

Predictive Analytics Techniques: What to Use For Your Big Data. March 26, 2014 Fern Halper, PhD Predictive Analytics Techniques: What to Use For Your Big Data March 26, 2014 Fern Halper, PhD Presenter Proven Performance Since 1995 TDWI helps business and IT professionals gain insight about data warehousing,

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015

International Journal of Computer Science Trends and Technology (IJCST) Volume 3 Issue 3, May-June 2015 RESEARCH ARTICLE OPEN ACCESS Data Mining Technology for Efficient Network Security Management Ankit Naik [1], S.W. Ahmad [2] Student [1], Assistant Professor [2] Department of Computer Science and Engineering

More information

CS 458 - Homework 4 p. 1. CS 458 - Homework 4. To become more familiar with top-down effort estimation models, especially COCOMO 81 and COCOMO II.

CS 458 - Homework 4 p. 1. CS 458 - Homework 4. To become more familiar with top-down effort estimation models, especially COCOMO 81 and COCOMO II. CS 458 - Homework 4 p. 1 Deadline Due by 11:59 pm on Friday, October 31, 2014 How to submit CS 458 - Homework 4 Submit these homework files using ~st10/458submit on nrs-labs, with a homework number of

More information

A HYBRID FUZZY-ANN APPROACH FOR SOFTWARE EFFORT ESTIMATION

A HYBRID FUZZY-ANN APPROACH FOR SOFTWARE EFFORT ESTIMATION A HYBRID FUZZY-ANN APPROACH FOR SOFTWARE EFFORT ESTIMATION Sheenu Rizvi 1, Dr. S.Q. Abbas 2 and Dr. Rizwan Beg 3 1 Department of Computer Science, Amity University, Lucknow, India 2 A.I.M.T., Lucknow,

More information

Pentaho Data Mining Last Modified on January 22, 2007

Pentaho Data Mining Last Modified on January 22, 2007 Pentaho Data Mining Copyright 2007 Pentaho Corporation. Redistribution permitted. All trademarks are the property of their respective owners. For the latest information, please visit our web site at www.pentaho.org

More information

Database Marketing, Business Intelligence and Knowledge Discovery

Database Marketing, Business Intelligence and Knowledge Discovery Database Marketing, Business Intelligence and Knowledge Discovery Note: Using material from Tan / Steinbach / Kumar (2005) Introduction to Data Mining,, Addison Wesley; and Cios / Pedrycz / Swiniarski

More information

Article 3, Dealing with Reuse, explains how to quantify the impact of software reuse and commercial components/libraries on your estimate.

Article 3, Dealing with Reuse, explains how to quantify the impact of software reuse and commercial components/libraries on your estimate. Estimating Software Costs This article describes the cost estimation lifecycle and a process to estimate project volume. Author: William Roetzheim Co-Founder, Cost Xpert Group, Inc. Estimating Software

More information

Software Cost Estimation Techniques Kusuma Kumari B.M * Department of Computer Science, University College of Science, Tumkur University

Software Cost Estimation Techniques Kusuma Kumari B.M * Department of Computer Science, University College of Science, Tumkur University Software Cost Estimation Techniques Kusuma Kumari B.M * Department of Computer Science, University College of Science, Tumkur University Abstract Project planning is one of the most important activities

More information

Predicting Students Final GPA Using Decision Trees: A Case Study

Predicting Students Final GPA Using Decision Trees: A Case Study Predicting Students Final GPA Using Decision Trees: A Case Study Mashael A. Al-Barrak and Muna Al-Razgan Abstract Educational data mining is the process of applying data mining tools and techniques to

More information

Comparison and Analysis of Different Software Cost Estimation Methods

Comparison and Analysis of Different Software Cost Estimation Methods Comparison and Analysis of Different Software Cost Estimation Methods Sweta Kumari Computer Science & Engineering Birla Institute of Technology Ranchi India Shashank Pushkar Computer Science &Engineering

More information

EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN

EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN Sridhar S Associate Professor, Department of Information Science and Technology, Anna University,

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining 1 Why Data Mining? Explosive Growth of Data Data collection and data availability Automated data collection tools, Internet, smartphones, Major sources of abundant data Business:

More information

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts

BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an

More information

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL

ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR

More information

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010

Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Decision Support Optimization through Predictive Analytics - Leuven Statistical Day 2010 Ernst van Waning Senior Sales Engineer May 28, 2010 Agenda SPSS, an IBM Company SPSS Statistics User-driven product

More information

Data Mining + Business Intelligence. Integration, Design and Implementation

Data Mining + Business Intelligence. Integration, Design and Implementation Data Mining + Business Intelligence Integration, Design and Implementation ABOUT ME Vijay Kotu Data, Business, Technology, Statistics BUSINESS INTELLIGENCE - Result Making data accessible Wider distribution

More information

How To Manage Project Management

How To Manage Project Management CS/SWE 321 Sections -001 & -003 Software Project Management Copyright 2014 Hassan Gomaa All rights reserved. No part of this document may be reproduced in any form or by any means, without the prior written

More information

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM

TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam

More information

Software Engineering. Reading. Effort estimation CS / COE 1530. Finish chapter 3 Start chapter 5

Software Engineering. Reading. Effort estimation CS / COE 1530. Finish chapter 3 Start chapter 5 Software Engineering CS / COE 1530 Lecture 5 Project Management (finish) & Design CS 1530 Software Engineering Fall 2004 Reading Finish chapter 3 Start chapter 5 CS 1530 Software Engineering Fall 2004

More information

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques.

Keywords Data mining, Classification Algorithm, Decision tree, J48, Random forest, Random tree, LMT, WEKA 3.7. Fig.1. Data mining techniques. International Journal of Emerging Research in Management &Technology Research Article October 2015 Comparative Study of Various Decision Tree Classification Algorithm Using WEKA Purva Sewaiwar, Kamal Kant

More information

Estimating Size and Effort

Estimating Size and Effort Estimating Size and Effort Dr. James A. Bednar jbednar@inf.ed.ac.uk http://homepages.inf.ed.ac.uk/jbednar Dr. David Robertson dr@inf.ed.ac.uk http://www.inf.ed.ac.uk/ssp/members/dave.htm SAPM Spring 2007:

More information

In this presentation, you will be introduced to data mining and the relationship with meaningful use.

In this presentation, you will be introduced to data mining and the relationship with meaningful use. In this presentation, you will be introduced to data mining and the relationship with meaningful use. Data mining refers to the art and science of intelligent data analysis. It is the application of machine

More information

A Review of Anomaly Detection Techniques in Network Intrusion Detection System

A Review of Anomaly Detection Techniques in Network Intrusion Detection System A Review of Anomaly Detection Techniques in Network Intrusion Detection System Dr.D.V.S.S.Subrahmanyam Professor, Dept. of CSE, Sreyas Institute of Engineering & Technology, Hyderabad, India ABSTRACT:In

More information

How To Use Neural Networks In Data Mining

How To Use Neural Networks In Data Mining International Journal of Electronics and Computer Science Engineering 1449 Available Online at www.ijecse.org ISSN- 2277-1956 Neural Networks in Data Mining Priyanka Gaur Department of Information and

More information

1.1 The Nature of Software... Object-Oriented Software Engineering Practical Software Development using UML and Java. The Nature of Software...

1.1 The Nature of Software... Object-Oriented Software Engineering Practical Software Development using UML and Java. The Nature of Software... 1.1 The Nature of Software... Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 1: Software and Software Engineering Software is intangible Hard to understand

More information

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 4, April 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Knowledge-Based Systems Engineering Risk Assessment

Knowledge-Based Systems Engineering Risk Assessment Knowledge-Based Systems Engineering Risk Assessment Raymond Madachy, Ricardo Valerdi University of Southern California - Center for Systems and Software Engineering Massachusetts Institute of Technology

More information

Financial Trading System using Combination of Textual and Numerical Data

Financial Trading System using Combination of Textual and Numerical Data Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,

More information

USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES

USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES USING DATA SCIENCE TO DISCOVE INSIGHT OF MEDICAL PROVIDERS CHARGE FOR COMMON SERVICES Irron Williams Northwestern University IrronWilliams2015@u.northwestern.edu Abstract--Data science is evolving. In

More information

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification

Denial of Service Attack Detection Using Multivariate Correlation Information and Support Vector Machine Classification International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Issue-3 E-ISSN: 2347-2693 Denial of Service Attack Detection Using Multivariate Correlation Information and

More information

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7

DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 DATA MINING TOOL FOR INTEGRATED COMPLAINT MANAGEMENT SYSTEM WEKA 3.6.7 UNDER THE GUIDANCE Dr. N.P. DHAVALE, DGM, INFINET Department SUBMITTED TO INSTITUTE FOR DEVELOPMENT AND RESEARCH IN BANKING TECHNOLOGY

More information

A DIFFERENT KIND OF PROJECT MANAGEMENT: AVOID SURPRISES

A DIFFERENT KIND OF PROJECT MANAGEMENT: AVOID SURPRISES SEER for Software: Cost, Schedule, Risk, Reliability SEER project estimation and management solutions improve success rates on complex software projects. Based on sophisticated modeling technology and

More information

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support

DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support DMDSS: Data Mining Based Decision Support System to Integrate Data Mining and Decision Support Rok Rupnik, Matjaž Kukar, Marko Bajec, Marjan Krisper University of Ljubljana, Faculty of Computer and Information

More information