Research Article Predicting Software Projects Cost Estimation Based on Mining Historical Data
|
|
|
- Theodora Bradford
- 10 years ago
- Views:
Transcription
1 International Scholarly Research Network ISRN Software Engineering Volume 2012, Article ID , 8 pages doi: /2012/ Research Article Predicting Software Projects Cost Estimation Based on Mining Historical Data Hassan Najadat, 1 Izzat Alsmadi, 2 and Yazan Shboul 1 1 Computer Information Systems Department, Jordan University of Science and Technology, P.O. Box 3030, Irbid 22110, Jordan 2 Computer Information Systems Department, Yarmouk University, P.O. Box 566, Irbid 21163, Jordan Correspondence should be addressed to Izzat Alsmadi, [email protected] Received 18 November 2011; Accepted 16 January 2012 Academic Editors: J. Cao and O. Greevy Copyright 2012 Hassan Najadat et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In this research, a hybrid cost estimation model is proposed to produce a realistic prediction model that takes into consideration software project, product, process, and environmental elements. A cost estimation dataset is built from a large number of open source projects. Those projects are divided into three domains: communication, finance, and game projects. Several data mining techniques are used to classify software projects in terms of their development complexity. Data mining techniques are also used to study association between different software attributes and their relation to cost estimation. Results showed that finance metrics are usually the most complex in terms of code size and some other complexity metrics. Results showed also that games applications have higher values of the SLOCmath, coupling, cyclomatic complexity, and MCDC metrics. Information gain is used in order to evaluate the ability of object-oriented metrics to predict software complexity. MCDC metric is shown to be the first metric in deciding a software project complexity. A software project effort equation is created based on clustering and based on all software projects attributes. According to the software metrics weights values developed in this project, we can notice that MCDC, LOC, and cyclomatic complexity of the traditional metrics are still the dominant metrics that affect our classification process, while number of children and depth of inheritance are the dominant from the object-oriented metrics as a second level. 1. Introduction Software companies are interested in determining the software development cost in the early stages to control and plan software tasks, risks, budgets, and schedules. In order to analyze software cost, we have to gather metrics from the current and the previous projects and draw similarities and analogies to be able to come up with predictions regarding the current project. It is important for software companies to benefit from the company historical software data in order to use it to estimate the cost and the schedule of the new ordered projects. So they can manage the budget, the development staff, and the schedules for the work process. It is essential for both developers and customers to get accurate software cost estimation by accurately estimating the new project cost. Project managers can provide the customers with an accurate deadline for their projects and debate some of the contract negotiation s issues, so that customers can expect actual development costs to be in line with estimated cost, and these estimations can be used by developers to generate reports and proposals and to determine what resources are needed to commit to the project and how well these resources will be used. Accurate software cost estimation makes projects management easier to be managed and controlled as resources are better matched to the real needs. Many researchers in software engineering field have studied in depth how to predict the software project cost which is important for the project managers and software development organizations. Cost estimation or what is called by other researchers effort prediction is the process of estimating the cost of the software system development. This estimation can generally be estimated through three methods: experts judgment, algorithmic model, and by analogy [1].
2 2 ISRN Software Engineering A Pseudocode // pseudocode for the McCabe graph B Source lines of node A If ( ) then source lines of node A C D Else source lines of node B E If ( ) then source lines of node C Else source lines of node D Source lines of node E F Source lines of node F Figure 1: McCabe graph and its pseudocode. In this work, a new approach is proposed to predict the cost and the complexity of building new software projects using the data mining classification techniques based on the metrics similarity. We have built a dataset from open source projects classified according to three domains (communication, finance, and games) Important software related features are extracted from them using a tool developed by Alsmadi and Magel [2]. This work builds a channel between both the data mining and the software engineering approaches. It is important for software companies to specify the current new ordered project cost in a bid to manage the project cost, schedule, and to determine the staff and time table of the project development. Most of the previous works have suffered from some problems of accuracy and building their models depending on their own datasets [3]. In this work, we built a dataset or a dictionary of prediction where users can relate their projects to a domain in the dictionary and then use such information to make the cost estimation. A little research has been done on estimating the effort of object-oriented software that applies the use of case model [4]. Also, there are few researches that analyze the effort prediction of open source code projects. 2. Background and Related Work Software metrics are the software features (measures) and characteristics. Since software measurements are essential in software engineering, there have been many researches over the last four decades to provide a comprehensive measure of software complexity and to use it in software cost estimation and software analysis. Although the first software metrics book was published in 1976 [5], the history of software metrics researches went back to the 1960s, when the lines of code (LOC) metric was used to measure the productivity of the programmer and software complexity and quality. LOC was used as a main key in effort prediction for some prediction models such as [6, 7]. In 1984, Basili and Perricone made an analysis of the relationship between module size and error proneness [8]. They analyzed some modules with LOC less than 200 lines of FORTRAN language project, and they concluded that the modules with less LOC have the greater bug density. Another research article published in 1985 made an analysis on the software of Pascal, PL/S, and assembly language [9]. In their researchshen et al. [9] concluded that the higher bug density occurs with large modules when LOC is greater than or equals 500. In his research of Ada program in 1990, Withrow [10] validated the concave relationship between software size and bug density where the bug density increases when LOC value is more than 250 and decreased when LOC value is less than 250. In the mid of the 1970s, the interest in software complexity increased when graph theoretical complexity is discussed by McCabe in [11]. He developed a mathematical technique for program modularization. Some definitions of the graph theory were used in order to measure and control the number of paths through a software program that is called the Cyclomatic Complexity metric. Then this metric has been used for complexity measurements instead of size metrics. McCabe Cyclomatic Complexity metrics compute the number of paths that may be executed through the program using the graph theory. The nodes of the graph represent the source code lines of the software program, and the directed edges between nodes represent the second source code lines that may be executed. Figure 1 represents an example of the McCabe Cyclomatic Complexity graph. According to McCabe [12], the Cyclomatic Complexity metric value is measured by the following formula: V(G) = e n +2, (1) where e refers to edges and n refers to the nodes. For example, in the above graph V(G) = 7 6+2= 3.
3 ISRN Software Engineering 3 In their research in 1984, Basili and Perricone [8] found a correlation between McCabe Cyclomatic Complexity and module sizes. They discovered that large modules have high complexity. Halstead introduced other software metrics in 1977 [13]. These metrics have been developed in order to estimate the programming effort. The Halstead metrics are measured using some statistic numbers. These statistics are (i) n1 = number of unique or distinct operators which appear in that implementation; (ii) n2 = number of unique or distinct operands which appear in that implementation; (iii) N1 = total usage of all of the operators which appear in that implementation; (iv) N2 = total usage of all of the operands which appear in that implementation. The above statistics contain operators and operands. The operators are used to specify the manipulation to be performed while the operands are used as logic units to be operated. From the above statistics Halstead complexity metrics are defined as (i) the vocabulary n = n1+n2, (ii) the program length N = N1+N2. In 2004, Fei et al. in their article [14] proposedanimprovement on the Halstead complexity metrics. They added weights to the Halstead metrics. They gave different operators and operands different weights. Six object-oriented design metrics were developed and evaluated by Chidamber and Kemerer in 1994 [15]. These object-oriented metrics are called CK metrics. The CK metrics that resulted from Chidamber and Kemerer are weighted methods per class (WMC), depth of inheritance tree (DIT), number of children (NOC), coupling between object classes (CBO), response for a class (RFC), and lack of cohesion in methods (LCOM) Cost Estimation. Software systems and applications are the most expensive part of the computer systems due to the human effort that is used in producing software systems. This reason (expansive software development) motivated many researchers to focus on this aspect of research. Software cost estimation or what is called by other researchers effort prediction is the process of estimating software cost accurately. Over the past 30 years, many studies have been conducted in the software cost estimation field [16] where two main types of cost estimation methods have been discussed including the algorithmic and nonalgorithmic methods. Some cost estimation models use data from previous projects in order to derive cost formulas; these models are called empirical models like COCOMO model [17], while other models depend on global assumptions to derive their cost formulas. These models are called analytical models; for example, Putnam in his paper [7] proposed an approach to accurately estimate effort, cost, and time of software projects. Leung and Fan in their article [16] gave an overview of the software cost estimation, and they highlighted the importance of the accuracy in estimating the cost of the software Cost Estimation Models. Cost estimation models can be classified into two categories according to the approach and the procedures used to measure the software cost. There are two major categories which are algorithmic and nonalgorithmic models Nonalgorithmic Models. There are many non-algorithmic models used in software cost estimation. The following are some of these models used in the literature. (i) Analogy Model. Prediction by analogy is one of the most non-algorithmic methods used in effort prediction. Using analogy prediction depends on previous completed projects where we can predict efforts using actual existing projects cost values. In prediction by analogy, the software project is characterized through variables, and then Euclidean distance is measured in n-dimensional space. A prediction tool to find an analogous of the current software project from the set of completed projects called ANGEL is developed, where it is flexible to return three analogous [1]. (ii) Expert Judgment. Usually, more than one expert s opinions are involved in the estimation process. Therefore, deriving the software cost estimation is neither explicit nor repeatable. In fact, experts consider some techniques such as Delphi technique or PERT [18, 19] mechanisms in a bid to reach a consensus between each other resolve the inconsistencies in the estimation. In the Delphi mechanisms, a coordinator asks each expert to fill a form to record estimations, and then the coordinator prepares a summary of all the estismations from the experts [16]. Parkinson Model. Thecostisdeterminedbytheavailableresources rather than based on an objective assessment according to Parkinson s principle [20]. Algorithmic models. The algorithmic model estimates the software cost through some formulas that depend mainly on the size of the project which is measured in terms of function point, object point, and lines of code (LOC). Beside the size of the project, there are a number of variables that are involved in the algorithmic model function such as: Effort = f ( y 1, y 2, y 3,..., y n ), (2) where Effort is a cost estimation measure that is usually measured by (person-month), and f refers to the function form, and (y 1, y 2, y 3,..., y n ) refers to the cost factors. Boehm introduced the first version of COCOMO as a model for estimating the effort, cost, and schedule; this COCOMO version was called COCOMO 81 [6] where 63 projects ranging from 2000 to lines of code are used in that study. In 1997, Boehm enhanced his first version of COCOMO and introduced another model called COCOMO 2 [21]. This model provides more support for modern
4 4 ISRN Software Engineering software development processes. In COCOMO models, LOC is used as a software code size and given in thousands to measure the effort which is measured in person-month. 3. Goals and Approaches A software metrics tool called CodeMetrics is built in order to extract software metrics from source codes which are used to build software metrics data set. CodeMetrics is extended from SWMetrics which was built by Alsmadi and Megal [2]. This tool can extract metrics from C#, C++, and Java source code. CodeMetrics can build and classify the software data set according to their domains (e.g., communication, finance, and games). CodeMetrics can extract some selected CK metrics as well as the traditional metrics like the metrics extracted from SWMetrics [15]. The metrics extracted are (Lines, LOC, SLOC, SLOCmath, MCDC, MaxNest, CComplexity, Averaged method per class, averaged method CComplexity, Max Inheritance, Coupling, and Number of Children). Our metrics tool parses all the source code files of the selected language only. For example, for C#, C++, and Java languages projects, CodeMetrics parses all the files of the extension types.cs,.cpp, and.java, respectively. Through the parsing process, a counter for each software metrics is used to compute the metric value. Before parsing the source code we should select the language; after completing the metrics counting we can specify the project domain and save it under that domain class. CodeMetrics can be extended to more than three domains types within the domains list such as educational and scientific systems which can be added to the domains list Open Source Code Dataset. We have built our data set using open source projects collected from different open source web sites which are (i) (ii) (iii) We have classified our source code projects into communication, finance, and games domains. All collected projects are of C# code language only. Table 1 shows the number of projects for each domain, 38.1% of the gathered projects are from the games domain, 35.7% of the projects are Finance applications, and 26.2% of the projects are from communication domain. Communication applications are related to , file transfer, client-server, and chat applications. Finance applications are related to stock and billing systems andsoforth.anexampleofsuchprojectsisnopcommerce and LinqCommerce. NopCommerce is an open source e- commerce solution with comprehensive features that is easy to use for new online businesses and is included within finance domain. LinqCommerce was created by JMA Web Technologies Inc. in order to be a part of e-commerce solutions. Table 1: Source code projects domain. Domain Number of application percentage Communication % Finance % Game % 3.2. Data Mining for Cost Estimation. We have used information gain method as a subset selection method in order to know the best subset of the metrics that play a major role in the classification process. A decision tree method (J48) is used as a classification method for the software data set depending on the software metrics. A clustering technique (Kmean) is used to classify the data set and grouping the similar projects within the same cluster Attribute Selection. There are many methods used for attribute subset selection such as information gain, gain ratio, and gini index methods. An attribute subset selection method is used to select the best separates of the attributes. In this work, we have used information gain method to determine the best splitting attributes. This method computes the information gain value for each attributes, and the attribute with the highest information gain is the best attribute that can identify the class label of a tuple in the data set. The following formulas are used to compute the information gain values: n Info(D) = pi log ( pi ), (3) i=0 where Info(D) is the average amount of information needed to identify the class label and p i is the probability that an tuple in the data set belongs to a class label Info A (D) = n k=0 Dk D Info(Dk), (4) where the Info A (D) is the expected information needed to classify a tuple based on an attribute A Dk / D is the weight of the k partition Gain(A) = Info(D) Info A (D), (5) where Gain (A) refers to the amount of information required to classify the data set of attribute A J48 Decision Tree. J48 decision tree is one of the classification methods that construct a decision tree classifier. J48 classifier works as follows. (1) Take the data set, attribute list, and the attribute selection method as input. (2) Check if the tuples in the data set are all with the same class label or not. (3) Check if the attribute list is empty or not. (4) Apply attribute selection method to find the best separate attribute.
5 ISRN Software Engineering 5 (5) Remove the splitting attribute from the attribute list. (6) For each outcomes of the tree, grow subtrees for each partition. We have used decision tree classifier as a knowledge discovery method because it can handle high-dimensional data like the data type of our data set. Another reason to use the decision tree classifier is that the construction of the decision tree does not need any domain knowledge Clustering by K-mean. Clustering techniques used to put the similar data object within the same cluster and the dissimilar objects in other clusters. We used K-mean clustering method to put the similar projects within the same cluster. In the K-mean clustering method we first select the number of clusters (K) and the distance method. K-means method worksasfollows (1) Randomly select K of the object, which represents a cluster center. (2) Assign each object to the cluster to which the object is the most similar. (3) Calculate the mean value of the objects for each cluster. (4) Reassign each object to the cluster to which the object is the most similar. (5) Repeat steps 3-4 until no changes. (6) Assign the cluster number as a class label for each object. After applying the K-mean clustering method, each projects of the data set is labeled with the cluster number which refers to a complexity level for each project as will be described later Using WEKA Data Mining Tool. In order to analyze our data set using data mining techniques, we used one of the most data mining tools that contain implementations of the data mining techniques. We used WEKA 3.6 version tool. WEKA is one of the most popular tools that are used by many researchers to analyze their data. WEKA supports data mining tasks such as data preprocessing, data classification, data clustering, and attributes selection. 4. Implementation and Experimental Work We first analyzed our dataset considering the domain types of the projects and concluding general characteristics for each domain type. We also grouped the projects using K-mean clustering method that is implemented through WEKA tool and concluding some analysis data Source Code Domains Analysis. This section discusses software source code according to their domains. We have analyzed software metrics within each domain and have extracted some information about each domain and made a comparison between the studied domains applications. In this work, we have studied three types of software application domains which are the communication, the finance, and the games applications. The following two sections analyze the traditional metrics and the object-oriented metrics within domains Traditional Metrics Analysis. Collected traditional metrics show different values for each domain application. Traditional metrics collected using the developed tool includes LOC, SLOCmath, MCDC, MaxNesting, and Cyclomatic Complexity. The following graphs show some characteristics for each domain application. The size metrics are lines, LOC, SLOC, MCDC, Slocmath, Maxnest, and Cyclomatic complexity. We have found that finance applications have higher source size codes than the other application types. Results showed that finance applications have higher nesting values than other applications domains. Results showed also that games applications have higher values of the SLOCmath and MCDC metrics, while lower values of these three metrics are measured within communication source code applications. It clearly shows that games applications have more complex source code than finance and communication applications, while the finance applications are the least complex applications. The above analysis and graphs conclude that these software application domains have some differences and that each application type has general characteristics. The analysis of our data set that is gathered from open sources software and classified into three domains (communication, finance, and games) showed that finance applications require the largest number of software size which is measured by computing source code lines of code. Finance applications have the deepest nesting levels which increase the complexity of the source codes. Games applications have higher values of SLOCmath andmcdcmetrics.thisresultisexpectedforthegames applications because these types of software applications deal with the players choices and the players probabilities through playing the game. Also games applications have the highest complexity values of the Cyclomatic Complexity metric which means that games applications have larger number of execution paths relative to communication and finance applications. Results showed also that finance applications have limited number of execution paths. This result refers to the stability of the finance applications which depend on clear and static business rules Object-Oriented Metrics Analysis. Object-oriented metrics information measures the quality of the software application and helps in accurately estimating the software cost. Results for Average Cyclomatic Complexity for methods within a class metric showed that games applications methods are more complex than other domains applications methods, while lower complexity levels appeared within communication source code methods. In terms of the metric average number of methods per class, results showed also that games applications classes have much more methods
6 6 ISRN Software Engineering than other classes at other domains, while finance application s classes are the lowest ones. We analyzed the inheritance depth of the classes within the three domains, and we found that the depth of inheritance for the finance applications is deeper than that of communication applications, while the depth of inheritance for the games applications is the deepest of the three. Coupling of a class measures how many methods and method calls exist in a call to all other classes. In coupling, results showed that games objects within any class are invoked many times by methods in different classes more than those in finance or communication applications. Results showed also the average number of children classes over the three analyzed domains (i.e., Games, Finance, and communication). It is clearly noticed that games application s classes have much more children classes than applications in other domains. The above analysis shows that applications of the game domain are more complex than other software applications within other domains. Average Cyclomatic Complexity per methods shows that games applications have the highest values of this metrics which can increase the software cost (i.e., building and maintenance effort) for games applications. Gamesprogramsrequiretointeractwiththeplayer soptions and reactions, so games programmers need to include all the probabilities of the game and the interactions with the games players options which explain the high values of objectoriented metrics for the game source codes Software Metrics Selection. Inthissection,wehaveanalyzed the software metrics and have assigned weights for the software metrics in order to specify the metrics that determine the software domain (the dominant metric). Attribute subset selection method from data mining is used in a bid to find the dominant metrics that specify the software domain. In order to know each metric weight, we have used the information gain method (InfoGain) that is implemented by WEKA tool. In a bid to know the effects of object-oriented metrics on the analysis, we have made two methods of analysis: on the traditional metrics only (i.e., without object-oriented metrics) and on all available software metrics Traditional Dominant Metrics. Applying information gain method on the traditional metrics shows that MCDC metric is the dominant metric since its value can often determine the software domain. Table 2 shows the information gain values for other metrics. The second metric that plays an important role in determining software domain is the cyclomatic complexity with a value 0.45 which is the nearest value to the MCDC metric. LOC and SLOCmath are in the third and fourth positions with values close to each other. The MaxNest metric is the last metric that may affect the software domain. The decision tree (J48) clearly shows the heuristic for selecting the splitting criterion metrics. Tree 2 shows that MCDC metric is at the root of the tree which means that MCDC has a major role in splitting the tree. J48 classifier has Table 2: Metrics information gain value. Metrics number Metrics Information gain value 1 MCDC CComplexity LOC Slocmath maxnest Table 3: J48 confusion matrix. Domain/classified as Communication Finance Game Communication Finance Game a tree with 13 leaves and 7 levels depth with % prediction accuracy. Table 3 shows the confusion matrix of J48 classifier. The main mismatches are within communication software where 10 communication projects are classified as finance projects. While finance and games projects have very high percentages of correctly classified projects. Results showed that all the traditional metrics play a role in determining the software domains, and there are different ends with a domain type. The shortest path in the above tree is the path that includes two metrics only which are MCDC and MaxNest that end with 18 games projects. The longest path ends with two finance and two communication projects. This path includes the following metrics: MCDC CComplexity LOC MaxNest MaxNest LOC. The path that ends with the biggest number of projects is the path that ends with 47 financial projects with 10 projects mismatching the domain, and so 37 of the finance projects are found through the same path which refers to the high similarities between finance projects. The tree 3.1 shows that all the financial projects have MCDC value less than 10221, while most games projects have MCDC values larger than Traditional-CK Dominant Metrics. In this section, we have added several CK metrics to the traditional metrics of the software. We have also found that MCDC metric is the most significant metric that can determine the source code domain by using information gain method as shown in Table 4. The above table shows the metrics arranged decreasingly according to the information gain value. The CK metrics have less effect on splitting the projects except the CBO metric which has the third position within the software metrics. The J48 classifier shows a high accuracy value (97.619%) of using traditional software metrics plus CK metrics where 123 projects are correctly classified. Table 5 shows the confusion matrix of applying J48 on traditional plus CK metrics. (6)
7 ISRN Software Engineering 7 Table 4: Traditional-CK metric information gain value. Metrics number Metric Information gain value 1 MCDC CComplexity CBO LOC SLOCmath MaxNest avgmethod per class Childern Table 5: Traditional CK metrics confusion matrix. Domain/classified as Communication Finance Game Communication Finance Game This high accuracy value when adding CK metrics refers to the enhancement of the prediction process when considering CK-metrics in addition to the traditional metrics. Results showed that traditional CK metrics has % of accuracy while traditional metrics only has % of accuracy. 5. Assumptions (i) Grouping the software projects into 3 groups using data mining clustering methods, each group represents a cost factor for the projects within each group where we assume (Low, Mid, and High) for the three grouping approach. (ii) Using one of the attribute subset selection methods in order to get metrics weights, we have used the information gain method and have assumed our effort formula according to the linear approach divided by Our collected data set contains traditional metrics as well as some CK metrics. All of these metrics are numerical, and so in order to cluster the data set into similar groups we used one of the data mining approaches. We used K-mean method which is one of the known clustering techniques. In the following sections, K-means is applied on our data set in a bid to put the collected software projects within clusters according to the similarity between clustered projects. 6. Conclusion Cost estimation remains a complex problem that attracts researchers to study and to try different approaches and methodologies to solve it. In this thesis, a new cost estimation approach for analyzing software projects is presented in order to help project managers to take their decisions. Our approach utilizes data mining techniques in the analysis process, and we have used K-mean in a bid to classify the gathered projects into clusters and find the similarities between each cluster s projects. In order to gather the data for our dataset a software metric tool called CodeMetrics is built to extract traditional metrics as well as some of the objectoriented metrics (CK metrics). A dataset of software metrics is established according to the projects domain types, and, to our knowledge, there is not any datasets that contain software metrics classified according to the domain type. The following points refer to the final experimental results of this work. (1) Finance projects require a larger number of lines and nesting relative to games and communication projects. (2) Slocmath, mcdc, and complexity metrics for games projects have higher values relative to finance and communication projects, while the complexity of the communication projects is larger than finance projects. (3) Applications of games type have higher values of object-oriented metrics relative to the other projects (method CComplexity, averaged method per class, inheritance depth, coupling, children). (4) Using data mining techniques for attribute subset selection we have found that MCDC plays a big role in determining source code domains more than LOC metric as all cost estimation models depend on it. (5) Adding object-oriented metrics to the metrics list increases the prediction accuracy where the accuracy becomes (97.619%) instead of ( %) (using J48 classifier). (6) Our results conclude that MCDC, LOC, MaxNets are the major metrics that affect the classification of the software projects while most of the previous studies consider LOC metric as the major metric that plays a role in effort estimation. (7) Adding object-oriented metrics to the metrics list increases the correlation between our model and COCOMO model where the correlation becomes for effort instead of , and the correlation of the development time becomes instead of References [1] M. Shepperd, C. Schofield, and B. Kitchenham, Effort estimation using analogy, in Proceedings of the 18th International Conference on Software Engineering, pp , Berlin, Germany, [2] I. Alsmadi and K. Magel, Open source evolution analysis, in Proceedings of the 22nd IEEE International Conference on Software Maintenance (ICSM 06), Philadelphia, Pa, USA, [3] B. Boehm, B. Clark, E. Horowitz, R. Madachy, R. Shelby, and C. Westland, Cost models for future software life cycle process: COCOMO 2.0, in Annals of Software Engineering Special Volume on Software Process and Product Measurement,
8 8 ISRN Software Engineering J. D. Arther and S. M. Henry, Eds., vol. 1, pp , J.C. Baltzer AG, Science Publishers, Amsterdam, The Netherlands, [4] K. Ribu, Estimating Object-Oriented Software Projects with Use Cases, M.S. thesis, University of Oslo Department of Informatics, [5] T. Gilb, Software Metrics, Chartwell-Bratt, [6] B. W. Boehm, Software Engineering Economics, Prentice-Hall, Englewood Cliffs, NJ, USA, [7] L. H. Putnam, A general empirical solution to the macro software sizing and estimating problem, IEEE Transactions on Software Engineering, vol. 4, no. 4, pp , [8] V. R. Basili and B. T. Perricone, Software errors and complexity: an empirical investigation, Communications of the ACM, vol. 27, no. 1, pp , [9] V. Y. Shen, T. J. Yu, S. M. Thebaut, and L. R. Paulsen, Identifying error-prone software an empirical study, IEEE Transactions on Software Engineering, vol. 11, no. 4, pp , [10] C. Withrow, Error density and size in Ada software, IEEE Software, vol. 7, no. 1, pp , [11] T. J. McCabe, A complexity measure, IEEE Transactions on Software Engineering, vol. SE-2, no. 4, pp , [12] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, San Francisco, Calif, USA, 2th edition, [13] M. H. Halstead, Elements of Software Science, ElsevierNorth Holland, [14] Y. Y. Fei, Z. Zhi, and Z. S. Chao, Improvements about Halstead model in software science, Journal of Computer Applications, pp , [15] S. R. Chidamber and C. F. Kemerer, Metrics suite for object oriented design, IEEE Transactions on Software Engineering, vol. 20, no. 6, pp , [16] H.LeungandZ.Fan,Software Cost Estimation,Departmentof Computing, The Hong Kong Polytechnic University, [17] T. Xie, S. Thummalapenta, D. Lo, and C. Liu, Data mining for software engineering, Computer, vol. 42, no. 8, pp , [18] H.LeungandZ.Fan,Software Cost Estimation,Departmentof Computing, The Hong Kong Polytechnic University, [19] J. D. Aron, Estimating Resource for Large Programming Systems, NATO Science Committee, Rome, Italy; October [20] G. N. Parkinson, Parkinson s Law and Other Studies in Administration, Houghton-Miffin, Boston, Mass, USA, [21] B. W. Boehm et al., The COCOMO 2.0 Software Cost Estimation Model, American Programmer, 1996.
9 Journal of Industrial Engineering Multimedia The Scientific World Journal Applied Computational Intelligence and Soft Computing International Journal of Distributed Sensor Networks Fuzzy Systems Modelling & Simulation in Engineering Submit your manuscripts at Journal of Computer Networks and Communications Advances in Artificial Intelligence Hindawi Publishing Corporation Computational Intelligence & Neuroscience Artificial Neural Systems Volume 2014 Software Engineering International Journal of Computer Games Technology Computer Engineering International Journal of Human-Computer Interaction Biomedical Imaging International Journal of Reconfigurable Computing Journal of Journal of Electrical and Computer Engineering Robotics
EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS
EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS Umamaheswari E. 1, N. Bhalaji 2 and D. K. Ghosh 3 1 SCSE, VIT Chennai Campus, Chennai, India 2 SSN College of
Open Source Software: How Can Design Metrics Facilitate Architecture Recovery?
Open Source Software: How Can Design Metrics Facilitate Architecture Recovery? Eleni Constantinou 1, George Kakarontzas 2, and Ioannis Stamelos 1 1 Computer Science Department Aristotle University of Thessaloniki
Definitions. Software Metrics. Why Measure Software? Example Metrics. Software Engineering. Determine quality of the current product or process
Definitions Software Metrics Software Engineering Measure - quantitative indication of extent, amount, dimension, capacity, or size of some attribute of a product or process. Number of errors Metric -
Quality prediction model for object oriented software using UML metrics
THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS TECHNICAL REPORT OF IEICE. UML Quality prediction model for object oriented software using UML metrics CAMARGO CRUZ ANA ERIKA and KOICHIRO
Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model
Software Development Cost and Time Forecasting Using a High Performance Artificial Neural Network Model Iman Attarzadeh and Siew Hock Ow Department of Software Engineering Faculty of Computer Science &
Using Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
PREDICTING STOCK PRICES USING DATA MINING TECHNIQUES
The International Arab Conference on Information Technology (ACIT 2013) PREDICTING STOCK PRICES USING DATA MINING TECHNIQUES 1 QASEM A. AL-RADAIDEH, 2 ADEL ABU ASSAF 3 EMAN ALNAGI 1 Department of Computer
EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN
EXTENDED ANGEL: KNOWLEDGE-BASED APPROACH FOR LOC AND EFFORT ESTIMATION FOR MULTIMEDIA PROJECTS IN MEDICAL DOMAIN Sridhar S Associate Professor, Department of Information Science and Technology, Anna University,
Comparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
Software Metrics. Lord Kelvin, a physicist. George Miller, a psychologist
Software Metrics 1. Lord Kelvin, a physicist 2. George Miller, a psychologist Software Metrics Product vs. process Most metrics are indirect: No way to measure property directly or Final product does not
Percerons: A web-service suite that enhance software development process
Percerons: A web-service suite that enhance software development process Percerons is a list of web services, see http://www.percerons.com, that helps software developers to adopt established software
How To Solve The Kd Cup 2010 Challenge
A Lightweight Solution to the Educational Data Mining Challenge Kun Liu Yan Xing Faculty of Automation Guangdong University of Technology Guangzhou, 510090, China [email protected] [email protected]
Software Defect Prediction for Quality Improvement Using Hybrid Approach
Software Defect Prediction for Quality Improvement Using Hybrid Approach 1 Pooja Paramshetti, 2 D. A. Phalke D.Y. Patil College of Engineering, Akurdi, Pune. Savitribai Phule Pune University ABSTRACT In
An Evaluation of Neural Networks Approaches used for Software Effort Estimation
Proc. of Int. Conf. on Multimedia Processing, Communication and Info. Tech., MPCIT An Evaluation of Neural Networks Approaches used for Software Effort Estimation B.V. Ajay Prakash 1, D.V.Ashoka 2, V.N.
An Introduction to Data Mining. Big Data World. Related Fields and Disciplines. What is Data Mining? 2/12/2015
An Introduction to Data Mining for Wind Power Management Spring 2015 Big Data World Every minute: Google receives over 4 million search queries Facebook users share almost 2.5 million pieces of content
DATA MINING TECHNIQUES AND APPLICATIONS
DATA MINING TECHNIQUES AND APPLICATIONS Mrs. Bharati M. Ramageri, Lecturer Modern Institute of Information Technology and Research, Department of Computer Application, Yamunanagar, Nigdi Pune, Maharashtra,
Data Mining. 1 Introduction 2 Data Mining methods. Alfred Holl Data Mining 1
Data Mining 1 Introduction 2 Data Mining methods Alfred Holl Data Mining 1 1 Introduction 1.1 Motivation 1.2 Goals and problems 1.3 Definitions 1.4 Roots 1.5 Data Mining process 1.6 Epistemological constraints
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH
EMPIRICAL STUDY ON SELECTION OF TEAM MEMBERS FOR SOFTWARE PROJECTS DATA MINING APPROACH SANGITA GUPTA 1, SUMA. V. 2 1 Jain University, Bangalore 2 Dayanada Sagar Institute, Bangalore, India Abstract- One
Customer Classification And Prediction Based On Data Mining Technique
Customer Classification And Prediction Based On Data Mining Technique Ms. Neethu Baby 1, Mrs. Priyanka L.T 2 1 M.E CSE, Sri Shakthi Institute of Engineering and Technology, Coimbatore 2 Assistant Professor
A New Approach in Software Cost Estimation with Hybrid of Bee Colony and Chaos Optimizations Algorithms
A New Approach in Software Cost Estimation with Hybrid of Bee Colony and Chaos Optimizations Algorithms Farhad Soleimanian Gharehchopogh 1 and Zahra Asheghi Dizaji 2 1 Department of Computer Engineering,
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree
Predicting the Risk of Heart Attacks using Neural Network and Decision Tree S.Florence 1, N.G.Bhuvaneswari Amma 2, G.Annapoorani 3, K.Malathi 4 PG Scholar, Indian Institute of Information Technology, Srirangam,
Chap 4. Using Metrics To Manage Software Risks
Chap 4. Using Metrics To Manage Software Risks. Introduction 2. Software Measurement Concepts 3. Case Study: Measuring Maintainability 4. Metrics and Quality . Introduction Definition Measurement is the
Software Defect Prediction Modeling
Software Defect Prediction Modeling Burak Turhan Department of Computer Engineering, Bogazici University [email protected] Abstract Defect predictors are helpful tools for project managers and developers.
Mining the Software Change Repository of a Legacy Telephony System
Mining the Software Change Repository of a Legacy Telephony System Jelber Sayyad Shirabad, Timothy C. Lethbridge, Stan Matwin School of Information Technology and Engineering University of Ottawa, Ottawa,
131-1. Adding New Level in KDD to Make the Web Usage Mining More Efficient. Abstract. 1. Introduction [1]. 1/10
1/10 131-1 Adding New Level in KDD to Make the Web Usage Mining More Efficient Mohammad Ala a AL_Hamami PHD Student, Lecturer m_ah_1@yahoocom Soukaena Hassan Hashem PHD Student, Lecturer soukaena_hassan@yahoocom
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES
DECISION TREE INDUCTION FOR FINANCIAL FRAUD DETECTION USING ENSEMBLE LEARNING TECHNIQUES Vijayalakshmi Mahanra Rao 1, Yashwant Prasad Singh 2 Multimedia University, Cyberjaya, MALAYSIA 1 [email protected]
A New Approach For Estimating Software Effort Using RBFN Network
IJCSNS International Journal of Computer Science and Network Security, VOL.8 No.7, July 008 37 A New Approach For Estimating Software Using RBFN Network Ch. Satyananda Reddy, P. Sankara Rao, KVSVN Raju,
Design methods. List of possible design methods. Functional decomposition. Data flow design. Functional decomposition. Data Flow Design (SA/SD)
Design methods List of possible design methods Functional decomposition Data Flow Design (SA/SD) Design based on Data Structures (JSD/JSP) OO is good, isn t it Decision tables E-R Flowcharts FSM JSD JSP
Vragen en opdracht. Complexity. Modularity. Intra-modular complexity measures
Vragen en opdracht Complexity Wat wordt er bedoeld met design g defensively? Wat is het gevolg van hoge complexiteit icm ontwerp? Opdracht: http://www.win.tue.nl/~mvdbrand/courses/se/1011/opgaven.html
COURSE RECOMMENDER SYSTEM IN E-LEARNING
International Journal of Computer Science and Communication Vol. 3, No. 1, January-June 2012, pp. 159-164 COURSE RECOMMENDER SYSTEM IN E-LEARNING Sunita B Aher 1, Lobo L.M.R.J. 2 1 M.E. (CSE)-II, Walchand
Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach
www.ijcsi.org 692 Pragmatic Peer Review Project Contextual Software Cost Estimation A Novel Approach Manoj Kumar Panda HEAD OF THE DEPT,CE,IT & MCA NUVA COLLEGE OF ENGINEERING & TECH NAGPUR, MAHARASHTRA,INDIA
Chapter 24 - Quality Management. Lecture 1. Chapter 24 Quality management
Chapter 24 - Quality Management Lecture 1 1 Topics covered Software quality Software standards Reviews and inspections Software measurement and metrics 2 Software quality management Concerned with ensuring
Software Defect Prediction Tool based on Neural Network
Software Defect Prediction Tool based on Neural Network Malkit Singh Student, Department of CSE Lovely Professional University Phagwara, Punjab (India) 144411 Dalwinder Singh Salaria Assistant Professor,
ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION
ISSN 9 X INFORMATION TECHNOLOGY AND CONTROL, 00, Vol., No.A ON INTEGRATING UNSUPERVISED AND SUPERVISED CLASSIFICATION FOR CREDIT RISK EVALUATION Danuta Zakrzewska Institute of Computer Science, Technical
Data Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL
Stock Asian-African Market Trends Journal using of Economics Cluster Analysis and Econometrics, and ARIMA Model Vol. 13, No. 2, 2013: 303-308 303 STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE
A NEW DECISION TREE METHOD FOR DATA MINING IN MEDICINE Kasra Madadipouya 1 1 Department of Computing and Science, Asia Pacific University of Technology & Innovation ABSTRACT Today, enormous amount of data
COMP3420: Advanced Databases and Data Mining. Classification and prediction: Introduction and Decision Tree Induction
COMP3420: Advanced Databases and Data Mining Classification and prediction: Introduction and Decision Tree Induction Lecture outline Classification versus prediction Classification A two step process Supervised
Using Code Quality Metrics in Management of Outsourced Development and Maintenance
Using Code Quality Metrics in Management of Outsourced Development and Maintenance Table of Contents 1. Introduction...3 1.1 Target Audience...3 1.2 Prerequisites...3 1.3 Classification of Sub-Contractors...3
Measuring Software Complexity to Target Risky Modules in Autonomous Vehicle Systems
Measuring Software Complexity to Target Risky Modules in Autonomous Vehicle Systems M. N. Clark, Bryan Salesky, Chris Urmson Carnegie Mellon University Dale Brenneman McCabe Software Inc. Corresponding
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY AUTUMN 2016 BACHELOR COURSES
FACULTY OF COMPUTER SCIENCE AND INFORMATION TECHNOLOGY Please note! This is a preliminary list of courses for the study year 2016/2017. Changes may occur! AUTUMN 2016 BACHELOR COURSES DIP217 Applied Software
Web Forensic Evidence of SQL Injection Analysis
International Journal of Science and Engineering Vol.5 No.1(2015):157-162 157 Web Forensic Evidence of SQL Injection Analysis 針 對 SQL Injection 攻 擊 鑑 識 之 分 析 Chinyang Henry Tseng 1 National Taipei University
Data Mining: A Preprocessing Engine
Journal of Computer Science 2 (9): 735-739, 2006 ISSN 1549-3636 2005 Science Publications Data Mining: A Preprocessing Engine Luai Al Shalabi, Zyad Shaaban and Basel Kasasbeh Applied Science University,
Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality
Journal of Information Processing Systems, Vol.8, No.2, June 2012 http://dx.doi.org/10.3745/jips.2012.8.2.241 Fault Prediction Using Statistical and Machine Learning Methods for Improving Software Quality
Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects
Journal of Computer Science 2 (2): 118-123, 2006 ISSN 1549-3636 2006 Science Publications Estimation of the COCOMO Model Parameters Using Genetic Algorithms for NASA Software Projects Alaa F. Sheta Computers
Mining Metrics to Predict Component Failures
Mining Metrics to Predict Component Failures Nachiappan Nagappan, Microsoft Research Thomas Ball, Microsoft Research Andreas Zeller, Saarland University Overview Introduction Hypothesis and high level
A metrics suite for JUnit test code: a multiple case study on open source software
Toure et al. Journal of Software Engineering Research and Development (2014) 2:14 DOI 10.1186/s40411-014-0014-6 RESEARCH Open Access A metrics suite for JUnit test code: a multiple case study on open source
LVQ Plug-In Algorithm for SQL Server
LVQ Plug-In Algorithm for SQL Server Licínia Pedro Monteiro Instituto Superior Técnico [email protected] I. Executive Summary In this Resume we describe a new functionality implemented
Software project cost estimation using AI techniques
Software project cost estimation using AI techniques Rodríguez Montequín, V.; Villanueva Balsera, J.; Alba González, C.; Martínez Huerta, G. Project Management Area University of Oviedo C/Independencia
Project Planning and Project Estimation Techniques. Naveen Aggarwal
Project Planning and Project Estimation Techniques Naveen Aggarwal Responsibilities of a software project manager The job responsibility of a project manager ranges from invisible activities like building
Estimating Size and Effort
Estimating Size and Effort Dr. James A. Bednar [email protected] http://homepages.inf.ed.ac.uk/jbednar Dr. David Robertson [email protected] http://www.inf.ed.ac.uk/ssp/members/dave.htm SAPM Spring 2007:
Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Proceedings of the 2014 International Conference on Industrial Engineering and Operations Management Bali, Indonesia, January 7 9, 2014 Mobile Phone APP Software Browsing Behavior using Clustering Analysis
Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing
www.ijcsi.org 198 Data Mining Framework for Direct Marketing: A Case Study of Bank Marketing Lilian Sing oei 1 and Jiayang Wang 2 1 School of Information Science and Engineering, Central South University
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets
Data Quality Mining: Employing Classifiers for Assuring consistent Datasets Fabian Grüning Carl von Ossietzky Universität Oldenburg, Germany, [email protected] Abstract: Independent
DATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE
DATA MINING USING INTEGRATION OF CLUSTERING AND DECISION TREE 1 K.Murugan, 2 P.Varalakshmi, 3 R.Nandha Kumar, 4 S.Boobalan 1 Teaching Fellow, Department of Computer Technology, Anna University 2 Assistant
Prediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
Financial Trading System using Combination of Textual and Numerical Data
Financial Trading System using Combination of Textual and Numerical Data Shital N. Dange Computer Science Department, Walchand Institute of Rajesh V. Argiddi Assistant Prof. Computer Science Department,
Hathaichanok Suwanjang and Nakornthip Prompoon
Framework for Developing a Software Cost Estimation Model for Software Based on a Relational Matrix of Project Profile and Software Cost Using an Analogy Estimation Method Hathaichanok Suwanjang and Nakornthip
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
Analysis and Evaluation of Quality Metrics in Software Engineering
Analysis and Evaluation of Quality Metrics in Software Engineering Zuhab Gafoor Dand 1, Prof. Hemlata Vasishtha 2 School of Science & Technology, Department of Computer Science, Shri Venkateshwara University,
Grid Density Clustering Algorithm
Grid Density Clustering Algorithm Amandeep Kaur Mann 1, Navneet Kaur 2, Scholar, M.Tech (CSE), RIMT, Mandi Gobindgarh, Punjab, India 1 Assistant Professor (CSE), RIMT, Mandi Gobindgarh, Punjab, India 2
Stephen MacDonell Andrew Gray Grant MacLennan Philip Sallis. The Information Science Discussion Paper Series. Number 99/12 June 1999 ISSN 1177-455X
DUNEDIN NEW ZEALAND Software Forensics for Discriminating between Program Authors using Case-Based Reasoning, Feed-Forward Neural Networks and Multiple Discriminant Analysis Stephen MacDonell Andrew Gray
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase
Extending Change Impact Analysis Approach for Change Effort Estimation in the Software Development Phase NAZRI KAMA, MEHRAN HALIMI Advanced Informatics School Universiti Teknologi Malaysia 54100, Jalan
Information Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 01 : 06/10/2015 Practical informations: Teacher: Alberto Ceselli ([email protected])
Application of Data Mining Methods in Health Care Databases
6 th International Conference on Applied Informatics Eger, Hungary, January 27 31, 2004. Application of Data Mining Methods in Health Care Databases Ágnes Vathy-Fogarassy Department of Mathematics and
How To Calculate Class Cohesion
Improving Applicability of Cohesion Metrics Including Inheritance Jaspreet Kaur 1, Rupinder Kaur 2 1 Department of Computer Science and Engineering, LPU, Phagwara, INDIA 1 Assistant Professor Department
Unit 11: Software Metrics
Unit 11: Software Metrics Objective Ð To describe the current state-of-the-art in the measurement of software products and process. Why Measure? "When you can measure what you are speaking about and express
Clustering Data Streams
Clustering Data Streams Mohamed Elasmar Prashant Thiruvengadachari Javier Salinas Martin [email protected] [email protected] [email protected] Introduction: Data mining is the science of extracting
PREDICTIVE TECHNIQUES IN SOFTWARE ENGINEERING : APPLICATION IN SOFTWARE TESTING
PREDICTIVE TECHNIQUES IN SOFTWARE ENGINEERING : APPLICATION IN SOFTWARE TESTING Jelber Sayyad Shirabad Lionel C. Briand, Yvan Labiche, Zaheer Bawar Presented By : Faezeh R.Sadeghi Overview Introduction
IT services for analyses of various data samples
IT services for analyses of various data samples Ján Paralič, František Babič, Martin Sarnovský, Peter Butka, Cecília Havrilová, Miroslava Muchová, Michal Puheim, Martin Mikula, Gabriel Tutoky Technical
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM
TOWARDS SIMPLE, EASY TO UNDERSTAND, AN INTERACTIVE DECISION TREE ALGORITHM Thanh-Nghi Do College of Information Technology, Cantho University 1 Ly Tu Trong Street, Ninh Kieu District Cantho City, Vietnam
International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
Efficient Indicators to Evaluate the Status of Software Development Effort Estimation inside the Organizations
Efficient Indicators to Evaluate the Status of Software Development Effort Estimation inside the Organizations Elham Khatibi Department of Information System Universiti Teknologi Malaysia (UTM) Skudai
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA
ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA D.Lavanya 1 and Dr.K.Usha Rani 2 1 Research Scholar, Department of Computer Science, Sree Padmavathi Mahila Visvavidyalayam, Tirupati, Andhra Pradesh,
A Comparison of Calibrated Equations for Software Development Effort Estimation
A Comparison of Calibrated Equations for Software Development Effort Estimation Cuauhtemoc Lopez Martin Edgardo Felipe Riveron Agustin Gutierrez Tornes 3,, 3 Center for Computing Research, National Polytechnic
Software Cost Estimation: A Tool for Object Oriented Console Applications
Software Cost Estimation: A Tool for Object Oriented Console Applications Ghazy Assassa, PhD Hatim Aboalsamh, PhD Amel Al Hussan, MSc Dept. of Computer Science, Dept. of Computer Science, Computer Dept.,
AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM
AUTO CLAIM FRAUD DETECTION USING MULTI CLASSIFIER SYSTEM ABSTRACT Luis Alexandre Rodrigues and Nizam Omar Department of Electrical Engineering, Mackenzie Presbiterian University, Brazil, São Paulo [email protected],[email protected]
Reference Books. Data Mining. Supervised vs. Unsupervised Learning. Classification: Definition. Classification k-nearest neighbors
Classification k-nearest neighbors Data Mining Dr. Engin YILDIZTEPE Reference Books Han, J., Kamber, M., Pei, J., (2011). Data Mining: Concepts and Techniques. Third edition. San Francisco: Morgan Kaufmann
A Study on Software Metrics and Phase based Defect Removal Pattern Technique for Project Management
International Journal of Soft Computing and Engineering (IJSCE) A Study on Software Metrics and Phase based Defect Removal Pattern Technique for Project Management Jayanthi.R, M Lilly Florence Abstract:
An Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
Extension of Decision Tree Algorithm for Stream Data Mining Using Real Data
Fifth International Workshop on Computational Intelligence & Applications IEEE SMC Hiroshima Chapter, Hiroshima University, Japan, November 10, 11 & 12, 2009 Extension of Decision Tree Algorithm for Stream
DEA implementation and clustering analysis using the K-Means algorithm
Data Mining VI 321 DEA implementation and clustering analysis using the K-Means algorithm C. A. A. Lemos, M. P. E. Lins & N. F. F. Ebecken COPPE/Universidade Federal do Rio de Janeiro, Brazil Abstract
Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem
International Journal of Applied Science and Technology Vol. 3 No. 8; December 2013 Evaluation of Complexity of Some Programming Languages on the Travelling Salesman Problem D. R. Aremu O. A. Gbadamosi
Analysis Of Source Lines Of Code(SLOC) Metric
Analysis Of Source Lines Of Code(SLOC) Metric Kaushal Bhatt 1, Vinit Tarey 2, Pushpraj Patel 3 1,2,3 Kaushal Bhatt MITS,Datana Ujjain 1 [email protected] 2 [email protected] 3 [email protected]
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE
EFFICIENCY OF DECISION TREES IN PREDICTING STUDENT S ACADEMIC PERFORMANCE S. Anupama Kumar 1 and Dr. Vijayalakshmi M.N 2 1 Research Scholar, PRIST University, 1 Assistant Professor, Dept of M.C.A. 2 Associate
A Study of Web Log Analysis Using Clustering Techniques
A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept
Use of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,
ASSOCIATION RULE MINING ON WEB LOGS FOR EXTRACTING INTERESTING PATTERNS THROUGH WEKA TOOL
International Journal Of Advanced Technology In Engineering And Science Www.Ijates.Com Volume No 03, Special Issue No. 01, February 2015 ISSN (Online): 2348 7550 ASSOCIATION RULE MINING ON WEB LOGS FOR
Experiments in Web Page Classification for Semantic Web
Experiments in Web Page Classification for Semantic Web Asad Satti, Nick Cercone, Vlado Kešelj Faculty of Computer Science, Dalhousie University E-mail: {rashid,nick,vlado}@cs.dal.ca Abstract We address
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
