4 Techniques for Analyzing Large Data Sets
|
|
- Kelly Lawrence
- 8 years ago
- Views:
Transcription
1 4 Techniques for Analyzing Large Data Sets Pablo A. Goloboff Contents 1 Introduction 70 2 Traditional Techniques 71 3 Composite Optima: Why Do Traditional Techniques Fail? 72 4 Techniques for Analyzing Large Data Sets Ratchet Sectorial searches Tree-fusing Tree-drifting Combined methods Minimum length: multiple trees or multiple hits? 76 5 TNT: Implementation of the New Methods 77 6 Remarks and Conclusions 78 Acknowledgments 79 References 79 1 Introduction Parsimony problems of medium or large numbers of taxa can be analyzed only by means of trial-and-error or "heuristic" methods. Traditional strategies for finding most parsimonious trees have long been in use, implemented in the programs Hennig86 [1], PAUP [2], and NONA [3]. Although successful for small and medium-sized data sets, these techniques normally fail for analyzing very large data sets, i. e., data sets with 200 or more taxa. This is because rather than simply requiring more of the same kind of work used to analyze smaller data sets, very large data sets require the use of qualitatively different techniques. The techniques described here have so far been used only for prealigned sequences, but they could be adapted for other methods of analysis, like the direct optimization method of Wheeler [4]. Methods and Tools in Biosciences and Medicine Techniques in molecular systematics and evolution, ed. by Rob DeSalle et al Birkhauser Verlag Basel/Switzerland
2 2 Traditional Techniques The two basic heuristic computational techniques for finding most parsimonious trees are wagner trees and branch-swapping. A wagner tree is a tree created by sequentially adding the taxa at the most parsimonious available branch. At each point during the addition of taxa, only part of the data are actually used. A taxon may be placed best in some part of the tree when only some taxa are present, but it may be placed best somewhere else when all the taxa are considered. Therefore, which taxa have been added determines the outcome of a wagner tree, so that different addition sequences will lead - for large data sets - to different results. Branch-swapping is a widely used technique for improving the trees produced by the wagner method. Branchswapping takes a tree and evaluates the parsimony of each of a series of branch-rearrangements (discarding, adding, or replacing the new tree if it is, respectively, worse, equal, or better than previously found trees). The number of rearrangements to complete swapping depends strongly on the number of taxa. The most widely used branch-swapping algorithm is "tree bisection reconnection" or TBR ([5], called "branch-breaking" in Hennig86; [1]). In TBR, the tree is clipped in two, and the two subtrees are rejoined in each possible way. The number of rearrangements to complete TBR increases with the cube of the number of taxa, and thus the time needed to complete TBR on a tree of twice the taxa is much more than twice the time. Thus, if a tree of 10 taxa requires x rearrangements for complete swapping, a tree of 20 taxa will require 8x, 40 taxa will require 50x and 80 taxa will require 400x. Because of special short-cuts, which allow deducing tree length for rearrangements without unnecessary calculations (see [6, 7] for basic descriptions, and [8], for a description of techniques for multi-character optimization), the rearrangements for larger trees can in many cases be evaluated more quickly than the rearrangements for smaller trees. Therefore, the time for swapping increases in those cases with less than the cube of the number of taxa (although it is still more than the square). In implementations which do not (or cannot) use some of these short-cuts, the time to complete TBR may well increase with the cube of the number of taxa (the use of some of the techniques described here, like sectorial searches and tree-fusing, would be even more beneficial under those circumstances). For even relatively small data sets (i. e., 30 or 40 taxa), TBR may be unable, given some starting trees, to find the most parsimonious trees. In computer science, this is known as the problem of local optima (known in systematics as the problem of "islands" of trees; [9]) This is easily visualized by thinking of the parsimony of the trees as a "landscape" with peaks and valleys. The goal of the analysis is to get to the highest possible peak; this is done by taking a series of "steps" in several possible directions, going back if the step took us to a lower elevation, continuing from the new point if the step took us higher. Note that if the "steps" with which the swapping algorithm "walks" in this landscape are too
3 72 Pablo A. Goloboff short, it may easily get trapped in an isolated peak of non-maximal height. To reach higher peaks, the algorithm would have to descend and then go up again - but the algorithm does not do so, by virtue of its own design. The two traditional strategies around the problem of local optima for the TBR algorithm are the use of multiple starting points for TBR and the retention of suboptimal trees during swapping. The first is more efficient and is thus the only one that will be considered here. The multiple starting points for TBR are best obtained by doing wagner trees using different addition sequences to create multiple wagner trees. Typically, the addition sequence can be randomized to obtain many different wagner trees to be later input to TBR - this has been termed a "random addition sequence" or RAS. The expectation is that some of the initial trees will eventually be near or on the slopes of the highest peaks. For data sets of 50 to 150 taxa, this method generally works well, although it may require the use of large numbers of RAS+TBR. The strategy of RAS+TBR, however, is very inefficient for data sets of much larger size. It might appear in principle that larger data sets might simply require a larger number of replications, but the number of RAS+TBR needed to actually find optimal trees for data sets with 500 or more taxa seems to increase exponentially. 3 Composite Optima: Why do Traditional Techniques Fail? Traditional techniques fail because very large trees can exhibit what Goloboff [10] termed composite optima. The TBR algorithm can get stuck in local optima for many data sets with taxa. But a tree with (say) 500 taxa has many regions or sectors that can be seen as sub-problems of 50 taxa. Each of these sub-problems might have its own "local" and "global" optima. Whether a given sector is in a globally optimal configuration will be, to some extent, independent of whether other sectors in the tree are in their optimal configurations. For a tree to be optimal, all sectors in the tree have to be in a globally optimal configuration, but the chances of achieving this result in a given RAS+TBR may be extremely low. If five sectors of the tree are in an optimal configuration, just starting a new RAS+TBR will possibly place other sectors of the tree in optimal configurations, but it is unlikely also to place the same five sectors that were optimal again in optimal configurations. Consider the following analogy: you have six dice, and the goal is to achieve the highest sum of values by throwing them. You can either take the six dice and throw all of them at once, in which case the probability of getting the highest value is (1/6) 6, or 2 in 100,000. Or, you can use a divisive strategy: throw all the dice together only once, and then take each of the six dice and, in turn, throw it 50 times, keeping the highest value in each case. In the first case, you may well not find the highest possible value in 100,000 throws. With the divisive strategy of the second case, you would be
4 Techniques for Analyzing Large Data Sets 73 almost guaranteed to find the highest possible value with a total of 301 throws. In the real world, parsimony problems do not have sectors clearly identified as the dice, and the resolution among different sectors is often not really independent. This simply makes the problem more difficult. It is then easy to understand why finding a shortest tree using RAS+TBR may become so difficult for large real data sets. Consider a tree of 500 taxa; such a tree could have 10 different sectors which can have its own local optima; if a given RAS+TBR has a chance of 0.5 to find a globally optimal configuration for a given sector, then the chances of a given RAS+TBR to find a most parsimonious tree are , or less than 1 in 1,000. Thus, not only the number of rearrangements necessary to complete TBR swapping on trees with more taxa increases exponentially, but so does the number of replications of RAS+TBR that have to be done in order to find optimal trees. 4 Techniques for Analyzing Large Data Sets The best way to analyze data sets with composite optima will be by means analogous to the divisive strategy described above for the dice. Re-starting a new replication every time a replication of RAS+TBR gets stuck will simply not do the job in a reasonable time. There are four basic methods that have been proposed to cope with the problem of local optima. The first one to be developed is the parsimony ratchet ([11], originally presented at a symposium in 1998; see [12]). Subsequently developed methods are sectorial-searches, tree-fusing and tree-drifting [10]. The expected difference in performance between the traditional and these new techniques is about as much as one would expect for the two strategies for throwing the dice. 4.1 Ratchet The ratchet is based on slightly perturbing the data once the TBR gets stuck, repeating a TBR search for the perturbed data using the same tree as starting point, then using that tree for searching again under the original data. The perturbation is normally done by either increasing the weights of a proportion (10 to 15%) of the characters, or by eliminating some characters, as in jackknifing (but with lower probabilities of deletion). The TBR searches for both the perturbed and the original data must be made saving only one (or very few) trees. The effectiveness of the ratchet is not significantly increased by saving more trees, but run times are (see [11] for details). The ratchet works because the perturbation phase makes partial changes to the tree, but without changing its entire structure. The changes are made, at each round, to only part of the tree, improving, it is hoped, the tree a few parts at a time. In the end, the changes made by the ratchet are determined by
5 74 Pablo A. Goloboff character conflict: a given TBR rearrangement can improve the tree for the perturbed data only if some characters actually favor the alternative groupings. Since it is character conflict in the first place that determines the existence of local optima, the ratchet addresses the problem of local optima at its very heart. The ratchet is very effective for finding shortest trees. In the case of the 500- taxon data set of Chase et al. [13], the ratchet can find a shortest tree in about 2 hours (on a 266 MHz pentium II machine). Using only multiple RAS+TBR, it takes from 48 to 72 hours to find minimum length for that data set. 4.2 Sectorial searches The sectorial searches choose a sector of the tree with a size such that it can be properly handled by the TBR algorithm, create a reduced data set for that part of the tree, and analyze that sector by doing some number of RAS+TBR (without saving multiple trees). Then the best tree for the sector is replaced onto the entire tree. The process is repeated several times, choosing different sectors. The sectors can be chosen at random, or based on a consensus previously calculated by some means. Details are given in Goloboff [10]. Sectorial searches find short trees much more effectively than TBR alone; in the case of Chase et al.'s data set, finding trees under steps using TBR alone would require using over 10 times more replications than when using sectorial searches, and this would take about 7 times longer. Sectorial searches alone rarely find an optimal tree for large data sets. Used alone, they are less effective than the ratchet, normally going down to some non-minimal length (much lower than TBR alone), and then they get stuck. Sectorial searches, however, analyze many reduced data sets, which take almost no time at all. They thus have the advantage that they get down to a non-minimal length faster than the ratchet. They are then useful as initial stages of the search, in combination with other methods. 4.3 Tree-fusing Tree-fusing takes two trees and evaluates all possible exchanges of sub-trees with identical taxon-composition. The sub-tree exchanges that improve the tree are then actually made. See Goloboff [10] for details. Tree-fusing is best done by successively fusing pairs of trees and thus needs several trees as input to produce results; getting those trees will require several replications of RAS+TBR, possibly followed by some other method (like a sectorial search, ratchet, or tree-drifting). Once several close-to-optimal trees have been obtained, tree-fusing produces dramatic improvements in almost no time. It is easy to see why: each of the sectors will be in an optimal configuration in at least
6 Techniques for Analyzing Large Data Sets 75 some of the trees, and tree-fusing simply merges together those optimal sectors to achieve a globally optimal tree. In this sense, tree-fusing makes it possible to make good use of trees which are not globally optimal, as long as they have at least some sectors in optimal configuration. 4.4 Tree-drifting Tree-drifting is based on an idea quite similar to that of the ratchet. It is based on doing rounds of TBR, alternatively accepting only optimal, and suboptimal as well as optimal trees. The suboptimal trees are accepted, during the drift phase, with a probability that depends on how suboptimal the trees are. One of the key components of the method is the function for determining the probability of acceptance, which is based on both the absolute step difference and a measure of character conflict (the relative fit difference, which is the ratio of steps gained and saved in all characters, between the two trees; see [14]). Trees as good as or better than the one being swapped are always accepted. Once a given number of rearrangements has been accepted, a round of TBR accepting only optimal trees is made, and the process is repeated (as in the ratchet) a certain number of times. Tree-drifting is about as effective as the ratchet at finding shortest trees, although in current implementations tree-drifting seems to find minimum length about two to three times faster than the ratchet itself. This difference is probably a consequence of the fact that the ratchet analyzes the perturbed data set until completion of TBR, while the equivalent phase in tree-drifting only does a fixed number of replacements. Since there is no point in having the ratchet find the actually optimal trees for the perturbed data, the ratchet could be easily modified such that the perturbed phase finishes as soon as a certain number of rearrangements has been accepted. Most likely this would make the ratchet about as fast as tree-drifting. 4.5 Combined methods The methods described above can be combined. Thus, the best results have been obtained when RAS+TBR is first followed by Sectorial searches, then some drift or ratchet, and the results are fused. Repeating this procedure will sometimes find minimum length much more quickly than other times. If the procedure uses (say) ten initial replications, on occasion the first four or five replications will find a shortest tree, the rest of the time effectively being wasted -at least as far as hitting minimum length is concerned. On other occasions, the ten replications will not be enough to find minimum length, but then there is no point in starting from scratch with another ten replications: maybe just adding a few more, and tree-fusing those new replications with the previous ten ones,
7 76 Pablo A. Goloboff will do the job. The most efficient results, unsurprisingly, are then obtained when the methods described above are combined, and the parameters for the entire search are supervised and changed at run time. At each point, the number of initial replications is changed according to how many replications had to be used in previous hits to minimum length; if fewer replications were needed, the number is decreased, and vice versa. Goloboff [10] suggested that it would also be beneficial to change the number of Sectorial searches as well, and the number of drift cycles, to be done within each replication (although this has not been actually implemented so far). The process just described in the end also makes it likely that the best results obtained correspond to the actual minimum length. Each hit to minimum length will use as many initial replications as necessary to reproduce the previously found best length; if the length used so far as bound is in fact not optimal, shorter trees will eventually be found. With every certain number of hits to minimum length, the results from all previous replications can be submitted to tree-fusing. If the trees from several independent hits to some length do not produce shorter trees when subject to fusing, it is likely that that length represents indeed the minimum possible (and thus tree-fusing provides an additional criterion, beyond mere convergence, to determine whether the actual minimum length has been found in a particular case). Alternatively, the search parameters can be made very aggressive (i. e., many replications, with lots of drifting and fusing, etc.) at first, to make sure that one has the actual minimum length, and subsequently they can be switched to the more effortsaving strategy when it comes to determining the consensus tree for the data set being analyzed. 4.6 Minimum length: multiple trees or multiple hits? The approach to parsimony analysis for many years has been that of trying to actually find each and every possible most parsimonious tree for the data. Getting all possible most parsimonious tree for large data sets can be a difficult task (since there can be millions of them). What is more important, for the purpose of taxonomic studies, is that there is absolutely no point in doing so. Since the trees found are to be used to create a (strict) consensus tree, it would be much less wasteful to simply gather the minimum number of trees necessary to produce the same consensus that would be produced by all possible most parsimonious trees. In this sense, it is more fruitful to find additional trees of minimum length by producing new, independent hits to minimum length, than it is to find trees from the same hit by doing TBR saving multiple trees. Doing TBR saving multiple trees will produce, by necessity, trees which are in the same local optimum or island, differing by few rearrangements, while the trees from new hits to minimum length could, potentially, be more different -possibly belonging to different islands. The consensus from a few trees from indepen-
8 Techniques for Analyzing Large Data Sets 77 dent hits to minimum length is likely to be the same as the consensus from every possible most parsimonious tree, especially when the trees are collapsed more stringently. The trees can be collapsed by applying the TBR algorithm, not to find multiple trees, but rather to collapse all the nodes between source and destination node when a rearrangement produces a tree of the same length as the tree being swapped. This allows production of the same results as would be produced by saving large numbers of trees, but more quickly and using less RAM. This is one of the main ideas in Farris et al.'s [15] paper, further explored in Goloboff and Farris [14]. Thus, current implementation of the methods described here exploits this idea. As minimum length is successively hit, the consensus for the results obtained so far can be calculated. The consensus will become less and less resolved with additional hits to minimum length, up to a point, where it will become stable. Once additional hits to minimum length do not further de-resolve the consensus, the search can be stopped, and it is likely that the consensus corresponds to the same consensus that would be obtained if each and every most parsimonious tree was used to produce a consensus. If the user wants more confidence that the actual consensus has been obtained, once the consensus became stable, it is possible to restart calculating a consensus from the new (subsequent) hits to minimum length, until it becomes stable again; the grand consensus of both consensuses is less likely to contain spurious groups (i. e., actually unsupported groups, present in some most parsimonious trees, but not in all of them). For Chase et al.'s data set, when the consensus is calculated every three hits to minimum length, until stability is achieved twice, the analysis takes (on a 266 MHz Pentium II) an average time of only 4 hours (minimum length being hit 20 to 40 times). The exact consensus is obtained 80% of the time, but the 20% of the cases where the consensus is not exact exhibit only one or two spurious nodes. The consensus could be made more reliable by re-calculating it until stability is reached more times, and by re-calculating it less frequently (e. g., every five hits to minimum length, instead of three). This is in stark contrast with a search like Rice et al.'s [16] analysis, based on ca trees (found in 3.5 months of analysis) from a single replication, which produced 46 spurious nodes. 5 TNT: Implementation of the New Methods The techniques described here have been implemented in "Tree analysis using New Technology" (TNT), a new program by P. Goloboff, J. Farris, and K. Nixon [17]. The program is still a prototype, but demonstration versions are available from The program has a full Windows interface (although command-driven versions for other operating systems are anticipated). The input format is as for Hennig86 and NONA (see Siddall, this volume). The program allows the user to change the parameters of the search, either by hand, or by letting the program try to identify the best parameters for a given size of data set and degree of exhaustiveness. In general, a few recommendations can be made. Data sets with
9 78 Pablo A. Goloboff fewer than 100 taxa will be difficult to analyze only when extremely incongruent. In those cases, the methods of tree-fusing and Sectorial searches perform more poorly (these methods assume that some isolated sectors in the tree can indeed be identified, but this is unlikely to be the case for such data sets). Therefore, smaller data sets are best analyzed by means of extensive ratchet and/or tree-drifting, reducing tree-fusing and Sectorial searches to a minimum. Larger data sets can be analyzed with essentially only Sectorial searches plus tree-fusing if they are rather clean. However, as data sets become more difficult, it is necessary to increase not only the number of initial replications, but also the exhaustiveness of each replication. This is best done by selecting (at some point in each of the initial replications) sectors of larger size and analyzing them with tree-drift instead of simply RAS+TBR (this is the "DSS" option in the sectorial-search dialogue of TNT). Larger sectors are more likely to identify areas of conflict, and it is less likely that better solutions will be missed, because they would require that some taxon be moved outside the sector being analyzed. After certain number of sector selections are analyzed with tree-drifting, several cycles of global tree-drifting further improve the trees, before submitting them to tree-fusing. The tree-drifting can be done faster if some nodes are constrained during the search (the constraint is created from a consensus of the previous tree and the tree resulting from the perturbed round of TBR; see [10]). This might conceivably decrease the effectiveness of the drift, but it can be countered by doing an unconstrained cycle of drift with some periodicity, and since it means more cycles of drift per unit time, in the end it means an increase in effectiveness. The "hard cycles" option in the "Drift" dialogue box of TNT sets the number of hard drift cycles to do before an unconstrained cycle is done. If large numbers of drift cycles are to be done, it is advisable to set the hard cycles so that a large portion of the drift cycles are constrained (e.g., eight or nine out of ten). For difficult data sets, making the searches more exhaustive will take more time per replication, but in the end will mean that minimum length can be found much more quickly. The number of hits to re-check for consensus stability and the number of times the consensus should reach stability are changed from the main dialogue box of the "New Technology Search." As discussed above, this determines the reliability of the consensus tree obtained, with larger numbers meaning more reliable results. If the user so prefers, he may simply decide to hit minimum length a certain number of times and then let the program stop. 6 Remarks and Conclusions New methods for analysis of large data sets perform at speeds that were unimaginable only a few years ago. Parsimony problems of a few hundred taxa had been considered "intractable" by many authors, but they can now be easily analyzed. No doubt the enormous progress made in the last few years in this area has been facilitated by the fact that people have recently started publishing and openly discussing new algorithms and ideas. Although at
10 Techniques for Analyzing Large Data Sets 79 present it is difficult to predict whether the currently used methods will be further improved, the possibility certainly exists: the field of computational cladistics is still an area of active discussion and ferment. Acknowledgments The author wishes to thank Martin Ramirez and Gonzalo Giribet for comments and help during the preparation of the manuscript. Part of the research was carried out with the deeply appreciated support from PICT (Agencia Nacional de Promociónn Cientifica y Tecnológica), and from PEI 0324/ 97 (CONICET). References Farris JS (1988) HENNIG 86, v. 1.5, 10 program and documentation. Port Jefferson, NY Swofford DL (1993) PAUP: Phylogenetic analysis using parsimony, v , pro- 11 gram and documentation, Illinois Goloboff PA (1994b) Nona, v , program and documentation. Available at 12 ftp.unt.edu.ar/pub/parsimony Wheeler WC (1996) Optimization alignment: the end of multiple sequence 13 alignment in phylogenetics? Cladistics 12: 1-9 Swofford D, Olsen G (1990) Phylogeny reconstruction. In: D Hillis and C Moritz (eds.): Molecular Systematics Goloboff PA (1994a) Character optimization and calculation of tree lengths. Cladistics 9: Goloboff PA (1996) Methods for faster parsimony analysis. Cladistics 12: Moilanen A (1999) Searching for most 16 parsimonious trees with simulated evolutionary optimization. Cladistics 15: Maddison D (1991) The discovery and importance of multiple islands of most parsimonious trees. Syst. Zool., 40: Goloboff PA (1999) Analyzing large data sets in reasonable times: solutions for composite optima. Cladistics 15: Nixon KC (1999) The Parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics 15: Horovitz I (1999) A report on "One Day Symposium on Numerical Cladistics". Cladistics 15: Chase MW, Soltis DE, Olmstead RG, Morgan D et al. (1993) Phylogenetics of seed plants: An analysis of nucleic sequences from the plastid gene rbcl. Ann. Mo. Bot. Gard. 80: Goloboff PA, Farris JS (2001) Methods for quick consensus estimation. Cladistics 17: Farris JS, Albert VA, Kallersjo M, Lipscomb, D et al. (1996) Parsimony jackknifing outperforms neighbor-joining. Cladistics 12: Rice KA, Donoghue MJ, Olmstead RG (1997) Analyzing large data sets: rbcl 500 revisited. Syst. Biol. 46: Goloboff PA, Farris JS, Nixon KC (1999) T.N. T.: Tree analysis using New Technology. Available at
Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima
Cladistics 15, 415 428 (1999) Article ID clad.1999.0122, available online at http://www.idealibrary.com on Analyzing Large Data Sets in Reasonable Times: Solutions for Composite Optima Pablo A. Goloboff
More informationMethods for Quick Consensus Estimation
Cladistics 17, S26 S34 (2001) doi:10.1006/clad.2000.0156, available online at http://www.idealibrary.com on Methods for Quick Consensus Estimation Pablo A. Goloboff* and James S. Farris *Instituto Superior
More informationScaling the gene duplication problem towards the Tree of Life: Accelerating the rspr heuristic search
Scaling the gene duplication problem towards the Tree of Life: Accelerating the rspr heuristic search André Wehe 1 and J. Gordon Burleigh 2 1 Department of Computer Science, Iowa State University, Ames,
More informationIntroduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
More informationPRec-I-DCM3: a parallel framework for fast and accurate large-scale phylogeny reconstruction
Int. J. Bioinformatics Research and Applications, Vol. 2, No. 4, 2006 407 PRec-I-DCM3: a parallel framework for fast and accurate large-scale phylogeny reconstruction Yuri Dotsenko*, Cristian Coarfa, Luay
More informationPHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice
More informationBayesian Phylogeny and Measures of Branch Support
Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The
More informationEvaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation
Evaluating the Performance of a Successive-Approximations Approach to Parameter Optimization in Maximum-Likelihood Phylogeny Estimation Jack Sullivan,* Zaid Abdo, à Paul Joyce, à and David L. Swofford
More informationArbres formels et Arbre(s) de la Vie
Arbres formels et Arbre(s) de la Vie A bit of history and biology Definitions Numbers Topological distances Consensus Random models Algorithms to build trees Basic principles DATA sequence alignment distance
More informationEnumerating possible Sudoku grids
Enumerating possible Sudoku grids Bertram Felgenhauer Department of Computer Science TU Dresden 00 Dresden Germany bf@mail.inf.tu-dresden.de Frazer Jarvis Department of Pure Mathematics University of Sheffield,
More informationSolving Three-objective Optimization Problems Using Evolutionary Dynamic Weighted Aggregation: Results and Analysis
Solving Three-objective Optimization Problems Using Evolutionary Dynamic Weighted Aggregation: Results and Analysis Abstract. In this paper, evolutionary dynamic weighted aggregation methods are generalized
More informationA Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster
, pp.11-20 http://dx.doi.org/10.14257/ ijgdc.2014.7.2.02 A Load Balancing Algorithm based on the Variation Trend of Entropy in Homogeneous Cluster Kehe Wu 1, Long Chen 2, Shichao Ye 2 and Yi Li 2 1 Beijing
More informationMISSING ENTRY REPLACEMENT DATA ANALYSIS: A REPLACEMENT APPROACH TO DEALING WITH MISSING DATA IN PALEONTOLOGICAL AND TOTAL EVIDENCE DATA SETS
Journal of Vertebrate Paleontology ():, June 00 00 by the Society of Vertebrate Paleontology MISSING ENTRY REPLACEMENT DATA ANALYSIS: A REPLACEMENT APPROACH TO DEALING WITH MISSING DATA IN PALEONTOLOGICAL
More informationClustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
More informationDistributed Dynamic Load Balancing for Iterative-Stencil Applications
Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,
More informationOnline Consensus and Agreement of Phylogenetic Trees.
Online Consensus and Agreement of Phylogenetic Trees. Tanya Y. Berger-Wolf 1 Department of Computer Science, University of New Mexico, Albuquerque, NM 87131, USA. tanyabw@cs.unm.edu Abstract. Computational
More informationA Non-Linear Schema Theorem for Genetic Algorithms
A Non-Linear Schema Theorem for Genetic Algorithms William A Greene Computer Science Department University of New Orleans New Orleans, LA 70148 bill@csunoedu 504-280-6755 Abstract We generalize Holland
More informationThe QOOL Algorithm for fast Online Optimization of Multiple Degree of Freedom Robot Locomotion
The QOOL Algorithm for fast Online Optimization of Multiple Degree of Freedom Robot Locomotion Daniel Marbach January 31th, 2005 Swiss Federal Institute of Technology at Lausanne Daniel.Marbach@epfl.ch
More informationThe 2010 British Informatics Olympiad
Time allowed: 3 hours The 2010 British Informatics Olympiad Instructions You should write a program for part (a) of each question, and produce written answers to the remaining parts. Programs may be used
More informationUsing Simulation to Understand and Optimize a Lean Service Process
Using Simulation to Understand and Optimize a Lean Service Process Kumar Venkat Surya Technologies, Inc. 4888 NW Bethany Blvd., Suite K5, #191 Portland, OR 97229 kvenkat@suryatech.com Wayne W. Wakeland
More informationLaboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling
Laboratory work in AI: First steps in Poker Playing Agents and Opponent Modeling Avram Golbert 01574669 agolbert@gmail.com Abstract: While Artificial Intelligence research has shown great success in deterministic
More informationPeer-to-peer Cooperative Backup System
Peer-to-peer Cooperative Backup System Sameh Elnikety Mark Lillibridge Mike Burrows Rice University Compaq SRC Microsoft Research Abstract This paper presents the design and implementation of a novel backup
More informationChapter 4 SUPPLY CHAIN PERFORMANCE MEASUREMENT USING ANALYTIC HIERARCHY PROCESS METHODOLOGY
Chapter 4 SUPPLY CHAIN PERFORMANCE MEASUREMENT USING ANALYTIC HIERARCHY PROCESS METHODOLOGY This chapter highlights on supply chain performance measurement using one of the renowned modelling technique
More informationEffect of Using Neural Networks in GA-Based School Timetabling
Effect of Using Neural Networks in GA-Based School Timetabling JANIS ZUTERS Department of Computer Science University of Latvia Raina bulv. 19, Riga, LV-1050 LATVIA janis.zuters@lu.lv Abstract: - The school
More informationMolecular Clocks and Tree Dating with r8s and BEAST
Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today
More informationResearch on a Heuristic GA-Based Decision Support System for Rice in Heilongjiang Province
Research on a Heuristic GA-Based Decision Support System for Rice in Heilongjiang Province Ran Cao 1,1, Yushu Yang 1, Wei Guo 1, 1 Engineering college of Northeast Agricultural University, Haerbin, China
More information6 Creating the Animation
6 Creating the Animation Now that the animation can be represented, stored, and played back, all that is left to do is understand how it is created. This is where we will use genetic algorithms, and this
More informationRouting in packet-switching networks
Routing in packet-switching networks Circuit switching vs. Packet switching Most of WANs based on circuit or packet switching Circuit switching designed for voice Resources dedicated to a particular call
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More information!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"
!"!!"#$$%&'()*+$(,%!"#$%$&'()*""%(+,'-*&./#-$&'(-&(0*".$#-$1"(2&."3$'45"!"#"$%&#'()*+',$$-.&#',/"-0%.12'32./4'5,5'6/%&)$).2&'7./&)8'5,5'9/2%.%3%&8':")08';:
More informationInformation Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay
Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding
More informationSingle machine models: Maximum Lateness -12- Approximation ratio for EDD for problem 1 r j,d j < 0 L max. structure of a schedule Q...
Lecture 4 Scheduling 1 Single machine models: Maximum Lateness -12- Approximation ratio for EDD for problem 1 r j,d j < 0 L max structure of a schedule 0 Q 1100 11 00 11 000 111 0 0 1 1 00 11 00 11 00
More informationLoad Distribution in Large Scale Network Monitoring Infrastructures
Load Distribution in Large Scale Network Monitoring Infrastructures Josep Sanjuàs-Cuxart, Pere Barlet-Ros, Gianluca Iannaccone, and Josep Solé-Pareta Universitat Politècnica de Catalunya (UPC) {jsanjuas,pbarlet,pareta}@ac.upc.edu
More informationBorges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.
... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those
More informationTesting Metrics. Introduction
Introduction Why Measure? What to Measure? It is often said that if something cannot be measured, it cannot be managed or improved. There is immense value in measurement, but you should always make sure
More informationThe 7 Attributes of a Good Software Configuration Management System
Software Development Best Practices The 7 Attributes of a Good Software Configuration Management System Robert Kennedy IBM Rational software Benefits of Business Driven Development GOVERNANCE DASHBOARD
More information3/8/2011. Applying Integrated Risk and Performance Management: A Case Study. Background: Our Discussion Today
FINANCIAL SERVICES Applying Integrated Risk and Performance Management: A Case Study KPMG LLP Our Discussion Today Background What is Integrated Risk and Performance Management: Economic Theory Perspective
More informationHow To Compare Load Sharing And Job Scheduling In A Network Of Workstations
A COMPARISON OF LOAD SHARING AND JOB SCHEDULING IN A NETWORK OF WORKSTATIONS HELEN D. KARATZA Department of Informatics Aristotle University of Thessaloniki 546 Thessaloniki, GREECE Email: karatza@csd.auth.gr
More informationLotto Master Formula (v1.3) The Formula Used By Lottery Winners
Lotto Master Formula (v.) The Formula Used By Lottery Winners I. Introduction This book is designed to provide you with all of the knowledge that you will need to be a consistent winner in your local lottery
More informationGEOENGINE MSc in Geomatics Engineering (Master Thesis) Anamelechi, Falasy Ebere
Master s Thesis: ANAMELECHI, FALASY EBERE Analysis of a Raster DEM Creation for a Farm Management Information System based on GNSS and Total Station Coordinates Duration of the Thesis: 6 Months Completion
More informationOutline. NP-completeness. When is a problem easy? When is a problem hard? Today. Euler Circuits
Outline NP-completeness Examples of Easy vs. Hard problems Euler circuit vs. Hamiltonian circuit Shortest Path vs. Longest Path 2-pairs sum vs. general Subset Sum Reducing one problem to another Clique
More informationStudio 5.0 User s Guide
Studio 5.0 User s Guide wls-ug-administrator-20060728-05 Revised 8/8/06 ii Copyright 2006 by Wavelink Corporation All rights reserved. Wavelink Corporation 6985 South Union Park Avenue, Suite 335 Midvale,
More informationRoulette Wheel Testing. Report on Stage 3.1 of NWML/GBGB Project Proposal
NOTICE: Following large wins from professional roulette teams, the UK Weights and Measures Lab (government lab) conducted a study to determine if particular "wheel conditions" made roulette spins predictable.
More informationAnalysis of Micromouse Maze Solving Algorithms
1 Analysis of Micromouse Maze Solving Algorithms David M. Willardson ECE 557: Learning from Data, Spring 2001 Abstract This project involves a simulation of a mouse that is to find its way through a maze.
More informationCompact Representations and Approximations for Compuation in Games
Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions
More informationChapter 13: Binary and Mixed-Integer Programming
Chapter 3: Binary and Mixed-Integer Programming The general branch and bound approach described in the previous chapter can be customized for special situations. This chapter addresses two special situations:
More informationINTEGER PROGRAMMING. Integer Programming. Prototype example. BIP model. BIP models
Integer Programming INTEGER PROGRAMMING In many problems the decision variables must have integer values. Example: assign people, machines, and vehicles to activities in integer quantities. If this is
More informationCredit Card Market Study Interim Report: Annex 4 Switching Analysis
MS14/6.2: Annex 4 Market Study Interim Report: Annex 4 November 2015 This annex describes data analysis we carried out to improve our understanding of switching and shopping around behaviour in the UK
More informationWhat makes a good process?
Rob Davis Everyone wants a good process. Our businesses would be more profitable if we had them. But do we know what a good process is? Would we recognized one if we saw it? And how do we ensure we can
More informationClassification/Decision Trees (II)
Classification/Decision Trees (II) Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Right Sized Trees Let the expected misclassification rate of a tree T be R (T ).
More informationHow to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics
How to Learn Good Cue Orders: When Social Learning Benefits Simple Heuristics Rocio Garcia-Retamero (rretamer@mpib-berlin.mpg.de) Center for Adaptive Behavior and Cognition, Max Plank Institute for Human
More informationAutomatic Inventory Control: A Neural Network Approach. Nicholas Hall
Automatic Inventory Control: A Neural Network Approach Nicholas Hall ECE 539 12/18/2003 TABLE OF CONTENTS INTRODUCTION...3 CHALLENGES...4 APPROACH...6 EXAMPLES...11 EXPERIMENTS... 13 RESULTS... 15 CONCLUSION...
More informationLoad Balancing and Rebalancing on Web Based Environment. Yu Zhang
Load Balancing and Rebalancing on Web Based Environment Yu Zhang This report is submitted as partial fulfilment of the requirements for the Honours Programme of the School of Computer Science and Software
More informationGenetic algorithms for changing environments
Genetic algorithms for changing environments John J. Grefenstette Navy Center for Applied Research in Artificial Intelligence, Naval Research Laboratory, Washington, DC 375, USA gref@aic.nrl.navy.mil Abstract
More informationDeployment of express checkout lines at supermarkets
Deployment of express checkout lines at supermarkets Maarten Schimmel Research paper Business Analytics April, 213 Supervisor: René Bekker Faculty of Sciences VU University Amsterdam De Boelelaan 181 181
More informationResearch Paper Business Analytics. Applications for the Vehicle Routing Problem. Jelmer Blok
Research Paper Business Analytics Applications for the Vehicle Routing Problem Jelmer Blok Applications for the Vehicle Routing Problem Jelmer Blok Research Paper Vrije Universiteit Amsterdam Faculteit
More informationMultiobjective Multicast Routing Algorithm
Multiobjective Multicast Routing Algorithm Jorge Crichigno, Benjamín Barán P. O. Box 9 - National University of Asunción Asunción Paraguay. Tel/Fax: (+9-) 89 {jcrichigno, bbaran}@cnc.una.py http://www.una.py
More informationImproved Single and Multiple Approximate String Matching
Improved Single and Multiple Approximate String Matching Kimmo Fredriksson Department of Computer Science, University of Joensuu, Finland Gonzalo Navarro Department of Computer Science, University of Chile
More informationThe Mathematics of the RSA Public-Key Cryptosystem
The Mathematics of the RSA Public-Key Cryptosystem Burt Kaliski RSA Laboratories ABOUT THE AUTHOR: Dr Burt Kaliski is a computer scientist whose involvement with the security industry has been through
More informationThe mathematical branch of probability has its
ACTIVITIES for students Matthew A. Carlton and Mary V. Mortlock Teaching Probability and Statistics through Game Shows The mathematical branch of probability has its origins in games and gambling. And
More informationUsing Analytic Hierarchy Process (AHP) Method to Prioritise Human Resources in Substitution Problem
Using Analytic Hierarchy Process (AHP) Method to Raymond Ho-Leung TSOI Software Quality Institute Griffith University *Email:hltsoi@hotmail.com Abstract In general, software project development is often
More informationThe Classes P and NP
The Classes P and NP We now shift gears slightly and restrict our attention to the examination of two families of problems which are very important to computer scientists. These families constitute the
More informationConcept of Cache in web proxies
Concept of Cache in web proxies Chan Kit Wai and Somasundaram Meiyappan 1. Introduction Caching is an effective performance enhancing technique that has been used in computer systems for decades. However,
More informationA data management framework for the Fungal Tree of Life
Web Accessible Sequence Analysis for Biological Inference A data management framework for the Fungal Tree of Life Kauff F, Cox CJ, Lutzoni F. 2007. WASABI: An automated sequence processing system for multi-gene
More informationLecture 10: Regression Trees
Lecture 10: Regression Trees 36-350: Data Mining October 11, 2006 Reading: Textbook, sections 5.2 and 10.5. The next three lectures are going to be about a particular kind of nonlinear predictive model,
More informationWhat is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling)
data analysis data mining quality control web-based analytics What is Data Mining, and How is it Useful for Power Plant Optimization? (and How is it Different from DOE, CFD, Statistical Modeling) StatSoft
More informationChoices, choices, choices... Which sequence database? Which modifications? What mass tolerance?
Optimization 1 Choices, choices, choices... Which sequence database? Which modifications? What mass tolerance? Where to begin? 2 Sequence Databases Swiss-prot MSDB, NCBI nr dbest Species specific ORFS
More information(Refer Slide Time: 01:52)
Software Engineering Prof. N. L. Sarda Computer Science & Engineering Indian Institute of Technology, Bombay Lecture - 2 Introduction to Software Engineering Challenges, Process Models etc (Part 2) This
More informationServer & Client Optimization
Table of Contents: Farmers WIFE / Farmers WIFE Advanced Make sure your Server specification is within the requirements... 2 Operating System... 2 Hardware... 2 Processor Server... 2 Memory... 2 Hard disk
More informationARLA Members Survey of the Private Rented Sector
Prepared for The Association of Residential Letting Agents ARLA Members Survey of the Private Rented Sector Fourth Quarter 2013 Prepared by: O M Carey Jones 5 Henshaw Lane Yeadon Leeds LS19 7RW December,
More information8.1 Min Degree Spanning Tree
CS880: Approximations Algorithms Scribe: Siddharth Barman Lecturer: Shuchi Chawla Topic: Min Degree Spanning Tree Date: 02/15/07 In this lecture we give a local search based algorithm for the Min Degree
More informationAsexual Versus Sexual Reproduction in Genetic Algorithms 1
Asexual Versus Sexual Reproduction in Genetic Algorithms Wendy Ann Deslauriers (wendyd@alumni.princeton.edu) Institute of Cognitive Science,Room 22, Dunton Tower Carleton University, 25 Colonel By Drive
More informationIMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE OPERATORS
Volume 2, No. 3, March 2011 Journal of Global Research in Computer Science RESEARCH PAPER Available Online at www.jgrcs.info IMPROVING PERFORMANCE OF RANDOMIZED SIGNATURE SORT USING HASHING AND BITWISE
More informationThe Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy
BMI Paper The Effects of Start Prices on the Performance of the Certainty Equivalent Pricing Policy Faculty of Sciences VU University Amsterdam De Boelelaan 1081 1081 HV Amsterdam Netherlands Author: R.D.R.
More informationSmart Queue Scheduling for QoS Spring 2001 Final Report
ENSC 833-3: NETWORK PROTOCOLS AND PERFORMANCE CMPT 885-3: SPECIAL TOPICS: HIGH-PERFORMANCE NETWORKS Smart Queue Scheduling for QoS Spring 2001 Final Report By Haijing Fang(hfanga@sfu.ca) & Liu Tang(llt@sfu.ca)
More informationA COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES
A COOL AND PRACTICAL ALTERNATIVE TO TRADITIONAL HASH TABLES ULFAR ERLINGSSON, MARK MANASSE, FRANK MCSHERRY MICROSOFT RESEARCH SILICON VALLEY MOUNTAIN VIEW, CALIFORNIA, USA ABSTRACT Recent advances in the
More informationEmpirically Identifying the Best Genetic Algorithm for Covering Array Generation
Empirically Identifying the Best Genetic Algorithm for Covering Array Generation Liang Yalan 1, Changhai Nie 1, Jonathan M. Kauffman 2, Gregory M. Kapfhammer 2, Hareton Leung 3 1 Department of Computer
More informationAcing Math (One Deck At A Time!): A Collection of Math Games. Table of Contents
Table of Contents Introduction to Acing Math page 5 Card Sort (Grades K - 3) page 8 Greater or Less Than (Grades K - 3) page 9 Number Battle (Grades K - 3) page 10 Place Value Number Battle (Grades 1-6)
More informationReinvent your storage infrastructure for e-business
Reinvent your storage infrastructure for e-business Paul Wang SolutionSoft Systems, Inc. 2345 North First Street, Suite 210 San Jose, CA 95131 pwang@solution-soft.com 408.346.1400 Abstract As the data
More informationGenetic Algorithms and Sudoku
Genetic Algorithms and Sudoku Dr. John M. Weiss Department of Mathematics and Computer Science South Dakota School of Mines and Technology (SDSM&T) Rapid City, SD 57701-3995 john.weiss@sdsmt.edu MICS 2009
More informationMemory Allocation Technique for Segregated Free List Based on Genetic Algorithm
Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College
More informationAdaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints
Adaptive Tolerance Algorithm for Distributed Top-K Monitoring with Bandwidth Constraints Michael Bauer, Srinivasan Ravichandran University of Wisconsin-Madison Department of Computer Sciences {bauer, srini}@cs.wisc.edu
More informationSo today we shall continue our discussion on the search engines and web crawlers. (Refer Slide Time: 01:02)
Internet Technology Prof. Indranil Sengupta Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No #39 Search Engines and Web Crawler :: Part 2 So today we
More informationCAD Algorithms. P and NP
CAD Algorithms The Classes P and NP Mohammad Tehranipoor ECE Department 6 September 2010 1 P and NP P and NP are two families of problems. P is a class which contains all of the problems we solve using
More informationParallel Scalable Algorithms- Performance Parameters
www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for
More informationComparing Alternate Designs For A Multi-Domain Cluster Sample
Comparing Alternate Designs For A Multi-Domain Cluster Sample Pedro J. Saavedra, Mareena McKinley Wright and Joseph P. Riley Mareena McKinley Wright, ORC Macro, 11785 Beltsville Dr., Calverton, MD 20705
More informationProblems, Methods and Tools of Advanced Constrained Scheduling
Problems, Methods and Tools of Advanced Constrained Scheduling Victoria Shavyrina, Spider Project Team Shane Archibald, Archibald Associates Vladimir Liberzon, Spider Project Team 1. Introduction In this
More informationUSING BACKTRACKING TO SOLVE THE SCRAMBLE SQUARES PUZZLE
USING BACKTRACKING TO SOLVE THE SCRAMBLE SQUARES PUZZLE Keith Brandt, Kevin R. Burger, Jason Downing, Stuart Kilzer Mathematics, Computer Science, and Physics Rockhurst University, 1100 Rockhurst Road,
More informationAn Empirical Study of Two MIS Algorithms
An Empirical Study of Two MIS Algorithms Email: Tushar Bisht and Kishore Kothapalli International Institute of Information Technology, Hyderabad Hyderabad, Andhra Pradesh, India 32. tushar.bisht@research.iiit.ac.in,
More informationIf A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?
Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question
More informationAnomaly Detection in Predictive Maintenance
Anomaly Detection in Predictive Maintenance Anomaly Detection with Time Series Analysis Phil Winters Iris Adae Rosaria Silipo Phil.Winters@knime.com Iris.Adae@uni-konstanz.de Rosaria.Silipo@knime.com Copyright
More informationWhat mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL
What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the
More informationAn approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups
An approach of detecting structure emergence of regional complex network of entrepreneurs: simulation experiment of college student start-ups Abstract Yan Shen 1, Bao Wu 2* 3 1 Hangzhou Normal University,
More informationR-trees. R-Trees: A Dynamic Index Structure For Spatial Searching. R-Tree. Invariants
R-Trees: A Dynamic Index Structure For Spatial Searching A. Guttman R-trees Generalization of B+-trees to higher dimensions Disk-based index structure Occupancy guarantee Multiple search paths Insertions
More informationResource Allocation Schemes for Gang Scheduling
Resource Allocation Schemes for Gang Scheduling B. B. Zhou School of Computing and Mathematics Deakin University Geelong, VIC 327, Australia D. Walsh R. P. Brent Department of Computer Science Australian
More information8. KNOWLEDGE BASED SYSTEMS IN MANUFACTURING SIMULATION
- 1-8. KNOWLEDGE BASED SYSTEMS IN MANUFACTURING SIMULATION 8.1 Introduction 8.1.1 Summary introduction The first part of this section gives a brief overview of some of the different uses of expert systems
More informationSnapshots in the Data Warehouse BY W. H. Inmon
Snapshots in the Data Warehouse BY W. H. Inmon There are three types of modes that a data warehouse is loaded in: loads from archival data loads of data from existing systems loads of data into the warehouse
More informationIn the IEEE Standard Glossary of Software Engineering Terminology the Software Life Cycle is:
In the IEEE Standard Glossary of Software Engineering Terminology the Software Life Cycle is: The period of time that starts when a software product is conceived and ends when the product is no longer
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationEvaluation of a New Method for Measuring the Internet Degree Distribution: Simulation Results
Evaluation of a New Method for Measuring the Internet Distribution: Simulation Results Christophe Crespelle and Fabien Tarissan LIP6 CNRS and Université Pierre et Marie Curie Paris 6 4 avenue du président
More information