Metamodeling by using Multiple Regression Integrated K-Means Clustering Algorithm
|
|
- Oscar Blake
- 8 years ago
- Views:
Transcription
1 Metamodeling by using Multiple Regression Integrated K-Means Clustering Algorithm Emre Irfanoglu, Ilker Akgun, Murat M. Gunal Institute of Naval Science and Engineering Turkish Naval Academy Tuzla, Istanbul, TURKEY Keywords: simulation optimization, K-means clustering, metamodel, multi regression Abstract A metamodel in simulation modeling, as also known as response surfaces, emulators, auxiliary models, etc. relates a simulation model s outputs to its inputs without the need for further experimentation. A metamodel is essentially a regression model and mostly known as the model of a simulation model. A metamodel may be used for Validation and Verification, sensitivity or what-if analysis, and optimization of simulation model. In this study, we proposed a new metamodeling approach by using multiple regression integrated K-means clustering algorithm especially for simulation optimization. Our aim is to evaluate the feasibility of a new metamodeling approach in which we create multiple metamodels by clustering inputoutput variables of a simulation model according to their similarities. In this approach, first, we run the simulation model of a system, second, by using K-Means clustering algorithm, we create metamodels for each cluster, and third, we seek the minima (or maxima) for each metamodel. We also tested our approach by using a fictitious call center. We observed that this approach increases the accuracy of a metamodel and decreases the sum of squared errors. These observations give us some insights about usefulness of clustering in metamodeling for simulation optimization. 1. INTRODUCTION Coupling the speed of optimization techniques and flexibility of simulation emerges a new research area called Simulation Optimization (SimOpt), which also affected the practice [1-3]. In the history of Operational Research, SimOpt methods have started to appear in 1990s, with the basic idea of merging the advantages of simulation modeling with optimization. Simulation methods are known for their flexibility to tackle the complexity in systems. Although simulation models require extensive amount of data, they help decision makers make better decisions. Optimization methods, on the other hand, are not as flexible as for modeling complexity, but once they are built, they generate more accurate and faster results than simulation does. Many methods for SimOpt have been developed, mainly in four categories; gradient-based and random search algorithms, evolutionary algorithms and metaheuristics, mathematical programming based approaches, and statistical search techniques. In this study, we suggest a four-phase approach to improve the metamodeling process for SimOpt. Our approach includes simulation experimentation, clustering, metamodeling, and optimization. In the first phase, conventional simulation experimentation techniques are used. Note that we assume we have a simulation model of a typical call centre system, and we aim to optimize some objective function. In the second phase, we apply a clustering algorithm (k-means) to the simulation inputs. In the third phase for each cluster, a metamodel is developed. Finally, we applied optimization techniques to each metamodel. Different from classical metamodeling in SimOpt, we integrated clustering before the multiple regression metamodel, and generated one metamodel for each cluster, instead of one metamodel for all data. First, we review some of SimOpt methods in the literature in section. In section 3 and 4, we give brief information about metamodel and clustering. In section, we presented our proposed approach by comparing with the classical approach. To show an application of the proposed approach, we experimented with a call center simulation model, and showed that clustered metamodels outperform the classical approach.. REVIEW OF SIMULATION OPTIMIZATION TECHNIQUES In SimOpt, a simulation model is used to estimate the performance of a system, and based on the estimation, then, an optimization algorithm is run to find some new input values that will maximize or minimize the system performance estimation. As in the conventional optimization models, the input values, or the decision variables, are constrained. The iterative nature of this approach generally 618
2 makes the simulation model a bottleneck and therefore the model performance is significant. We review some of the well-known simulation optimization techniques as follows: Gradient-based and random search algorithms (e.g. stochastic approximation): Gradient-based search methods are a type of optimization techniques that use the gradient of the objective function to find an optimal solution [4]. In each iteration of the algorithm, the values of the decision variables are adjusted so that the simulation produces a lower objective function value. Gradient-based methods work well in high-dimensional spaces provided that these spaces do not have local minima. The drawback is that global minima are likely to remain unfound. Evolutionary Algorithms and Metaheuristics (e.g. Genetic Algorithms, Tabu Search and Simulated Annealing): Heuristic-based methods strike a balance between exploration and exploitation. This balance permits the identification of local minima, but encourages the discovery of a globally optimal solution []. Heuristic techniques generate good candidate solutions when the search space is large and nonlinear. Mathematical Programming-Based Approaches (e.g. the Sample Path Method): Sample path optimization (also known as stochastic counterpart, sample average approximation; see [6]) takes many simulations first, and then tries to optimize the resulting estimates by using conventional mathematical programming solution algorithms. Statistical Search Techniques (e.g. Sequential Response Surface Methodology): Response surface methodology (RSM) is a statistical method for fitting a series of regression models to the output of a simulation model []. The goal of RSM is to construct a functional relationship between the decision variables and the output to demonstrate how the changes in the value of decision variables affect the output. Relationships constructed from RSM are often called meta-models [7]. RSM usually consists of a screening phase that eliminates unimportant variables in the simulation [8]. After the screening phase, linear models are used to build a surface and find the region of optimality. Then, second or higher order models are run to find the optimal values for decision variables. The eventual objective of RSM is to determine the optimum operating conditions for the system or to determine a region of the factor space in which operating requirements are satisfied [9]. In the formal application of RSM for optimization and for design of experiments in general, one of the most important steps is factor screening, the initial identification of the important" parameters, those factors that have the greatest influence on the response. However, in our discussion of optimization of discrete-event simulation models, we assume that this has already been determined. In most discrete-event system applications, this is usually the case, since there are underlying analytic models which can give a rough idea as to the influence of various parameters. For example, in manufacturing systems and telecommunications networks, the analyst knows from queuing network models which routing probabilities and service times have an effect on the performance measures of interest. RSM procedures usually presuppose a more black box" approach to the problem as stated above, so it is unclear a priori which factors are of importance at all [10]. Additionally, Fu [10] classifies the application of RSM in two main categories: metamodels, and sequential procedures. Meta models are special cases of RSM representation and therefore the remainder of this paper uses the term metamodel rather than RSM. 3. METAMODEL A metamodel is a polynomial model that relates the inputoutput behavior of a simulation model. A metamodel is often a least squares regression model that has form as given in Eqs.(1): k k k k... 0 i i ii i (1) ij i j E y x x x x i 1 i 1 i 1 j 1 where βi, βii, and βij represent regression coefficients, xi (i = 1,..,n) are design variables, and y is the response. The simple form of a metamodel can reveal the general characteristics of behavior in complex simulation models. The objective of a metamodel is to effectively relate the output data of a simulation model to the model s input to aid in the purpose for which the simulation model was developed [11]. Since our aim in this study is to form a metamodel by using clustering algorithms, we review the related literature in the following section. Note that we aim at classifying the input variables according to the similarities between each other, and after clustering the data, there will be n grouped (clustered) data sets, n metamodels. We discuss the details of this approach after stating the clustering algorithms. 4. CLUSTERING Clustering is a way to examine similarities and dissimilarities of observations or objects. Data often fall naturally into groups, or clusters, of observations, where the 619
3 characteristics of objects in the same cluster are similar and the characteristics of objects in different clusters are dissimilar. Both the similarity and the dissimilarity should be examinable in a clear and meaningful way. Measures of similarity depend on the application. Clustering is widespread, and a wealth of clustering algorithms has been developed to solve different problems in specific fields. However, there is no clustering algorithm that can be universally used to solve all problems [1]. Clustering has been applied in a wide range of areas, ranging from engineering (machine learning, artificial intelligence, pattern recognition, mechanical engineering, electrical engineering), computer sciences (web mining, spatial database analysis, textual document collection, image segmentation), life and medical sciences (genetics, biology, microbiology, paleontology, psychiatry, clinic, pathology), to earth sciences (geography. geology, remote sensing), social sciences (sociology, psychology, archeology, education), and economics (marketing, business) [13-14]. There are two common clustering techniques based on the properties of clusters generated [13-1]; hierarchical clustering and partitioned clustering. Hierarchical clustering groups the data over a variety of scales by creating a cluster tree. The tree is not a single set of cluster, but rather a multilevel hierarchy, where clusters at one level are joined as clusters at the next level. This allows you to decide the level or scale of clustering that is most appropriate for your application. In partitioned clustering, the data objects are divided into some specified number of clusters. K-means clustering algorithm is one of the well-known methods in this category. K-means partitions data into k mutually exclusive clusters, and returns the index of the cluster to which each observation has assigned. We used this technique in our methodology to cluster the simulation inputs. K-means clustering algorithm treats each observation in data as an object having a location in space. It finds a partition in which objects within each cluster are as close to each other as possible, and as far from objects in other clusters as possible. There are several different distance measures, depending on the kind of data you are clustering. Each cluster in the partition is defined by its member objects and by its centroid, or center point. The centroid for each cluster is the point to which the sum of distances from all objects in that cluster is minimized. K-means computes the cluster centroids differently for each distance measure, to minimize the sum with respect to the specified measure. K-means uses an iterative algorithm that minimizes the sum of distances from each object to its cluster centroid, over all clusters. The algorithm moves objects between clusters until the sum cannot be decreased further. The result is a set of clusters that are as compact and well-separated as possible. An example of clustered data points is shown in Figure Figure-1. An example of clustered data points (taken from a Matlab example). PROPOSED APPROACH In this study, we proposed a new metamodeling approach by using multiple regression integrated K-means clustering algorithm especially for simulation optimization. Our approach works in four phases; Experimentation, Clustering, Metamodeling, and Optimization. We have ten steps in total as presented in Figure. We assume that we have a simulation model that is built for the system that we desire to find optimum values of some decision variables. Note that in this case, the decision variables are simulation model inputs. In the experimentation phase, the modeler designs the experiments according to the search space size. For example, if there are n input values and we decided to run low and high values of each variable, we end up with n factorial experiments. For some cases, that many experiments may not be enough and more experimentation might be required. 60
4 In the second phase, we cluster the simulation inputs. This is an iterative process since we look for some performance criterion in each iteration and if the criterion for clustering is below the acceptable level, we increase the number of clusters. For example, in K-Means clustering method, the performance criterion is the silhouette value. In the third phase, we develop metamodels for each cluster. As in the previous phase, we look for some quality measures of metamodels, for example by the R-square values. The purpose of a metamodel is to estimate outputs values without further simulation experimentation. Therefore after this phase, we can estimate simulation outputs without running the model. However since we have multiple metamodels, we need to determine some rules for using each metamodel. These rules might be based on the limits of the simulation inputs. The final phase is the optimization phase. Based on the objective function, we seek the minima or maxima of each metamodel. This requires differentiating the regression model and setting it to zero to find the roots of the equation. Then, we choose the minimum or maximum among the clusters optimum values. 6. APPLICATION 6.1. Problem Definition To test our approach, we used an example of a call center simulation model created with ARENA program [16]. We choose this model to benchmark our methodology. Therefore the model structure and its parameter values are taken from the original problem definition as is written in [16]. The call centre provides technical support, sales information, and order processing to a company. The calls arrive to this call centre with interarrival times exponentially distributed with a mean value of 0.87 minute. The call center has 6 trunk lines, which means that there exist concurrent 6 calls maximum. If all lines are busy, then the next arriving call will be rejected. An incoming call can be diverted to one of these options; transfer to technical support, sales information or order status inquiry. Their percentages are 76, 16, 8 respectively. The estimated time for this activity is UNIF(0.1, 0.6); all times are in minutes. Figure-. Flowchart of the proposed methodology In case of technical support calls, first, a recorded welcome message is presented which takes UNIF (0.1, 0.) minutes. In this message, the caller is expected to choose one of the three product types. The percentage of the product types 1, and 3 are, 34 and 41 respectively. If a qualified technical support person is available for the selected product type, the call is automatically routed to that person. Otherwise, the customer is placed in an electronic queue 61
5 until a support person is available. All technical support call durations are triangularly distributed with 3, 6, 18 minutes. After a caller is being served, he exits the system. The second type of calls is the sales. These calls are routed to the sales staff. A sales staff call duration is triangularly distributed with the parameters 4, 1, 4 minutes. As in the technical support, the caller leaves the system after completion of the call. The third type of call, order status, is handled by computers. However some customers may require talking to a real operator. This happens in 1 of this type of calls. Order status calls also distributed triangularly with, 3, 4 minutes. Note that when these calls are inserted to a queue for a real operator, they have lower priority than sales calls. An operator can handle these calls with triangularly distributed times (3,, 10 minutes). These callers then exit the system. In our base experimentation, there are 11 technical support employees to answer the technical support calls. Two are only qualified to handle calls for product Type 1, three are only qualified to handle calls for product Type, three are only qualified to handle calls for product Type 3, two are only qualified to handle calls for product Types 1 and 3, and one is only qualified to handle calls for all three products types. There are four employees to answer the sales calls and those order-status calls that want to speak to a real person. Our main output variable is the total cost which includes 3 types of costs; (1) staffing and resource costs, () costs due to poor customer service and (3) costs of rejected calls. A sales staff s cost is $0/hour and a tech-support staff s cost is $18-$0/hour, depending on their level of training and flexibility. The second type of cost is the incurred cost associated by making costumer wait on hold. When dealing with a call center, at some point, people will start getting mad and the system will start incurring a cost. Although it is difficult to measure this cost, we assumed that for tech calls, this point is 3 minutes; for sales calls, it s 1 minute; and for order status it s minutes. Beyond this tolerance point for each call type, the system will incur a cost of 36.8 cents/minute for tech calls, 81.8 cents/minute for sales calls and 34.6 cents/minute for order status calls. For rejected calls it is assumed that no more than of incoming calls get a busy signal; any model configuration not meeting this requirement will be regarded as unacceptable. With related rejected calls changing the number of trunk line is incurred $98/week for each trunk line. In the optimization part, we used this call center simulation model to find the minimum total cost while holding percent of rejected calls to and less. The decision variables and their lower/upper bound values are as shown in the Table 1. There are two constraints in the problem definition; first, the number of trunk lines must be between 6 and 0. Second, the call center can accommodate 1 operators at most. 6.. Steps of the Methodology Step-1 Specify the decision variables: We choose the six decision variable as shown in Table-1 that affected our performance criteria (e.g. the total cost). Table 1. Decision variables and their lower and upper bounds Decision Variables Lower Upper Bound Bound New Sales (1) 0 1 New Tech 1 () 0 1 New Tech (3) 0 1 New Tech 3 (4) 0 1 New Tech All () 0 1 Trunk Line (6) 6 0 Step- Simulation Experimentation: For this stage, instead of designing our own experiments, we choose the experiments that are already specified by Arena s OptQuest. To ease the process, we first run OptQuest for 00 experiments to find the optimum. As a result of this, OptQuest found the values in Table with the objective function value of $1,017. The run length for the model is 1000 hours and we made 10 replications in each experiment. Table. Minimum total cost and values of decision variables via OptQuest Obj.Func $ Step-3 Evaluate the Simulation Output: 16 experiments among 00 experimental results are removed since they were in infeasible region. Step-4 Determine the Number of Clusters: In this step, we cluster the inputs of the simulation model by examining the silhouette values. The silhouette plot displays a measure of the closeness of each data point by comparing with the neighboring clusters in the diagram. The measure for the silhouette value ranges from +1 to indicates the points that are very distant from the neighboring clusters. 0 indicates the points that are not distinctly in one cluster or another. -1 indicates the points that are assigned to the wrong cluster. The value is defined as; S(i) = (min(b(i,k),) - a(i)) / max(a(i),min(b(i,k))) where a(i) is the average distance from the ith point to the other points in its cluster, and b(i,k) is the average distance from the ith point to points in another cluster k. 6
6 Cluster Step- Cluster Simulation Inputs: We clustered the simulation inputs using the euclidean distance between the inputs. Here, we clustered the inputs up to 8 to compare the Silhouette plots. Step-6 Cluster Validation: To validate the clusters, we analyzed the Silhouette plots and means. Here, the best plot belongs to the -clusters (mean 0.), as shown in Figure-3. Therefore we end up with metamodels Silhouette Value Figure 3. Silhouette plot for the experiments. Step-7 Create Metamodel of Every Cluster: We created metamodels by using Minitab [17] according to number of clusters in Step 6. The Equations to 6 shows the metamodel of each cluster. f , 06* 38, 69* 774, 67* , * 166,4* 618,* 4, 43* 4 1 6, 7* 96, 67* 7,97* 3 4 f ,84* 18*,* 1 3 7, 46* 9, 69* 617, 7* 10, * ,7* 14, 71* 16,81* 3 4 () (3) f * * * * * * * * * * * * * * * * * * * * * * * * * * f * * * * * * * * * * * * + 19.* * * * * * + 4.8* * * * * * * + 69* - 8.9* Step-8 Evaluate the Results: In this step, we evaluate the metamodels in Step-7 by conducting some statistical tests (ANOVA, R-square, Residuals Sum of Square). The metamodels corresponding R-Square values are 79,83, 63,8, 6,7, 67,70 and 96.6 respectively. Additionally, square roots of mean square errors (MSE) are given in Table-3. We compare the R-Square and MSE values with the single metamodel, that is when we assume to have a classic metamodel (no cluster), we see that the single metamodel s R-Square value is 81.1 and MSE is Table 3. Statistical results of proposed approach and classic metamodel Method R MSE F& p Value square Proposed Approach Classic Metamodel Cluster-1 79,83, Cluster- 63,8 Cluster Cluster Cluster () (6) f * +1649* * * * * + 90* * * * * -81.7* * * * * * * +0. * * * +1* * * (4) Step-9 Find the Optimum of Each Metamodel: To optimize the objective functions of five metamodels, we used Matlab [19] s Optimization Tool. Table-4 shows the minimum total costs and values of decision variables. 63
7 Table 4. Objective functions and decision variables values Method Obj.Func Value Decision Variables [1;;3;4;;6] Tested Obj. Func. OptQuest $1017 [3;0;0;0;3;9] - Cluster-1 $1394 [3.76;0;6.17;4;8;0] $870 Cluster- $484 [3.4;0.9;1.4;1.9;6.83; $6343 0] Cluster-3 $3994 [3.;0;3.9;6;0;0] $646 Cluster-4 $1888 [7.;0;4;.6;0;41] $171 Cluster- $034 [4;0;0;0;;9] $1986 Step-10 Test the Optimum by Using Simulation Model: We tested the optimum of each cluster that obtained in Step-9 by using Arena simulation model. Note that the minimum total cost belongs to the Cluster- s metamodel, as shown in Table-3. After running those decision variable values in our call center simulation model, the result is $1646 ( Tested Objective Function column) which is close to the minimum total cost that OptQuest finds $ CONCLUSION Simulation optimization techniques have developed significantly in the last two decades. In this study, we aim at contributing the literature by proposing a new approach in which K-Means clustering algorithm is integrated into metamodeling. We tested the proposed approach by using a call center simulation model. In this example we used 00 scenarios which are created by Arena OptQuest optimization tool, and then clustered the inputs into five groups. The clusters helped to create plausible metamodels with satisfactory and near-optimal R-Square and MSE values. This gives us an indication of the advantage of the proposed approach. When the solution space is large and searching is costly, the proposed approach can be used as an alternative to heuristic search algorithms. However to generalize the usefulness of this approach, we aim at having more cases in the future. 8. ACKNOWLEDGMENTS The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of any affiliated organization or government. 9. REFERENCES [1] Tekin, E. and Sabuncuoglu, I., 004. Simulation Optimization: A Comprehensive Review on Theory and Applications. IEEE Transactions, 36: [] Law, M. and Kelton W. D., 001. Simulation Modeling and Analysis, McGrawHill, Second Edition, United States. [3] Fu, M., 00. Optimization for Simulation: Theory vs. Practice, INFORMS Journal on Computing 14(3):19-1. [4] Waziruddin, S., Brogan,D. C., Reynolds, P. F.: Coercion through Optimization: A Classification of Optimization Techniques Proceedings of the 004 Fall Simulation Interoperability Workshop, Orlando, FL, September 004. [] Carson, Y. and A. Maria: Simulation Optimization: Methods and Applications Proceedings of the 1997 Winter Simulation Conference, [6] Rubinstein, R. Y. and A. Shapiro Discrete Event Systems: Sensitivity Analysis and Stochastic Optimization by the Score Function Method. New York: John Wiley & Sons. [7] Fu, M.: Simulation Optimization Proceedings of the 001 Winter Simulation Conference, 001. [8] R. H. Myers and D. C. Montgomery: Response Surface Methodology: Process and Product Optimization Using Designed Experiments, Wiley-Interscience, 00. [9] Montgomery, D.C. (1991) Design and Analysis of Experiments, John Wiley & Sons, New York, NY. [10] Fu, M.C. (1994) Optimization via simulation: A review. Annals of Operations Research, 3, [11] Sargent,R.G.: Reesearch Issues in Metamodeling Proceedings of the 1991 Winter Simulation Conference, [1] u, R.: Survey of Clustering Algorithms IEEE Transactıons on Neural Networks, Vol. 16, No. 3, pp , May 00. [13] B. Everitt, S. Landau, and M. Leese, Cluster Analysis. London:Arnold, 001.Biography. [14] J. Hartigan, Clustering Algorithms. New York: Wiley, 197. [1] A. Jain, M. Murty, and P. Flynn, Data clustering: A review, ACM Comput. Surv., vol. 31, no. 3, pp , [16] Kelton, W. D. Sadowski, R. P. and Sturrock, D. T Simulation with Arena, McGrawHill, Fourth Edition, United States. pp [17] Minitab, [accessed Jan.013] [18] Arena Simulation Software, [accessed Jan.013] [19] Matlab, [accessed Jan.013] 64
8 Biography Emre İrfanoglu is pursuing his MSc in Naval Operations Research in the Institute of Naval Science and Engineering. He holds a BSc in Industrial Engineering degree where he received in 00 from the Turkish Naval Academy. Ilker Akgun is an assistant professor in Turkish Naval Academy. He completed his PhD in Istanbul Technical University and MSc studies in Middle East Technical University in 01 and 00 respectively. Murat Gunal is an assistant professor in Turkish Naval Academy. He completed his PhD and MSc studies in Lancaster University, UK, in 008 and 000 respectively. His PhD thesis title is Simulation Modelling for Performance Measurement in Hospitals. He did research and worked in simulation field many years. 6
The Psychology of Simulation Model and Metamodeling
THE EXPLODING DOMAIN OF SIMULATION OPTIMIZATION Jay April* Fred Glover* James P. Kelly* Manuel Laguna** *OptTek Systems 2241 17 th Street Boulder, CO 80302 **Leeds School of Business University of Colorado
More informationSPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING
AAS 07-228 SPECIAL PERTURBATIONS UNCORRELATED TRACK PROCESSING INTRODUCTION James G. Miller * Two historical uncorrelated track (UCT) processing approaches have been employed using general perturbations
More informationSTATISTICA. Clustering Techniques. Case Study: Defining Clusters of Shopping Center Patrons. and
Clustering Techniques and STATISTICA Case Study: Defining Clusters of Shopping Center Patrons STATISTICA Solutions for Business Intelligence, Data Mining, Quality Control, and Web-based Analytics Table
More informationSupport Vector Machines with Clustering for Training with Very Large Datasets
Support Vector Machines with Clustering for Training with Very Large Datasets Theodoros Evgeniou Technology Management INSEAD Bd de Constance, Fontainebleau 77300, France theodoros.evgeniou@insead.fr Massimiliano
More informationARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING)
ARTIFICIAL INTELLIGENCE (CSCU9YE) LECTURE 6: MACHINE LEARNING 2: UNSUPERVISED LEARNING (CLUSTERING) Gabriela Ochoa http://www.cs.stir.ac.uk/~goc/ OUTLINE Preliminaries Classification and Clustering Applications
More informationSocial Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
More informationPERFORMANCE ANALYSIS OF AN AUTOMATED PRODUCTION SYSTEM WITH QUEUE LENGTH DEPENDENT SERVICE RATES
ISSN 1726-4529 Int j simul model 9 (2010) 4, 184-194 Original scientific paper PERFORMANCE ANALYSIS OF AN AUTOMATED PRODUCTION SYSTEM WITH QUEUE LENGTH DEPENDENT SERVICE RATES Al-Hawari, T. * ; Aqlan,
More informationIntelligent Agents Serving Based On The Society Information
Intelligent Agents Serving Based On The Society Information Sanem SARIEL Istanbul Technical University, Computer Engineering Department, Istanbul, TURKEY sariel@cs.itu.edu.tr B. Tevfik AKGUN Yildiz Technical
More informationUsing Data Mining for Mobile Communication Clustering and Characterization
Using Data Mining for Mobile Communication Clustering and Characterization A. Bascacov *, C. Cernazanu ** and M. Marcu ** * Lasting Software, Timisoara, Romania ** Politehnica University of Timisoara/Computer
More informationEM Clustering Approach for Multi-Dimensional Analysis of Big Data Set
EM Clustering Approach for Multi-Dimensional Analysis of Big Data Set Amhmed A. Bhih School of Electrical and Electronic Engineering Princy Johnson School of Electrical and Electronic Engineering Martin
More informationData Mining Project Report. Document Clustering. Meryem Uzun-Per
Data Mining Project Report Document Clustering Meryem Uzun-Per 504112506 Table of Content Table of Content... 2 1. Project Definition... 3 2. Literature Survey... 3 3. Methods... 4 3.1. K-means algorithm...
More informationAn Overview of Knowledge Discovery Database and Data mining Techniques
An Overview of Knowledge Discovery Database and Data mining Techniques Priyadharsini.C 1, Dr. Antony Selvadoss Thanamani 2 M.Phil, Department of Computer Science, NGM College, Pollachi, Coimbatore, Tamilnadu,
More informationPrediction of Stock Performance Using Analytical Techniques
136 JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE, VOL. 5, NO. 2, MAY 2013 Prediction of Stock Performance Using Analytical Techniques Carol Hargreaves Institute of Systems Science National University
More informationMethodology for Emulating Self Organizing Maps for Visualization of Large Datasets
Methodology for Emulating Self Organizing Maps for Visualization of Large Datasets Macario O. Cordel II and Arnulfo P. Azcarraga College of Computer Studies *Corresponding Author: macario.cordel@dlsu.edu.ph
More informationD A T A M I N I N G C L A S S I F I C A T I O N
D A T A M I N I N G C L A S S I F I C A T I O N FABRICIO VOZNIKA LEO NARDO VIA NA INTRODUCTION Nowadays there is huge amount of data being collected and stored in databases everywhere across the globe.
More informationA STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS
A STUDY ON DATA MINING INVESTIGATING ITS METHODS, APPROACHES AND APPLICATIONS Mrs. Jyoti Nawade 1, Dr. Balaji D 2, Mr. Pravin Nawade 3 1 Lecturer, JSPM S Bhivrabai Sawant Polytechnic, Pune (India) 2 Assistant
More informationMONTE CARLO SIMULATION FOR INSURANCE AGENCY CONTINGENT COMMISSION
Proceedings of the 2013 Winter Simulation Conference R. Pasupathy, S.-H. Kim, A. Tolk, R. Hill, and M. E. Kuhl, eds MONTE CARLO SIMULATION FOR INSURANCE AGENCY CONTINGENT COMMISSION Mark Grabau Advanced
More informationEnhancing Business Process Management With Simulation Optimization
Introduction Enhancing Business Process Management With Simulation Optimization Jay April, Marco Better, Fred Glover, James P. Kelly, Manuel Laguna OptTek Systems, Inc. A growing number of business process
More informationA Review of Data Mining Techniques
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 4, April 2014,
More informationNeural Networks Lesson 5 - Cluster Analysis
Neural Networks Lesson 5 - Cluster Analysis Prof. Michele Scarpiniti INFOCOM Dpt. - Sapienza University of Rome http://ispac.ing.uniroma1.it/scarpiniti/index.htm michele.scarpiniti@uniroma1.it Rome, 29
More informationA Study of Web Log Analysis Using Clustering Techniques
A Study of Web Log Analysis Using Clustering Techniques Hemanshu Rana 1, Mayank Patel 2 Assistant Professor, Dept of CSE, M.G Institute of Technical Education, Gujarat India 1 Assistant Professor, Dept
More informationBranch-and-Price Approach to the Vehicle Routing Problem with Time Windows
TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents
More informationMake Better Decisions with Optimization
ABSTRACT Paper SAS1785-2015 Make Better Decisions with Optimization David R. Duling, SAS Institute Inc. Automated decision making systems are now found everywhere, from your bank to your government to
More informationUSING OPNET TO SIMULATE THE COMPUTER SYSTEM THAT GIVES SUPPORT TO AN ON-LINE UNIVERSITY INTRANET
USING OPNET TO SIMULATE THE COMPUTER SYSTEM THAT GIVES SUPPORT TO AN ON-LINE UNIVERSITY INTRANET Norbert Martínez 1, Angel A. Juan 2, Joan M. Marquès 3, Javier Faulin 4 {1, 3, 5} [ norbertm@uoc.edu, jmarquesp@uoc.edu
More informationSIMULATION FOR IT SERVICE DESK IMPROVEMENT
QUALITY INNOVATION PROSPERITY/KVALITA INOVÁCIA PROSPERITA XVIII/1 2014 47 SIMULATION FOR IT SERVICE DESK IMPROVEMENT DOI: 10.12776/QIP.V18I1.343 PETER BOBER Received 7 April 2014, Revised 30 June 2014,
More informationCREATING VALUE WITH BUSINESS ANALYTICS EDUCATION
ISAHP Article: Ozaydin, Ulengin/Creating Value with Business Analytics Education, Washington D.C., U.S.A. CREATING VALUE WITH BUSINESS ANALYTICS EDUCATION Ozay Ozaydin Faculty of Engineering Dogus University
More informationAn Analysis on Density Based Clustering of Multi Dimensional Spatial Data
An Analysis on Density Based Clustering of Multi Dimensional Spatial Data K. Mumtaz 1 Assistant Professor, Department of MCA Vivekanandha Institute of Information and Management Studies, Tiruchengode,
More informationPower Prediction Analysis using Artificial Neural Network in MS Excel
Power Prediction Analysis using Artificial Neural Network in MS Excel NURHASHINMAH MAHAMAD, MUHAMAD KAMAL B. MOHAMMED AMIN Electronic System Engineering Department Malaysia Japan International Institute
More informationPrinciples of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
More informationLecture. Simulation and optimization
Course Simulation Lecture Simulation and optimization 1 4/3/2015 Simulation and optimization Platform busses at Schiphol Optimization: Find a feasible assignment of bus trips to bus shifts (driver and
More informationEnhanced Boosted Trees Technique for Customer Churn Prediction Model
IOSR Journal of Engineering (IOSRJEN) ISSN (e): 2250-3021, ISSN (p): 2278-8719 Vol. 04, Issue 03 (March. 2014), V5 PP 41-45 www.iosrjen.org Enhanced Boosted Trees Technique for Customer Churn Prediction
More informationA Genetic Algorithm Approach for Solving a Flexible Job Shop Scheduling Problem
A Genetic Algorithm Approach for Solving a Flexible Job Shop Scheduling Problem Sayedmohammadreza Vaghefinezhad 1, Kuan Yew Wong 2 1 Department of Manufacturing & Industrial Engineering, Faculty of Mechanical
More informationHierarchical Cluster Analysis Some Basics and Algorithms
Hierarchical Cluster Analysis Some Basics and Algorithms Nethra Sambamoorthi CRMportals Inc., 11 Bartram Road, Englishtown, NJ 07726 (NOTE: Please use always the latest copy of the document. Click on this
More informationThe Applied and Computational Mathematics (ACM) Program at The Johns Hopkins University (JHU) is
The Applied and Computational Mathematics Program at The Johns Hopkins University James C. Spall The Applied and Computational Mathematics Program emphasizes mathematical and computational techniques of
More informationAachen Summer Simulation Seminar 2014
Aachen Summer Simulation Seminar 2014 Lecture 07 Input Modelling + Experimentation + Output Analysis Peer-Olaf Siebers pos@cs.nott.ac.uk Motivation 1. Input modelling Improve the understanding about how
More informationThere are a number of different methods that can be used to carry out a cluster analysis; these methods can be classified as follows:
Statistics: Rosie Cornish. 2007. 3.1 Cluster Analysis 1 Introduction This handout is designed to provide only a brief introduction to cluster analysis and how it is done. Books giving further details are
More informationGerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I
Gerard Mc Nulty Systems Optimisation Ltd gmcnulty@iol.ie/0876697867 BA.,B.A.I.,C.Eng.,F.I.E.I Data is Important because it: Helps in Corporate Aims Basis of Business Decisions Engineering Decisions Energy
More informationAn Introduction to Data Mining
An Introduction to Intel Beijing wei.heng@intel.com January 17, 2014 Outline 1 DW Overview What is Notable Application of Conference, Software and Applications Major Process in 2 Major Tasks in Detail
More informationData Mining Applications in Higher Education
Executive report Data Mining Applications in Higher Education Jing Luan, PhD Chief Planning and Research Officer, Cabrillo College Founder, Knowledge Discovery Laboratories Table of contents Introduction..............................................................2
More informationStandardization and Its Effects on K-Means Clustering Algorithm
Research Journal of Applied Sciences, Engineering and Technology 6(7): 399-3303, 03 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 03 Submitted: January 3, 03 Accepted: February 5, 03
More informationA Robust Method for Solving Transcendental Equations
www.ijcsi.org 413 A Robust Method for Solving Transcendental Equations Md. Golam Moazzam, Amita Chakraborty and Md. Al-Amin Bhuiyan Department of Computer Science and Engineering, Jahangirnagar University,
More informationDATA MINING TECHNOLOGY. Keywords: data mining, data warehouse, knowledge discovery, OLAP, OLAM.
DATA MINING TECHNOLOGY Georgiana Marin 1 Abstract In terms of data processing, classical statistical models are restrictive; it requires hypotheses, the knowledge and experience of specialists, equations,
More informationAdvanced Ensemble Strategies for Polynomial Models
Advanced Ensemble Strategies for Polynomial Models Pavel Kordík 1, Jan Černý 2 1 Dept. of Computer Science, Faculty of Information Technology, Czech Technical University in Prague, 2 Dept. of Computer
More informationA New Approach for Evaluation of Data Mining Techniques
181 A New Approach for Evaluation of Data Mining s Moawia Elfaki Yahia 1, Murtada El-mukashfi El-taher 2 1 College of Computer Science and IT King Faisal University Saudi Arabia, Alhasa 31982 2 Faculty
More informationThe Predictive Data Mining Revolution in Scorecards:
January 13, 2013 StatSoft White Paper The Predictive Data Mining Revolution in Scorecards: Accurate Risk Scoring via Ensemble Models Summary Predictive modeling methods, based on machine learning algorithms
More informationA Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images
A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images Małgorzata Charytanowicz, Jerzy Niewczas, Piotr A. Kowalski, Piotr Kulczycki, Szymon Łukasik, and Sławomir Żak Abstract Methods
More informationModeling Stochastic Inventory Policy with Simulation
Modeling Stochastic Inventory Policy with Simulation 1 Modeling Stochastic Inventory Policy with Simulation János BENKŐ Department of Material Handling and Logistics, Institute of Engineering Management
More informationDiscrete-Event Simulation
Discrete-Event Simulation Prateek Sharma Abstract: Simulation can be regarded as the emulation of the behavior of a real-world system over an interval of time. The process of simulation relies upon the
More informationENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS
ENHANCED CONFIDENCE INTERPRETATIONS OF GP BASED ENSEMBLE MODELING RESULTS Michael Affenzeller (a), Stephan M. Winkler (b), Stefan Forstenlechner (c), Gabriel Kronberger (d), Michael Kommenda (e), Stefan
More informationComparison of K-means and Backpropagation Data Mining Algorithms
Comparison of K-means and Backpropagation Data Mining Algorithms Nitu Mathuriya, Dr. Ashish Bansal Abstract Data mining has got more and more mature as a field of basic research in computer science and
More informationClassifying Large Data Sets Using SVMs with Hierarchical Clusters. Presented by :Limou Wang
Classifying Large Data Sets Using SVMs with Hierarchical Clusters Presented by :Limou Wang Overview SVM Overview Motivation Hierarchical micro-clustering algorithm Clustering-Based SVM (CB-SVM) Experimental
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 3, May-Jun 2014
RESEARCH ARTICLE OPEN ACCESS A Survey of Data Mining: Concepts with Applications and its Future Scope Dr. Zubair Khan 1, Ashish Kumar 2, Sunny Kumar 3 M.Tech Research Scholar 2. Department of Computer
More informationManjeet Kaur Bhullar, Kiranbir Kaur Department of CSE, GNDU, Amritsar, Punjab, India
Volume 5, Issue 6, June 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Multiple Pheromone
More informationChapter ML:XI (continued)
Chapter ML:XI (continued) XI. Cluster Analysis Data Mining Overview Cluster Analysis Basics Hierarchical Cluster Analysis Iterative Cluster Analysis Density-Based Cluster Analysis Cluster Evaluation Constrained
More informationWhite Paper Business Process Modeling and Simulation
White Paper Business Process Modeling and Simulation WP0146 May 2014 Bhakti Stephan Onggo Bhakti Stephan Onggo is a lecturer at the Department of Management Science at the Lancaster University Management
More informationLinear programming approach for online advertising
Linear programming approach for online advertising Igor Trajkovski Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University in Skopje, Rugjer Boshkovikj 16, P.O. Box 393, 1000 Skopje,
More informationData Mining Practical Machine Learning Tools and Techniques
Ensemble learning Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 8 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Combining multiple models Bagging The basic idea
More informationModelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches
Modelling, Extraction and Description of Intrinsic Cues of High Resolution Satellite Images: Independent Component Analysis based approaches PhD Thesis by Payam Birjandi Director: Prof. Mihai Datcu Problematic
More informationDATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS
DATA MINING CLUSTER ANALYSIS: BASIC CONCEPTS 1 AND ALGORITHMS Chiara Renso KDD-LAB ISTI- CNR, Pisa, Italy WHAT IS CLUSTER ANALYSIS? Finding groups of objects such that the objects in a group will be similar
More informationThe primary goal of this thesis was to understand how the spatial dependence of
5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial
More informationRobust Outlier Detection Technique in Data Mining: A Univariate Approach
Robust Outlier Detection Technique in Data Mining: A Univariate Approach Singh Vijendra and Pathak Shivani Faculty of Engineering and Technology Mody Institute of Technology and Science Lakshmangarh, Sikar,
More informationMemory Allocation Technique for Segregated Free List Based on Genetic Algorithm
Journal of Al-Nahrain University Vol.15 (2), June, 2012, pp.161-168 Science Memory Allocation Technique for Segregated Free List Based on Genetic Algorithm Manal F. Younis Computer Department, College
More informationBOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL
The Fifth International Conference on e-learning (elearning-2014), 22-23 September 2014, Belgrade, Serbia BOOSTING - A METHOD FOR IMPROVING THE ACCURACY OF PREDICTIVE MODEL SNJEŽANA MILINKOVIĆ University
More informationANN Based Fault Classifier and Fault Locator for Double Circuit Transmission Line
International Journal of Computer Sciences and Engineering Open Access Research Paper Volume-4, Special Issue-2, April 2016 E-ISSN: 2347-2693 ANN Based Fault Classifier and Fault Locator for Double Circuit
More informationResearch on the Performance Optimization of Hadoop in Big Data Environment
Vol.8, No.5 (015), pp.93-304 http://dx.doi.org/10.1457/idta.015.8.5.6 Research on the Performance Optimization of Hadoop in Big Data Environment Jia Min-Zheng Department of Information Engineering, Beiing
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationClustering. Data Mining. Abraham Otero. Data Mining. Agenda
Clustering 1/46 Agenda Introduction Distance K-nearest neighbors Hierarchical clustering Quick reference 2/46 1 Introduction It seems logical that in a new situation we should act in a similar way as in
More informationDecision Support System Methodology Using a Visual Approach for Cluster Analysis Problems
Decision Support System Methodology Using a Visual Approach for Cluster Analysis Problems Ran M. Bittmann School of Business Administration Ph.D. Thesis Submitted to the Senate of Bar-Ilan University Ramat-Gan,
More informationBig Data in Finance. Alexander Grigoriev. School of Business and Economics Sharing Success
Big Data in Finance Alexander Grigoriev Definitions Wiki: Big Data Gartner s 3V-definition [2012]: Big data is high volume, high velocity, and/or high variety information assets that require new forms
More informationTRAFFIC ENGINEERING OF DISTRIBUTED CALL CENTERS: NOT AS STRAIGHT FORWARD AS IT MAY SEEM. M. J. Fischer D. A. Garbin A. Gharakhanian D. M.
TRAFFIC ENGINEERING OF DISTRIBUTED CALL CENTERS: NOT AS STRAIGHT FORWARD AS IT MAY SEEM M. J. Fischer D. A. Garbin A. Gharakhanian D. M. Masi January 1999 Mitretek Systems 7525 Colshire Drive McLean, VA
More informationAS-D1 SIMULATION: A KEY TO CALL CENTER MANAGEMENT. Rupesh Chokshi Project Manager
AS-D1 SIMULATION: A KEY TO CALL CENTER MANAGEMENT Rupesh Chokshi Project Manager AT&T Laboratories Room 3J-325 101 Crawfords Corner Road Holmdel, NJ 07733, U.S.A. Phone: 732-332-5118 Fax: 732-949-9112
More informationContents. Dedication List of Figures List of Tables. Acknowledgments
Contents Dedication List of Figures List of Tables Foreword Preface Acknowledgments v xiii xvii xix xxi xxv Part I Concepts and Techniques 1. INTRODUCTION 3 1 The Quest for Knowledge 3 2 Problem Description
More informationStrategic Online Advertising: Modeling Internet User Behavior with
2 Strategic Online Advertising: Modeling Internet User Behavior with Patrick Johnston, Nicholas Kristoff, Heather McGinness, Phuong Vu, Nathaniel Wong, Jason Wright with William T. Scherer and Matthew
More informationCLUSTERING FOR FORENSIC ANALYSIS
IMPACT: International Journal of Research in Engineering & Technology (IMPACT: IJRET) ISSN(E): 2321-8843; ISSN(P): 2347-4599 Vol. 2, Issue 4, Apr 2014, 129-136 Impact Journals CLUSTERING FOR FORENSIC ANALYSIS
More informationClustering & Visualization
Chapter 5 Clustering & Visualization Clustering in high-dimensional databases is an important problem and there are a number of different clustering paradigms which are applicable to high-dimensional data.
More informationConstrained Optimization in Expensive Simulation: Novel Approach
Constrained Optimization in Expensive Simulation: Novel Approach Jack P.C. Kleijnen a), Wim van Beers b) and Inneke van Nieuwenhuyse c) a) Department of Information Management, Tilburg University, Postbox
More informationSTOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL
Stock Asian-African Market Trends Journal using of Economics Cluster Analysis and Econometrics, and ARIMA Model Vol. 13, No. 2, 2013: 303-308 303 STOCK MARKET TRENDS USING CLUSTER ANALYSIS AND ARIMA MODEL
More informationTime series clustering and the analysis of film style
Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such
More informationCLUSTERING LARGE DATA SETS WITH MIXED NUMERIC AND CATEGORICAL VALUES *
CLUSTERING LARGE DATA SETS WITH MIED NUMERIC AND CATEGORICAL VALUES * ZHEUE HUANG CSIRO Mathematical and Information Sciences GPO Box Canberra ACT, AUSTRALIA huang@cmis.csiro.au Efficient partitioning
More informationChapter 7. Cluster Analysis
Chapter 7. Cluster Analysis. What is Cluster Analysis?. A Categorization of Major Clustering Methods. Partitioning Methods. Hierarchical Methods 5. Density-Based Methods 6. Grid-Based Methods 7. Model-Based
More informationSTUDY OF PROJECT SCHEDULING AND RESOURCE ALLOCATION USING ANT COLONY OPTIMIZATION 1
STUDY OF PROJECT SCHEDULING AND RESOURCE ALLOCATION USING ANT COLONY OPTIMIZATION 1 Prajakta Joglekar, 2 Pallavi Jaiswal, 3 Vandana Jagtap Maharashtra Institute of Technology, Pune Email: 1 somanprajakta@gmail.com,
More informationUsing Simulation to Understand and Optimize a Lean Service Process
Using Simulation to Understand and Optimize a Lean Service Process Kumar Venkat Surya Technologies, Inc. 4888 NW Bethany Blvd., Suite K5, #191 Portland, OR 97229 kvenkat@suryatech.com Wayne W. Wakeland
More informationArtificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
More informationPLAANN as a Classification Tool for Customer Intelligence in Banking
PLAANN as a Classification Tool for Customer Intelligence in Banking EUNITE World Competition in domain of Intelligent Technologies The Research Report Ireneusz Czarnowski and Piotr Jedrzejowicz Department
More informationStatistics for BIG data
Statistics for BIG data Statistics for Big Data: Are Statisticians Ready? Dennis Lin Department of Statistics The Pennsylvania State University John Jordan and Dennis K.J. Lin (ICSA-Bulletine 2014) Before
More informationRequirements Analysis Concepts & Principles. Instructor: Dr. Jerry Gao
Requirements Analysis Concepts & Principles Instructor: Dr. Jerry Gao Requirements Analysis Concepts and Principles - Requirements Analysis - Communication Techniques - Initiating the Process - Facilitated
More informationMarkovian Process and Novel Secure Algorithm for Big Data in Two-Hop Wireless Networks
Markovian Process and Novel Secure Algorithm for Big Data in Two-Hop Wireless Networks K. Thiagarajan, Department of Mathematics, PSNA College of Engineering and Technology, Dindigul, India. A. Veeraiah,
More informationCOMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES
COMBINING THE METHODS OF FORECASTING AND DECISION-MAKING TO OPTIMISE THE FINANCIAL PERFORMANCE OF SMALL ENTERPRISES JULIA IGOREVNA LARIONOVA 1 ANNA NIKOLAEVNA TIKHOMIROVA 2 1, 2 The National Nuclear Research
More informationResearch on Clustering Analysis of Big Data Yuan Yuanming 1, 2, a, Wu Chanle 1, 2
Advanced Engineering Forum Vols. 6-7 (2012) pp 82-87 Online: 2012-09-26 (2012) Trans Tech Publications, Switzerland doi:10.4028/www.scientific.net/aef.6-7.82 Research on Clustering Analysis of Big Data
More informationUse of Data Mining Techniques to Improve the Effectiveness of Sales and Marketing
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 4, April 2015,
More information2014-2015 The Master s Degree with Thesis Course Descriptions in Industrial Engineering
2014-2015 The Master s Degree with Thesis Course Descriptions in Industrial Engineering Compulsory Courses IENG540 Optimization Models and Algorithms In the course important deterministic optimization
More informationFOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS
FOREX TRADING PREDICTION USING LINEAR REGRESSION LINE, ARTIFICIAL NEURAL NETWORK AND DYNAMIC TIME WARPING ALGORITHMS Leslie C.O. Tiong 1, David C.L. Ngo 2, and Yunli Lee 3 1 Sunway University, Malaysia,
More informationHow To Identify Noisy Variables In A Cluster
Identification of noisy variables for nonmetric and symbolic data in cluster analysis Marek Walesiak and Andrzej Dudek Wroclaw University of Economics, Department of Econometrics and Computer Science,
More informationMachine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
More informationUSING THE AGGLOMERATIVE METHOD OF HIERARCHICAL CLUSTERING AS A DATA MINING TOOL IN CAPITAL MARKET 1. Vera Marinova Boncheva
382 [7] Reznik, A, Kussul, N., Sokolov, A.: Identification of user activity using neural networks. Cybernetics and computer techniques, vol. 123 (1999) 70 79. (in Russian) [8] Kussul, N., et al. : Multi-Agent
More informationData Mining and Knowledge Discovery in Databases (KDD) State of the Art. Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland
Data Mining and Knowledge Discovery in Databases (KDD) State of the Art Prof. Dr. T. Nouri Computer Science Department FHNW Switzerland 1 Conference overview 1. Overview of KDD and data mining 2. Data
More informationHadoop Operations Management for Big Data Clusters in Telecommunication Industry
Hadoop Operations Management for Big Data Clusters in Telecommunication Industry N. Kamalraj Asst. Prof., Department of Computer Technology Dr. SNS Rajalakshmi College of Arts and Science Coimbatore-49
More informationJoseph Twagilimana, University of Louisville, Louisville, KY
ST14 Comparing Time series, Generalized Linear Models and Artificial Neural Network Models for Transactional Data analysis Joseph Twagilimana, University of Louisville, Louisville, KY ABSTRACT The aim
More informationA Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data
A Comparative Study of the Pickup Method and its Variations Using a Simulated Hotel Reservation Data Athanasius Zakhary, Neamat El Gayar Faculty of Computers and Information Cairo University, Giza, Egypt
More informationClustering. 15-381 Artificial Intelligence Henry Lin. Organizing data into clusters such that there is
Clustering 15-381 Artificial Intelligence Henry Lin Modified from excellent slides of Eamonn Keogh, Ziv Bar-Joseph, and Andrew Moore What is Clustering? Organizing data into clusters such that there is
More information