Runtime Hardware Reconfiguration using Machine Learning
|
|
|
- Martha Page
- 10 years ago
- Views:
Transcription
1 Runtime Hardware Reconfiguration using Machine Learning Tanmay Gangwani University of Illinois, Urbana-Champaign Abstract Tailoring the machine hardware to varying needs of the software running on top of it is crucial to achieving energy-efficient execution. However, the characteristics of the code are non-deterministic and depend on system environments, precluding the use of static methods for tuning the hardware to match the requirements. We propose the use of machine learning models which can be trained to suggest energy-optimal hardware configurations, given a previously unseen application. We evaluate our model using application traces from an x86 simulator, and measure the closeness of the predicted hardware configuration to the one achieving the best energy-efficiency for the application. The machine learning model is trained using Google TensorFlow [1]. I. Introduction The shift to multi-core hardware has put more focus on design principles that reduce the overall power consumption of the chip. This is because using excessive energy entails high cooling costs and poses a threat to physical stability of the chip. To conserve energy, hardware manufacturers provide platform features to switch off parts of the chip not in use [2]. Turning off unused area, albiet for a short while, is beneficial since transistors drain leakage current if they are powered up. Applications for a multi-core environment exhibit various phases, with each phase being potentially different in terms of the amount of the chip resources it needs for execution. Micro-managing hardware to adapt to the phases, thus, can result in subtantial energy savings and cost reduction. Adapting hardware to program phases is challenging because of the nondeterministic nature of applications. In general, a program could take different execution paths depending on user-provided inputs, while also demanding varying hardware resources. As such, statically defining the best hardware for a program phase is not possible. Our idea is to use softmax machine learning technique to train a model using data from pre-run program phases as input. Once trained, the model can predict the best hardware configuration for an unseen test program phase. Selective parts of the hardware can then be switched off to reap power benefits during the execution of the phase. Table 1: Input Features and Labels Features Label-classes (3 values per class) ROB occupancy (histogram) ROB size L1D-mpki L1D cache size L2D-mpki L2D cache size IPC Dispatch width Branch misprediction rate Branch predictor size Reuse distance (histogram) -
2 II. The Model We aim to learn a relationship between the program phases and the best hardware configurations. In machine learning jargon, the program phases form the input features, and the hardware configurations are the output labels. The program phases are characterized by a set of hardware counters measured during the run. The counters used, along with the labels, are outlined in Table 1. We consider 5 label classes, each of which is a hardware resource with possibility for reconfiguration on the fly. Each label class can take 1 of the 3 possible values, defined separately for the class - e.g. 48, 64, 96 for ROB size; 1, 2, 4 for Dispatch width. We measure the hardware counters to generate a feature vector of 44 values - 20 each for the ROB occupancy and Reuse distance histograms, and 1 each for others. We run every program phase with all the possible permutations of the labels, and select the top k configurations based the energy-efficiency values. Thus, from each program phase we get a set of good configurations (labels), and the corresponding hardware counters that form the feature space. We use the numbers to model a conditional distribution P( Y X) of good microarchitectural configurations using soft-max [3]. Similar to the approach adopted in [4], we consider the hardware parameter values (label classes) to be independent of each other, given the counters (features): P( Y X) = 5 i=1 P(y X) (1) As noted by the authors in [4], there exist dependencies between hardware parameter values as one would expect in a hardware system. However, the model assumes that good parameter values are conditionally indepedent, given the features, and not marginally independent. Such an assumption, similar to naive Bayes, makes the problem of computing the overall conditional distribution P( Y X) tractable. A decision on a test sample is done as follows: Y = argmax P( Y X), (2) Y where due to conditional independence, we can compute the maximum for each label class separately, using a different distribution P(y X). These distributions are defined by the soft-max function: P(y = k X) = exp( w k X) K i=1 exp( w i X) III. I. Simulator Evaluation (3) We use traces from single-threaded SPEC benchmarks [5] to train our model. Each trace represents a particular phase in a SPEC program. These are generated using the SimPoint methodology [6], and run on a cycle-approximate x86 simulator, Sniper [7], in detailed mode for 30M instructions following a 100M warmup phase. Sniper is modified to produce the desired performance counters, which are the features of our model. The hardware parameter values (labels) are changed using the Sniperconfig file. We generate input data features and labels for good configurations from each SPEC trace. Sniper uses McPAT [8] to provide energy values for each run. Energy-efficiency of a configuration is measured in terms of ipc/watt for the complete run of the application phase.
3 ipc is the number of instructions executed per cycle, and watt refers to the power consumption in Watts. Since the features represent hardware counter values, their ranges differ significantly. To have bettter generalization, we standardize the feature values using Python scikit-learn. Next, the dimensionality of the feature space is reduced to improve computation time. We use Random Forest [9] to find the top k features for each label class, and then use them to find the parameters of the soft-max function. We train 5 soft-max functions for the 5 label classes, following our conditional independence assumption. II. TensorFlow TensorFlow is a rich library of machine learning algorithms from Google. It uses a data-flow graph model where the nodes represent mathematical operations and the edges represent communication of tensors between operations. We use the in-built soft-max function in TensorFlow. Gradient descent is used as the optimization algorithm in a stochastic training methodology. The error function used is cross-entropy, along with L2 regularization to prevent overfitting. The objective function is: y. log y pred + 1 m 2 w 2 (4) The TensorFlow source files, along with scripts for feature generation, training and testing are available on github. 1 IV. Results We run a total of 90 traces, each representing a different phase of a SPEC benchmark. 1 Of these, data from 68 traces (739 feature vectors) is used to train the parameters of the soft-max functions. We predict the accuracy of our model on the remaining 22 traces. For the train-traces, multiple good configurations are generated by changing the hardware parameter values using the simulator knobs. For the test-traces, we generate the performance counters (features) by running the phase with the maximum value for each of the hardware parameters. We refer to this setting as base, for basic configuration, as would be the case in a system which is provisioned to run all general applications without any dynamic reconfiguration. The configuration produced by the machine learning model using the input performance counters is refered to as pred, for predicted configuration. We measure success in terms of the number of application phases where pred achieves a more energy-efficient execution than base. We also compare pred against a second configuration, refered to as best, for the overall most energy-efficient configuration for the phase. Such a configuration would be found by an online exhaustive search space algorithm, which is globally optimal, but hard to realize in practice. Figure 1 plots two energy efficiency ratios - pred/base and pred/best. Out of the 22 traces, pred performs better than base in 16 of them. The maximum ratio is 1.09x, indicating a 9% improvement in energy efficiency. The ratio, averaged across all applications is 1.02x. For pred/best, we see an average ratio of 0.96x across applications; i.e. our prediction lags behind the exhaustive search technique by only 4%.
4 Figure 1: Energy efficiency ratios for different configuration combinations I. Model Accuracy In this subsection, we provide some discussion on the soft-max functions learned using TensorFlow. Our each input vector is of dimension 1x44, and we have 3 possible values for each label class. Hence, we learn a 44x3 matrix of weights and a 1x3 vector of baises for each of the label classes, of which we have 5. We omit the label class L2D cache size from the discussion 2. The accuracies on the training data for the other label classe are - ROB size (97%), L1D cache size (73%), Dispatch width (97%) and Branch predictor size (59%). This indicates difficulty in correctly learning the optimal L1D cache size and Branch predictor size based on the performance counters used in this work. We believe this to be a major source of efficiency-loss for the predicted configuration (figure 1). Correlating to the accuracy numbers, figure 2 shows how the objective (loss) function changes with iterations of the SGD algorithm, for ROB size and Branch predictor size. The error in latter converges to a value that is more than 4 times higher than the convergence value for the former. (a) (b) Figure 2: Loss for ROB and Branch predictor size V. Conclusion and Future Work In this work, our goal was to dynamically tune hardware to software, rather than the other way round, as done by programmers today. By learning a mapping between the data obtained from performance counters during an application run and the hardware resources required for efficient execution, we were able to suggest hardware reconfiguration opportunities for new, un- 2 A last minute bug was found in the manner data was generated with different values for this label class. The affected feature vectors were removed from the training set.
5 seen applications. We implemented a collection of soft-max functions in TensorFlow to realize this goal. The results indicate that while sensitivity to some hardware structures can be captured using this technique, the impact of others are hard to predict. Future work could throw a bigger net, and evaluate more hardware resources which have a significant impact on performance and power, like load-store queues and NoC. We could also look to other algorithms like nueral nets to learn complex mappings. Interestingly, we noted that our prediction accuracies are also dependent on the energy-efficiency metric chosen for optimization, i.e. learning is different when optimizing for ipc 3 /watt and ipc/watt. References [1] [2] events/files/slides/linuxconpowercapping_jpan.pdf [3] [4] Dubach, Christophe, et al. "A predictive model for dynamic microarchitectural adaptivity control." Proceedings of the rd Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, [5] [6] calder/simpoint/ [7] [8] [9]
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
In-network Monitoring and Control Policy for DVFS of CMP Networkson-Chip and Last Level Caches
In-network Monitoring and Control Policy for DVFS of CMP Networkson-Chip and Last Level Caches Xi Chen 1, Zheng Xu 1, Hyungjun Kim 1, Paul V. Gratz 1, Jiang Hu 1, Michael Kishinevsky 2 and Umit Ogras 2
Computer Science 146/246 Homework #3
Computer Science 146/246 Homework #3 Due 11:59 P.M. Sunday, April 12th, 2015 We played with a Pin-based cache simulator for Homework 2. This homework will prepare you to setup and run a detailed microarchitecture-level
Logistic Regression for Spam Filtering
Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used
Applying Data Analysis to Big Data Benchmarks. Jazmine Olinger
Applying Data Analysis to Big Data Benchmarks Jazmine Olinger Abstract This paper describes finding accurate and fast ways to simulate Big Data benchmarks. Specifically, using the currently existing simulation
Car Insurance. Havránek, Pokorný, Tomášek
Car Insurance Havránek, Pokorný, Tomášek Outline Data overview Horizontal approach + Decision tree/forests Vertical (column) approach + Neural networks SVM Data overview Customers Viewed policies Bought
Energy-Efficient, High-Performance Heterogeneous Core Design
Energy-Efficient, High-Performance Heterogeneous Core Design Raj Parihar Core Design Session, MICRO - 2012 Advanced Computer Architecture Lab, UofR, Rochester April 18, 2013 Raj Parihar Energy-Efficient,
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Characterizing Task Usage Shapes in Google s Compute Clusters
Characterizing Task Usage Shapes in Google s Compute Clusters Qi Zhang 1, Joseph L. Hellerstein 2, Raouf Boutaba 1 1 University of Waterloo, 2 Google Inc. Introduction Cloud computing is becoming a key
TPCalc : a throughput calculator for computer architecture studies
TPCalc : a throughput calculator for computer architecture studies Pierre Michaud Stijn Eyerman Wouter Rogiest IRISA/INRIA Ghent University Ghent University [email protected] [email protected]
CHAPTER 1 INTRODUCTION
1 CHAPTER 1 INTRODUCTION 1.1 MOTIVATION OF RESEARCH Multicore processors have two or more execution cores (processors) implemented on a single chip having their own set of execution and architectural recourses.
A Logistic Regression Approach to Ad Click Prediction
A Logistic Regression Approach to Ad Click Prediction Gouthami Kondakindi [email protected] Satakshi Rana [email protected] Aswin Rajkumar [email protected] Sai Kaushik Ponnekanti [email protected] Vinit Parakh
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Advanced analytics at your hands
2.3 Advanced analytics at your hands Neural Designer is the most powerful predictive analytics software. It uses innovative neural networks techniques to provide data scientists with results in a way previously
Solution: start more than one instruction in the same clock cycle CPI < 1 (or IPC > 1, Instructions per Cycle) Two approaches:
Multiple-Issue Processors Pipelining can achieve CPI close to 1 Mechanisms for handling hazards Static or dynamic scheduling Static or dynamic branch handling Increase in transistor counts (Moore s Law):
CSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
Supervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
Method for detecting software anomalies based on recurrence plot analysis
Journal of Theoretical and Applied Computer Science Vol. 6, No. 1, 2012, pp. 3-12 ISSN 2299-2634 http://www.jtacs.org Method for detecting software anomalies based on recurrence plot analysis Michał Mosdorf
Energy Efficient MapReduce
Energy Efficient MapReduce Motivation: Energy consumption is an important aspect of datacenters efficiency, the total power consumption in the united states has doubled from 2000 to 2005, representing
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
Maximize Revenues on your Customer Loyalty Program using Predictive Analytics
Maximize Revenues on your Customer Loyalty Program using Predictive Analytics 27 th Feb 14 Free Webinar by Before we begin... www Q & A? Your Speakers @parikh_shachi Technical Analyst @tatvic Loves js
DESIGN CHALLENGES OF TECHNOLOGY SCALING
DESIGN CHALLENGES OF TECHNOLOGY SCALING IS PROCESS TECHNOLOGY MEETING THE GOALS PREDICTED BY SCALING THEORY? AN ANALYSIS OF MICROPROCESSOR PERFORMANCE, TRANSISTOR DENSITY, AND POWER TRENDS THROUGH SUCCESSIVE
Big Data at Spotify. Anders Arpteg, Ph D Analytics Machine Learning, Spotify
Big Data at Spotify Anders Arpteg, Ph D Analytics Machine Learning, Spotify Quickly about me Quickly about Spotify What is all the data used for? Quickly about Spark Hadoop MR vs Spark Need for (distributed)
Adaptive Resource Optimizer For Optimal High Performance Compute Resource Utilization
Technical Backgrounder Adaptive Resource Optimizer For Optimal High Performance Compute Resource Utilization July 2015 Introduction In a typical chip design environment, designers use thousands of CPU
Making Sense of the Mayhem: Machine Learning and March Madness
Making Sense of the Mayhem: Machine Learning and March Madness Alex Tran and Adam Ginzberg Stanford University [email protected] [email protected] I. Introduction III. Model The goal of our research
The Artificial Prediction Market
The Artificial Prediction Market Adrian Barbu Department of Statistics Florida State University Joint work with Nathan Lay, Siemens Corporate Research 1 Overview Main Contributions A mathematical theory
Improved metrics collection and correlation for the CERN cloud storage test framework
Improved metrics collection and correlation for the CERN cloud storage test framework September 2013 Author: Carolina Lindqvist Supervisors: Maitane Zotes Seppo Heikkila CERN openlab Summer Student Report
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus
Facebook Friend Suggestion Eytan Daniyalzade and Tim Lipus 1. Introduction Facebook is a social networking website with an open platform that enables developers to extract and utilize user information
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations
Expanding the CASEsim Framework to Facilitate Load Balancing of Social Network Simulations Amara Keller, Martin Kelly, Aaron Todd 4 June 2010 Abstract This research has two components, both involving the
MACHINE LEARNING IN HIGH ENERGY PHYSICS
MACHINE LEARNING IN HIGH ENERGY PHYSICS LECTURE #1 Alex Rogozhnikov, 2015 INTRO NOTES 4 days two lectures, two practice seminars every day this is introductory track to machine learning kaggle competition!
Parallelism and Cloud Computing
Parallelism and Cloud Computing Kai Shen Parallel Computing Parallel computing: Process sub tasks simultaneously so that work can be completed faster. For instances: divide the work of matrix multiplication
1 Maximum likelihood estimation
COS 424: Interacting with Data Lecturer: David Blei Lecture #4 Scribes: Wei Ho, Michael Ye February 14, 2008 1 Maximum likelihood estimation 1.1 MLE of a Bernoulli random variable (coin flips) Given N
Big Data Analytics CSCI 4030
High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering data streams SVM Recommen der systems Clustering Community Detection Web advertising
Software Performance and Scalability
Software Performance and Scalability A Quantitative Approach Henry H. Liu ^ IEEE )computer society WILEY A JOHN WILEY & SONS, INC., PUBLICATION Contents PREFACE ACKNOWLEDGMENTS xv xxi Introduction 1 Performance
Perfmon2: A leap forward in Performance Monitoring
Perfmon2: A leap forward in Performance Monitoring Sverre Jarp, Ryszard Jurga, Andrzej Nowak CERN, Geneva, Switzerland [email protected] Abstract. This paper describes the software component, perfmon2,
Social Media Mining. Data Mining Essentials
Introduction Data production rate has been increased dramatically (Big Data) and we are able store much more data than before E.g., purchase data, social media data, mobile phone data Businesses and customers
Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic
Testing Automation for Distributed Applications By Isabel Drost-Fromm, Software Engineer, Elastic The challenge When building distributed, large-scale applications, quality assurance (QA) gets increasingly
Large-Scale Data Sets Clustering Based on MapReduce and Hadoop
Journal of Computational Information Systems 7: 16 (2011) 5956-5963 Available at http://www.jofcis.com Large-Scale Data Sets Clustering Based on MapReduce and Hadoop Ping ZHOU, Jingsheng LEI, Wenjun YE
Machine Learning in Python with scikit-learn. O Reilly Webcast Aug. 2014
Machine Learning in Python with scikit-learn O Reilly Webcast Aug. 2014 Outline Machine Learning refresher scikit-learn How the project is structured Some improvements released in 0.15 Ongoing work for
Precise and Accurate Processor Simulation
Precise and Accurate Processor Simulation Harold Cain, Kevin Lepak, Brandon Schwartz, and Mikko H. Lipasti University of Wisconsin Madison http://www.ece.wisc.edu/~pharm Performance Modeling Analytical
OPNET Network Simulator
Simulations and Tools for Telecommunications 521365S: OPNET Network Simulator Jarmo Prokkola Research team leader, M. Sc. (Tech.) VTT Technical Research Centre of Finland Kaitoväylä 1, Oulu P.O. Box 1100,
Performance Impacts of Non-blocking Caches in Out-of-order Processors
Performance Impacts of Non-blocking Caches in Out-of-order Processors Sheng Li; Ke Chen; Jay B. Brockman; Norman P. Jouppi HP Laboratories HPL-2011-65 Keyword(s): Non-blocking cache; MSHR; Out-of-order
5MD00. Assignment Introduction. Luc Waeijen 16-12-2014
5MD00 Assignment Introduction Luc Waeijen 16-12-2014 Contents EEG application Background on EEG Early Seizure Detection Algorithm Implementation Details Super Scalar Assignment Description Tooling (simple
Car Insurance. Prvák, Tomi, Havri
Car Insurance Prvák, Tomi, Havri Sumo report - expectations Sumo report - reality Bc. Jan Tomášek Deeper look into data set Column approach Reminder What the hell is this competition about??? Attributes
McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures
McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures Sheng Li, Junh Ho Ahn, Richard Strong, Jay B. Brockman, Dean M Tullsen, Norman Jouppi MICRO 2009
Performance Analysis and Optimization Tool
Performance Analysis and Optimization Tool Andres S. CHARIF-RUBIAL [email protected] Performance Analysis Team, University of Versailles http://www.maqao.org Introduction Performance Analysis Develop
14.10.2014. Overview. Swarms in nature. Fish, birds, ants, termites, Introduction to swarm intelligence principles Particle Swarm Optimization (PSO)
Overview Kyrre Glette kyrrehg@ifi INF3490 Swarm Intelligence Particle Swarm Optimization Introduction to swarm intelligence principles Particle Swarm Optimization (PSO) 3 Swarms in nature Fish, birds,
Euler: A System for Numerical Optimization of Programs
Euler: A System for Numerical Optimization of Programs Swarat Chaudhuri 1 and Armando Solar-Lezama 2 1 Rice University 2 MIT Abstract. We give a tutorial introduction to Euler, a system for solving difficult
Practical Graph Mining with R. 5. Link Analysis
Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities
Predicting the Stock Market with News Articles
Predicting the Stock Market with News Articles Kari Lee and Ryan Timmons CS224N Final Project Introduction Stock market prediction is an area of extreme importance to an entire industry. Stock price is
Optimizing Shared Resource Contention in HPC Clusters
Optimizing Shared Resource Contention in HPC Clusters Sergey Blagodurov Simon Fraser University Alexandra Fedorova Simon Fraser University Abstract Contention for shared resources in HPC clusters occurs
Multivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
Applied Multivariate Analysis - Big data analytics
Applied Multivariate Analysis - Big data analytics Nathalie Villa-Vialaneix [email protected] http://www.nathalievilla.org M1 in Economics and Economics and Statistics Toulouse School of
MapReduce/Bigtable for Distributed Optimization
MapReduce/Bigtable for Distributed Optimization Keith B. Hall Google Inc. [email protected] Scott Gilpin Google Inc. [email protected] Gideon Mann Google Inc. [email protected] Abstract With large data
CHAPTER 1 INTRODUCTION
CHAPTER 1 INTRODUCTION 1.1 Background The command over cloud computing infrastructure is increasing with the growing demands of IT infrastructure during the changed business scenario of the 21 st Century.
Energy Constrained Resource Scheduling for Cloud Environment
Energy Constrained Resource Scheduling for Cloud Environment 1 R.Selvi, 2 S.Russia, 3 V.K.Anitha 1 2 nd Year M.E.(Software Engineering), 2 Assistant Professor Department of IT KSR Institute for Engineering
1. Simulation of load balancing in a cloud computing environment using OMNET
Cloud Computing Cloud computing is a rapidly growing technology that allows users to share computer resources according to their need. It is expected that cloud computing will generate close to 13.8 million
Collaborative Filtering Scalable Data Analysis Algorithms Claudia Lehmann, Andrina Mascher
Collaborative Filtering Scalable Data Analysis Algorithms Claudia Lehmann, Andrina Mascher Outline 2 1. Retrospection 2. Stratosphere Plans 3. Comparison with Hadoop 4. Evaluation 5. Outlook Retrospection
Classification algorithm in Data mining: An Overview
Classification algorithm in Data mining: An Overview S.Neelamegam #1, Dr.E.Ramaraj *2 #1 M.phil Scholar, Department of Computer Science and Engineering, Alagappa University, Karaikudi. *2 Professor, Department
Semi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
Distributed forests for MapReduce-based machine learning
Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication
Machine Learning using MapReduce
Machine Learning using MapReduce What is Machine Learning Machine learning is a subfield of artificial intelligence concerned with techniques that allow computers to improve their outputs based on previous
Leveraging Ensemble Models in SAS Enterprise Miner
ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to
EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS
EVALUATING METRICS AT CLASS AND METHOD LEVEL FOR JAVA PROGRAMS USING KNOWLEDGE BASED SYSTEMS Umamaheswari E. 1, N. Bhalaji 2 and D. K. Ghosh 3 1 SCSE, VIT Chennai Campus, Chennai, India 2 SSN College of
Software Test Plan (STP) Template
(STP) Template Items that are intended to stay in as part of your document are in bold; explanatory comments are in italic text. Plain text is used where you might insert wording about your project. This
Schedule Risk Analysis Simulator using Beta Distribution
Schedule Risk Analysis Simulator using Beta Distribution Isha Sharma Department of Computer Science and Applications, Kurukshetra University, Kurukshetra, Haryana (INDIA) [email protected] Dr. P.K.
18-447 Computer Architecture Lecture 3: ISA Tradeoffs. Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013
18-447 Computer Architecture Lecture 3: ISA Tradeoffs Prof. Onur Mutlu Carnegie Mellon University Spring 2013, 1/18/2013 Reminder: Homeworks for Next Two Weeks Homework 0 Due next Wednesday (Jan 23), right
The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge,
The Union-Find Problem Kruskal s algorithm for finding an MST presented us with a problem in data-structure design. As we looked at each edge, cheapest first, we had to determine whether its two endpoints
Building an energy dashboard. Energy measurement and visualization in current HPC systems
Building an energy dashboard Energy measurement and visualization in current HPC systems Thomas Geenen 1/58 [email protected] SURFsara The Dutch national HPC center 2H 2014 > 1PFlop GPGPU accelerators
Exponential Random Graph Models for Social Network Analysis. Danny Wyatt 590AI March 6, 2009
Exponential Random Graph Models for Social Network Analysis Danny Wyatt 590AI March 6, 2009 Traditional Social Network Analysis Covered by Eytan Traditional SNA uses descriptive statistics Path lengths
Analysis of MapReduce Algorithms
Analysis of MapReduce Algorithms Harini Padmanaban Computer Science Department San Jose State University San Jose, CA 95192 408-924-1000 [email protected] ABSTRACT MapReduce is a programming model
Model Combination. 24 Novembre 2009
Model Combination 24 Novembre 2009 Datamining 1 2009-2010 Plan 1 Principles of model combination 2 Resampling methods Bagging Random Forests Boosting 3 Hybrid methods Stacking Generic algorithm for mulistrategy
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)
large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) 1 three frequent ideas in machine learning. independent and identically distributed data This experimental paradigm has driven
Intrusion Detection via Static Analysis
Intrusion Detection via Static Analysis IEEE Symposium on Security & Privacy 01 David Wagner Drew Dean Presented by Yongjian Hu Outline Introduction Motivation Models Trivial model Callgraph model Abstract
Performance Workload Design
Performance Workload Design The goal of this paper is to show the basic principles involved in designing a workload for performance and scalability testing. We will understand how to achieve these principles
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS
USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA [email protected]
Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing
Heterogeneous Workload Consolidation for Efficient Management of Data Centers in Cloud Computing Deep Mann ME (Software Engineering) Computer Science and Engineering Department Thapar University Patiala-147004
Software Engineering for LabVIEW Applications. Elijah Kerry LabVIEW Product Manager
Software Engineering for LabVIEW Applications Elijah Kerry LabVIEW Product Manager 1 Ensuring Software Quality and Reliability Goals 1. Deliver a working product 2. Prove it works right 3. Mitigate risk
A Hybrid Analytical Modeling of Pending Cache Hits, Data Prefetching, and MSHRs 1
A Hybrid Analytical Modeling of Pending Cache Hits, Data Prefetching, and MSHRs 1 XI E. CHEN and TOR M. AAMODT University of British Columbia This paper proposes techniques to predict the performance impact
Predicting Soccer Match Results in the English Premier League
Predicting Soccer Match Results in the English Premier League Ben Ulmer School of Computer Science Stanford University Email: [email protected] Matthew Fernandez School of Computer Science Stanford University
EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000. ILP Execution
EE482: Advanced Computer Organization Lecture #11 Processor Architecture Stanford University Wednesday, 31 May 2000 Lecture #11: Wednesday, 3 May 2000 Lecturer: Ben Serebrin Scribe: Dean Liu ILP Execution
Tutorial on Markov Chain Monte Carlo
Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,
Use-it or Lose-it: Wearout and Lifetime in Future Chip-Multiprocessors
Use-it or Lose-it: Wearout and Lifetime in Future Chip-Multiprocessors Hyungjun Kim, 1 Arseniy Vitkovsky, 2 Paul V. Gratz, 1 Vassos Soteriou 2 1 Department of Electrical and Computer Engineering, Texas
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model
Location matters. 3 techniques to incorporate geo-spatial effects in one's predictive model Xavier Conort [email protected] Motivation Location matters! Observed value at one location is
OPNET - Network Simulator
Simulations and Tools for Telecommunications 521365S: OPNET - Network Simulator Jarmo Prokkola Project Manager, M. Sc. (Tech.) VTT Technical Research Centre of Finland Kaitoväylä 1, Oulu P.O. Box 1100,
Prediction of Software Development Modication Eort Enhanced by a Genetic Algorithm
Prediction of Software Development Modication Eort Enhanced by a Genetic Algorithm Gerg Balogh, Ádám Zoltán Végh, and Árpád Beszédes Department of Software Engineering University of Szeged, Szeged, Hungary
CloudRank-D:A Benchmark Suite for Private Cloud Systems
CloudRank-D:A Benchmark Suite for Private Cloud Systems Jing Quan Institute of Computing Technology, Chinese Academy of Sciences and University of Science and Technology of China HVC tutorial in conjunction
Introduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu
Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of
BIDM Project. Predicting the contract type for IT/ITES outsourcing contracts
BIDM Project Predicting the contract type for IT/ITES outsourcing contracts N a n d i n i G o v i n d a r a j a n ( 6 1 2 1 0 5 5 6 ) The authors believe that data modelling can be used to predict if an
