Parallel & Distributed Optimization. Based on Mark Schmidt s slides
|
|
|
- Moses Flynn
- 10 years ago
- Views:
Transcription
1 Parallel & Distributed Optimization Based on Mark Schmidt s slides
2 Motivation behind using parallel & Distributed optimization Performance Computational throughput have increased exponentially in linear time (Moore s law) But only so many transistors can fit in limited space (atomic size) Serial computation throughput plateaued (Moore s law coming to an end) Space Large datasets cannot fit on a single machine
3 Introduction Parallel Computing One machine Multiple processors (Quad-Core, GPU)
4 Introduction Parallel Computing One machine Multiple processors (Quad-Core, GPU) Distributed Computing Multiple computers, linked via network
5 Introduction Parallel Computing One machine Multiple processors (Quad-Core, GPU) Distributed Computing Multiple computers, linked via network
6 Distributed optimization Strategy Each machine handles a subset of the dataset Issues link failures between machines devise algorithms that limits communication decentralize optimization Synchronization wait for slowest machine when the all machines depend on the variable parameter x coordinates
7 Distributed optimization Strategy Issues Each machine handles a subset of the dataset link failures between machines synchronization wait for the slowest machine to complete processing its data Solutions devise algorithms that limits communication decentralize optimization has parameter vector x
8 Straightforward distributed optimization Run different algorithms/strategies on different machines/cores First one that finishes wins Gauss-Southwell Coordinate Descent Randomized Coordinate Descent Gradient Descent Stochastic Gradient Descent
9 Straightforward distributed optimization Run different algorithms/strategies on different machines/cores First one that finishes wins Gauss-Southwell Coordinate Descent Randomized Coordinate Descent Gradient Descent Stochastic Gradient Descent Gauss-Southwell Coordinate Descent Randomized Coordinate Descent Your computer
10 Parallel first-order methods Synchronized deterministic Gradient Descent Can be broken into separable components n samples m machines
11 Parallel first-order methods Synchronized deterministic Gradient Descent n samples m machines
12 Parallel first-order methods Synchronized deterministic Gradient Descent n samples m machines
13 Parallel first-order methods m machines Synchronized deterministic Gradient Descent n samples
14 Parallel first-order methods m machines Synchronized deterministic Gradient Descent These allow optimal linear speedups You should always consider this first!
15 Parallel first-order methods m machines Synchronized deterministic Gradient Descent Issue What if one of the computers is very slow? What if one of the links failed?
16 Centralized Gradient descent (HogWild) Update x asynchronously - saves a lot of time Stochastic gradient method on shared memory
17 Centralized Gradient descent (HogWild) Update x asynchronously - saves a lot of time Stochastic gradient method on shared memory mini-batch 1 mini-batch 2 mini-batch 3
18 Centralized Gradient descent Update x asynchronously - saves a lot of time Stochastic gradient method on shared memory
19 Centralized Gradient descent Update x asynchronously - saves a lot of time Stochastic gradient method on shared memory
20 Centralized Gradient descent Update x asynchronously - saves a lot of time Stochastic gradient method on shared memory
21 Centralized Gradient descent Update x asynchronously - saves a lot of time Stochastic gradient method on shared memory
22 Centralized Coordinate descent Communicating parameters x can be expensive Use coordinate descent to transmit one coordinate update at a time mini-batch 1 mini-batch 2 mini-batch 3
23 Centralized Coordinate descent Communicating parameters x can be expensive Sending and receiving `x` parameters is expensive mini-batch 1 mini-batch 2 mini-batch 3
24 Centralized Coordinate descent Communicating parameters x can be expensive Use coordinate descent to transmit one coordinate update at a time Coordinates that have changed
25 Centralized Coordinate descent Communicating parameters x can be expensive Use coordinate descent to transmit one coordinate update at a time
26 Centralized Coordinate descent Communicating parameters x can be expensive Use coordinate descent to transmit one coordinate update at a time Coordinates that have changed Coordinates that have changed Need to decrease step-size for convergence (it s stochastic coordinate descent).
27 Decentralized Coordinate descent for sparse datasets Least square problem Sparse A Update rule Doesn t seem separable at first sight. But, can have many non-zero entries - most entries in will be unnecessary
28 Decentralized Coordinate descent for sparse datasets Least square problem Sparse A Update rule Coordinates? &? Coordinate? Coordinate?
29 Decentralized Coordinate descent for sparse datasets Least square problem Sparse A Update rule Coordinates? &? Coordinate? Coordinate?
30 Decentralized Coordinate descent for sparse datasets Least square problem Sparse A Update rule Coordinates? &? Coordinate? Coordinate?
31 Decentralized Coordinate descent for sparse datasets Least square problem Sparse A Update rule Coordinates 1 & 4 Coordinate 2 Coordinate 3 Samples 2 & 3 Sample 1 Sample 4
32 Decentralized Coordinate descent for sparse datasets Update rule Sparse A Say, you can only fit one sample in the machine Coordinates 1 & 4 Coordinate 2 Coordinate 3 Samples 2 & 3 Sample 1 Sample
33 Decentralized Coordinate descent for sparse datasets Update rule Sparse A Say, you can only fit one sample in a machine Coordinates 1 & 4 Coordinate 2 Coordinate 3 Sample 1 Sample Samples 2 Sample 3
34 Decentralized Coordinate descent for sparse datasets Update rule Sparse A Say, you can only fit one sample in a machine Coordinates 1 & 4 Coordinate 2 Coordinate 3 Sample 1 Sample Samples 2 Sample 3 Mini centralized coordinate descent
35 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Sparse A
36 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Sparse A Machine 1 Machine 2 Machine 3 Machine 4
37 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
38 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A ???? Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
39 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A ???? Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
40 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A ???? Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
41 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A ???? Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
42 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A ???? Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
43 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Sample 1 Sample 2 Sample 3 Sample 4 Sparse A Sample 1 Sample 2 Sample 3 Sample 4 Machine 1 Machine 2 Machine 3 Machine 4
44 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Update rule Sample 1 Sample 2 Sample 3 Sample 4 Sparse A no communication communicates with machine 3 communicates with machine 2 No communication machine 1 machine 2 machine 3 machine 4
45 Decentralized Gradient Descent Distribute the data across machines. We may not want to update a centralized vector x Decentralized Gradient Descent Each machine has its own data samples Each machine has its own parameter vector Update rule Sample 1 Sample 2 Sample 3 Sample 4 Sparse A Similar convergence to the gradient descent with central communication The rate depends on the sparsity of the dataset
46 Summary Using parallel and distributed systems is important for speeding up optimization for big data Synchronized Deterministic Gradient Descent Optimization halts with link failure or when a machine is slow at processing its data Centralized Asynchronous Gradient Descent Communicating vector x is costly Centralized Asynchronous Coordinate descent Centralization causes additional overhead - communication Decentralized Asynchronous Coordinate descent Helpful for sparse datasets No communication between machines Decentralized Gradient descent Helpful for sparse datasets Machines have to communicate with few neighbors only
Big Data Optimization: Randomized lock-free methods for minimizing partially separable convex functions
Big Data Optimization: Randomized lock-free methods for minimizing partially separable convex functions Peter Richtárik School of Mathematics The University of Edinburgh Joint work with Martin Takáč (Edinburgh)
Big learning: challenges and opportunities
Big learning: challenges and opportunities Francis Bach SIERRA Project-team, INRIA - Ecole Normale Supérieure December 2013 Omnipresent digital media Scientific context Big data Multimedia, sensors, indicators,
Machine learning challenges for big data
Machine learning challenges for big data Francis Bach SIERRA Project-team, INRIA - Ecole Normale Supérieure Joint work with R. Jenatton, J. Mairal, G. Obozinski, N. Le Roux, M. Schmidt - December 2012
Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014
Parallel Data Mining Team 2 Flash Coders Team Research Investigation Presentation 2 Foundations of Parallel Computing Oct 2014 Agenda Overview of topic Analysis of research papers Software design Overview
Chapter 12: Multiprocessor Architectures. Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup
Chapter 12: Multiprocessor Architectures Lesson 01: Performance characteristics of Multiprocessor Architectures and Speedup Objective Be familiar with basic multiprocessor architectures and be able to
Weekly Sales Forecasting
Weekly Sales Forecasting! San Diego Data Science and R Users Group June 2014 Kevin Davenport! http://kldavenport.com [email protected] @KevinLDavenport Thank you to our sponsors: The competition
Scalable Machine Learning - or what to do with all that Big Data infrastructure
- or what to do with all that Big Data infrastructure TU Berlin blog.mikiobraun.de Strata+Hadoop World London, 2015 1 Complex Data Analysis at Scale Click-through prediction Personalized Spam Detection
Machine Learning over Big Data
Machine Learning over Big Presented by Fuhao Zou [email protected] Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed
MapReduce/Bigtable for Distributed Optimization
MapReduce/Bigtable for Distributed Optimization Keith B. Hall Google Inc. [email protected] Scott Gilpin Google Inc. [email protected] Gideon Mann Google Inc. [email protected] Abstract With large data
A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster
Acta Technica Jaurinensis Vol. 3. No. 1. 010 A Simultaneous Solution for General Linear Equations on a Ring or Hierarchical Cluster G. Molnárka, N. Varjasi Széchenyi István University Győr, Hungary, H-906
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
Learning Outcomes. Simple CPU Operation and Buses. Composition of a CPU. A simple CPU design
Learning Outcomes Simple CPU Operation and Buses Dr Eddie Edwards [email protected] At the end of this lecture you will Understand how a CPU might be put together Be able to name the basic components
Regularized Logistic Regression for Mind Reading with Parallel Validation
Regularized Logistic Regression for Mind Reading with Parallel Validation Heikki Huttunen, Jukka-Pekka Kauppi, Jussi Tohka Tampere University of Technology Department of Signal Processing Tampere, Finland
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Application Scenario: Recommender
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Big Data by the numbers
COMP 598 Applied Machine Learning Lecture 21: Parallelization methods for large-scale machine learning! Instructor: ([email protected]) TAs: Pierre-Luc Bacon ([email protected]) Ryan Lowe ([email protected])
How I won the Chess Ratings: Elo vs the rest of the world Competition
How I won the Chess Ratings: Elo vs the rest of the world Competition Yannis Sismanis November 2010 Abstract This article discusses in detail the rating system that won the kaggle competition Chess Ratings:
Spark: Cluster Computing with Working Sets
Spark: Cluster Computing with Working Sets Outline Why? Mesos Resilient Distributed Dataset Spark & Scala Examples Uses Why? MapReduce deficiencies: Standard Dataflows are Acyclic Prevents Iterative Jobs
Chapter 4: Artificial Neural Networks
Chapter 4: Artificial Neural Networks CS 536: Machine Learning Littman (Wu, TA) Administration icml-03: instructional Conference on Machine Learning http://www.cs.rutgers.edu/~mlittman/courses/ml03/icml03/
Parallel Algorithm Engineering
Parallel Algorithm Engineering Kenneth S. Bøgh PhD Fellow Based on slides by Darius Sidlauskas Outline Background Current multicore architectures UMA vs NUMA The openmp framework Examples Software crisis
Introduction to Machine Learning Using Python. Vikram Kamath
Introduction to Machine Learning Using Python Vikram Kamath Contents: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Introduction/Definition Where and Why ML is used Types of Learning Supervised Learning Linear Regression
The Gurobi Optimizer
The Gurobi Optimizer Gurobi History Gurobi Optimization, founded July 2008 Zonghao Gu, Ed Rothberg, Bob Bixby Started code development March 2008 Gurobi Version 1.0 released May 2009 History of rapid,
Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent Feng Niu, Benjamin Recht, Christopher Ré and Stephen J. Wright Computer Sciences Department, University of Wisconsin-Madison
Federated Optimization: Distributed Optimization Beyond the Datacenter
Federated Optimization: Distributed Optimization Beyond the Datacenter Jakub Konečný School of Mathematics University of Edinburgh [email protected] H. Brendan McMahan Google, Inc. Seattle, WA 98103
Parallelism and Cloud Computing
Parallelism and Cloud Computing Kai Shen Parallel Computing Parallel computing: Process sub tasks simultaneously so that work can be completed faster. For instances: divide the work of matrix multiplication
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp
Designing and Building Applications for Extreme Scale Systems CS598 William Gropp www.cs.illinois.edu/~wgropp Welcome! Who am I? William (Bill) Gropp Professor of Computer Science One of the Creators of
STORM: Stochastic Optimization Using Random Models Katya Scheinberg Lehigh University. (Joint work with R. Chen and M. Menickelly)
STORM: Stochastic Optimization Using Random Models Katya Scheinberg Lehigh University (Joint work with R. Chen and M. Menickelly) Outline Stochastic optimization problem black box gradient based Existing
Various Schemes of Load Balancing in Distributed Systems- A Review
741 Various Schemes of Load Balancing in Distributed Systems- A Review Monika Kushwaha Pranveer Singh Institute of Technology Kanpur, U.P. (208020) U.P.T.U., Lucknow Saurabh Gupta Pranveer Singh Institute
Stochastic Optimization for Big Data Analytics: Algorithms and Libraries
Stochastic Optimization for Big Data Analytics: Algorithms and Libraries Tianbao Yang SDM 2014, Philadelphia, Pennsylvania collaborators: Rong Jin, Shenghuo Zhu NEC Laboratories America, Michigan State
HPC with Multicore and GPUs
HPC with Multicore and GPUs Stan Tomov Electrical Engineering and Computer Science Department University of Tennessee, Knoxville CS 594 Lecture Notes March 4, 2015 1/18 Outline! Introduction - Hardware
Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore
Steven C.H. Hoi School of Computer Engineering Nanyang Technological University Singapore Acknowledgments: Peilin Zhao, Jialei Wang, Hao Xia, Jing Lu, Rong Jin, Pengcheng Wu, Dayong Wang, etc. 2 Agenda
Experiments on the local load balancing algorithms; part 1
Experiments on the local load balancing algorithms; part 1 Ştefan Măruşter Institute e-austria Timisoara West University of Timişoara, Romania [email protected] Abstract. In this paper the influence
Main Memory Map Reduce (M3R)
Main Memory Map Reduce (M3R) PL Day, September 23, 2013 Avi Shinnar, David Cunningham, Ben Herta, Vijay Saraswat, Wei Zhang In collaboralon with: Yan Li*, Dave Grove, Mikio Takeuchi**, Salikh Zakirov**,
MLlib: Scalable Machine Learning on Spark
MLlib: Scalable Machine Learning on Spark Xiangrui Meng Collaborators: Ameet Talwalkar, Evan Sparks, Virginia Smith, Xinghao Pan, Shivaram Venkataraman, Matei Zaharia, Rean Griffith, John Duchi, Joseph
Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc.
Tackling Big Data with MATLAB Adam Filion Application Engineer MathWorks, Inc. 2015 The MathWorks, Inc. 1 Challenges of Big Data Any collection of data sets so large and complex that it becomes difficult
Big Data Analytics. Lucas Rego Drumond
Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Going For Large Scale 1
Large-Scale Similarity and Distance Metric Learning
Large-Scale Similarity and Distance Metric Learning Aurélien Bellet Télécom ParisTech Joint work with K. Liu, Y. Shi and F. Sha (USC), S. Clémençon and I. Colin (Télécom ParisTech) Séminaire Criteo March
CSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
Convex Optimization for Big Data
Convex Optimization for Big Data Volkan Cevher, Stephen Becker, and Mark Schmidt September 2014 arxiv:1411.0972v1 [math.oc] 4 Nov 2014 This article reviews recent advances in convex optimization algorithms
Parallel Computing: Strategies and Implications. Dori Exterman CTO IncrediBuild.
Parallel Computing: Strategies and Implications Dori Exterman CTO IncrediBuild. In this session we will discuss Multi-threaded vs. Multi-Process Choosing between Multi-Core or Multi- Threaded development
GPU File System Encryption Kartik Kulkarni and Eugene Linkov
GPU File System Encryption Kartik Kulkarni and Eugene Linkov 5/10/2012 SUMMARY. We implemented a file system that encrypts and decrypts files. The implementation uses the AES algorithm computed through
Linear smoother. ŷ = S y. where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S
Linear smoother ŷ = S y where s ij = s ij (x) e.g. s ij = diag(l i (x)) To go the other way, you need to diagonalize S 2 Online Learning: LMS and Perceptrons Partially adapted from slides by Ryan Gabbard
Spark and the Big Data Library
Spark and the Big Data Library Reza Zadeh Thanks to Matei Zaharia Problem Data growing faster than processing speeds Only solution is to parallelize on large clusters» Wide use in both enterprises and
Distributed Structured Prediction for Big Data
Distributed Structured Prediction for Big Data A. G. Schwing ETH Zurich [email protected] T. Hazan TTI Chicago M. Pollefeys ETH Zurich R. Urtasun TTI Chicago Abstract The biggest limitations of learning
Introduction to Logistic Regression
OpenStax-CNX module: m42090 1 Introduction to Logistic Regression Dan Calderon This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 Abstract Gives introduction
Convex Optimization for Big Data
Convex Optimization for Big Data Volkan Cevher, Stephen Becker, and Mark Schmidt September 2014 This article reviews recent advances in convex optimization algorithms for Big Data, which aim to reduce
Parallelization: Binary Tree Traversal
By Aaron Weeden and Patrick Royal Shodor Education Foundation, Inc. August 2012 Introduction: According to Moore s law, the number of transistors on a computer chip doubles roughly every two years. First
Beyond stochastic gradient descent for large-scale machine learning
Beyond stochastic gradient descent for large-scale machine learning Francis Bach INRIA - Ecole Normale Supérieure, Paris, France Joint work with Eric Moulines, Nicolas Le Roux and Mark Schmidt - ECML-PKDD,
Map-Reduce for Machine Learning on Multicore
Map-Reduce for Machine Learning on Multicore Chu, et al. Problem The world is going multicore New computers - dual core to 12+-core Shift to more concurrent programming paradigms and languages Erlang,
Database Replication Techniques: a Three Parameter Classification
Database Replication Techniques: a Three Parameter Classification Matthias Wiesmann Fernando Pedone André Schiper Bettina Kemme Gustavo Alonso Département de Systèmes de Communication Swiss Federal Institute
Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com
CSCI-GA.3033-012 Graphics Processing Units (GPUs): Architecture and Programming Lecture 3: Modern GPUs A Hardware Perspective Mohamed Zahran (aka Z) [email protected] http://www.mzahran.com Modern GPU
Big Graph Processing: Some Background
Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu Graphs
18-742 Lecture 4. Parallel Programming II. Homework & Reading. Page 1. Projects handout On Friday Form teams, groups of two
age 1 18-742 Lecture 4 arallel rogramming II Spring 2005 rof. Babak Falsafi http://www.ece.cmu.edu/~ece742 write X Memory send X Memory read X Memory Slides developed in part by rofs. Adve, Falsafi, Hill,
Simple and efficient online algorithms for real world applications
Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,
MOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS. DANIEL LARIMER [email protected] Invictus Innovations, Inc
MOMENTUM - A MEMORY-HARD PROOF-OF-WORK VIA FINDING BIRTHDAY COLLISIONS DANIEL LARIMER [email protected] Invictus Innovations, Inc ABSTRACT. We introduce the concept of memory-hard proof-of-work
Big Data Analytics and HPC
Big Data Analytics and HPC Matthew J. Denny [email protected] www.mjdenny.com @MatthewJDenny www.mjdenny.com/icpsr Data Science 2015.html July 28, 2015 Overview 1. Overview of High Performance Computing/Big
JBösen: A Java Bounded Asynchronous Key-Value Store for Big ML Yihua Fang
JBösen: A Java Bounded Asynchronous Key-Value Store for Big ML Yihua Fang CMU-CS-15-122 July 21st 2015 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Thesis Committee: Eric
Forecasting and Hedging in the Foreign Exchange Markets
Christian Ullrich Forecasting and Hedging in the Foreign Exchange Markets 4u Springer Contents Part I Introduction 1 Motivation 3 2 Analytical Outlook 7 2.1 Foreign Exchange Market Predictability 7 2.2
Logistic Regression for Spam Filtering
Logistic Regression for Spam Filtering Nikhila Arkalgud February 14, 28 Abstract The goal of the spam filtering problem is to identify an email as a spam or not spam. One of the classic techniques used
Intro to GPU computing. Spring 2015 Mark Silberstein, 048661, Technion 1
Intro to GPU computing Spring 2015 Mark Silberstein, 048661, Technion 1 Serial vs. parallel program One instruction at a time Multiple instructions in parallel Spring 2015 Mark Silberstein, 048661, Technion
Road Map. Scheduling. Types of Scheduling. Scheduling. CPU Scheduling. Job Scheduling. Dickinson College Computer Science 354 Spring 2010.
Road Map Scheduling Dickinson College Computer Science 354 Spring 2010 Past: What an OS is, why we have them, what they do. Base hardware and support for operating systems Process Management Threads Present:
Fast R-CNN. Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th. Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083.
Fast R-CNN Author: Ross Girshick Speaker: Charlie Liu Date: Oct, 13 th Girshick, R. (2015). Fast R-CNN. arxiv preprint arxiv:1504.08083. ECS 289G 001 Paper Presentation, Prof. Lee Result 1 67% Accuracy
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU
Efficient Parallel Graph Exploration on Multi-Core CPU and GPU Pervasive Parallelism Laboratory Stanford University Sungpack Hong, Tayo Oguntebi, and Kunle Olukotun Graph and its Applications Graph Fundamental
Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011
Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis
Distributed Databases
Distributed Databases Chapter 1: Introduction Johann Gamper Syllabus Data Independence and Distributed Data Processing Definition of Distributed databases Promises of Distributed Databases Technical Problems
Introduction to Cloud Computing
Introduction to Cloud Computing Parallel Processing I 15 319, spring 2010 7 th Lecture, Feb 2 nd Majd F. Sakr Lecture Motivation Concurrency and why? Different flavors of parallel computing Get the basic
Linear Threshold Units
Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear
Graphics Cards and Graphics Processing Units. Ben Johnstone Russ Martin November 15, 2011
Graphics Cards and Graphics Processing Units Ben Johnstone Russ Martin November 15, 2011 Contents Graphics Processing Units (GPUs) Graphics Pipeline Architectures 8800-GTX200 Fermi Cayman Performance Analysis
2: Computer Performance
2: Computer Performance http://people.sc.fsu.edu/ jburkardt/presentations/ fdi 2008 lecture2.pdf... John Information Technology Department Virginia Tech... FDI Summer Track V: Parallel Programming 10-12
Uintah Framework. Justin Luitjens, Qingyu Meng, John Schmidt, Martin Berzins, Todd Harman, Chuch Wight, Steven Parker, et al
Uintah Framework Justin Luitjens, Qingyu Meng, John Schmidt, Martin Berzins, Todd Harman, Chuch Wight, Steven Parker, et al Uintah Parallel Computing Framework Uintah - far-sighted design by Steve Parker
Bringing Big Data Modelling into the Hands of Domain Experts
Bringing Big Data Modelling into the Hands of Domain Experts David Willingham Senior Application Engineer MathWorks [email protected] 2015 The MathWorks, Inc. 1 Data is the sword of the
Extreme Machine Learning with GPUs John Canny
Extreme Machine Learning with GPUs John Canny Computer Science Division University of California, Berkeley GTC, March, 2014 Big Data Event and text data: Microsoft Yahoo Ebay Quantcast MOOC logs Social
Dell One Identity Manager Scalability and Performance
Dell One Identity Manager Scalability and Performance Scale up and out to ensure simple, effective governance for users. Abstract For years, organizations have had to be able to support user communities
Parallel Scalable Algorithms- Performance Parameters
www.bsc.es Parallel Scalable Algorithms- Performance Parameters Vassil Alexandrov, ICREA - Barcelona Supercomputing Center, Spain Overview Sources of Overhead in Parallel Programs Performance Metrics for
Artificial Neural Networks and Support Vector Machines. CS 486/686: Introduction to Artificial Intelligence
Artificial Neural Networks and Support Vector Machines CS 486/686: Introduction to Artificial Intelligence 1 Outline What is a Neural Network? - Perceptron learners - Multi-layer networks What is a Support
Quantum Computing. Robert Sizemore
Quantum Computing Robert Sizemore Outline Introduction: What is quantum computing? What use is quantum computing? Overview of Quantum Systems Dirac notation & wave functions Two level systems Classical
Introduction to Parallel Programming and MapReduce
Introduction to Parallel Programming and MapReduce Audience and Pre-Requisites This tutorial covers the basics of parallel programming and the MapReduce programming model. The pre-requisites are significant
Big Data Deep Learning: Challenges and Perspectives
Big Data Deep Learning: Challenges and Perspectives D.saraswathy Department of computer science and engineering IFET college of engineering Villupuram [email protected] Abstract Deep
Lecture Outline Overview of real-time scheduling algorithms Outline relative strengths, weaknesses
Overview of Real-Time Scheduling Embedded Real-Time Software Lecture 3 Lecture Outline Overview of real-time scheduling algorithms Clock-driven Weighted round-robin Priority-driven Dynamic vs. static Deadline
ultra fast SOM using CUDA
ultra fast SOM using CUDA SOM (Self-Organizing Map) is one of the most popular artificial neural network algorithms in the unsupervised learning category. Sijo Mathew Preetha Joy Sibi Rajendra Manoj A
Using In-Memory Computing to Simplify Big Data Analytics
SCALEOUT SOFTWARE Using In-Memory Computing to Simplify Big Data Analytics by Dr. William Bain, ScaleOut Software, Inc. 2012 ScaleOut Software, Inc. 12/27/2012 T he big data revolution is upon us, fed
Asynchronous Bypass Channels
Asynchronous Bypass Channels Improving Performance for Multi-Synchronous NoCs T. Jain, P. Gratz, A. Sprintson, G. Choi, Department of Electrical and Computer Engineering, Texas A&M University, USA Table
Parallel Computing for Data Science
Parallel Computing for Data Science With Examples in R, C++ and CUDA Norman Matloff University of California, Davis USA (g) CRC Press Taylor & Francis Group Boca Raton London New York CRC Press is an imprint
Determining Inventory Levels in a CONWIP Controlled Job Shop
Determining Inventory Levels in a CONWIP Controlled Job Shop Sarah M. Ryan* Senior Member, IIE Department of Industrial and Management Systems Engineering University of Nebraska-Lincoln Lincoln, NE 68588-0518
Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications
Enabling Multi-pipeline Data Transfer in HDFS for Big Data Applications Liqiang (Eric) Wang, Hong Zhang University of Wyoming Hai Huang IBM T.J. Watson Research Center Background Hadoop: Apache Hadoop
Graph Based Processing of Big Images
Graph Based Processing of Big Images HUI HAN CHIN DSO NATIONAL LABORATORIES, SINGAPORE 31 OCT 2014 Introduction Member of Technical Staff at Signal Processing Laboratory, Sensors Division, DSO National
Big Data Analytics: Optimization and Randomization
Big Data Analytics: Optimization and Randomization Tianbao Yang, Qihang Lin, Rong Jin Tutorial@SIGKDD 2015 Sydney, Australia Department of Computer Science, The University of Iowa, IA, USA Department of
Parallel Computing with Mathematica UVACSE Short Course
UVACSE Short Course E Hall 1 1 University of Virginia Alliance for Computational Science and Engineering [email protected] October 8, 2014 (UVACSE) October 8, 2014 1 / 46 Outline 1 NX Client for Remote
Mesh Generation and Load Balancing
Mesh Generation and Load Balancing Stan Tomov Innovative Computing Laboratory Computer Science Department The University of Tennessee April 04, 2012 CS 594 04/04/2012 Slide 1 / 19 Outline Motivation Reliable
PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM
PERFORMANCE ANALYSIS OF PaaS CLOUD COMPUTING SYSTEM Akmal Basha 1 Krishna Sagar 2 1 PG Student,Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, India. 2 Associate
Computer Network. Interconnected collection of autonomous computers that are able to exchange information
Introduction Computer Network. Interconnected collection of autonomous computers that are able to exchange information No master/slave relationship between the computers in the network Data Communications.
