Distributed Structured Prediction for Big Data

Size: px
Start display at page:

Download "Distributed Structured Prediction for Big Data"

Transcription

1 Distributed Structured Prediction for Big Data A. G. Schwing ETH Zurich T. Hazan TTI Chicago M. Pollefeys ETH Zurich R. Urtasun TTI Chicago Abstract The biggest limitations of learning structured predictors from big data are the computation time and the memory demands. In this paper, we propose to handle those big data problems efficiently by distributing and parallelizing the resource reuirements. We present a distributed structured prediction learning algorithm for large scale models that cannot be effectively handled by a single cluster node. Importantly, convergence and optimality guarantees of recently developed algorithms are preserved while keeping between node communication low. Introduction In the past few years, structured models have become an important tool in domains such as natural language processing, computer vision and computational biology. The growing variability within data sets, reuires an increasing ressiveness that is achieved by modeling the influence of more and more variables. Hence memory and computational limits of desktop computers are reached uickly. In computer vision, for example, uncompressed full HD video streams produce 0 megabytes of data per second. Several structured prediction frameworks have been developed in the past. Notable examples are Conditional Random Fields (CRFs) [2], structured support vector machines (SSVMs) [7, 8] and their generalizations []. All three frameworks aim at minimizing a regularized surrogate loss. While CRFs and SSVMs are the method of choice for tree-structured or sub-modular models, approximations, e.g., [] are in general reuired. Note that all three approaches are inherently parallel in the training data. But none of the aforementioned frameworks address the underlying memory limitations of large scale models arising from real-world problems. This is important since nowadays big data tasks of increasing volume, variety and velocity call for large models. Hence we are interested in making structured prediction algorithms practical for large scale scenarios. We present an algorithm which distributes and parallelizes the computation and memory reuirements while reducing communication between cluster nodes and conserving convergence and optimality guarantees. Our approach is based on the principle of dual decomposition, i.e., computation is done in parallel by partitioning the model and imposing agreement on independent variables that are reuired to be consistent. Thus, we split the graph-based optimization program into several local optimization problems solved in parallel, and cluster nodes exchange information occasionally to enforce consistency. 2 A Review on Structured Prediction Let us first consider a setting where X denotes the input space (e.g., a video or a document) and S is a structured label space (e.g., a video segmentation or a set of parse trees). Further, let φ : X S R F denote a mapping from the input and label space to an F -dimensional feature space. When using structured prediction approaches, we are commonly interested in finding the parameters w R F of a log-linear model p w (s x) ( w φ(x, s)/ɛ ) with covariance ɛ, which best describes the possible labeling s S of x X. For training, we are given a data set D = {(x i, s i ) N i= } containing N pairs, each composed by an input space object x X and a label space object s S. In order to find the model parameters w

2 that best describe the annotations, we are often able to construct a task loss l (x,s) (ŝ) which measures the fitness of any labeling ŝ S. The vector v = (x,s) D φ(x, s) denotes the empirical mean and we commonly assume independent and identically distributed data in addition to a prior p(w) ( w p p). During learning we minimize the negative loss-augmented data-log-posterior, i.e., min ɛ ln ( ) l(x,s) (ŝ) + w φ(x, ŝ) v w + C w ɛ p w p p. () (x,s) D ŝ S Note that the covariance ɛ = recovers the CRF objective [2] while ɛ 0 smoothly approximates the max-function, hence recovering the SSVM formulation [7, 8]. Due to the sum over all label space configurations ŝ S being generally onential in size, the unconstrained minimization problem given in E. () is NP-hard in general. Elements φ r of the feature vector φ often describe interactions between subsets of random variables, i.e., φ r (x, s) = i V r,x φ r,i (x, s i ) + α E r,x φ r,α (x, s α ). Note that a labeling s = (s i ) i V S is a tuple subsuming V variables, each having S i discrete states. The sparse interactions induced by the feature functions φ r (x, s) are visually depicted by a factor graph G r,x with the individual variables i V r,x of sample (x, s) being vertices that are connected to factors α E r,x iff vertex i is a neighbor of factor α E r,x. The union graph G x = r G r,x describes the relationship over all features r and we say that vertex i V x = r V r,x is a neighbor to factor α E x = r E r,x if variable s i is part of the variable set s α in any of the features of sample (x, s), i.e., i N(α). Conversely, all factors that variable i participates in are referred to by α N(i). Approximations [] are one way to deal with the previously outlined intractability. The dual to the program given in E. () is described by means of joint distributions ranging, for each data sample (x, s), over the label space S. We describe this probability by its variable and factor marginals b (x,s),i (s i ) and b (x,s),α (s α ) and approximate the entropies of those joint distributions by its marginal entropies H(b (x,s),i ) and H(b (x,s),α ) using chosen counting numbers c i and c α for better approximation accuracy. To ensure consistency, we reuire the beliefs to fulfill marginalization constraints corresponding to the structure of the graph G x while maximizing the approximated dual cost function ɛc i H(b (x,s),i )+ ɛc α H(b (x,s),α )+ b (x,s),i (ŝ i )l (x,s),i (ŝ i )+ b (x,s),α (ŝ α )l (x,s),α (ŝ α ) (x,s) i α i,ŝ i α,ŝ α C b (x,s),i (ŝ i )φ r,i (x, ŝ i ) + b (x,s),α (ŝ α )φ r,α (x, ŝ α ) v r, (2) r (x,s),i V r,x,ŝ i (x,s),α E r,x,ŝ α with /p + / =. The sum ranging over the training samples being the first term in both the original primal (E. ()) and the approximated dual (E. (2)) suggests that computation of the gradient is inherently parallel in the data set elements. With real-world models G x often being too large for the resources provided by a single cluster node we next discuss a possibility to partition the optimization task while preserving the original convergence properties. 3 Distributed Structured Prediction To cope with current model size needs we are interested in an algorithm to maximize E. (2) while leveraging the sparsity given by the graph structure G x. In addition, we partition the vertices of the model such that each of the distributed cluster nodes solves an independent program defined on a subgraph induced by the variables of each partition (Fig. (a)). To ensure consistency for the global model, the distributed solutions are combined by exchanging information between connected subgraphs. The distributed structured prediction algorithm extends existing frameworks by introducing a high-level factor graph (Fig. (b)) describing the cluster node interactions. Occasional exchange of information corresponds to messages being sent on this factor graph. It is important to note that we do not reuire an exchange of information at every iteration. More concretely, let P x be a partition of all the vertices i V x for sample (x, s) into disjunct subsets n x P x each containing the variables i n x that are assigned to the cluster node n x. The vertices assigned to node n x P x induce a subgraph G x,nx. As before, this subgraph describes the 2

3 (x, s ) (x 2, s 2 ) Iterations (a) (b) (c) (d) Figure : (a): 2 samples each distributed on 2 cluster nodes (color). (b): The cluster node factor graph for consistency messages. (c),(d): Convergence of the inference task w.r.t. iterations and time. marginalization constraints reuired to be enforced on cluster node n x for its assigned variable beliefs (x,s),i (ŝ i) (x, s), i n x, ŝ i and the factor beliefs (x,s),α (ŝ α) (x, s), i n x, α N(i), ŝ α, i.e., ŝ α\ŝ i (x,s),α (ŝ α) = (x,s),i (ŝ i). A factor α that is assigned to multiple subgraphs G x,nx, corresponds to a set of beliefs (x,s),α each of them optimized independently on the cluster nodes n x N Px (α). Since these distributed beliefs originate from a single b (x,s),α in E. (2) we are reuired to ensure consistency. Formally, we construct a factor graph G Px with cluster nodes n x being the vertices that are connected to shared factors α iff n x N Px (α). Conversely, we denote by N Px (n x ) all factors α that are shared between multiple nodes, one of them being n x. To keep the shared beliefs consistent, we add the constraints (x,s),α (ŝ α) = b (x,s),α (ŝ α ) (x, s), α, n x N Px (α), ŝ α. To ensure optimization of the cost function given in E. (2), we further need to balance the entropy H(b (x,s),α ), the loss l (x,s),α and the features φ r,α for those factors α, distributed onto different cluster nodes. To this end, we let ĉ α = c α / N Px (α), ˆl (x,s),α = l (x,s),α / N Px (α) and ˆφ (x,s),α = φ (x,s),α / N Px (α) for all shared factors. For the remaining factors the variables augmented by the hat symbol ˆ correspond to the original variables. Conseuently, we obtain the following maximization, euivalent to E. (2): (x,s),n x P x ɛc i H( (x,s),i ) + i G x,nx α G x,nx,ŝ α Dual Energy 4.84 x ɛĉ α H( (x,s),α ) + α G x,nx Dual Energy 4.84 x Time [s] i G x,nx,ŝ i (x,s),i (ŝ i)l (x,s),i (ŝ i )+ (x,s),α (ŝ α)ˆl (x,s),α (ŝ α ) C z v, (3) with marginalization constraints ŝ α\ŝ i (x,s),α (ŝ α) = (x,s),i (ŝ i) (x, s), n x, i, ŝ i, α N(i), consistency constraints (x,s),α (ŝ α) = b (x,s),α (ŝ α ) (x, s), n x, α N P(x,s) (n x), ŝ α and variable z r = (x,s),n x,i,ŝ i (x,s),i (ŝ i)φ r,i (x, ŝ i ) + (x,s),s,α,ŝ α (x,s),α (ŝ α) ˆφ r,α (x, ŝ α ) r = {,..., F }. We would like to utilize the structure of the graph to obtain memory efficient and fast algorithms. Since the structure is employed to ress the marginalization constraints, the dual program of E. (3), with its Lagrange multipliers λ (x,s),i α (ŝ i ) corresponding to the marginalization constraints and ν (x,s),nx α(ŝ α ) originating from the consistency constraints between different cluster nodes is our preferred task. The dual program to E. (3) is given by the following claim. Claim. Set ν (x,s),nx α = 0 for every α G Px and enforce n x N P(x,s) (α) ν (x,s),n x α(ŝ α ) = 0 (x, s), α, ŝ α. With ˆφ (x,s),i (ŝ i ) = l (x,s),i (ŝ i ) + r:i V r,x,nx w r φ r,i (x, ŝ i ) and ˆφ (x,s),α (ŝ α ) = ˆl (x,s),α (ŝ α )+ r:α E r,x,nx w r ˆφr,α (x, ŝ α ) the dual program of the approximated structured prediction dual in E. (3) reads as g = ɛc i ln ( ˆφ(x,s),i (ŝ i ) α N(i) λ ) (x,s),i α(ŝ i ) v w + C ɛc i p w p p + (x,s),n x,i G x,nx ŝ i ɛĉ α ln ( ˆφ(x,s),α (ŝ α ) + i N(α) s λ ) (x,s),i α(ŝ i ) + ν (x,s),nx α(ŝ α ). (4) ɛĉ α ŝ α (x,s),n x,α G x,nx Proof: Follows [, ]. Looking at the distributed approximated primal given in E. (4) more closely, we note that both terms involving the two types of Lagrange multipliers are now preceded by sums ranging over the samples as well as the compute nodes n x

4 To derive an efficient algorithm we perform block-coordinate descent on this approximated primal. Fixing the consistency messages ν (x,s),nx α(ŝ α ), the optimal λ (x,s),i α (ŝ i ) is computed i G x,nx without considering current information from other cluster nodes. A status update in form of consistency messages ν (x,s),nx α(ŝ α ) is analytically computed by synchronizing messages between the different machines. The Armijo-Iterations performed to optimize w r reuire computation of the beliefs as well as the primal cost function value, which is done on the distributed nodes before another synchronization. The resulting block-coordinate descent and gradient steps are given by the following claim. Claim 2. With µ (x,s),α i (ŝ i ) = ɛĉ α ln ŝ α\ŝ i (( ˆφ (x,s),α (ŝ α ) + j N(α) s\i λ (x,s),j α(ŝ j ) + ν (x,s),nx α(ŝ α ))/(ɛĉ α )) the gradient steps in λ, ν and the gradient in w r are: λ (x,s),i α (ŝ i ) ĉ α c i + ˆφ(x,s),i (ŝ i ) + α N(i) ĉα µ (x,s),β i (ŝ i ) µ (x,s),α i (ŝ i ), ν (x,s),nx α(ŝ α ) g w r = N P(x,s) (α) (x,s),n x,i,ŝ i Proof: Follows [, ]. i N(α) λ (x,s),i α (ŝ i ) (x,s),i (ŝ i)φ r,i (x, ŝ i ) + (x,s),n x,α,ŝ α β N(i) i N(α) s λ (x,s),i α (ŝ i ), (x,s),α ˆφ r,α (x, ŝ α ) v r + C w r p sgn(w r ). Since the order of the block-coordinate descent steps does not impact convergence guarantees, we iteratively update the λ messages within a cluster node and the model parameters w r, before exchanging information between machines in form of consistency messages. Note, that updating model parameters reuires cluster nodes to only exchange numbers, while the size of the consistency messages depends on the size of the shared factors being commonly larger than a single real value. 4 Related Work and Discussion Data parallel frameworks, like MapReduce, simplify implementation of large-scale data processing but do not naturally support development of efficient learning algorithms. One of the most notable publicly available engines working towards efficient distributed algorithms is GraphLAB which, originally supporting only shared-memory environments [3], was recently extended to distributed environments [4]. However, minimization of communication overhead between cluster nodes is not considered, which potentially reduces computational performance. Our recent work on a parallel inference tasks that licitly minimizes the communication overhead was presented in []. Fig. (c) and Fig. (d) from [] show the convergence of an inference task w.r.t. iterations and time when communicating between machines every,,0,...,00 iterations. Although convergence in terms of iterations is best when transmitting information freuently, communication overhead reduces wall-clock performance when exchanging variables often. The drop in performance depends on the graph connectivity and cliue size (e.g., a common pairwise 4-connected grid in our case) and the cluster infrastructure (LAN or InfiniBand connection). Since learning involves inference a similar time dependence is ected. Conclusion We have presented a distributed structured prediction algorithm that is able to process models that exceed the resource restrictions of a single cluster node. Our approach divides computation and memory reuirements onto multiple machines while convergence and optimality guarantees are preserved by introducing a new type of consistency message. Our algorithm benefits particularly from the availability of multiple cluster nodes but it is also useful on a single machine since we derive licit rules for swapping parts of the model between memory and hard disk. Extensions towards latent variable models [6] and towards automatically finding an effective partitioning of graphical models are subject to future research. 4

5 References [] T. Hazan and R. Urtasun. A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction. In Proc. NIPS, 200. [2] J. Lafferty, A. McCallum, and F. Pereira. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Seuence Data. In Proc. ICML, 200. [3] Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. GraphLab: A New Parallel Framework for Machine Learning. In Proc. UAI, 200. [4] Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A Framework for Machine Learning in the Cloud. In Proc. Very Large Data Bases, 202. [] A. G. Schwing, T. Hazan, M. Pollefeys, and R. Urtasun. Distributed Message-Passing for Large-Scale Graphical Models. In Proc. CVPR, 20. [6] A. G. Schwing, T. Hazan, M. Pollefeys, and R. Urtasun. Efficient Structured Prediction with Latent Variables for General Graphical Models. In Proc. ICML, 202. [7] B. Taskar, C. Guestrin, and D. Koller. Max-Margin Markov Networks. In Proc. NIPS, [8] I. Tsochantaridis, T. Hofmann, T. Joachims, and Y. Altun. Support Vector Learning for Interdependent and Structured Output Spaces. In Proc. ICML, 2004.

Direct Loss Minimization for Structured Prediction

Direct Loss Minimization for Structured Prediction Direct Loss Minimization for Structured Prediction David McAllester TTI-Chicago mcallester@ttic.edu Tamir Hazan TTI-Chicago tamir@ttic.edu Joseph Keshet TTI-Chicago jkeshet@ttic.edu Abstract In discriminative

More information

Machine Learning over Big Data

Machine Learning over Big Data Machine Learning over Big Presented by Fuhao Zou fuhao@hust.edu.cn Jue 16, 2014 Huazhong University of Science and Technology Contents 1 2 3 4 Role of Machine learning Challenge of Big Analysis Distributed

More information

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics

Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University

More information

Semi-Supervised Support Vector Machines and Application to Spam Filtering

Semi-Supervised Support Vector Machines and Application to Spam Filtering Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery

More information

MapReduce Approach to Collective Classification for Networks

MapReduce Approach to Collective Classification for Networks MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty

More information

Conditional Random Fields: An Introduction

Conditional Random Fields: An Introduction Conditional Random Fields: An Introduction Hanna M. Wallach February 24, 2004 1 Labeling Sequential Data The task of assigning label sequences to a set of observation sequences arises in many fields, including

More information

Programming Tools based on Big Data and Conditional Random Fields

Programming Tools based on Big Data and Conditional Random Fields Programming Tools based on Big Data and Conditional Random Fields Veselin Raychev Martin Vechev Andreas Krause Department of Computer Science ETH Zurich Zurich Machine Learning and Data Science Meet-up,

More information

MapReduce/Bigtable for Distributed Optimization

MapReduce/Bigtable for Distributed Optimization MapReduce/Bigtable for Distributed Optimization Keith B. Hall Google Inc. kbhall@google.com Scott Gilpin Google Inc. sgilpin@google.com Gideon Mann Google Inc. gmann@google.com Abstract With large data

More information

Big Graph Processing: Some Background

Big Graph Processing: Some Background Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu Graphs

More information

Journal of Machine Learning Research 1 (2013) 1-1 Submitted 8/13; Published 10/13

Journal of Machine Learning Research 1 (2013) 1-1 Submitted 8/13; Published 10/13 Journal of Machine Learning Research 1 (2013) 1-1 Submitted 8/13; Published 10/13 PyStruct - Learning Structured Prediction in Python Andreas C. Müller Sven Behnke Institute of Computer Science, Department

More information

Training Conditional Random Fields using Virtual Evidence Boosting

Training Conditional Random Fields using Virtual Evidence Boosting Training Conditional Random Fields using Virtual Evidence Boosting Lin Liao Tanzeem Choudhury Dieter Fox Henry Kautz University of Washington Intel Research Department of Computer Science & Engineering

More information

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014 Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

More information

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014

LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014 LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph

More information

Semantic parsing with Structured SVM Ensemble Classification Models

Semantic parsing with Structured SVM Ensemble Classification Models Semantic parsing with Structured SVM Ensemble Classification Models Le-Minh Nguyen, Akira Shimazu, and Xuan-Hieu Phan Japan Advanced Institute of Science and Technology (JAIST) Asahidai 1-1, Nomi, Ishikawa,

More information

A Learning Based Method for Super-Resolution of Low Resolution Images

A Learning Based Method for Super-Resolution of Low Resolution Images A Learning Based Method for Super-Resolution of Low Resolution Images Emre Ugur June 1, 2004 emre.ugur@ceng.metu.edu.tr Abstract The main objective of this project is the study of a learning based method

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Statistical machine learning, high dimension and big data

Statistical machine learning, high dimension and big data Statistical machine learning, high dimension and big data S. Gaïffas 1 14 mars 2014 1 CMAP - Ecole Polytechnique Agenda for today Divide and Conquer principle for collaborative filtering Graphical modelling,

More information

Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

More information

Structured Learning and Prediction in Computer Vision. Contents

Structured Learning and Prediction in Computer Vision. Contents Foundations and Trends R in Computer Graphics and Vision Vol. 6, Nos. 3 4 (2010) 185 365 c 2011 S. Nowozin and C. H. Lampert DOI: 10.1561/0600000033 Structured Learning and Prediction in Computer Vision

More information

Multi-Relational Record Linkage

Multi-Relational Record Linkage Multi-Relational Record Linkage Parag and Pedro Domingos Department of Computer Science and Engineering University of Washington Seattle, WA 98195, U.S.A. {parag,pedrod}@cs.washington.edu http://www.cs.washington.edu/homes/{parag,pedrod}

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling. Tim Nieberg Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany Going For Large Scale Going For Large Scale 1

More information

Tracking Groups of Pedestrians in Video Sequences

Tracking Groups of Pedestrians in Video Sequences Tracking Groups of Pedestrians in Video Sequences Jorge S. Marques Pedro M. Jorge Arnaldo J. Abrantes J. M. Lemos IST / ISR ISEL / IST ISEL INESC-ID / IST Lisbon, Portugal Lisbon, Portugal Lisbon, Portugal

More information

Small Maximal Independent Sets and Faster Exact Graph Coloring

Small Maximal Independent Sets and Faster Exact Graph Coloring Small Maximal Independent Sets and Faster Exact Graph Coloring David Eppstein Univ. of California, Irvine Dept. of Information and Computer Science The Exact Graph Coloring Problem: Given an undirected

More information

3. The Junction Tree Algorithms

3. The Junction Tree Algorithms A Short Course on Graphical Models 3. The Junction Tree Algorithms Mark Paskin mark@paskin.org 1 Review: conditional independence Two random variables X and Y are independent (written X Y ) iff p X ( )

More information

Distributed Machine Learning and Big Data

Distributed Machine Learning and Big Data Distributed Machine Learning and Big Data Sourangshu Bhattacharya Dept. of Computer Science and Engineering, IIT Kharagpur. http://cse.iitkgp.ac.in/~sourangshu/ August 21, 2015 Sourangshu Bhattacharya

More information

Parallel & Distributed Optimization. Based on Mark Schmidt s slides

Parallel & Distributed Optimization. Based on Mark Schmidt s slides Parallel & Distributed Optimization Based on Mark Schmidt s slides Motivation behind using parallel & Distributed optimization Performance Computational throughput have increased exponentially in linear

More information

Mining Large Datasets: Case of Mining Graph Data in the Cloud

Mining Large Datasets: Case of Mining Graph Data in the Cloud Mining Large Datasets: Case of Mining Graph Data in the Cloud Sabeur Aridhi PhD in Computer Science with Laurent d Orazio, Mondher Maddouri and Engelbert Mephu Nguifo 16/05/2014 Sabeur Aridhi Mining Large

More information

An Empirical Study of Two MIS Algorithms

An Empirical Study of Two MIS Algorithms An Empirical Study of Two MIS Algorithms Email: Tushar Bisht and Kishore Kothapalli International Institute of Information Technology, Hyderabad Hyderabad, Andhra Pradesh, India 32. tushar.bisht@research.iiit.ac.in,

More information

Simple and efficient online algorithms for real world applications

Simple and efficient online algorithms for real world applications Simple and efficient online algorithms for real world applications Università degli Studi di Milano Milano, Italy Talk @ Centro de Visión por Computador Something about me PhD in Robotics at LIRA-Lab,

More information

Y. Xiang, Constraint Satisfaction Problems

Y. Xiang, Constraint Satisfaction Problems Constraint Satisfaction Problems Objectives Constraint satisfaction problems Backtracking Iterative improvement Constraint propagation Reference Russell & Norvig: Chapter 5. 1 Constraints Constraints are

More information

Bayesian networks - Time-series models - Apache Spark & Scala

Bayesian networks - Time-series models - Apache Spark & Scala Bayesian networks - Time-series models - Apache Spark & Scala Dr John Sandiford, CTO Bayes Server Data Science London Meetup - November 2014 1 Contents Introduction Bayesian networks Latent variables Anomaly

More information

Approximation Algorithms

Approximation Algorithms Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

More information

Big Data - Lecture 1 Optimization reminders

Big Data - Lecture 1 Optimization reminders Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics

More information

Image Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm

Image Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm Image Analytics on Big Data In Motion Implementation of Image Analytics CCL in Apache Kafka and Storm Lokesh Babu Rao 1 C. Elayaraja 2 1PG Student, Dept. of ECE, Dhaanish Ahmed College of Engineering,

More information

Case-Factor Diagrams for Structured Probabilistic Modeling

Case-Factor Diagrams for Structured Probabilistic Modeling Case-Factor Diagrams for Structured Probabilistic Modeling David McAllester TTI at Chicago mcallester@tti-c.org Michael Collins CSAIL Massachusetts Institute of Technology mcollins@ai.mit.edu Fernando

More information

Parallel Programming Map-Reduce. Needless to Say, We Need Machine Learning for Big Data

Parallel Programming Map-Reduce. Needless to Say, We Need Machine Learning for Big Data Case Study 2: Document Retrieval Parallel Programming Map-Reduce Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Carlos Guestrin January 31 st, 2013 Carlos Guestrin

More information

Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

More information

Software tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team

Software tools for Complex Networks Analysis. Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team Software tools for Complex Networks Analysis Fabrice Huet, University of Nice Sophia- Antipolis SCALE (ex-oasis) Team MOTIVATION Why do we need tools? Source : nature.com Visualization Properties extraction

More information

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore

Steven C.H. Hoi. School of Computer Engineering Nanyang Technological University Singapore Steven C.H. Hoi School of Computer Engineering Nanyang Technological University Singapore Acknowledgments: Peilin Zhao, Jialei Wang, Hao Xia, Jing Lu, Rong Jin, Pengcheng Wu, Dayong Wang, etc. 2 Agenda

More information

How Conditional Random Fields Learn Dynamics: An Example-Based Study

How Conditional Random Fields Learn Dynamics: An Example-Based Study Computer Communication & Collaboration (2013) Submitted on 27/May/2013 How Conditional Random Fields Learn Dynamics: An Example-Based Study Mohammad Javad Shafiee School of Electrical & Computer Engineering,

More information

Proximal mapping via network optimization

Proximal mapping via network optimization L. Vandenberghe EE236C (Spring 23-4) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:

More information

A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery

A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery A Serial Partitioning Approach to Scaling Graph-Based Knowledge Discovery Runu Rathi, Diane J. Cook, Lawrence B. Holder Department of Computer Science and Engineering The University of Texas at Arlington

More information

Structured Models for Fine-to-Coarse Sentiment Analysis

Structured Models for Fine-to-Coarse Sentiment Analysis Structured Models for Fine-to-Coarse Sentiment Analysis Ryan McDonald Kerry Hannan Tyler Neylon Mike Wells Jeff Reynar Google, Inc. 76 Ninth Avenue New York, NY 10011 Contact email: ryanmcd@google.com

More information

Distributed Dynamic Load Balancing for Iterative-Stencil Applications

Distributed Dynamic Load Balancing for Iterative-Stencil Applications Distributed Dynamic Load Balancing for Iterative-Stencil Applications G. Dethier 1, P. Marchot 2 and P.A. de Marneffe 1 1 EECS Department, University of Liege, Belgium 2 Chemical Engineering Department,

More information

WORKFLOW ENGINE FOR CLOUDS

WORKFLOW ENGINE FOR CLOUDS WORKFLOW ENGINE FOR CLOUDS By SURAJ PANDEY, DILEBAN KARUNAMOORTHY, and RAJKUMAR BUYYA Prepared by: Dr. Faramarz Safi Islamic Azad University, Najafabad Branch, Esfahan, Iran. Workflow Engine for clouds

More information

Big Data Science. Prof. Lise Getoor University of Maryland, College Park. http://www.cs.umd.edu/~getoor. October 17, 2013

Big Data Science. Prof. Lise Getoor University of Maryland, College Park. http://www.cs.umd.edu/~getoor. October 17, 2013 Big Data Science Prof Lise Getoor University of Maryland, College Park October 17, 2013 http://wwwcsumdedu/~getoor BIG Data is not flat 2004-2013 lonnitaylor Data is multi-modal, multi-relational, spatio-temporal,

More information

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014 Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

More information

Finding the M Most Probable Configurations Using Loopy Belief Propagation

Finding the M Most Probable Configurations Using Loopy Belief Propagation Finding the M Most Probable Configurations Using Loopy Belief Propagation Chen Yanover and Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem 91904 Jerusalem, Israel

More information

Large-Scale Similarity and Distance Metric Learning

Large-Scale Similarity and Distance Metric Learning Large-Scale Similarity and Distance Metric Learning Aurélien Bellet Télécom ParisTech Joint work with K. Liu, Y. Shi and F. Sha (USC), S. Clémençon and I. Colin (Télécom ParisTech) Séminaire Criteo March

More information

LARGE-SCALE GRAPH PROCESSING IN THE BIG DATA WORLD. Dr. Buğra Gedik, Ph.D.

LARGE-SCALE GRAPH PROCESSING IN THE BIG DATA WORLD. Dr. Buğra Gedik, Ph.D. LARGE-SCALE GRAPH PROCESSING IN THE BIG DATA WORLD Dr. Buğra Gedik, Ph.D. MOTIVATION Graph data is everywhere Relationships between people, systems, and the nature Interactions between people, systems,

More information

Fast Iterative Graph Computation with Resource Aware Graph Parallel Abstraction

Fast Iterative Graph Computation with Resource Aware Graph Parallel Abstraction Human connectome. Gerhard et al., Frontiers in Neuroinformatics 5(3), 2011 2 NA = 6.022 1023 mol 1 Paul Burkhardt, Chris Waring An NSA Big Graph experiment Fast Iterative Graph Computation with Resource

More information

Machine learning challenges for big data

Machine learning challenges for big data Machine learning challenges for big data Francis Bach SIERRA Project-team, INRIA - Ecole Normale Supérieure Joint work with R. Jenatton, J. Mairal, G. Obozinski, N. Le Roux, M. Schmidt - December 2012

More information

Machine Learning Big Data using Map Reduce

Machine Learning Big Data using Map Reduce Machine Learning Big Data using Map Reduce By Michael Bowles, PhD Where Does Big Data Come From? -Web data (web logs, click histories) -e-commerce applications (purchase histories) -Retail purchase histories

More information

Solving NP Hard problems in practice lessons from Computer Vision and Computational Biology

Solving NP Hard problems in practice lessons from Computer Vision and Computational Biology Solving NP Hard problems in practice lessons from Computer Vision and Computational Biology Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem www.cs.huji.ac.il/ yweiss

More information

Sanjeev Kumar. contribute

Sanjeev Kumar. contribute RESEARCH ISSUES IN DATAA MINING Sanjeev Kumar I.A.S.R.I., Library Avenue, Pusa, New Delhi-110012 sanjeevk@iasri.res.in 1. Introduction The field of data mining and knowledgee discovery is emerging as a

More information

Approximating the Partition Function by Deleting and then Correcting for Model Edges

Approximating the Partition Function by Deleting and then Correcting for Model Edges Approximating the Partition Function by Deleting and then Correcting for Model Edges Arthur Choi and Adnan Darwiche Computer Science Department University of California, Los Angeles Los Angeles, CA 995

More information

Spark and the Big Data Library

Spark and the Big Data Library Spark and the Big Data Library Reza Zadeh Thanks to Matei Zaharia Problem Data growing faster than processing speeds Only solution is to parallelize on large clusters» Wide use in both enterprises and

More information

Graph Processing and Social Networks

Graph Processing and Social Networks Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph

More information

Jubatus: An Open Source Platform for Distributed Online Machine Learning

Jubatus: An Open Source Platform for Distributed Online Machine Learning Jubatus: An Open Source Platform for Distributed Online Machine Learning Shohei Hido Seiya Tokui Preferred Infrastructure Inc. Tokyo, Japan {hido, tokui}@preferred.jp Satoshi Oda NTT Software Innovation

More information

Big Data: Big N. V.C. 14.387 Note. December 2, 2014

Big Data: Big N. V.C. 14.387 Note. December 2, 2014 Big Data: Big N V.C. 14.387 Note December 2, 2014 Examples of Very Big Data Congressional record text, in 100 GBs Nielsen s scanner data, 5TBs Medicare claims data are in 100 TBs Facebook 200,000 TBs See

More information

Multi-Class and Structured Classification

Multi-Class and Structured Classification Multi-Class and Structured Classification [slides prises du cours cs294-10 UC Berkeley (2006 / 2009)] [ p y( )] http://www.cs.berkeley.edu/~jordan/courses/294-fall09 Basic Classification in ML Input Output

More information

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014

Parallel Data Mining. Team 2 Flash Coders Team Research Investigation Presentation 2. Foundations of Parallel Computing Oct 2014 Parallel Data Mining Team 2 Flash Coders Team Research Investigation Presentation 2 Foundations of Parallel Computing Oct 2014 Agenda Overview of topic Analysis of research papers Software design Overview

More information

Part 2: Community Detection

Part 2: Community Detection Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

More information

Revisiting Output Coding for Sequential Supervised Learning

Revisiting Output Coding for Sequential Supervised Learning Revisiting Output Coding for Sequential Supervised Learning Guohua Hao and Alan Fern School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331 USA {haog, afern}@eecs.oregonstate.edu

More information

Graph Mining and Social Network Analysis

Graph Mining and Social Network Analysis Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann

More information

Learning and Inference over Constrained Output

Learning and Inference over Constrained Output IJCAI 05 Learning and Inference over Constrained Output Vasin Punyakanok Dan Roth Wen-tau Yih Dav Zimak Department of Computer Science University of Illinois at Urbana-Champaign {punyakan, danr, yih, davzimak}@uiuc.edu

More information

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING

CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING Journal homepage: http://www.journalijar.com INTERNATIONAL JOURNAL OF ADVANCED RESEARCH RESEARCH ARTICLE CURTAIL THE EXPENDITURE OF BIG DATA PROCESSING USING MIXED INTEGER NON-LINEAR PROGRAMMING R.Kohila

More information

KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice,

More information

Big Data Analytics. Lucas Rego Drumond

Big Data Analytics. Lucas Rego Drumond Big Data Analytics Lucas Rego Drumond Information Systems and Machine Learning Lab (ISMLL) Institute of Computer Science University of Hildesheim, Germany MapReduce II MapReduce II 1 / 33 Outline 1. Introduction

More information

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing

CSE 4351/5351 Notes 7: Task Scheduling & Load Balancing CSE / Notes : Task Scheduling & Load Balancing Task Scheduling A task is a (sequential) activity that uses a set of inputs to produce a set of outputs. A task (precedence) graph is an acyclic, directed

More information

Tekniker för storskalig parsning

Tekniker för storskalig parsning Tekniker för storskalig parsning Diskriminativa modeller Joakim Nivre Uppsala Universitet Institutionen för lingvistik och filologi joakim.nivre@lingfil.uu.se Tekniker för storskalig parsning 1(19) Generative

More information

Segmentation & Clustering

Segmentation & Clustering EECS 442 Computer vision Segmentation & Clustering Segmentation in human vision K-mean clustering Mean-shift Graph-cut Reading: Chapters 14 [FP] Some slides of this lectures are courtesy of prof F. Li,

More information

large-scale machine learning revisited Léon Bottou Microsoft Research (NYC)

large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) large-scale machine learning revisited Léon Bottou Microsoft Research (NYC) 1 three frequent ideas in machine learning. independent and identically distributed data This experimental paradigm has driven

More information

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research

MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With

More information

MapReduce for Bayesian Network Parameter Learning using the EM Algorithm

MapReduce for Bayesian Network Parameter Learning using the EM Algorithm apreduce for Bayesian Network Parameter Learning using the E Algorithm Aniruddha Basak Carnegie ellon University Silicon Valley Campus NASA Research Park, offett Field, CA 94035 abasak@cmu.edu Irina Brinster

More information

ANALYTICS IN BIG DATA ERA

ANALYTICS IN BIG DATA ERA ANALYTICS IN BIG DATA ERA ANALYTICS TECHNOLOGY AND ARCHITECTURE TO MANAGE VELOCITY AND VARIETY, DISCOVER RELATIONSHIPS AND CLASSIFY HUGE AMOUNT OF DATA MAURIZIO SALUSTI SAS Copyr i g ht 2012, SAS Ins titut

More information

Tell Me What You See and I will Show You Where It Is

Tell Me What You See and I will Show You Where It Is Tell Me What You See and I will Show You Where It Is Jia Xu Alexander G. Schwing 2 Raquel Urtasun 2,3 University of Wisconsin-Madison 2 University of Toronto 3 TTI Chicago jiaxu@cs.wisc.edu {aschwing,

More information

Self-Paced Learning for Latent Variable Models

Self-Paced Learning for Latent Variable Models Self-Paced Learning for Latent Variable Models M. Pawan Kumar Benjamin Packer Daphne Koller Computer Science Department Stanford University {pawan,bpacker,koller}@cs.stanford.edu Abstract Latent variable

More information

Distributed forests for MapReduce-based machine learning

Distributed forests for MapReduce-based machine learning Distributed forests for MapReduce-based machine learning Ryoji Wakayama, Ryuei Murata, Akisato Kimura, Takayoshi Yamashita, Yuji Yamauchi, Hironobu Fujiyoshi Chubu University, Japan. NTT Communication

More information

Fully Distributed EM for Very Large Datasets

Fully Distributed EM for Very Large Datasets Fully Distributed for Very Large Datasets Jason Wolfe Aria Haghighi Dan Klein Computer Science Division, University of California, Berkeley, CA 9472 jawolfe@cs.berkeley.edu aria42@cs.berkeley.edu klein@cs.berkeley.edu

More information

Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan

Data Mining: An Overview. David Madigan http://www.stat.columbia.edu/~madigan Data Mining: An Overview David Madigan http://www.stat.columbia.edu/~madigan Overview Brief Introduction to Data Mining Data Mining Algorithms Specific Eamples Algorithms: Disease Clusters Algorithms:

More information

How To Make A Credit Risk Model For A Bank Account

How To Make A Credit Risk Model For A Bank Account TRANSACTIONAL DATA MINING AT LLOYDS BANKING GROUP Csaba Főző csaba.fozo@lloydsbanking.com 15 October 2015 CONTENTS Introduction 04 Random Forest Methodology 06 Transactional Data Mining Project 17 Conclusions

More information

Introduction to Deep Learning Variational Inference, Mean Field Theory

Introduction to Deep Learning Variational Inference, Mean Field Theory Introduction to Deep Learning Variational Inference, Mean Field Theory 1 Iasonas Kokkinos Iasonas.kokkinos@ecp.fr Center for Visual Computing Ecole Centrale Paris Galen Group INRIA-Saclay Lecture 3: recap

More information

Leveraging Ensemble Models in SAS Enterprise Miner

Leveraging Ensemble Models in SAS Enterprise Miner ABSTRACT Paper SAS133-2014 Leveraging Ensemble Models in SAS Enterprise Miner Miguel Maldonado, Jared Dean, Wendy Czika, and Susan Haller SAS Institute Inc. Ensemble models combine two or more models to

More information

Final Project Proposal. CSCI.6500 Distributed Computing over the Internet

Final Project Proposal. CSCI.6500 Distributed Computing over the Internet Final Project Proposal CSCI.6500 Distributed Computing over the Internet Qingling Wang 660795696 1. Purpose Implement an application layer on Hybrid Grid Cloud Infrastructure to automatically or at least

More information

Predicting Program Properties from Big Code

Predicting Program Properties from Big Code Predicting Program Properties from Big Code * POPL * Artifact Consistent * Complete * Well Documented * Easy to Reuse * Evaluated * AEC * Veselin Raychev Department of Computer Science ETH Zürich veselin.raychev@inf.ethz.ch

More information

Doctor of Philosophy in Computer Science

Doctor of Philosophy in Computer Science Doctor of Philosophy in Computer Science Background/Rationale The program aims to develop computer scientists who are armed with methods, tools and techniques from both theoretical and systems aspects

More information

Research Statement Joseph E. Gonzalez jegonzal@eecs.berkeley.edu

Research Statement Joseph E. Gonzalez jegonzal@eecs.berkeley.edu As we scale to increasingly parallel and distributed architectures and explore new algorithms and machine learning techniques, the fundamental computational models and abstractions that once separated

More information

Keywords: Big Data, HDFS, Map Reduce, Hadoop

Keywords: Big Data, HDFS, Map Reduce, Hadoop Volume 5, Issue 7, July 2015 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Configuration Tuning

More information

Efficient and Robust Allocation Algorithms in Clouds under Memory Constraints

Efficient and Robust Allocation Algorithms in Clouds under Memory Constraints Efficient and Robust Allocation Algorithms in Clouds under Memory Constraints Olivier Beaumont,, Paul Renaud-Goud Inria & University of Bordeaux Bordeaux, France 9th Scheduling for Large Scale Systems

More information

Evaluating partitioning of big graphs

Evaluating partitioning of big graphs Evaluating partitioning of big graphs Fredrik Hallberg, Joakim Candefors, Micke Soderqvist fhallb@kth.se, candef@kth.se, mickeso@kth.se Royal Institute of Technology, Stockholm, Sweden Abstract. Distributed

More information

Towards Resource-Elastic Machine Learning

Towards Resource-Elastic Machine Learning Towards Resource-Elastic Machine Learning Shravan Narayanamurthy, Markus Weimer, Dhruv Mahajan Tyson Condie, Sundararajan Sellamanickam, Keerthi Selvaraj Microsoft [shravan mweimer dhrumaha tcondie ssrajan

More information

A Network Flow Approach in Cloud Computing

A Network Flow Approach in Cloud Computing 1 A Network Flow Approach in Cloud Computing Soheil Feizi, Amy Zhang, Muriel Médard RLE at MIT Abstract In this paper, by using network flow principles, we propose algorithms to address various challenges

More information

Parallel Computing for Data Science

Parallel Computing for Data Science Parallel Computing for Data Science With Examples in R, C++ and CUDA Norman Matloff University of California, Davis USA (g) CRC Press Taylor & Francis Group Boca Raton London New York CRC Press is an imprint

More information

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION

OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION OPTIMAL DESIGN OF DISTRIBUTED SENSOR NETWORKS FOR FIELD RECONSTRUCTION Sérgio Pequito, Stephen Kruzick, Soummya Kar, José M. F. Moura, A. Pedro Aguiar Department of Electrical and Computer Engineering

More information

A Systematic Cross-Comparison of Sequence Classifiers

A Systematic Cross-Comparison of Sequence Classifiers A Systematic Cross-Comparison of Sequence Classifiers Binyamin Rozenfeld, Ronen Feldman, Moshe Fresko Bar-Ilan University, Computer Science Department, Israel grurgrur@gmail.com, feldman@cs.biu.ac.il,

More information

MLlib: Scalable Machine Learning on Spark

MLlib: Scalable Machine Learning on Spark MLlib: Scalable Machine Learning on Spark Xiangrui Meng Collaborators: Ameet Talwalkar, Evan Sparks, Virginia Smith, Xinghao Pan, Shivaram Venkataraman, Matei Zaharia, Rean Griffith, John Duchi, Joseph

More information

Exponential time algorithms for graph coloring

Exponential time algorithms for graph coloring Exponential time algorithms for graph coloring Uriel Feige Lecture notes, March 14, 2011 1 Introduction Let [n] denote the set {1,..., k}. A k-labeling of vertices of a graph G(V, E) is a function V [k].

More information