BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION
|
|
|
- Kimberly Young
- 10 years ago
- Views:
Transcription
1 BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli 1 1: Boğaziçi University, 2: Bilgi University Nottingham University March, 2015 Ş. İlker Birbil (Sabancı University) Big Data Optimization 1 / 22
2 LARGE-SCALE OPTIMIZATION AND MACHINE LEARNING Introduction Exploiting the Structure Need for Parallel Algorithms F. Öztoprak Ş. İlker Birbil (Sabancı University) Big Data Optimization 2 / 22
3 DATA SCIENCE Ş. İlker Birbil (Sabancı University) Big Data Optimization 3 / 22
4 GRADUATE COURSES Ş. İlker Birbil (Sabancı University) Big Data Optimization 4 / 22
5 NONLINEAR OPTIMIZATION Introduction Exploiting the Structure Need for Parallel Algorithms Typically, Nonlinear a nonlinear Programming optimization problem (NLP) isproblem defined as minimize f (x) x R n Covers optimization problems subject to c i(x) = 0, i E, min c f(x) x2x i(x) 0, i I, where where f : R n X = {x 2 R R is the n : g(x) apple 0}, the functions g objective function and c i : R n : R n! R R for m, f : R i E n! R are I are the continuous and not necessarily linear. constraint functions. At least one of these functions is nonlinear. (1) x* Ş. İlker Birbil (Sabancı University) Big Data Optimization 5 / 22
6 ROLE OF NONLINEAR OPTIMIZATION Introduction Exploiting the Structure Need for Parallel Algorithms Molecular Biology (Protein Folding) Engineering Design (Machining) Global Optimization Finance (Risk Management) Derivative Free Optimization Nonlinear Stochastic Prog. Statistics Large Scale Core NLP Computer Science Applied Mathematics Convex Optimization Mixed Integer NLP Operations Research Machine Learning (Image Recovery) PDE Constrained Optimization Production (Chemical Complex Design) Health (Cancer Treatment) F. Öztoprak Ş. İlker Birbil (Sabancı University) Big Data Optimization 6 / 22
7 OUR RESEARCH GROUP Three faculty members, four PhD students, three MSc students (Coupled) Tensor or matrix factorization Distributed and parallel algorithms: Bayesian inference Nonlinear optimization Processor 1 Core 1 Core 2 Core 3 Core 4 Memory 1 Processor 2 Core 1 Core 2 Core 3 Core 4 Memory 2 Processor 3 Core 1 Core 2 Core 3 Core 4 Memory 3 Ş. İlker Birbil (Sabancı University) Big Data Optimization 7 / 22
8 OUR RESEARCH GROUP Three faculty members, four PhD students, three MSc students (Coupled) Tensor or matrix factorization Distributed and parallel algorithms: Bayesian inference Nonlinear optimization Processor 1 Core 1 Core 2 Core 3 Core 4 Memory 1 Processor 2 Core 1 Core 2 Core 3 Core 4 Memory 2 Processor 3 Core 1 Core 2 Core 3 Core 4 Memory 3 Ş. İlker Birbil (Sabancı University) Big Data Optimization 7 / 22
9 LINK PREDICTION VIA TENSOR FACTORIZATION X 1(i, j, k): if user i visits location j and performs activity k X 2(i, m): frequency of a user i visiting location m X j(j, n): points of interest for a location j Ş. İlker Birbil (Sabancı University) Big Data Optimization 8 / 22
10 TENSOR FACTORIZATION Matrix & Tensor Factorizations Tensor Factorization Tensor Factorization Tensor Multidimensional Array (X i,j,k,...) Extension of matrix factorizations to higher-order tensors Tensor factorizations are used to extract the underlying factors in higher-order data I Tensor Multidimensional Array I Used toi extract the underlying factors in higher-order data sets sets Tensor Factorisation + 7/1 X (i, j, k) X (i, r)z 2(j, r)z 3(k, r) X(i, j, k) r (i, r)z 2(j, r)z 3(k, r) r Cemgil Probabilistic Latent Tensor Factorisation. IFG19SabanciUniversity 14 Ş. İlker Birbil (Sabancı University) Big Data Optimization 9 / 22
11 X X 12 Z 2 MATRIX FACTORIZATION X (, ) X X(, ) i Z(, 1 ( i)z,i)z 2 (i, 2 ) (i, ) An inverse problem: Estimate i and Z 2 given data matrix X assuming X Z 2 X M "! ˆX Z 2 #! " #! " able error Overall function optimization subject problem to constraints (e.g., nonnegativity, ble error function subject to constraints (e.g., nonnegativity, minimize X Z 2 2 F subject to, Z 2 Z, (, Z 2 ) =argmind(x Z 2 )+ R(, Z 2 ),Z 2 where Z is the feasible region. When Z is the first orthant, we have the 1,Z 2 nonnegative ) = arg matrixmin factorization D(X Z problem. 1 Z 2 )+ R(,Z 2 ),Z 2 Ş. İlker Birbil (Sabancı University) Big Data Optimization 10 / 22
12 MOVIE RECOMMENDATION minimize X Z 2 2 F subject to 0, Z 2 0 Ş. İlker Birbil (Sabancı University) Big Data Optimization 11 / 22
13 DISTRIBUTED IMPLEMENTATION Time Slot 1: Perform X 12 (1,:) X 12 = (1,:)Z 2 (:,2) on P1 X 31 Time Slot 2: X 23 (2,:) (3,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 23 = (2,:)Z 2 (:,3) on P2 X 31 = (3,:)Z 2 (:,1) on P3 by employing IPA. X 11 (1,:) X 22 (2,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 33 (3,:) Time Slot 3: X 13 (1,:) X 21 (2,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 32 (3,:) Time Slot 4:... Ş. İlker Birbil (Sabancı University) Big Data Optimization 12 / 22
14 REFORMULATION 1" minimize subject to X Z 2 2 F, Z 2 Z 1" 2" 3" Z 2 4" 5" 6" z."."." 6" GENERIC PROBLEM minimize f i(z) subject to i {1,,m} z ζ Ş. İlker Birbil (Sabancı University) Big Data Optimization 13 / 22
15 DISTRUBUTED OPTIMIZATION Time Slot 1: X 31 Time Slot 2: X 11 Time Slot 3: X 21 Time Slot 4:... X 12 X 22 X 32 X 23 X 33 X 13 (1,:) (2,:) (3,:) (1,:) (2,:) (3,:) (1,:) (2,:) (3,:) x x x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Perform X 12 = (1,:)Z 2 (:,2) on P1 X 23 = (2,:)Z 2 (:,3) on P2 X 31 = (3,:)Z 2 (:,1) on P3 by employing IPA. " 2" 3" 1 Z 2 4" 5" 6" z 1"."."." 6" minimize subject to i {1,,m} z ζ f i(z) At each time slot k, we solve a subset S k of the component functions f i, i {1, 2,, m} We make sure that each data block is visited after c passes (c = 3 in the figure) Ş. İlker Birbil (Sabancı University) Big Data Optimization 14 / 22
16 INCREMENTAL QUASI-NEWTON ALGORITHM Unlike gradient-based methods, the proposed algorithm uses second order information through Hessian approximation (L-BFGS quasi-newton method) The proposed algorithm visits each subset of component functions in the same order (incremental and deterministic) We do not assume convexity of the function (matrix factorization can be solved) CORE STEP Solve a quadratic approximation of the (partial) objective function: Q t k(z) = (z z k) Sk f (z k) (z zk) H t(z z k) βt z zk 2. Ş. İlker Birbil (Sabancı University) Big Data Optimization 15 / 22
17 INCREMENTAL QUASI-NEWTON ALGORITHM (CONT D) Q t k(z) = (z z k) Sk f (z k) (z zk) H t(z z k) βt z zk 2. Algorithm 1: HAMSI input: y 0,β 1 1 for t = 0, 1, 2, do 2 z 1 = y t 3 Compute H t 4 for k = 1, 2,, c do 5 Choose a subset S k {1,, m} 6 Compute Sk f (z k) 7 z k+1 = arg min z ζ Q t k(z) 8 end 9 y t+1 = z c+1 10 Set β t+1 β t 11 end Ş. İlker Birbil (Sabancı University) Big Data Optimization 16 / 22
18 CONVERGENCE ANALYSIS (ζ = R n ) ASSUMPTIONS 1. Hessians of the component functions and (H t + β ti) are uniformly bounded: i S k 2 i f (y t) L t L S k, y t. 2. The smallest eigenvalue of (H t + β ti) is bounded away from zero: U t (H t + β ti) 1 M t t. 3. The gradient norms are uniformly bounded: Sk f (y t) C S k, y t. Ş. İlker Birbil (Sabancı University) Big Data Optimization 17 / 22
19 CONVERGENCE ANALYSIS (CONT D) LEMMA At each outer iteration t of Algorithm 1 and for k = 1,, c, we have k 1 δ k = Sk f (z k) Sk f (y t) L tm t (1 + L tm t) k 1 j Sj f (y t) j=1 THEOREM Consider the iterates y t produced by Algorithm 1. Then, all accumulation points of {y t} are stationary points of the generic problem. Ş. İlker Birbil (Sabancı University) Big Data Optimization 18 / 22
20 CONVERGENCE ANALYSIS (CONT D) LEMMA At each outer iteration t of Algorithm 1 and for k = 1,, c, we have k 1 δ k = Sk f (z k) Sk f (y t) L tm t (1 + L tm t) k 1 j Sj f (y t) j=1 THEOREM Consider the iterates y t produced by Algorithm 1. Then, all accumulation points of {y t} are stationary points of the generic problem. COROLLARY Algorithm 1 solves the matrix factorization problem. Ş. İlker Birbil (Sabancı University) Big Data Optimization 18 / 22
21 PRELIMINARY EXPERIMENTS - SETUP Linux cluster with 15 nodes Each node has 8, Intel Xeon 2.50 GHz processor with 16 GB RAM This setting allows execution of 120 parallel tasks in parallel MovieLens data (1M) is used for our preliminary experiments Ş. İlker Birbil (Sabancı University) Big Data Optimization 19 / 22
22 PRELIMINARY EXPERIMENTS FIGURE: Objective function values Ş. İlker Birbil (Sabancı University) Big Data Optimization 20 / 22
23 PRELIMINARY EXPERIMENTS (CONT D) FIGURE: Root mean square error Ş. İlker Birbil (Sabancı University) Big Data Optimization 21 / 22
24 CONCLUDING REMARKS Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
25 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
26 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
27 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
28 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
29 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Stochastic version of the proposed algorithm Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
30 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Stochastic version of the proposed algorithm Quasi-Newton-based Bayesian inference Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
Numerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
Adaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
Computing a Nearest Correlation Matrix with Factor Structure
Computing a Nearest Correlation Matrix with Factor Structure Nick Higham School of Mathematics The University of Manchester [email protected] http://www.ma.man.ac.uk/~higham/ Joint work with Rüdiger
Optimal Scheduling for Dependent Details Processing Using MS Excel Solver
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 8, No 2 Sofia 2008 Optimal Scheduling for Dependent Details Processing Using MS Excel Solver Daniela Borissova Institute of
Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
Solutions of Equations in One Variable. Fixed-Point Iteration II
Solutions of Equations in One Variable Fixed-Point Iteration II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
Nonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris [email protected] Nonlinear optimization c 2006 Jean-Philippe Vert,
t := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows
TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents
Inner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
Date: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
Big Data Techniques Applied to Very Short-term Wind Power Forecasting
Big Data Techniques Applied to Very Short-term Wind Power Forecasting Ricardo Bessa Senior Researcher ([email protected]) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with
A characterization of trace zero symmetric nonnegative 5x5 matrices
A characterization of trace zero symmetric nonnegative 5x5 matrices Oren Spector June 1, 009 Abstract The problem of determining necessary and sufficient conditions for a set of real numbers to be the
Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation
Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation Ying Peng, Bin Gong, Hui Liu, and Yanxin Zhang School of Computer Science and Technology, Shandong University,
Statistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points
Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a
2.2 Creaseness operator
2.2. Creaseness operator 31 2.2 Creaseness operator Antonio López, a member of our group, has studied for his PhD dissertation the differential operators described in this section [72]. He has compared
Lecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
MATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
17.3.1 Follow the Perturbed Leader
CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.
Lecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method
Lecture 3 3B1B Optimization Michaelmas 2015 A. Zisserman Linear Programming Extreme solutions Simplex method Interior point method Integer programming and relaxation The Optimization Tree Linear Programming
GenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1
(R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter [email protected] February 20, 2009 Notice: This work was supported by the U.S.
MapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
Pricing and calibration in local volatility models via fast quantization
Pricing and calibration in local volatility models via fast quantization Parma, 29 th January 2015. Joint work with Giorgia Callegaro and Martino Grasselli Quantization: a brief history Birth: back to
The Advantages and Disadvantages of Online Linear Optimization
LINEAR PROGRAMMING WITH ONLINE LEARNING TATSIANA LEVINA, YURI LEVIN, JEFF MCGILL, AND MIKHAIL NEDIAK SCHOOL OF BUSINESS, QUEEN S UNIVERSITY, 143 UNION ST., KINGSTON, ON, K7L 3N6, CANADA E-MAIL:{TLEVIN,YLEVIN,JMCGILL,MNEDIAK}@BUSINESS.QUEENSU.CA
Zeros of Polynomial Functions
Zeros of Polynomial Functions The Rational Zero Theorem If f (x) = a n x n + a n-1 x n-1 + + a 1 x + a 0 has integer coefficients and p/q (where p/q is reduced) is a rational zero, then p is a factor of
Cyber-Security Analysis of State Estimators in Power Systems
Cyber-Security Analysis of State Estimators in Electric Power Systems André Teixeira 1, Saurabh Amin 2, Henrik Sandberg 1, Karl H. Johansson 1, and Shankar Sastry 2 ACCESS Linnaeus Centre, KTH-Royal Institute
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
Machine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
Simulation-based optimization methods for urban transportation problems. Carolina Osorio
Simulation-based optimization methods for urban transportation problems Carolina Osorio Civil and Environmental Engineering Department Massachusetts Institute of Technology (MIT) Joint work with: Prof.
Duality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
CSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
Introduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
Prime Numbers and Irreducible Polynomials
Prime Numbers and Irreducible Polynomials M. Ram Murty The similarity between prime numbers and irreducible polynomials has been a dominant theme in the development of number theory and algebraic geometry.
Lecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
SINGLE-STAGE MULTI-PRODUCT PRODUCTION AND INVENTORY SYSTEMS: AN ITERATIVE ALGORITHM BASED ON DYNAMIC SCHEDULING AND FIXED PITCH PRODUCTION
SIGLE-STAGE MULTI-PRODUCT PRODUCTIO AD IVETORY SYSTEMS: A ITERATIVE ALGORITHM BASED O DYAMIC SCHEDULIG AD FIXED PITCH PRODUCTIO Euclydes da Cunha eto ational Institute of Technology Rio de Janeiro, RJ
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
Big Data Science. Prof. Lise Getoor University of Maryland, College Park. http://www.cs.umd.edu/~getoor. October 17, 2013
Big Data Science Prof Lise Getoor University of Maryland, College Park October 17, 2013 http://wwwcsumdedu/~getoor BIG Data is not flat 2004-2013 lonnitaylor Data is multi-modal, multi-relational, spatio-temporal,
How To Prove The Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
5.1 Bipartite Matching
CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson
Big Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
Optimization Modeling for Mining Engineers
Optimization Modeling for Mining Engineers Alexandra M. Newman Division of Economics and Business Slide 1 Colorado School of Mines Seminar Outline Linear Programming Integer Linear Programming Slide 2
Efficient Curve Fitting Techniques
15/11/11 Life Conference and Exhibition 11 Stuart Carroll, Christopher Hursey Efficient Curve Fitting Techniques - November 1 The Actuarial Profession www.actuaries.org.uk Agenda Background Outline of
Notes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
Proximal mapping via network optimization
L. Vandenberghe EE236C (Spring 23-4) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:
Optimization of Supply Chain Networks
Optimization of Supply Chain Networks M. Herty TU Kaiserslautern September 2006 (2006) 1 / 41 Contents 1 Supply Chain Modeling 2 Networks 3 Optimization Continuous optimal control problem Discrete optimal
Solving polynomial least squares problems via semidefinite programming relaxations
Solving polynomial least squares problems via semidefinite programming relaxations Sunyoung Kim and Masakazu Kojima August 2007, revised in November, 2007 Abstract. A polynomial optimization problem whose
Pacific Journal of Mathematics
Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000
Big Data Optimization: Randomized lock-free methods for minimizing partially separable convex functions
Big Data Optimization: Randomized lock-free methods for minimizing partially separable convex functions Peter Richtárik School of Mathematics The University of Edinburgh Joint work with Martin Takáč (Edinburgh)
Convex Programming Tools for Disjunctive Programs
Convex Programming Tools for Disjunctive Programs João Soares, Departamento de Matemática, Universidade de Coimbra, Portugal Abstract A Disjunctive Program (DP) is a mathematical program whose feasible
Semi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
A FIRST COURSE IN OPTIMIZATION THEORY
A FIRST COURSE IN OPTIMIZATION THEORY RANGARAJAN K. SUNDARAM New York University CAMBRIDGE UNIVERSITY PRESS Contents Preface Acknowledgements page xiii xvii 1 Mathematical Preliminaries 1 1.1 Notation
A progressive method to solve large-scale AC Optimal Power Flow with discrete variables and control of the feasibility
A progressive method to solve large-scale AC Optimal Power Flow with discrete variables and control of the feasibility Manuel Ruiz, Jean Maeght, Alexandre Marié, Patrick Panciatici and Arnaud Renaud [email protected]
Network Traffic Modelling
University of York Dissertation submitted for the MSc in Mathematics with Modern Applications, Department of Mathematics, University of York, UK. August 009 Network Traffic Modelling Author: David Slade
Notes from Week 1: Algorithms for sequential prediction
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking
Fixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
Parallel Selective Algorithms for Nonconvex Big Data Optimization
1874 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 63, NO. 7, APRIL 1, 2015 Parallel Selective Algorithms for Nonconvex Big Data Optimization Francisco Facchinei, Gesualdo Scutari, Senior Member, IEEE,
Chapter 13: Binary and Mixed-Integer Programming
Chapter 3: Binary and Mixed-Integer Programming The general branch and bound approach described in the previous chapter can be customized for special situations. This chapter addresses two special situations:
Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs
CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the
STORM: Stochastic Optimization Using Random Models Katya Scheinberg Lehigh University. (Joint work with R. Chen and M. Menickelly)
STORM: Stochastic Optimization Using Random Models Katya Scheinberg Lehigh University (Joint work with R. Chen and M. Menickelly) Outline Stochastic optimization problem black box gradient based Existing
A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams
Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams André Ciré University of Toronto John Hooker Carnegie Mellon University INFORMS 2014 Home Health Care Home health care delivery
Tensor Factorization for Multi-Relational Learning
Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany [email protected] 2 Siemens AG, Corporate
Massive Data Classification via Unconstrained Support Vector Machines
Massive Data Classification via Unconstrained Support Vector Machines Olvi L. Mangasarian and Michael E. Thompson Computer Sciences Department University of Wisconsin 1210 West Dayton Street Madison, WI
Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms
Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State
Computational Optical Imaging - Optique Numerique. -- Deconvolution --
Computational Optical Imaging - Optique Numerique -- Deconvolution -- Winter 2014 Ivo Ihrke Deconvolution Ivo Ihrke Outline Deconvolution Theory example 1D deconvolution Fourier method Algebraic method
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
Properties of BMO functions whose reciprocals are also BMO
Properties of BMO functions whose reciprocals are also BMO R. L. Johnson and C. J. Neugebauer The main result says that a non-negative BMO-function w, whose reciprocal is also in BMO, belongs to p> A p,and
Parameter Estimation for Bingham Models
Dr. Volker Schulz, Dmitriy Logashenko Parameter Estimation for Bingham Models supported by BMBF Parameter Estimation for Bingham Models Industrial application of ceramic pastes Material laws for Bingham
NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing
NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing Alex Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu
Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: [email protected] Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
Distributed Machine Learning and Big Data
Distributed Machine Learning and Big Data Sourangshu Bhattacharya Dept. of Computer Science and Engineering, IIT Kharagpur. http://cse.iitkgp.ac.in/~sourangshu/ August 21, 2015 Sourangshu Bhattacharya
24. The Branch and Bound Method
24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no
FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
Performance Characteristics of Large SMP Machines
Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller [email protected] Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark
Solving NP Hard problems in practice lessons from Computer Vision and Computational Biology
Solving NP Hard problems in practice lessons from Computer Vision and Computational Biology Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem www.cs.huji.ac.il/ yweiss
Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis
Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,
Nonlinear Programming Methods.S2 Quadratic Programming
Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective
Stochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
Parallel & Distributed Optimization. Based on Mark Schmidt s slides
Parallel & Distributed Optimization Based on Mark Schmidt s slides Motivation behind using parallel & Distributed optimization Performance Computational throughput have increased exponentially in linear
Duality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
Scheduling a sequence of tasks with general completion costs
Scheduling a sequence of tasks with general completion costs Francis Sourd CNRS-LIP6 4, place Jussieu 75252 Paris Cedex 05, France [email protected] Abstract Scheduling a sequence of tasks in the acceptation
Fast Analytics on Big Data with H20
Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,
TOMLAB - For fast and robust largescale optimization in MATLAB
The TOMLAB Optimization Environment is a powerful optimization and modeling package for solving applied optimization problems in MATLAB. TOMLAB provides a wide range of features, tools and services for
A simpler and better derandomization of an approximation algorithm for Single Source Rent-or-Buy
A simpler and better derandomization of an approximation algorithm for Single Source Rent-or-Buy David P. Williamson Anke van Zuylen School of Operations Research and Industrial Engineering, Cornell University,
