BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION
|
|
- Kimberly Young
- 8 years ago
- Views:
Transcription
1 BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli 1 1: Boğaziçi University, 2: Bilgi University Nottingham University March, 2015 Ş. İlker Birbil (Sabancı University) Big Data Optimization 1 / 22
2 LARGE-SCALE OPTIMIZATION AND MACHINE LEARNING Introduction Exploiting the Structure Need for Parallel Algorithms F. Öztoprak Ş. İlker Birbil (Sabancı University) Big Data Optimization 2 / 22
3 DATA SCIENCE Ş. İlker Birbil (Sabancı University) Big Data Optimization 3 / 22
4 GRADUATE COURSES Ş. İlker Birbil (Sabancı University) Big Data Optimization 4 / 22
5 NONLINEAR OPTIMIZATION Introduction Exploiting the Structure Need for Parallel Algorithms Typically, Nonlinear a nonlinear Programming optimization problem (NLP) isproblem defined as minimize f (x) x R n Covers optimization problems subject to c i(x) = 0, i E, min c f(x) x2x i(x) 0, i I, where where f : R n X = {x 2 R R is the n : g(x) apple 0}, the functions g objective function and c i : R n : R n! R R for m, f : R i E n! R are I are the continuous and not necessarily linear. constraint functions. At least one of these functions is nonlinear. (1) x* Ş. İlker Birbil (Sabancı University) Big Data Optimization 5 / 22
6 ROLE OF NONLINEAR OPTIMIZATION Introduction Exploiting the Structure Need for Parallel Algorithms Molecular Biology (Protein Folding) Engineering Design (Machining) Global Optimization Finance (Risk Management) Derivative Free Optimization Nonlinear Stochastic Prog. Statistics Large Scale Core NLP Computer Science Applied Mathematics Convex Optimization Mixed Integer NLP Operations Research Machine Learning (Image Recovery) PDE Constrained Optimization Production (Chemical Complex Design) Health (Cancer Treatment) F. Öztoprak Ş. İlker Birbil (Sabancı University) Big Data Optimization 6 / 22
7 OUR RESEARCH GROUP Three faculty members, four PhD students, three MSc students (Coupled) Tensor or matrix factorization Distributed and parallel algorithms: Bayesian inference Nonlinear optimization Processor 1 Core 1 Core 2 Core 3 Core 4 Memory 1 Processor 2 Core 1 Core 2 Core 3 Core 4 Memory 2 Processor 3 Core 1 Core 2 Core 3 Core 4 Memory 3 Ş. İlker Birbil (Sabancı University) Big Data Optimization 7 / 22
8 OUR RESEARCH GROUP Three faculty members, four PhD students, three MSc students (Coupled) Tensor or matrix factorization Distributed and parallel algorithms: Bayesian inference Nonlinear optimization Processor 1 Core 1 Core 2 Core 3 Core 4 Memory 1 Processor 2 Core 1 Core 2 Core 3 Core 4 Memory 2 Processor 3 Core 1 Core 2 Core 3 Core 4 Memory 3 Ş. İlker Birbil (Sabancı University) Big Data Optimization 7 / 22
9 LINK PREDICTION VIA TENSOR FACTORIZATION X 1(i, j, k): if user i visits location j and performs activity k X 2(i, m): frequency of a user i visiting location m X j(j, n): points of interest for a location j Ş. İlker Birbil (Sabancı University) Big Data Optimization 8 / 22
10 TENSOR FACTORIZATION Matrix & Tensor Factorizations Tensor Factorization Tensor Factorization Tensor Multidimensional Array (X i,j,k,...) Extension of matrix factorizations to higher-order tensors Tensor factorizations are used to extract the underlying factors in higher-order data I Tensor Multidimensional Array I Used toi extract the underlying factors in higher-order data sets sets Tensor Factorisation + 7/1 X (i, j, k) X (i, r)z 2(j, r)z 3(k, r) X(i, j, k) r (i, r)z 2(j, r)z 3(k, r) r Cemgil Probabilistic Latent Tensor Factorisation. IFG19SabanciUniversity 14 Ş. İlker Birbil (Sabancı University) Big Data Optimization 9 / 22
11 X X 12 Z 2 MATRIX FACTORIZATION X (, ) X X(, ) i Z(, 1 ( i)z,i)z 2 (i, 2 ) (i, ) An inverse problem: Estimate i and Z 2 given data matrix X assuming X Z 2 X M "! ˆX Z 2 #! " #! " able error Overall function optimization subject problem to constraints (e.g., nonnegativity, ble error function subject to constraints (e.g., nonnegativity, minimize X Z 2 2 F subject to, Z 2 Z, (, Z 2 ) =argmind(x Z 2 )+ R(, Z 2 ),Z 2 where Z is the feasible region. When Z is the first orthant, we have the 1,Z 2 nonnegative ) = arg matrixmin factorization D(X Z problem. 1 Z 2 )+ R(,Z 2 ),Z 2 Ş. İlker Birbil (Sabancı University) Big Data Optimization 10 / 22
12 MOVIE RECOMMENDATION minimize X Z 2 2 F subject to 0, Z 2 0 Ş. İlker Birbil (Sabancı University) Big Data Optimization 11 / 22
13 DISTRIBUTED IMPLEMENTATION Time Slot 1: Perform X 12 (1,:) X 12 = (1,:)Z 2 (:,2) on P1 X 31 Time Slot 2: X 23 (2,:) (3,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 23 = (2,:)Z 2 (:,3) on P2 X 31 = (3,:)Z 2 (:,1) on P3 by employing IPA. X 11 (1,:) X 22 (2,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 33 (3,:) Time Slot 3: X 13 (1,:) X 21 (2,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 32 (3,:) Time Slot 4:... Ş. İlker Birbil (Sabancı University) Big Data Optimization 12 / 22
14 REFORMULATION 1" minimize subject to X Z 2 2 F, Z 2 Z 1" 2" 3" Z 2 4" 5" 6" z."."." 6" GENERIC PROBLEM minimize f i(z) subject to i {1,,m} z ζ Ş. İlker Birbil (Sabancı University) Big Data Optimization 13 / 22
15 DISTRUBUTED OPTIMIZATION Time Slot 1: X 31 Time Slot 2: X 11 Time Slot 3: X 21 Time Slot 4:... X 12 X 22 X 32 X 23 X 33 X 13 (1,:) (2,:) (3,:) (1,:) (2,:) (3,:) (1,:) (2,:) (3,:) x x x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Perform X 12 = (1,:)Z 2 (:,2) on P1 X 23 = (2,:)Z 2 (:,3) on P2 X 31 = (3,:)Z 2 (:,1) on P3 by employing IPA. " 2" 3" 1 Z 2 4" 5" 6" z 1"."."." 6" minimize subject to i {1,,m} z ζ f i(z) At each time slot k, we solve a subset S k of the component functions f i, i {1, 2,, m} We make sure that each data block is visited after c passes (c = 3 in the figure) Ş. İlker Birbil (Sabancı University) Big Data Optimization 14 / 22
16 INCREMENTAL QUASI-NEWTON ALGORITHM Unlike gradient-based methods, the proposed algorithm uses second order information through Hessian approximation (L-BFGS quasi-newton method) The proposed algorithm visits each subset of component functions in the same order (incremental and deterministic) We do not assume convexity of the function (matrix factorization can be solved) CORE STEP Solve a quadratic approximation of the (partial) objective function: Q t k(z) = (z z k) Sk f (z k) (z zk) H t(z z k) βt z zk 2. Ş. İlker Birbil (Sabancı University) Big Data Optimization 15 / 22
17 INCREMENTAL QUASI-NEWTON ALGORITHM (CONT D) Q t k(z) = (z z k) Sk f (z k) (z zk) H t(z z k) βt z zk 2. Algorithm 1: HAMSI input: y 0,β 1 1 for t = 0, 1, 2, do 2 z 1 = y t 3 Compute H t 4 for k = 1, 2,, c do 5 Choose a subset S k {1,, m} 6 Compute Sk f (z k) 7 z k+1 = arg min z ζ Q t k(z) 8 end 9 y t+1 = z c+1 10 Set β t+1 β t 11 end Ş. İlker Birbil (Sabancı University) Big Data Optimization 16 / 22
18 CONVERGENCE ANALYSIS (ζ = R n ) ASSUMPTIONS 1. Hessians of the component functions and (H t + β ti) are uniformly bounded: i S k 2 i f (y t) L t L S k, y t. 2. The smallest eigenvalue of (H t + β ti) is bounded away from zero: U t (H t + β ti) 1 M t t. 3. The gradient norms are uniformly bounded: Sk f (y t) C S k, y t. Ş. İlker Birbil (Sabancı University) Big Data Optimization 17 / 22
19 CONVERGENCE ANALYSIS (CONT D) LEMMA At each outer iteration t of Algorithm 1 and for k = 1,, c, we have k 1 δ k = Sk f (z k) Sk f (y t) L tm t (1 + L tm t) k 1 j Sj f (y t) j=1 THEOREM Consider the iterates y t produced by Algorithm 1. Then, all accumulation points of {y t} are stationary points of the generic problem. Ş. İlker Birbil (Sabancı University) Big Data Optimization 18 / 22
20 CONVERGENCE ANALYSIS (CONT D) LEMMA At each outer iteration t of Algorithm 1 and for k = 1,, c, we have k 1 δ k = Sk f (z k) Sk f (y t) L tm t (1 + L tm t) k 1 j Sj f (y t) j=1 THEOREM Consider the iterates y t produced by Algorithm 1. Then, all accumulation points of {y t} are stationary points of the generic problem. COROLLARY Algorithm 1 solves the matrix factorization problem. Ş. İlker Birbil (Sabancı University) Big Data Optimization 18 / 22
21 PRELIMINARY EXPERIMENTS - SETUP Linux cluster with 15 nodes Each node has 8, Intel Xeon 2.50 GHz processor with 16 GB RAM This setting allows execution of 120 parallel tasks in parallel MovieLens data (1M) is used for our preliminary experiments Ş. İlker Birbil (Sabancı University) Big Data Optimization 19 / 22
22 PRELIMINARY EXPERIMENTS FIGURE: Objective function values Ş. İlker Birbil (Sabancı University) Big Data Optimization 20 / 22
23 PRELIMINARY EXPERIMENTS (CONT D) FIGURE: Root mean square error Ş. İlker Birbil (Sabancı University) Big Data Optimization 21 / 22
24 CONCLUDING REMARKS Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
25 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
26 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
27 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
28 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
29 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Stochastic version of the proposed algorithm Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
30 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Stochastic version of the proposed algorithm Quasi-Newton-based Bayesian inference Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More information(Quasi-)Newton methods
(Quasi-)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable non-linear function g, x such that g(x) = 0, where g : R n R n. Given a starting
More informationNumerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
More informationModern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
More informationAdaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
More informationComputing a Nearest Correlation Matrix with Factor Structure
Computing a Nearest Correlation Matrix with Factor Structure Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ Joint work with Rüdiger
More informationOptimal Scheduling for Dependent Details Processing Using MS Excel Solver
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 8, No 2 Sofia 2008 Optimal Scheduling for Dependent Details Processing Using MS Excel Solver Daniela Borissova Institute of
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More information10. Proximal point method
L. Vandenberghe EE236C Spring 2013-14) 10. Proximal point method proximal point method augmented Lagrangian method Moreau-Yosida smoothing 10-1 Proximal point method a conceptual algorithm for minimizing
More informationSolutions of Equations in One Variable. Fixed-Point Iteration II
Solutions of Equations in One Variable Fixed-Point Iteration II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationNonlinear Optimization: Algorithms 3: Interior-point methods
Nonlinear Optimization: Algorithms 3: Interior-point methods INSEAD, Spring 2006 Jean-Philippe Vert Ecole des Mines de Paris Jean-Philippe.Vert@mines.org Nonlinear optimization c 2006 Jean-Philippe Vert,
More informationt := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
More informationFurther Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
More informationBranch-and-Price Approach to the Vehicle Routing Problem with Time Windows
TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationDate: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
More informationTensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
More informationExact shape-reconstruction by one-step linearization in electrical impedance tomography
Exact shape-reconstruction by one-step linearization in electrical impedance tomography Bastian von Harrach harrach@math.uni-mainz.de Institut für Mathematik, Joh. Gutenberg-Universität Mainz, Germany
More informationBig Data Techniques Applied to Very Short-term Wind Power Forecasting
Big Data Techniques Applied to Very Short-term Wind Power Forecasting Ricardo Bessa Senior Researcher (ricardo.j.bessa@inesctec.pt) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with
More informationA characterization of trace zero symmetric nonnegative 5x5 matrices
A characterization of trace zero symmetric nonnegative 5x5 matrices Oren Spector June 1, 009 Abstract The problem of determining necessary and sufficient conditions for a set of real numbers to be the
More informationParallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation
Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation Ying Peng, Bin Gong, Hui Liu, and Yanxin Zhang School of Computer Science and Technology, Shandong University,
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationIntroduction to Algebraic Geometry. Bézout s Theorem and Inflection Points
Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a
More information2.2 Creaseness operator
2.2. Creaseness operator 31 2.2 Creaseness operator Antonio López, a member of our group, has studied for his PhD dissertation the differential operators described in this section [72]. He has compared
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More information17.3.1 Follow the Perturbed Leader
CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.
More informationLecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method
Lecture 3 3B1B Optimization Michaelmas 2015 A. Zisserman Linear Programming Extreme solutions Simplex method Interior point method Integer programming and relaxation The Optimization Tree Linear Programming
More informationGenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1
(R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter MWetter@lbl.gov February 20, 2009 Notice: This work was supported by the U.S.
More informationMapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
More informationPricing and calibration in local volatility models via fast quantization
Pricing and calibration in local volatility models via fast quantization Parma, 29 th January 2015. Joint work with Giorgia Callegaro and Martino Grasselli Quantization: a brief history Birth: back to
More informationThe Advantages and Disadvantages of Online Linear Optimization
LINEAR PROGRAMMING WITH ONLINE LEARNING TATSIANA LEVINA, YURI LEVIN, JEFF MCGILL, AND MIKHAIL NEDIAK SCHOOL OF BUSINESS, QUEEN S UNIVERSITY, 143 UNION ST., KINGSTON, ON, K7L 3N6, CANADA E-MAIL:{TLEVIN,YLEVIN,JMCGILL,MNEDIAK}@BUSINESS.QUEENSU.CA
More informationZeros of Polynomial Functions
Zeros of Polynomial Functions The Rational Zero Theorem If f (x) = a n x n + a n-1 x n-1 + + a 1 x + a 0 has integer coefficients and p/q (where p/q is reduced) is a rational zero, then p is a factor of
More informationCyber-Security Analysis of State Estimators in Power Systems
Cyber-Security Analysis of State Estimators in Electric Power Systems André Teixeira 1, Saurabh Amin 2, Henrik Sandberg 1, Karl H. Johansson 1, and Shankar Sastry 2 ACCESS Linnaeus Centre, KTH-Royal Institute
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationSimulation-based optimization methods for urban transportation problems. Carolina Osorio
Simulation-based optimization methods for urban transportation problems Carolina Osorio Civil and Environmental Engineering Department Massachusetts Institute of Technology (MIT) Joint work with: Prof.
More informationDuality in General Programs. Ryan Tibshirani Convex Optimization 10-725/36-725
Duality in General Programs Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More informationIntroduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
More informationPrime Numbers and Irreducible Polynomials
Prime Numbers and Irreducible Polynomials M. Ram Murty The similarity between prime numbers and irreducible polynomials has been a dominant theme in the development of number theory and algebraic geometry.
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationThe Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Line-search Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
More informationSINGLE-STAGE MULTI-PRODUCT PRODUCTION AND INVENTORY SYSTEMS: AN ITERATIVE ALGORITHM BASED ON DYNAMIC SCHEDULING AND FIXED PITCH PRODUCTION
SIGLE-STAGE MULTI-PRODUCT PRODUCTIO AD IVETORY SYSTEMS: A ITERATIVE ALGORITHM BASED O DYAMIC SCHEDULIG AD FIXED PITCH PRODUCTIO Euclydes da Cunha eto ational Institute of Technology Rio de Janeiro, RJ
More informationLABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING. ----Changsheng Liu 10-30-2014
LABEL PROPAGATION ON GRAPHS. SEMI-SUPERVISED LEARNING ----Changsheng Liu 10-30-2014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
More informationBig Data Science. Prof. Lise Getoor University of Maryland, College Park. http://www.cs.umd.edu/~getoor. October 17, 2013
Big Data Science Prof Lise Getoor University of Maryland, College Park October 17, 2013 http://wwwcsumdedu/~getoor BIG Data is not flat 2004-2013 lonnitaylor Data is multi-modal, multi-relational, spatio-temporal,
More informationHow To Prove The Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
More information5.1 Bipartite Matching
CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the Ford-Fulkerson
More informationBig Data - Lecture 1 Optimization reminders
Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data - Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationOptimization Modeling for Mining Engineers
Optimization Modeling for Mining Engineers Alexandra M. Newman Division of Economics and Business Slide 1 Colorado School of Mines Seminar Outline Linear Programming Integer Linear Programming Slide 2
More informationEfficient Curve Fitting Techniques
15/11/11 Life Conference and Exhibition 11 Stuart Carroll, Christopher Hursey Efficient Curve Fitting Techniques - November 1 The Actuarial Profession www.actuaries.org.uk Agenda Background Outline of
More informationNotes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
More informationProximal mapping via network optimization
L. Vandenberghe EE236C (Spring 23-4) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:
More informationAdaptive Search with Stochastic Acceptance Probabilities for Global Optimization
Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Archis Ghate a and Robert L. Smith b a Industrial Engineering, University of Washington, Box 352650, Seattle, Washington,
More informationOptimization of Supply Chain Networks
Optimization of Supply Chain Networks M. Herty TU Kaiserslautern September 2006 (2006) 1 / 41 Contents 1 Supply Chain Modeling 2 Networks 3 Optimization Continuous optimal control problem Discrete optimal
More informationHigh Performance Computing for Operation Research
High Performance Computing for Operation Research IEF - Paris Sud University claude.tadonki@u-psud.fr INRIA-Alchemy seminar, Thursday March 17 Research topics Fundamental Aspects of Algorithms and Complexity
More informationSolving polynomial least squares problems via semidefinite programming relaxations
Solving polynomial least squares problems via semidefinite programming relaxations Sunyoung Kim and Masakazu Kojima August 2007, revised in November, 2007 Abstract. A polynomial optimization problem whose
More informationGLOBAL OPTIMIZATION METHOD FOR SOLVING MATHEMATICAL PROGRAMS WITH LINEAR COMPLEMENTARITY CONSTRAINTS. 1. Introduction
GLOBAL OPTIMIZATION METHOD FOR SOLVING MATHEMATICAL PROGRAMS WITH LINEAR COMPLEMENTARITY CONSTRAINTS N.V. THOAI, Y. YAMAMOTO, AND A. YOSHISE Abstract. We propose a method for finding a global optimal solution
More informationPacific Journal of Mathematics
Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000
More informationBig Data Optimization: Randomized lock-free methods for minimizing partially separable convex functions
Big Data Optimization: Randomized lock-free methods for minimizing partially separable convex functions Peter Richtárik School of Mathematics The University of Edinburgh Joint work with Martin Takáč (Edinburgh)
More informationConvex Programming Tools for Disjunctive Programs
Convex Programming Tools for Disjunctive Programs João Soares, Departamento de Matemática, Universidade de Coimbra, Portugal Abstract A Disjunctive Program (DP) is a mathematical program whose feasible
More informationSemi-Supervised Support Vector Machines and Application to Spam Filtering
Semi-Supervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
More informationA FIRST COURSE IN OPTIMIZATION THEORY
A FIRST COURSE IN OPTIMIZATION THEORY RANGARAJAN K. SUNDARAM New York University CAMBRIDGE UNIVERSITY PRESS Contents Preface Acknowledgements page xiii xvii 1 Mathematical Preliminaries 1 1.1 Notation
More informationA progressive method to solve large-scale AC Optimal Power Flow with discrete variables and control of the feasibility
A progressive method to solve large-scale AC Optimal Power Flow with discrete variables and control of the feasibility Manuel Ruiz, Jean Maeght, Alexandre Marié, Patrick Panciatici and Arnaud Renaud manuel.ruiz@artelys.com
More informationNetwork Traffic Modelling
University of York Dissertation submitted for the MSc in Mathematics with Modern Applications, Department of Mathematics, University of York, UK. August 009 Network Traffic Modelling Author: David Slade
More informationNotes from Week 1: Algorithms for sequential prediction
CS 683 Learning, Games, and Electronic Markets Spring 2007 Notes from Week 1: Algorithms for sequential prediction Instructor: Robert Kleinberg 22-26 Jan 2007 1 Introduction In this course we will be looking
More informationFixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
More informationParallel Selective Algorithms for Nonconvex Big Data Optimization
1874 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 63, NO. 7, APRIL 1, 2015 Parallel Selective Algorithms for Nonconvex Big Data Optimization Francisco Facchinei, Gesualdo Scutari, Senior Member, IEEE,
More informationOptimal File Sharing in Distributed Networks
Optimal File Sharing in Distributed Networks Moni Naor Ron M. Roth Abstract The following file distribution problem is considered: Given a network of processors represented by an undirected graph G = (V,
More informationChapter 13: Binary and Mixed-Integer Programming
Chapter 3: Binary and Mixed-Integer Programming The general branch and bound approach described in the previous chapter can be customized for special situations. This chapter addresses two special situations:
More informationLecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs
CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like
More informationSECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA
SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the
More informationSTORM: Stochastic Optimization Using Random Models Katya Scheinberg Lehigh University. (Joint work with R. Chen and M. Menickelly)
STORM: Stochastic Optimization Using Random Models Katya Scheinberg Lehigh University (Joint work with R. Chen and M. Menickelly) Outline Stochastic optimization problem black box gradient based Existing
More informationA NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION
1 A NEW LOOK AT CONVEX ANALYSIS AND OPTIMIZATION Dimitri Bertsekas M.I.T. FEBRUARY 2003 2 OUTLINE Convexity issues in optimization Historical remarks Our treatment of the subject Three unifying lines of
More informationScheduling Home Health Care with Separating Benders Cuts in Decision Diagrams
Scheduling Home Health Care with Separating Benders Cuts in Decision Diagrams André Ciré University of Toronto John Hooker Carnegie Mellon University INFORMS 2014 Home Health Care Home health care delivery
More informationTensor Factorization for Multi-Relational Learning
Tensor Factorization for Multi-Relational Learning Maximilian Nickel 1 and Volker Tresp 2 1 Ludwig Maximilian University, Oettingenstr. 67, Munich, Germany nickel@dbs.ifi.lmu.de 2 Siemens AG, Corporate
More informationMassive Data Classification via Unconstrained Support Vector Machines
Massive Data Classification via Unconstrained Support Vector Machines Olvi L. Mangasarian and Michael E. Thompson Computer Sciences Department University of Wisconsin 1210 West Dayton Street Madison, WI
More informationMixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms
Mixed Precision Iterative Refinement Methods Energy Efficiency on Hybrid Hardware Platforms Björn Rocker Hamburg, June 17th 2010 Engineering Mathematics and Computing Lab (EMCL) KIT University of the State
More informationComputational Optical Imaging - Optique Numerique. -- Deconvolution --
Computational Optical Imaging - Optique Numerique -- Deconvolution -- Winter 2014 Ivo Ihrke Deconvolution Ivo Ihrke Outline Deconvolution Theory example 1D deconvolution Fourier method Algebraic method
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationProperties of BMO functions whose reciprocals are also BMO
Properties of BMO functions whose reciprocals are also BMO R. L. Johnson and C. J. Neugebauer The main result says that a non-negative BMO-function w, whose reciprocal is also in BMO, belongs to p> A p,and
More informationParameter Estimation for Bingham Models
Dr. Volker Schulz, Dmitriy Logashenko Parameter Estimation for Bingham Models supported by BMBF Parameter Estimation for Bingham Models Industrial application of ceramic pastes Material laws for Bingham
More informationNMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing
NMR Measurement of T1-T2 Spectra with Partial Measurements using Compressive Sensing Alex Cloninger Norbert Wiener Center Department of Mathematics University of Maryland, College Park http://www.norbertwiener.umd.edu
More informationLogistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression
Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max
More informationDistributed Machine Learning and Big Data
Distributed Machine Learning and Big Data Sourangshu Bhattacharya Dept. of Computer Science and Engineering, IIT Kharagpur. http://cse.iitkgp.ac.in/~sourangshu/ August 21, 2015 Sourangshu Bhattacharya
More information24. The Branch and Bound Method
24. The Branch and Bound Method It has serious practical consequences if it is known that a combinatorial problem is NP-complete. Then one can conclude according to the present state of science that no
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 34-48 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationPerformance Characteristics of Large SMP Machines
Performance Characteristics of Large SMP Machines Dirk Schmidl, Dieter an Mey, Matthias S. Müller schmidl@rz.rwth-aachen.de Rechen- und Kommunikationszentrum (RZ) Agenda Investigated Hardware Kernel Benchmark
More informationSolving NP Hard problems in practice lessons from Computer Vision and Computational Biology
Solving NP Hard problems in practice lessons from Computer Vision and Computational Biology Yair Weiss School of Computer Science and Engineering The Hebrew University of Jerusalem www.cs.huji.ac.il/ yweiss
More informationEchidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis
Echidna: Efficient Clustering of Hierarchical Data for Network Traffic Analysis Abdun Mahmood, Christopher Leckie, Parampalli Udaya Department of Computer Science and Software Engineering University of
More informationAN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS
AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,
More informationNonlinear Programming Methods.S2 Quadratic Programming
Nonlinear Programming Methods.S2 Quadratic Programming Operations Research Models and Methods Paul A. Jensen and Jonathan F. Bard A linearly constrained optimization problem with a quadratic objective
More informationStochastic Inventory Control
Chapter 3 Stochastic Inventory Control 1 In this chapter, we consider in much greater details certain dynamic inventory control problems of the type already encountered in section 1.3. In addition to the
More informationParallel & Distributed Optimization. Based on Mark Schmidt s slides
Parallel & Distributed Optimization Based on Mark Schmidt s slides Motivation behind using parallel & Distributed optimization Performance Computational throughput have increased exponentially in linear
More informationDuality of linear conic problems
Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least
More informationScheduling a sequence of tasks with general completion costs
Scheduling a sequence of tasks with general completion costs Francis Sourd CNRS-LIP6 4, place Jussieu 75252 Paris Cedex 05, France Francis.Sourd@lip6.fr Abstract Scheduling a sequence of tasks in the acceptation
More informationFast Analytics on Big Data with H20
Fast Analytics on Big Data with H20 0xdata.com, h2o.ai Tomas Nykodym, Petr Maj Team About H2O and 0xdata H2O is a platform for distributed in memory predictive analytics and machine learning Pure Java,
More informationTOMLAB - For fast and robust largescale optimization in MATLAB
The TOMLAB Optimization Environment is a powerful optimization and modeling package for solving applied optimization problems in MATLAB. TOMLAB provides a wide range of features, tools and services for
More informationA simpler and better derandomization of an approximation algorithm for Single Source Rent-or-Buy
A simpler and better derandomization of an approximation algorithm for Single Source Rent-or-Buy David P. Williamson Anke van Zuylen School of Operations Research and Industrial Engineering, Cornell University,
More information