BIG DATA PROBLEMS AND LARGESCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION


 Kimberly Young
 2 years ago
 Views:
Transcription
1 BIG DATA PROBLEMS AND LARGESCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli 1 1: Boğaziçi University, 2: Bilgi University Nottingham University March, 2015 Ş. İlker Birbil (Sabancı University) Big Data Optimization 1 / 22
2 LARGESCALE OPTIMIZATION AND MACHINE LEARNING Introduction Exploiting the Structure Need for Parallel Algorithms F. Öztoprak Ş. İlker Birbil (Sabancı University) Big Data Optimization 2 / 22
3 DATA SCIENCE Ş. İlker Birbil (Sabancı University) Big Data Optimization 3 / 22
4 GRADUATE COURSES Ş. İlker Birbil (Sabancı University) Big Data Optimization 4 / 22
5 NONLINEAR OPTIMIZATION Introduction Exploiting the Structure Need for Parallel Algorithms Typically, Nonlinear a nonlinear Programming optimization problem (NLP) isproblem defined as minimize f (x) x R n Covers optimization problems subject to c i(x) = 0, i E, min c f(x) x2x i(x) 0, i I, where where f : R n X = {x 2 R R is the n : g(x) apple 0}, the functions g objective function and c i : R n : R n! R R for m, f : R i E n! R are I are the continuous and not necessarily linear. constraint functions. At least one of these functions is nonlinear. (1) x* Ş. İlker Birbil (Sabancı University) Big Data Optimization 5 / 22
6 ROLE OF NONLINEAR OPTIMIZATION Introduction Exploiting the Structure Need for Parallel Algorithms Molecular Biology (Protein Folding) Engineering Design (Machining) Global Optimization Finance (Risk Management) Derivative Free Optimization Nonlinear Stochastic Prog. Statistics Large Scale Core NLP Computer Science Applied Mathematics Convex Optimization Mixed Integer NLP Operations Research Machine Learning (Image Recovery) PDE Constrained Optimization Production (Chemical Complex Design) Health (Cancer Treatment) F. Öztoprak Ş. İlker Birbil (Sabancı University) Big Data Optimization 6 / 22
7 OUR RESEARCH GROUP Three faculty members, four PhD students, three MSc students (Coupled) Tensor or matrix factorization Distributed and parallel algorithms: Bayesian inference Nonlinear optimization Processor 1 Core 1 Core 2 Core 3 Core 4 Memory 1 Processor 2 Core 1 Core 2 Core 3 Core 4 Memory 2 Processor 3 Core 1 Core 2 Core 3 Core 4 Memory 3 Ş. İlker Birbil (Sabancı University) Big Data Optimization 7 / 22
8 OUR RESEARCH GROUP Three faculty members, four PhD students, three MSc students (Coupled) Tensor or matrix factorization Distributed and parallel algorithms: Bayesian inference Nonlinear optimization Processor 1 Core 1 Core 2 Core 3 Core 4 Memory 1 Processor 2 Core 1 Core 2 Core 3 Core 4 Memory 2 Processor 3 Core 1 Core 2 Core 3 Core 4 Memory 3 Ş. İlker Birbil (Sabancı University) Big Data Optimization 7 / 22
9 LINK PREDICTION VIA TENSOR FACTORIZATION X 1(i, j, k): if user i visits location j and performs activity k X 2(i, m): frequency of a user i visiting location m X j(j, n): points of interest for a location j Ş. İlker Birbil (Sabancı University) Big Data Optimization 8 / 22
10 TENSOR FACTORIZATION Matrix & Tensor Factorizations Tensor Factorization Tensor Factorization Tensor Multidimensional Array (X i,j,k,...) Extension of matrix factorizations to higherorder tensors Tensor factorizations are used to extract the underlying factors in higherorder data I Tensor Multidimensional Array I Used toi extract the underlying factors in higherorder data sets sets Tensor Factorisation + 7/1 X (i, j, k) X (i, r)z 2(j, r)z 3(k, r) X(i, j, k) r (i, r)z 2(j, r)z 3(k, r) r Cemgil Probabilistic Latent Tensor Factorisation. IFG19SabanciUniversity 14 Ş. İlker Birbil (Sabancı University) Big Data Optimization 9 / 22
11 X X 12 Z 2 MATRIX FACTORIZATION X (, ) X X(, ) i Z(, 1 ( i)z,i)z 2 (i, 2 ) (i, ) An inverse problem: Estimate i and Z 2 given data matrix X assuming X Z 2 X M "! ˆX Z 2 #! " #! " able error Overall function optimization subject problem to constraints (e.g., nonnegativity, ble error function subject to constraints (e.g., nonnegativity, minimize X Z 2 2 F subject to, Z 2 Z, (, Z 2 ) =argmind(x Z 2 )+ R(, Z 2 ),Z 2 where Z is the feasible region. When Z is the first orthant, we have the 1,Z 2 nonnegative ) = arg matrixmin factorization D(X Z problem. 1 Z 2 )+ R(,Z 2 ),Z 2 Ş. İlker Birbil (Sabancı University) Big Data Optimization 10 / 22
12 MOVIE RECOMMENDATION minimize X Z 2 2 F subject to 0, Z 2 0 Ş. İlker Birbil (Sabancı University) Big Data Optimization 11 / 22
13 DISTRIBUTED IMPLEMENTATION Time Slot 1: Perform X 12 (1,:) X 12 = (1,:)Z 2 (:,2) on P1 X 31 Time Slot 2: X 23 (2,:) (3,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 23 = (2,:)Z 2 (:,3) on P2 X 31 = (3,:)Z 2 (:,1) on P3 by employing IPA. X 11 (1,:) X 22 (2,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 33 (3,:) Time Slot 3: X 13 (1,:) X 21 (2,:) x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) X 32 (3,:) Time Slot 4:... Ş. İlker Birbil (Sabancı University) Big Data Optimization 12 / 22
14 REFORMULATION 1" minimize subject to X Z 2 2 F, Z 2 Z 1" 2" 3" Z 2 4" 5" 6" z."."." 6" GENERIC PROBLEM minimize f i(z) subject to i {1,,m} z ζ Ş. İlker Birbil (Sabancı University) Big Data Optimization 13 / 22
15 DISTRUBUTED OPTIMIZATION Time Slot 1: X 31 Time Slot 2: X 11 Time Slot 3: X 21 Time Slot 4:... X 12 X 22 X 32 X 23 X 33 X 13 (1,:) (2,:) (3,:) (1,:) (2,:) (3,:) (1,:) (2,:) (3,:) x x x Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Z 2 (:,1) Z 2 (:,2) Z 2 (:,3) Perform X 12 = (1,:)Z 2 (:,2) on P1 X 23 = (2,:)Z 2 (:,3) on P2 X 31 = (3,:)Z 2 (:,1) on P3 by employing IPA. " 2" 3" 1 Z 2 4" 5" 6" z 1"."."." 6" minimize subject to i {1,,m} z ζ f i(z) At each time slot k, we solve a subset S k of the component functions f i, i {1, 2,, m} We make sure that each data block is visited after c passes (c = 3 in the figure) Ş. İlker Birbil (Sabancı University) Big Data Optimization 14 / 22
16 INCREMENTAL QUASINEWTON ALGORITHM Unlike gradientbased methods, the proposed algorithm uses second order information through Hessian approximation (LBFGS quasinewton method) The proposed algorithm visits each subset of component functions in the same order (incremental and deterministic) We do not assume convexity of the function (matrix factorization can be solved) CORE STEP Solve a quadratic approximation of the (partial) objective function: Q t k(z) = (z z k) Sk f (z k) (z zk) H t(z z k) βt z zk 2. Ş. İlker Birbil (Sabancı University) Big Data Optimization 15 / 22
17 INCREMENTAL QUASINEWTON ALGORITHM (CONT D) Q t k(z) = (z z k) Sk f (z k) (z zk) H t(z z k) βt z zk 2. Algorithm 1: HAMSI input: y 0,β 1 1 for t = 0, 1, 2, do 2 z 1 = y t 3 Compute H t 4 for k = 1, 2,, c do 5 Choose a subset S k {1,, m} 6 Compute Sk f (z k) 7 z k+1 = arg min z ζ Q t k(z) 8 end 9 y t+1 = z c+1 10 Set β t+1 β t 11 end Ş. İlker Birbil (Sabancı University) Big Data Optimization 16 / 22
18 CONVERGENCE ANALYSIS (ζ = R n ) ASSUMPTIONS 1. Hessians of the component functions and (H t + β ti) are uniformly bounded: i S k 2 i f (y t) L t L S k, y t. 2. The smallest eigenvalue of (H t + β ti) is bounded away from zero: U t (H t + β ti) 1 M t t. 3. The gradient norms are uniformly bounded: Sk f (y t) C S k, y t. Ş. İlker Birbil (Sabancı University) Big Data Optimization 17 / 22
19 CONVERGENCE ANALYSIS (CONT D) LEMMA At each outer iteration t of Algorithm 1 and for k = 1,, c, we have k 1 δ k = Sk f (z k) Sk f (y t) L tm t (1 + L tm t) k 1 j Sj f (y t) j=1 THEOREM Consider the iterates y t produced by Algorithm 1. Then, all accumulation points of {y t} are stationary points of the generic problem. Ş. İlker Birbil (Sabancı University) Big Data Optimization 18 / 22
20 CONVERGENCE ANALYSIS (CONT D) LEMMA At each outer iteration t of Algorithm 1 and for k = 1,, c, we have k 1 δ k = Sk f (z k) Sk f (y t) L tm t (1 + L tm t) k 1 j Sj f (y t) j=1 THEOREM Consider the iterates y t produced by Algorithm 1. Then, all accumulation points of {y t} are stationary points of the generic problem. COROLLARY Algorithm 1 solves the matrix factorization problem. Ş. İlker Birbil (Sabancı University) Big Data Optimization 18 / 22
21 PRELIMINARY EXPERIMENTS  SETUP Linux cluster with 15 nodes Each node has 8, Intel Xeon 2.50 GHz processor with 16 GB RAM This setting allows execution of 120 parallel tasks in parallel MovieLens data (1M) is used for our preliminary experiments Ş. İlker Birbil (Sabancı University) Big Data Optimization 19 / 22
22 PRELIMINARY EXPERIMENTS FIGURE: Objective function values Ş. İlker Birbil (Sabancı University) Big Data Optimization 20 / 22
23 PRELIMINARY EXPERIMENTS (CONT D) FIGURE: Root mean square error Ş. İlker Birbil (Sabancı University) Big Data Optimization 21 / 22
24 CONCLUDING REMARKS Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
25 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
26 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
27 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
28 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
29 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Stochastic version of the proposed algorithm Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
30 CONCLUDING REMARKS SUMMARY A promising research path at the intersection of operations research and computer science A new distributed and parallel implementation for matrix factorization A generic analysis that could be used for showing convergence of other algorithms FUTURE RESEARCHJ Extensive computational study Stochastic version of the proposed algorithm QuasiNewtonbased Bayesian inference Ş. İlker Birbil (Sabancı University) Big Data Optimization 22 / 22
2.3 Convex Constrained Optimization Problems
42 CHAPTER 2. FUNDAMENTAL CONCEPTS IN CONVEX OPTIMIZATION Theorem 15 Let f : R n R and h : R R. Consider g(x) = h(f(x)) for all x R n. The function g is convex if either of the following two conditions
More information(Quasi)Newton methods
(Quasi)Newton methods 1 Introduction 1.1 Newton method Newton method is a method to find the zeros of a differentiable nonlinear function g, x such that g(x) = 0, where g : R n R n. Given a starting
More informationNumerisches Rechnen. (für Informatiker) M. Grepl J. Berger & J.T. Frings. Institut für Geometrie und Praktische Mathematik RWTH Aachen
(für Informatiker) M. Grepl J. Berger & J.T. Frings Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2010/11 Problem Statement Unconstrained Optimality Conditions Constrained
More informationChapter 4 Sequential Quadratic Programming
Optimization I; Chapter 4 77 Chapter 4 Sequential Quadratic Programming 4.1 The Basic SQP Method 4.1.1 Introductory Definitions and Assumptions Sequential Quadratic Programming (SQP) is one of the most
More informationParameter Estimation: A Deterministic Approach using the LevenburgMarquardt Algorithm
Parameter Estimation: A Deterministic Approach using the LevenburgMarquardt Algorithm John Bardsley Department of Mathematical Sciences University of Montana Applied Math SeminarFeb. 2005 p.1/14 Outline
More informationModern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh
Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem
More informationAdaptive Online Gradient Descent
Adaptive Online Gradient Descent Peter L Bartlett Division of Computer Science Department of Statistics UC Berkeley Berkeley, CA 94709 bartlett@csberkeleyedu Elad Hazan IBM Almaden Research Center 650
More informationComputing a Nearest Correlation Matrix with Factor Structure
Computing a Nearest Correlation Matrix with Factor Structure Nick Higham School of Mathematics The University of Manchester higham@ma.man.ac.uk http://www.ma.man.ac.uk/~higham/ Joint work with Rüdiger
More informationOptimal Scheduling for Dependent Details Processing Using MS Excel Solver
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 8, No 2 Sofia 2008 Optimal Scheduling for Dependent Details Processing Using MS Excel Solver Daniela Borissova Institute of
More informationBranchandPrice Approach to the Vehicle Routing Problem with Time Windows
TECHNISCHE UNIVERSITEIT EINDHOVEN BranchandPrice Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents
More information10. Proximal point method
L. Vandenberghe EE236C Spring 201314) 10. Proximal point method proximal point method augmented Lagrangian method MoreauYosida smoothing 101 Proximal point method a conceptual algorithm for minimizing
More informationSolutions of Equations in One Variable. FixedPoint Iteration II
Solutions of Equations in One Variable FixedPoint Iteration II Numerical Analysis (9th Edition) R L Burden & J D Faires Beamer Presentation Slides prepared by John Carroll Dublin City University c 2011
More informationNonlinear Optimization: Algorithms 3: Interiorpoint methods
Nonlinear Optimization: Algorithms 3: Interiorpoint methods INSEAD, Spring 2006 JeanPhilippe Vert Ecole des Mines de Paris JeanPhilippe.Vert@mines.org Nonlinear optimization c 2006 JeanPhilippe Vert,
More informationInner Product Spaces
Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and
More informationbe a nested sequence of closed nonempty connected subsets of a compact metric space X. Prove that
Problem 1A. Let... X 2 X 1 be a nested sequence of closed nonempty connected subsets of a compact metric space X. Prove that i=1 X i is nonempty and connected. Since X i is closed in X, it is compact.
More informationExam in SF1811/SF1831/SF1841 Optimization. Monday June 11, 2012, time:
Examiner: Per Enqvist, tel. 790 6 98 Exam in SF8/SF8/SF8 Optimization. Monday June, 0, time:.00 9.00 Allowed utensils: Pen, paper, eraser and ruler. No calculator! A formulasheet is handed out. Solution
More informationAM 221: Advanced Optimization Spring Prof. Yaron Singer Lecture 8 February 24th, 2014
AM 221: Advanced Optimization Spring 2014 Prof. Yaron Singer Lecture 8 February 24th, 2014 1 Overview Last week we talked about the Simplex algorithm. Today we ll broaden the scope of the objectives we
More informationIntroduction to Convex Optimization for Machine Learning
Introduction to Convex Optimization for Machine Learning John Duchi University of California, Berkeley Practical Machine Learning, Fall 2009 Duchi (UC Berkeley) Convex Optimization for Machine Learning
More informationt := maxγ ν subject to ν {0,1,2,...} and f(x c +γ ν d) f(x c )+cγ ν f (x c ;d).
1. Line Search Methods Let f : R n R be given and suppose that x c is our current best estimate of a solution to P min x R nf(x). A standard method for improving the estimate x c is to choose a direction
More informationAbsolute Value Programming
Computational Optimization and Aplications,, 1 11 (2006) c 2006 Springer Verlag, Boston. Manufactured in The Netherlands. Absolute Value Programming O. L. MANGASARIAN olvi@cs.wisc.edu Computer Sciences
More informationProbabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014
Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about
More informationTensor Methods for Machine Learning, Computer Vision, and Computer Graphics
Tensor Methods for Machine Learning, Computer Vision, and Computer Graphics Part I: Factorizations and Statistical Modeling/Inference Amnon Shashua School of Computer Science & Eng. The Hebrew University
More informationFurther Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1
Further Study on Strong Lagrangian Duality Property for Invex Programs via Penalty Functions 1 J. Zhang Institute of Applied Mathematics, Chongqing University of Posts and Telecommunications, Chongqing
More informationExact shapereconstruction by onestep linearization in electrical impedance tomography
Exact shapereconstruction by onestep linearization in electrical impedance tomography Bastian von Harrach harrach@math.unimainz.de Institut für Mathematik, Joh. GutenbergUniversität Mainz, Germany
More informationA characterization of trace zero symmetric nonnegative 5x5 matrices
A characterization of trace zero symmetric nonnegative 5x5 matrices Oren Spector June 1, 009 Abstract The problem of determining necessary and sufficient conditions for a set of real numbers to be the
More informationAdvanced Topics in Machine Learning (Part II)
Advanced Topics in Machine Learning (Part II) 3. Convexity and Optimisation February 6, 2009 Andreas Argyriou 1 Today s Plan Convex sets and functions Types of convex programs Algorithms Convex learning
More informationBig Data Techniques Applied to Very Shortterm Wind Power Forecasting
Big Data Techniques Applied to Very Shortterm Wind Power Forecasting Ricardo Bessa Senior Researcher (ricardo.j.bessa@inesctec.pt) Center for Power and Energy Systems, INESC TEC, Portugal Joint work with
More informationIntroduction and message of the book
1 Introduction and message of the book 1.1 Why polynomial optimization? Consider the global optimization problem: P : for some feasible set f := inf x { f(x) : x K } (1.1) K := { x R n : g j (x) 0, j =
More informationDate: April 12, 2001. Contents
2 Lagrange Multipliers Date: April 12, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 12 2.3. Informative Lagrange Multipliers...........
More informationStatistical Machine Learning
Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes
More informationME128 ComputerAided Mechanical Design Course Notes Introduction to Design Optimization
ME128 Computerided Mechanical Design Course Notes Introduction to Design Optimization 2. OPTIMIZTION Design optimization is rooted as a basic problem for design engineers. It is, of course, a rare situation
More informationIntroduction to Algebraic Geometry. Bézout s Theorem and Inflection Points
Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a
More informationMultiObjective Optimization
MultiObjective Optimization A quick introduction Giuseppe Narzisi Courant Institute of Mathematical Sciences New York University 24 January 2008 Outline 1 Introduction Motivations Definition Notion of
More informationLecture 3: Linear methods for classification
Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,
More informationLecture 3. Linear Programming. 3B1B Optimization Michaelmas 2015 A. Zisserman. Extreme solutions. Simplex method. Interior point method
Lecture 3 3B1B Optimization Michaelmas 2015 A. Zisserman Linear Programming Extreme solutions Simplex method Interior point method Integer programming and relaxation The Optimization Tree Linear Programming
More information17.3.1 Follow the Perturbed Leader
CS787: Advanced Algorithms Topic: Online Learning Presenters: David He, Chris Hopman 17.3.1 Follow the Perturbed Leader 17.3.1.1 Prediction Problem Recall the prediction problem that we discussed in class.
More informationOptimization of the HOTS score of a website s pages
Optimization of the HOTS score of a website s pages Olivier Fercoq and Stéphane Gaubert INRIA Saclay and CMAP Ecole Polytechnique June 21st, 2012 Toy example with 21 pages 3 5 6 2 4 20 1 12 9 11 15 16
More informationMapReduce and Distributed Data Analysis. Sergei Vassilvitskii Google Research
MapReduce and Distributed Data Analysis Google Research 1 Dealing With Massive Data 2 2 Dealing With Massive Data Polynomial Memory Sublinear RAM Sketches External Memory Property Testing 3 3 Dealing With
More informationZeros of Polynomial Functions
Zeros of Polynomial Functions The Rational Zero Theorem If f (x) = a n x n + a n1 x n1 + + a 1 x + a 0 has integer coefficients and p/q (where p/q is reduced) is a rational zero, then p is a factor of
More informationParallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation
Parallel Computing for Option Pricing Based on the Backward Stochastic Differential Equation Ying Peng, Bin Gong, Hui Liu, and Yanxin Zhang School of Computer Science and Technology, Shandong University,
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 4: LINEAR MODELS FOR CLASSIFICATION Introduction In the previous chapter, we explored a class of regression models having particularly simple analytical
More informationRegression Using Support Vector Machines: Basic Foundations
Regression Using Support Vector Machines: Basic Foundations Technical Report December 2004 Aly Farag and Refaat M Mohamed Computer Vision and Image Processing Laboratory Electrical and Computer Engineering
More informationMATHEMATICAL BACKGROUND
Chapter 1 MATHEMATICAL BACKGROUND This chapter discusses the mathematics that is necessary for the development of the theory of linear programming. We are particularly interested in the solutions of a
More informationPricing and calibration in local volatility models via fast quantization
Pricing and calibration in local volatility models via fast quantization Parma, 29 th January 2015. Joint work with Giorgia Callegaro and Martino Grasselli Quantization: a brief history Birth: back to
More informationApproximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques. My T. Thai
Approximation Algorithms: LP Relaxation, Rounding, and Randomized Rounding Techniques My T. Thai 1 Overview An overview of LP relaxation and rounding method is as follows: 1. Formulate an optimization
More informationCyberSecurity Analysis of State Estimators in Power Systems
CyberSecurity Analysis of State Estimators in Electric Power Systems André Teixeira 1, Saurabh Amin 2, Henrik Sandberg 1, Karl H. Johansson 1, and Shankar Sastry 2 ACCESS Linnaeus Centre, KTHRoyal Institute
More information2.2 Creaseness operator
2.2. Creaseness operator 31 2.2 Creaseness operator Antonio López, a member of our group, has studied for his PhD dissertation the differential operators described in this section [72]. He has compared
More informationMachine Learning and Pattern Recognition Logistic Regression
Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,
More informationDepartment of Industrial Engineering
Department of Industrial Engineering Master of Engineering Program in Industrial Engineering (International Program) M.Eng. (Industrial Engineering) Plan A Option 2: Total credits required: minimum 39
More informationThe Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Linesearch Method
The Steepest Descent Algorithm for Unconstrained Optimization and a Bisection Linesearch Method Robert M. Freund February, 004 004 Massachusetts Institute of Technology. 1 1 The Algorithm The problem
More informationv 1. v n R n we have for each 1 j n that v j v n max 1 j n v j. i=1
1. Limits and Continuity It is often the case that a nonlinear function of nvariables x = (x 1,..., x n ) is not really defined on all of R n. For instance f(x 1, x 2 ) = x 1x 2 is not defined when x
More informationLINEAR PROGRAMMING WITH ONLINE LEARNING
LINEAR PROGRAMMING WITH ONLINE LEARNING TATSIANA LEVINA, YURI LEVIN, JEFF MCGILL, AND MIKHAIL NEDIAK SCHOOL OF BUSINESS, QUEEN S UNIVERSITY, 143 UNION ST., KINGSTON, ON, K7L 3N6, CANADA EMAIL:{TLEVIN,YLEVIN,JMCGILL,MNEDIAK}@BUSINESS.QUEENSU.CA
More informationMATHEMATICAL METHODS OF STATISTICS
MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS
More informationDuality in General Programs. Ryan Tibshirani Convex Optimization 10725/36725
Duality in General Programs Ryan Tibshirani Convex Optimization 10725/36725 1 Last time: duality in linear programs Given c R n, A R m n, b R m, G R r n, h R r : min x R n c T x max u R m, v R r b T
More informationIntroduction to Online Learning Theory
Introduction to Online Learning Theory Wojciech Kot lowski Institute of Computing Science, Poznań University of Technology IDSS, 04.06.2013 1 / 53 Outline 1 Example: Online (Stochastic) Gradient Descent
More informationSimulationbased optimization methods for urban transportation problems. Carolina Osorio
Simulationbased optimization methods for urban transportation problems Carolina Osorio Civil and Environmental Engineering Department Massachusetts Institute of Technology (MIT) Joint work with: Prof.
More informationEfficient Curve Fitting Techniques
15/11/11 Life Conference and Exhibition 11 Stuart Carroll, Christopher Hursey Efficient Curve Fitting Techniques  November 1 The Actuarial Profession www.actuaries.org.uk Agenda Background Outline of
More informationPiecewise Linear Relaxation Techniques for Solution of Nonconvex. Nonlinear Programming Problems
Piecewise Linear Relaxation Techniques for Solution of Nonconvex Nonlinear Programming Problems Pradeep K. Polisetty and Edward P. Gatzke Department of Chemical Engineering University of South Carolina
More informationPrime Numbers and Irreducible Polynomials
Prime Numbers and Irreducible Polynomials M. Ram Murty The similarity between prime numbers and irreducible polynomials has been a dominant theme in the development of number theory and algebraic geometry.
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 22, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 22, 2014 1 /
More information5.1 Bipartite Matching
CS787: Advanced Algorithms Lecture 5: Applications of Network Flow In the last lecture, we looked at the problem of finding the maximum flow in a graph, and how it can be efficiently solved using the FordFulkerson
More informationLecture 8 February 4
ICS273A: Machine Learning Winter 2008 Lecture 8 February 4 Scribe: Carlos Agell (Student) Lecturer: Deva Ramanan 8.1 Neural Nets 8.1.1 Logistic Regression Recall the logistic function: g(x) = 1 1 + e θt
More informationBig Data Science. Prof. Lise Getoor University of Maryland, College Park. http://www.cs.umd.edu/~getoor. October 17, 2013
Big Data Science Prof Lise Getoor University of Maryland, College Park October 17, 2013 http://wwwcsumdedu/~getoor BIG Data is not flat 20042013 lonnitaylor Data is multimodal, multirelational, spatiotemporal,
More informationEstimating the Inverse Covariance Matrix of Independent Multivariate Normally Distributed Random Variables
Estimating the Inverse Covariance Matrix of Independent Multivariate Normally Distributed Random Variables Dominique Brunet, Hanne Kekkonen, Vitor Nunes, Iryna Sivak FieldsMITACS Thematic Program on Inverse
More informationConvex Programming Tools for Disjunctive Programs
Convex Programming Tools for Disjunctive Programs João Soares, Departamento de Matemática, Universidade de Coimbra, Portugal Abstract A Disjunctive Program (DP) is a mathematical program whose feasible
More informationGaussMarkov Theorem. The GaussMarkov Theorem is given in the following regression model and assumptions:
GaussMarkov Theorem The GaussMarkov Theorem is given in the following regression model and assumptions: The regression model y i = β 1 + β x i + u i, i = 1,, n (1) Assumptions (A) or Assumptions (B):
More informationRegularization and Normal Solutions of Systems of Linear Equations and Inequalities
ISSN 00815438, Proceedings of the Steklov Institute of Mathematics, 2015, Vol. 289, Suppl. 1, pp. S102 S110. c Pleiades Publishing, Ltd., 2015. Original Russian Text c A.I. Golikov, Yu.G. Evtushenko,
More informationGenOpt (R) Generic Optimization Program User Manual Version 3.0.0β1
(R) User Manual Environmental Energy Technologies Division Berkeley, CA 94720 http://simulationresearch.lbl.gov Michael Wetter MWetter@lbl.gov February 20, 2009 Notice: This work was supported by the U.S.
More informationA branchandbound algorithm for convex multiobjective Mixed Integer NonLinear Programming Problems
A branchandbound algorithm for convex multiobjective Mixed Integer NonLinear Programming Problems Valentina Cacchiani 1 Claudia D Ambrosio 2 1 University of Bologna, Italy 2 École Polytechnique, France
More informationOptimization Modeling for Mining Engineers
Optimization Modeling for Mining Engineers Alexandra M. Newman Division of Economics and Business Slide 1 Colorado School of Mines Seminar Outline Linear Programming Integer Linear Programming Slide 2
More informationLAGRANGIAN RELAXATION TECHNIQUES FOR LARGE SCALE OPTIMIZATION
LAGRANGIAN RELAXATION TECHNIQUES FOR LARGE SCALE OPTIMIZATION Kartik Sivaramakrishnan Department of Mathematics NC State University kksivara@ncsu.edu http://www4.ncsu.edu/ kksivara SIAM/MGSA Brown Bag
More informationNONLINEAR AND DYNAMIC OPTIMIZATION From Theory to Practice
NONLINEAR AND DYNAMIC OPTIMIZATION From Theory to Practice IC32: Winter Semester 2006/2007 Benoît C. CHACHUAT Laboratoire d Automatique, École Polytechnique Fédérale de Lausanne CONTENTS 1 Nonlinear
More informationBig Data  Lecture 1 Optimization reminders
Big Data  Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Big Data  Lecture 1 Optimization reminders S. Gadat Toulouse, Octobre 2014 Schedule Introduction Major issues Examples Mathematics
More information2.5 Gaussian Elimination
page 150 150 CHAPTER 2 Matrices and Systems of Linear Equations 37 10 the linear algebra package of Maple, the three elementary 20 23 1 row operations are 12 1 swaprow(a,i,j): permute rows i and j 3 3
More informationContents. Introduction and Notes pages 23 (These are important and it s only 2 pages ~ please take the time to read them!)
Page Contents Introduction and Notes pages 23 (These are important and it s only 2 pages ~ please take the time to read them!) Systematic Search for a Change of Sign (Decimal Search) Method Explanation
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
More informationBranch and Bound Methods
Branch and Bound Methods basic ideas and attributes unconstrained nonconvex optimization mixed convexboolean optimization Prof. S. Boyd, EE364b, Stanford University Methods for nonconvex optimization
More informationProximal mapping via network optimization
L. Vandenberghe EE236C (Spring 234) Proximal mapping via network optimization minimum cut and maximum flow problems parametric minimum cut problem application to proximal mapping Introduction this lecture:
More informationThe Dirichlet Unit Theorem
Chapter 6 The Dirichlet Unit Theorem As usual, we will be working in the ring B of algebraic integers of a number field L. Two factorizations of an element of B are regarded as essentially the same if
More informationGradient Methods. Rafael E. Banchs
Gradient Methods Rafael E. Banchs INTRODUCTION This report discuss one class of the local search algorithms to be used in the inverse modeling of the time harmonic field electric logging problem, the Gradient
More informationDELFT UNIVERSITY OF TECHNOLOGY
DELFT UNIVERSITY OF TECHNOLOGY REPORT 1206 ADJOINT SENSITIVITY IN ELECTRICAL IMPEDANCE TOMOGRAPHY USING COMSOL MULTIPHYSICS W. MULCKHUYSE, D. LAHAYE, A. BELITSKAYA ISSN 13896520 Reports of the Department
More informationSOME RESULTS ON THE DRAZIN INVERSE OF A MODIFIED MATRIX WITH NEW CONDITIONS
International Journal of Analysis and Applications ISSN 22918639 Volume 5, Number 2 (2014, 191197 http://www.etamaths.com SOME RESULTS ON THE DRAZIN INVERSE OF A MODIFIED MATRIX WITH NEW CONDITIONS ABDUL
More informationNotes on Symmetric Matrices
CPSC 536N: Randomized Algorithms 201112 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.
More informationHigh Performance Computing for Operation Research
High Performance Computing for Operation Research IEF  Paris Sud University claude.tadonki@upsud.fr INRIAAlchemy seminar, Thursday March 17 Research topics Fundamental Aspects of Algorithms and Complexity
More informationIntroduction to Linear Programming.
Chapter 1 Introduction to Linear Programming. This chapter introduces notations, terminologies and formulations of linear programming. Examples will be given to show how reallife problems can be modeled
More informationAdaptive Search with Stochastic Acceptance Probabilities for Global Optimization
Adaptive Search with Stochastic Acceptance Probabilities for Global Optimization Archis Ghate a and Robert L. Smith b a Industrial Engineering, University of Washington, Box 352650, Seattle, Washington,
More informationLABEL PROPAGATION ON GRAPHS. SEMISUPERVISED LEARNING. Changsheng Liu 10302014
LABEL PROPAGATION ON GRAPHS. SEMISUPERVISED LEARNING Changsheng Liu 10302014 Agenda Semi Supervised Learning Topics in Semi Supervised Learning Label Propagation Local and global consistency Graph
More informationParallel Selective Algorithms for Nonconvex Big Data Optimization
1874 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 63, NO. 7, APRIL 1, 2015 Parallel Selective Algorithms for Nonconvex Big Data Optimization Francisco Facchinei, Gesualdo Scutari, Senior Member, IEEE,
More informationBig Data Optimization: Randomized lockfree methods for minimizing partially separable convex functions
Big Data Optimization: Randomized lockfree methods for minimizing partially separable convex functions Peter Richtárik School of Mathematics The University of Edinburgh Joint work with Martin Takáč (Edinburgh)
More informationOptimization of Supply Chain Networks
Optimization of Supply Chain Networks M. Herty TU Kaiserslautern September 2006 (2006) 1 / 41 Contents 1 Supply Chain Modeling 2 Networks 3 Optimization Continuous optimal control problem Discrete optimal
More informationFixed Point Theorems
Fixed Point Theorems Definition: Let X be a set and let T : X X be a function that maps X into itself. (Such a function is often called an operator, a transformation, or a transform on X, and the notation
More informationELECE8104 Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems
Stochastics models and estimation, Lecture 3b: Linear Estimation in Static Systems Minimum Mean Square Error (MMSE) MMSE estimation of Gaussian random vectors Linear MMSE estimator for arbitrarily distributed
More informationFUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT MINING SYSTEM
International Journal of Innovative Computing, Information and Control ICIC International c 0 ISSN 3448 Volume 8, Number 8, August 0 pp. 4 FUZZY CLUSTERING ANALYSIS OF DATA MINING: APPLICATION TO AN ACCIDENT
More informationSolving polynomial least squares problems via semidefinite programming relaxations
Solving polynomial least squares problems via semidefinite programming relaxations Sunyoung Kim and Masakazu Kojima August 2007, revised in November, 2007 Abstract. A polynomial optimization problem whose
More informationOptimization of Design. Lecturer:DungAn Wang Lecture 12
Optimization of Design Lecturer:DungAn Wang Lecture 12 Lecture outline Reading: Ch12 of text Today s lecture 2 Constrained nonlinear programming problem Find x=(x1,..., xn), a design variable vector of
More informationSemiSupervised Support Vector Machines and Application to Spam Filtering
SemiSupervised Support Vector Machines and Application to Spam Filtering Alexander Zien Empirical Inference Department, Bernhard Schölkopf Max Planck Institute for Biological Cybernetics ECML 2006 Discovery
More informationOptimal File Sharing in Distributed Networks
Optimal File Sharing in Distributed Networks Moni Naor Ron M. Roth Abstract The following file distribution problem is considered: Given a network of processors represented by an undirected graph G = (V,
More informationEXPLICIT ABS SOLUTION OF A CLASS OF LINEAR INEQUALITY SYSTEMS AND LP PROBLEMS. Communicated by Mohammad Asadzadeh. 1. Introduction
Bulletin of the Iranian Mathematical Society Vol. 30 No. 2 (2004), pp 2138. EXPLICIT ABS SOLUTION OF A CLASS OF LINEAR INEQUALITY SYSTEMS AND LP PROBLEMS H. ESMAEILI, N. MAHDAVIAMIRI AND E. SPEDICATO
More informationSeveral Views of Support Vector Machines
Several Views of Support Vector Machines Ryan M. Rifkin Honda Research Institute USA, Inc. Human Intention Understanding Group 2007 Tikhonov Regularization We are considering algorithms of the form min
More informationPacific Journal of Mathematics
Pacific Journal of Mathematics GLOBAL EXISTENCE AND DECREASING PROPERTY OF BOUNDARY VALUES OF SOLUTIONS TO PARABOLIC EQUATIONS WITH NONLOCAL BOUNDARY CONDITIONS Sangwon Seo Volume 193 No. 1 March 2000
More information