# Link-based Analysis on Large Graphs. Presented by Weiren Yu Mar 01, 2011

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Link-based Analysis on Large Graphs Presented by Weiren Yu Mar 01, 2011

2 Overview 1 Introduction 2 Problem Definition 3 Optimization Techniques 4 Experimental Results 2

3 1. Introduction Many applications require a measure of similarity between objects. Recommender System (amazon.com) similarity search Citation of Scientific Papers (citeseer.com) Web Search Engine (google.com) Graph Clustering 3

4 Existing Similarity Measures Textual-Content Similarity (text-based) Vector-cosine similarity, Pearson correlation in IR, Structural-Context Similarity (link-based) PageRank : A page s authority is decided by its neighbors authorities. HITS : a good hub --- a page that pointed to many other pages a good authority --- a page that was linked by many hubs SimRank : similar objects are referenced by similar objects. LinkFusion : reinforcement assumption Penetrating-Rank : entities are similar if (1) they are referenced by similar entities; (2) they reference similar entities. 4

5 What is SimRank? The similarity in a domain can be modeled as graphs. [ vertices objects, edges relationships ] SimRank is an important similarity measure which exploits the relationships between vertices on web graphs. (Glen Jeh & Jennifer Widom,2002) Basic intuition: Two objects are similar if their neighbors are similar. (the recursive definition) Objects are maximally similar to themselves. (the base case ) 5

6 Overview 1 Introduction 2 Problem Definition 3 Optimization Techniques 4 Experimental Results 6

7 SimRank Equation Definition 1 (SimRank similarity) Let s : V 2 [0, 1] be a similarity function on G 2 if a = b, s (a, b) = 1, if I(a) or I(b) =, s (a, b) = 0, otherwise: s(a,b) C I(a) I(b) I( a) I(b) i 1 j 1 s(i(a),i (b)) i j C is a decay factor btw. 0 & 1 symmetric : s(a,b) = s(b,a) Similarity btw. a & b is the average similarity btw. in-neighbors of a and in-neighbors of b. 7

8 Naïve SimRank Computation Iterative Paradigm: S (a,b) 0 0 a b 1 a b I( a) I(b) C k 1 k i j I(a) I(b) i 1 j 1 s (a,b) s (I(a),I (b)), k 0,1, (monotonicity) S k (a, b) S(a, b) as k. (symmetry) S (a, b) = S (b, a). (stability) S (a, b) is independent of S 0 (a, b). (complexity) Time : O(Kn 2 d 2 ), Space : O(n 2 ), where d is the average of I( ) over all nodes. 8

9 Existing Techniques for SimRank Optimization Deterministic Method [PVLDB 08] ( to compute s(, ) iteratively for finding a fixed point ) I( a) I(b) C k 1 k i j I(a) I(b) i 1 j 1 s (a,b) s (I(a),I (b)), k 0,1, Advantage: accurate Disadvantage: high time complexity O(Kn 3 ) Probabilistic Method [WWW 05] ( to estimate s(, ) stochastically by using Monte-Carlo ) s(a,b) = E (c T(a,b) ), where T (a,b) : the first meeting time btw. a & b Advantage: scalable (linear time) Disadvantage: low similarity quality 9

10 Overview 1 Introduction 2 Problem Definition 3 Optimization Techniques 4 Experimental Results 10

11 Motivation The computational time, which has been reduced to O(Kn 3 ) by [PVLDB08], is still rather costly for practical purposes. Optimization for SimRank storage space has not been addressed in scientific literature yet. The accuracy estimate ϵ = c k in [PVLDB08] is solely based on the empirical inductive method and, therefore, is not preferable. 11

12 Our Contributions A matrix representation and a storage scheme for SimRank model has been introduced. to reduce space from O(n 2 ) to O(m + n) to improve time from O(n 3 ) to O (min {n m, n r }) in the worst case, where r log 2 7 Optimization techniques for minimizing the matrix bandwidth have been developed. to improve I/O efficiency A successive over-relaxation method has been showed. to accelerate the rate of convergence 12

13 3.1 Matrix Representations for SimRank Model Let S = (s i,j ) R n n be a SimRank matrix, where s i,j = the SimRank value btw. vertices i and j. P = (p i,j ) N n n be an adjacency matrix, where p i,j = # of edges from vertices i to j. SimRank in matrix notation O(n 3 ) for matrix multiplication (0) S In (k 1) (k) T S c Q S Q I (k 0,1, ) n I( a) I(b) C s (a,b) s (I(a),I (b)) k 1 k i j I(a) I(b) i 1 j 1 p p n n i,a (k) j,b C s, k 0,1, i,j i 1 j 1 p p i i,a j j,b (k 1) (k) T S c Q S Q a b 13

14 3.1 Matrix Representations for SimRank Model (cont.) For dense graphs Fast matrix multiplication algorithms can be applied to speed up the SimRank computation. Strassen Algorithm: O(n r ), where r = log 2 7 Coppersmith-Winograd Algorithm: O(n 2.38 ) For sparse graphs Compressed Sparse Row (CSR) are used to represent Q due to its high compression ratio. CSR has O(m n) time with the space O(m+n). 14

15 3.2 Permuted SimRank Iterative Approach The permutation method allows improving I/O efficiency for SimRank computation. The main idea involves 2 steps: Reversed Cuthill-McKee (RCM) algorithm for nonsymmetric matrix is introduced for finding an optimal permutation while reordering the matrix Q during the precomputation phase. Permuted SimRank iterative equation is developed for reducing the matrix bandwidth for SimRank computation. 15

16 3.2 Permuted SimRank Iterative Approach (cont.) The permutation π can be thought of as a bijection between the vertices of the labeled graph G Q and G π(q). β (π(q)) β (Q). We extend the original RCM to the directed graph by adding the mate Q T and apply RCM to Q + Q T. 16

17 3.2 Permuted SimRank Iterative Approach (cont.) Permuted SimRank Equation Let π be an arbitrary permutation with an induced permutation matrix Θ. For a given graph G, SimRank similarity score can be computed as where ˆ (k) 1 (k) S (S ) (0) Ŝ In ˆ ˆ (k 1) (k) T S c (Q) S (Q) I, k 0,1, n For the computation to be I/O efficient, Q needs to be preordered during the precomputation phrase. 17

18 3.3 Successive Over-relaxation (SOR) SimRank Algorithm SOR can be used for computing S (k) to effectively exhibit faster rate of convergence. SOR SimRank Equation: Let Q = (q i,j ) R n n, S (k) = (s (k) 1 s (k) 2 s (k) n ), where s (k) j is the j-th column vector of S (k), then s c Q ( q s q s ) I GS (k 1) (k) (k 1) i i,j j i,j j n j i j i s (1 ) s s SOR (k 1) SOR (k) GS (k 1) i i i 18

19 Overview 1 Introduction 2 Problem Definition 3 Optimization Techniques 4 Experimental Results 19

20 4 Experimental Evaluation Experimental Setup Hardware 2.0GHz Pentium(R) Dual-Core / 2GB RAM Windows Vista OS / Visual C Data Sets Synthetic graph with an average of 8 links per page. 10 sample adjacency matrices from 1K to 10K with ξ uniform[0; 16] out-links on each row. Real-life Wikipedia (3.2M articles with 110M intra-wiki links Oct. 07) We choose the relationship : a category contains an article to be a link from the category to the article. Parameter Settings c = 0.8, ω = 1.3, ϵ =

21 Experimental Results 21

22 22

23 23

24 24

25 25

26 26

27

28 Overview 1 Introduction 2 Problem Definition 3 Optimization Techniques 4 Experimental Results 5 Conclusions 28

29 Conclusions We formalized the SimRank equation in matrix notations. We investigated optimization issues for SimRank computation. A compressed storage scheme for sparse graphs is adopted for reducing the space from O(n 2 ) to O (n + m). A fast matrix multiplication for dense graphs is used for improving the time from O(n 2 d) to O (min {n m, n r }), r log 2 7. A permuted SimRank iteration was developed in combination of the extended RCM algorithm to achieve its I/O efficiency. A SOR method has been showed to significantly speed up the convergence rate of the SimRank iteration. Our experimental evaluations on synthetic and real-life data sets demonstrate the efficiency of our methods. 29

### Yousef Saad University of Minnesota Computer Science and Engineering. CRM Montreal - April 30, 2008

A tutorial on: Iterative methods for Sparse Matrix Problems Yousef Saad University of Minnesota Computer Science and Engineering CRM Montreal - April 30, 2008 Outline Part 1 Sparse matrices and sparsity

### Practical Graph Mining with R. 5. Link Analysis

Practical Graph Mining with R 5. Link Analysis Outline Link Analysis Concepts Metrics for Analyzing Networks PageRank HITS Link Prediction 2 Link Analysis Concepts Link A relationship between two entities

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS

ABSTRACT KEYWORD SEARCH OVER PROBABILISTIC RDF GRAPHS In many real applications, RDF (Resource Description Framework) has been widely used as a W3C standard to describe data in the Semantic Web. In practice,

### 1 o Semestre 2007/2008

Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Outline 1 2 3 4 5 Outline 1 2 3 4 5 Exploiting Text How is text exploited? Two main directions Extraction Extraction

### Graph Processing and Social Networks

Graph Processing and Social Networks Presented by Shu Jiayu, Yang Ji Department of Computer Science and Engineering The Hong Kong University of Science and Technology 2015/4/20 1 Outline Background Graph

### CSE 494 CSE/CBS 598 (Fall 2007): Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye

CSE 494 CSE/CBS 598 Fall 2007: Numerical Linear Algebra for Data Exploration Clustering Instructor: Jieping Ye 1 Introduction One important method for data compression and classification is to organize

### A. V. Gerbessiotis CS Spring 2014 PS 3 Mar 24, 2014 No points

A. V. Gerbessiotis CS 610-102 Spring 2014 PS 3 Mar 24, 2014 No points Problem 1. Suppose that we insert n keys into a hash table of size m using open addressing and uniform hashing. Let p(n, m) be the

### Rank one SVD: un algorithm pour la visualisation d une matrice non négative

Rank one SVD: un algorithm pour la visualisation d une matrice non négative L. Labiod and M. Nadif LIPADE - Universite ParisDescartes, France ECAIS 2013 November 7, 2013 Outline Outline 1 Data visualization

### Fast Multipole Method for particle interactions: an open source parallel library component

Fast Multipole Method for particle interactions: an open source parallel library component F. A. Cruz 1,M.G.Knepley 2,andL.A.Barba 1 1 Department of Mathematics, University of Bristol, University Walk,

### Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time

Big Data Technology Motivating NoSQL Databases: Computing Page Importance Metrics at Crawl Time Edward Bortnikov & Ronny Lempel Yahoo! Labs, Haifa Class Outline Link-based page importance measures Why

### Asking Hard Graph Questions. Paul Burkhardt. February 3, 2014

Beyond Watson: Predictive Analytics and Big Data U.S. National Security Agency Research Directorate - R6 Technical Report February 3, 2014 300 years before Watson there was Euler! The first (Jeopardy!)

### Web Graph Analyzer Tool

Web Graph Analyzer Tool Konstantin Avrachenkov INRIA Sophia Antipolis 2004, route des Lucioles, B.P.93 06902, France Email: K.Avrachenkov@sophia.inria.fr Danil Nemirovsky St.Petersburg State University

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs

SIGMOD RWE Review Towards Proximity Pattern Mining in Large Graphs Fabian Hueske, TU Berlin June 26, 21 1 Review This document is a review report on the paper Towards Proximity Pattern Mining in Large

### Notes on Matrix Multiplication and the Transitive Closure

ICS 6D Due: Wednesday, February 25, 2015 Instructor: Sandy Irani Notes on Matrix Multiplication and the Transitive Closure An n m matrix over a set S is an array of elements from S with n rows and m columns.

### Social Media Mining. Network Measures

Klout Measures and Metrics 22 Why Do We Need Measures? Who are the central figures (influential individuals) in the network? What interaction patterns are common in friends? Who are the like-minded users

### Lecture Notes on Spanning Trees

Lecture Notes on Spanning Trees 15-122: Principles of Imperative Computation Frank Pfenning Lecture 26 April 26, 2011 1 Introduction In this lecture we introduce graphs. Graphs provide a uniform model

### Advanced Algebra 2. I. Equations and Inequalities

Advanced Algebra 2 I. Equations and Inequalities A. Real Numbers and Number Operations 6.A.5, 6.B.5, 7.C.5 1) Graph numbers on a number line 2) Order real numbers 3) Identify properties of real numbers

### SECTIONS 1.5-1.6 NOTES ON GRAPH THEORY NOTATION AND ITS USE IN THE STUDY OF SPARSE SYMMETRIC MATRICES

SECIONS.5-.6 NOES ON GRPH HEORY NOION ND IS USE IN HE SUDY OF SPRSE SYMMERIC MRICES graph G ( X, E) consists of a finite set of nodes or vertices X and edges E. EXMPLE : road map of part of British Columbia

### 6. Cholesky factorization

6. Cholesky factorization EE103 (Fall 2011-12) triangular matrices forward and backward substitution the Cholesky factorization solving Ax = b with A positive definite inverse of a positive definite matrix

### Efficient visual search of local features. Cordelia Schmid

Efficient visual search of local features Cordelia Schmid Visual search change in viewing angle Matches 22 correct matches Image search system for large datasets Large image dataset (one million images

### STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. Clarificationof zonationprocedure described onpp. 238-239

STATISTICS AND DATA ANALYSIS IN GEOLOGY, 3rd ed. by John C. Davis Clarificationof zonationprocedure described onpp. 38-39 Because the notation used in this section (Eqs. 4.8 through 4.84) is inconsistent

Neural Network Add-in Version 1.5 Software User s Guide Contents Overview... 2 Getting Started... 2 Working with Datasets... 2 Open a Dataset... 3 Save a Dataset... 3 Data Pre-processing... 3 Lagging...

### (57) (58) (59) (60) (61)

Module 4 : Solving Linear Algebraic Equations Section 5 : Iterative Solution Techniques 5 Iterative Solution Techniques By this approach, we start with some initial guess solution, say for solution and

### (67902) Topics in Theory and Complexity Nov 2, 2006. Lecture 7

(67902) Topics in Theory and Complexity Nov 2, 2006 Lecturer: Irit Dinur Lecture 7 Scribe: Rani Lekach 1 Lecture overview This Lecture consists of two parts In the first part we will refresh the definition

### The PageRank Citation Ranking: Bring Order to the Web

The PageRank Citation Ranking: Bring Order to the Web presented by: Xiaoxi Pang 25.Nov 2010 1 / 20 Outline Introduction A ranking for every page on the Web Implementation Convergence Properties Personalized

### Graph Algorithms. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar

Graph Algorithms Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar To accompany the text Introduction to Parallel Computing, Addison Wesley, 3. Topic Overview Definitions and Representation Minimum

### Entropy based Graph Clustering: Application to Biological and Social Networks

Entropy based Graph Clustering: Application to Biological and Social Networks Edward C Kenley Young-Rae Cho Department of Computer Science Baylor University Complex Systems Definition Dynamically evolving

### Announcements. CompSci 230 Discrete Math for Computer Science. Test 1

CompSci 230 Discrete Math for Computer Science Sep 26, 2013 Announcements Exam 1 is Tuesday, Oct. 1 No class, Oct 3, No recitation Oct 4-7 Prof. Rodger is out Sep 30-Oct 4 There is Recitation: Sept 27-30.

### Relations Graphical View

Relations Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry Introduction Recall that a relation between elements of two sets is a subset of their Cartesian product (of ordered pairs). A binary

### Part 1: Link Analysis & Page Rank

Chapter 8: Graph Data Part 1: Link Analysis & Page Rank Based on Leskovec, Rajaraman, Ullman 214: Mining of Massive Datasets 1 Exam on the 5th of February, 216, 14. to 16. If you wish to attend, please

### Perron vector Optimization applied to search engines

Perron vector Optimization applied to search engines Olivier Fercoq INRIA Saclay and CMAP Ecole Polytechnique May 18, 2011 Web page ranking The core of search engines Semantic rankings (keywords) Hyperlink

### USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS

USING SPECTRAL RADIUS RATIO FOR NODE DEGREE TO ANALYZE THE EVOLUTION OF SCALE- FREE NETWORKS AND SMALL-WORLD NETWORKS Natarajan Meghanathan Jackson State University, 1400 Lynch St, Jackson, MS, USA natarajan.meghanathan@jsums.edu

### ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING

ANALYSIS, THEORY AND DESIGN OF LOGISTIC REGRESSION CLASSIFIERS USED FOR VERY LARGE SCALE DATA MINING BY OMID ROUHANI-KALLEH THESIS Submitted as partial fulfillment of the requirements for the degree of

### Matrix Multiplication

Matrix Multiplication CPS343 Parallel and High Performance Computing Spring 2016 CPS343 (Parallel and HPC) Matrix Multiplication Spring 2016 1 / 32 Outline 1 Matrix operations Importance Dense and sparse

### Introduction to Flocking {Stochastic Matrices}

Supelec EECI Graduate School in Control Introduction to Flocking {Stochastic Matrices} A. S. Morse Yale University Gif sur - Yvette May 21, 2012 CRAIG REYNOLDS - 1987 BOIDS The Lion King CRAIG REYNOLDS

### Hadoop SNS. renren.com. Saturday, December 3, 11

Hadoop SNS renren.com Saturday, December 3, 11 2.2 190 40 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December 3, 11 Saturday, December

### Data Mining. Cluster Analysis: Advanced Concepts and Algorithms

Data Mining Cluster Analysis: Advanced Concepts and Algorithms Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 More Clustering Methods Prototype-based clustering Density-based clustering Graph-based

### MapReduce Approach to Collective Classification for Networks

MapReduce Approach to Collective Classification for Networks Wojciech Indyk 1, Tomasz Kajdanowicz 1, Przemyslaw Kazienko 1, and Slawomir Plamowski 1 Wroclaw University of Technology, Wroclaw, Poland Faculty

### A General Approach to Variance Estimation under Imputation for Missing Survey Data

A General Approach to Variance Estimation under Imputation for Missing Survey Data J.N.K. Rao Carleton University Ottawa, Canada 1 2 1 Joint work with J.K. Kim at Iowa State University. 2 Workshop on Survey

### Technical Report. Parallel iterative solution method for large sparse linear equation systems. Rashid Mehmood, Jon Crowcroft. Number 650.

Technical Report UCAM-CL-TR-65 ISSN 1476-2986 Number 65 Computer Laboratory Parallel iterative solution method for large sparse linear equation systems Rashid Mehmood, Jon Crowcroft October 25 15 JJ Thomson

### Probabilistic Models for Big Data. Alex Davies and Roger Frigola University of Cambridge 13th February 2014

Probabilistic Models for Big Data Alex Davies and Roger Frigola University of Cambridge 13th February 2014 The State of Big Data Why probabilistic models for Big Data? 1. If you don t have to worry about

### APPM4720/5720: Fast algorithms for big data. Gunnar Martinsson The University of Colorado at Boulder

APPM4720/5720: Fast algorithms for big data Gunnar Martinsson The University of Colorado at Boulder Course objectives: The purpose of this course is to teach efficient algorithms for processing very large

### Francesco Sorrentino Department of Mechanical Engineering

Master stability function approaches to analyze stability of the synchronous evolution for hypernetworks and of synchronized clusters for networks with symmetries Francesco Sorrentino Department of Mechanical

### 10-810 /02-710 Computational Genomics. Clustering expression data

10-810 /02-710 Computational Genomics Clustering expression data What is Clustering? Organizing data into clusters such that there is high intra-cluster similarity low inter-cluster similarity Informally,

### Sparse direct methods

Sparse direct methods Week 6: Monday, Sep 24 Suppose A is a sparse matrix, and P A = LU. Will L and U also be sparse? The answer depends in a somewhat complicated way on the structure of the graph associated

### The Trimmed Iterative Closest Point Algorithm

Image and Pattern Analysis (IPAN) Group Computer and Automation Research Institute, HAS Budapest, Hungary The Trimmed Iterative Closest Point Algorithm Dmitry Chetverikov and Dmitry Stepanov http://visual.ipan.sztaki.hu

### Analysis and Computation of Google s PageRank

Analysis and Computation of Google s PageRank Ilse Ipsen North Carolina State University Joint work with Rebecca M. Wills IMACS p.1 Overview Goal: Compute (citation) importance of a web page Simple Web

### Lecture 9. 1 Introduction. 2 Random Walks in Graphs. 1.1 How To Explore a Graph? CS-621 Theory Gems October 17, 2012

CS-62 Theory Gems October 7, 202 Lecture 9 Lecturer: Aleksander Mądry Scribes: Dorina Thanou, Xiaowen Dong Introduction Over the next couple of lectures, our focus will be on graphs. Graphs are one of

### Course: Model, Learning, and Inference: Lecture 5

Course: Model, Learning, and Inference: Lecture 5 Alan Yuille Department of Statistics, UCLA Los Angeles, CA 90095 yuille@stat.ucla.edu Abstract Probability distributions on structured representation.

### Equivalence Concepts for Social Networks

Equivalence Concepts for Social Networks Tom A.B. Snijders University of Oxford March 26, 2009 c Tom A.B. Snijders (University of Oxford) Equivalences in networks March 26, 2009 1 / 40 Outline Structural

### Network Analysis and Visualization of Staphylococcus aureus. by Russ Gibson

Network Analysis and Visualization of Staphylococcus aureus by Russ Gibson Network analysis Based on graph theory Probabilistic models (random graphs) developed by Erdős and Rényi in 1959 Theory and tools

### Scalable Distributed Schur Complement Solvers for Internal and External Flow Computations on Many-Core Architectures

Scalable Distributed Schur Complement Solvers for Internal and External Flow Computations on Many-Core Architectures Dr.-Ing. Achim Basermann, Dr. Hans-Peter Kersken, Melven Zöllner** German Aerospace

### Practical Guide to the Simplex Method of Linear Programming

Practical Guide to the Simplex Method of Linear Programming Marcel Oliver Revised: April, 0 The basic steps of the simplex algorithm Step : Write the linear programming problem in standard form Linear

### NETZCOPE - a tool to analyze and display complex R&D collaboration networks

The Task Concepts from Spectral Graph Theory EU R&D Network Analysis Netzcope Screenshots NETZCOPE - a tool to analyze and display complex R&D collaboration networks L. Streit & O. Strogan BiBoS, Univ.

### IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 20, NO. 7, JULY 2009 1181 The Global Kernel k-means Algorithm for Clustering in Feature Space Grigorios F. Tzortzis and Aristidis C. Likas, Senior Member, IEEE

### An Optimized Scheme for Vertical Partitioning of a Distributed Database

310 An Optimized Scheme for Vertical Partitioning of a Distributed Database Eltayeb Salih Abuelyaman CCIS, Prince Sultan University, Riyadh 11586, Saudi Arabia Summary This paper proposes a scheme for

### Scheduling Shop Scheduling. Tim Nieberg

Scheduling Shop Scheduling Tim Nieberg Shop models: General Introduction Remark: Consider non preemptive problems with regular objectives Notation Shop Problems: m machines, n jobs 1,..., n operations

### 5. Orthogonal matrices

L Vandenberghe EE133A (Spring 2016) 5 Orthogonal matrices matrices with orthonormal columns orthogonal matrices tall matrices with orthonormal columns complex matrices with orthonormal columns 5-1 Orthonormal

### A Spectral Clustering Approach to Validating Sensors via Their Peers in Distributed Sensor Networks

A Spectral Clustering Approach to Validating Sensors via Their Peers in Distributed Sensor Networks H. T. Kung Dario Vlah {htk, dario}@eecs.harvard.edu Harvard School of Engineering and Applied Sciences

### Learning Gaussian process models from big data. Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu

Learning Gaussian process models from big data Alan Qi Purdue University Joint work with Z. Xu, F. Yan, B. Dai, and Y. Zhu Machine learning seminar at University of Cambridge, July 4 2012 Data A lot of

### Sibyl: a system for large scale machine learning

Sibyl: a system for large scale machine learning Tushar Chandra, Eugene Ie, Kenneth Goldman, Tomas Lloret Llinares, Jim McFadden, Fernando Pereira, Joshua Redstone, Tal Shaked, Yoram Singer Machine Learning

### Bilinear Prediction Using Low-Rank Models

Bilinear Prediction Using Low-Rank Models Inderjit S. Dhillon Dept of Computer Science UT Austin 26th International Conference on Algorithmic Learning Theory Banff, Canada Oct 6, 2015 Joint work with C-J.

### 1 Cubic Hermite Spline Interpolation

cs412: introduction to numerical analysis 10/26/10 Lecture 13: Cubic Hermite Spline Interpolation II Instructor: Professor Amos Ron Scribes: Yunpeng Li, Mark Cowlishaw, Nathanael Fillmore 1 Cubic Hermite

### Part 2: Community Detection

Chapter 8: Graph Data Part 2: Community Detection Based on Leskovec, Rajaraman, Ullman 2014: Mining of Massive Datasets Big Data Management and Analytics Outline Community Detection - Social networks -

### Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs

CSE599s: Extremal Combinatorics November 21, 2011 Lecture 15 An Arithmetic Circuit Lowerbound and Flows in Graphs Lecturer: Anup Rao 1 An Arithmetic Circuit Lower Bound An arithmetic circuit is just like

### SCAN: A Structural Clustering Algorithm for Networks

SCAN: A Structural Clustering Algorithm for Networks Xiaowei Xu, Nurcan Yuruk, Zhidan Feng (University of Arkansas at Little Rock) Thomas A. J. Schweiger (Acxiom Corporation) Networks scaling: #edges connected

### Scalable Data Analysis in R. Lee E. Edlefsen Chief Scientist UserR! 2011

Scalable Data Analysis in R Lee E. Edlefsen Chief Scientist UserR! 2011 1 Introduction Our ability to collect and store data has rapidly been outpacing our ability to analyze it We need scalable data analysis

### arxiv:1408.3610v1 [math.pr] 15 Aug 2014

PageRank in scale-free random graphs Ningyuan Chen, Nelly Litvak 2, Mariana Olvera-Cravioto Columbia University, 500 W. 20th Street, 3rd floor, New York, NY 0027 2 University of Twente, P.O.Box 27, 7500AE,

### We call this set an n-dimensional parallelogram (with one vertex 0). We also refer to the vectors x 1,..., x n as the edges of P.

Volumes of parallelograms 1 Chapter 8 Volumes of parallelograms In the present short chapter we are going to discuss the elementary geometrical objects which we call parallelograms. These are going to

### Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows

TECHNISCHE UNIVERSITEIT EINDHOVEN Branch-and-Price Approach to the Vehicle Routing Problem with Time Windows Lloyd A. Fasting May 2014 Supervisors: dr. M. Firat dr.ir. M.A.A. Boon J. van Twist MSc. Contents

### BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION

BIG DATA PROBLEMS AND LARGE-SCALE OPTIMIZATION: A DISTRIBUTED ALGORITHM FOR MATRIX FACTORIZATION Ş. İlker Birbil Sabancı University Ali Taylan Cemgil 1, Hazal Koptagel 1, Figen Öztoprak 2, Umut Şimşekli

### A scalable multilevel algorithm for graph clustering and community structure detection

A scalable multilevel algorithm for graph clustering and community structure detection Hristo N. Djidjev 1 Los Alamos National Laboratory, Los Alamos, NM 87545 Abstract. One of the most useful measures

Adaptive Linear Programming Decoding Mohammad H. Taghavi and Paul H. Siegel ECE Department, University of California, San Diego Email: (mtaghavi, psiegel)@ucsd.edu ISIT 2006, Seattle, USA, July 9 14, 2006

### Thinkwell s Homeschool Algebra 2 Course Lesson Plan: 34 weeks

Thinkwell s Homeschool Algebra 2 Course Lesson Plan: 34 weeks Welcome to Thinkwell s Homeschool Algebra 2! We re thrilled that you ve decided to make us part of your homeschool curriculum. This lesson

### CSC2420 Fall 2012: Algorithm Design, Analysis and Theory

CSC2420 Fall 2012: Algorithm Design, Analysis and Theory Allan Borodin November 15, 2012; Lecture 10 1 / 27 Randomized online bipartite matching and the adwords problem. We briefly return to online algorithms

### Approximation Algorithms

Approximation Algorithms or: How I Learned to Stop Worrying and Deal with NP-Completeness Ong Jit Sheng, Jonathan (A0073924B) March, 2012 Overview Key Results (I) General techniques: Greedy algorithms

### Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes

Parallel Programming at the Exascale Era: A Case Study on Parallelizing Matrix Assembly For Unstructured Meshes Eric Petit, Loïc Thebault, Quang V. Dinh May 2014 EXA2CT Consortium 2 WPs Organization Proto-Applications

### 1 Solving LPs: The Simplex Algorithm of George Dantzig

Solving LPs: The Simplex Algorithm of George Dantzig. Simplex Pivoting: Dictionary Format We illustrate a general solution procedure, called the simplex algorithm, by implementing it on a very simple example.

### Interval Pseudo-Inverse Matrices and Interval Greville Algorithm

Interval Pseudo-Inverse Matrices and Interval Greville Algorithm P V Saraev Lipetsk State Technical University, Lipetsk, Russia psaraev@yandexru Abstract This paper investigates interval pseudo-inverse

### Compact Representations and Approximations for Compuation in Games

Compact Representations and Approximations for Compuation in Games Kevin Swersky April 23, 2008 Abstract Compact representations have recently been developed as a way of both encoding the strategic interactions

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### 3. Interpolation. Closing the Gaps of Discretization... Beyond Polynomials

3. Interpolation Closing the Gaps of Discretization... Beyond Polynomials Closing the Gaps of Discretization... Beyond Polynomials, December 19, 2012 1 3.3. Polynomial Splines Idea of Polynomial Splines

### Big Graph Processing: Some Background

Big Graph Processing: Some Background Bo Wu Colorado School of Mines Part of slides from: Paul Burkhardt (National Security Agency) and Carlos Guestrin (Washington University) Mines CSCI-580, Bo Wu Graphs

### Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms. Spiros Papadimitriou, IBM Research Jimeng Sun, IBM Research Rong Yan, Facebook

Large-scale Data Mining: MapReduce and Beyond Part 2: Algorithms Spiros Papadimitriou, IBM Research Jimeng Sun, IBM Research Rong Yan, Facebook Part 2:Mining using MapReduce Mining algorithms using MapReduce

### Tensor Decompositions for Analyzing Multi-link Graphs

Tensor Decompositions for Analyzing Multi-link Graphs Danny Dunlavy, Tammy Kolda, Philip Kegelmeyer Sandia National Laboratories SIAM Parallel Processing for Scientific Computing March 13, 2008 Sandia

### Social Network Mining

Social Network Mining Data Mining November 11, 2013 Frank Takes (ftakes@liacs.nl) LIACS, Universiteit Leiden Overview Social Network Analysis Graph Mining Online Social Networks Friendship Graph Semantics

### Algebraic Complexity Theory and Matrix Multiplication

Algebraic Complexity Theory and Matrix Multiplication [Handout for the Tutorial] François Le Gall Department of Computer Science Graduate School of Information Science and Technology The University of

### A Markovian model of the RED mechanism solved with a cluster of computers

Annales UMCS Informatica AI 5 (2006) 19-27 Annales UMCS Informatica Lublin-Polonia Sectio AI http://www.annales.umcs.lublin.pl/ A Markovian model of the RED mechanism solved with a cluster of computers

### PART III. OPS-based wide area networks

PART III OPS-based wide area networks Chapter 7 Introduction to the OPS-based wide area network 7.1 State-of-the-art In this thesis, we consider the general switch architecture with full connectivity

### Acoustics Analysis of Speaker

Acoustics Analysis of Speaker 1 Introduction ANSYS 14.0 offers many enhancements in the area of acoustics. In this presentation, an example speaker analysis will be shown to highlight some of the acoustics

### Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### Binary Image Analysis

Binary Image Analysis Segmentation produces homogenous regions each region has uniform gray-level each region is a binary image (0: background, 1: object or the reverse) more intensity values for overlapping

### Optimization of the HOTS score of a website s pages

Optimization of the HOTS score of a website s pages Olivier Fercoq and Stéphane Gaubert INRIA Saclay and CMAP Ecole Polytechnique June 21st, 2012 Toy example with 21 pages 3 5 6 2 4 20 1 12 9 11 15 16

### The Green Index: A Metric for Evaluating System-Wide Energy Efficiency in HPC Systems

202 IEEE 202 26th IEEE International 26th International Parallel Parallel and Distributed and Distributed Processing Processing Symposium Symposium Workshops Workshops & PhD Forum The Green Index: A Metric

### Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data

Parallel Data Selection Based on Neurodynamic Optimization in the Era of Big Data Jun Wang Department of Mechanical and Automation Engineering The Chinese University of Hong Kong Shatin, New Territories,