# A deeper analysis indicates, at least in qualitative

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Further formal analysis We have already Indicated that the logistic model allows for the introduction of new parameters, which may represent, for example, interactions between pairs of existing query terms or the addition of new terms to the query. A superficial analysis suggests that any additional parameters should be useful in helping to distinguish relevant fron non-relevant documents, or at worst neutral. However, some of our early experiments were conducted on term interactions, with entirely negative results. A deeper analysis indicates, at least in qualitative terms, why this might be. here is only a limited amount of data from which the parameters are to be estimated, and introducing new parameters in effect stretches this data further. hus all the parameters are less precisely estimated, and this drop in precision more than outweighs (or may do so) the potential improvement due to including the new parameters. his property was indicated previously in van Rijsbergen's (979) phrase "the curse of dimensionality". Is it possible to quantify this property, so as to derive a rule or rules which would define when a parameter should or should not be added? in the present section. his question is addressed We have not in fact succeeded in developing an immediately applicable rule; however, we feel that the analysis gives some insight into the problem, and may also provide a formal basis suitable for further work. 4. Definitions Suppose that we take repeated samples of n_ documents and the probability of a document in cell j If k.(t) being relevant is constant. is the number of relevant documents in cell j after t 0 samples then k.(t)/t 0 notation > n.p. as 3 * <*> a.s. 0 0 with the obvious

2 39. d -r f (k(t) 3 tn ; p)» ^> n.p.p. - n.log (l+e p j) ~~ -r J = he value of which maximises the limit above will be demoted ^ can J - IL.'tn be thought of as a sort of asymptotic bias. It is where we lose out because our model is wrong. If C then J yy^ = I. As is made larger the bias will decrease but the expected value of J^ - JLvn will increase Suppose we add another vector to m, increasing its dimension by one. We would like to try and get estimates of the way the "bias" J_ - j_ m and the "variance ( I JLm " JLyvJ ) vary. 4.2 he "variance" Let ty (x) be any real valued function and suppose ty(x) has a maximum at x = 0. Expanding ^ as a power series we have say 2 3 \b(x) = a + a 0 x + a 7 x + 2 ^ (x) = 2a c,x + ZCLJX + ^(x) = 2a 2 + 6a 2x + 2 and so tf (x)/ \$" (x) = x + 0(x ) as x ^0. Now suppose ty has its maximum at x. hen if \x - is small ^' (x)/^" (x) is a reasonable estimate of x - x.

3 40. When x_ is a vector variable ^? (x) becomes the vector of first order partial derivatives and ^" (x) the matrix of second order derivatives but the argument still works. Hence we might be able to get some idea of the behaviour of ^ - ILyn by looking at the first and second derivatives of f. 3f - e % -* = k. - n. = k. - n.r. say; % + e % with the obvious notation f = *-** Also i!t -z- = j 3p. p. J p. n, ~ = n.r.(l-r J % = j which we will abreviate as 9 2 f,l - = ff _ _ (VJ where _ ^ is the diagonal matrix whose i'th diagonal entry is r.( - r.j. Now let ^_ be a d x m matrix whose columns are a basis for rfl. (m being the dimension of ). Writing p = A o we have (with some abuse of notation) ^ f(a a) = / r ; = / (k - # r) 2 2 8o f(a a) = ^ ^4 fpj 4 3p = A N <j> W i4

4 4. As both jiyy> and JLyyi such that JLyy, = are in d. > >n = A n we can '^ f ind 7 &nd ^ and o W *H t> e the which maximises f(a_ ). Hence we can approximate JL fr\. ~ ij^ by 2 A< -^ / M a))' JL /M a; -3 A f/v ( m ; ^; 2 /* (k- I Em) First we note that A^ # \$_ A_ will be non-singular if and only if N_ A_ has rank m but this is implied by our assumptions that N y = 0 and ye 7l=^>y = 0. Now, if instead of the usual Euclidian distance, we use the norm I. = N_ (jp y^ ) we find that *Wfc - ivp_ m ) A_ (/N_ ±A) 2 A (k - N^^ ) () Because of the way p_rr\. is defined we have that A (N_ p_ - N_ p_ ^ ) and so we can replace our estimate () by (k - N ) A(^N_ \$_A) 2 A (k- NpJ (2) Now for each j. k. - n.p. is a random variable with mean 0. If for each j we define X. 3 ~k. - n.p J/(n.p.a - p.)r n. = 0 J _a N(0 3 ) random variable independent of the other X. n; = 0 -z. 0

5 42. k then for all j X.(n x>. (-v.)r = k. - p.n. and we can replace our estimate by Now E(X.X.) and so the expectation of (3) given N matrix. Reorganising it a bit we get will be the trace of the inner E(( lm - ^ m ) N* (p m )( lm - l m ; \ N) r(/l f )A)~ ^ v (A N_ * ( Z m )A) ~ 2 (^l (R)A) he first point about this is that if 2.Yt\ = R then the above estimate equals m (the dimension of Ml J and each new vector we add increases it by. We will come back to the more general case later. here are two main questions about the above estimate. "How good is it?" and "Can we do better?". It is possible that it is not too bad. he difference between f'/f and jn_ - J_ might average out when we take the expectation. Clearly more work could be done. 4.3 he "bias" Let [ft and A_ be defined as before and suppose we increase to include the vector a. We write n = m & <a> and B = (A ex) We assume that both maxima exist.

6 43. i.e. f(p) - ^> n o [ PJ P J " log( + e ^ } has a finite maximum in I L at and in u[ at [^. Jl/* will be the unique vector satisfying. r m elfl and //' 0* m ) = 0_ but / f^yw ^ = #( " E JYI ^ and so ZLttl is that unique vector satisying (i) jr m e V/l (ii) A N( - m ) = similarly for jr we have (iii) yv (iv) / ff< - n ) = 0_ Note that (i)=>(iii) and (iv) =Xii). d Now let us suppose we have an arc y_ (t) in, ~* with the following properties (as usual git) = " ) +e ^ () x <0> =!IYn and ^ (i) = ^n (2) /V ( - (t)) = tf, 0 4 <_ (3) x<*>' e Yl5 0 i * i (4) a, N^ ( - git)) is monotonously decreasing for (9 ^ < ^ By (3) we can write x (*) = JL (t) C(t)elR

7 44. and by (2) L K. ( "* C*)> = j A[( ~ (*))JOL where H = m + dim. differentiating both sides w.r.t. t gives i) d 3t, exp B t, (t) a N( - - SL(t)) H = R N. + exp B_ (t) B N ~7~. B t, (t).2 B r,'(t) ( B_N\$ (q(t)) B s (t) but x' (t) = S _c' <t) and so Y' (t) = BCEFN (glb)' n ^ «fe _ ^*^ Suppose we can consider that the variation in _ (g) is fairly small compared with its size. hen we can write. i m _ \ ±(o) - cz; BCA ^^ ;B; 2 n J;i(v_ - v_^) Now if we choose as. t. # ( yn J^.=. BfF 2^ (f) B; n = B( AA ^ A) 0 (a. N \$ a) r] = (</# <j> ^ - = (a N \$ a) a

8 45. giving ^m -ilyv SL BR " Etvi > a N <{> a Hence a ff <f) a If we go back to forumla (4) in the "variance" section and impose the conditions a_ N_ < _ A_ = 0_ and ^(gy* ) ~ / ) we find that the change in the variance going from VYl to Kl is about tfk i (pj^/(ik i fem.^ a if Comparing these we get the criterion that we should include An Example It is interesting to look at the case when we have only two terms. hen d = 4, the 4 classes being class 0 : all the documents with neither term class : the documents with term but not term 2 class 2 : the documents with term 2 but not term class 3 : the documents with both terms We want to see what happens when we add the interaction to the independence model. he independence model has dimension 3. A suitable basis for to. would be the columns of A where

9 \ 0 0 \ \ A ) If we assume that there are some documents* in each class ( N is non-singular) then a suitable candidate for a is a = (Mfcy^ ) where 2 \-l \-l K ) he criterion for the inclusion of a then becomes I 2-2 If we assume ^ - ^^i is small compared to <f> we can simplify this 2 (n (& (R ~ m )) > 3 (p.(l - pj)n. -i he way the left-hand side varies with N_ needs more investigation but it should be sandwiched between positive upper and lower bounds. he RHS on the other hand tends to 0 only if all the n. tend to infinity. he tentative conclusion we might draw from this is that there is no point in including an interaction unless there are a number of documents in each of the four classes.

### Least Squares Estimation

Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

### Notes V General Equilibrium: Positive Theory. 1 Walrasian Equilibrium and Excess Demand

Notes V General Equilibrium: Positive Theory In this lecture we go on considering a general equilibrium model of a private ownership economy. In contrast to the Notes IV, we focus on positive issues such

### Vector and Matrix Norms

Chapter 1 Vector and Matrix Norms 11 Vector Spaces Let F be a field (such as the real numbers, R, or complex numbers, C) with elements called scalars A Vector Space, V, over the field F is a non-empty

### ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL

Kardi Teknomo ANALYTIC HIERARCHY PROCESS (AHP) TUTORIAL Revoledu.com Table of Contents Analytic Hierarchy Process (AHP) Tutorial... 1 Multi Criteria Decision Making... 1 Cross Tabulation... 2 Evaluation

### ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE

ECON20310 LECTURE SYNOPSIS REAL BUSINESS CYCLE YUAN TIAN This synopsis is designed merely for keep a record of the materials covered in lectures. Please refer to your own lecture notes for all proofs.

### Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

### Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

### Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model

Overview of Violations of the Basic Assumptions in the Classical Normal Linear Regression Model 1 September 004 A. Introduction and assumptions The classical normal linear regression model can be written

### Factor analysis. Angela Montanari

Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

### 3. INNER PRODUCT SPACES

. INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

### SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA

SECOND DERIVATIVE TEST FOR CONSTRAINED EXTREMA This handout presents the second derivative test for a local extrema of a Lagrange multiplier problem. The Section 1 presents a geometric motivation for the

### Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

### Lecture 4: BK inequality 27th August and 6th September, 2007

CSL866: Percolation and Random Graphs IIT Delhi Amitabha Bagchi Scribe: Arindam Pal Lecture 4: BK inequality 27th August and 6th September, 2007 4. Preliminaries The FKG inequality allows us to lower bound

### Inner Product Spaces

Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

### Similarity and Diagonalization. Similar Matrices

MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

### 8. Linear least-squares

8. Linear least-squares EE13 (Fall 211-12) definition examples and applications solution of a least-squares problem, normal equations 8-1 Definition overdetermined linear equations if b range(a), cannot

### Solution to Homework 2

Solution to Homework 2 Olena Bormashenko September 23, 2011 Section 1.4: 1(a)(b)(i)(k), 4, 5, 14; Section 1.5: 1(a)(b)(c)(d)(e)(n), 2(a)(c), 13, 16, 17, 18, 27 Section 1.4 1. Compute the following, if

### 15.062 Data Mining: Algorithms and Applications Matrix Math Review

.6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

### Example 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum. asin. k, a, and b. We study stability of the origin x

Lecture 4. LaSalle s Invariance Principle We begin with a motivating eample. Eample 4.1 (nonlinear pendulum dynamics with friction) Figure 4.1: Pendulum Dynamics of a pendulum with friction can be written

### Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

### Two classes of ternary codes and their weight distributions

Two classes of ternary codes and their weight distributions Cunsheng Ding, Torleiv Kløve, and Francesco Sica Abstract In this paper we describe two classes of ternary codes, determine their minimum weight

### MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

### Inequality, Mobility and Income Distribution Comparisons

Fiscal Studies (1997) vol. 18, no. 3, pp. 93 30 Inequality, Mobility and Income Distribution Comparisons JOHN CREEDY * Abstract his paper examines the relationship between the cross-sectional and lifetime

### The van Hoeij Algorithm for Factoring Polynomials

The van Hoeij Algorithm for Factoring Polynomials Jürgen Klüners Abstract In this survey we report about a new algorithm for factoring polynomials due to Mark van Hoeij. The main idea is that the combinatorial

### Lecture 5 Least-squares

EE263 Autumn 2007-08 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

### Diagonal, Symmetric and Triangular Matrices

Contents 1 Diagonal, Symmetric Triangular Matrices 2 Diagonal Matrices 2.1 Products, Powers Inverses of Diagonal Matrices 2.1.1 Theorem (Powers of Matrices) 2.2 Multiplying Matrices on the Left Right by

### Tail inequalities for order statistics of log-concave vectors and applications

Tail inequalities for order statistics of log-concave vectors and applications Rafał Latała Based in part on a joint work with R.Adamczak, A.E.Litvak, A.Pajor and N.Tomczak-Jaegermann Banff, May 2011 Basic

### 3.8 Finding Antiderivatives; Divergence and Curl of a Vector Field

3.8 Finding Antiderivatives; Divergence and Curl of a Vector Field 77 3.8 Finding Antiderivatives; Divergence and Curl of a Vector Field Overview: The antiderivative in one variable calculus is an important

### Machine Learning and Pattern Recognition Logistic Regression

Machine Learning and Pattern Recognition Logistic Regression Course Lecturer:Amos J Storkey Institute for Adaptive and Neural Computation School of Informatics University of Edinburgh Crichton Street,

### 1 Review of Least Squares Solutions to Overdetermined Systems

cs4: introduction to numerical analysis /9/0 Lecture 7: Rectangular Systems and Numerical Integration Instructor: Professor Amos Ron Scribes: Mark Cowlishaw, Nathanael Fillmore Review of Least Squares

### MATH 240 Fall, Chapter 1: Linear Equations and Matrices

MATH 240 Fall, 2007 Chapter Summaries for Kolman / Hill, Elementary Linear Algebra, 9th Ed. written by Prof. J. Beachy Sections 1.1 1.5, 2.1 2.3, 4.2 4.9, 3.1 3.5, 5.3 5.5, 6.1 6.3, 6.5, 7.1 7.3 DEFINITIONS

### 18.06 Problem Set 4 Solution Due Wednesday, 11 March 2009 at 4 pm in 2-106. Total: 175 points.

806 Problem Set 4 Solution Due Wednesday, March 2009 at 4 pm in 2-06 Total: 75 points Problem : A is an m n matrix of rank r Suppose there are right-hand-sides b for which A x = b has no solution (a) What

### 1. Introduction to multivariate data

. Introduction to multivariate data. Books Chat eld, C. and A.J.Collins, Introduction to multivariate analysis. Chapman & Hall Krzanowski, W.J. Principles of multivariate analysis. Oxford.000 Johnson,

### Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

### FALSE ALARMS IN FAULT-TOLERANT DOMINATING SETS IN GRAPHS. Mateusz Nikodem

Opuscula Mathematica Vol. 32 No. 4 2012 http://dx.doi.org/10.7494/opmath.2012.32.4.751 FALSE ALARMS IN FAULT-TOLERANT DOMINATING SETS IN GRAPHS Mateusz Nikodem Abstract. We develop the problem of fault-tolerant

### Figure 2.1: Center of mass of four points.

Chapter 2 Bézier curves are named after their inventor, Dr. Pierre Bézier. Bézier was an engineer with the Renault car company and set out in the early 196 s to develop a curve formulation which would

### Notes on Determinant

ENGG2012B Advanced Engineering Mathematics Notes on Determinant Lecturer: Kenneth Shum Lecture 9-18/02/2013 The determinant of a system of linear equations determines whether the solution is unique, without

### 1 Fixed Point Iteration and Contraction Mapping Theorem

1 Fixed Point Iteration and Contraction Mapping Theorem Notation: For two sets A,B we write A B iff x A = x B. So A A is true. Some people use the notation instead. 1.1 Introduction Consider a function

### פרויקט מסכם לתואר בוגר במדעים )B.Sc( במתמטיקה שימושית

המחלקה למתמטיקה Department of Mathematics פרויקט מסכם לתואר בוגר במדעים )B.Sc( במתמטיקה שימושית הימורים אופטימליים ע"י שימוש בקריטריון קלי אלון תושיה Optimal betting using the Kelly Criterion Alon Tushia

### BANACH AND HILBERT SPACE REVIEW

BANACH AND HILBET SPACE EVIEW CHISTOPHE HEIL These notes will briefly review some basic concepts related to the theory of Banach and Hilbert spaces. We are not trying to give a complete development, but

### 0.1 Phase Estimation Technique

Phase Estimation In this lecture we will describe Kitaev s phase estimation algorithm, and use it to obtain an alternate derivation of a quantum factoring algorithm We will also use this technique to design

### Convex analysis and profit/cost/support functions

CALIFORNIA INSTITUTE OF TECHNOLOGY Division of the Humanities and Social Sciences Convex analysis and profit/cost/support functions KC Border October 2004 Revised January 2009 Let A be a subset of R m

### 1 Introduction to Matrices

1 Introduction to Matrices In this section, important definitions and results from matrix algebra that are useful in regression analysis are introduced. While all statements below regarding the columns

### DATA ANALYSIS II. Matrix Algorithms

DATA ANALYSIS II Matrix Algorithms Similarity Matrix Given a dataset D = {x i }, i=1,..,n consisting of n points in R d, let A denote the n n symmetric similarity matrix between the points, given as where

### Mathematical Methods of Engineering Analysis

Mathematical Methods of Engineering Analysis Erhan Çinlar Robert J. Vanderbei February 2, 2000 Contents Sets and Functions 1 1 Sets................................... 1 Subsets.............................

### 4. Matrix inverses. left and right inverse. linear independence. nonsingular matrices. matrices with linearly independent columns

L. Vandenberghe EE133A (Spring 2016) 4. Matrix inverses left and right inverse linear independence nonsingular matrices matrices with linearly independent columns matrices with linearly independent rows

### Section 1.1. Introduction to R n

The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

### Linear Threshold Units

Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

### Solution of Linear Systems

Chapter 3 Solution of Linear Systems In this chapter we study algorithms for possibly the most commonly occurring problem in scientific computing, the solution of linear systems of equations. We start

### 1 Error in Euler s Method

1 Error in Euler s Method Experience with Euler s 1 method raises some interesting questions about numerical approximations for the solutions of differential equations. 1. What determines the amount of

### The basic unit in matrix algebra is a matrix, generally expressed as: a 11 a 12. a 13 A = a 21 a 22 a 23

(copyright by Scott M Lynch, February 2003) Brief Matrix Algebra Review (Soc 504) Matrix algebra is a form of mathematics that allows compact notation for, and mathematical manipulation of, high-dimensional

### 3. Reaction Diffusion Equations Consider the following ODE model for population growth

3. Reaction Diffusion Equations Consider the following ODE model for population growth u t a u t u t, u 0 u 0 where u t denotes the population size at time t, and a u plays the role of the population dependent

### MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

### Inverses and powers: Rules of Matrix Arithmetic

Contents 1 Inverses and powers: Rules of Matrix Arithmetic 1.1 What about division of matrices? 1.2 Properties of the Inverse of a Matrix 1.2.1 Theorem (Uniqueness of Inverse) 1.2.2 Inverse Test 1.2.3

### Option Properties. Liuren Wu. Zicklin School of Business, Baruch College. Options Markets. (Hull chapter: 9)

Option Properties Liuren Wu Zicklin School of Business, Baruch College Options Markets (Hull chapter: 9) Liuren Wu (Baruch) Option Properties Options Markets 1 / 17 Notation c: European call option price.

### Multivariate normal distribution and testing for means (see MKB Ch 3)

Multivariate normal distribution and testing for means (see MKB Ch 3) Where are we going? 2 One-sample t-test (univariate).................................................. 3 Two-sample t-test (univariate).................................................

### Department of Economics

Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

### Factorization Theorems

Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

### 4. Joint Distributions of Two Random Variables

4. Joint Distributions of Two Random Variables 4.1 Joint Distributions of Two Discrete Random Variables Suppose the discrete random variables X and Y have supports S X and S Y, respectively. The joint

### Efficiency of algorithms. Algorithms. Efficiency of algorithms. Binary search and linear search. Best, worst and average case.

Algorithms Efficiency of algorithms Computational resources: time and space Best, worst and average case performance How to compare algorithms: machine-independent measure of efficiency Growth rate Complexity

### MATH 4330/5330, Fourier Analysis Section 11, The Discrete Fourier Transform

MATH 433/533, Fourier Analysis Section 11, The Discrete Fourier Transform Now, instead of considering functions defined on a continuous domain, like the interval [, 1) or the whole real line R, we wish

### Notes for STA 437/1005 Methods for Multivariate Data

Notes for STA 437/1005 Methods for Multivariate Data Radford M. Neal, 26 November 2010 Random Vectors Notation: Let X be a random vector with p elements, so that X = [X 1,..., X p ], where denotes transpose.

### Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components

Eigenvalues, Eigenvectors, Matrix Factoring, and Principal Components The eigenvalues and eigenvectors of a square matrix play a key role in some important operations in statistics. In particular, they

### 1 Formulating The Low Degree Testing Problem

6.895 PCP and Hardness of Approximation MIT, Fall 2010 Lecture 5: Linearity Testing Lecturer: Dana Moshkovitz Scribe: Gregory Minton and Dana Moshkovitz In the last lecture, we proved a weak PCP Theorem,

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### SOME ASPECTS OF GAMBLING WITH THE KELLY CRITERION. School of Mathematical Sciences. Monash University, Clayton, Victoria, Australia 3168

SOME ASPECTS OF GAMBLING WITH THE KELLY CRITERION Ravi PHATARFOD School of Mathematical Sciences Monash University, Clayton, Victoria, Australia 3168 In this paper we consider the problem of gambling with

### Chapter 1. Vector autoregressions. 1.1 VARs and the identi cation problem

Chapter Vector autoregressions We begin by taking a look at the data of macroeconomics. A way to summarize the dynamics of macroeconomic data is to make use of vector autoregressions. VAR models have become

### LOGISTIC REGRESSION. Nitin R Patel. where the dependent variable, y, is binary (for convenience we often code these values as

LOGISTIC REGRESSION Nitin R Patel Logistic regression extends the ideas of multiple linear regression to the situation where the dependent variable, y, is binary (for convenience we often code these values

### STAT 830 Convergence in Distribution

STAT 830 Convergence in Distribution Richard Lockhart Simon Fraser University STAT 830 Fall 2011 Richard Lockhart (Simon Fraser University) STAT 830 Convergence in Distribution STAT 830 Fall 2011 1 / 31

### T ( a i x i ) = a i T (x i ).

Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

### Math 215 HW #6 Solutions

Math 5 HW #6 Solutions Problem 34 Show that x y is orthogonal to x + y if and only if x = y Proof First, suppose x y is orthogonal to x + y Then since x, y = y, x In other words, = x y, x + y = (x y) T

### Multi-state transition models with actuarial applications c

Multi-state transition models with actuarial applications c by James W. Daniel c Copyright 2004 by James W. Daniel Reprinted by the Casualty Actuarial Society and the Society of Actuaries by permission

### Lecture notes for Choice Under Uncertainty

Lecture notes for Choice Under Uncertainty 1. Introduction In this lecture we examine the theory of decision-making under uncertainty and its application to the demand for insurance. The undergraduate

### Since [L : K(α)] < [L : K] we know from the inductive assumption that [L : K(α)] s < [L : K(α)]. It follows now from Lemma 6.

Theorem 7.1. Let L K be a finite extension. Then a)[l : K] [L : K] s b) the extension L K is separable iff [L : K] = [L : K] s. Proof. Let M be a normal closure of L : K. Consider first the case when L

### Mathematical Induction

Mathematical Induction In logic, we often want to prove that every member of an infinite set has some feature. E.g., we would like to show: N 1 : is a number 1 : has the feature Φ ( x)(n 1 x! 1 x) How

### F Matrix Calculus F 1

F Matrix Calculus F 1 Appendix F: MATRIX CALCULUS TABLE OF CONTENTS Page F1 Introduction F 3 F2 The Derivatives of Vector Functions F 3 F21 Derivative of Vector with Respect to Vector F 3 F22 Derivative

### Logistic Regression. Vibhav Gogate The University of Texas at Dallas. Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld.

Logistic Regression Vibhav Gogate The University of Texas at Dallas Some Slides from Carlos Guestrin, Luke Zettlemoyer and Dan Weld. Generative vs. Discriminative Classifiers Want to Learn: h:x Y X features

### Hedging Options In The Incomplete Market With Stochastic Volatility. Rituparna Sen Sunday, Nov 15

Hedging Options In The Incomplete Market With Stochastic Volatility Rituparna Sen Sunday, Nov 15 1. Motivation This is a pure jump model and hence avoids the theoretical drawbacks of continuous path models.

### Logistic Regression. Jia Li. Department of Statistics The Pennsylvania State University. Logistic Regression

Logistic Regression Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Logistic Regression Preserve linear classification boundaries. By the Bayes rule: Ĝ(x) = arg max

### Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

### DERIVATIVES AS MATRICES; CHAIN RULE

DERIVATIVES AS MATRICES; CHAIN RULE 1. Derivatives of Real-valued Functions Let s first consider functions f : R 2 R. Recall that if the partial derivatives of f exist at the point (x 0, y 0 ), then we

### Statistical Machine Learning

Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

### COMMUTATIVITY DEGREES OF WREATH PRODUCTS OF FINITE ABELIAN GROUPS

Bull Austral Math Soc 77 (2008), 31 36 doi: 101017/S0004972708000038 COMMUTATIVITY DEGREES OF WREATH PRODUCTS OF FINITE ABELIAN GROUPS IGOR V EROVENKO and B SURY (Received 12 April 2007) Abstract We compute

### Sequences and Series

Sequences and Series Consider the following sum: 2 + 4 + 8 + 6 + + 2 i + The dots at the end indicate that the sum goes on forever. Does this make sense? Can we assign a numerical value to an infinite

### Data Matching Optimal and Greedy

Chapter 13 Data Matching Optimal and Greedy Introduction This procedure is used to create treatment-control matches based on propensity scores and/or observed covariate variables. Both optimal and greedy

### LIMITS AND CONTINUITY

LIMITS AND CONTINUITY 1 The concept of it Eample 11 Let f() = 2 4 Eamine the behavior of f() as approaches 2 2 Solution Let us compute some values of f() for close to 2, as in the tables below We see from

### 0 x = 0.30 x = 1.10 x = 3.05 x = 4.15 x = 6 0.4 x = 12. f(x) =

. A mail-order computer business has si telephone lines. Let X denote the number of lines in use at a specified time. Suppose the pmf of X is as given in the accompanying table. 0 2 3 4 5 6 p(.0.5.20.25.20.06.04

### A QUICK GUIDE TO THE FORMULAS OF MULTIVARIABLE CALCULUS

A QUIK GUIDE TO THE FOMULAS OF MULTIVAIABLE ALULUS ontents 1. Analytic Geometry 2 1.1. Definition of a Vector 2 1.2. Scalar Product 2 1.3. Properties of the Scalar Product 2 1.4. Length and Unit Vectors

### Section 5.3. Section 5.3. u m ] l jj. = l jj u j + + l mj u m. v j = [ u 1 u j. l mj

Section 5. l j v j = [ u u j u m ] l jj = l jj u j + + l mj u m. l mj Section 5. 5.. Not orthogonal, the column vectors fail to be perpendicular to each other. 5..2 his matrix is orthogonal. Check that

### Section 1.4. Lines, Planes, and Hyperplanes. The Calculus of Functions of Several Variables

The Calculus of Functions of Several Variables Section 1.4 Lines, Planes, Hyperplanes In this section we will add to our basic geometric understing of R n by studying lines planes. If we do this carefully,

### What is Linear Programming?

Chapter 1 What is Linear Programming? An optimization problem usually has three essential ingredients: a variable vector x consisting of a set of unknowns to be determined, an objective function of x to

### Notes on Symmetric Matrices

CPSC 536N: Randomized Algorithms 2011-12 Term 2 Notes on Symmetric Matrices Prof. Nick Harvey University of British Columbia 1 Symmetric Matrices We review some basic results concerning symmetric matrices.

### Orbits of the Lennard-Jones Potential

Orbits of the Lennard-Jones Potential Prashanth S. Venkataram July 28, 2012 1 Introduction The Lennard-Jones potential describes weak interactions between neutral atoms and molecules. Unlike the potentials

### CASH FLOW MATCHING PROBLEM WITH CVaR CONSTRAINTS: A CASE STUDY WITH PORTFOLIO SAFEGUARD. Danjue Shang and Stan Uryasev

CASH FLOW MATCHING PROBLEM WITH CVaR CONSTRAINTS: A CASE STUDY WITH PORTFOLIO SAFEGUARD Danjue Shang and Stan Uryasev PROJECT REPORT #2011-1 Risk Management and Financial Engineering Lab Department of

### Lecture 9: Introduction to Pattern Analysis

Lecture 9: Introduction to Pattern Analysis g Features, patterns and classifiers g Components of a PR system g An example g Probability definitions g Bayes Theorem g Gaussian densities Features, patterns

### FIRST YEAR CALCULUS. Chapter 7 CONTINUITY. It is a parabola, and we can draw this parabola without lifting our pencil from the paper.

FIRST YEAR CALCULUS WWLCHENW L c WWWL W L Chen, 1982, 2008. 2006. This chapter originates from material used by the author at Imperial College, University of London, between 1981 and 1990. It It is is

### Probability and Statistics

CHAPTER 2: RANDOM VARIABLES AND ASSOCIATED FUNCTIONS 2b - 0 Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be

### 171:290 Model Selection Lecture II: The Akaike Information Criterion

171:290 Model Selection Lecture II: The Akaike Information Criterion Department of Biostatistics Department of Statistics and Actuarial Science August 28, 2012 Introduction AIC, the Akaike Information