Chapter 2 Fundamentals of Statistical Analysis

Size: px
Start display at page:

Download "Chapter 2 Fundamentals of Statistical Analysis"

Transcription

1 Chapter Fundamentals of Statistical Analysis To make this book self-contained, this chapter will review relevant mathematical concepts used in this book. We first review basic probability and statistical concepts used in this book. Then we introduce mathematic notations for statistical processes with multiple variable and variable reduction methods. We will then go through some statistical analysis approaches such as the MC method and the spectral stochastic method. Finally, we will discuss some fast techniques to compute some of random variables with log-normal distributions. 1 Basic Concepts in Probability Theory An understanding of probability theory is essential to statistical analysis. In this section, we will explain some basic concepts in probability theory [13] first. More details and other stochastic theories can be found in [13]. 1.1 Experiment, Sample Space, and Event Definition.1. A experiment is any process of observation or procedure that can be repeated (theoretically) an infinite number of times and has a well-defined set of possible outcomes. Definition.. A sample space is the set of all possible outcomes of an experiment. Definition.3. An event is a subset of the sample space of an experiment. Consider the following experiments as examples: Example 1. Tossing a coin. Sample space: S Dfhead or tailg or S Df0, 1g, where 0 represents a tail and 1 represents a head. R. Shen et al., Statistical Performance Analysis and Modeling Techniques for Nanometer VLSI Designs, DOI / , Springer Science+Business Media, LLC 01 15

2 16 Fundamentals of Statistical Analysis 1. Random Variable and Expectation Usually, we are interested in some value associated with a random event rather than the event itself. For example, in the experiment of tossing two dice, we only care about the sum of the two dice, not the outcome of each die. Definition.4. A random variable X on a sample space S is a real-valued function X W S! R. Definition.5. A discrete random variable is a random variable that takes only a finite or countably infinite number of values (arises from counting). Definition.6. A continuous random variable is a random variable whose set of assumed values is uncountable (arises from measurement). Let X be a random variable and let a R.Theevent X D a represents the set fs S j X.s/ D ag and the probability of this event is written as Pr.X D a/ D X Pr.s/: sswx.s/da Example. Continuous random variable. A CPU is picked randomly from a group of CPUs whose area should be 1 cm. Due to some error in the manufacture process, the area of a chip could vary from chip to chip in the range 0.9 cm to 1.05 cm, excluding the latter. Let X denote the area of a selected chip. Possible outcomes: 0:9 X < 1:05: Example 3. Refer to the previous example. The area of a selected chip is a continuous random variable. The following table gives the area in cm of 100 chips. It lists the observed values of the continuous random variable, the corresponding frequencies, and their probabilities. Area X Number of chips Pr.a X<b/ Total Definition.7. The expectation EŒX,or, of a discrete random variable X is EŒX D D X i i Pr.X D i/; where the sum is taken over all values in the range of X. If P i jij Pr.X D i/ converges, then the expectation is finite. Otherwise, the expectation is said to be unbounded. E.X/ is also called the mean value of the probability distribution.

3 1 Basic Concepts in Probability Theory Variance and Moments of Random Variable Theorem.1. Markov s inequality. For a random variable X that takes on only nonnegative values and for all a>0, we have Pr.X a/ EŒX a : Proof. Let X be a random variable such that X 0 and let a>0. Define a random variable I by I D ( 1; if X a, 0; otherwise, where EŒI D Pr.I D 1/ D Pr.X a/ and I X a : (.1) The expectations of both sides of (.1) are given by the inequality EŒI D Pr.X a/ E X D EŒX a a ; where we used Lemma.3. ut Definition.8. The kth moment of a random variable X is EŒX k.thevariance of X is VarŒX D E.X EŒX / D E X X EŒX C.EŒX / D EŒX EŒX EŒX C.EŒX / D EŒX.EŒX / ; and the standard deviation of X is defined as The area under each curve is 1..X/ D p VarŒX : Theorem.. Chebyshev s inequality. For any a > 0 and a random variable X, we have Pr.jX EŒX ja/ VarŒX a :

4 18 Fundamentals of Statistical Analysis Proof. Note that Pr.jX EŒX ja/ D Pr.X EŒX / a and the random variable.x EŒX / definition of variance to obtain Pr.X EŒX / a E.X EŒX / > 0. Use Markov s inequality and the a D VarŒX a as required. Corollary.1. For any t>1and a random variable X, we have ut Pr jx EŒX jt.x/ 1 t Pr jx EŒX jt EŒX VarŒX t.eœx / : Proof. The results follow from the definitions of variance and standard deviation and Chebyshev s inequality. ut 1.4 Distribution Functions Definition.9. A discrete probability distribution is a table (or a formula) listing all possible values that a discrete variable can take on, together with the associated probabilities. Definition.10. The function f.x/ is called a probability density function (PDF) for the continuous random variable X, if Z b a f.x/dx D Pr.a X b/ (.) for any values of a and b. That is to say, the area under the curve of f.x/between any two ordinates x D a and x D b is the probability that X lies between a and b. It is easy to see that the total area under the PDF curve bounded by the x-axis is equal to 1: Z 1 1 f.x/dx D 1: (.3)

5 1 Basic Concepts in Probability Theory 19 Definition.11. For a real-valuerandomvariablex, the probability distribution is completely characterized by its cumulative distribution function (CDF): F.x/ D Z x 1 f.t/dt D PrŒX x ; x R; (.4) which describes probabilities for a random variable to fall in the intervals of. 1;x. 1.5 Gaussian and Log-Normal Distributions Definition.1. A Gaussian distribution (also called normal distribution) is denoted as N.; /,where,asusual, identifies the mean and the variance. The PDF is defined as follows: f.xi ; / D / 1 p e.x : (.5) The CDF of the standard normal distribution is denoted with.x/ and can be computed as an integral of the PDF:.x/ D p 1 Z x e t = dt D 1 xp 1 C erf ; x R; (.6) 1 where erf is the complementary error function. Definition.13. If X is distributed normally with mean and variance,then the exponential of XY D exp.x/ follows log-normal distribution. Thatistosay, a log-normal distribution is a probability distribution of a random variable whose logarithm is normally distributed. The PDF and CDF of a log-normal distribution are as follows: 1 f.xi ; / D x p.lnx / e ; x > 0; (.7) F X.xI ; / D 1 erf lnx lnx p D : (.8) More details about the sum of multiple log-normal distribution is given in Sect. 4 of Chap..

6 0 Fundamentals of Statistical Analysis 1.6 Basic Concepts for Multiple Random Variables Definition.14. Two random variables X and Y are independent if Pr..X D x/ \.Y D y// D Pr.X D x/ Pr.Y D y/ for all x;y R. Furthermore, the random variables X 1 ;X ;:::;X k are mutually independent if for any subset I f1;;:::;kg and any values x i for i I,we have! \ Pr X i D x i D Y Pr.X i D x i /: ii ii Theorem.3. Linearity of expectations. Let X 1 ;X ;:::;X n be a finite collection of discrete random variables with finite expectations. Then " # X E X i D X i i EŒX i : Proof. We use induction on the number of random variables. For the base case, let X and Y be random variables. Use the law of total probability to get EŒX C Y D X i D X i C X i X.i C j/ Pr..X D i/\.y D j// j X i Pr..X D i/\.y D j// j X j Pr..X D i/\.y D j// j D X i i X j Pr..X D i/\.y D j// C X j j X i Pr..X D i/\.y D j// D X i i Pr.X D i/ C X j j Pr.Y D j/ D EŒX C EŒY : ut Linearity of expectations holds for any collection of random variables, even if they are not independent. Furthermore, if P 1 id1 EŒjX ij converges, then it can be shown that

7 1 Basic Concepts in Probability Theory 1 " 1 # X 1X E X i D EŒX i : id1 id1 Lemma.1. Let c be any constant and X a random variable. Then EŒcX D c EŒX : Proof. The case c D 0 is trivial. Suppose c 0.Then EŒcX D X i i Pr.cX D i/ D c X i.i=c/ Pr.X D i=c/ D c X k k Pr.X D k/ D c EŒX as required. If X and Y are two random variables, their covariance is ut Cov.X; Y / D EŒ.X EŒX /.Y EŒY / D EŒ.Y EŒY /.X EŒX / D Cov.Y; X/: Theorem.4. For any two random variables X and Y, we have VarŒX C Y D VarŒX C VarŒY C Cov.X; Y /: Proof. Use the linearity of expectations, and the definitions of variance and covariance, to obtain VarŒX C Y D E.X C Y EŒX C Y / D E.X C Y EŒX EŒY / D E.X EŒX / C.Y EŒY / C.X EŒX /.Y EŒY / D E.X EŒX / C E.Y EŒY / C EŒ.X EŒX /.Y EŒY / D VarŒX C VarŒY C Cov.X; Y / as required. ut

8 Fundamentals of Statistical Analysis Theorem.4 can be extended to a sum of any finite number of random variables. For a collection X 1 ;:::;X n of random variables, it can be shown that " # X Var X i D X i i VarŒX i C X i X Cov.X i ;X j /: Theorem.5. For any two independent random variables X and Y, we have j>i EŒX Y D EŒX EŒY : Proof. Let the indices i and j assume all values in the ranges of X and Y, respectively. As X and Y are independent random variables, then EŒX Y D X X ij Pr..X D i/\.y D j// i j D X X ij Pr.X D i/ Pr.Y D j/ i j " # 3 X D i Pr.X D i/ 4 X j Pr.Y D j/ 5 i D EŒX EŒY j as required. Corollary.. For any independent random variables X and Y, we have ut Cov.X; Y / D 0 and VarŒX C Y D VarŒX C VarŒY : Proof. As X and Y are independent, then so are X EŒX and Y EŒY.Forany random variable Z, wehave EŒZ EŒZ D EŒZ EŒEŒZ D 0: Using Theorem.5, the covariance of X and Y is Cov.X; Y / D EŒ.X EŒX /.Y EŒY / D EŒ.X EŒX / EŒ.Y EŒY / D 0:

9 Multiple Random Variables and Variable Reduction 3 Conclude via the latter equation and Theorem.4 that VarŒX C Y D VarŒX C VarŒY C Cov.X; Y / D VarŒX C VarŒY as required. ut Definition.15. For a collection of random variables, X D X 1 ;:::;X n, the covariance matrix nn is defined as 0 1 Var.X 1 / Cov.X 1 ;X / ::: Cov.X 1 ;X n / Cov.X ;X 1 / Var.X 1 / ::: Cov.X ;X n / D : : ::: : B Cov.X n 1 ;X 1 / Cov.X n 1 ;X / ::: Cov.X n 1 ;X n / A Cov.X n ;X 1 / Cov.X n ;X / ::: Var.X n / When X 1 ;:::;X n are mutually independent random variables, it can be shown by induction that " # X Var X i D X VarŒX i : i i And the covariance matrix is a diagonal matrix in this case. Multiple Random Variables and Variable Reduction.1 Components of Covariance in Process Variation In general, process variation can be classified into two categories [13]: inter-die and intra-die. Inter-die variations are variations from die to die, while intra-die variations correspond to variability within a single chip. Inter-die variations are global variables and, hence, affect all the devices on a chip in the similar fashion. For example, it can cause channel lengths of all the devices on the same chip smaller. Intra-die variations may affect devices differently on the same chip. For example, it can cause some devices with smaller gate oxide thicknesses and others with larger gate oxide thicknesses. The intra-die variations may exhibit spatial correlation. For example, it is more likely for devices located close to each other to have similar characteristics.

10 4 Fundamentals of Statistical Analysis Fig..1 Grid-based model for spatial correlations Gate 1 Gate Gate 3 Gate 5 Gate 4 In general, we can model parameter variation as follows, ı total D ı inter C ı intra ; (.9) where ı inter and ı intra represent the inter-die variation and intra-die variation, respectively. In general [13, 95, 169], ı inter and ı intra can be modeled as Gaussian random variables with normal distribution. In this chapter, we will discuss both Gaussian and non-gaussian cases. Note that due to global effect of inter-die variation, single random variable ı inter is used for all gates/grids in one chip. For ı intra, the value of parameter p located at.x; y/ can be modeled as normally distributed random variable [101] dependent on location: p D p C ı x C ı y C ; (.10) where p is the mean value (nominal design parameter value) at.0; 0/ and ı x and ı y stand for gradients of the parameter indicating the spatial variations of p along x and y directions, respectively. represents the random intra-chip variation. Due to spatial correlations in the intra-chip variation, the vector of all random components across the chip has a correlated multivariate normal distribution, N.0; /, where is the covariance matrix of the spatially correlated parameters. A grid-based method is introduced by [13] for the consideration of correlation. In the grid-based method, the intra-die spatial correlation of parameters is partitioned into p n row p n col D n grids. Since devices close to each other are more likely to have similar characteristics than those placed far away, grid-based methods assume a perfect correlation among the devices in the same grid, high correlations among those in close grids, and low to zero correlations in faraway grids. For example, in Fig..1, Gate 1 and Gate have sizes shown to be exaggeratedly large. They are located in the same grid square, and hence, their parameter variations such

11 Multiple Random Variables and Variable Reduction 5 as the variations of their gate channel length are assumed to be always identical. Gate 1 and Gate 3 lie in neighboring grids, and hence, their parameter variations are not identical but highly correlated due to their spatial proximity. For example, when Gate 1 has a larger than nominal gate channel length, Gate 3 is more likely to have a larger than nominal gate channel length. On the other hand, Gate 1 and Gate 4 are far away from each other; their parameters can be assumed as weakly correlated or uncorrelated. For example, when Gate 1 has a larger than nominal gate channel length, the gate channel length for Gate 4 may be either larger or smaller than nominal. With the grid-based model, we can use a single random variable p.x; y/ to model a parameter variation in a single grid at location.x; y/. As a result, n random variables are needed for each type of parameter, where each represents the value of a parameterin one of the n grids. In addition, we assume that correlation only exists among the same type of parameters in different grids. Note that this assumption is not critical and can easily be removed. For example, gate length L for transistors in the ith grid is correlated with those in nearby grids, but is uncorrelated with other parameters such as gate oxide thickness T ox in any grid including the ith grid itself. For each type of parameter, a correlation matrix of size n n represents the spatial correlation of this parameter. Notice that the number of grid partitions needed is determined by the process, not the circuit. So we can apply the same correlation model to different designs under the same process.. Random Variable Decoupling and Reduction Due to correlation, a large number of random variables involved in VLSI design can be reduced. After the random variable decoupling via correlation, one may further reduce the cost of statistical analysis by the spectral stochastic method as discussed in Sect. 3. Since the random variables are correlated, this correlation should be removed before using the spectral stochastic method. In this part, we first present the theoretical basis for decoupling the correlation of random variables. Proposition.1. For a set of zero-mean Gaussian-distributed variables whose covariance matrix is, if there is a matrix L satisfying D LL T,then can be represented by a set of independent standard normal distributed variables as D L. Proof. According to the characteristics of normal distribution, linear transformation does not impact on the zero mean of the variables and yield another normal distribution. Thus, we only need to prove the covariance matrix remains unchanged during the transformation. According to the definition of covariance, cov.l/ D E L.L/ T D LE T L T : (.11)

12 6 Fundamentals of Statistical Analysis Since is subject to standard normal distribution, LE T L T D LL T D n: (.1).3 Principle Factor Analysis Technique Note that the solution for decoupling is not unique. For example, Cholesky decomposition can be used to seek L since the covariance matrix is always a semipositive definite matrix. However, Cholesky decomposition cannot reduce the number of variables. PFA [74] can substitute Cholesky decomposition when variable reduction is needed. Eigendecomposition on the covariance matrix yields D LL T ; L D p 1 e 1 ; :::; p n e n ; (.13) where f i g are eigenvalues in order of descending magnitude, and fe i g are corresponding eigenvectors. PFA reduces the number of components in by truncating L using the first k items. The error of PFA can be controlled by k: err D np i idkc1 ; (.14) np i id1 where bigger k leads to a more accurate result. PFA is efficient, especially when the correlation length is large. In our experiments, we set the correlation length being eight times the width of wires. As a result, PFA can reduce the number of variables from 40 to 14 with an error of about 1% in an example with 0 parallel wires..4 Weighted PFA Technique One idea is to consider the importance of the outputs during the reduction process when using PFA. Recently, the weighted PFA (wpfa) technique has been used [04] to obtain variable reduction efficiency. If a weight is defined for each physical variable i, to reflect its impact on the output, then a set of new variables are formed: D W; (.15)

13 Multiple Random Variables and Variable Reduction 7 where W D diag.w 1 ; w ; :::; w n / is a diagonal matrix of weights. As a result, the covariance matrix of,. / now contains the weight information, and performing PFA on. / leads to the weighted variable reduction. Specifically, we have. / D E W.W/ T D W./W T ; (.16) and denote its eigenvalues and eigenvectors by i and ei. Then, the variables can be approximated by the linear combination of a set of independent dominant variables : kx q D W 1 W 1 i e i i : (.17) The error controlling process is similar to (.14) but uses the weighted eigenvalues i. id1.5 Principal Component Analysis Technique We first briefly review the concept of principal component analysis (PCA), which is used here to transform the random variables with correlation to uncorrelated random variables [75]. Suppose that x is a vector of n random variables, x D Œx 1 ;x ; :::; x n T, with covariance matrix and mean vector x D Œ x1 ; x ; :::; xn. To find the orthogonal random variables, we first calculate the eigenvalue and corresponding eigenvector. Then, by ordering the eigenvectors in descending order eigenvalues, the orthogonal matrix A will be obtained. Here, A is expressed as A D e1 T ;et ; :::; T et n ; (.18) where e i is the corresponding eigenvector to eigenvalue i, which satisfies i e i D e i ;i D 1; ; :::; n; (.19) and i < i 1 ;id ; 3; :::; n: (.0) With A, we can perform the transformation to get orthogonal random variables y, y D Œy 1 ;y ; :::; y n T by using y D A.x x /; (.1)

14 8 Fundamentals of Statistical Analysis where y i is a random variable with Gaussian distribution. The mean, yi,is0and the standard deviation, yi,is p i on the condition that [75] Here, because of the orthogonal property of matrix A, e T i e i D 1; i D 1; ; :::; n: (.) A 1 D A T : (.3) To reconstruct the original random variables, we use the following equation: x D A T y C x : (.4) 3 Statistical Analysis Approaches 3.1 Monte Carlo Method Monte Carlo techniques [41] are usually used to estimate the value of a definite, finite-dimensional integral of the form Z G D g.x/f.x/dx; (.5) S where S is a finite domain and f.x/ is a PDF over X, i.e., f.x/ 0 for all X and R S f.x/dx D 1. We can accomplish the MC estimation for the value of G by drawing a set of independent samples X 1 ;X ; :::; X MC from f.x/and by applying XMC G MC D.1=MC / g.x i /: (.6) The estimator G MC above is a random variable. Its mean value is the integral G to estimate, i.e., E.G MC / D G, making it an unbiased estimator. The variance of G MC is Var.G MC / D =MC,where is the variance of the random variable g.x/ given by Z D g.x/f.x/dx G : (.7) S id1

15 3 Statistical Analysis Approaches 9 We can use the standard deviation of G MC to assess its accuracy in estimating G. If the sample number MC is sufficiently large, then by the Central Limit Theorem, has an approximate standard normal distribution (N.0; 1/). Hence, G MC G = p MC P G 1:96p G MC G C 1:96p 0:95; (.8) MC MC where Phis the probability measure. Equation i (.8)showsthatG MC will be in the interval G 1:96 p MC ;GC 1:96 p MC with 95% confidence. Thus, one can use the error measure jerrorj p (.9) MC in order to assess the accuracy of the estimator. 3. Spectral Stochastic Method Using Stochastic Orthogonal Polynomial Chaos One recent advance in fast statistical analysis is to apply stochastic OPC [187] to the nanometer-scale integrated circuit analysis. Based on the Askey scheme [196], any stochastic random variable can be represented by OPC, and the random variable with different probability distribution type is associated with different types of orthogonal polynomials. Hermite polynomial chaos (Hermite PC or HPC) utilizes a series of orthogonal polynomials (with respect to the Gaussian distribution) to facilitate stochastic analysis [197]. These polynomials are used as the orthogonal base to decompose a random process in a similar way that sine and cosine functions are used to decompose a periodic signal in a Fourier series expansion. Note that for the Gaussian and log-normal distributions, Hermite polynomial is the best choice as they lead to exponential convergence rate [45]. For non-gaussian and non-lognormal distributions, there are other orthogonal polynomials such as Legendre for uniform distribution, Charlier for Poisson distribution, and Krawtchouk for binomial distribution [44, 187]. For a random variable y./ with limited variance, where D Π1 ; ; ::: n is a vector of zero-mean orthogonal Gaussian random variables, the random variable can be approximated by truncated Hermite PC expansion as follows [45]: y./ D PX kd0 a k Hk n./; (.30)

16 30 Fundamentals of Statistical Analysis where n is the number of independent random variables, Hk n./ is n-dimensional Hermite polynomials, and a k are the deterministic coefficients. The number of terms P is given by px.n 1 C k/š P D ; (.31) kš.n 1/Š kd0 where p is the order of the Hermite PC. Similarly, a random process v.t; / with limited variance can be approximated as v.t; / D PX kd0 a k Hk n./: (.3) If only one random variable/process is considered, the one-dimensional Hermite polynomials are expressed as follows: H 1 0./ D 1; H 1 1./ D ;H1./ D 1; H 1 3./ D 3 3; ::: : (.33) Hermite polynomials are orthogonal with respect to Gaussian weighted expectation (the superscript n is dropped for simple notation): hh i./; H j./i DhH i./iı ij ; (.34) where ı ij follow: is the Kronecker delta and h; i denotes an inner product defined as Z 1 hf./; g./i D p f./g./e 1 T d: (.35)./ n Similar to Fourier series, the coefficient a k for random variable y and a k.t/ for random process v.t/ can be found by a projection operation onto the HPC basis: a k D hy./; H k./i hh k./i ; (.36) a k.t/ D hv.t; /; H k./i hh k./i ; 8k f0; :::; P g: (.37) Once we obtain the Hermite PC, we can calculate the mean and variance of random variable y./ by one-time analysis as (one Gaussian variable case): E.y.// D y 0 Var.y.// D y1 Var. 1/ C y.t/var 1 1 D y1 C y : (.38)

17 3 Statistical Analysis Approaches 31 Similarly, for random process v.t; / (one Gaussian variable case), the mean and variance are as follows: E.v.t; // D v 0.t/ Var.v.t; // D v 1.t/Var. 1/ C v.t/var 1 1 D v 1.t/ C v.t/: (.39) One critical problem remains so far is how to obtain the coefficients of Hermite PC in (.36) and(.37) efficiently. There are two kinds of techniques to calculate the coefficients of Hermite PC in (.36) and(.37), which are collocation-based spectral stochastic method and Galerkin-based spectral stochastic method. In short, we classify in the later part of the book as collocation-based and Galerkin-based methods. 3.3 Collocation-Based Spectral Stochastic Method The collocation method is mainly based on computing the definite integral of a function [70]. The Gaussian quadrature is the commonly used method. We can compute the coefficients a k and a k.t/ in (.36)and (.37), respectively. We review this method by using the Hermite polynomial shown below. Our objective is to determine the numerical solution of the integral equation hy./; H j./i (x can be a random variable or random process). In our problem, this is one-dimensional numerical quadrature problem based on Hermite polynomials [70]. Thus, we have hy./; H k./i Dp 1 Z y./h k./e 1 d./ PX y. i /H i. i /w i : (.40) id0 Here we have only a single random variable. i and w i are Gaussian-Hermite quadrature abscissas (quadrature points) and weights. The quadrature rule states that if we select the roots of the P th Hermite polynomial as the quadrature points, the quadrature is exact for all polynomials of degree P 1 or less for (.40). This is called (P 1)-level accuracy of the Gaussian-Hermite quadrature. For multiple random variables, a multidimensional quadrature is required. The traditional way of computing a multidimensional quadrature is to use a direct tensor product based on one-dimensional Gaussian Hermite quadrature abscissas

18 3 Fundamentals of Statistical Analysis and weights [16]. With this method, the number of quadrature points needed for n dimensions at level P is about.p C 1/ n, which is well known as the curse of dimensionality. Smolyak quadrature [16], also known as sparse grid quadrature, is used as an efficient method to reduce the number of quadrature points. Let us define a onedimensional sparse grid quadrature point set 1 P D f i ; ; :::; P g, which uses P C 1 points to achieve degree P C 1 of exactness. The sparse grid for an n- dimensional quadrature at degree P chooses points from the following set: n P D [. i 1 1 i n 1 /; (.41) P C1jijP Cn where jij D P n j D1 i j. The corresponding weight is w i 1:::i n j i1 :::j in D. 1/ P Cn jij n 1 n C P jij m w i m jim ; (.4) n 1 where is the combinatorial number and w is the weight for the n C P jij corresponding quadrature points. It has been shown that interpolation on a Smolyak grid ensures a bound for the mean-square error [16] je P jdo N r P.logN P /.rc1/.n 1/ ; where N P is the number of quadrature points and r is the order of the maximum derivative thatexists for the delay function. The number of quadrature points increases as O. n P.P /Š It can be shown that a sparse grid at least with level P is required for an order P representation. The reason is that the approximation contains order P polynomials for both y./ and H j./. Thus, there exists y./h j./ with order P,which requires a sparse grid of at least level P with an exactness degree of P C 1. Therefore, level 1 and level sparse grids are required for linear and quadratic models, respectively. The number of quadrature points is about n for the linear model and n for the quadratic model. The computational cost is about the same as the Taylor-conversion method, while keeping the accuracy of homogeneous chaos expansion. In addition to the sparse grid technique, we can also employ several accelerating techniques. Firstly, when n is too small, the number of quadrature points for sparse grid may be larger than that of direct tensor product of a Gaussian quadrature. For example, if there are only two variables, the number is 5 and 15 for level 1 and sparse grid, compared to 4 and 9 for direct tensor product. In this case, the sparse grid will not be used. Secondly, the set of quadrature points (.41) may contain the same points with different weights. For example, the level sparse grid for three variables contains four instances of the point (0,0,0). Combining these points by summing the weights reduces the computational cost of y. i /.

19 4 Sum of Log-Normal Random Variables Galerkin-Based Spectral Stochastic Method The Galerkin-based method is based on the principle of orthogonality that the best approximation of y./ is obtained when the error,./, definedas is orthogonal to the approximation. That is,./ D y./ y (.43) <./; H k./ >D 0; k D 0;1;:::;P; (.44) where H k./ are Hermite polynomials. In this way, we have transformed the stochastic analysis process into a deterministic form, whereas we only need to compute the corresponding coefficients of the Hermite PC. For the illustration purpose, considering two Gaussian variable D Π1 ;,we assume that the charge vector in panels can be written as a second-order (p D ) Hermite PC, we have y./ D y 0 C y 1 1 C y C y 3.1 1/ C y 4. 1/ C y 5. 1 /; (.45) which will be solved by (.44). Once the Hermite PC of y./ is known, the mean and variance of y./ can be evaluated trivially. Given an example, for one random variable, the mean and variance are calculated as E.y.// D y 0 ; Var.y.// D y1 Var./ C y Var. 1/ D y1 C y : (.46) In consideration of correlations among random variables, we apply PCA Sect..5 to transform the correlated variables into a set of independent variables. 4 Sum of Log-Normal Random Variables Leakage current distribution is usually with log-normal distribution. Due to the exponential convergence rate, Hermite PC can be used to represent log-normal variables and the sum of log-normal variables [109].

20 34 Fundamentals of Statistical Analysis 4.1 Hermite PC Representation of Log-Normal Variables Let g./ be the Gaussian random variable and l./ be the random variable obtained by taking the exponential of g./, l./ D e g./ ;g./ D ln.l.//: (.47) For a log-normal random variable I l, let the mean and the variance of g./ as g and g, then the mean and variance of l./ are g C g l D e ; (.48) l D e. gcg/ h i e g 1 ; (.49) respectively. A general Gaussian variable g./ can always be represented in the following affine form: nx g./ D i g i ; (.50) id0 where i are orthogonal Gaussian variables. That is, h i j idı ij, h i id0, and 0 D 1 and g i is the coefficient of the individual Gaussian variables. Note that such form can always be obtained by using Karhunen Loeve orthogonal expansion method [45]. In our problem, we need to represent the log-normal random variable l./ by using the Hermite PC expansion form: where l 0 D exp g C g l./. Therefore, we have l./ D PX kd0 l k Hk n./; (.51). To find the other coefficients, we can apply (.36) on l k.t/ D hl.t;/; H k./i hh k./i ; 8k f0; :::; P g: (.5) As was shown in [44], l./ can be written as l./ D hh k. g/i hh i./i D exp4 g C 1 nx j D1 where n is the number of independent Gaussian random variables. g j 3 5; (.53)

21 4 Sum of Log-Normal Random Variables 35 The log-normal process can then be written as l./ D l 0 C nx i g i C id1 where g i is defined in (.50). nx id1 j D1 1 nx. i j ı ij / h. i j ı ij / i g ig j CA ; (.54) 4. Hermite PC Representation with One Gaussian Variable In this case, D Œ 1. For the second-order Hermite PC (P D /, following (.54), we have l./ D l 0 1 C g 1 C 1 g 1 1 : (.55) Hence, the desired Hermite PC coefficients, I 0;1;, can be expressed as l 0 ;l 0 g, and 1 l 0 g, respectively. 4.3 Hermite PC Representation of Two and More Gaussian Variables For two random variables (n D ), assume that D Œ 1 ; is a normalized uncorrelated Gaussian random variable vector that represents random variable g./: g./ D g C 1 1 C : (.56) Note that h. i j ı ij / idhi j idh i ih j id1: Therefore, the expansion of the log-normal random variables using second-order Hermite PCs can be expressed as l./ D l 0 1 C 1 1 C C / C. 1/ C 1 1 ; (.57) where l D l 0 D exp g C 1 1 C 1 :

22 36 Fundamentals of Statistical Analysis Hence, the desired Hermite PC coefficients, I 0;1;;3;4;5, can be expressed as l 0 ;l 0 1 ;l 0 ; 1 l 0 1 ; 1 l 0,andl 0 1, respectively. Similarly, for four Gaussian random variables, assume that D Π1 ; ; 3 ; 4 is a normalized, uncorrelated Gaussian random variable vector. The random variable g./ can be expressed as g D g C 4X i i : (.58) As a result, the log-normal random variable l./ can be expressed as l./ D l 0 C 4X i i C id1 4X id1 id1 1 i 1 i C 4X id1 j D1 1 4X i j i j CA ; (.59) where! l D l 0 D exp 0 C 1 4X i : id1 Hence, the desired Hermite PC coefficients can be expressed using the equation (.59) above. 5 Summary The discussion of preliminary in probability theory is required to understanding statistical analysis and modeling for VLSI design in nanometer region. In this chapter, we introduced the relevant fundamentals employed in statistical analysis. First, we presented the basic concepts and components such as mean, variance, and covariance due to process variation. After that, we reviewed techniques for the statistical variable decoupling and reduction in PFA/PCA analysis. We further discussed the spectral stochastic analysis required for extraction, mismatch, and yield analysis used in the later chapters. We also discussed different methods to estimate the sum of random variables required for leakage current estimation.

23

Chapter 4 Lecture Notes

Chapter 4 Lecture Notes Chapter 4 Lecture Notes Random Variables October 27, 2015 1 Section 4.1 Random Variables A random variable is typically a real-valued function defined on the sample space of some experiment. For instance,

More information

Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

More information

MATHEMATICAL METHODS OF STATISTICS

MATHEMATICAL METHODS OF STATISTICS MATHEMATICAL METHODS OF STATISTICS By HARALD CRAMER TROFESSOK IN THE UNIVERSITY OF STOCKHOLM Princeton PRINCETON UNIVERSITY PRESS 1946 TABLE OF CONTENTS. First Part. MATHEMATICAL INTRODUCTION. CHAPTERS

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2015 Timo Koski Matematisk statistik 24.09.2015 1 / 1 Learning outcomes Random vectors, mean vector, covariance matrix,

More information

Inner Product Spaces and Orthogonality

Inner Product Spaces and Orthogonality Inner Product Spaces and Orthogonality week 3-4 Fall 2006 Dot product of R n The inner product or dot product of R n is a function, defined by u, v a b + a 2 b 2 + + a n b n for u a, a 2,, a n T, v b,

More information

Joint Exam 1/P Sample Exam 1

Joint Exam 1/P Sample Exam 1 Joint Exam 1/P Sample Exam 1 Take this practice exam under strict exam conditions: Set a timer for 3 hours; Do not stop the timer for restroom breaks; Do not look at your notes. If you believe a question

More information

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers

Variance Reduction. Pricing American Options. Monte Carlo Option Pricing. Delta and Common Random Numbers Variance Reduction The statistical efficiency of Monte Carlo simulation can be measured by the variance of its output If this variance can be lowered without changing the expected value, fewer replications

More information

4.3 Lagrange Approximation

4.3 Lagrange Approximation 206 CHAP. 4 INTERPOLATION AND POLYNOMIAL APPROXIMATION Lagrange Polynomial Approximation 4.3 Lagrange Approximation Interpolation means to estimate a missing function value by taking a weighted average

More information

Inner Product Spaces

Inner Product Spaces Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

More information

1 Sets and Set Notation.

1 Sets and Set Notation. LINEAR ALGEBRA MATH 27.6 SPRING 23 (COHEN) LECTURE NOTES Sets and Set Notation. Definition (Naive Definition of a Set). A set is any collection of objects, called the elements of that set. We will most

More information

Section 6.1 - Inner Products and Norms

Section 6.1 - Inner Products and Norms Section 6.1 - Inner Products and Norms Definition. Let V be a vector space over F {R, C}. An inner product on V is a function that assigns, to every ordered pair of vectors x and y in V, a scalar in F,

More information

Probability and Random Variables. Generation of random variables (r.v.)

Probability and Random Variables. Generation of random variables (r.v.) Probability and Random Variables Method for generating random variables with a specified probability distribution function. Gaussian And Markov Processes Characterization of Stationary Random Process Linearly

More information

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1].

The sample space for a pair of die rolls is the set. The sample space for a random number between 0 and 1 is the interval [0, 1]. Probability Theory Probability Spaces and Events Consider a random experiment with several possible outcomes. For example, we might roll a pair of dice, flip a coin three times, or choose a random real

More information

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution

SF2940: Probability theory Lecture 8: Multivariate Normal Distribution SF2940: Probability theory Lecture 8: Multivariate Normal Distribution Timo Koski 24.09.2014 Timo Koski () Mathematisk statistik 24.09.2014 1 / 75 Learning outcomes Random vectors, mean vector, covariance

More information

Some probability and statistics

Some probability and statistics Appendix A Some probability and statistics A Probabilities, random variables and their distribution We summarize a few of the basic concepts of random variables, usually denoted by capital letters, X,Y,

More information

MULTIVARIATE PROBABILITY DISTRIBUTIONS

MULTIVARIATE PROBABILITY DISTRIBUTIONS MULTIVARIATE PROBABILITY DISTRIBUTIONS. PRELIMINARIES.. Example. Consider an experiment that consists of tossing a die and a coin at the same time. We can consider a number of random variables defined

More information

Sections 2.11 and 5.8

Sections 2.11 and 5.8 Sections 211 and 58 Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1/25 Gesell data Let X be the age in in months a child speaks his/her first word and

More information

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i )

For a partition B 1,..., B n, where B i B j = for i. A = (A B 1 ) (A B 2 ),..., (A B n ) and thus. P (A) = P (A B i ) = P (A B i )P (B i ) Probability Review 15.075 Cynthia Rudin A probability space, defined by Kolmogorov (1903-1987) consists of: A set of outcomes S, e.g., for the roll of a die, S = {1, 2, 3, 4, 5, 6}, 1 1 2 1 6 for the roll

More information

Math 431 An Introduction to Probability. Final Exam Solutions

Math 431 An Introduction to Probability. Final Exam Solutions Math 43 An Introduction to Probability Final Eam Solutions. A continuous random variable X has cdf a for 0, F () = for 0 <

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

More information

Randomized algorithms

Randomized algorithms Randomized algorithms March 10, 2005 1 What are randomized algorithms? Algorithms which use random numbers to make decisions during the executions of the algorithm. Why would we want to do this?? Deterministic

More information

LINEAR ALGEBRA W W L CHEN

LINEAR ALGEBRA W W L CHEN LINEAR ALGEBRA W W L CHEN c W W L Chen, 1997, 2008 This chapter is available free to all individuals, on understanding that it is not to be used for financial gain, and may be downloaded and/or photocopied,

More information

An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

More information

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab

Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh Overview of Monte Carlo Simulation, Probability Review and Introduction to Matlab 1 Overview of Monte Carlo Simulation 1.1 Why use simulation?

More information

Master s Theory Exam Spring 2006

Master s Theory Exam Spring 2006 Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

More information

Lecture Notes 1. Brief Review of Basic Probability

Lecture Notes 1. Brief Review of Basic Probability Probability Review Lecture Notes Brief Review of Basic Probability I assume you know basic probability. Chapters -3 are a review. I will assume you have read and understood Chapters -3. Here is a very

More information

3. INNER PRODUCT SPACES

3. INNER PRODUCT SPACES . INNER PRODUCT SPACES.. Definition So far we have studied abstract vector spaces. These are a generalisation of the geometric spaces R and R. But these have more structure than just that of a vector space.

More information

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL

FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL FEGYVERNEKI SÁNDOR, PROBABILITY THEORY AND MATHEmATICAL STATIsTICs 4 IV. RANDOm VECTORs 1. JOINTLY DIsTRIBUTED RANDOm VARIABLEs If are two rom variables defined on the same sample space we define the joint

More information

Factor analysis. Angela Montanari

Factor analysis. Angela Montanari Factor analysis Angela Montanari 1 Introduction Factor analysis is a statistical model that allows to explain the correlations between a large number of observed correlated variables through a small number

More information

Factorization Theorems

Factorization Theorems Chapter 7 Factorization Theorems This chapter highlights a few of the many factorization theorems for matrices While some factorization results are relatively direct, others are iterative While some factorization

More information

E3: PROBABILITY AND STATISTICS lecture notes

E3: PROBABILITY AND STATISTICS lecture notes E3: PROBABILITY AND STATISTICS lecture notes 2 Contents 1 PROBABILITY THEORY 7 1.1 Experiments and random events............................ 7 1.2 Certain event. Impossible event............................

More information

5. Continuous Random Variables

5. Continuous Random Variables 5. Continuous Random Variables Continuous random variables can take any value in an interval. They are used to model physical characteristics such as time, length, position, etc. Examples (i) Let X be

More information

A Tutorial on Probability Theory

A Tutorial on Probability Theory Paola Sebastiani Department of Mathematics and Statistics University of Massachusetts at Amherst Corresponding Author: Paola Sebastiani. Department of Mathematics and Statistics, University of Massachusetts,

More information

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation

CONTROLLABILITY. Chapter 2. 2.1 Reachable Set and Controllability. Suppose we have a linear system described by the state equation Chapter 2 CONTROLLABILITY 2 Reachable Set and Controllability Suppose we have a linear system described by the state equation ẋ Ax + Bu (2) x() x Consider the following problem For a given vector x in

More information

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010

Generating Random Numbers Variance Reduction Quasi-Monte Carlo. Simulation Methods. Leonid Kogan. MIT, Sloan. 15.450, Fall 2010 Simulation Methods Leonid Kogan MIT, Sloan 15.450, Fall 2010 c Leonid Kogan ( MIT, Sloan ) Simulation Methods 15.450, Fall 2010 1 / 35 Outline 1 Generating Random Numbers 2 Variance Reduction 3 Quasi-Monte

More information

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS

VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS VERTICES OF GIVEN DEGREE IN SERIES-PARALLEL GRAPHS MICHAEL DRMOTA, OMER GIMENEZ, AND MARC NOY Abstract. We show that the number of vertices of a given degree k in several kinds of series-parallel labelled

More information

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression

Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Principle Component Analysis and Partial Least Squares: Two Dimension Reduction Techniques for Regression Saikat Maitra and Jun Yan Abstract: Dimension reduction is one of the major tasks for multivariate

More information

Aggregate Loss Models

Aggregate Loss Models Aggregate Loss Models Chapter 9 Stat 477 - Loss Models Chapter 9 (Stat 477) Aggregate Loss Models Brian Hartman - BYU 1 / 22 Objectives Objectives Individual risk model Collective risk model Computing

More information

2.1 Complexity Classes

2.1 Complexity Classes 15-859(M): Randomized Algorithms Lecturer: Shuchi Chawla Topic: Complexity classes, Identity checking Date: September 15, 2004 Scribe: Andrew Gilpin 2.1 Complexity Classes In this lecture we will look

More information

Introduction to the Finite Element Method (FEM)

Introduction to the Finite Element Method (FEM) Introduction to the Finite Element Method (FEM) ecture First and Second Order One Dimensional Shape Functions Dr. J. Dean Discretisation Consider the temperature distribution along the one-dimensional

More information

Understanding Basic Calculus

Understanding Basic Calculus Understanding Basic Calculus S.K. Chung Dedicated to all the people who have helped me in my life. i Preface This book is a revised and expanded version of the lecture notes for Basic Calculus and other

More information

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 10 Lecture Notes to Accompany Scientific Computing An Introductory Survey Second Edition by Michael T. Heath Chapter 10 Boundary Value Problems for Ordinary Differential Equations Copyright c 2001. Reproduction

More information

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points

Introduction to Algebraic Geometry. Bézout s Theorem and Inflection Points Introduction to Algebraic Geometry Bézout s Theorem and Inflection Points 1. The resultant. Let K be a field. Then the polynomial ring K[x] is a unique factorisation domain (UFD). Another example of a

More information

Quadratic forms Cochran s theorem, degrees of freedom, and all that

Quadratic forms Cochran s theorem, degrees of freedom, and all that Quadratic forms Cochran s theorem, degrees of freedom, and all that Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Why We Care Cochran s theorem tells us

More information

15.062 Data Mining: Algorithms and Applications Matrix Math Review

15.062 Data Mining: Algorithms and Applications Matrix Math Review .6 Data Mining: Algorithms and Applications Matrix Math Review The purpose of this document is to give a brief review of selected linear algebra concepts that will be useful for the course and to develop

More information

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel

Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002 359 Communication on the Grassmann Manifold: A Geometric Approach to the Noncoherent Multiple-Antenna Channel Lizhong Zheng, Student

More information

State of Stress at Point

State of Stress at Point State of Stress at Point Einstein Notation The basic idea of Einstein notation is that a covector and a vector can form a scalar: This is typically written as an explicit sum: According to this convention,

More information

NOV - 30211/II. 1. Let f(z) = sin z, z C. Then f(z) : 3. Let the sequence {a n } be given. (A) is bounded in the complex plane

NOV - 30211/II. 1. Let f(z) = sin z, z C. Then f(z) : 3. Let the sequence {a n } be given. (A) is bounded in the complex plane Mathematical Sciences Paper II Time Allowed : 75 Minutes] [Maximum Marks : 100 Note : This Paper contains Fifty (50) multiple choice questions. Each question carries Two () marks. Attempt All questions.

More information

Analysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm

Analysis of Mean-Square Error and Transient Speed of the LMS Adaptive Algorithm IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 7, JULY 2002 1873 Analysis of Mean-Square Error Transient Speed of the LMS Adaptive Algorithm Onkar Dabeer, Student Member, IEEE, Elias Masry, Fellow,

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models General Linear Models - part I Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby

More information

Review of Random Variables

Review of Random Variables Chapter 1 Review of Random Variables Updated: January 16, 2015 This chapter reviews basic probability concepts that are necessary for the modeling and statistical analysis of financial data. 1.1 Random

More information

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur

Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Probability and Statistics Prof. Dr. Somesh Kumar Department of Mathematics Indian Institute of Technology, Kharagpur Module No. #01 Lecture No. #15 Special Distributions-VI Today, I am going to introduce

More information

Portfolio Distribution Modelling and Computation. Harry Zheng Department of Mathematics Imperial College h.zheng@imperial.ac.uk

Portfolio Distribution Modelling and Computation. Harry Zheng Department of Mathematics Imperial College h.zheng@imperial.ac.uk Portfolio Distribution Modelling and Computation Harry Zheng Department of Mathematics Imperial College h.zheng@imperial.ac.uk Workshop on Fast Financial Algorithms Tanaka Business School Imperial College

More information

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure

Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Exact Inference for Gaussian Process Regression in case of Big Data with the Cartesian Product Structure Belyaev Mikhail 1,2,3, Burnaev Evgeny 1,2,3, Kapushev Yermek 1,2 1 Institute for Information Transmission

More information

APPLIED MATHEMATICS ADVANCED LEVEL

APPLIED MATHEMATICS ADVANCED LEVEL APPLIED MATHEMATICS ADVANCED LEVEL INTRODUCTION This syllabus serves to examine candidates knowledge and skills in introductory mathematical and statistical methods, and their applications. For applications

More information

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( )

NCSS Statistical Software Principal Components Regression. In ordinary least squares, the regression coefficients are estimated using the formula ( ) Chapter 340 Principal Components Regression Introduction is a technique for analyzing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates

More information

Probability Generating Functions

Probability Generating Functions page 39 Chapter 3 Probability Generating Functions 3 Preamble: Generating Functions Generating functions are widely used in mathematics, and play an important role in probability theory Consider a sequence

More information

4. Continuous Random Variables, the Pareto and Normal Distributions

4. Continuous Random Variables, the Pareto and Normal Distributions 4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random

More information

Conditional Tail Expectations for Multivariate Phase Type Distributions

Conditional Tail Expectations for Multivariate Phase Type Distributions Conditional Tail Expectations for Multivariate Phase Type Distributions Jun Cai Department of Statistics and Actuarial Science University of Waterloo Waterloo, ON N2L 3G1, Canada jcai@math.uwaterloo.ca

More information

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 2. x n. a 11 a 12 a 1n b 1 a 21 a 22 a 2n b 2 a 31 a 32 a 3n b 3. a m1 a m2 a mn b m MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS 1. SYSTEMS OF EQUATIONS AND MATRICES 1.1. Representation of a linear system. The general system of m equations in n unknowns can be written a 11 x 1 + a 12 x 2 +

More information

Lecture 3: Linear methods for classification

Lecture 3: Linear methods for classification Lecture 3: Linear methods for classification Rafael A. Irizarry and Hector Corrada Bravo February, 2010 Today we describe four specific algorithms useful for classification problems: linear regression,

More information

( ) is proportional to ( 10 + x)!2. Calculate the

( ) is proportional to ( 10 + x)!2. Calculate the PRACTICE EXAMINATION NUMBER 6. An insurance company eamines its pool of auto insurance customers and gathers the following information: i) All customers insure at least one car. ii) 64 of the customers

More information

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T..

Probability Theory. Florian Herzog. A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Probability Theory A random variable is neither random nor variable. Gian-Carlo Rota, M.I.T.. Florian Herzog 2013 Probability space Probability space A probability space W is a unique triple W = {Ω, F,

More information

Influences in low-degree polynomials

Influences in low-degree polynomials Influences in low-degree polynomials Artūrs Bačkurs December 12, 2012 1 Introduction In 3] it is conjectured that every bounded real polynomial has a highly influential variable The conjecture is known

More information

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

More information

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh

Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Modern Optimization Methods for Big Data Problems MATH11146 The University of Edinburgh Peter Richtárik Week 3 Randomized Coordinate Descent With Arbitrary Sampling January 27, 2016 1 / 30 The Problem

More information

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C.

CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES From Exploratory Factor Analysis Ledyard R Tucker and Robert C MacCallum 1997 180 CHAPTER 8 FACTOR EXTRACTION BY MATRIX FACTORING TECHNIQUES In

More information

POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS

POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS N. ROBIDOUX Abstract. We show that, given a histogram with n bins possibly non-contiguous or consisting

More information

Introduction to Probability

Introduction to Probability Introduction to Probability EE 179, Lecture 15, Handout #24 Probability theory gives a mathematical characterization for experiments with random outcomes. coin toss life of lightbulb binary data sequence

More information

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015.

Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment 2-3, Probability and Statistics, March 2015. Due:-March 25, 2015. Department of Mathematics, Indian Institute of Technology, Kharagpur Assignment -3, Probability and Statistics, March 05. Due:-March 5, 05.. Show that the function 0 for x < x+ F (x) = 4 for x < for x

More information

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4) Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

More information

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators...

MATH4427 Notebook 2 Spring 2016. 2 MATH4427 Notebook 2 3. 2.1 Definitions and Examples... 3. 2.2 Performance Measures for Estimators... MATH4427 Notebook 2 Spring 2016 prepared by Professor Jenny Baglivo c Copyright 2009-2016 by Jenny A. Baglivo. All Rights Reserved. Contents 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Section 1.1. Introduction to R n

Section 1.1. Introduction to R n The Calculus of Functions of Several Variables Section. Introduction to R n Calculus is the study of functional relationships and how related quantities change with each other. In your first exposure to

More information

1 VECTOR SPACES AND SUBSPACES

1 VECTOR SPACES AND SUBSPACES 1 VECTOR SPACES AND SUBSPACES What is a vector? Many are familiar with the concept of a vector as: Something which has magnitude and direction. an ordered pair or triple. a description for quantities such

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Psychology 7291: Multivariate Statistics (Carey) 8/27/98 Matrix Algebra - 1 Introduction to Matrix Algebra Definitions: A matrix is a collection of numbers ordered by rows and columns. It is customary

More information

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables 1 The Monte Carlo Framework Suppose we wish

More information

Chapter 8 - Power Density Spectrum

Chapter 8 - Power Density Spectrum EE385 Class Notes 8/8/03 John Stensby Chapter 8 - Power Density Spectrum Let X(t) be a WSS random process. X(t) has an average power, given in watts, of E[X(t) ], a constant. his total average power is

More information

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS

AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS AN INTRODUCTION TO NUMERICAL METHODS AND ANALYSIS Revised Edition James Epperson Mathematical Reviews BICENTENNIAL 0, 1 8 0 7 z ewiley wu 2007 r71 BICENTENNIAL WILEY-INTERSCIENCE A John Wiley & Sons, Inc.,

More information

Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are:

Measurements of central tendency express whether the numbers tend to be high or low. The most common of these are: A PRIMER IN PROBABILITY This handout is intended to refresh you on the elements of probability and statistics that are relevant for econometric analysis. In order to help you prioritize the information

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS

POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS POLYNOMIAL HISTOPOLATION, SUPERCONVERGENT DEGREES OF FREEDOM, AND PSEUDOSPECTRAL DISCRETE HODGE OPERATORS N. ROBIDOUX Abstract. We show that, given a histogram with n bins possibly non-contiguous or consisting

More information

Distributed computing of failure probabilities for structures in civil engineering

Distributed computing of failure probabilities for structures in civil engineering Distributed computing of failure probabilities for structures in civil engineering Andrés Wellmann Jelic, University of Bochum (andres.wellmann@rub.de) Matthias Baitsch, University of Bochum (matthias.baitsch@rub.de)

More information

Lecture 6: Discrete & Continuous Probability and Random Variables

Lecture 6: Discrete & Continuous Probability and Random Variables Lecture 6: Discrete & Continuous Probability and Random Variables D. Alex Hughes Math Camp September 17, 2015 D. Alex Hughes (Math Camp) Lecture 6: Discrete & Continuous Probability and Random September

More information

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering

Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Two Topics in Parametric Integration Applied to Stochastic Simulation in Industrial Engineering Department of Industrial Engineering and Management Sciences Northwestern University September 15th, 2014

More information

Monte Carlo Methods in Finance

Monte Carlo Methods in Finance Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

More information

Similarity and Diagonalization. Similar Matrices

Similarity and Diagonalization. Similar Matrices MATH022 Linear Algebra Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n n matrices. We say that A is similar to B if there is an invertible n n matrix P such that

More information

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION

CITY UNIVERSITY LONDON. BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION No: CITY UNIVERSITY LONDON BEng Degree in Computer Systems Engineering Part II BSc Degree in Computer Systems Engineering Part III PART 2 EXAMINATION ENGINEERING MATHEMATICS 2 (resit) EX2005 Date: August

More information

Nonlinear Iterative Partial Least Squares Method

Nonlinear Iterative Partial Least Squares Method Numerical Methods for Determining Principal Component Analysis Abstract Factors Béchu, S., Richard-Plouet, M., Fernandez, V., Walton, J., and Fairley, N. (2016) Developments in numerical treatments for

More information

T ( a i x i ) = a i T (x i ).

T ( a i x i ) = a i T (x i ). Chapter 2 Defn 1. (p. 65) Let V and W be vector spaces (over F ). We call a function T : V W a linear transformation form V to W if, for all x, y V and c F, we have (a) T (x + y) = T (x) + T (y) and (b)

More information

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8

Bindel, Spring 2012 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Feb 8 Spaces and bases Week 3: Wednesday, Feb 8 I have two favorite vector spaces 1 : R n and the space P d of polynomials of degree at most d. For R n, we have a canonical basis: R n = span{e 1, e 2,..., e

More information

Introduction: Overview of Kernel Methods

Introduction: Overview of Kernel Methods Introduction: Overview of Kernel Methods Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University

More information

MAS108 Probability I

MAS108 Probability I 1 QUEEN MARY UNIVERSITY OF LONDON 2:30 pm, Thursday 3 May, 2007 Duration: 2 hours MAS108 Probability I Do not start reading the question paper until you are instructed to by the invigilators. The paper

More information

Java Modules for Time Series Analysis

Java Modules for Time Series Analysis Java Modules for Time Series Analysis Agenda Clustering Non-normal distributions Multifactor modeling Implied ratings Time series prediction 1. Clustering + Cluster 1 Synthetic Clustering + Time series

More information

Dimensionality Reduction: Principal Components Analysis

Dimensionality Reduction: Principal Components Analysis Dimensionality Reduction: Principal Components Analysis In data mining one often encounters situations where there are a large number of variables in the database. In such situations it is very likely

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year.

Algebra Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13 school year. This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Algebra

More information

Tail inequalities for order statistics of log-concave vectors and applications

Tail inequalities for order statistics of log-concave vectors and applications Tail inequalities for order statistics of log-concave vectors and applications Rafał Latała Based in part on a joint work with R.Adamczak, A.E.Litvak, A.Pajor and N.Tomczak-Jaegermann Banff, May 2011 Basic

More information

Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

More information

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties:

THE DYING FIBONACCI TREE. 1. Introduction. Consider a tree with two types of nodes, say A and B, and the following properties: THE DYING FIBONACCI TREE BERNHARD GITTENBERGER 1. Introduction Consider a tree with two types of nodes, say A and B, and the following properties: 1. Let the root be of type A.. Each node of type A produces

More information