# The Concept of Exchangeability and its Applications

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Departament d Estadística i I.O., Universitat de València. Facultat de Matemàtiques, Burjassot, València, Spain. Tel , Fax (direct), (office) Internet: Web: bernardo/ Printed on December 9, 1996 Invited paper for the Bhattacharya Memorial Volume, to be published in the Far East J. Math. Sci. The Concept of Exchangeability and its Applications JOSÉ M. BERNARDO Universitat de València, Spain SUMMARY The general concept of exchangeability allows the more flexible modelling of most experimental setups. The representation theorems for exchangeable sequences of random variables establish that any coherent analysis of the information thus modelled requires the specification of a joint probability distribution on all the parameters involved, hence forcing a Bayesian approach. The concept of partial exchangeability provides a further refinement, by permitting appropriate modelling of related experimental setups, leading to coherent information integration by means of so-called hierarchical models. Recent applications of hierarchical models for combining information from similar experiments in education, medicine and psychology have been produced under the name of meta-analysis. Keywords: BAYESIAN INFERENCE; EXCHANGEABILITY; HIERARCHICAL MODELS; META-ANALYSIS; REFERENCE DISTRIBUTIONS; REPRESENTATION THEOREMS. 1. INTRODUCTION It is generally agreed that the uncertainty relative to the possible values of the observable outcome x={x 1,...,x n } of an experiment of size n is appropriately described by its joint probability distribution with, say, density p(x) =p(x 1,...,x n ), so that the probability that x belongs to a region A is P (x A) = p(x) dx and, hence, by standard arguments of probability theory, the (predictive) probability that a future similar observation x n+1 belongs to an interval I given the information provided by x is P (x n+1 I x) = p(x n+1 x 1,...,x n ) dx n+1, I p(x n+1 x) = p(x 1,...,x n+1 ) p(x 1,...,x n ) José M. Bernardo is Professor of Statistics at the University of Valencia. Research partially funded with grant PB of the DGICYT, Madrid, Spain. A.

2 J. M. Bernardo. 2 It follows that, in order to predict a future observable quantity given a sequence of similar observations which is one of the basic problems in the analysis of scientific data it is necessary and sufficient to assess, for any n, the form of the joint probability density p(x 1,...,x n ). In Section 2, we describe the concept of exchangeability, which makes precise the sense in which the observations must be similar. In Section 3 we discuss the radical consequences of the exchangeability assumptions which are implied by the so-called representation theorems. In Section 4, we describe a further elaboration, introducing hierarchical models as an appropriate tool for information integration. Finally, Section 5 contains additional remarks, and some references to recent applied work on the combination of information, which makes use of the concepts reviewed here. 2. EXCHANGEABILITY The joint density p(x 1,...,x n ), which we have to specify, must encapsulate the type of dependence assumed among the individual random quantities x i. In general, there is a vast number of possible assumptions about the form such dependencies might take, but there are some particularly simple forms which accurately describe a large class of experimental setups. Suppose that in considering p(x 1,...,x n ) the scientist makes the judgement that the subscripts, the labels identifying the individual random quantities are uninformative, in the sense the information that the x i s provide is independent of the order in which they are collected. This judgement of similarity or symmetry is captured by requiring that p(x 1,...,x n )=p(x π(1),...,x π(n) ), for all permutations π defined on the set {1,...,n}. A sequence of random quantities is said to be exchangeable if this property holds for every finite subset of them. Example. (Physiological responses). Suppose (x 1,...,x n ) are real-valued measurements of a specific physiological response in human subjects when a particular drug is administered. If the drug is administered at more than one dose level, and if they are male and female subjects from different ethnic groups, one would be reluctant to make a judgement of exchangeability for the entire sequence of results. However, within each combination of dose-level, sex, and ethnic group, an assumption of exchangeability would often be reasonable. We shall now review the important consequences of an exchangeability assumption implied by the general representation theorem. 3. THE GENERAL REPRESENTATION THEOREM The similarity assumption of exchangeability has strong mathematical implications. Formally, the general representation theorem provides an integral representation of the joint density p(x 1,...,x n ) of any subset of exchangeable random quantities. More specifically, if {x 1,x 2,...} is an exchangeable sequence of real-valued random quantities, then there exists a parametric model, p(x θ), labeled by some parameter θ Θ which is the limit (as n ) of some function of the x i s, and there exists a probability distribution for θ, with density p(θ), such that n p(x 1,...,x n )= p(x i θ)p(θ) d(θ). Θ In more conventional terminology, this means that, if a sequence of observations is judged to be exchangeable, then, anyfinite subset of them is a random sample of some model p(x i θ), and

4 J. M. Bernardo HIERARCHICAL MODELS One often has several related sequences of exchangeable random quantities, the distribution of which depends on separately sufficient statistics in the sense that, if one has m of such sequences, p(x 1,...,x m )= p(x i t i ), where t i is a sufficient statistic for x i. The corresponding integral representation is then of the form n i p(x 1,...,x m )= p i (x ij θ i )p(θ 1,...,θ m ) dθ 1,...,dθ m. Θ j=1 Most often, the fact that the m sequences are being considered together means that all random sequences relate to the same measurement procedure, so that one typically has p i (x θ i )= p(x θ i ). Example. (Continued). Ifx ij, j =1,...,n i, i =1,...,mdenote the responses to a drug from females from m ethnic groups in the conditions described above, then one would typically have a representation of the form n i p(x 1,...,x m )= N(x ij µ i,σ)p(µ 1,...,µ m,σ) dµ 1,...,dµ m dσ. Θ j=1 Thus, {x i1,...,x ini } must be considered as a random sample from a normal distribution N(x µ i,σ) and there must exist a prior distribution p(µ 1...,µ m,σ) which has to describe our assumptions on the relationship among the means of the m groups. In this context, judgements of exchangeability are not only appropriate within each of the m separate sequences of observations, but also between the m corresponding parameters, so that its joint distribution has an integral representation of the form p(θ 1,...,θ m )= p(θ i φ)p(φ) dφ, Φ and, hence, the parameter values which correspond to each sequence may be seen as a random sample from some parameter population with density p(θ φ), and there must exist a prior distribution p(φ) describing the initial information about the hyperparameter φ which labels p(θ φ). Example. (Continued). If the mean responses {µ 1,...,µ m } within each of the m ethnic groups considered are judged to be exchangeable, then p(µ 1,...,µ m,σ)= p(µ i σ)p(σ). If, furthermore, their mean and standard deviations are judged sufficient to capture all relevant information about the µ i s, then m p(µ 1,...,µ m,σ)=p(σ) N(µ i µ 0,σ 0 )p(µ 0,σ 0 ) dµ 0,dσ 0, and, hence, the µ i s, the means of the responses within each ethnic group, may be regarded as a random sample from a normal population of female mean responses with overall mean µ 0 and standard deviation σ 0.

5 J. M. Bernardo. 5 If no prior information is assumed on µ 0, σ 0 and σ, one may use the corresponding reference prior and derive the appropriate posterior distributions for each of the µ i s, π(µ i x 1,...x m ). It is easily verified that, contidional on σ and σ 0, the posterior means of the µ i s are given by E[µ i x 1,...x m,σ,σ 0 ]=ω i x i +(1 ω i )x, with m x = ω ix i m ω, ω i = n iσ0 2 i n i σ σ2 This shows that there is a shrinkage from the sample mean within the i-th group, x i, towards the overall weighted mean, x, which reflects the fact that the exchangeability assumption about the µ i s makes all data relevant to draw conclusions about each of them, thus providing a very simple example of information integration. For details, see Berger and Bernardo (1992b). Hierarchical modelling provides a powerful and flexible approach to the representation of assumptions about observables in complex data structures. This section merely provides a brief sketch to the basic ideas. A comprehensive discussion of hierarchical models requires a dedicated monograph. 5. DISCUSSION The representation theorems are mainly due to de Finetti (1930, 1970/1974), Hewitt and Savage (1955) and Diaconis and Freedman (1984, 1987); for a recent review at a textbook level see Bernardo and Smith (1994, Ch. 4). The detailed mathematics of the representation theorems are involved, but their main message is very clear: if a sequence of observations is judged to be exchangeable, then any subset of them must be regarded as a random sample from some model, and there exist a prior distribution on the parameter of such model, hence requiring a Bayesian approach. A simple hierarchical model is of the form p(x 1,...,x m θ 1,...,θ m )= p(θ 1,...,θ m φ) = p(φ), p(x i θ i ), p(θ i φ), which is to be interpreted as follows. Sequences of observables x 1,...,x m are available from m different, but related sources: for example m clinical trial centres involved in the same study. The first stage of the hierarchy specifies the parametric model of each of the m sequences. Since the sequences are similar, the parameters θ 1,...,θ m are themselves judged to be exchangeable; the second and third stages of the hierarchy thus provides a prior of the form p(θ 1,...,θ m )= p(θ i φ)p(φ) dφ, Φ where the hyperparameter φ has typically an interpretation in terms of the characteristics for example mean and variance of the population (for example trial centers) from which the m data sequences are drawn. Very often, no reliable prior information is available on the

6 J. M. Bernardo. 6 hypermarameter φ, in which case, a reference prior π(φ) may be used; for details, see Bernardo (1979), Berger and Bernardo (1992a) or Bernardo and Smith (1994, Sec. 5.4). The information about both the unit characteristics, θ i and the population characteristics φ after the data x 1,...,x m have been collected is described by their corresponding posterior densities, which may be obtained, from standard probability arguments involving Bayes theorem, as p(θ i x 1,...,x m )= p(θ i φ, x 1,...,x m )p(φ x 1,...,x m ) dφ, Φ p(φ x 1,...,x m ) p(x 1,...,x m φ)p(φ), where p(θ i φ, x 1,...,x m ) p(x 1,...,x m θ i )p(θ i φ), p(x 1,...,x m φ) = Θ p(x 1,...,x m θ 1,...,θ m )p(θ 1,...,θ m φ) dθ 1,...,dθ m. Naturally, actual implementation requires the evaluation of the appropriate integrals, and this is not necessarily trivial. Some key references on hierarchical models are Good (1965, 1980), Lindley (1971), Lindley and Smith (1972), Smith (1973), Mouchart and Simar (1980), Goel and DeGroot (1981), Berger and Bernardo (1992), George et al. (1994) and Morris and Christiansen (1996). The term meta-analysis is often used in the education, medicine and psychology literature to describe the practice of combining results from similar independent experiments. Hierarchical models provide the appropriate tool for understanding and generalizing those analysis. Some key references within this specialized topic are Cochran (1954), Edwards et al. (1963), Hodges and Olkin (1985), DerSimonian and Laird (1986), Berlin et al. (1989), Goodman (1989), DuMouchel (1990), Wachter and Straf (1990), Cook et al. (1992), Morris and Normand (1992), Wolpert and Warren-Hicks (1992), Cooper and Hedges (1994) and Petitti (1994). REFERENCES Berger, J. O. and Bernardo, J. M. (1992a). On the development of reference priors. Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press, (with discussion). Berger, J. O. and Bernardo, J. M. (1992b). Reference priors in a variance components problem. Bayesian Analysis in Statistics and Econometrics (P. K. Goel and N. S. Iyengar, eds.). Berlin: Springer, Berlin, J. A., Laird, N. M., Sacks, H. S. and Chalmers, T. C. (1989). A comparison of statistical methods for combining event rates from clinical trials. Statistics in Medicine 8, Bernardo, J. M. (1979b). Reference posterior distributions for Bayesian inference. J. Roy. Statist. Soc. B 41, (with discussion). Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian Theory. Chichester: Wiley. Berry, D. A., Thor, A., Cirrincione, C., Edgerton, S., Muss, H., Marks, J., Lui, E., Wood, W., Budman, D., Perloff, M., Peters, W. and Henderson, I. C. (1996). Scientific inference and predictions; multiplicities and convincing stories: a case study in breast cancer therapy. Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press, 45 67, (with discussion). Cochran, W, G. (1954). The combination of estimates from different experiments. Biometrics 10, Cook, T. D., Cooper, H., Cordray, D. S., Hartmann, H., Hedges, L. V. Light, R. J., Jouis, T. A. and Mosteller, F. (1992). Meta-Analysis for Explanation: A Casebook. New York: Russel Sage Foundation. Cooper, H. and Hedges, L. V. (eds.) (1994). The Handbook of Research Synthesis. New York: Russel Sage Foundation. de Finetti, B. (1930). Funzione caratteristica di un fenomeno aleatorio. Mem. Acad. Naz. Lincei 4, de Finetti, B. (1970/1974). Teoria delle Probabilità 1. Turin: Einaudi. English translation as Theory of Probability 1 in 1974, Chichester: Wiley.

7 J. M. Bernardo. 7 DerSimonian, R. and Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials 7, Diaconis, P. and Freedman, D. (1984). Partial exchangeability and sufficiency. Statistics: Applications and New Directions (J. K. Ghosh and J. Roy, eds.). Calcutta: Indian Statist. Institute, Diaconis, P. and Freedman, D. (1987). A dozen de Finetti-style results in search of a theory. Ann. Inst. H. Poincaré 23, DuMouchel, W. H. (1990). Bayesian Meta-Analysis. Statistical Methodology in the Pharmaceutical Sciences. (D. A. Berry, ed.). New York: Marcel Dekker, Edwards, W., Lindman, H. and Sacage, L. J. (1963). Bayesian statistical inference for psychological research.psychological Rev. 70, George, E. I., Makov, U. E. and Smith, A. F. M. (1994). Bayesian hierarchical analysis for exponential families via Markov chain Monte Carlo. Aspects of Uncertainty: a Tribute to D. V. Lindley (P. R. Freeman and A. F. M. Smith, eds.). Chichester: Wiley, Goodman, S. N. (1989). Meta-analysis and evidence. Controlled Clinical Trials 10, Hewitt, E. and Savage, L. J. (1955). Symmetric measures on Cartesian products. Trans. Amer. Math. Soc. 80, Hodges, L. V. and Olkin, I. (1985). Statistical Methods for Meta-analysis. New York: Academic Press. Goel, P. K. and DeGroot, M. H. (1981). Information about hyperparameters in hierarchical models. J. Amer. Statist. Assoc. 76, Good, I. J. (1965). The Estimation of Probabilities. An Essay on Modern Bayesian Methods. Cambridge, Mass: The MIT Press. Good, I. J. (1980). Some history of the hierarchical Bayesian methodology. Bayesian Statistics (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.). Valencia: University Press, Lindley, D. V. (1971). The estimation of many parameters. Foundations of Statistical Inference (V. P. Godambe and D. A. Sprott, eds.). Toronto: Holt, Rinehart and Winston, (with discussion). Lindley, D. V. and Smith, A. F. M. (1972). Bayes estimates for the linear model. J. Roy. Statist. Soc. B 34, 1 41 (with discussion). Morris, C. N. and Christiansen, C. L. (1996). Hierarchical models for ranking and for identifying extremes, with applications. Bayesian Statistics 5 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press, , (with discussion). Morris, C. N. and Normand, S. L. (1992). Hierarchical models for combining information and for meta-analysis. Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press, , (with discussion). Mouchart, M. and Simar, L. (1980). Least squares approximation in Bayesian analysis. Bayesian Statistics (J. M. Bernardo, M. H. DeGroot, D. V. Lindley and A. F. M. Smith, eds.). Valencia: University Press, and (with discussion). Petitti, D. B. (1994). Meta-Analysis, Decision analysis, and Cost-Effectiveness Analysis: Methods for Quantitative Synthesis in Medicine. Oxford: University Press Smith, A. F. M. (1973). A general Bayesian linear model. J. Roy. Statist. Soc. B 35, Wachter, K. W. and Straf, M. L. (eds.) (1990) The Future of Meta-Analysis. New York: Russel Sage Foundation. Wolpert, R. L. and Warren-Hicks, W. J. (1992). Bayesian hierarchical logistic models for combining field and laboratory survival data. Bayesian Statistics 4 (J. M. Bernardo, J. O. Berger, A. P. Dawid and A. F. M. Smith, eds.). Oxford: University Press, , (with discussion).

### The Chinese Restaurant Process

COS 597C: Bayesian nonparametrics Lecturer: David Blei Lecture # 1 Scribes: Peter Frazier, Indraneel Mukherjee September 21, 2007 In this first lecture, we begin by introducing the Chinese Restaurant Process.

### Optimizing Prediction with Hierarchical Models: Bayesian Clustering

1 Technical Report 06/93, (August 30, 1993). Presidencia de la Generalidad. Caballeros 9, 46001 - Valencia, Spain. Tel. (34)(6) 386.6138, Fax (34)(6) 386.3626, e-mail: bernardo@mac.uv.es Optimizing Prediction

### Fuzzy Probability Distributions in Bayesian Analysis

Fuzzy Probability Distributions in Bayesian Analysis Reinhard Viertl and Owat Sunanta Department of Statistics and Probability Theory Vienna University of Technology, Vienna, Austria Corresponding author:

### Optimal Stopping in Software Testing

Optimal Stopping in Software Testing Nilgun Morali, 1 Refik Soyer 2 1 Department of Statistics, Dokuz Eylal Universitesi, Turkey 2 Department of Management Science, The George Washington University, 2115

### Nonparametric adaptive age replacement with a one-cycle criterion

Nonparametric adaptive age replacement with a one-cycle criterion P. Coolen-Schrijner, F.P.A. Coolen Department of Mathematical Sciences University of Durham, Durham, DH1 3LE, UK e-mail: Pauline.Schrijner@durham.ac.uk

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

### Sufficient Statistics and Exponential Family. 1 Statistics and Sufficient Statistics. Math 541: Statistical Theory II. Lecturer: Songfeng Zheng

Math 541: Statistical Theory II Lecturer: Songfeng Zheng Sufficient Statistics and Exponential Family 1 Statistics and Sufficient Statistics Suppose we have a random sample X 1,, X n taken from a distribution

### Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber 2011 1

Data Modeling & Analysis Techniques Probability & Statistics Manfred Huber 2011 1 Probability and Statistics Probability and statistics are often used interchangeably but are different, related fields

### Dynamic Linear Models with R

Giovanni Petris, Sonia Petrone, Patrizia Campagnoli Dynamic Linear Models with R SPIN Springer s internal project number, if known Monograph August 10, 2007 Springer Berlin Heidelberg NewYork Hong Kong

### Applications of R Software in Bayesian Data Analysis

Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

### CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

### About the inverse football pool problem for 9 games 1

Seventh International Workshop on Optimal Codes and Related Topics September 6-1, 013, Albena, Bulgaria pp. 15-133 About the inverse football pool problem for 9 games 1 Emil Kolev Tsonka Baicheva Institute

### Parametric and Nonparametric: Demystifying the Terms

Parametric and Nonparametric: Demystifying the Terms By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD

### Solution Using the geometric series a/(1 r) = x=1. x=1. Problem For each of the following distributions, compute

Math 472 Homework Assignment 1 Problem 1.9.2. Let p(x) 1/2 x, x 1, 2, 3,..., zero elsewhere, be the pmf of the random variable X. Find the mgf, the mean, and the variance of X. Solution 1.9.2. Using the

### HT2015: SC4 Statistical Data Mining and Machine Learning

HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric

### Model-based Synthesis. Tony O Hagan

Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that

### A Bayesian hierarchical surrogate outcome model for multiple sclerosis

A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)

### Chapter 14 Managing Operational Risks with Bayesian Networks

Chapter 14 Managing Operational Risks with Bayesian Networks Carol Alexander This chapter introduces Bayesian belief and decision networks as quantitative management tools for operational risks. Bayesian

### Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach

Modeling and Analysis of Call Center Arrival Data: A Bayesian Approach Refik Soyer * Department of Management Science The George Washington University M. Murat Tarimcilar Department of Management Science

### Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

### Prime Numbers. Chapter Primes and Composites

Chapter 2 Prime Numbers The term factoring or factorization refers to the process of expressing an integer as the product of two or more integers in a nontrivial way, e.g., 42 = 6 7. Prime numbers are

### More details on the inputs, functionality, and output can be found below.

Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing

### Arnold Zellner. Booth School of Business. University of Chicago. 5807 South Woodlawn Avenue Chicago, IL 60637. arnold.zellner@chicagobooth.

H.G.B. Alexander Research Foundation Graduate School of Business University of Chicago Comments on Harold Jeffreys Theory of Probability Revisited, co-authored by C.P. Robert, N. Chopin and J. Rousseau.

### On Gauss s characterization of the normal distribution

Bernoulli 13(1), 2007, 169 174 DOI: 10.3150/07-BEJ5166 On Gauss s characterization of the normal distribution ADELCHI AZZALINI 1 and MARC G. GENTON 2 1 Department of Statistical Sciences, University of

### Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

### False-Alarm and Non-Detection Probabilities for On-line Quality Control via HMM

Int. Journal of Math. Analysis, Vol. 6, 2012, no. 24, 1153-1162 False-Alarm and Non-Detection Probabilities for On-line Quality Control via HMM C.C.Y. Dorea a,1, C.R. Gonçalves a, P.G. Medeiros b and W.B.

### What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL

What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the

### A PRIORI ESTIMATES FOR SEMISTABLE SOLUTIONS OF SEMILINEAR ELLIPTIC EQUATIONS. In memory of Rou-Huai Wang

A PRIORI ESTIMATES FOR SEMISTABLE SOLUTIONS OF SEMILINEAR ELLIPTIC EQUATIONS XAVIER CABRÉ, MANEL SANCHÓN, AND JOEL SPRUCK In memory of Rou-Huai Wang 1. Introduction In this note we consider semistable

### THE ENTROPY OF THE NORMAL DISTRIBUTION

CHAPTER 8 THE ENTROPY OF THE NORMAL DISTRIBUTION INTRODUCTION The normal distribution or Gaussian distribution or Gaussian probability density function is defined by N(x; µ, σ) = /σ (πσ ) / e (x µ). (8.)

### On the mathematical theory of splitting and Russian roulette

On the mathematical theory of splitting and Russian roulette techniques St.Petersburg State University, Russia 1. Introduction Splitting is an universal and potentially very powerful technique for increasing

### Master s Theory Exam Spring 2006

Spring 2006 This exam contains 7 questions. You should attempt them all. Each question is divided into parts to help lead you through the material. You should attempt to complete as much of each problem

### A crash course in probability and Naïve Bayes classification

Probability theory A crash course in probability and Naïve Bayes classification Chapter 9 Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s

### ON COMPLETELY CONTINUOUS INTEGRATION OPERATORS OF A VECTOR MEASURE. 1. Introduction

ON COMPLETELY CONTINUOUS INTEGRATION OPERATORS OF A VECTOR MEASURE J.M. CALABUIG, J. RODRÍGUEZ, AND E.A. SÁNCHEZ-PÉREZ Abstract. Let m be a vector measure taking values in a Banach space X. We prove that

### Math 421, Homework #5 Solutions

Math 421, Homework #5 Solutions (1) (8.3.6) Suppose that E R n and C is a subset of E. (a) Prove that if E is closed, then C is relatively closed in E if and only if C is a closed set (as defined in Definition

### Monte Carlo Methods in Finance

Author: Yiyang Yang Advisor: Pr. Xiaolin Li, Pr. Zari Rachev Department of Applied Mathematics and Statistics State University of New York at Stony Brook October 2, 2012 Outline Introduction 1 Introduction

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Developing a New Freshman Mathematics Course for Biology and Environmental Sciences Majors. Department of Mathematics December 2015

Developing a New Freshman Mathematics Course for Biology and Environmental Sciences Majors. Department of Mathematics December 2015 Programs changing to allow Math 1044 as an alternate to Math 1042: GSTC

### Department of Economics

Department of Economics On Testing for Diagonality of Large Dimensional Covariance Matrices George Kapetanios Working Paper No. 526 October 2004 ISSN 1473-0278 On Testing for Diagonality of Large Dimensional

### 15 Markov Chains: Limiting Probabilities

MARKOV CHAINS: LIMITING PROBABILITIES 67 Markov Chains: Limiting Probabilities Example Assume that the transition matrix is given by 7 2 P = 6 Recall that the n-step transition probabilities are given

### What is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference

0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures

### Random graphs with a given degree sequence

Sourav Chatterjee (NYU) Persi Diaconis (Stanford) Allan Sly (Microsoft) Let G be an undirected simple graph on n vertices. Let d 1,..., d n be the degrees of the vertices of G arranged in descending order.

### A REVIEW OF CONSISTENCY AND CONVERGENCE OF POSTERIOR DISTRIBUTION

A REVIEW OF CONSISTENCY AND CONVERGENCE OF POSTERIOR DISTRIBUTION By Subhashis Ghosal Indian Statistical Institute, Calcutta Abstract In this article, we review two important issues, namely consistency

### F. ABTAHI and M. ZARRIN. (Communicated by J. Goldstein)

Journal of Algerian Mathematical Society Vol. 1, pp. 1 6 1 CONCERNING THE l p -CONJECTURE FOR DISCRETE SEMIGROUPS F. ABTAHI and M. ZARRIN (Communicated by J. Goldstein) Abstract. For 2 < p

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### Inner Product Spaces

Math 571 Inner Product Spaces 1. Preliminaries An inner product space is a vector space V along with a function, called an inner product which associates each pair of vectors u, v with a scalar u, v, and

### International Journal of Information Technology, Modeling and Computing (IJITMC) Vol.1, No.3,August 2013

FACTORING CRYPTOSYSTEM MODULI WHEN THE CO-FACTORS DIFFERENCE IS BOUNDED Omar Akchiche 1 and Omar Khadir 2 1,2 Laboratory of Mathematics, Cryptography and Mechanics, Fstm, University of Hassan II Mohammedia-Casablanca,

### Subjective Probability and Lower and Upper Prevision: a New Understanding

Subjective Probability and Lower and Upper Prevision: a New Understanding G. SHAFER Rutgers Business School Newark & New Brunswick, NJ, USA P. R. GILLETT Rutgers Business School Newark & New Brunswick,

### Pooling and Meta-analysis. Tony O Hagan

Pooling and Meta-analysis Tony O Hagan Pooling Synthesising prior information from several experts 2 Multiple experts The case of multiple experts is important When elicitation is used to provide expert

### L10: Probability, statistics, and estimation theory

L10: Probability, statistics, and estimation theory Review of probability theory Bayes theorem Statistics and the Normal distribution Least Squares Error estimation Maximum Likelihood estimation Bayesian

### Monte Carlo simulation models: Sampling from the joint distribution of State of Nature -parameters

Monte Carlo simulation models: Sampling from the joint distribution of State of Nature -parameters Erik Jørgensen Biometry Research Unit. Danish Institute of Agricultural Sciences, P.O.Box 50, DK-8830

### MODELLING OCCUPATIONAL EXPOSURE USING A RANDOM EFFECTS MODEL: A BAYESIAN APPROACH ABSTRACT

MODELLING OCCUPATIONAL EXPOSURE USING A RANDOM EFFECTS MODEL: A BAYESIAN APPROACH Justin Harvey * and Abrie van der Merwe ** * Centre for Statistical Consultation, University of Stellenbosch ** University

### An Analysis of Rank Ordered Data

An Analysis of Rank Ordered Data Krishna P Paudel, Louisiana State University Biswo N Poudel, University of California, Berkeley Michael A. Dunn, Louisiana State University Mahesh Pandit, Louisiana State

### 1 Limiting distribution for a Markov chain

Copyright c 2009 by Karl Sigman Limiting distribution for a Markov chain In these Lecture Notes, we shall study the limiting behavior of Markov chains as time n In particular, under suitable easy-to-check

### ON LIMIT LAWS FOR CENTRAL ORDER STATISTICS UNDER POWER NORMALIZATION. E. I. Pancheva, A. Gacovska-Barandovska

Pliska Stud. Math. Bulgar. 22 (2015), STUDIA MATHEMATICA BULGARICA ON LIMIT LAWS FOR CENTRAL ORDER STATISTICS UNDER POWER NORMALIZATION E. I. Pancheva, A. Gacovska-Barandovska Smirnov (1949) derived four

### Bayesian Analysis for the Social Sciences

Bayesian Analysis for the Social Sciences Simon Jackman Stanford University http://jackman.stanford.edu/bass November 9, 2012 Simon Jackman (Stanford) Bayesian Analysis for the Social Sciences November

### MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 5 9/17/2008 RANDOM VARIABLES Contents 1. Random variables and measurable functions 2. Cumulative distribution functions 3. Discrete

### Stationary random graphs on Z with prescribed iid degrees and finite mean connections

Stationary random graphs on Z with prescribed iid degrees and finite mean connections Maria Deijfen Johan Jonasson February 2006 Abstract Let F be a probability distribution with support on the non-negative

### Mathematics Course 111: Algebra I Part IV: Vector Spaces

Mathematics Course 111: Algebra I Part IV: Vector Spaces D. R. Wilkins Academic Year 1996-7 9 Vector Spaces A vector space over some field K is an algebraic structure consisting of a set V on which are

### 1 Prior Probability and Posterior Probability

Math 541: Statistical Theory II Bayesian Approach to Parameter Estimation Lecturer: Songfeng Zheng 1 Prior Probability and Posterior Probability Consider now a problem of statistical inference in which

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### Continuous random variables

Continuous random variables So far we have been concentrating on discrete random variables, whose distributions are not continuous. Now we deal with the so-called continuous random variables. A random

### Current Standard: Mathematical Concepts and Applications Shape, Space, and Measurement- Primary

Shape, Space, and Measurement- Primary A student shall apply concepts of shape, space, and measurement to solve problems involving two- and three-dimensional shapes by demonstrating an understanding of:

### AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC

AN ALGORITHM FOR DETERMINING WHETHER A GIVEN BINARY MATROID IS GRAPHIC W. T. TUTTE. Introduction. In a recent series of papers [l-4] on graphs and matroids I used definitions equivalent to the following.

### Bayesian Statistical Analysis in Medical Research

Bayesian Statistical Analysis in Medical Research David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz draper@ams.ucsc.edu www.ams.ucsc.edu/ draper ROLE Steering

### P (A) = lim P (A) = N(A)/N,

1.1 Probability, Relative Frequency and Classical Definition. Probability is the study of random or non-deterministic experiments. Suppose an experiment can be repeated any number of times, so that we

### Bayesian Adaptive Methods for Clinical Trials

CENTER FOR QUANTITATIVE METHODS Department of Biostatistics COURSE: Bayesian Adaptive Methods for Clinical Trials Bradley P. Carlin (Division of Biostatistics, University of Minnesota) and Laura A. Hatfield

### Using game-theoretic probability for probability judgment. Glenn Shafer

Workshop on Game-Theoretic Probability and Related Topics Tokyo University February 28, 2008 Using game-theoretic probability for probability judgment Glenn Shafer For 170 years, people have asked whether

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### Probability and Statistics

Probability and Statistics Syllabus for the TEMPUS SEE PhD Course (Podgorica, April 4 29, 2011) Franz Kappel 1 Institute for Mathematics and Scientific Computing University of Graz Žaneta Popeska 2 Faculty

### Math Department Student Learning Objectives Updated April, 2014

Math Department Student Learning Objectives Updated April, 2014 Institutional Level Outcomes: Victor Valley College has adopted the following institutional outcomes to define the learning that all students

### New Axioms for Rigorous Bayesian Probability

Bayesian Analysis (2009) 4, Number 3, pp. 599 606 New Axioms for Rigorous Bayesian Probability Maurice J. Dupré, and Frank J. Tipler Abstract. By basing Bayesian probability theory on five axioms, we can

### THE CENTRAL LIMIT THEOREM TORONTO

THE CENTRAL LIMIT THEOREM DANIEL RÜDT UNIVERSITY OF TORONTO MARCH, 2010 Contents 1 Introduction 1 2 Mathematical Background 3 3 The Central Limit Theorem 4 4 Examples 4 4.1 Roulette......................................

### POWER SETS AND RELATIONS

POWER SETS AND RELATIONS L. MARIZZA A. BAILEY 1. The Power Set Now that we have defined sets as best we can, we can consider a sets of sets. If we were to assume nothing, except the existence of the empty

### An Introduction to Basic Statistics and Probability

An Introduction to Basic Statistics and Probability Shenek Heyward NCSU An Introduction to Basic Statistics and Probability p. 1/4 Outline Basic probability concepts Conditional probability Discrete Random

### On the Union of Arithmetic Progressions

On the Union of Arithmetic Progressions Shoni Gilboa Rom Pinchasi August, 04 Abstract We show that for any integer n and real ɛ > 0, the union of n arithmetic progressions with pairwise distinct differences,

### The distance of a curve to its control polygon J. M. Carnicer, M. S. Floater and J. M. Peña

The distance of a curve to its control polygon J. M. Carnicer, M. S. Floater and J. M. Peña Abstract. Recently, Nairn, Peters, and Lutterkort bounded the distance between a Bézier curve and its control

### Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

### Approaches for Analyzing Survey Data: a Discussion

Approaches for Analyzing Survey Data: a Discussion David Binder 1, Georgia Roberts 1 Statistics Canada 1 Abstract In recent years, an increasing number of researchers have been able to access survey microdata

### A Tutorial on Probability Theory

Paola Sebastiani Department of Mathematics and Statistics University of Massachusetts at Amherst Corresponding Author: Paola Sebastiani. Department of Mathematics and Statistics, University of Massachusetts,

### EC 112 SAMPLING, ESTIMATION AND HYPOTHESIS TESTING. One and One Half Hours (1 1 2 Hours) TWO questions should be answered

ECONOMICS EC 112 SAMPLING, ESTIMATION AND HYPOTHESIS TESTING One and One Half Hours (1 1 2 Hours) TWO questions should be answered Statistical tables are provided. The approved calculator (that is, Casio

### Principle of Data Reduction

Chapter 6 Principle of Data Reduction 6.1 Introduction An experimenter uses the information in a sample X 1,..., X n to make inferences about an unknown parameter θ. If the sample size n is large, then

### MATHEMATICS (CLASSES XI XII)

MATHEMATICS (CLASSES XI XII) General Guidelines (i) All concepts/identities must be illustrated by situational examples. (ii) The language of word problems must be clear, simple and unambiguous. (iii)

### Fixed-Effect Versus Random-Effects Models

CHAPTER 13 Fixed-Effect Versus Random-Effects Models Introduction Definition of a summary effect Estimating the summary effect Extreme effect size in a large study or a small study Confidence interval

### Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

### 2. Reduction to the simple case Proposition 2.1. It suffices to prove Theorem 1.1 for finite non-abelian simple groups.

ON THE NAVARRO-WILLEMS CONJECTURE FOR BLOCKS OF FINITE GROUPS C. BESSENRODT, G. NAVARRO, J. B. OLSSON, AND P. H. TIEP Abstract. We prove that a set of characters of a finite group can only be the set of

### Definition of Random Variable A random variable is a function from a sample space S into the real numbers.

.4 Random Variable Motivation example In an opinion poll, we might decide to ask 50 people whether they agree or disagree with a certain issue. If we record a for agree and 0 for disagree, the sample space

### Duality of linear conic problems

Duality of linear conic problems Alexander Shapiro and Arkadi Nemirovski Abstract It is well known that the optimal values of a linear programming problem and its dual are equal to each other if at least

### The Math. P (x) = 5! = 1 2 3 4 5 = 120.

The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct

### Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay

Information Theory and Coding Prof. S. N. Merchant Department of Electrical Engineering Indian Institute of Technology, Bombay Lecture - 17 Shannon-Fano-Elias Coding and Introduction to Arithmetic Coding

### Power and sample size in multilevel modeling

Snijders, Tom A.B. Power and Sample Size in Multilevel Linear Models. In: B.S. Everitt and D.C. Howell (eds.), Encyclopedia of Statistics in Behavioral Science. Volume 3, 1570 1573. Chicester (etc.): Wiley,

### Why the Normal Distribution?

Why the Normal Distribution? Raul Rojas Freie Universität Berlin Februar 2010 Abstract This short note explains in simple terms why the normal distribution is so ubiquitous in pattern recognition applications.

### Artificial Intelligence Mar 27, Bayesian Networks 1 P (T D)P (D) + P (T D)P ( D) =

Artificial Intelligence 15-381 Mar 27, 2007 Bayesian Networks 1 Recap of last lecture Probability: precise representation of uncertainty Probability theory: optimal updating of knowledge based on new information

### Sums of Independent Random Variables

Chapter 7 Sums of Independent Random Variables 7.1 Sums of Discrete Random Variables In this chapter we turn to the important question of determining the distribution of a sum of independent random variables

### 3.4 Complex Zeros and the Fundamental Theorem of Algebra

86 Polynomial Functions.4 Complex Zeros and the Fundamental Theorem of Algebra In Section., we were focused on finding the real zeros of a polynomial function. In this section, we expand our horizons and

### CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA

We Can Early Learning Curriculum PreK Grades 8 12 INSIDE ALGEBRA, GRADES 8 12 CORRELATED TO THE SOUTH CAROLINA COLLEGE AND CAREER-READY FOUNDATIONS IN ALGEBRA April 2016 www.voyagersopris.com Mathematical

### Introduction to Markov Chain Monte Carlo

Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem

### Timely Issues in Risk Management. Uncertainty and Subjectivity in Quantitative Science.

Timely Issues in Risk Management. Uncertainty and Subjectivity in Quantitative Science. Klaus Böcker CQD Workshop, Zagreb, April 9, 2010 2 3 Disclaimer The opinions expressed in this talk are those of