Borges, J. L On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.

Size: px
Start display at page:

Download "Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books."

Transcription

1 ... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those Unconscionable Maps no longer satisfied, and the Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and which coincided point for point with it. The following Generations, who were not so fond of the Study of Cartography as their Forebears had been, saw that that vast Map was Useless, and not without some Pitilessness was it, that they delivered it up to the Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is no other Relic of the Disciplines of Geography. Suárez Miranda, Viajes de varones prudentes, Libro IV, Cap. XLV, Lérida, 1658 Borges, J. L On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.

2 Fitting (more) Macroevolutionary Models to Data Luke J. Harmon

3 Slatkin and Pollack 2005

4 within species Slatkin and Pollack 2005

5 within species among species Slatkin and Pollack 2005

6 Hansen and Martins 1996

7 Topics Fitting models to comparative data: what do we know? Extending the set of models we can fit The future of comparative methods

8 Topics Fitting models to comparative data: what do we know? Extending the set of models we can fit The future of comparative methods

9 Example: Anolis lizards Lizards on Caribbean islands Phylogenetic and body size data for 73 species (out of ~140 total) Anolis baleatus

10

11 Brownian Motion Two parameters: starting value (zo) and rate (σ 2 ) dz(t) = σ db(t) zo t z(t)

12 Phylogeny A B

13 var(a) Phylogeny A B

14 var(a) Phylogeny A var(b) B

15 var(a) Phylogeny A cov(a,b) B var(b) A B

16 Phylogenetic variance-covariance (VCV) matrix A B A B

17 Phylogenetic variance-covariance (VCV) matrix var(a) A var(b) B A B

18 Phylogenetic variance-covariance (VCV) matrix var(a) cov(a,b) A cov(a,b) var(b) B A B

19 Phylogenetic variance-covariance (VCV) matrix var(a) cov(a,b) A σ 2 cov(a,b) var(b) B A B

20 genetic drift (co)variance + -

21 General form Tip data follow a multivariate normal distribution with mean vector zo and variance-covariance matrix where var(i) = σ 2 (di); di =distance from root to tip i cov(i,j) = σ 2 (ci,j); ci,j =shared path of tip i and j

22 We can fit Brownian motion model to comparative data using likelihood

23 16 17 σ 2 Sigma Squared 0e+00 1e 09 2e 09 3e 09 4e zo Theta

24 Quantitative genetics A quantitative genetics model of pure genetic drift also produces Brownian motion Three parameters: G, Ne, zo σ 2 = G/Ne

25 G G 1e 11 1e 10 1e 09 1e 08 1e 07 1e Ne Ne

26 G G 1e 11 1e 10 1e 09 1e 08 1e 07 1e σ 2 = G/Ne Ne Ne

27 σ 2 = G/Ne

28 σ 2 = G/Ne σ 2 / Vp = h 2 /Ne

29 Across a wide range of taxa, σ 2 /Vp is about 0.74

30 Across a wide range of taxa, σ 2 /Vp is about 0.74 That is, the average Brownian rate parameter is about 0.74 phenotypic standard deviations per million years

31 Across a wide range of taxa, σ 2 /Vp is about 0.74 That is, the average Brownian rate parameter is about 0.74 phenotypic standard deviations per million years That translates to a variance of 1.2 x 10-6 phenotypic sd per generation

32 Across a wide range of taxa, σ 2 /Vp is about 0.74 That is, the average Brownian rate parameter is about 0.74 phenotypic standard deviations per million years That translates to a variance of 1.2 x 10-6 phenotypic sd per generation About half of the time, the change from one generation to the next is phenotypic s.d.

33 G G 1e 11 1e 10 1e 09 1e 08 1e 07 1e σ 2 = G/Ne Ne Ne

34 G G 1e 11 1e 10 1e 09 1e 08 1e 07 1e Evolution is too slow for drift σ 2 = G/Ne Ne Ne

35 What about selection?

36 Brownian motion can also result if selection is random in direction and relatively weak

37

38 BM model: About half of the time, the change from one generation to the next is phenotypic s.d.

39 What happens with OU models?

40 OU Model - single optimim Three parameters: starting value (μ), rate (σ 2 ), and constraint parameter (α) i sij j T = total tree depth

41 stabilizing selection (co)variance + -

42 α = 1 / (ω+p)

43 α = 1 / (ω+p) mean α across clades: 0.34

44 α = 1 / (ω+p) mean α across clades: 0.34 Typical ω 2 = 3-50

45 α = 1 / (ω+p) mean α across clades: 0.34 Typical ω 2 = 3-50 P would have to be negative to get these alpha values

46 α = 1 / (ω+p) mean α across clades: 0.34 Typical ω 2 = 3-50 P would have to be negative to get these alpha values (e.g. stabilizing selection is typically stronger than what OU values suggest)

47 What does this mean? Commonly used models can fit comparative data Simplistic quantitative genetics interpretations of these models are probably not correct

48 Brownian motion is not drift Ornstein-Uhlenbeck is not stabilizing selection on a single peak

49 Hansen-Martins environmental change model Imagine populations are subject to strong stabilizing selection on optima But the position of these optima varies according to a Brownian motion model

50 σb 2 = overall rate of drift σe 2 = rate of drift of optima Hansen and Martins 1996

51 σb 2 = overall rate of drift σe 2 = rate of drift of optima When σe 2 larger than (left term): Hansen and Martins 1996

52 σb 2 = overall rate of drift σe 2 = rate of drift of optima When σe 2 larger than (left term): σb 2 = σe 2 Hansen and Martins 1996

53 Main idea: patterns of trait means on trees might reflect the dynamics of the adaptive landscape more than they reflect processes of adaptation within populations

54 Slatkin and Pollack 2005

55 within species Slatkin and Pollack 2005

56 within species among species Slatkin and Pollack 2005

57 Possible model: The location of the optimum changes rapidly from one generation to the next Globally, optima are constrained to be within a certain range of values

58 Topics Fitting models to comparative data: what do we know? Extending the set of models we can fit The future of comparative methods

59 Extending Models Solving likelihoods for new models Using Bayesian approaches ABC

60 Extending Models Solving likelihoods for new models Using Bayesian approaches ABC

61 Early Burst Model (EB) Rate of evolution slows through time Highest rate at the root of the tree Three parameters: starting value (μ), starting rate (σ 2 o), and rate change (r) i sij j

62 Proportion of weight Proportion of weight Body size Body shape Squamates Birds Fish Insects Mammals BM CC EB NA Amphibians

63 Extending Models Solving likelihoods for new models Using Bayesian approaches ABC

64 Acceptance probability Prior =. odds Likelihood ratio Standard Bayesian MCMC

65 rjmcmc reversible-jump MCMC rjmcmc is an MCMC algorithm that can jump between models of differing complexity

66 rjmcmc moves Update parameter values (root state, rate σ 2 i) Split single rate category into two Merge two rate categories into one

67 (α1 α1 α1 α1 α1 α1 α1 α1 α1 α1) k = 1

68 (α1 α1 α1 α1 α1 α1 α1 α1 α1 α1) k = 1 split move (α1 α1 α1 α1 α1 α2 α2 α2 α2 α2) k = 2

69 (α1 α1 α1 α1 α1 α1 α1 α1 α1 α1) k = 1 split move (α1 α1 α1 α1 α1 α2 α2 α2 α2 α2) k = 2 split move (α3 α3 α3 α1 α1 α2 α2 α2 α2 α2) k = 3

70 (α1 α1 α1 α1 α1 α1 α1 α1 α1 α1) k = 1 split move (α1 α1 α1 α1 α1 α2 α2 α2 α2 α2) k = 2 split move (α3 α3 α3 α1 α1 α2 α2 α2 α2 α2) k = 3 merge move (α3 α3 α3 α1 α1 α1 α1 α1 α1 α1) k = 2

71 Acceptance probability Prior =. odds Likelihood ratio Standard Bayesian MCMC

72 Acceptance probability Prior =. odds Likelihood ratio. Hastings ratio Standard Bayesian MCMC Reversible jump MCMC

73 AUTEUR ACCOMMODATING UNCERTAINTY IN TRAIT EVOLUTION USING R Jonathan M. Eastman, Michael E. Alfaro, Paul Joyce, Andrew E. Hipp, and Luke J. Harmon

74 AUTEUR Do rates of trait evolution vary across clades in a phylogenetic tree? Are traits in some clades evolving faster than in others?

75

76 Em

77 Em Graptemys Pseudemys

78 Extending Models Solving likelihoods for new models Using Bayesian approaches ABC

79 Standard Bayes MCMC Acceptance probability Prior =. odds Likelihood ratio

80

81 Prior Density θ

82 Prior Density θ

83 Prior compute likelihood of data under model M with θ Density θ

84 Prior compute likelihood of data under model M with θ Density Likelihood ratio θ

85 Prior compute likelihood of data under model M with θ accept proposal with probability h Posterior Density Likelihood ratio Density θ θ

86 Prior compute likelihood of data under model M with θ accept proposal with probability h Posterior Density Likelihood ratio Density θ θ

87 Prior compute likelihood of data under model M with θ accept proposal with probability h Posterior Density Likelihood ratio Density θ θ otherwise reject

88 Prior compute likelihood of data under model M with θ accept proposal with probability h Posterior Density Likelihood ratio Density θ θ otherwise reject

89 Standard Bayes MCMC Acceptance probability Prior =. odds Likelihood ratio

90 Standard Bayes MCMC Acceptance probability Prior =. odds Likelihood ratio But what if we can t compute the likelihood ratio?

91 Standard Bayes MCMC Acceptance probability Prior =. odds Likelihood ratio But what if we can t compute the likelihood ratio? Use Approximate Bayesian Computation (ABC)

92 ABC

93 ABC Prior Density θ

94 ABC Prior Density θ

95 ABC Prior simulate data under model M with θ Density θ

96 ABC Prior simulate data under model M with θ Density θ

97 ABC Prior simulate data under model M with θ Density θ

98 ABC Prior simulate data under model M with θ Density θ

99 ABC Prior simulate data under model M with θ Density θ

100 ABC Prior simulate data under model M with θ Density θ

101 ABC Prior simulate data under model M with θ Posterior Density Density θ θ

102 ABC Prior simulate data under model M with θ Posterior Density Density θ θ

103 MECCA Simulation Algorithm

104 n

105 1. Draw ancestral character state Θ n 3 5 Θ

106 1. Draw ancestral character state Θ n 3 5 Θ Simulate characters along branches under BM

107 3. Simulate characters in unresolved clades using a branching diffusion process (birth-death plus Brownian motion)

108 ABC summary statistics

109 ABC summary statistics

110 ABC summary statistics a

111 ABC summary statistics σ 2 a

112 ABC summary statistics σ 2 a a σ 2

113 λ θμ μ a

114 λ λ μ θμ a

115 λ λ μ θμ a

116 λ λ μ θμ a σ 2 σ 2

117 λ λ μ θμ a a σ 2 σ 2

118 λ λ μ θμ a a σ 2 σ 2

119 λ λ μ θμ a a σ 2 σ 2

120 Diversification rates

121 Diversification rates Lambda Mu density Lambda Mu Index

122 Diversification rates Lambda Mu density Lambda Mu λ = 0.19 ( ) μ = 0.10 ( ) Index

123 Character evolution Density Density σ 2 sigmasq lnθ Root State

124 Character evolution Density Density σ 2 sigmasq ( ) lnθ Root State 7.1 kg ( )

125 ACME C MECCA Modeling the Evolution of Continuous Characters using ABC Graham Slater, Luke Harmon, Paul Joyce, Liam Revell, and Michael Alfaro

126 Topics Fitting models to comparative data: what do we know? Extending the set of models we can fit The future of comparative methods

127 Fitting the data people actually have Incomplete data where sampling is nonrandom Combinations of different types of data, like within and among species

128 Adding to the set of models Moving beyond Brownian motion Estimating meaningful parameters Adding complexity when possible

129 New Statistical Tools Dealing with all forms of uncertainty More flexible Bayesian analyses ABC

130 Topics Fitting models to comparative data: what do we know? Extending the set of models we can fit The future of comparative methods

FUNDAMENTAL CONCEPTS IN GEOGRAPHY. Spring

FUNDAMENTAL CONCEPTS IN GEOGRAPHY. Spring FUNDAMENTAL CONCEPTS IN GEOGRAPHY Spring 2010 1 FUNDAMENTAL CONCEPTS IN GEOGRAPHY Geog 5 People, place & Environment What is Geography? Geo [earth] graphy [write, describe] The study of earth as our home

More information

Why real-time scheduling theory still matters

Why real-time scheduling theory still matters Why real-time scheduling theory still matters Sanjoy Baruah The University of North Carolina at Chapel Hill Our discipline = Systems + Theory is about systems that require formal/ theoretical analysis

More information

Bayesian Phylogeny and Measures of Branch Support

Bayesian Phylogeny and Measures of Branch Support Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The

More information

5. Fit the models to each phenotypic dataset using ouch and record the AICc and parameter estimates.

5. Fit the models to each phenotypic dataset using ouch and record the AICc and parameter estimates. 25 629 630 631 632 633 634 635 636 APPENDIX A. SUPPLEMENTARY FIGURES Fig. A1 is a schematic diagram of the simulation study design. Fig. A2 shows the distribution of selection opportunity, discriminability

More information

Dealing with large datasets

Dealing with large datasets Dealing with large datasets (by throwing away most of the data) Alan Heavens Institute for Astronomy, University of Edinburgh with Ben Panter, Rob Tweedie, Mark Bastin, Will Hossack, Keith McKellar, Trevor

More information

Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

More information

Lab 8: Introduction to WinBUGS

Lab 8: Introduction to WinBUGS 40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next

More information

Biology 164 Laboratory PHYLOGENETIC SYSTEMATICS

Biology 164 Laboratory PHYLOGENETIC SYSTEMATICS Biology 164 Laboratory PHYLOGENETIC SYSTEMATICS Objectives 1. To become familiar with the cladistic approach to reconstruction of phylogenies. 2. To construct a character matrix and phylogeny for a group

More information

Overview. bayesian coalescent analysis (of viruses) The coalescent. Coalescent inference

Overview. bayesian coalescent analysis (of viruses) The coalescent. Coalescent inference Overview bayesian coalescent analysis (of viruses) Introduction to the Coalescent Phylodynamics and the Bayesian skyline plot Molecular population genetics for viruses Testing model assumptions Alexei

More information

Life Settlement Pricing

Life Settlement Pricing Life Settlement Pricing Yinglu Deng Patrick Brockett Richard MacMinn Tsinghua University University of Texas Illinois State University Life Settlement Description A life settlement is a financial arrangement

More information

Adaptive Arrival Price

Adaptive Arrival Price Adaptive Arrival Price Julian Lorenz (ETH Zurich, Switzerland) Robert Almgren (Adjunct Professor, New York University) Algorithmic Trading 2008, London 07. 04. 2008 Outline Evolution of algorithmic trading

More information

Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

More information

Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour. Patrick Lam Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

More information

AP Biology Learning Objective Cards

AP Biology Learning Objective Cards 1.1 The student is able to convert a data set from a table of numbers that reflect a change in the genetic makeup of a population over time and to apply mathematical methods and conceptual understandings

More information

Analysis of Bayesian Dynamic Linear Models

Analysis of Bayesian Dynamic Linear Models Analysis of Bayesian Dynamic Linear Models Emily M. Casleton December 17, 2010 1 Introduction The main purpose of this project is to explore the Bayesian analysis of Dynamic Linear Models (DLMs). The main

More information

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables

The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables Monte Carlo Simulation: IEOR E4703 Fall 2004 c 2004 by Martin Haugh The Monte Carlo Framework, Examples from Finance and Generating Correlated Random Variables 1 The Monte Carlo Framework Suppose we wish

More information

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML

A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML 9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze

More information

AFM 472. Midterm Examination. Monday Oct. 24, 2011. A. Huang

AFM 472. Midterm Examination. Monday Oct. 24, 2011. A. Huang AFM 472 Midterm Examination Monday Oct. 24, 2011 A. Huang Name: Answer Key Student Number: Section (circle one): 10:00am 1:00pm 2:30pm Instructions: 1. Answer all questions in the space provided. If space

More information

Population Genetics (Outline)

Population Genetics (Outline) Population Genetics (Outline) Definition of terms of population genetics: population, species, gene, pool, gene flow Calculation of genotypic of homozygous dominant, recessive, or heterozygous individuals,

More information

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification

Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification Tail-Dependence an Essential Factor for Correctly Measuring the Benefits of Diversification Presented by Work done with Roland Bürgi and Roger Iles New Views on Extreme Events: Coupled Networks, Dragon

More information

Gene flow and genetic drift Evolution Biology 4971/5974 D F Tomback

Gene flow and genetic drift Evolution Biology 4971/5974 D F Tomback Biology 4974/5974 Evolution Gene Flow, Genetic Drift, and the Shifting Balance Theory Figures from Hall and Hallgrimsson, 2014, Strickberger s Evolution Learning goals Understand how the following processes

More information

APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

More information

Introduction Pricing Effects Greeks Summary. Vol Target Options. Rob Coles. February 7, 2014

Introduction Pricing Effects Greeks Summary. Vol Target Options. Rob Coles. February 7, 2014 February 7, 2014 Outline 1 Introduction 2 3 Vega Theta Delta & Gamma Hedge P& L Jump sensitivity The Basic Idea Basket split between risky asset and cash Chose weight of risky asset w to keep volatility

More information

Inference on Phase-type Models via MCMC

Inference on Phase-type Models via MCMC Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

AN ACCESSIBLE TREATMENT OF MONTE CARLO METHODS, TECHNIQUES, AND APPLICATIONS IN THE FIELD OF FINANCE AND ECONOMICS

AN ACCESSIBLE TREATMENT OF MONTE CARLO METHODS, TECHNIQUES, AND APPLICATIONS IN THE FIELD OF FINANCE AND ECONOMICS Brochure More information from http://www.researchandmarkets.com/reports/2638617/ Handbook in Monte Carlo Simulation. Applications in Financial Engineering, Risk Management, and Economics. Wiley Handbooks

More information

Quiz #4 Ch. 4 Modern Evolutionary Theory

Quiz #4 Ch. 4 Modern Evolutionary Theory Physical Anthropology Summer 2014 Dr. Leanna Wolfe Quiz #4 Ch. 4 Modern Evolutionary Theory 1. T/F Evolution by natural selection works directly on individuals, transforming populations. 2. T/F A genotypic

More information

Molecular Clocks and Tree Dating with r8s and BEAST

Molecular Clocks and Tree Dating with r8s and BEAST Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today

More information

Estimated genetic parameters for growth traits of German shepherd dog and Labrador retriever dog guides 1

Estimated genetic parameters for growth traits of German shepherd dog and Labrador retriever dog guides 1 Estimated genetic parameters for growth traits of German shepherd dog and Labrador retriever dog guides 1 S. K. Helmink*, S. L. Rodriguez-Zas*, R. D. Shanks*,, and E. A. Leighton *Department of Animal

More information

Time series clustering and the analysis of film style

Time series clustering and the analysis of film style Time series clustering and the analysis of film style Nick Redfern Introduction Time series clustering provides a simple solution to the problem of searching a database containing time series data such

More information

Hedging Options In The Incomplete Market With Stochastic Volatility. Rituparna Sen Sunday, Nov 15

Hedging Options In The Incomplete Market With Stochastic Volatility. Rituparna Sen Sunday, Nov 15 Hedging Options In The Incomplete Market With Stochastic Volatility Rituparna Sen Sunday, Nov 15 1. Motivation This is a pure jump model and hence avoids the theoretical drawbacks of continuous path models.

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Hierarchical Bayesian Modeling of the HIV Response to Therapy

Hierarchical Bayesian Modeling of the HIV Response to Therapy Hierarchical Bayesian Modeling of the HIV Response to Therapy Shane T. Jensen Department of Statistics, The Wharton School, University of Pennsylvania March 23, 2010 Joint Work with Alex Braunstein and

More information

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

More information

Integer Programming: Algorithms - 3

Integer Programming: Algorithms - 3 Week 9 Integer Programming: Algorithms - 3 OPR 992 Applied Mathematical Programming OPR 992 - Applied Mathematical Programming - p. 1/12 Dantzig-Wolfe Reformulation Example Strength of the Linear Programming

More information

Applications of R Software in Bayesian Data Analysis

Applications of R Software in Bayesian Data Analysis Article International Journal of Information Science and System, 2012, 1(1): 7-23 International Journal of Information Science and System Journal homepage: www.modernscientificpress.com/journals/ijinfosci.aspx

More information

Towards running complex models on big data

Towards running complex models on big data Towards running complex models on big data Working with all the genomes in the world without changing the model (too much) Daniel Lawson Heilbronn Institute, University of Bristol 2013 1 / 17 Motivation

More information

Likelihood Approaches for Trial Designs in Early Phase Oncology

Likelihood Approaches for Trial Designs in Early Phase Oncology Likelihood Approaches for Trial Designs in Early Phase Oncology Clinical Trials Elizabeth Garrett-Mayer, PhD Cody Chiuzan, PhD Hollings Cancer Center Department of Public Health Sciences Medical University

More information

Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing

Constrained Bayes and Empirical Bayes Estimator Applications in Insurance Pricing Communications for Statistical Applications and Methods 2013, Vol 20, No 4, 321 327 DOI: http://dxdoiorg/105351/csam2013204321 Constrained Bayes and Empirical Bayes Estimator Applications in Insurance

More information

Statistical Machine Learning

Statistical Machine Learning Statistical Machine Learning UoC Stats 37700, Winter quarter Lecture 4: classical linear and quadratic discriminants. 1 / 25 Linear separation For two classes in R d : simple idea: separate the classes

More information

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization

Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Non Linear Dependence Structures: a Copula Opinion Approach in Portfolio Optimization Jean- Damien Villiers ESSEC Business School Master of Sciences in Management Grande Ecole September 2013 1 Non Linear

More information

Basics of Statistical Machine Learning

Basics of Statistical Machine Learning CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar

More information

Darwinian Natural Selection

Darwinian Natural Selection Darwinian Natural Selection Evidence of Evolution Direct observation: species change Fossils show intermediate forms Extant species show structural, developmental and genetic homology Vestigial traits

More information

Likelihood: Frequentist vs Bayesian Reasoning

Likelihood: Frequentist vs Bayesian Reasoning "PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

More information

Statistical Analysis with Missing Data

Statistical Analysis with Missing Data Statistical Analysis with Missing Data Second Edition RODERICK J. A. LITTLE DONALD B. RUBIN WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION Contents Preface PARTI OVERVIEW AND BASIC APPROACHES

More information

MCMC A T T T G C T C B T T C C C T C C G C C T C T C D C C T T C T C. (Saitou and Nei, 1987) (Swofford and Begle, 1993)

MCMC A T T T G C T C B T T C C C T C C G C C T C T C D C C T T C T C. (Saitou and Nei, 1987) (Swofford and Begle, 1993) MCMC 1 1 1 DNA 1 2 3 4 5 6 7 A T T T G C T C B T T C C C T C C G C C T C T C D C C T T C T C ( 2) (Saitou and Nei, 1987) (Swofford and Begle, 1993) 1 A B C D (Vos, 2003) (Zwickl, 2006) (Morrison, 2007)

More information

Student Guide for Mesquite

Student Guide for Mesquite MESQUITE Student User Guide 1 Student Guide for Mesquite This guide describes how to 1. create a project file, 2. construct phylogenetic trees, and 3. map trait evolution on branches (e.g. morphological

More information

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering

Presentation by: Ahmad Alsahaf. Research collaborator at the Hydroinformatics lab - Politecnico di Milano MSc in Automation and Control Engineering Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen 9-October 2015 Presentation by: Ahmad Alsahaf Research collaborator at the Hydroinformatics lab - Politecnico di

More information

Black-Litterman Return Forecasts in. Tom Idzorek and Jill Adrogue Zephyr Associates, Inc. September 9, 2003

Black-Litterman Return Forecasts in. Tom Idzorek and Jill Adrogue Zephyr Associates, Inc. September 9, 2003 Black-Litterman Return Forecasts in Tom Idzorek and Jill Adrogue Zephyr Associates, Inc. September 9, 2003 Using Black-Litterman Return Forecasts for Asset Allocation Results in Diversified Portfolios

More information

FROM PROTEIN SEQUENCES TO PHYLOGENETIC TREES

FROM PROTEIN SEQUENCES TO PHYLOGENETIC TREES FROM PROTEIN SEQUENCES TO PHYLOGENETIC TREES Robert Hirt Department of Zoology, The Natural History Museum, London Agenda Remind you that molecular phylogenetics is complex the more you know about the

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Gaussian Mixture Models Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Syllabus for the TEMPUS SEE PhD Course (Podgorica, April 4 29, 2011) Franz Kappel 1 Institute for Mathematics and Scientific Computing University of Graz Žaneta Popeska 2 Faculty

More information

CONCEPT MAP (AB=CDEF)

CONCEPT MAP (AB=CDEF) ETHOLOGY Vocabulary CONCEPT MAP (AB=CDEF) Folk Psychology Cause Development Evolution Function Proximate Ultimate Animal Behavior = Cause + Development + Evolution + Function KEYWORD CONCEPTS HIERARCHICAL

More information

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY

Lean Six Sigma Analyze Phase Introduction. TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY TECH 50800 QUALITY and PRODUCTIVITY in INDUSTRY and TECHNOLOGY Before we begin: Turn on the sound on your computer. There is audio to accompany this presentation. Audio will accompany most of the online

More information

Performance study of supertree methods

Performance study of supertree methods Q&A How did you become involved in doing research? I applied for an REU (Research for Undergraduates) at KU last summer and worked with Dr. Mark Holder for ten weeks. It was an amazing experience! How

More information

Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups

Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups Model-Based Recursive Partitioning for Detecting Interaction Effects in Subgroups Achim Zeileis, Torsten Hothorn, Kurt Hornik http://eeecon.uibk.ac.at/~zeileis/ Overview Motivation: Trees, leaves, and

More information

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York

CCNY. BME I5100: Biomedical Signal Processing. Linear Discrimination. Lucas C. Parra Biomedical Engineering Department City College of New York BME I5100: Biomedical Signal Processing Linear Discrimination Lucas C. Parra Biomedical Engineering Department CCNY 1 Schedule Week 1: Introduction Linear, stationary, normal - the stuff biology is not

More information

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care.

Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care. Incorporating cost in Bayesian Variable Selection, with application to cost-effective measurement of quality of health care University of Florida 10th Annual Winter Workshop: Bayesian Model Selection and

More information

Imputing Missing Data using SAS

Imputing Missing Data using SAS ABSTRACT Paper 3295-2015 Imputing Missing Data using SAS Christopher Yim, California Polytechnic State University, San Luis Obispo Missing data is an unfortunate reality of statistics. However, there are

More information

Variables. Exploratory Data Analysis

Variables. Exploratory Data Analysis Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

More information

BayeScan v2.1 User Manual

BayeScan v2.1 User Manual BayeScan v2.1 User Manual Matthieu Foll January, 2012 1. Introduction This program, BayeScan aims at identifying candidate loci under natural selection from genetic data, using differences in allele frequencies

More information

Bayesian Methods. 1 The Joint Posterior Distribution

Bayesian Methods. 1 The Joint Posterior Distribution Bayesian Methods Every variable in a linear model is a random variable derived from a distribution function. A fixed factor becomes a random variable with possibly a uniform distribution going from a lower

More information

Department of Industrial Engineering

Department of Industrial Engineering Department of Industrial Engineering Master of Engineering Program in Industrial Engineering (International Program) M.Eng. (Industrial Engineering) Plan A Option 2: Total credits required: minimum 39

More information

Quantitative Methods for Finance

Quantitative Methods for Finance Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

More information

Is a Brownian motion skew?

Is a Brownian motion skew? Is a Brownian motion skew? Ernesto Mordecki Sesión en honor a Mario Wschebor Universidad de la República, Montevideo, Uruguay XI CLAPEM - November 2009 - Venezuela 1 1 Joint work with Antoine Lejay and

More information

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation

Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Parametric Models Part I: Maximum Likelihood and Bayesian Density Estimation Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2015 CS 551, Fall 2015

More information

Physics Notes Class 11 CHAPTER 13 KINETIC THEORY

Physics Notes Class 11 CHAPTER 13 KINETIC THEORY 1 P a g e Physics Notes Class 11 CHAPTER 13 KINETIC THEORY Assumptions of Kinetic Theory of Gases 1. Every gas consists of extremely small particles known as molecules. The molecules of a given gas are

More information

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni

Web-based Supplementary Materials for Bayesian Effect Estimation. Accounting for Adjustment Uncertainty by Chi Wang, Giovanni 1 Web-based Supplementary Materials for Bayesian Effect Estimation Accounting for Adjustment Uncertainty by Chi Wang, Giovanni Parmigiani, and Francesca Dominici In Web Appendix A, we provide detailed

More information

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris

Class #6: Non-linear classification. ML4Bio 2012 February 17 th, 2012 Quaid Morris Class #6: Non-linear classification ML4Bio 2012 February 17 th, 2012 Quaid Morris 1 Module #: Title of Module 2 Review Overview Linear separability Non-linear classification Linear Support Vector Machines

More information

Network Tomography. Christos Gkantsidis

Network Tomography. Christos Gkantsidis Network Tomography Christos Gkantsidis Introduction Internet AS AS 2 What is the: Bandwidth Loss Rate Connectivity AS 3 of the links of the network? Using only end-to-end measurements. Network Tomography

More information

Maximum likelihood estimation of mean reverting processes

Maximum likelihood estimation of mean reverting processes Maximum likelihood estimation of mean reverting processes José Carlos García Franco Onward, Inc. jcpollo@onwardinc.com Abstract Mean reverting processes are frequently used models in real options. For

More information

Marketing Mix Modelling and Big Data P. M Cain

Marketing Mix Modelling and Big Data P. M Cain 1) Introduction Marketing Mix Modelling and Big Data P. M Cain Big data is generally defined in terms of the volume and variety of structured and unstructured information. Whereas structured data is stored

More information

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX

MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX MAN-BITES-DOG BUSINESS CYCLES ONLINE APPENDIX KRISTOFFER P. NIMARK The next section derives the equilibrium expressions for the beauty contest model from Section 3 of the main paper. This is followed by

More information

Markov Chain Monte Carlo Simulation Made Simple

Markov Chain Monte Carlo Simulation Made Simple Markov Chain Monte Carlo Simulation Made Simple Alastair Smith Department of Politics New York University April2,2003 1 Markov Chain Monte Carlo (MCMC) simualtion is a powerful technique to perform numerical

More information

EC 6310: Advanced Econometric Theory

EC 6310: Advanced Econometric Theory EC 6310: Advanced Econometric Theory July 2008 Slides for Lecture on Bayesian Computation in the Nonlinear Regression Model Gary Koop, University of Strathclyde 1 Summary Readings: Chapter 5 of textbook.

More information

Bayesian coalescent inference of population size history

Bayesian coalescent inference of population size history Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population

More information

LECTURES ON REAL OPTIONS: PART II TECHNICAL ANALYSIS

LECTURES ON REAL OPTIONS: PART II TECHNICAL ANALYSIS LECTURES ON REAL OPTIONS: PART II TECHNICAL ANALYSIS Robert S. Pindyck Massachusetts Institute of Technology Cambridge, MA 02142 Robert Pindyck (MIT) LECTURES ON REAL OPTIONS PART II August, 2008 1 / 50

More information

QUALITY ENGINEERING PROGRAM

QUALITY ENGINEERING PROGRAM QUALITY ENGINEERING PROGRAM Production engineering deals with the practical engineering problems that occur in manufacturing planning, manufacturing processes and in the integration of the facilities and

More information

P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i )

P (x) 0. Discrete random variables Expected value. The expected value, mean or average of a random variable x is: xp (x) = v i P (v i ) Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =

More information

Monte Carlo Methods and Models in Finance and Insurance

Monte Carlo Methods and Models in Finance and Insurance Chapman & Hall/CRC FINANCIAL MATHEMATICS SERIES Monte Carlo Methods and Models in Finance and Insurance Ralf Korn Elke Korn Gerald Kroisandt f r oc) CRC Press \ V^ J Taylor & Francis Croup ^^"^ Boca Raton

More information

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need

More information

PS 271B: Quantitative Methods II. Lecture Notes

PS 271B: Quantitative Methods II. Lecture Notes PS 271B: Quantitative Methods II Lecture Notes Langche Zeng zeng@ucsd.edu The Empirical Research Process; Fundamental Methodological Issues 2 Theory; Data; Models/model selection; Estimation; Inference.

More information

Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23

Probability Theory. Elementary rules of probability Sum rule. Product rule. p. 23 Probability Theory Uncertainty is key concept in machine learning. Probability provides consistent framework for the quantification and manipulation of uncertainty. Probability of an event is the fraction

More information

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

More information

Introduction To Genetic Algorithms

Introduction To Genetic Algorithms 1 Introduction To Genetic Algorithms Dr. Rajib Kumar Bhattacharjya Department of Civil Engineering IIT Guwahati Email: rkbc@iitg.ernet.in References 2 D. E. Goldberg, Genetic Algorithm In Search, Optimization

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

The QOOL Algorithm for fast Online Optimization of Multiple Degree of Freedom Robot Locomotion

The QOOL Algorithm for fast Online Optimization of Multiple Degree of Freedom Robot Locomotion The QOOL Algorithm for fast Online Optimization of Multiple Degree of Freedom Robot Locomotion Daniel Marbach January 31th, 2005 Swiss Federal Institute of Technology at Lausanne Daniel.Marbach@epfl.ch

More information

Kinetic Molecular Theory

Kinetic Molecular Theory Kinetic Molecular Theory Particle volume - The volume of an individual gas particle is small compaired to that of its container. Therefore, gas particles are considered to have mass, but no volume. There

More information

Gaussian Conjugate Prior Cheat Sheet

Gaussian Conjugate Prior Cheat Sheet Gaussian Conjugate Prior Cheat Sheet Tom SF Haines 1 Purpose This document contains notes on how to handle the multivariate Gaussian 1 in a Bayesian setting. It focuses on the conjugate prior, its Bayesian

More information

Estimating the evidence for statistical models

Estimating the evidence for statistical models Estimating the evidence for statistical models Nial Friel University College Dublin nial.friel@ucd.ie March, 2011 Introduction Bayesian model choice Given data y and competing models: m 1,..., m l, each

More information

How to Build a Phylogenetic Tree

How to Build a Phylogenetic Tree How to Build a Phylogenetic Tree Phylogenetics tree is a structure in which species are arranged on branches that link them according to their relationship and/or evolutionary descent. A typical rooted

More information

Step 5: Conduct Analysis. The CCA Algorithm

Step 5: Conduct Analysis. The CCA Algorithm Model Parameterization: Step 5: Conduct Analysis P Dropped species with fewer than 5 occurrences P Log-transformed species abundances P Row-normalized species log abundances (chord distance) P Selected

More information

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12

SENSITIVITY ANALYSIS AND INFERENCE. Lecture 12 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Practice Questions 1: Evolution

Practice Questions 1: Evolution Practice Questions 1: Evolution 1. Which concept is best illustrated in the flowchart below? A. natural selection B. genetic manipulation C. dynamic equilibrium D. material cycles 2. The diagram below

More information

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS

Lab 2/Phylogenetics/September 16, 2002 1 PHYLOGENETICS Lab 2/Phylogenetics/September 16, 2002 1 Read: Tudge Chapter 2 PHYLOGENETICS Objective of the Lab: To understand how DNA and protein sequence information can be used to make comparisons and assess evolutionary

More information

9.1: Mechanisms of Evolution and Their Effect on Populations pg. 350-359

9.1: Mechanisms of Evolution and Their Effect on Populations pg. 350-359 9.1: Mechanisms of Evolution and Their Effect on Populations pg. 350-359 Key Terms: gene flow, non-random mating, genetic drift, founder effect, bottleneck effect, stabilizing selection, directional selection

More information

1 Phylogenetic History: The Evolution of Marine Mammals

1 Phylogenetic History: The Evolution of Marine Mammals 1 Phylogenetic History: The Evolution of Marine Mammals Think for a moment about marine mammals: seals, walruses, dugongs and whales. Seals and walruses are primarily cold-water species that eat mostly

More information

Chapter 16 How Populations Evolve

Chapter 16 How Populations Evolve Title Chapter 16 How Populations Evolve Copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Population Genetics A population is all of the members of a single species

More information