# Bayesian Phylogeny and Measures of Branch Support

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Bayesian Phylogeny and Measures of Branch Support

2 Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The unfair dice are strongly biased: Imagine that you take one die from the bag and throw it 2 times, obtaining: The problem is: what kind of die did you roll?

3 Bayesian Statistics The likelihood that this is an unbiased die is: L u = Pr ( unbiased die ) = 1/6 1/6 = 1/36 L b = Pr ( biased die) = 4/21 6/21 = 24/441 Bayesian inferences are based on the posterior probability of a hypothesis: This means that our opinion that the dice is biased changed from 0.1 to after observing a four and a six.

4 Bayes Theorem

5 Bayes Theorem

6 Bayes Theorem Bayesian Analysis depends on good priors (weakness and strength of the method)

7 Likelihood Likelihood is the probability that an hypothesis would have been generated the new observed data. Ignores pre-existing information Bayesian Bayesian Posterior Probability is the probability that an hypothesis is true, given the new observed data AND existing knowledge Considers pre-existing information ( Prior )

8 How does related to Phylogenetics? Likelihood analysis (e.g. PHYML, RAxML) - Best tree = Maximum likelihood tree (ML tree) - Pool of plausible trees obtained by bootstraping Bayesian analysis (e.g. MrBayes - Best tree = Maximum posterior probability tree (MPP tree) - Pool of plausible trees obtained by Markov Chain- Monte Carlo

9 Non-parametric bootstrap

10 Likelihood- (Nonparametric) Bootstrapping Used to generate the pool of plausible trees in ML Resamples CHARACTERS Majority-rule consensus tree A simple way of acertaining clade support 70% boostrap support is strong (rough rule of thumb)

11 Bayesian: Markov-Chain Monte Carlo Used to generate the pool of plausible trees in Bayesian methods Resamples PARAMETERS (e.g. branch length, transition/transversion bias, base frequencies

12 Bayesian: Markov-Chain Monte Carlo

13 Bayesian: Markov-Chain Monte Carlo

14 Bayesian Markov Chain Monte Carlo Initially the likelihoods will increase rapidly (the first random tree will have a low likelihood, which can be improved with random moves. Eventually, the likelihoods will hit a plateau (once sampled trees are very good, most changes will not lead to improved likelihoods and will be rejected)

15 Bayesian Markov Chain Monte Carlo Initially the likelihoods will increase rapidly (the first random tree will have a low likelihood, which can be improved with random moves. Burn in Eventually, the likelihoods will hit a plateau (once sampled trees are very good, most changes will not lead to improved likelihoods and will be rejected) -Stationarity

16 Bayesian Markov Chain Monte Carlo At stationarity, the MCMC method will sample trees in proportion to their posterior probability.

17 Bayesian Markov Chain Monte Carlo At stationarity, the MCMC method will sample trees in proportion to their posterior probability. Out of this pool of trees, one SAMPLED tree topology will be most representative of the clades found in the whole sample maximum credibility tree Often, people get a majority rule consensus of all sampled trees not the same. Analogous to getting the ML tree versus getting the bootstrap consensus.

18 Bayesian: Markov-Chain Monte Carlo Used to generate the pool of plausible trees in Bayesian methods Resamples PARAMETERS (e.g. branch length, transition/transversion bias, base frequencies Markov Chain: Trees sampled one after the other, next tree is determined only by current tree (not earlier ones Monte Carlo: Next tree is obtained by a random perturbation of parameters

19 ML versus Bayesian Likelihood analysis (e.g. PHYML, RAxML) - Best tree = Maximum likelihood tree (ML tree) - Pool of plausible trees obtained by bootstraping (perturbs CHARACTERS) Bayesian analysis (e.g. MrBayes - Best tree = Maximum posterior probability tree (MPP tree) - Pool of plausible trees obtained by Markov Chain- Monte Carlo (perturbs PARAMETERS)

20 ML versus Bayesian

21 Discussion session

22 Process of Phylogenetic Estimation Sequence Data MSA Neighbor joining Parsimony ML Bayesian Algorithm Substitution model HKY + JTT WAG+ F mtrev24 Estimate of phylogeny

23 Sources of Systematic error Sequence data Substitution mdel Algorithm Estimate of phylogeny Alignment Residues included in analysis that are not related by substitutions Countermeasures Carefully examine and edit MSA - remove regions from analysis that likely to be misaligned

24 Sources of Systematic error Sequence data Substitution model Algorithm Estimate of phylogeny Model - substitutions may occur very differently from those described by model used in phylogenetic analysis Countermeasures Examine sequences for signs of such model mis-specification E.g check frequencies of residues are similar in all sequences If possible, exclude sequences/residues that seem to to violate the model If not possible, interpret resulting phylogeny critically

25 Sources of Systematic error Sequence data Substitution model Algorithm Estimate of phylogeny Algorithm - incorporates assumptions about sequence evolution that lead to model mis-specification OR algorithm fails (e.g. ML gets trapped in local maxima) Countermeasures Compare results of different algorithms - if they agree, it s less likely that specific algorithms have failed Run algorithms using different starting conditions (e.g. different initial values for parameters of likelihood model)

26 Exam Questions: What is the difference between local and global alignment? What does the following dotplot depict? Which differences between sequence A and B? Draw a dot plot which has a n insertion in sequence A in comparison to sequence B. Please write down the following tree topology in NEWICK format. Please draw the tree that is given by the following NEWICK format. What is the difference between orthologs and paralogs? What is the difference between the following two DNA models HKY and a FEL. Why can codon models be used to detect selection? Are the HKY model and the JC model nested? If yes what is the degrees of freedom that should be used for a likelihood ratio test? Describe the difference between boostrap and Bayesian branch support values? Please name the steps in the hierarchal structure of de novo sequencing?

### PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference

PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice

Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts

### Molecular Clocks and Tree Dating with r8s and BEAST

Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today

### MCMC A T T T G C T C B T T C C C T C C G C C T C T C D C C T T C T C. (Saitou and Nei, 1987) (Swofford and Begle, 1993)

MCMC 1 1 1 DNA 1 2 3 4 5 6 7 A T T T G C T C B T T C C C T C C G C C T C T C D C C T T C T C ( 2) (Saitou and Nei, 1987) (Swofford and Begle, 1993) 1 A B C D (Vos, 2003) (Zwickl, 2006) (Morrison, 2007)

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### NSilico Life Science Introductory Bioinformatics Course

NSilico Life Science Introductory Bioinformatics Course INTRODUCTORY BIOINFORMATICS COURSE A public course delivered over three days on the fundamentals of bioinformatics and illustrated with lectures,

### A comparison of methods for estimating the transition:transversion ratio from DNA sequences

Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### A Bayesian hierarchical surrogate outcome model for multiple sclerosis

A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)

### Missing data and the accuracy of Bayesian phylogenetics

Journal of Systematics and Evolution 46 (3): 307 314 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08040 http://www.plantsystematics.com Missing data and the accuracy of Bayesian

### Bio-Informatics Lectures. A Short Introduction

Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively

### Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions

SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0

### More details on the inputs, functionality, and output can be found below.

Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing

### Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6

Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues

### Summary of Probability

Summary of Probability Mathematical Physics I Rules of Probability The probability of an event is called P(A), which is a positive number less than or equal to 1. The total probability for all possible

### Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are

### Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com

Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian

### What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL

What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the

### A Rough Guide to BEAST 1.4

A Rough Guide to BEAST 1.4 Alexei J. Drummond 1, Simon Y.W. Ho, Nic Rawlence and Andrew Rambaut 2 1 Department of Computer Science The University of Auckland, Private Bag 92019 Auckland, New Zealand alexei@cs.auckland.ac.nz

### An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics

Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics Part 1 Slide 2 Talk overview Foundations of Bayesian statistics

### PhyML Manual. Version 3.0 September 17, 2008. http://www.atgc-montpellier.fr/phyml

PhyML Manual Version 3.0 September 17, 2008 http://www.atgc-montpellier.fr/phyml Contents 1 Citation 3 2 Authors 3 3 Overview 4 4 Installing PhyML 4 4.1 Sources and compilation.............................

### Multiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro

Supplementary Material for: Multiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro 1. Supplementary Tables Supplementary Table S1. Sample information.

Operational Risk Management: Added Value of Advanced Methodologies Paris, September 2013 Bertrand HASSANI Head of Major Risks Management & Scenario Analysis Disclaimer: The opinions, ideas and approaches

### Learning outcomes. Knowledge and understanding. Competence and skills

Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges

### The HB. How Bayesian methods have changed the face of marketing research. Summer 2004

The HB How Bayesian methods have changed the face of marketing research. 20 Summer 2004 Reprinted with permission from Marketing Research, Summer 2004, published by the American Marketing Association.

### Introduction to Phylogenetic Analysis

Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.

### Parallelization Strategies for Multicore Data Analysis

Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management

### jmodeltest 0.1.1 (April 2008) David Posada 2008 onwards

jmodeltest 0.1.1 (April 2008) David Posada 2008 onwards dposada@uvigo.es http://darwin.uvigo.es/ See the jmodeltest FORUM and FAQs at http://darwin.uvigo.es/ INDEX 1 1. DISCLAIMER 3 2. PURPOSE 3 3. CITATION

### Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need

### APPLIED MISSING DATA ANALYSIS

APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview

### Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization

Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize

### Appendix 1: Time series analysis of peak-rate years and synchrony testing.

Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are

### Imputing Values to Missing Data

Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data

### PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: msm_eng@k-space.org

BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,

Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

### Sample Size Designs to Assess Controls

Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference

### How to Build a Phylogenetic Tree

How to Build a Phylogenetic Tree Phylogenetics tree is a structure in which species are arranged on branches that link them according to their relationship and/or evolutionary descent. A typical rooted

### An Introduction to Phylogenetics

An Introduction to Phylogenetics Bret Larget larget@stat.wisc.edu Departments of Botany and of Statistics University of Wisconsin Madison February 4, 2008 1 / 70 Phylogenetics and Darwin A phylogeny is

### Hierarchical Bayesian Modeling of the HIV Response to Therapy

Hierarchical Bayesian Modeling of the HIV Response to Therapy Shane T. Jensen Department of Statistics, The Wharton School, University of Pennsylvania March 23, 2010 Joint Work with Alex Braunstein and

### A Bayesian Antidote Against Strategy Sprawl

A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne (benjamin.scheibehenne@unibas.ch) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp (joerg.rieskamp@unibas.ch)

### A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML

9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze

### Bayesian Statistics: Indian Buffet Process

Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note

### Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.

... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those

### Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior

### Supervised Learning (Big Data Analytics)

Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used

### Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1

Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat

### Handling attrition and non-response in longitudinal data

Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein

### Comparing Bootstrap and Posterior Probability Values in the Four-Taxon Case

Syst. Biol. 52(4):477 487, 2003 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150390218213 Comparing Bootstrap and Posterior Probability Values

### Bayesian coalescent inference of population size history

Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population

### Supporting Online Material for

www.sciencemag.org/cgi/content/full/312/5781/1762/dc1 Supporting Online Material for Silk Genes Support the Single Origin of Orb Webs Jessica E. Garb,* Teresa DiMauro, Victoria Vo, Cheryl Y. Hayashi *To

### Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards

Syst. Biol. 53(3):448 469, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490445797 Data Partitions and Complex Models in Bayesian Analysis:

### STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct

### Indices of Model Fit STRUCTURAL EQUATION MODELING 2013

Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:

### Phylogenetic systematics turns over a new leaf

30 Review Phylogenetic systematics turns over a new leaf Paul O. Lewis Long restricted to the domain of molecular systematics and studies of molecular evolution, likelihood methods are now being used in

### PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP

PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP March 4-7, 2013 Valencia, Spain Parc Cientific of the University of Valencia Goals The aim of this workshop is to provide the attendees with a broad

### Bayesian inference for population prediction of individuals without health insurance in Florida

Bayesian inference for population prediction of individuals without health insurance in Florida Neung Soo Ha 1 1 NISS 1 / 24 Outline Motivation Description of the Behavioral Risk Factor Surveillance System,

### The Basics of Graphical Models

The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures

### Regression Modeling Strategies

Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions

### Protein Sequence Analysis - Overview -

Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein

### Combining information from different survey samples - a case study with data collected by world wide web and telephone

Combining information from different survey samples - a case study with data collected by world wide web and telephone Magne Aldrin Norwegian Computing Center P.O. Box 114 Blindern N-0314 Oslo Norway E-mail:

### Likelihood: Frequentist vs Bayesian Reasoning

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and

### Lecture 4 : Bayesian inference

Lecture 4 : Bayesian inference The Lecture dark 4 energy : Bayesian puzzle inference What is the Bayesian approach to statistics? How does it differ from the frequentist approach? Conditional probabilities,

### Bayesian Statistical Analysis in Medical Research

Bayesian Statistical Analysis in Medical Research David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz draper@ams.ucsc.edu www.ams.ucsc.edu/ draper ROLE Steering

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### morephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo

morephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo ftp://ftp.pasteur.fr/pub/gensoft/projects/morephyml/ http://mobyle.pasteur.fr/cgi-bin/portal.py Please cite this paper if you use this

### Bayesian Statistics in One Hour. Patrick Lam

Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical

### Draft 1, Attempted 2014 FR Solutions, AP Statistics Exam

Free response questions, 2014, first draft! Note: Some notes: Please make critiques, suggest improvements, and ask questions. This is just one AP stats teacher s initial attempts at solving these. I, as

### Objections to Bayesian statistics

Bayesian Analysis (2008) 3, Number 3, pp. 445 450 Objections to Bayesian statistics Andrew Gelman Abstract. Bayesian inference is one of the more controversial approaches to statistics. The fundamental

### Principles of Data Mining by Hand&Mannila&Smyth

Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences

### CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS

Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships

### Missing Data Dr Eleni Matechou

1 Statistical Methods Principles Missing Data Dr Eleni Matechou matechou@stats.ox.ac.uk References: R.J.A. Little and D.B. Rubin 2nd edition Statistical Analysis with Missing Data J.L. Schafer and J.W.

### A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data

A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing

### Probability Using Dice

Using Dice One Page Overview By Robert B. Brown, The Ohio State University Topics: Levels:, Statistics Grades 5 8 Problem: What are the probabilities of rolling various sums with two dice? How can you

### Big data challenges for physics in the next decades

Big data challenges for physics in the next decades David W. Hogg Center for Cosmology and Particle Physics, New York University 2012 November 09 punchlines Huge data sets create new opportunities. they

### The Variability of P-Values. Summary

The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report

### Network Tomography Based on to-end Measurements

Network Tomography Based on end-to to-end Measurements Francesco Lo Presti Dipartimento di Informatica - Università dell Aquila The First COST-IST(EU)-NSF(USA) Workshop on EXCHANGES & TRENDS IN NETWORKING

### A short guide to phylogeny reconstruction

A short guide to phylogeny reconstruction E. Michu Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ABSTRACT This review is a short introduction to phylogenetic

### Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data

Sampling via Moment Sharing: A New Framework for Distributed Bayesian Inference for Big Data (Oxford) in collaboration with: Minjie Xu, Jun Zhu, Bo Zhang (Tsinghua) Balaji Lakshminarayanan (Gatsby) Bayesian

### Dating Phylogenies with Sequentially Sampled Tips

Name: OUTLINE SOLUTIONS University of Chicago Graduate School of Business Business 41000: Business Statistics Solution Key Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper

### Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note 11

CS 70 Discrete Mathematics and Probability Theory Fall 2009 Satish Rao,David Tse Note Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According

### GCSE Statistics Revision notes

GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### One-year reserve risk including a tail factor : closed formula and bootstrap approaches

One-year reserve risk including a tail factor : closed formula and bootstrap approaches Alexandre Boumezoued R&D Consultant Milliman Paris alexandre.boumezoued@milliman.com Yoboua Angoua Non-Life Consultant

### Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting

Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example

### Quantitative Methods for Finance

Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

### Lab 8: Introduction to WinBUGS

40.656 Lab 8 008 Lab 8: Introduction to WinBUGS Goals:. Introduce the concepts of Bayesian data analysis.. Learn the basic syntax of WinBUGS. 3. Learn the basics of using WinBUGS in a simple example. Next

### What? So what? NOW WHAT? Presenting metrics to get results

What? So what? NOW WHAT? What? So what? Visualization is like photography. Impact is a function of focus, illumination, and perspective. What? NOW WHAT? Don t Launch! Prevent your own disastrous decisions

### Dealing with large datasets

Dealing with large datasets (by throwing away most of the data) Alan Heavens Institute for Astronomy, University of Edinburgh with Ben Panter, Rob Tweedie, Mark Bastin, Will Hossack, Keith McKellar, Trevor

### Speculative Moves: Multithreading Markov Chain Monte Carlo Programs

Speculative Moves: Multithreading Markov Chain Monte Carlo Programs Jonathan M. R. Byrd, Stephen A. Jarvis and Abhir H. Bhalerao Department of Computer Science, University of Warwick, Coventry, CV4 7AL,

### Tutorial on Markov Chain Monte Carlo

Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,

### Lecture 3 : Hypothesis testing and model-fitting

Lecture 3 : Hypothesis testing and model-fitting These dark lectures energy puzzle Lecture 1 : basic descriptive statistics Lecture 2 : searching for correlations Lecture 3 : hypothesis testing and model-fitting

### Gaussian Processes to Speed up Hamiltonian Monte Carlo

Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo

### Variance of OLS Estimators and Hypothesis Testing. Randomness in the model. GM assumptions. Notes. Notes. Notes. Charlie Gibbons ARE 212.

Variance of OLS Estimators and Hypothesis Testing Charlie Gibbons ARE 212 Spring 2011 Randomness in the model Considering the model what is random? Y = X β + ɛ, β is a parameter and not random, X may be

### The RAxML 7.0.3 Manual

The RAxML 7.0.3 Manual Alexandros Stamatakis The Exelixis Lab 1 Teaching & Research Unit Bioinformatics Department of Computer Science Ludwig-Maximilians-Universität München stamatakis@bio.ifi.lmu.de 1

### PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE

PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive

### Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle

Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential

### An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment

An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide