Bayesian Phylogeny and Measures of Branch Support
|
|
|
- Rafe Booker
- 10 years ago
- Views:
Transcription
1 Bayesian Phylogeny and Measures of Branch Support
2 Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The unfair dice are strongly biased: Imagine that you take one die from the bag and throw it 2 times, obtaining: The problem is: what kind of die did you roll?
3 Bayesian Statistics The likelihood that this is an unbiased die is: L u = Pr ( unbiased die ) = 1/6 1/6 = 1/36 L b = Pr ( biased die) = 4/21 6/21 = 24/441 Bayesian inferences are based on the posterior probability of a hypothesis: This means that our opinion that the dice is biased changed from 0.1 to after observing a four and a six.
4 Bayes Theorem
5 Bayes Theorem
6 Bayes Theorem Bayesian Analysis depends on good priors (weakness and strength of the method)
7 Likelihood Likelihood is the probability that an hypothesis would have been generated the new observed data. Ignores pre-existing information Bayesian Bayesian Posterior Probability is the probability that an hypothesis is true, given the new observed data AND existing knowledge Considers pre-existing information ( Prior )
8 How does related to Phylogenetics? Likelihood analysis (e.g. PHYML, RAxML) - Best tree = Maximum likelihood tree (ML tree) - Pool of plausible trees obtained by bootstraping Bayesian analysis (e.g. MrBayes - Best tree = Maximum posterior probability tree (MPP tree) - Pool of plausible trees obtained by Markov Chain- Monte Carlo
9 Non-parametric bootstrap
10 Likelihood- (Nonparametric) Bootstrapping Used to generate the pool of plausible trees in ML Resamples CHARACTERS Majority-rule consensus tree A simple way of acertaining clade support 70% boostrap support is strong (rough rule of thumb)
11 Bayesian: Markov-Chain Monte Carlo Used to generate the pool of plausible trees in Bayesian methods Resamples PARAMETERS (e.g. branch length, transition/transversion bias, base frequencies
12 Bayesian: Markov-Chain Monte Carlo
13 Bayesian: Markov-Chain Monte Carlo
14 Bayesian Markov Chain Monte Carlo Initially the likelihoods will increase rapidly (the first random tree will have a low likelihood, which can be improved with random moves. Eventually, the likelihoods will hit a plateau (once sampled trees are very good, most changes will not lead to improved likelihoods and will be rejected)
15 Bayesian Markov Chain Monte Carlo Initially the likelihoods will increase rapidly (the first random tree will have a low likelihood, which can be improved with random moves. Burn in Eventually, the likelihoods will hit a plateau (once sampled trees are very good, most changes will not lead to improved likelihoods and will be rejected) -Stationarity
16 Bayesian Markov Chain Monte Carlo At stationarity, the MCMC method will sample trees in proportion to their posterior probability.
17 Bayesian Markov Chain Monte Carlo At stationarity, the MCMC method will sample trees in proportion to their posterior probability. Out of this pool of trees, one SAMPLED tree topology will be most representative of the clades found in the whole sample maximum credibility tree Often, people get a majority rule consensus of all sampled trees not the same. Analogous to getting the ML tree versus getting the bootstrap consensus.
18 Bayesian: Markov-Chain Monte Carlo Used to generate the pool of plausible trees in Bayesian methods Resamples PARAMETERS (e.g. branch length, transition/transversion bias, base frequencies Markov Chain: Trees sampled one after the other, next tree is determined only by current tree (not earlier ones Monte Carlo: Next tree is obtained by a random perturbation of parameters
19 ML versus Bayesian Likelihood analysis (e.g. PHYML, RAxML) - Best tree = Maximum likelihood tree (ML tree) - Pool of plausible trees obtained by bootstraping (perturbs CHARACTERS) Bayesian analysis (e.g. MrBayes - Best tree = Maximum posterior probability tree (MPP tree) - Pool of plausible trees obtained by Markov Chain- Monte Carlo (perturbs PARAMETERS)
20 ML versus Bayesian
21 Discussion session
22 Process of Phylogenetic Estimation Sequence Data MSA Neighbor joining Parsimony ML Bayesian Algorithm Substitution model HKY + JTT WAG+ F mtrev24 Estimate of phylogeny
23 Sources of Systematic error Sequence data Substitution mdel Algorithm Estimate of phylogeny Alignment Residues included in analysis that are not related by substitutions Countermeasures Carefully examine and edit MSA - remove regions from analysis that likely to be misaligned
24 Sources of Systematic error Sequence data Substitution model Algorithm Estimate of phylogeny Model - substitutions may occur very differently from those described by model used in phylogenetic analysis Countermeasures Examine sequences for signs of such model mis-specification E.g check frequencies of residues are similar in all sequences If possible, exclude sequences/residues that seem to to violate the model If not possible, interpret resulting phylogeny critically
25 Sources of Systematic error Sequence data Substitution model Algorithm Estimate of phylogeny Algorithm - incorporates assumptions about sequence evolution that lead to model mis-specification OR algorithm fails (e.g. ML gets trapped in local maxima) Countermeasures Compare results of different algorithms - if they agree, it s less likely that specific algorithms have failed Run algorithms using different starting conditions (e.g. different initial values for parameters of likelihood model)
26 Exam Questions: What is the difference between local and global alignment? What does the following dotplot depict? Which differences between sequence A and B? Draw a dot plot which has a n insertion in sequence A in comparison to sequence B. Please write down the following tree topology in NEWICK format. Please draw the tree that is given by the following NEWICK format. What is the difference between orthologs and paralogs? What is the difference between the following two DNA models HKY and a FEL. Why can codon models be used to detect selection? Are the HKY model and the JC model nested? If yes what is the degrees of freedom that should be used for a likelihood ratio test? Describe the difference between boostrap and Bayesian branch support values? Please name the steps in the hierarchal structure of de novo sequencing?
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice
Phylogenetic Trees Made Easy
Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts
Molecular Clocks and Tree Dating with r8s and BEAST
Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today
A Bayesian hierarchical surrogate outcome model for multiple sclerosis
A Bayesian hierarchical surrogate outcome model for multiple sclerosis 3 rd Annual ASA New Jersey Chapter / Bayer Statistics Workshop David Ohlssen (Novartis), Luca Pozzi and Heinz Schmidli (Novartis)
Statistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
A comparison of methods for estimating the transition:transversion ratio from DNA sequences
Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from
Bio-Informatics Lectures. A Short Introduction
Bio-Informatics Lectures A Short Introduction The History of Bioinformatics Sanger Sequencing PCR in presence of fluorescent, chain-terminating dideoxynucleotides Massively Parallel Sequencing Massively
More details on the inputs, functionality, and output can be found below.
Overview: The SMEEACT (Software for More Efficient, Ethical, and Affordable Clinical Trials) web interface (http://research.mdacc.tmc.edu/smeeactweb) implements a single analysis of a two-armed trial comparing
Lecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions
SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
What mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL
What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the
Bayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University [email protected]
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University [email protected] 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics
Slide 1 An Introduction to Using WinBUGS for Cost-Effectiveness Analyses in Health Economics Dr. Christian Asseburg Centre for Health Economics Part 1 Slide 2 Talk overview Foundations of Bayesian statistics
Multiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro
Supplementary Material for: Multiple Losses of Flight and Recent Speciation in Steamer Ducks Tara L. Fulton, Brandon Letts, and Beth Shapiro 1. Supplementary Tables Supplementary Table S1. Sample information.
Learning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
A Rough Guide to BEAST 1.4
A Rough Guide to BEAST 1.4 Alexei J. Drummond 1, Simon Y.W. Ho, Nic Rawlence and Andrew Rambaut 2 1 Department of Computer Science The University of Auckland, Private Bag 92019 Auckland, New Zealand [email protected]
Introduction to Phylogenetic Analysis
Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.
Operational Risk Management: Added Value of Advanced Methodologies
Operational Risk Management: Added Value of Advanced Methodologies Paris, September 2013 Bertrand HASSANI Head of Major Risks Management & Scenario Analysis Disclaimer: The opinions, ideas and approaches
Sequence Analysis 15: lecture 5. Substitution matrices Multiple sequence alignment
Sequence Analysis 15: lecture 5 Substitution matrices Multiple sequence alignment A teacher's dilemma To understand... Multiple sequence alignment Substitution matrices Phylogenetic trees You first need
PhyML Manual. Version 3.0 September 17, 2008. http://www.atgc-montpellier.fr/phyml
PhyML Manual Version 3.0 September 17, 2008 http://www.atgc-montpellier.fr/phyml Contents 1 Citation 3 2 Authors 3 3 Overview 4 4 Installing PhyML 4 4.1 Sources and compilation.............................
Parallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
APPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization
Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize
Imputing Values to Missing Data
Imputing Values to Missing Data In federated data, between 30%-70% of the data points will have at least one missing attribute - data wastage if we ignore all records with a missing value Remaining data
PROC. CAIRO INTERNATIONAL BIOMEDICAL ENGINEERING CONFERENCE 2006 1. E-mail: [email protected]
BIOINFTool: Bioinformatics and sequence data analysis in molecular biology using Matlab Mai S. Mabrouk 1, Marwa Hamdy 2, Marwa Mamdouh 2, Marwa Aboelfotoh 2,Yasser M. Kadah 2 1 Biomedical Engineering Department,
How To Understand The Theory Of Probability
Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL
Sample Size Designs to Assess Controls
Sample Size Designs to Assess Controls B. Ricky Rambharat, PhD, PStat Lead Statistician Office of the Comptroller of the Currency U.S. Department of the Treasury Washington, DC FCSM Research Conference
An Introduction to Phylogenetics
An Introduction to Phylogenetics Bret Larget [email protected] Departments of Botany and of Statistics University of Wisconsin Madison February 4, 2008 1 / 70 Phylogenetics and Darwin A phylogeny is
The HB. How Bayesian methods have changed the face of marketing research. Summer 2004
The HB How Bayesian methods have changed the face of marketing research. 20 Summer 2004 Reprinted with permission from Marketing Research, Summer 2004, published by the American Marketing Association.
jmodeltest 0.1.1 (April 2008) David Posada 2008 onwards
jmodeltest 0.1.1 (April 2008) David Posada 2008 onwards [email protected] http://darwin.uvigo.es/ See the jmodeltest FORUM and FAQs at http://darwin.uvigo.es/ INDEX 1 1. DISCLAIMER 3 2. PURPOSE 3 3. CITATION
A Bayesian Antidote Against Strategy Sprawl
A Bayesian Antidote Against Strategy Sprawl Benjamin Scheibehenne ([email protected]) University of Basel, Missionsstrasse 62a 4055 Basel, Switzerland & Jörg Rieskamp ([email protected])
A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML
9 June 2011 A Step-by-Step Tutorial: Divergence Time Estimation with Approximate Likelihood Calculation Using MCMCTREE in PAML by Jun Inoue, Mario dos Reis, and Ziheng Yang In this tutorial we will analyze
Bayesian Statistics: Indian Buffet Process
Bayesian Statistics: Indian Buffet Process Ilker Yildirim Department of Brain and Cognitive Sciences University of Rochester Rochester, NY 14627 August 2012 Reference: Most of the material in this note
Borges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.
... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those
Handling attrition and non-response in longitudinal data
Longitudinal and Life Course Studies 2009 Volume 1 Issue 1 Pp 63-72 Handling attrition and non-response in longitudinal data Harvey Goldstein University of Bristol Correspondence. Professor H. Goldstein
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom
Comparison of frequentist and Bayesian inference. Class 20, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom 1 Learning Goals 1. Be able to explain the difference between the p-value and a posterior
Comparing Bootstrap and Posterior Probability Values in the Four-Taxon Case
Syst. Biol. 52(4):477 487, 2003 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150390218213 Comparing Bootstrap and Posterior Probability Values
Supervised Learning (Big Data Analytics)
Supervised Learning (Big Data Analytics) Vibhav Gogate Department of Computer Science The University of Texas at Dallas Practical advice Goal of Big Data Analytics Uncover patterns in Data. Can be used
Core Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: [email protected]
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013
Indices of Model Fit STRUCTURAL EQUATION MODELING 2013 Indices of Model Fit A recommended minimal set of fit indices that should be reported and interpreted when reporting the results of SEM analyses:
Bayesian coalescent inference of population size history
Bayesian coalescent inference of population size history Alexei Drummond University of Auckland Workshop on Population and Speciation Genomics, 2016 1st February 2016 1 / 39 BEAST tutorials Population
Data Partitions and Complex Models in Bayesian Analysis: The Phylogeny of Gymnophthalmid Lizards
Syst. Biol. 53(3):448 469, 2004 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150490445797 Data Partitions and Complex Models in Bayesian Analysis:
STA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! [email protected]! http://www.cs.toronto.edu/~rsalakhu/ Lecture 6 Three Approaches to Classification Construct
PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP
PHYLOGENY AND COMPARATIVE METHODS SYMBIOMICS WORKSHOP March 4-7, 2013 Valencia, Spain Parc Cientific of the University of Valencia Goals The aim of this workshop is to provide the attendees with a broad
Bayesian inference for population prediction of individuals without health insurance in Florida
Bayesian inference for population prediction of individuals without health insurance in Florida Neung Soo Ha 1 1 NISS 1 / 24 Outline Motivation Description of the Behavioral Risk Factor Surveillance System,
The Basics of Graphical Models
The Basics of Graphical Models David M. Blei Columbia University October 3, 2015 Introduction These notes follow Chapter 2 of An Introduction to Probabilistic Graphical Models by Michael Jordan. Many figures
Regression Modeling Strategies
Frank E. Harrell, Jr. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis With 141 Figures Springer Contents Preface Typographical Conventions
Protein Sequence Analysis - Overview -
Protein Sequence Analysis - Overview - UDEL Workshop Raja Mazumder Research Associate Professor, Department of Biochemistry and Molecular Biology Georgetown University Medical Center Topics Why do protein
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
Bayesian Statistical Analysis in Medical Research
Bayesian Statistical Analysis in Medical Research David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz [email protected] www.ams.ucsc.edu/ draper ROLE Steering
morephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo
morephyml User Guide [Version 1.14] August 2011 by Alexis Criscuolo ftp://ftp.pasteur.fr/pub/gensoft/projects/morephyml/ http://mobyle.pasteur.fr/cgi-bin/portal.py Please cite this paper if you use this
Likelihood: Frequentist vs Bayesian Reasoning
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and
Draft 1, Attempted 2014 FR Solutions, AP Statistics Exam
Free response questions, 2014, first draft! Note: Some notes: Please make critiques, suggest improvements, and ask questions. This is just one AP stats teacher s initial attempts at solving these. I, as
Bayesian Statistics in One Hour. Patrick Lam
Bayesian Statistics in One Hour Patrick Lam Outline Introduction Bayesian Models Applications Missing Data Hierarchical Models Outline Introduction Bayesian Models Applications Missing Data Hierarchical
Probability Using Dice
Using Dice One Page Overview By Robert B. Brown, The Ohio State University Topics: Levels:, Statistics Grades 5 8 Problem: What are the probabilities of rolling various sums with two dice? How can you
Principles of Data Mining by Hand&Mannila&Smyth
Principles of Data Mining by Hand&Mannila&Smyth Slides for Textbook Ari Visa,, Institute of Signal Processing Tampere University of Technology October 4, 2010 Data Mining: Concepts and Techniques 1 Differences
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data
A Bootstrap Metropolis-Hastings Algorithm for Bayesian Analysis of Big Data Faming Liang University of Florida August 9, 2015 Abstract MCMC methods have proven to be a very powerful tool for analyzing
CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS
Examples: Regression And Path Analysis CHAPTER 3 EXAMPLES: REGRESSION AND PATH ANALYSIS Regression analysis with univariate or multivariate dependent variables is a standard procedure for modeling relationships
Objections to Bayesian statistics
Bayesian Analysis (2008) 3, Number 3, pp. 445 450 Objections to Bayesian statistics Andrew Gelman Abstract. Bayesian inference is one of the more controversial approaches to statistics. The fundamental
The Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 [email protected] August 15, 2009 NC State Statistics Departement Tech Report
A short guide to phylogeny reconstruction
A short guide to phylogeny reconstruction E. Michu Institute of Biophysics, Academy of Sciences of the Czech Republic, Brno, Czech Republic ABSTRACT This review is a short introduction to phylogenetic
Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
University of Chicago Graduate School of Business. Business 41000: Business Statistics Solution Key
Name: OUTLINE SOLUTIONS University of Chicago Graduate School of Business Business 41000: Business Statistics Solution Key Special Notes: 1. This is a closed-book exam. You may use an 8 11 piece of paper
Model Calibration with Open Source Software: R and Friends. Dr. Heiko Frings Mathematical Risk Consulting
Model with Open Source Software: and Friends Dr. Heiko Frings Mathematical isk Consulting Bern, 01.09.2011 Agenda in a Friends Model with & Friends o o o Overview First instance: An Extreme Value Example
One-year reserve risk including a tail factor : closed formula and bootstrap approaches
One-year reserve risk including a tail factor : closed formula and bootstrap approaches Alexandre Boumezoued R&D Consultant Milliman Paris [email protected] Yoboua Angoua Non-Life Consultant
Quantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
The RAxML 7.0.3 Manual
The RAxML 7.0.3 Manual Alexandros Stamatakis The Exelixis Lab 1 Teaching & Research Unit Bioinformatics Department of Computer Science Ludwig-Maximilians-Universität München [email protected] 1
What? So what? NOW WHAT? Presenting metrics to get results
What? So what? NOW WHAT? What? So what? Visualization is like photography. Impact is a function of focus, illumination, and perspective. What? NOW WHAT? Don t Launch! Prevent your own disastrous decisions
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE
PREDICTIVE DISTRIBUTIONS OF OUTSTANDING LIABILITIES IN GENERAL INSURANCE BY P.D. ENGLAND AND R.J. VERRALL ABSTRACT This paper extends the methods introduced in England & Verrall (00), and shows how predictive
Dealing with large datasets
Dealing with large datasets (by throwing away most of the data) Alan Heavens Institute for Astronomy, University of Edinburgh with Ben Panter, Rob Tweedie, Mark Bastin, Will Hossack, Keith McKellar, Trevor
Gaussian Processes to Speed up Hamiltonian Monte Carlo
Gaussian Processes to Speed up Hamiltonian Monte Carlo Matthieu Lê Murray, Iain http://videolectures.net/mlss09uk_murray_mcmc/ Rasmussen, Carl Edward. "Gaussian processes to speed up hybrid Monte Carlo
Tutorial on Markov Chain Monte Carlo
Tutorial on Markov Chain Monte Carlo Kenneth M. Hanson Los Alamos National Laboratory Presented at the 29 th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Technology,
Dating Phylogenies with Sequentially Sampled Tips
Syst. Biol. 62(5):674 688, 2013 The Author(s) 2013. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: [email protected]
Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases. Andreas Züfle
Similarity Search and Mining in Uncertain Spatial and Spatio Temporal Databases Andreas Züfle Geo Spatial Data Huge flood of geo spatial data Modern technology New user mentality Great research potential
An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment
An Application of Inverse Reinforcement Learning to Medical Records of Diabetes Treatment Hideki Asoh 1, Masanori Shiro 1 Shotaro Akaho 1, Toshihiro Kamishima 1, Koiti Hasida 1, Eiji Aramaki 2, and Takahide
Inference on Phase-type Models via MCMC
Inference on Phase-type Models via MCMC with application to networks of repairable redundant systems Louis JM Aslett and Simon P Wilson Trinity College Dublin 28 th June 202 Toy Example : Redundant Repairable
HT2015: SC4 Statistical Data Mining and Machine Learning
HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Bayesian Nonparametrics Parametric vs Nonparametric
AP STATISTICS (Warm-Up Exercises)
AP STATISTICS (Warm-Up Exercises) 1. Describe the distribution of ages in a city: 2. Graph a box plot on your calculator for the following test scores: {90, 80, 96, 54, 80, 95, 100, 75, 87, 62, 65, 85,
Government of Russian Federation. Faculty of Computer Science School of Data Analysis and Artificial Intelligence
Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University «Higher School of Economics» Faculty of Computer Science School
Simple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
Results from the 2014 AP Statistics Exam. Jessica Utts, University of California, Irvine Chief Reader, AP Statistics [email protected]
Results from the 2014 AP Statistics Exam Jessica Utts, University of California, Irvine Chief Reader, AP Statistics [email protected] The six free-response questions Question #1: Extracurricular activities
Program description for the Master s Degree Program in Mathematics and Finance
Program description for the Master s Degree Program in Mathematics and Finance : English: Master s Degree in Mathematics and Finance Norwegian, bokmål: Master i matematikk og finans Norwegian, nynorsk:
Dirichlet Processes A gentle tutorial
Dirichlet Processes A gentle tutorial SELECT Lab Meeting October 14, 2008 Khalid El-Arini Motivation We are given a data set, and are told that it was generated from a mixture of Gaussian distributions.
**BEGINNING OF EXAMINATION** The annual number of claims for an insured has probability function: , 0 < q < 1.
**BEGINNING OF EXAMINATION** 1. You are given: (i) The annual number of claims for an insured has probability function: 3 p x q q x x ( ) = ( 1 ) 3 x, x = 0,1,, 3 (ii) The prior density is π ( q) = q,
SAS Certificate Applied Statistics and SAS Programming
SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011
Chicago Booth BUSINESS STATISTICS 41000 Final Exam Fall 2011 Name: Section: I pledge my honor that I have not violated the Honor Code Signature: This exam has 34 pages. You have 3 hours to complete this
Model-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA
123 Kwantitatieve Methoden (1999), 62, 123-138. A REVIEW OF CURRENT SOFTWARE FOR HANDLING MISSING DATA Joop J. Hox 1 ABSTRACT. When we deal with a large data set with missing data, we have to undertake
Comparison of Data Mining Techniques used for Financial Data Analysis
Comparison of Data Mining Techniques used for Financial Data Analysis Abhijit A. Sawant 1, P. M. Chawan 2 1 Student, 2 Associate Professor, Department of Computer Technology, VJTI, Mumbai, INDIA Abstract
