Review of Hypothesis Testing
|
|
- Justin Oliver
- 7 years ago
- Views:
Transcription
1 Review of Hypothesis Testing Classic two sample problem: X 1, X 2,..., X m F Y 1, Y 2,..., Y n G H 0 : F = G. e.g.: F,G are Gaussian with different means. The hypothesis testing framework We compute a test statistic ˆθ that takes on a larger (or more extreme) value if H 0 is not true. Assume that the null distribution of ˆθ is known. Compute P 0 (ˆθ > ˆθ) which we call the achieved significance level (ASL).
2 Review of Hypothesis Testing Achieved significance level: P 0 (ˆθ > ˆθ) ASL small Strong evidence against H 0 H 0 not true. This framework is asymmetric. If H 0 and H A equally likely then accept H 0. Evidence against H 0 does not count as evidence for H A.
3 Review of Hypothesis Testing X 1, X 2,..., X m N(µ T, σ 2 ) Y 1, Y 2,..., Y n N(µ C, σ 2 ) H 0 : µ T = µ C. ˆθ = X Ȳ Then under H 0, ˆθ N(0, σ 2 (1/n + 1/m)). ( ) ˆθ ASL = P Z > σ, where Z N(0, 1). 1/n + 1/m If σ 2 were estimated using data ˆσ 2 (Xi = X) 2 + (Y i Ȳ )2 n + m 2 then use t n+m 2 -distribution.
4 Where does Bootstrap and permutations come in? The main practical difficulty with hypothesis tests come in estimating ASL. Often, the null hypothesis distribution is not known because F, G and/or the statistic ˆθ is not as nice. Thus, instead of using theoretical distributions, we estimate the ASL by Monte Carlo sampling from the Bootstrap or permutation distribution.
5 Fisher s permutation test Combine n + m measurements, draw samples of size n from pool, each with probability 1 ( n + m n Permutation Lemma Under H 0, F = G, each set of n observations has the above probability of appearing in treatment set. ). ASL perm = #{ˆθ > ˆθ} ( ). n + m n This is different from the Bootstrap. It is purely based on random assignment. There is no empirical distribution ˆF at play here. The permutation method works for any statistic ˆθ.
6 Bootstrapping to two-sample problem 1. Draw samples of size n + m from combined pool with replacement. 2. Assign the first m to control, the next n to treatment. X 1,..., X m, Y 1,..., Y n. Compute ˆθ = ˆθ(X, Y ). Do this B times. 3. ASL boot = #{ˆθ >ˆθ obs } B. For the two sample problem, the only difference with permutations is that samples are drawn with replacement. Not surprisingly this gives similar results as permutations for the two-sample problem.
7 When there is nothing to permute... Consider the one-sample problem: X 1,..., X n, H 0 : E[X i ] = µ 0 There is nothing to permute for a permutation test. On the other hand, we can Bootstrap from an estimated null distribution ˆF 0. Note: ˆF 0 ˆF n! Simple trick: assume ˆF n has the correct shape and variance but just not the correct mean. Let X i = X i X + µ 0. Bootstrap from the empirical distribution of {X i : i = 1,..., n}. The key in using Bootstrap for hypothesis testing is to identify the null distribution, and constructing it from observed data.
8 Bootstrap phylogeny Reference: Felsenstein, 1985 Hillis & Bull, 1993 Efron et al. 1996
9 Phylogeny Evolutionary history of species Many different approaches for phylogenetic inferences Distance-based Maximum parsimony Maximum likelihood Bayesian Gen245, Spring 2009
10 Data Aligned sequences, x i Tree-building programs E.g. phylip t=t(x 1, x 2,,x k ) Gen245, Spring 2009
11 Bootstrap confidence level What does confidence mean? Real world Sequence multiple regions, build a phylogeny on each region of the same length Bootstrap world Resample (with replacement) nucleotide positions t * =T(x 1*, x 2*,,x k* ) Confidence=Proportion of t * that include a particular clade Gen245, Spring 2009
12 Bootstrap example Aligned sequences, x i Tree-building programs E.g. phylip t=t(x 1, x 2,,x k ) Gen245, Spring 2009
13 The controversy An empirical test (Hillis & Bull, 1993) Simulate a phylogeny Simulate sequence data Compute bootstrap confidence level Gen245, Spring 2009
14 The controversy An empirical test (Hillis & Bull, 1993) Simulate a phylogeny ( tree A ) Simulate sequence data (multiple real data) Compute bootstrap confidence level (for each data) Is the bootstrap confidence level accurate? Gen245, Spring 2009
15 What went wrong? Key idea of bootstrap D( ˆ * ˆ ) ~D( ˆ ) Not true D( ˆ * ) ~D( ˆ ) Gen245, Spring 2009 x Ref: Efron, Halloran & Holmes, 1996
16
17
18
19 How should the p-values of fraction overlap be computed? Uniformly resample the start positions of each feature, keeping the positions of the other feature fixed. Treat the features as 0,1 vectors and apply the classic Bootstrap. What is wrong about these approaches?
20 Simulation Study
21
22
23
24
25
Bayesian Phylogeny and Measures of Branch Support
Bayesian Phylogeny and Measures of Branch Support Bayesian Statistics Imagine we have a bag containing 100 dice of which we know that 90 are fair and 10 are biased. The
More informationNonparametric statistics and model selection
Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.
More informationPhylogenetic Trees Made Easy
Phylogenetic Trees Made Easy A How-To Manual Fourth Edition Barry G. Hall University of Rochester, Emeritus and Bellingham Research Institute Sinauer Associates, Inc. Publishers Sunderland, Massachusetts
More informationStatistics Graduate Courses
Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.
More informationComparing Bootstrap and Posterior Probability Values in the Four-Taxon Case
Syst. Biol. 52(4):477 487, 2003 Copyright c Society of Systematic Biologists ISSN: 1063-5157 print / 1076-836X online DOI: 10.1080/10635150390218213 Comparing Bootstrap and Posterior Probability Values
More informationGenome Explorer For Comparative Genome Analysis
Genome Explorer For Comparative Genome Analysis Jenn Conn 1, Jo L. Dicks 1 and Ian N. Roberts 2 Abstract Genome Explorer brings together the tools required to build and compare phylogenies from both sequence
More informationBootstrapping Big Data
Bootstrapping Big Data Ariel Kleiner Ameet Talwalkar Purnamrita Sarkar Michael I. Jordan Computer Science Division University of California, Berkeley {akleiner, ameet, psarkar, jordan}@eecs.berkeley.edu
More informationDnaSP, DNA polymorphism analyses by the coalescent and other methods.
DnaSP, DNA polymorphism analyses by the coalescent and other methods. Author affiliation: Julio Rozas 1, *, Juan C. Sánchez-DelBarrio 2,3, Xavier Messeguer 2 and Ricardo Rozas 1 1 Departament de Genètica,
More informationStatistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course
Statistics 3202 Introduction to Statistical Inference for Data Analytics 4-semester-hour course Prerequisite: Stat 3201 (Introduction to Probability for Data Analytics) Exclusions: Class distribution:
More informationVisualization of Phylogenetic Trees and Metadata
Visualization of Phylogenetic Trees and Metadata November 27, 2015 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com
More informationData Mining as Exploratory Data Analysis. Zachary Jones
Data Mining as Exploratory Data Analysis Zachary Jones The Problem(s) presumptions social systems are complex causal identification is difficult/impossible with many data sources theory not generally predictively
More informationPHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference
PHYML Online: A Web Server for Fast Maximum Likelihood-Based Phylogenetic Inference Stephane Guindon, F. Le Thiec, Patrice Duroux, Olivier Gascuel To cite this version: Stephane Guindon, F. Le Thiec, Patrice
More informationWhat mathematical optimization can, and cannot, do for biologists. Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL
What mathematical optimization can, and cannot, do for biologists Steven Kelk Department of Knowledge Engineering (DKE) Maastricht University, NL Introduction There is no shortage of literature about the
More informationHandling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza
Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and
More informationMonte Carlo testing with Big Data
Monte Carlo testing with Big Data Patrick Rubin-Delanchy University of Bristol & Heilbronn Institute for Mathematical Research Joint work with: Axel Gandy (Imperial College London) with contributions from:
More informationIntroduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization
Introduction to Mobile Robotics Bayes Filter Particle Filter and Monte Carlo Localization Wolfram Burgard, Maren Bennewitz, Diego Tipaldi, Luciano Spinello 1 Motivation Recall: Discrete filter Discretize
More informationFrom the help desk: Bootstrapped standard errors
The Stata Journal (2003) 3, Number 1, pp. 71 80 From the help desk: Bootstrapped standard errors Weihua Guan Stata Corporation Abstract. Bootstrapping is a nonparametric approach for evaluating the distribution
More informationLecture/Recitation Topic SMA 5303 L1 Sampling and statistical distributions
SMA 50: Statistical Learning and Data Mining in Bioinformatics (also listed as 5.077: Statistical Learning and Data Mining ()) Spring Term (Feb May 200) Faculty: Professor Roy Welsch Wed 0 Feb 7:00-8:0
More informationLikelihood: Frequentist vs Bayesian Reasoning
"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200B University of California, Berkeley Spring 2009 N Hallinan Likelihood: Frequentist vs Bayesian Reasoning Stochastic odels and
More informationCore Bioinformatics. Degree Type Year Semester. 4313473 Bioinformàtica/Bioinformatics OB 0 1
Core Bioinformatics 2014/2015 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformàtica/Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat
More informationWhy Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
More informationMaster's projects at ITMO University. Daniil Chivilikhin PhD Student @ ITMO University
Master's projects at ITMO University Daniil Chivilikhin PhD Student @ ITMO University General information Guidance from our lab's researchers Publishable results 2 Research areas Research at ITMO Evolutionary
More informationMultivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
More informationA comparison of methods for estimating the transition:transversion ratio from DNA sequences
Molecular Phylogenetics and Evolution 32 (2004) 495 503 MOLECULAR PHYLOGENETICS AND EVOLUTION www.elsevier.com/locate/ympev A comparison of methods for estimating the transition:transversion ratio from
More informationIntroduction to Bioinformatics AS 250.265 Laboratory Assignment 6
Introduction to Bioinformatics AS 250.265 Laboratory Assignment 6 In the last lab, you learned how to perform basic multiple sequence alignments. While useful in themselves for determining conserved residues
More informationSimple Linear Regression Inference
Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation
More informationBig Data: The Computation/Statistics Interface
Big Data: The Computation/Statistics Interface Michael I. Jordan University of California, Berkeley September 2, 2013 What Is the Big Data Phenomenon? Big Science is generating massive datasets to be used
More informationMolecular Clocks and Tree Dating with r8s and BEAST
Integrative Biology 200B University of California, Berkeley Principals of Phylogenetics: Ecology and Evolution Spring 2011 Updated by Nick Matzke Molecular Clocks and Tree Dating with r8s and BEAST Today
More informationThe Variability of P-Values. Summary
The Variability of P-Values Dennis D. Boos Department of Statistics North Carolina State University Raleigh, NC 27695-8203 boos@stat.ncsu.edu August 15, 2009 NC State Statistics Departement Tech Report
More informationComparison of resampling method applied to censored data
International Journal of Advanced Statistics and Probability, 2 (2) (2014) 48-55 c Science Publishing Corporation www.sciencepubco.com/index.php/ijasp doi: 10.14419/ijasp.v2i2.2291 Research Paper Comparison
More informationPermutation Tests for Comparing Two Populations
Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of
More informationPermutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn
Permutation P-values Should Never Be Zero: Calculating Exact P-values When Permutations Are Randomly Drawn Gordon K. Smyth & Belinda Phipson Walter and Eliza Hall Institute of Medical Research Melbourne,
More informationInference for two Population Means
Inference for two Population Means Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison October 27 November 1, 2011 Two Population Means 1 / 65 Case Study Case Study Example
More informationAppendix 1: Time series analysis of peak-rate years and synchrony testing.
Appendix 1: Time series analysis of peak-rate years and synchrony testing. Overview The raw data are accessible at Figshare ( Time series of global resources, DOI 10.6084/m9.figshare.929619), sources are
More informationIntroduction to Phylogenetic Analysis
Subjects of this lecture Introduction to Phylogenetic nalysis Irit Orr 1 Introducing some of the terminology of phylogenetics. 2 Introducing some of the most commonly used methods for phylogenetic analysis.
More informationSimulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes
Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes Simcha Pollack, Ph.D. St. John s University Tobin College of Business Queens, NY, 11439 pollacks@stjohns.edu
More informationNonparametric Tests for Randomness
ECE 461 PROJECT REPORT, MAY 2003 1 Nonparametric Tests for Randomness Ying Wang ECE 461 PROJECT REPORT, MAY 2003 2 Abstract To decide whether a given sequence is truely random, or independent and identically
More informationSENSITIVITY ANALYSIS AND INFERENCE. Lecture 12
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this
More informationIntroduction to Markov Chain Monte Carlo
Introduction to Markov Chain Monte Carlo Monte Carlo: sample from a distribution to estimate the distribution to compute max, mean Markov Chain Monte Carlo: sampling using local information Generic problem
More informationR Simulations: Monty Hall problem
R Simulations: Monty Hall problem Monte Carlo Simulations Monty Hall Problem Statistical Analysis Simulation in R Exercise 1: A Gift Giving Puzzle Exercise 2: Gambling Problem R Simulations: Monty Hall
More informationLearning outcomes. Knowledge and understanding. Competence and skills
Syllabus Master s Programme in Statistics and Data Mining 120 ECTS Credits Aim The rapid growth of databases provides scientists and business people with vast new resources. This programme meets the challenges
More informationApplied Multivariate Analysis - Big data analytics
Applied Multivariate Analysis - Big data analytics Nathalie Villa-Vialaneix nathalie.villa@toulouse.inra.fr http://www.nathalievilla.org M1 in Economics and Economics and Statistics Toulouse School of
More informationInternet Appendix to False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas
Internet Appendix to False Discoveries in Mutual Fund Performance: Measuring Luck in Estimated Alphas A. Estimation Procedure A.1. Determining the Value for from the Data We use the bootstrap procedure
More informationData Preparation and Statistical Displays
Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability
More informationX = rnorm(5e4,mean=1,sd=2) # we need a total of 5 x 10,000 = 5e4 samples X = matrix(data=x,nrow=1e4,ncol=5)
ECL 290 Statistical Models in Ecology using R Problem set for Week 6 Monte Carlo simulation, power analysis, bootstrapping 1. Monte Carlo simulations - Central Limit Theorem example To start getting a
More informationHow To Solve A Sequential Mca Problem
Monte Carlo-based statistical methods (MASM11/FMS091) Jimmy Olsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 3, 2012 Changes in HA1 Problem
More informationCore Bioinformatics. Degree Type Year Semester
Core Bioinformatics 2015/2016 Code: 42397 ECTS Credits: 12 Degree Type Year Semester 4313473 Bioinformatics OB 0 1 Contact Name: Sònia Casillas Viladerrams Email: Sonia.Casillas@uab.cat Teachers Use of
More informationMissing data and the accuracy of Bayesian phylogenetics
Journal of Systematics and Evolution 46 (3): 307 314 (2008) (formerly Acta Phytotaxonomica Sinica) doi: 10.3724/SP.J.1002.2008.08040 http://www.plantsystematics.com Missing data and the accuracy of Bayesian
More informationCurriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
More informationAPPLIED MISSING DATA ANALYSIS
APPLIED MISSING DATA ANALYSIS Craig K. Enders Series Editor's Note by Todd D. little THE GUILFORD PRESS New York London Contents 1 An Introduction to Missing Data 1 1.1 Introduction 1 1.2 Chapter Overview
More informationFoundations of Statistics Frequentist and Bayesian
Mary Parker, http://www.austincc.edu/mparker/stat/nov04/ page 1 of 13 Foundations of Statistics Frequentist and Bayesian Statistics is the science of information gathering, especially when the information
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,
More informationFinding Clusters in Phylogenetic Trees: A Special Type of Cluster Analysis
Finding lusters in Phylogenetic Trees: Special Type of luster nalysis Why try to identify clusters in phylogenetic trees? xample: origin of HIV. NUMR: Why are there so many distinct clusters? LUR04-7 SYNHRONY:
More informationBig Data Big Knowledge?
EBPI Epidemiology, Biostatistics and Prevention Institute Big Data Big Knowledge? Torsten Hothorn 2015-03-06 The end of theory The End of Theory: The Data Deluge Makes the Scientific Method Obsolete (Chris
More informationRECENT advances in digital technology have led to a
Robust, scalable and fast bootstrap method for analyzing large scale data Shahab Basiri, Esa Ollila, Member, IEEE, and Visa Koivunen, Fellow, IEEE arxiv:542382v2 [statme] 2 Apr 25 Abstract In this paper
More informationBorges, J. L. 1998. On exactitude in science. P. 325, In, Jorge Luis Borges, Collected Fictions (Trans. Hurley, H.) Penguin Books.
... In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province. In time, those
More informationComparing Two Groups. Standard Error of ȳ 1 ȳ 2. Setting. Two Independent Samples
Comparing Two Groups Chapter 7 describes two ways to compare two populations on the basis of independent samples: a confidence interval for the difference in population means and a hypothesis test. The
More informationBasics of Statistical Machine Learning
CS761 Spring 2013 Advanced Machine Learning Basics of Statistical Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu Modern machine learning is rooted in statistics. You will find many familiar
More informationMath 58. Rumbos Fall 2008 1. Solutions to Review Problems for Exam 2
Math 58. Rumbos Fall 2008 1 Solutions to Review Problems for Exam 2 1. For each of the following scenarios, determine whether the binomial distribution is the appropriate distribution for the random variable
More informationMonte Carlo-based statistical methods (MASM11/FMS091)
Monte Carlo-based statistical methods (MASM11/FMS091) Magnus Wiktorsson Centre for Mathematical Sciences Lund University, Sweden Lecture 6 Sequential Monte Carlo methods II February 7, 2014 M. Wiktorsson
More informationNonparametric Predictive Methods for Bootstrap and Test Reproducibility
Nonparametric Predictive Methods for Bootstrap and Test Reproducibility Sulafah BinHimd A Thesis presented for the degree of Doctor of Philosophy Department of Mathematical Sciences University of Durham
More informationF nest. Monte Carlo and Bootstrap using Stata. Financial Intermediation Network of European Studies
F nest Financial Intermediation Network of European Studies S U M M E R S C H O O L Monte Carlo and Bootstrap using Stata Dr. Giovanni Cerulli 8-10 October 2015 University of Rome III, Italy Lecturer Dr.
More informationChapter 3 RANDOM VARIATE GENERATION
Chapter 3 RANDOM VARIATE GENERATION In order to do a Monte Carlo simulation either by hand or by computer, techniques must be developed for generating values of random variables having known distributions.
More informationBayesian Machine Learning (ML): Modeling And Inference in Big Data. Zhuhua Cai Google, Rice University caizhua@gmail.com
Bayesian Machine Learning (ML): Modeling And Inference in Big Data Zhuhua Cai Google Rice University caizhua@gmail.com 1 Syllabus Bayesian ML Concepts (Today) Bayesian ML on MapReduce (Next morning) Bayesian
More informationParallelization Strategies for Multicore Data Analysis
Parallelization Strategies for Multicore Data Analysis Wei-Chen Chen 1 Russell Zaretzki 2 1 University of Tennessee, Dept of EEB 2 University of Tennessee, Dept. Statistics, Operations, and Management
More informationWhat is Statistics? Lecture 1. Introduction and probability review. Idea of parametric inference
0. 1. Introduction and probability review 1.1. What is Statistics? What is Statistics? Lecture 1. Introduction and probability review There are many definitions: I will use A set of principle and procedures
More informationReferences. Importance Sampling. Jessi Cisewski (CMU) Carnegie Mellon University. June 2014
Jessi Cisewski Carnegie Mellon University June 2014 Outline 1 Recall: Monte Carlo integration 2 3 Examples of (a) Monte Carlo, Monaco (b) Monte Carlo Casino Some content and examples from Wasserman (2004)
More informationIntroduction to Machine Learning and Data Mining. Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk
Introduction to Machine Learning and Data Mining Prof. Dr. Igor Trajkovski trajkovski@nyus.edu.mk Ensembles 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training
More informationQuantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
More informationSAS Certificate Applied Statistics and SAS Programming
SAS Certificate Applied Statistics and SAS Programming SAS Certificate Applied Statistics and Advanced SAS Programming Brigham Young University Department of Statistics offers an Applied Statistics and
More informationA study on the bi-aspect procedure with location and scale parameters
통계연구(2012), 제17권 제1호, 19-26 A study on the bi-aspect procedure with location and scale parameters (Short Title: Bi-aspect procedure) Hyo-Il Park 1) Ju Sung Kim 2) Abstract In this research we propose a
More informationHeuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations
Heuristics for the Sorting by Length-Weighted Inversions Problem on Signed Permutations AlCoB 2014 First International Conference on Algorithms for Computational Biology Thiago da Silva Arruda Institute
More informationMaster programme in Statistics
Master programme in Statistics Björn Holmquist 1 1 Department of Statistics Lund University Cramérsällskapets årskonferens, 2010-03-25 Master programme Vad är ett Master programme? Breddmaster vs Djupmaster
More informationMaximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1
Maximum-Likelihood Estimation of Phylogeny from DNA Sequences When Substitution Rates Differ over Sites1 Ziheng Yang Department of Animal Science, Beijing Agricultural University Felsenstein s maximum-likelihood
More informationChapter 6: Point Estimation. Fall 2011. - Probability & Statistics
STAT355 Chapter 6: Point Estimation Fall 2011 Chapter Fall 2011 6: Point1 Estimat / 18 Chap 6 - Point Estimation 1 6.1 Some general Concepts of Point Estimation Point Estimate Unbiasedness Principle of
More informationTHE CENTRAL LIMIT THEOREM TORONTO
THE CENTRAL LIMIT THEOREM DANIEL RÜDT UNIVERSITY OF TORONTO MARCH, 2010 Contents 1 Introduction 1 2 Mathematical Background 3 3 The Central Limit Theorem 4 4 Examples 4 4.1 Roulette......................................
More informationPart 2: Analysis of Relationship Between Two Variables
Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable
More informationWISE Power Tutorial All Exercises
ame Date Class WISE Power Tutorial All Exercises Power: The B.E.A.. Mnemonic Four interrelated features of power can be summarized using BEA B Beta Error (Power = 1 Beta Error): Beta error (or Type II
More informationClass 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)
Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the
More informationHow To Understand Data Science
EBPI Epidemiology, Biostatistics and Prevention Institute Big Data Science Torsten Hothorn 2014-03-31 The end of theory The End of Theory: The Data Deluge Makes the Scientific Method Obsolete (Chris Anderson,
More informationThe Big Data Bootstrap
Ariel Kleiner akleiner@cs.berkeley.edu Ameet Talwalkar ameet@cs.berkeley.edu Purnamrita Sarkar psarkar@cs.berkeley.edu Computer Science Division, University of California, Berkeley, CA 9472, USA Michael
More informationRedwood Building, Room T204, Stanford University School of Medicine, Stanford, CA 94305-5405.
W hittemoretxt050806.tex A Bayesian False Discovery Rate for Multiple Testing Alice S. Whittemore Department of Health Research and Policy Stanford University School of Medicine Correspondence Address:
More informationONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY,
ONLINE APPENDIX FOR PUBLIC HEALTH INSURANCE, LABOR SUPPLY, AND EMPLOYMENT LOCK Craig Garthwaite Tal Gross Matthew J. Notowidigdo December 2013 A1. Monte Carlo Simulations This section describes a set of
More informationThere are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month
More informationProbabilistic Methods for Time-Series Analysis
Probabilistic Methods for Time-Series Analysis 2 Contents 1 Analysis of Changepoint Models 1 1.1 Introduction................................ 1 1.1.1 Model and Notation....................... 2 1.1.2 Example:
More informationThe Bonferonni and Šidák Corrections for Multiple Comparisons
The Bonferonni and Šidák Corrections for Multiple Comparisons Hervé Abdi 1 1 Overview The more tests we perform on a set of data, the more likely we are to reject the null hypothesis when it is true (i.e.,
More informationresearch/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other
1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric
More informationTesting Research and Statistical Hypotheses
Testing Research and Statistical Hypotheses Introduction In the last lab we analyzed metric artifact attributes such as thickness or width/thickness ratio. Those were continuous variables, which as you
More informationMATH BOOK OF PROBLEMS SERIES. New from Pearson Custom Publishing!
MATH BOOK OF PROBLEMS SERIES New from Pearson Custom Publishing! The Math Book of Problems Series is a database of math problems for the following courses: Pre-algebra Algebra Pre-calculus Calculus Statistics
More informationStatistical Testing of Randomness Masaryk University in Brno Faculty of Informatics
Statistical Testing of Randomness Masaryk University in Brno Faculty of Informatics Jan Krhovják Basic Idea Behind the Statistical Tests Generated random sequences properties as sample drawn from uniform/rectangular
More informationMaster of Mathematical Finance: Course Descriptions
Master of Mathematical Finance: Course Descriptions CS 522 Data Mining Computer Science This course provides continued exploration of data mining algorithms. More sophisticated algorithms such as support
More informationWhat Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum
What Teachers Should Know about the Bootstrap: Resampling in the Undergraduate Statistics Curriculum Tim Hesterberg Google timhesterberg@gmail.com November 19, 2014 Abstract I have three goals in this
More informationPhylogenetic systematics turns over a new leaf
30 Review Phylogenetic systematics turns over a new leaf Paul O. Lewis Long restricted to the domain of molecular systematics and studies of molecular evolution, likelihood methods are now being used in
More informationChapter 12 Bagging and Random Forests
Chapter 12 Bagging and Random Forests Xiaogang Su Department of Statistics and Actuarial Science University of Central Florida - 1 - Outline A brief introduction to the bootstrap Bagging: basic concepts
More informationKnowledge Discovery and Data Mining. Bootstrap review. Bagging Important Concepts. Notes. Lecture 19 - Bagging. Tom Kelsey. Notes
Knowledge Discovery and Data Mining Lecture 19 - Bagging Tom Kelsey School of Computer Science University of St Andrews http://tom.host.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-19-B &
More informationBayesian Nash Equilibrium
. Bayesian Nash Equilibrium . In the final two weeks: Goals Understand what a game of incomplete information (Bayesian game) is Understand how to model static Bayesian games Be able to apply Bayes Nash
More informationModel-based Synthesis. Tony O Hagan
Model-based Synthesis Tony O Hagan Stochastic models Synthesising evidence through a statistical model 2 Evidence Synthesis (Session 3), Helsinki, 28/10/11 Graphical modelling The kinds of models that
More informationMonte Carlo tests for spatial patterns and their change a
1 Monte Carlo tests for spatial patterns and their change a Finnish Forest Research Institute Unioninkatu 40 A, 00170 Helsinki juha.heikkinen@metla.fi Workshop on Spatial Statistics and Ecology Perämeri
More information