Statistics and the Search for Scientific Truth



Similar documents
Book Review of Rosenhouse, The Monty Hall Problem. Leslie Burkholder 1

Conditional Probability, Hypothesis Testing, and the Monty Hall Problem

How To Choose Between A Goat And A Door In A Game Of \"The Black Jackpot\"

Discrete Mathematics and Probability Theory Fall 2009 Satish Rao, David Tse Note 10

Probabilities. Probability of a event. From Random Variables to Events. From Random Variables to Events. Probability Theory I

R Simulations: Monty Hall problem

Practical Probability:

Week 2: Conditional Probability and Bayes formula

Paper Airplanes & Scientific Methods

Lecture 25: Money Management Steven Skiena. skiena

Statistical Fallacies: Lying to Ourselves and Others

Lecture 13. Understanding Probability and Long-Term Expectations

Probability. a number between 0 and 1 that indicates how likely it is that a specific event or set of events will occur.

AP * Statistics Review. Designing a Study

COMPELLING SUBJECT LINE SWIPES

Problem sets for BUEC 333 Part 1: Probability and Statistics

CROSS EXAMINATION OF AN EXPERT WITNESS IN A CHILD SEXUAL ABUSE CASE. Mark Montgomery

Session 8 Probability

Forex Trading. What Finally Worked For Me

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

Five Tips for Presenting Data Analyses: Telling a Good Story with Data

Homework Assignment #2: Answer Key

Which Design Is Best?

13.0 Central Limit Theorem

Big Data: a new era for Statistics

Evaluating New Cancer Treatments

5 WAYS TO DOUBLE YOUR WEB SITE S SALES IN THE NEXT 12 MONTHS

Worksheet for Teaching Module Probability (Lesson 1)

Randomization in Clinical Trials

The Two Envelopes Problem

Family Governance and Wealth Planning

Classroom Activity: Research Jeopardy!

is true for managers, team leaders and team members, individual contributors, professionals and executives.

How To Be A Successful Employee

Math 141. Lecture 2: More Probability! Albyn Jones 1. jones/courses/ Library 304. Albyn Jones Math 141

Lesson One: God Is Always At Work

The mathematical branch of probability has its

News English.com Ready-to-use ESL / EFL Lessons

Frequently Asked Questions (FAQs)

One PR Tactic that Generated. 15,000 for a Client

National Cancer Institute

Breakthrough Lung Cancer Treatment Approved Webcast September 9, 2011 Renato Martins, M.D., M.P.H. Introduction

LAB : THE CHI-SQUARE TEST. Probability, Random Chance, and Genetics

PROFESSOR ROBERT EDWARDS AWARDED NOBEL PRIZE Cofounder of Bourn Hall, world s first IVF clinic, recognised by award

Getting Published. Ed Diener Smiley Professor of Psychology University of Illinois

What is a P-value? Ronald A. Thisted, PhD Departments of Statistics and Health Studies The University of Chicago

STAT LINE FROM THE EDITOR S DESK

Conditional Probability, Independence and Bayes Theorem Class 3, 18.05, Spring 2014 Jeremy Orloff and Jonathan Bloom

1/3 1/3 1/

TABLE OF CONTENTS. ROULETTE FREE System # ROULETTE FREE System #

Discrete Math in Computer Science Homework 7 Solutions (Max Points: 80)

Pattern matching probabilities and paradoxes A new variation on Penney s coin game

Cancer Research Graduate Programs. The University of Texas MD Anderson Cancer Center

1.2 Investigations and Experiments

Metropolitan University Prague. International Relations and European Studies. Bachelor Entrance Test. 60 Minutes

Thursday, October 18, 2001 Page: 1 STAT 305. Solutions

Project 16 - PLAYING THE STOCK MARKET FOR GAIN OR LOSS

PEER REVIEW HISTORY ARTICLE DETAILS TITLE (PROVISIONAL)

I C C R. Gain Attention/Interest: Is ESP (Extra Sensory Perception) For Real? (I Knew You Were Going to Ask That!

If, under a given assumption, the of a particular observed is extremely. , we conclude that the is probably not

Probability & Probability Distributions

Using Gamification in Reward and Recognition to improve Employee Engagement

Fundraising Toolkit Success is in your hands

What Cancer Patients Need To Know

Gaming the Law of Large Numbers

Statistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined

1O SECRETS OF GEORGIA CAR WRECK CLAIMS

Pitfalls and Best Practices in Role Engineering

Contents. Using stories to teach Science Ages 5-6 5

Chapter 6. Examples (details given in class) Who is Measured: Units, Subjects, Participants. Research Studies to Detect Relationships

The Psychic Salesperson Speakers Edition

Expected Value and Variance

Cancer Genomics: What Does It Mean for You?

Part 3 focuses on ways families can help keep teens from using or abusing alcohol and tobacco.

GUIDE TO FUNDING YOUR MEDICAL NEGLIGENCE CLAIM

Hip Replacement Recall. A Special Report

Standard Deviation Estimator

Clinical Trials. Clinical trials the basics

REAL SECURITY IS DIRTY

Expected Value and the Game of Craps

Your Questions from Chapter 1. General Psychology PSYC 200. Your Questions from Chapter 1. Your Questions from Chapter 1. Science is a Method.

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Addiction may be in our genes

Hal Taylor, Esq. Reno, Nevada

THE TOP 5 ONLINE MARKETING MISTAKES

Monty Hall, Monty Fall, Monty Crawl

TAKING PART IN CANCER TREATMENT RESEARCH STUDIES

Lab 11. Simulations. The Concept

When Betting Odds and Credences Come Apart: More Worries for Dutch Book Arguments

Probability --QUESTIONS-- Principles of Math 12 - Probability Practice Exam 1

Bayesian Tutorial (Sheet Updated 20 March)

Transcription:

Statistics and the Search for Scientific Truth Martin Hazelton 1 Institute of Fundamental Sciences Massey University 11 November 2015 1 Presenter: m.hazelton@massey.ac.nz U3A, November 2015 1 / 30

Science in a World of Infinite Variety The World is full of variation. A lot of Science is about explaining variations and differences. U3A, November 2015 2 / 30

Designed Experiments Experiments provide a powerful approach to explaining differences. The goal is to relate variations in a response to changes in experimental factors. U3A, November 2015 3 / 30

Statistical Analysis of Experimental Data Typically there are multiple sources of variation beyond experimental factors. The result is noisy experimental data. This can make it difficult to attribute changes in response to the treatment. Statistics provides the tools to objectively assess the evidence for an association between treatment and response. Statistics makes use of probability models to represent alternative sources of variation. U3A, November 2015 4 / 30

The Lady Tasting Tea Muriel Bristol ( The Lady ) was a scientist working at Rothamsted Experimental Station in 1919. She claimed to be able to tell whether the milk or the tea was poured into a cup first. Co-worker Ronald Fisher devised experiment to test this. U3A, November 2015 5 / 30

The Lady Tasting Tea The Experiment 8 cups of tea 4 have milk poured first, then tea...... 4 have tea poured first, then milk. Cups presented in random order to The Lady. She had to identify each one as milk or tea first. U3A, November 2015 6 / 30

The Lady Tasting Tea Analysis of Results The Lady correctly identified milk/tea pouring order for all 8 cups. Does this provide evidence of her discernment...... or could it have been lucky guesswork? Can assess the evidence using probability. U3A, November 2015 7 / 30

The Lady Tasting Tea A Simplified Case Suppose there had been just two cups of tea: One with milk first. One with tea first. Cups presented in random order...... so two equally likely possibilities. 1 M T 2 T M 50% chance of selecting correct one by chance. Even if the Lady did guess the correct order, cannot distinguish between discernment and guesswork. U3A, November 2015 8 / 30

The Lady Tasting Tea Possibilities from Full Experiment MMMMTTTT MMMTMTTT MMMTTTTM MMMTTTMT MMMTTMTT MMTMTTTM MMTMTTMT MMTMTMTT MMTMMTTT MMTTTTMM MMTTTMTM MMTTTMMT MMTTMTTM MMTTMTMT MMTTMMTT MTMTMMTT MTMTMTMT MTMTMTTM MTMTTMMT MTMTTMTM MTMTTTMM MTMMTMTT MTMMTTMT MTMMTTTM MTMMMTTT MTTMMTTM MTTMMTMT MTTMMMTT MTTMTMMT MTTMTMTM MTTMTTMM MTTTMMMT MTTTMMTM MTTTMTMM MTTTTMMM TMMTMMTT TMMTMTMT TMMTMTTM TMMTTMMT TMMTTMTM TMMTTTMM TMMMTTMT TMMMTTTM TMMMTMTT TMMMMTTT TMTTTMMM TMTTMMTM TMTTMMMT TMTTMTMM TMTMTTMM TMTMTMTM TMTMTMMT TMTMMTTM TMTMMTMT TMTMMMTT TTMMMMTT TTMMMTMT TTMMMTTM TTMMTMMT TTMMTMTM TTMMTTMM TTMTMMMT TTMTMMTM TTMTMTMM TTMTTMMM TTTMMMTM TTTMMMMT TTTMMTMM TTTMTMMM TTTTMMMM U3A, November 2015 9 / 30

The Lady Tasting Tea Possibilities from Full Experiment MMMMTTTT MMMTMTTT MMMTTTTM MMMTTTMT MMMTTMTT MMTMTTTM MMTMTTMT MMTMTMTT MMTMMTTT MMTTTTMM MMTTTMTM MMTTTMMT MMTTMTTM MMTTMTMT MMTTMMTT MTMTMMTT MTMTMTMT MTMTMTTM MTMTTMMT MTMTTMTM MTMTTTMM MTMMTMTT MTMMTTMT MTMMTTTM MTMMMTTT MTTMMTTM MTTMMTMT MTTMMMTT MTTMTMMT MTTMTMTM MTTMTTMM MTTTMMMT MTTTMMTM MTTTMTMM MTTTTMMM TMMTMMTT TMMTMTMT TMMTMTTM TMMTTMMT TMMTTMTM TMMTTTMM TMMMTTMT TMMMTTTM TMMMTMTT TMMMMTTT TMTTTMMM TMTTMMTM TMTTMMMT TMTTMTMM TMTMTTMM TMTMTMTM TMTMTMMT TMTMMTTM TMTMMTMT TMTMMMTT TTMMMMTT TTMMMTMT TTMMMTTM TTMMTMMT TTMMTMTM TTMMTTMM TTMTMMMT TTMTMMTM TTMTMTMM TTMTTMMM TTTMMMTM TTTMMMMT TTTMMTMM TTTMTMMM TTTTMMMM U3A, November 2015 9 / 30

The Lady Tasting Tea Conclusions Recall, the Lady correctly identified milk/tea pouring order for all 8 cups. Two possible explanations. 1 The Lady truly can detect whether tea or milk was poured first. 2 The Lady made a very lucky 1/70 1.4% guess. Since guess is rather improbable, explanation 1 is plausible. U3A, November 2015 10 / 30

Probability Valid statistical analysis of experimental data requires probability models. Unfortunately, humans don t seem to be intuitively good at probability. U3A, November 2015 11 / 30

The Monty Hall Problem Probability Challenge #1 Based on a U.S. game show. Contestant must choose one of three doors. Behind two are goats; behind the other is a car. After initial choice, game show host identifies one of the other doors as hiding a goat. Contestant may either stick with original choice, or switch to another door. U3A, November 2015 12 / 30

The Monty Hall Problem Example Contestant picks door 1. Host indicates that door 3 hides a goat. Contestant can now stick with door 1, or change to door 2. What is the best strategy? What is the probability of winning the car if contestant sticks with door 1? U3A, November 2015 13 / 30

The Monty Hall Problem Explanation U3A, November 2015 14 / 30

Patterns in Coin Tossing Data The Audience Gets Another Go Pattern 1: H Pattern 2: T U3A, November 2015 15 / 30

Patterns in Coin Tossing Data The Audience Gets Another Go Pattern 1: HTH Pattern 2: HTT U3A, November 2015 15 / 30

Patterns in Coin Tossing Data The Audience Gets Another Go Repeated Pattern 1: HTHTH Repeated Pattern 2: HTTHTT U3A, November 2015 15 / 30

Probability Models for Genetic Data...... can be really tricky U3A, November 2015 16 / 30

Mistakes in the Statistical Analysis of Scientific Data Mistakes in the statistical analysis of scientific data are not uncommon. These can occur through incorrect probability calculations...... and also when simply processing and manipulating the raw data. Modern datasets can be of huge size and complexity. Genetic data is a case in point. Require careful computer processing. A single error in a line of code can have profound effects. U3A, November 2015 17 / 30

Lies, Damn Lies and Statistics The Problem of Scientific Fraud The complexity and scale of the analysis of many modern datasets provides plenty of opportunity for well disguised scientific fraud. U3A, November 2015 18 / 30

The Duke University Cancer Research Scandal Advances in human genetics have the potential to revolutionize medicine. A major long term goal is the development of effective personalized treatments for cancer. Progress had been slower than hoped...... but a major breakthrough was announced in 2006 by researchers at Duke University. U3A, November 2015 19 / 30

The Duke University Cancer Research Scandal Claimed Research Breakthrough Duke researchers examined drug sensitivity of a standard panel of cell lines derived from 9 types of human tumours. Then looked at genetic profiles of most resistant and most susceptible cell lines using microarrays. In theory, these results provide means of selecting best chemotherapy treatment based on genetic profile of a patient s tumour. U3A, November 2015 20 / 30

The Duke University Cancer Research Scandal Fame and Fortune (Science Style) The research was widely regarded as a remarkable breakthrough. The paper by the Duke research team that first described the results was hailed as one of the top publications of 2006. Subsequent papers appeared in several leading journals. The researcher primarily responsible for the work was Anil Potti (picture above). He became something of a poster boy for Duke University. U3A, November 2015 21 / 30

The Duke University Cancer Research Scandal Houston, We Have a Problem. There was great interest in Potti s work from research groups worldwide. One such group was in the famous M. D. Anderson Cancer Center, Houston, Texas. U3A, November 2015 22 / 30

The Duke University Cancer Research Scandal Statisticians take centre stage Keith Baggerly (above left) and Kevin Coombes were statisticians working at the M. D. Anderson Cancer Center. At the request of medical research colleagues, Baggerly and Coombes sought to learn more about the remarkable findings of Potti and co. Baggerly and Coombes requested the data and computer code for the statistical analysis from Duke University. U3A, November 2015 23 / 30

The Duke University Cancer Research Scandal Irreproducibility Baggerly and Coombes first step was to try and reproduce the data analysis done at Duke. They ran into problems almost straight away. There were issues with data integrity. Some nominally resistant cell lines appeared more sensitive than some sensitive ones, and vice-versa. Data were also mis-labelled in other ways. Some of the data processing had been done by hand, producing results that Baggerly and Coombes could not replicate. Based on their best understanding of the data and the labelling techniques employed, Baggerly and Coombes found that experimental results for genetic effects were no better than pure chance. U3A, November 2015 24 / 30

The Duke University Cancer Research Scandal Is anybody listening? Baggerly and Coombes raised their concerns through a variety of avenues. Duke University conducted a perfunctory internal review, which found no significant problems with Potti s work. Potti and colleagues were allowed to continue their research. By 2007, 109 patients were enrolled in a clinical trial based on Potti s findings. Baggerly and Coombes also sent letters to journals that had published Potti s work, but with limited success. U3A, November 2015 25 / 30

The Duke University Cancer Research Scandal Can you hear us now? Eventually Baggerly and Coombes resorted to publishing their own interpretation of the Duke data, in a top statistics journal. U3A, November 2015 26 / 30

The Duke University Cancer Research Scandal Denouement The evidence of scientific fraud became overwhelming: 1 Baggerly and Coombes research paper; 2 Several international research groups had tried and failed to replicate Potti s findings. 3 33 senior statisticians, epidemiologists and medical researchers wrote to the senior management of Duke University to express concerns; 4 Paul Goldberg, editor of The Cancer Letter, found that Anil Potti had lied of his curriculum vitae. Anil Potti eventually admitted to problems with the data, and subsequently resigned. Almost all the published papers on Potti s research were retracted. U3A, November 2015 27 / 30

Not the End of the Story (Unfortunately) Grant Steen showed that over period 2000 2010, about 80,000 patients had participated in clinical trials based on research that was incorrect and for which papers were retracted. Steen, R. G. (2011). Retractions in the medical literature: how many patients are put at risk by flawed research? Journal of Medical Ethics, jme-2011. U3A, November 2015 28 / 30

Hope for the Future Growing recognition in the scientific community of pivotal importance of reproducibility in data processing and analysis. Journals increasingly will only publish work for which data and computer code are supplied. Statisticians play a central role in the search for scientific truth. U3A, November 2015 29 / 30

Thank You for Listening U3A, November 2015 30 / 30