# Permutation Test & Monte Carlo Sampling

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Permutation Test & Monte Carlo Sampling by MA, Jianqiang March 18th, 2009

2 Outline Introduction to Permutation Test Permutation Test in Linguistics : Measuring Syntactic Differences Brief View of Monte Carlo Method Monte Carlo in Linguistics: An Simple Example Conclusion

3 Hypothesis Test Define H0,H1.. Choose Test (t, Z, F, etc) then we know test statistic distribution under H0 Compute Test Statistic Make Statistical Decision by looking the observed statistic in the distribution P-value: that probability that we would observe a statistic value as extreme or more extreme than the one we did observe

4 Assumption for a z-test, t-test or F-test When conducting a z-test or a t-test, we are actually assuming that the data (or the random errors) follow a normal distribution. Based on this assumption, we know the distribution of the test statistic (T.S.) under the null hypothesis. Based on the distribution (z-distribution, t-distribution or F-distribution), we get a p-value for each observed T.S.. This can be referred to as parametric approaches.

5 What if the distributional assumption does not hold? If the normal assumption does not hold for the data and the sample size is small, the results of z-test, t- or F-test are not reliable. What can we do? 1) Transformation of data to make the data normal 2) Choose some tests that do not make such distributional assumptions nonparametric approaches

6 Permutation Test Permutation Test (randomization tests) can be used without the normal assumption for the distribution of data. Permutation Test is a resampling test (like bootstrapping) Permutation Test is an Exact Test Monte Carlo Sampling: makes testing on large data possible

7 Idea of permutation test Under H0 (the null hypothesis ), some of the data are exchangeable. We permute (rearrange) the data by shuffling their labels of treatments, and then calculate our T.S. on each permutation. The collection of T.S. from the permuted data constructs the distribution under H0.

8 An example of Permutation Two groups of participants, score of a linguistics test: Group A: Group B: Statistic= XA- XB, In the observation=173-68=105 Rearrange the observations and compute corresponding T.S. Compare the T.S. from original observation with the ones from re-arranged data. In this case, TS(observation) is the biggest, thus the p-value is 1/20=0.05

9 Distribution of XA- XB

10 Application: Measuring S yntactic Distance By John Nerbonne and Wybo Wiersema 2006 Measure linguistic contamination mobility, multilinguality Languages in contact influence one another first languages influence second languages, vise versa What are the factors, how important are they? experience, attitude, instruction, relations of languages Differences between varieties of a language

11 The Idea Goal: detect lots of syntactic differences Material: Corpora of language use in contact situations (e.g. 2 corpus of Finnish Australian Immigrants, of adults and kids respectively) Mark syntactic categories of words with Part-of-speech (POS) tags Collect and analyse trigrams of tags

12 How to measure? Indirectly! We aim to observe differences in syntactic use including overuse and underuse, not just errors Indirect, since it's difficult to model syntactic difference Lexical categories mirror syntactic analysis We assume that syntactic differences correlate strongly with the distribution of POS tag-trigrams

13 Trigram Vectors and their Differences Finnish people who emigrated to Australia Two groups of participants, got two sub-corpus Kids (< 17) 30 interviews & Adults ( >=17) 60 interviews Frequency Vectors containing the counts of 13,784 different POS trigrams, one for each of the sub-corpus Measure Vector Differences Using cosine, R/Rsq comparing two vectors

14 Statistical Significance Aarts & Granger examined tag-trigrams, but did not subject their collections to statistical analysis We do not have general distribution of these trigrams or distribution of syntactic differences We have:13,784 trigrams actually occurred Solution: permutation test, with Monte Carlo techniques

15 Normalization Problem in this case we need to permute sentences, not trigrams to avoid measuring only the effect of syntactic coherence Normalization for sentences length Since average sentence length differs in two sub-corpus (24 wd/sent. vs. 16 wd/sent.), number of trigrams will differ across permutation as well numbers of trigrams in each group will vary if no normalization is applied.

16 Normalization in Detail (1) Initially: a series of counts of all the trigrams of vectors the young group vs. the older group. 1. Sums no. of trigrams for each vector 2. compute the frequencies based on counts and sums.

17 Normalization in Detail (2) 3. weight these frequencies on the basis of the distributions in the aggregated categories 4. compute final elements of vectors (here, ) Another Normalization is skipped here, anyway, we can see from this case normalization is useful for deal with real data in which is not perfectly exchangeable

18 Apply Permutation Test 1. Determine difference between 2 vectors of trigrams, which is our test statistic 2. Permute a pair of sentences from two sub-corpus, compare the differences of resulting two vectors of trigrams (compute test statistics for this permutation) 3. Repeat step (3) e.g. 10,000 times, each time, we pick pairs of sentences randomly. 4. Estimation of stat. significance, the probability that the original samples were due to chance (p-value).

19 Findings Relative difference between young and old emigrants significant (P<0.001) Some striking patterns: Problems caused by tagger (elided here)

20 So, where is Monte Carlo? --what is Monte Carlo (sampling)? 3. Repeat step (3) e.g. 10,000 times, each time, we pick pairs of sentences randomly. --why bothering? In permutation test, there may be too many possible orderings of the data to conveniently allow complete enumeration This is done by generating the reference distribution by Monte Carlo sampling, which takes a relatively small random sample of the possible replicates

21 Monte Carlo principle Given a very large set X and a distribution p(x) over it Draw N samples randomly from the distribution Approximate the distribution using these samples Can also approximate expectation

22 Monte Carlo: a simple example Find out the probability that, out of a group of 30 people, 2 people share a birthday 1. Pick 30 random numbers in the range [1,365]. Each number represents one day of the year. 2. Check to see if any of the thirty are equal. 3. Go back to step 1 and repeat 10,000 times. 4. Report the fraction of trials that have matching days. --Results: , which is very close to exact result Another example: calculating pi:

23 Features of Monte Carlo in general A domain of possible inputs Random number generating and sampling rejection, metropolis and exact sampling... Error estimation

24 An application: Identifying Language Language Model: most frequent words/most frequent N-grams of alphabets Document Model: similar features as in language model Classification Methods: rank order statistic, mutual information statistics, Monte Carlo Method

25 Identifying Language by Monte Carlo (1) By Arjen Poutsma Find the most probable language given a certain document, i.e. maximize Apply Bayesian Law: As both language and documents are features:

26 Identifying Language by Monte Carlo (2) we can determine the language of this document to be the language which results most often from these random features. Monte Carlo Approach: 1. Generating random number and sample one feature from all features of the document 2. Check which language(s) also have this feature 3. Repeat 1~2 for N times

27 Results of Monte Carlo Method Performance is close to the best Time complexity is much lower than the best

28 Conclusion Permutation Test is a good choice for hypothesis test of unknown distribution. It works regardless of the shape and size of the population gives exact p value Monte Carlo Sampling is introduced to permutation test when it is impossible to complete enumeration the data. Monte Carlo Method can well approximate the distribution using random samples

### Nonparametric tests, Bootstrapping

Nonparametric tests, Bootstrapping http://www.isrec.isb-sib.ch/~darlene/embnet/ Hypothesis testing review 2 competing theories regarding a population parameter: NULL hypothesis H ( straw man ) ALTERNATIVEhypothesis

### Permutation Tests for Comparing Two Populations

Permutation Tests for Comparing Two Populations Ferry Butar Butar, Ph.D. Jae-Wan Park Abstract Permutation tests for comparing two populations could be widely used in practice because of flexibility of

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Quantitative Biology Lecture 5 (Hypothesis Testing)

15 th Oct 2015 Quantitative Biology Lecture 5 (Hypothesis Testing) Gurinder Singh Mickey Atwal Center for Quantitative Biology Summary Classification Errors Statistical significance T-tests Q-values (Traditional)

### Permutation & Non-Parametric Tests

Permutation & Non-Parametric Tests Statistical tests Gather data to assess some hypothesis (e.g., does this treatment have an effect on this outcome?) Form a test statistic for which large values indicate

### LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

### Hypothesis Testing Level I Quantitative Methods. IFT Notes for the CFA exam

Hypothesis Testing 2014 Level I Quantitative Methods IFT Notes for the CFA exam Contents 1. Introduction... 3 2. Hypothesis Testing... 3 3. Hypothesis Tests Concerning the Mean... 10 4. Hypothesis Tests

### Nonparametric Test Procedures

Nonparametric Test Procedures 1 Introduction to Nonparametrics Nonparametric tests do not require that samples come from populations with normal distributions or any other specific distribution. Hence

### Nonparametric Statistics

1 14.1 Using the Binomial Table Nonparametric Statistics In this chapter, we will survey several methods of inference from Nonparametric Statistics. These methods will introduce us to several new tables

### Outline. Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test

The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation

### Projects Involving Statistics (& SPSS)

Projects Involving Statistics (& SPSS) Academic Skills Advice Starting a project which involves using statistics can feel confusing as there seems to be many different things you can do (charts, graphs,

### Section 13, Part 1 ANOVA. Analysis Of Variance

Section 13, Part 1 ANOVA Analysis Of Variance Course Overview So far in this course we ve covered: Descriptive statistics Summary statistics Tables and Graphs Probability Probability Rules Probability

### Hypothesis testing S2

Basic medical statistics for clinical and experimental research Hypothesis testing S2 Katarzyna Jóźwiak k.jozwiak@nki.nl 2nd November 2015 1/43 Introduction Point estimation: use a sample statistic to

### 3. Nonparametric methods

3. Nonparametric methods If the probability distributions of the statistical variables are unknown or are not as required (e.g. normality assumption violated), then we may still apply nonparametric tests

### StatCrunch and Nonparametric Statistics

StatCrunch and Nonparametric Statistics You can use StatCrunch to calculate the values of nonparametric statistics. It may not be obvious how to enter the data in StatCrunch for various data sets that

### Statistics Review PSY379

Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

### Rank-Based Non-Parametric Tests

Rank-Based Non-Parametric Tests Reminder: Student Instructional Rating Surveys You have until May 8 th to fill out the student instructional rating surveys at https://sakai.rutgers.edu/portal/site/sirs

### Chapter 7 Notes - Inference for Single Samples. You know already for a large sample, you can invoke the CLT so:

Chapter 7 Notes - Inference for Single Samples You know already for a large sample, you can invoke the CLT so: X N(µ, ). Also for a large sample, you can replace an unknown σ by s. You know how to do a

### Study Guide for the Final Exam

Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

### Combining Paired and Two-Sample Data Using a Permutation Test

Journal of Data Science 11(2013), 767-779 Combining Paired and Two-Sample Data Using a Permutation Test Richard L. Einsporn and Desale Habtzghi University of Akron Abstract: This paper presents a permutation

### Tagging with Hidden Markov Models

Tagging with Hidden Markov Models Michael Collins 1 Tagging Problems In many NLP problems, we would like to model pairs of sequences. Part-of-speech (POS) tagging is perhaps the earliest, and most famous,

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### Statistiek I. Proportions aka Sign Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. http://www.let.rug.nl/nerbonne/teach/statistiek-i/

Statistiek I Proportions aka Sign Tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/34 Proportions aka Sign Test The relative frequency

### Nonparametric Statistics

Nonparametric Statistics J. Lozano University of Goettingen Department of Genetic Epidemiology Interdisciplinary PhD Program in Applied Statistics & Empirical Methods Graduate Seminar in Applied Statistics

### NCSS Statistical Software

Chapter 06 Introduction This procedure provides several reports for the comparison of two distributions, including confidence intervals for the difference in means, two-sample t-tests, the z-test, the

### Effect estimation versus hypothesis testing

Department of Epidemiology and Public Health Unit of Biostatistics and Computational Sciences Effect estimation versus hypothesis testing PD Dr. C. Schindler Swiss Tropical and Public Health Institute

### Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test) A hypothesis test is conducted when trying to find out if a claim is true or not. And if the claim is true, is it significant.

### Hypothesis Testing & Data Analysis. Statistics. Descriptive Statistics. What is the difference between descriptive and inferential statistics?

2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.

### Module 5 Hypotheses Tests: Comparing Two Groups

Module 5 Hypotheses Tests: Comparing Two Groups Objective: In medical research, we often compare the outcomes between two groups of patients, namely exposed and unexposed groups. At the completion of this

### Single sample hypothesis testing, II 9.07 3/02/2004

Single sample hypothesis testing, II 9.07 3/02/2004 Outline Very brief review One-tailed vs. two-tailed tests Small sample testing Significance & multiple tests II: Data snooping What do our results mean?

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Random Uniform Clumped. 0 1 2 3 4 5 6 7 8 9 Number of Individuals per Sub-Quadrat. Number of Individuals per Sub-Quadrat

4-1 Population ecology Lab 4: Population dispersion patterns I. Introduction to population dispersion patterns The dispersion of individuals in a population describes their spacing relative to each other.

### Nonparametric tests these test hypotheses that are not statements about population parameters (e.g.,

CHAPTER 13 Nonparametric and Distribution-Free Statistics Nonparametric tests these test hypotheses that are not statements about population parameters (e.g., 2 tests for goodness of fit and independence).

### Statistical Significance and Bivariate Tests

Statistical Significance and Bivariate Tests BUS 735: Business Decision Making and Research 1 1.1 Goals Goals Specific goals: Re-familiarize ourselves with basic statistics ideas: sampling distributions,

### Statistical tests for SPSS

Statistical tests for SPSS Paolo Coletti A.Y. 2010/11 Free University of Bolzano Bozen Premise This book is a very quick, rough and fast description of statistical tests and their usage. It is explicitly

### research/scientific includes the following: statistical hypotheses: you have a null and alternative you accept one and reject the other

1 Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC 2 Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric

### NCSS Statistical Software. One-Sample T-Test

Chapter 205 Introduction This procedure provides several reports for making inference about a population mean based on a single sample. These reports include confidence intervals of the mean or median,

### Supplementary methods and results

1 Supplementary methods and results Computational analysis Calculation of model evidence Task 1 Task 2 1 2 3 4 5 6 7 8 9 10 Figure S1. Calculating model evidence example The procedure employed to calculate

### Module 9: Nonparametric Tests. The Applied Research Center

Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test

### Stat 411/511 THE RANDOMIZATION TEST. Charlotte Wickham. stat511.cwick.co.nz. Oct 16 2015

Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation

### Nonparametric statistics and model selection

Chapter 5 Nonparametric statistics and model selection In Chapter, we learned about the t-test and its variations. These were designed to compare sample means, and relied heavily on assumptions of normality.

### HYPOTHESIS TESTING WITH SPSS:

HYPOTHESIS TESTING WITH SPSS: A NON-STATISTICIAN S GUIDE & TUTORIAL by Dr. Jim Mirabella SPSS 14.0 screenshots reprinted with permission from SPSS Inc. Published June 2006 Copyright Dr. Jim Mirabella CHAPTER

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

### Statistiek I. t-tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen. John Nerbonne 1/35

Statistiek I t-tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://wwwletrugnl/nerbonne/teach/statistiek-i/ John Nerbonne 1/35 t-tests To test an average or pair of averages when σ is known, we

### General Procedure for Hypothesis Test. Five types of statistical analysis. 1. Formulate H 1 and H 0. General Procedure for Hypothesis Test

Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics

### Chapter 7. Section Introduction to Hypothesis Testing

Section 7.1 - Introduction to Hypothesis Testing Chapter 7 Objectives: State a null hypothesis and an alternative hypothesis Identify type I and type II errors and interpret the level of significance Determine

### UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates

UCLA STAT 13 Statistical Methods - Final Exam Review Solutions Chapter 7 Sampling Distributions of Estimates 1. (a) (i) µ µ (ii) σ σ n is exactly Normally distributed. (c) (i) is approximately Normally

### STAT 35A HW2 Solutions

STAT 35A HW2 Solutions http://www.stat.ucla.edu/~dinov/courses_students.dir/09/spring/stat35.dir 1. A computer consulting firm presently has bids out on three projects. Let A i = { awarded project i },

### Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 6: Models and Patterns Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Models vs. Patterns Models A model is a high level, global description of a

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

MONT 07N Understanding Randomness Solutions For Final Examination May, 00 Short Answer (a) (0) How are the EV and SE for the sum of n draws with replacement from a box computed? Solution: The EV is n times

### Statistics 641 - EXAM II - 1999 through 2003

Statistics 641 - EXAM II - 1999 through 2003 December 1, 1999 I. (40 points ) Place the letter of the best answer in the blank to the left of each question. (1) In testing H 0 : µ 5 vs H 1 : µ > 5, the

### INTRODUCTORY STATISTICS

INTRODUCTORY STATISTICS FIFTH EDITION Thomas H. Wonnacott University of Western Ontario Ronald J. Wonnacott University of Western Ontario WILEY JOHN WILEY & SONS New York Chichester Brisbane Toronto Singapore

### Two-sample inference: Continuous data

Two-sample inference: Continuous data Patrick Breheny April 5 Patrick Breheny STA 580: Biostatistics I 1/32 Introduction Our next two lectures will deal with two-sample inference for continuous data As

### 1.2 Statistical testing by permutation

Statistical testing by permutation 17 Excerpt (pp. 17-26) Ch. 13), from: McBratney & Webster (1981), McBratney et al. (1981), Webster & Burgess (1984), Borgman & Quimby (1988), and François-Bongarçon (1991).

### Reliable and Cost-Effective PoS-Tagging

Reliable and Cost-Effective PoS-Tagging Yu-Fang Tsai Keh-Jiann Chen Institute of Information Science, Academia Sinica Nanang, Taipei, Taiwan 5 eddie,chen@iis.sinica.edu.tw Abstract In order to achieve

### Chi-Square Test. Contingency Tables. Contingency Tables. Chi-Square Test for Independence. Chi-Square Tests for Goodnessof-Fit

Chi-Square Tests 15 Chapter Chi-Square Test for Independence Chi-Square Tests for Goodness Uniform Goodness- Poisson Goodness- Goodness Test ECDF Tests (Optional) McGraw-Hill/Irwin Copyright 2009 by The

### Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:

### 14.0 Hypothesis Testing

14.0 Hypothesis Testing 1 Answer Questions Hypothesis Tests Examples 14.1 Hypothesis Tests A hypothesis test (significance test) is a way to decide whether the data strongly support one point of view or

### Statistics 2014 Scoring Guidelines

AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

### Chapter 3: Nonparametric Tests

B. Weaver (15-Feb-00) Nonparametric Tests... 1 Chapter 3: Nonparametric Tests 3.1 Introduction Nonparametric, or distribution free tests are so-called because the assumptions underlying their use are fewer

### Invariance and optimality Linear rank statistics Permutation tests in R. Rank Tests. Patrick Breheny. October 7. STA 621: Nonparametric Statistics

Rank Tests October 7 Power Invariance and optimality Permutation testing allows great freedom to use a wide variety of test statistics, all of which lead to exact level-α tests regardless of the distribution

### Chapter 26: Tests of Significance

Chapter 26: Tests of Significance Procedure: 1. State the null and alternative in words and in terms of a box model. 2. Find the test statistic: z = observed EV. SE 3. Calculate the P-value: The area under

### 15.0 More Hypothesis Testing

15.0 More Hypothesis Testing 1 Answer Questions Type I and Type II Error Power Calculation Bayesian Hypothesis Testing 15.1 Type I and Type II Error In the philosophy of hypothesis testing, the null hypothesis

### Inferential Statistics. Probability. From Samples to Populations. Katie Rommel-Esham Education 504

Inferential Statistics Katie Rommel-Esham Education 504 Probability Probability is the scientific way of stating the degree of confidence we have in predicting something Tossing coins and rolling dice

### Nonparametric Statistics

Nonparametric Statistics References Some good references for the topics in this course are 1. Higgins, James (2004), Introduction to Nonparametric Statistics 2. Hollander and Wolfe, (1999), Nonparametric

### Non-Inferiority Tests for One Mean

Chapter 45 Non-Inferiority ests for One Mean Introduction his module computes power and sample size for non-inferiority tests in one-sample designs in which the outcome is distributed as a normal random

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Exact Nonparametric Tests for Comparing Means - A Personal Summary

Exact Nonparametric Tests for Comparing Means - A Personal Summary Karl H. Schlag European University Institute 1 December 14, 2006 1 Economics Department, European University Institute. Via della Piazzuola

### Chapter 7 Section 7.1: Inference for the Mean of a Population

Chapter 7 Section 7.1: Inference for the Mean of a Population Now let s look at a similar situation Take an SRS of size n Normal Population : N(, ). Both and are unknown parameters. Unlike what we used

### 11-2 Goodness of Fit Test

11-2 Goodness of Fit Test In This section we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way frequency table). We will use a hypothesis

### SAMPLING & INFERENTIAL STATISTICS. Sampling is necessary to make inferences about a population.

SAMPLING & INFERENTIAL STATISTICS Sampling is necessary to make inferences about a population. SAMPLING The group that you observe or collect data from is the sample. The group that you make generalizations

### NONPARAMETRIC STATISTICS 1. depend on assumptions about the underlying distribution of the data (or on the Central Limit Theorem)

NONPARAMETRIC STATISTICS 1 PREVIOUSLY parametric statistics in estimation and hypothesis testing... construction of confidence intervals computing of p-values classical significance testing depend on assumptions

### Association Between Variables

Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

### WHAT IS A JOURNAL CLUB?

WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club

### Non-parametric tests I

Non-parametric tests I Objectives Mann-Whitney Wilcoxon Signed Rank Relation of Parametric to Non-parametric tests 1 the problem Our testing procedures thus far have relied on assumptions of independence,

### Inference for Data Visualization

Inference for Data Visualization Christof Seiler Stanford University, Spring 2016, Stats 205 Introduction Exploratory data analysis is usually not parametric For instance, in Principle Component Analysis

### Simple Linear Regression Inference

Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Lesson : Comparison of Population Means Part c: Comparison of Two- Means Welcome to lesson c. This third lesson of lesson will discuss hypothesis testing for two independent means. Steps in Hypothesis

### Research Methodology: Tools

MSc Business Administration Research Methodology: Tools Applied Data Analysis (with SPSS) Lecture 11: Nonparametric Methods May 2014 Prof. Dr. Jürg Schwarz Lic. phil. Heidi Bruderer Enzler Contents Slide

### Statistical Inference and t-tests

1 Statistical Inference and t-tests Objectives Evaluate the difference between a sample mean and a target value using a one-sample t-test. Evaluate the difference between a sample mean and a target value

### Inferences About Differences Between Means Edpsy 580

Inferences About Differences Between Means Edpsy 580 Carolyn J. Anderson Department of Educational Psychology University of Illinois at Urbana-Champaign Inferences About Differences Between Means Slide

### Chapter 19 The Chi-Square Test

Tutorial for the integration of the software R with introductory statistics Copyright c Grethe Hystad Chapter 19 The Chi-Square Test In this chapter, we will discuss the following topics: We will plot

### Statistiek I. Nonparametric Tests. John Nerbonne. CLCG, Rijksuniversiteit Groningen.

Statistiek I Nonparametric Tests John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/36 Overview 1 Mann-Whitney U-Test 2 Wilcoxon s Signed

### CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY

CHAPTER 11 CHI-SQUARE: NON-PARAMETRIC COMPARISONS OF FREQUENCY The hypothesis testing statistics detailed thus far in this text have all been designed to allow comparison of the means of two or more samples

### Contents. NRJF, University of Sheffield, 2011/12 Semester 1. Why Randomize? 3: Randomization. Simple randomization. Restricted Randomization

Contents Preliminaries 0: Introduction 1: Background & Basic Concepts 2: Basic Trial analysis 3: Randomization 4: Protocol Deviations 5: Size of the Trial 6: Multiplicity & Interim Analysis 7: Crossover

### Research Design Concepts. Independent and dependent variables Data types Sampling Validity and reliability

Research Design Concepts Independent and dependent variables Data types Sampling Validity and reliability Research Design Action plan for carrying out research How the research will be conducted to investigate

### Chapter 11-12 1 Review

Chapter 11-12 Review Name 1. In formulating hypotheses for a statistical test of significance, the null hypothesis is often a statement of no effect or no difference. the probability of observing the data

### The binomial distribution and proportions. Erik-Jan Smits & Eleonora Rossi

The binomial distribution and proportions Erik-Jan Smits & Eleonora Rossi Seminar in statistics and methodology 2005 Sampling distributions Moore and McCabe (2003:367) : Nature of sampling distribution

### UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

### Part 3. Comparing Groups. Chapter 7 Comparing Paired Groups 189. Chapter 8 Comparing Two Independent Groups 217

Part 3 Comparing Groups Chapter 7 Comparing Paired Groups 189 Chapter 8 Comparing Two Independent Groups 217 Chapter 9 Comparing More Than Two Groups 257 188 Elementary Statistics Using SAS Chapter 7 Comparing

### Instructions for : TI83, 83-Plus, 84-Plus for STP classes, Ela Jackiewicz

Computing areas under normal curves: option 2 normalcdf(lower limit, upper limit, mean, standard deviation) will give are between lower and upper limits (mean=0 and St.dev=1 are default values) Ex1 To

Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

### Introduction to Hypothesis Testing. Point estimation and confidence intervals are useful statistical inference procedures.

Introduction to Hypothesis Testing Point estimation and confidence intervals are useful statistical inference procedures. Another type of inference is used frequently used concerns tests of hypotheses.

### Simple maths for keywords

Simple maths for keywords Adam Kilgarriff Lexical Computing Ltd adam@lexmasterclass.com Abstract We present a simple method for identifying keywords of one corpus vs. another. There is no one-sizefits-all

### Statistical matching: Experimental results and future research questions

Statistical matching: Experimental results and future research questions 2015 19 Ton de Waal Content 1. Introduction 4 2. Methods for statistical matching 5 2.1 Introduction to statistical matching 5 2.2

There are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month