Roadmap to Data Analysis. Introduction to the Series, and I. Introduction to Statistical Thinking-A (Very) Short Introductory Course for Agencies



Similar documents
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

DATA ANALYSIS. QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University

Correlational Research

COMPARISONS OF CUSTOMER LOYALTY: PUBLIC & PRIVATE INSURANCE COMPANIES.

CORRELATION ANALYSIS

Simple Linear Regression Inference

UNIVERSITY OF NAIROBI

Organizing Your Approach to a Data Analysis

UNDERSTANDING THE DEPENDENT-SAMPLES t TEST

Pearson s Correlation

Introduction to Regression and Data Analysis

CHAPTER 5 COMPARISON OF DIFFERENT TYPE OF ONLINE ADVERTSIEMENTS. Table: 8 Perceived Usefulness of Different Advertisement Types

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression

Economic Statistics (ECON2006), Statistics and Research Design in Psychology (PSYC2010), Survey Design and Analysis (SOCI2007)

Chapter 2 Probability Topics SPSS T tests

Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship)

Inferential Statistics. What are they? When would you use them?

II. DISTRIBUTIONS distribution normal distribution. standard scores

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

AIE: 85-86, 193, , 294, , , 412, , , 682, SE: : 339, 434, , , , 680, 686

Using Excel for Statistical Analysis

MATH 140 HYBRID INTRODUCTORY STATISTICS COURSE SYLLABUS

Recommend Continued CPS Monitoring. 63 (a) 17 (b) 10 (c) (d) 20 (e) 25 (f) 80. Totals/Marginal

Descriptive Statistics

Econometrics and Data Analysis I

Simulation Exercises to Reinforce the Foundations of Statistical Thinking in Online Classes

Fairfield Public Schools

" Y. Notation and Equations for Regression Lecture 11/4. Notation:

Correlational Research. Correlational Research. Stephen E. Brock, Ph.D., NCSP EDS 250. Descriptive Research 1. Correlational Research: Scatter Plots

ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE. School of Mathematical Sciences

Your Questions from Chapter 1. General Psychology PSYC 200. Your Questions from Chapter 1. Your Questions from Chapter 1. Science is a Method.

Statistical tests for SPSS

Introduction to Hypothesis Testing OPRE 6301

Chapter 7: Simple linear regression Learning Objectives

RARITAN VALLEY COMMUNITY COLLEGE ACADEMIC COURSE OUTLINE MATH 111H STATISTICS II HONORS

Which WJ-III Subtests Should I Administer?

Nursing Journal Toolkit: Critiquing a Quantitative Research Article

Obtaining Knowledge. Lecture 7 Methods of Scientific Observation and Analysis in Behavioral Psychology and Neuropsychology.

2 GENETIC DATA ANALYSIS

SAMPLE SIZE CONSIDERATIONS

Introduction. Hypothesis Testing. Hypothesis Testing. Significance Testing

Basic Concepts in Research and Data Analysis

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures

SCIENCE ROAD JOURNAL

E10: Controlled Experiments

Working with data: Data analyses April 8, 2014

Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.

Understand the role that hypothesis testing plays in an improvement project. Know how to perform a two sample hypothesis test.

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

Foundation of Quantitative Data Analysis

Hypothesis testing. c 2014, Jeffrey S. Simonoff 1

Lesson 1: Comparison of Population Means Part c: Comparison of Two- Means

Section 7.1. Introduction to Hypothesis Testing. Schrodinger s cat quantum mechanics thought experiment (1935)

Additional sources Compilation of sources:

Understanding Confidence Intervals and Hypothesis Testing Using Excel Data Table Simulation

Introduction to Statistics and Quantitative Research Methods

Overview of Factor Analysis

University of Chicago Graduate School of Business. Business 41000: Business Statistics

Bachelor Program in Analytical Finance, 180 credits

Measuring Evaluation Results with Microsoft Excel

MTH 140 Statistics Videos

MULTIPLE REGRESSION WITH CATEGORICAL DATA

BA 275 Review Problems - Week 5 (10/23/06-10/27/06) CD Lessons: 48, 49, 50, 51, 52 Textbook: pp

Hypothesis Test for Mean Using Given Data (Standard Deviation Known-z-test)

IMPLEMENTATION NOTE. Validating Risk Rating Systems at IRB Institutions

6: Introduction to Hypothesis Testing

Chapter Four. Data Analyses and Presentation of the Findings

COMP6053 lecture: Relationship between two variables: correlation, covariance and r-squared.

Sample Size and Power in Clinical Trials

Multivariate Analysis of Variance. The general purpose of multivariate analysis of variance (MANOVA) is to determine

Introduction to Hypothesis Testing. Hypothesis Testing. Step 1: State the Hypotheses

Glossary of Terms Ability Accommodation Adjusted validity/reliability coefficient Alternate forms Analysis of work Assessment Battery Bias

Consulting projects: What really matters

The correlation coefficient

Project Management. Individual Program Information Macomb1 ( )

MONT 107N Understanding Randomness Solutions For Final Examination May 11, 2010

Association Between Variables

Analyzing Experimental Data

Chapter 7. Comparing Means in SPSS (t-tests) Compare Means analyses. Specifically, we demonstrate procedures for running Dependent-Sample (or

A Power Primer. Jacob Cohen New York University ABSTRACT

11. Analysis of Case-control Studies Logistic Regression

A Hands-On Exercise Improves Understanding of the Standard Error. of the Mean. Robert S. Ryan. Kutztown University

Tel: Tuesdays 12:00-2:45

Pearson s Correlation Coefficient

Having a coin come up heads or tails is a variable on a nominal scale. Heads is a different category from tails.

Fixed-Effect Versus Random-Effects Models

Chapter 13 Introduction to Linear Regression and Correlation Analysis

Do Supplemental Online Recorded Lectures Help Students Learn Microeconomics?*

STAT 360 Probability and Statistics. Fall 2012

Business Valuation Review

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Introduction to. Hypothesis Testing CHAPTER LEARNING OBJECTIVES. 1 Identify the four steps of hypothesis testing.

Exploratory Research Design. Primary vs. Secondary data. Advantages and uses of SD

Transcription:

Roadmap to Data Analysis Introduction to the Series, and I. Introduction to Statistical Thinking-A (Very) Short Introductory Course for Agencies

Objectives of the Series Roadmap to Data Analysis Provide and introduction to basic statistical procedures relevant to SOT agencies Provide a foundation to analyze and interpret data on clients, services, and outcomes Provide an introduction to the use of available tools for basic analysis of data Provide an understanding of the limitations of statistical analysis, and guidelines for agency staff about when to seek statistical consultation 2

The Series Roadmap to Data Analysis I. Introduction to Statistical Thinking II. Primer on Measurement and Variables III. Choosing the Right Statistical Test IV. Comparing Averages The t Tests V. Comparing Counts Chi Square VI. Relationship Between Two Continuous Variables Pearson s r Correlation 3

Learning Objectives: I. Introduction to Statistical Thinking Understand basics of statistical thinking and inference Understand concepts of population and sample in quantitative research Understand the process by which inferences can be made to a population based on a sample Understand hypothesis testing and probability value 4

Inference Statistical analysis is all about understanding ( making inferences about) a population based on a sample from the population. We rarely have the opportunity to measure an entire population, such as all torture survivors in the U.S. Your agency has ready-made samples of the population of torture survivors The extent to which a sample is representative of the population can be quantified By a statistic some measure of the sample (i.e. an average), and By a quantified value that explains how well your sample statistic describes the population (i.e. a standard deviation) 5

Definitions: Population : the entire universe of individuals to which you seek to generalize Sample : that part of the population from which you collect data A sample should be representative of the population from which it is drawn The extent to which a sample is representative of the population can be quantified. This forms the basis for all types of statistical inference 6

Characteristics of the Sample A sample should be representative of the population from which it is drawn Population of Torture Survivors in the US (N=500,000) Sample of 1000 Torture Survivors Note: you can also define your population as all of your agency s clients. If you take a sample from that population, then you are inferring statistical results to the agency population. 7

A practical example If we say the longer torture survivors in your agency participate in socialization groups, the more they will show improved functioning, we are implying that, in the general population of torture survivors, more participation in socialization groups will result in improved functioning. How is this type of inference possible? 8

Minimum requirements: The sample is reasonably representative of the population to which it will generalize There is a specific research question that addresses the sample and can be answered quantitatively Is the length of participation in socialization groups related to improved functioning? There is an hypothesis implied by the research question that can be tested The analysis of data correctly matches the type of data being analyzed (i.e. using the right statistic for the data) 9

Hypothesis testing An hypothesis is a statement about the relationship among variables A variable is a factor, or characteristic, that varies It is hypothesized that the longer torture survivors participate in our agency s socialization group, the better the functioning. Two variables 1) Length of participation in the group, and 2) a measure of functioning Hypotheses should be generated based on the available scientific literature and some theoretical basis (i.e. what leads you to think the length of participation in socialization groups might improve functioning?) 10

Statistics for inference Analysis of the data results in a statistic about the sample A number representing improved functioning, such as the difference between the average pre- and postfunctioning scores A quantified value that explains how well your sample statistic describes the population (such as a standard deviation a measure of how much individuals in the sample deviate from the average), and A quantified value of how confident you can be that the sample statistic reflects the population (a probability value ) 11

Probability Value A p value is the mathematical probability that a relationship between variables found within a sample may have been produced by chance or error P <.05 Statistically significant at <.05 this is a typical threshold Technical note: p values are calculated based on distribution tables for the specific statistic used. The p value is provided by statistics software programs To interpret: the probability is less than 5 in 100 that improvement in functioning was the result of chance alone. In other words, the improvement in functioning from your sample likely reflects the same experience in the larger population that length of participation in the socialization group is related to improved functioning (putting aside the thorny issue of cause and effect, for now ) 12

Important Caveats (Partial List!) P values are useful but can t tell the whole story about treatment effects (look up effect size ). Also, non-significant findings can be valuable information as well. Statistics can t compensate for poorly written questions or instruments that are not valid Especially in torture treatment settings, there are too few instruments that have been tested with many of our current survivor groups Statistics can t compensate for a weak research design i.e. for understanding the impact of treatment it s best to have a non-treatment control or comparison group (to address that thorny causality issue) Seek consultation from statistical and content experts when considering or implementing data collection instruments They can comment on reliability and validity concerns They can guide analysis strategies, including sample size requirements and choosing the right statistical analyses They can help interpret results 13