2 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses Independent T-test Paired T-test One-way ANOVA Two-way ANOVA Regression
3 Measurement Scales Before we can examine a variable statistically, we must first observe, or measure, the variable. Measurement is the assignment of numbers to observations based on certain rules. Measurement attempts to transform attributes into numbers. How much is high vs. low stress? How much fast vs. slow learning of a maze? How much is good vs. bad memory?
4 Measurement Scales Non-metric (or qualitative) Nominal scale (Categories): Numbers indicate difference in kind; no order info (e.g., ethnicity, gender, id#s) Say that men is assigned 0 and women is assigned 1 ; doesn t mean 1 is better than 0 Ordinal scale (Orders): Numbers represent rank orderings; distances are not equal (e.g., grades, rank orderings on a survey)
5 Measurement Scales Metric (or quantitative) Interval scale: Equal intervals, arbitrary zero Ratios have no meaning (e.g., temperature in degrees F; F = F; 60 F 2 X 30 F) Ratio scale: Equal intervals, absolute zero Equal ratios are equivalent (e.g., weight, height)
6 Populations vs. Samples Population: all members of a specific group. parametric: a measure (e.g., mean and variance) computed for the population Sample: a finite subset of a predefined population. statistic: a measure (e.g., mean and variance) computed for the sample
7 Continuous vs. Discrete Discrete variable: one in which a measure can take on distinct values but not intermediate values (e.g., number of children -- it is either 1 or 2, but not 1.2). The most common form of discrete variable is based on counting. Continuous variable: approximations of the exact value; it is not possible to obtain the exact measure on a continuous variable, because there are always infinitely smaller gradation of measure (e.g., height we can say someone is 72 inches tall, but this is really approximating between 71.5 and 72.5 inches).
8 Independent vs. Dependent Independent variable: one manipulated by the experimenter, or the observed variable thought to cause or predict the dependent variable. In the relation Y = f(x), X is independent variable because the value of X does not depend on the value of Y. Dependent variable: one thought to result from the independent variable. In the relation Y = f(x), Y is the dependent variable because the value it takes on depends on the value of X
9 Descriptive Statistics - Descriptive Statistics (a.k.a. Summary Statistics) - Primarily concerned with the summary and description of a collection of data - Serves to reduce a large and unmanageable set of data to a smaller and more interpretable set of information
10 Descriptive Statistics Frequency distribution & histogram a function that summarizes the membership of individual observation to measurement classifications. Can be constructed regardless of whether the scale is nominal, ordinal, interval or ratio, as long as each and every observation goes into one and only one class.
11 Descriptive Statistics One of the goals in stats is to compare distributions of data, one data distribution with another data distribution. This would be easier if each data distribution can be summarized into one or two numbers. Central Tendency & Variability -- what is the descriptive central number and how much do individual scores vary from the number?
12 Descriptive Statistics Measures of Central Tendency Mean: typical/average score, sensitive to extreme scores Median: middlemost score; useful for skewed distribution Mode: most common or frequent score
13 Descriptive Statistics Frequency IQ scores
14 Descriptive Statistics Measures of Variability Variance (dispersion or spread): degree of spread in X (variable) Standard deviation (SD): a measure of variability in the original metric units of X (variable); the square root of the variance
15 Variance S 2 = (x i x ) 2 n-1 x
16 Variance S 2 = (x i x ) 2 n-1 x
17 IQ score Frequency IQ scores Frequency IQ score Frequency IQ score Frequency
18 Other Measures Skewness is a measure of symmetry, or more precisely, the lack of symmetry. Kurtosis is a measure of whether the data are peaked or flat relative to a normal distribution. That is, data sets with high kurtosis tend to have a distinct peak near the mean, decline rather rapidly, and have heavy tails. Data sets with low kurtosis tend to have a flat top near the mean rather than a sharp peak.
19 Pop Quiz! Variance is the average of the squared differences between data points and the mean. Then why are the differences squared? Standard deviation is the square root of variance. Then why is the variance square rooted?
20 Inferential Statistics - A formalized method for solving a class of problems relating to the inference of properties to a large set of data from examination of a small set of data - Goal is to predict or to estimate characteristics of a population based on information obtained from a sample drawn from that population
21 Inferential Statistics We want to know about these: We have this to work with: Random Selection Population Sample Parameter Inference Statistic (Population mean) (Sample mean)
22 Normal distribution 67% of data within 1 SD of mean 95% of data within 2 SD of mean
23 Poisson distribution mean Mostly, nothing happens (lots of zeros)
24 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses Independent T-test Paired T-test One-way ANOVA Two-way ANOVA Regression
25 Hypothesis testing 1. Assume null hypothesis (H 0 ) (e.g., the two sets of samples come from the same population) 2. Construct alternative hypothesis (H 1 ) (e.g., the two sets of samples do not come from the same population) 3. Calculate test statistic 4. Decide on rejection region for null hypothesis (e.g., 95% confidence in rejecting null hypothesis)
26 Hypotheses Null (H 0 ): no effect of our experimental treatment, status quo Alternative (H 1 ): there is an effect
27 T-tests One sample t-test compare a group to a known value (e.g., comparing the IQ of a specific group to the known average of 100) Paired samples t-test compare one group at two points in time (e.g., comparing pretest and posttest scores) Independent samples t-test compare two groups to each other
28 Paired t-test More examples Before-and-after observations on the same subjects (e.g. students diagnostic test results before and after a particular module or course) A comparison of two different methods of measurement or two different treatments where the measurements or treatments are applied to the same subjects (e.g. blood pressure measurements)
29 Paired t-test 1. Calculate the difference between the two observations on each pair, making sure you distinguish between positive and negative differences. 2. Calculate statistics (mean, SD etc.) for these difference scores. 3. Calculate the t-statistic (T). Under the null hypothesis, this statistic follows a t-distribution with n 1 degrees of freedom (n = sample size). 4. Use tables of the t-distribution to compare your value for T to the t n distribution.
30 Paired t-test Example: Suppose a sample of n students were given a diagnostic test before studying a particular subject and then again after completing it. Student Pre-test Post-test difference
31 Independent t-test Question: Do two samples come from different populations? NO H 0 DATA YES A B
32 Independent t-test Depends on whether the difference between samples is much greater than difference within sample. A B Between >> Within A B
33 Degrees of freedom (df) df = (number of independent observations) (number of restraints) or df = (number of independent observations) (number of population estimates) df = (a) (n - 1) a = number of different groups; n = number of observations (i.e., sample size)
34 Independent t-test How many degrees of freedom when sample sizes are different? (n 1-1) + (n 2-1)
35 T-tables df (twotailed) df (onetailed) Two samples, each n=3, with t-statistic of 2.50: infinity significantly different? 1.960
36 T-tables df (twotailed) df (onetailed) Two samples, each n=3, with t-statistic of 2.50: infinity significantly different? No! 1.960
38 One-way (factor) ANOVA General form of the t-test; can have more than 2 samples H 0 : All samples the same H 1 : At least one sample different
39 One-way (factor) ANOVA General form of the t-test; can have more than 2 samples A B C H DATA 0 AB C H 1 A BC AC B
40 One-way (factor) ANOVA Just like t-test, compares differences between samples to differences within samples A B C T-test statistic (t) ANOVA statistic (F) Difference between means Standard error within sample MS between groups MS within group
41 ANOVA table df SS MS F p Treatment (between groups) df (X) SSX SSX df (X) } MS X MS E Look up! Error (within groups) df (E) SS E SS E df (E) } Total df (T) SS T
42 Suppose there are 3 groups of treatment (i.e., one factor with three levels), and there are 5 observations per group. alpha = 0.05, F 2,12 = 3.89 df SS MS F p Treatment (between groups) ? Error (within groups) Total
43 Two-way ANOVA Just like one-way ANOVA, except subdivides the treatment SS into: Treatment 1 Treatment 2 Interaction between 1 & 2
44 Two-way ANOVA Suppose there are two groups of treatment 1 and two groups of treatment 2, and there are 10 observations in each group: Treatment 1 (2 levels, so df = 1) Treatment 2 (2 levels, so df = 1) Treatment 1 x Treatment 2 interaction (1df x 1df = 1df) Error? df = k(n-1) = 4 (10-1) = 36
45 v df SS MS F Treatment 1 1 SS(T1) MS(T1) MS(T1) MS E Treatment 2 1 SS(T2) MS(T2) MS(T2) MS E Treatment 1 x Treatment 2 1 SS(T1XT2) MS(T1XT2) MS(Int) MS E Error (within groups) 36 SS E MS E Total 39 SS T
47 Interactions Combination of treatments gives non-additive effect Anything not parallel!
48 How to report Independent t-test: (Example) There was no overall difference in performance on control RAT items between younger and older adults, Ms = 0.39 and 0.32, respectively, t(18) = 1.34, p >.05.
49 ANOVA (or F-test): How to report (Example) Reading time (in seconds) on the control story was compared to the mean reading time for the four stories with distraction using a 2 (Age: young and old) X 2 (Story Type: without and with distraction) ANOVA with age as a between-subject variable and story type as a within-subject variable. Older adults were slower overall than younger adults, M = and 51.33, respectively, F (1, 18) = 18.94, p <.01, the stories with distraction took longer to read than the stories without distraction, M = and 37.95, respectively, F (1, 18) = , p <.01, and, in replication of the earlier work, the slowdown between the stories with and without distraction was greater for older than for younger adults, F (1, 18) = 7.43, p <.05.
Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize
Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.
The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide
Testing Group Differences using T-tests, ANOVA, and Nonparametric Measures Jamie DeCoster Department of Psychology University of Alabama 348 Gordon Palmer Hall Box 870348 Tuscaloosa, AL 35487-0348 Phone:
Analysis of Data Claudia J. Stanny PSY 67 Research Design Organizing Data Files in SPSS All data for one subject entered on the same line Identification data Between-subjects manipulations: variable to
Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool
There are three kinds of people in the world those who are good at math and those who are not. PSY 511: Advanced Statistics for Psychological and Behavioral Research 1 Positive Views The record of a month
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
Notation: Notation and Equations for Regression Lecture 11/4 m: The number of predictor variables in a regression Xi: One of multiple predictor variables. The subscript i represents any number from 1 through
Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are
Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents
The t-test Outline Definitions Descriptive vs. Inferential Statistics The t-test - One-sample t-test - Dependent (related) groups t-test - Independent (unrelated) groups t-test Comparing means Correlation
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
THERE ARE TWO WAYS TO DO HYPOTHESIS TESTING WITH STATCRUNCH: WITH SUMMARY DATA (AS IN EXAMPLE 7.17, PAGE 236, IN ROSNER); WITH THE ORIGINAL DATA (AS IN EXAMPLE 8.5, PAGE 301 IN ROSNER THAT USES DATA FROM
Five types of statistical analysis General Procedure for Hypothesis Test Descriptive Inferential Differences Associative Predictive What are the characteristics of the respondents? What are the characteristics
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks
PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE
Chapter 4 One-Way ANOVA Recall this chart that showed how most of our course would be organized: Explanatory Variable(s) Response Variable Methods Categorical Categorical Contingency Tables Categorical
Dr. Peter Tröger Hasso Plattner Institute, University of Potsdam Software Profiling Seminar, 2013 Statistics 101 Descriptive Statistics Population Object Object Object Sample numerical description Object
1 Final Review 2 Review 2.1 CI 1-propZint Scenario 1 A TV manufacturer claims in its warranty brochure that in the past not more than 10 percent of its TV sets needed any repair during the first two years
Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and
Today: Sections 13.1 to 13.3 ANNOUNCEMENTS: We will finish hypothesis testing for the 5 situations today. See pages 586-587 (end of Chapter 13) for a summary table. Quiz for week 8 starts Wed, ends Monday
Sociology 6Z03 Topic 15: Statistical Inference for Means John Fox McMaster University Fall 2016 John Fox (McMaster University) Soc 6Z03: Statistical Inference for Means Fall 2016 1 / 41 Outline: Statistical
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
1 14.1 Using the Binomial Table Nonparametric Statistics In this chapter, we will survey several methods of inference from Nonparametric Statistics. These methods will introduce us to several new tables
Content DESCRIPTIVE STATISTICS Dr Najib Majdi bin Yaacob MD, MPH, DrPH (Epidemiology) USM Unit of Biostatistics & Research Methodology School of Medical Sciences Universiti Sains Malaysia. Introduction
Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately
LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.
1. Why the hell do we need statistics? There are three kind of lies: lies, damned lies, and statistics, British Prime Minister Benjamin Disraeli (as credited by Mark Twain): It is easy to lie with statistics,
Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How
One-Way Analysis of Variance (ANOVA) Example Problem Introduction Analysis of Variance (ANOVA) is a hypothesis-testing technique used to test the equality of two or more population (or treatment) means
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Sample Size Determination Population A: 10,000 Population B: 5,000 Sample 10% Sample 15% Sample size 1000 Sample size 750 The process of obtaining information from a subset (sample) of a larger group (population)
Stat 411/511 THE RANDOMIZATION TEST Oct 16 2015 Charlotte Wickham stat511.cwick.co.nz Today Review randomization model Conduct randomization test What about CIs? Using a t-distribution as an approximation
Readings: Ha and Ha Textbook - Chapters 1 8 Appendix D & E (online) Plous - Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................
The Logic of Statistical Inference-- Testing Hypotheses Confirming your research hypothesis (relationship between 2 variables) is dependent on ruling out Rival hypotheses Research design problems (e.g.
Basic Concepts in Research and Data Analysis Introduction: A Common Language for Researchers...2 Steps to Follow When Conducting Research...3 The Research Question... 3 The Hypothesis... 4 Defining the
Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools Occam s razor.......................................................... 2 A look at data I.........................................................
t-test Statistics Overview of Statistical Tests Assumption: Testing for Normality The Student s t-distribution Inference about one mean (one sample t-test) Inference about two means (two sample t-test)
2 Hypothesis Testing & Data Analysis 5 What is the difference between descriptive and inferential statistics? Statistics 8 Tools to help us understand our data. Makes a complicated mess simple to understand.
1 Central Tendency CENTRAL TENDENCY: A statistical measure that identifies a single score that is most typical or representative of the entire group Usually, a value that reflects the middle of the distribution
Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research
Chapter 9 Two-Sample Tests Paired t Test (Correlated Groups t Test) Effect Sizes and Power Paired t Test Calculation Summary Independent t Test Chapter 9 Homework Power and Two-Sample Tests: Paired Versus
Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
The Islamic University of Gaza Faculty of Commerce Department of Economics and Political Sciences An Introduction to Statistics Course (ECOE 130) Spring Semester 011 Chapter 10- TWO-SAMPLE TESTS Practice
HOW TO WRITE A LABORATORY REPORT Pete Bibby Dept of Psychology 1 About Laboratory Reports The writing of laboratory reports is an essential part of the practical course One function of this course is to
Paper TU04 An Overview of Non-parametric Tests in SAS : When, Why, and How Paul A. Pappas and Venita DePuy Durham, North Carolina, USA ABSTRACT Most commonly used statistical procedures are based on the
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
FACT SHEET Using Excel for inferential statistics Introduction When you collect data, you expect a certain amount of variation, just caused by chance. A wide variety of statistical tests can be applied
DATA ANALYSIS QEM Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. Howard University Quantitative Research What is Statistics? Statistics (as a subject) is the science
Module 9: Nonparametric Tests The Applied Research Center Module 9 Overview } Nonparametric Tests } Parametric vs. Nonparametric Tests } Restrictions of Nonparametric Tests } One-Sample Chi-Square Test
STATISTICS FOR PSYCHOLOGISTS SECTION: STATISTICAL METHODS CHAPTER: REPORTING STATISTICS Abstract: This chapter describes basic rules for presenting statistical results in APA style. All rules come from
COMP6053 lecture: Hypothesis testing, t-tests, p-values, type-i and type-ii errors firstname.lastname@example.org The t-test This lecture introduces the t-test -- our first real statistical test -- and the related
t-tests in Excel By Mark Harmon Copyright 2011 Mark Harmon No part of this publication may be reproduced or distributed without the express permission of the author. email@example.com www.excelmasterseries.com
General Sir John Kotelawala Defence University Workshop on Descriptive and Inferential Statistics Faculty of Research and Development 14 th May 2013 1. Introduction to Statistics 1.1 What is Statistics?
Introduction to Hypothesis Testing 9-1 Learning Outcomes Outcome 1. Formulate null and alternative hypotheses for applications involving a single population mean or proportion. Outcome 2. Know what Type
THE UNIVERSITY OF TEXAS AT TYLER COLLEGE OF NURSING 1 COURSE SYLLABUS NURS 5317 STATISTICS FOR HEALTH PROVIDERS Fall 2013 & Danice B. Greer, Ph.D., RN, BC firstname.lastname@example.org Office BRB 1115 (903) 565-5766
1 Calculating, Interpreting, and Reporting Estimates of Effect Size (Magnitude of an Effect or the Strength of a Relationship) I. Authors should report effect sizes in the manuscript and tables when reporting
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal
Statistical Functions in Excel There are many statistical functions in Excel. Moreover, there are other functions that are not specified as statistical functions that are helpful in some statistical analyses.
Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction
Independent t- Test (Comparing Two Means) The objectives of this lesson are to learn: the definition/purpose of independent t-test when to use the independent t-test the use of SPSS to complete an independent
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
Why do we measure central tendency? Basic Concepts in Statistical Analysis Chapter 4 Too many numbers Simplification of data Descriptive purposes What is central tendency? Measure of central tendency A
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
DATA COLLECTION AND ANALYSIS Quality Education for Minorities (QEM) Network HBCU-UP Fundamentals of Education Research Workshop Gerunda B. Hughes, Ph.D. August 23, 2013 Objectives of the Discussion 2 Discuss
Chapter Additional: Standard Deviation and Chi- Square Chapter Outline: 6.4 Confidence Intervals for the Standard Deviation 7.5 Hypothesis testing for Standard Deviation Section 6.4 Objectives Interpret
Difference of Means and Problems Dr. Tom Ilvento FREC 408 Accounting Firm Study An accounting firm specializes in auditing the financial records of large firm It is interested in evaluating its fee structure,particularly
UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly
Section 7.1 Introduction to Hypothesis Testing Schrodinger s cat quantum mechanics thought experiment (1935) Statistical Hypotheses A statistical hypothesis is a claim about a population. Null hypothesis