What are Data? The Research Question (Randomised Controlled Trials (RCTs)) The Research Question (Non RCTs)

Similar documents
Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Exploratory data analysis (Chapter 2) Fall 2011

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

Module 4: Data Exploration

II. DISTRIBUTIONS distribution normal distribution. standard scores

Foundation of Quantitative Data Analysis

Descriptive Statistics

Northumberland Knowledge

Descriptive Statistics

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

Descriptive Statistics

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Variables. Exploratory Data Analysis

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics. Measurement. Scales of Measurement 7/18/2012

Mathematical goals. Starting points. Materials required. Time needed

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

Descriptive Statistics and Measurement Scales

Diagrams and Graphs of Statistical Data

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Lecture 1: Review and Exploratory Data Analysis (EDA)

Means, standard deviations and. and standard errors

Descriptive statistics parameters: Measures of centrality

THE BINOMIAL DISTRIBUTION & PROBABILITY

Intro to Statistics 8 Curriculum

Using SPSS, Chapter 2: Descriptive Statistics

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Midterm Review Problems

Measures of Central Tendency and Variability: Summarizing your Data for Others

Lesson 4 Measures of Central Tendency

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Topic 9 ~ Measures of Spread

CHAPTER THREE. Key Concepts

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

IBM SPSS Statistics for Beginners for Windows

Introduction to Statistics and Quantitative Research Methods

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Data Exploration Data Visualization

Week 1. Exploratory Data Analysis

How To Write A Data Analysis

Mean = (sum of the values / the number of the value) if probabilities are equal

Data exploration with Microsoft Excel: analysing more than one variable

Statistics Review PSY379

Basic research methods. Basic research methods. Question: BRM.2. Question: BRM.1

2. Filling Data Gaps, Data validation & Descriptive Statistics

Summarizing and Displaying Categorical Data

Biostatistics: Types of Data Analysis

consider the number of math classes taken by math 150 students. how can we represent the results in one number?

Introduction to Quantitative Methods

Mind on Statistics. Chapter 2

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

AP * Statistics Review. Descriptive Statistics

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Box-and-Whisker Plots

Ch. 3.1 # 3, 4, 7, 30, 31, 32

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Lecture 2. Summarizing the Sample

Analyzing and interpreting data Evaluation resources from Wilder Research

Data! Data! Data! I can t make bricks without clay. Sherlock Holmes, The Copper Beeches, 1892.

List of Examples. Examples 319

Exploratory Data Analysis

MEASURES OF VARIATION

DATA INTERPRETATION AND STATISTICS

Lesson outline 1: Mean, median, mode

AP STATISTICS REVIEW (YMS Chapters 1-8)

Mark. Use this information and the cumulative frequency graph to draw a box plot showing information about the students marks.

430 Statistics and Financial Mathematics for Business

Introduction to Probability Theory and Statistics for Psychology and Quantitative Methods for Human Sciences

Biology 300 Homework assignment #1 Solutions. Assignment:

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Density Curve. A density curve is the graph of a continuous probability distribution. It must satisfy the following properties:

Random Variables. Chapter 2. Random Variables 1

MEASURES OF CENTER AND SPREAD MEASURES OF CENTER 11/20/2014. What is a measure of center? a value at the center or middle of a data set

3.2 Measures of Spread

HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS

SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION. Friday 14 March

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Section 1.3 Exercises (Solutions)

Exercise 1.12 (Pg )

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Programme Guide PGDCDM

CALCULATIONS & STATISTICS

Chapter 2 Statistical Foundations: Descriptive Statistics

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Description. Textbook. Grading. Objective

Transcription:

What are Data? Quantitative Data o Sets of measurements of objective descriptions of physical and behavioural events; susceptible to statistical analysis Qualitative data o Descriptive, views, actions and activities, non-verbal behaviour and interactions; susceptible to interpretation The Research Question (Randomised Controlled Trials (RCTs)) P = Population Who is the question about I = Intervention What is happening/ being done to P C = Comparison What could be done instead of I O = Outcome (s) What happens to P as a result of I The Research Question (Non RCTs) P = Population Who is the question about? I = Intervention The group with the disease / characteristic of interest C = Comparison The group without the disease / characteristic of interest O = Outcome (s) The variable we are measuring for both the I and C groups

Metric Categorical Descriptive Statistics Data and methods that say something about a complete population Inferential statistics Data and methods that say something about a larger population which is probably true What are we measuring Need to know what we are measuring and how it is being measured. How we measure the variables will influence the types of analysis we can carry out on our data 2 main types of variable Categorical categories e.g. age ranges, gender, cat/dog Metric e.g. actual values, not grouped, weight, time Levels of measurement 4 types of scle for measuring variables Nominal: These are categories and lists e.g. dog, cat, mouse, yes, no Ordinal: These are ordered of ranked positions, not true numbers e.g. Educational achievements, income bandings Continuous: Values can lie anywhere within the possible range, are true numbers e.g. height, can be any point on a scale Discrete: Whole numbers, arise from counting things e.g. number of decayed missing teeth

Identifying data type Can the data be out in order? No Nominal Yes Do the data have units? No Ordinal (inc. numbers of things) Yes Metric Do the data come from measuring or counting things? Measuring Counting Continuous Discrete How do you describe Data? The role of summary statistics Central tendency The typical values in a set of scores Mode most frequently occurring category of score 1 1 2 2 2 3 4 4 5 5 5 5 6 Median the mid-point in a set of scores 1 1 2 2 2 3 4 4 5 5 5 5 6 Mean average score Sum of X (scores) N (number of scores) = 3.5 Summarising Date Percentage The frequency of people with a given characteristic expressed as a number out of 100. E.g. 52 people out of every 100 studied had blue eyes, can also be expressed as 52%. A percentage can also be defined as a rate per 100.

Rate The frequency of people with given characteristic expressed as a number out of a total population, (usually multiples of 100). E.g. the rate of people with blue eyes can be expressed as 52 per 100, 520 per 1000, 5200 per 10,000. What can we do with our data? Prevalence Defined as the proportion of individuals with a particular disease P= total number of cases at a given time total population at that time Prevalence is measured at a particular point in time, and as such may be referred to as a point prevalence. Incidence Defined as the proportion of new cases in a population previously without disease in a specified period. I = number of new cases in a period of time Population at risk N.B. the time period involved must always be specified when presenting incidence rates. This is also referred to as the cumulative incidence. Which summary statistic should I use? Nominal Mode /percentage Ordinal Median Metric Normal distribution? Yes Mean No Median The Normal Distribution = A lot of biological measures

In a distribution of values that looks like this when plotted, the mean, median and mode are the same. Negative and Positive skews Negative = mean is less than median Positive = mean is greater then median How do you know if the data is normally distributed? To test for this you can either: o Plot a frequency diagram o See if mean= median= mode o If the standard deviation does not fit twice into the mean then it definitely isn t normally distributed (this is a good tip when looking a research papers). Measures of Dispersion 25% Q1 Q2 Q3 Minimum Maximum Inter-quartile range (IQR) ( in ascending order)

Examples of median and IQR 2 sets of data: 13, 2, 16, 1, 17 13, 14, 13, 14, 13 First sort them into numerical order 1, 2, 13, 16, 17 13, 13, 13, 14, 14 - Median is iddle value, so for both it is 13. We can calculate the poition of the lower quartile by (n+1)0.25 (n= number of values) The upper quartile is (n+1)0.75 1, 2, 13, 16, 17 LQ = 1.5 UQ= 16.5 this shows that slthough they have the same 13, 13, 13, 14, 14 LQ= 13 UQ= 14 median, they have very different ranges. - If both the median and IQR are presented we can see that the data are different - The values are more dispersed in the 1 st data set. Standard Deviation The standard deviation is very useful as statisticians have calculated that 68% of a normally distributed population will have within 1 standard deviation of the mean,

approximately 95% within 2SD and approx 99% within 3 SD. However, this statistical estimation assumes a mean of 0 and a SD of 1. The obvious problem is that we rarely collect data with a mean of 0 and a SD of 1 often the data we collect only has positive values, for example the mean assessment score in a class may be 55 with a SD of 12 and nobody achieving a mark less than 0. What statistics programmes such as spss do is convert this data so that it has a mean of 0 and a SD of 1 and generate Z scores i.e. Normalised score Measure of dispersion when mean is used as measure of central tendency Based on all the individual scores Describes how individual scores typically vary from the mean The larger the SD the more spread out the scores are about the mean