# Basic Biostatistics for Clinical Research. Ramses F Sadek, PhD GRU Cancer Center

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Basic Biostatistics for Clinical Research Ramses F Sadek, PhD GRU Cancer Center 1

2 1. Basic Concepts 2. Data & Their Presentation Part One 2

3 1. Basic Concepts Statistics Biostatistics Populations and samples Statistics and parameters Statistical inferences variables Random Variables Simple random sample 3

4 Statistics Statistics is a field of study concerned with 1- collection, organization, summarization and analysis of data. 2- drawing of inferences about a population when only a part of the data is observed. Statisticians try to interpret and communicate the results to others. 4

5 Biostatistics Biostatistics can be defined as the application of the mathematical tools used in statistics to the fields of biological sciences and medicine. Biostatistics is a growing field with applications in many areas of biology including epidemiology, medical sciences, health sciences, educational research and environmental sciences. 5

6 Variables A variable is an object, characteristic or property that can have different values in different places, persons, or things. A quantitative variable can be measured in some way. Examples: Heart rate, heights, weight, age, size of tumor, volume of a dose. A qualitative (categorical) variable is characterized by its inability to be measured but it can be sorted into categories. Examples: gender, race, drug name, disease status. 6

7 Populations and Samples A population is the collection or set of all of the values that a variable may have. A sample is a part of a population. We use the data from the sample to make inference about the population The sample mean is not true mean but might be very close. Closeness depends on sample size. Population of interest sample 7

8 Sampling Approaches-1 Convenience Sampling: select the most accessible and available subjects in target population. Inexpensive, less time consuming, but sample is nearly always non-representative of target population. Random Sampling (Simple): select subjects at random from the target population. Need to identify all in target population first. Provides representative sample frequently. 8

9 Sampling Approaches-2 Systematic Sampling: Identify all in target population, and select every x th person as a subject. Stratified Sampling: Identify important subgroups in your target population. Sample from these groups randomly or by convenience. Ensures that important sub-groups are included in sample. May not be representative. More complex sampling 9

10 Sampling Error The discrepancy between the true population parameter and the sample statistic Sampling error likely exists in most studies, but can be reduced by using larger sample sizes Sampling error approximates 1 / n Note that larger sample sizes also require time and expense to obtain, and that large sample sizes do not eliminate sampling error 10

11 Parameters vs. Statistics A parameter is a population characteristic A statistic is a sample characteristic Example: we estimate the sample mean to tell us about the true population mean the sample mean is a statistic the population mean is a parameter 11

12 Descriptive & Inferential Statistics Descriptive Statistics deal with the enumeration, organization and graphical representation of data from a sample Inferential Statistics deal with reaching conclusions from incomplete information, that is, generalizing from the specific sample Inferential statistics use available information in a sample to draw inferences about the population from which the sample was selected 12

13 Random Variables A random variable is one that cannot be predicted in advance because it arises by chance. Observations or measurements are used to obtain the value of a random variable. A discrete random variable has gaps or interruptions in the values that it can have. The values may be whole numbers or have spaces between them. A continuous random variable does not have gaps in the values it can assume. Its properties are like the real numbers. 13

14 2- Data and Their Presentation Data Data sources Records Surveys Experiments Types of data Categorical variables Frequency tables Numerical variables Categorization Bar charts Histograms Box plots Bar charts by another variable Histogram by another variable Box plots by another variable Scatter plots 14

15 Data The raw material of Statistics is data. We may define data as figures. Figures result from the process of counting or from taking a measurement. Example: - When a hospital administrator counts the number of patients (counting). - When a nurse weighs a patient (measurement) 15

16 Sources of Data Data are obtained from Records Surveys Experiments 16

17 Data Sources: Records, Reports and Other Sources Look for data to serve as the raw material for our investigation. 1- Routinely kept records. - Hospital medical records contain immense amounts of information on patients. - Hospital accounting records contain a wealth of data on the facility s business activities. 2- External sources. The data needed to answer a question may already exist in the form of published reports, commercially available data banks, or the research literature, i.e. someone else 17 has already asked the same question.

18 Data Sources: Surveys Survey may be necessary if the data needed is about answering certain questions. Example: If the administrator of a clinic wishes to obtain information regarding the mode of transportation used by patients to visit the clinic, then a survey may be conducted among patients to obtain this information 18

19 Data Sources: Experiments Frequently the data needed to answer a question are available only as the result of an experiment. For example: If a nurse wishes to know which of several strategies is best for maximizing patient compliance, she might conduct an experiment in which the different strategies of motivating compliance are tried with different patients. Clinical trials is the most obvious example. 19

20 Types of Data Data are made up of a set of variables: Categorical variable Numerical variables 20

21 Categorical Variables Any variable that is not numerical (values have no numerical meaning) (e.g. gender, race, drug, disease status) Nominal variables The data are unordered (e.g. RACE: 1=Caucasian, 2=Asian American, 3=African American, 4=others) A subset of these variables are Binary or dichotomous variables: have only two categories (e.g. GENDER: 1=male, 2=female) Ordinal variables The data are ordered (e.g. AGE: 1=10-19 years, 2=20-29 years, 3=30-39 years; likelihood of participating in a vaccine trial). Income: Low, medium, high. 21

22 FrequencyTables Categorical variables are summarized by Frequency counts how many are in each category Relative frequency or percent (a number from 0 to 100) Or proportion (a number from 0 to 1) Gender of new HIV clinic patients, , Mbarara, Uganda. n (%) Male 415 (39) Female 645 (61) Total 1060 (100) 22

23 Numerical Variables (Quantitative) Naturally measured as numbers for which meaningful arithmetic operations make sense (e.g. height, weight, age, salary, viral load, CD4 cell counts) Discrete variables: can be counted (e.g. number of children in household: 0, 1, 2, 3, etc.) Continuous variables: can take any value within a given range (e.g. weight: g, g) 23

24 Manipulation of Variables Continuous variables can be discretized E.g., age can be rounded to whole numbers Continuous or discrete variables can be categorized E.g., age categories Categorical variables can be re-categorized E.g., lumping from 5 categories down to 2 24

25 Categorization Continuous variables can categorized in meaningful ways Choice of cut-off points Even intervals (5 year age intervals) Meaningful cut-points related to a health outcome or decision Meaningful CD4 count (below 200, -350, -500, 500+) Equal percentage of the data falling into each category (quartiles, centiles,..) 25

26 Organizing Data and Presentation Some of common methods: Frequency Table Frequency Histogram Relative Frequency Histogram Frequency polygon Relative Frequency polygon Bar chart Pie chart Box plot Scatter plots. 26

27 Frequency Tables CD4 cell counts (mm 3 ) of newly diagnosed HIV positives at Mulago Hospital, Kampala (N=268) n (%) < (14.9) (26.9) (21.6) > (36.6) 27

28 Bar Charts General graph for categorical variables Graphical equivalent of a frequency table The x-axis does not have to be numerical Alcohol consumption in Mulago Hospital patients enrolling in VCT study, n= Proportion Never >1 year ago Within the past year 28

29 Histograms Bar chart for numerical data The number of bins and the bin width will make a difference in the appearance of this plot and may affect interpretation CD4 among new HIV positives at Mulago CD4 cell count 29

30 Histograms This histogram has less detail but gives us the % of persons with CD4 <350 cells/mm 3 CD4 among new HIV positives at Mulago CD4 cell count 30

31 What Does This Graph Tell Us?.25 Days drank alcohol among current drinkers.05 Relative freq Days 31

32 Middle line=median (50 th percentile) Middle box=25 th to 75 th percentiles (interquartile range) Bottom whisker: Data point at or above 25 th percentile 1.5*IQR Days drank alcohol Top whisker: Data point at or below 75 th percentile + 1.5*IQR Box Plots

33 Box Plots 1,000 1,500 CD4 count among new HIV positives at Mulago cd4count 33

34 Box Plots By Another Variable We can divide up our graphs by another variable What type of variable is gender? male female 0 Days drank alcohol Graphs by a1. sex 34

35 Relative freq Histograms By Another Variable male female Days consumed alcohol of prior 30 35

36 Frequency Polygon Use to identify the distribution of your data Female Male Frequency Age in years 36

37 Scatter Plots CD4 cell count CD4 cell count versus age a4. how old are you? 37

38 Part Two Numerical Variable Summaries and Measures 38

39 Measures of Central Tendency and Dispersion Where is the center of the data? Median Mean Mode How variable the data are? Range, Interquartile range Variance, Standard Deviation, Standard Error Coefficient of variation 39

40 Measures of Central Tendency: Median Median the 50 th percentile = the middle value If n is odd: the median is the (n+1)/2 observations (e.g. if n=31 then median is the 16 th highest observation) If n is even: the median is the average of the two middle observations (e.g. if n=30 then the median is the average of the 15 th and16th observation Example; Median Progression Free Survival in a cancer Clinical trial sample= 195 days. 40

41 Measures of Central Tendency: Mode Mode the value (or range of values) that occurs most frequently Sometimes there is more than one mode, e.g. a bi-modal distribution (both modes do not have to be the same height) The mode only makes sense when the values are discrete, rounded off, or binned f Grades 41

42 Measures of Central Tendency: Mean Mean : x 1 n i 1 n Mean arithmetic average Means are sensitive to very large or small values (outliers), while the median is not. Mean CD4 cell count: Mean age: 32.5 When data is highly skewed, the median is preferred. Both measures are close when data are not skewed. xi 42

43 Interpreting the Formula is the symbol for the sum of the elements immediately to the right of the symbol These elements are indexed (i.e. subscripted) with the letter i The index letter could be any letter, though i is commonly used) 1 n Mean : x i x 1 i n The elements are lined up in a list, and the first one in the list is denoted as x 1, the second one is x 2, the third one is x 3 and the last one is x n. n is the number of elements in the list. n x i i x x x n... 43

44 Measures of Variation: Range, IQR 1. Range Minimum to maximum or difference (e.g. age range or range=43) CD4 cell count range: (0-1368) 2. Interquartile range (IQR) 25 th and 75 th percentiles (e.g. IQR for age: 23-36) or difference (e.g. 13) Less sensitive to extreme values CD4 cell count IQR: (92-422) 44

45 Measures of Variation: Variance SD, SE Sample variance Amount of spread around the mean, calculated in a sample by s 2 n i 1 ( x i x) n 1 2 Sample standard deviation (SD)or just s is the square root of the variance s The standard deviation has the same units as the mean Standard error (SE) = S / s of CD4 cell count = s of Age = 11.2 N n i 1 ( x i x) n

46 Mean and Standard deviation Mean = 7 SD= Mean = 7 SD= Mean = 7 SD=

47 Measures of Variation: CV Coefficient of variation For the same relative spread around a mean, the variance will be larger for a larger mean CV *100% Can use to compare variability across measurements that are on a different scale (e.g. IQ and head circumference) s x CV for CD4 cell count: 86.0% CV for age: 34.5% 47

48 Empirical Rule For a Normal distribution approximately, a) 68% of the measurements fall within one standard deviation around the mean b) 95% of the measurements fall within two standard deviations around the mean c) 99.7% of the measurements fall within three standard deviations around the mean 48

49 Suppose the reaction time of a particular drug has a Normal distribution with a mean of 10 minutes and a standard deviation of 2 minutes Approximately, a) 68% of the subjects taking the drug will have reaction time between 8 and 12 minutes b) 95% of the subjects taking the drug will have reaction tome between 6 and 14 minutes c) 99.7% of the subjects taking the drug will have reaction tome between 4 and 16 minutes 49

50 Part Three Biostatistics in Clinical Research Role of statistics Statistician role In research In development Study design Controlled designs Observational studies Cohort Case-control 50

51 Role of Statistics In general, statistics is a collection of techniques for extracting information from data, and for ensuring that the data collected contains the desired information. Statistics provides an objective basis for making decisions in the presence of uncertainty. 51

52 Statistician s Role Study design Sample size and power calculation Protocol development Data analysis Analysis interpretations Reporting Publications 52

53 Statistician s Role: In Research Collaboration Design clinical development program Design and analyze clinical trials Manage the report production process Interpret results Communicate results 53

54 Statistician s Role: In Drug Development Develop final statistical statements on the efficacy and safety for marketing purposes Develop publications, package inserts, etc. New indications, line extensions, etc. 54

55 Study Design How do we set up the study to answer the scientific question(s)? Two Main categories: Controlled designs: The experimenter has control exposure, treatments, Randomized Clinical trials. Observational Studies: Cohort studies Case-Control studies 55

56 Controlled Designs Not necessarily randomized e.g. In Cancer research Phase I: dose finding Phase II: 1 or 2-arm preliminary efficacy Phase III: Randomized, efficacy Choice of gold standard, balancing of treatment arms to controlling bias. 56

57 Observational Studies: Cohort P r o c e s s : Identify a cohort Measure exposure Follow for a long time See who gets disease Analyze to see if disease is associated with exposure P r o s Measurement is not biased and usually measured precisely Can estimate prevalence and associations, and relative risks Cons Very expensive Extremely expensive if outcome of interest is rare 57 Sometimes we don t know all of the exposures to measure

58 Observational Studies: Case-Control Process: Identify a set of patients with disease, and corresponding set of controls without disease Find out retrospectively about exposure Analyze data to see if associations exist Pros Relatively inexpensive Takes a short time Works well even for rare disease C o n s Measurement is often biased and imprecise ( recall bias ) Cannot estimate prevalence due to sampling 58

59 Factors Affecting Observation Studies Confounders a confounding variable (factor)is an extraneous variable in a statistical model that correlates (directly or inversely) with both the dependent variable and the independent variable Biases Self-selection Bias self-selection bias arises when individuals select themselves into a group, causing a biased sample with nonprobability sampling. It is commonly used to describe situations where the characteristics of the people which cause them to select themselves in the group create abnormal or undesirable conditions in the group. Recall bias (next slide) Survival bias Etc. 59

60 Recall Bias In Observational Studies Recall bias is a systematic error caused by differences in the accuracy or completeness of the recollections retrieved ("recalled") by study participants regarding events or experiences from the past Sometimes also referred to as response bias, responder bias or reporting bias. This type of measurement bias can be a methodological issue in research that involves interviews or questionnaires (potentially leading to differential misclassification of various types of exposure). Recall bias can be a particular concern in retrospective studies that use a case-control design to investigate the etiological causes of a disease or psychiatric condition. For example, in studies of risk factors for breast cancer, women who have had the disease may search their memories more thoroughly than unaffected controls to try to recall exposure to factors that have been mentioned in the press, such as use of oral contraceptives. 60

61 Survival Bias In Observational Studies One analytic bias has persistently appeared in many reported analyses of observational studies of therapy for HIV infection. Longer survival may increase a patient's chance to use treatment in an observational study. Patients who die sooner have less time to select treatment and thus, by default, are more likely to remain untreated. If this is not considered in the data analysis, the estimate of the treatment effect will be biased. 61

62 Part Four Statistical Inference Statistical Estimation Point estimation Confidence Intervals Testing of Hypotheses 62

63 Statistical Estimation 1. Point estimation Estimating a population parameter by one value: Sample mean, sample SD, Sample Median It is a sample dependent 2. Confidence Interval CI for a parameter is a random interval constructed from data in such a way that the probability that the interval contains the true value of the parameter can be specified before the data are collected. 3. Estimator. An estimator is a rule for "guessing" the value of a population parameter based on a random sample from the population. An estimator is a random variable, because its value depends on which particular sample is obtained, which is random. A canonical example of an estimator is the sample mean, which is an estimator of the population mean. 63

64 Confidence level The confidence level of a confidence interval is the chance that the interval that will result once data are collected will contain the corresponding parameter. If one computes confidence intervals again and again from independent data, the long-term limit of the fraction of intervals that contain the parameter is the confidence level. 95% Confidence interval (95% CI),

65 Hypothesis testing Forming of Hypothesis Null Hypothesis (H 0 ) Alternative Hypothesis (H 1 ) or (H a ) A null hypothesis (typically that there is no effect) is compared with an alternative hypothesis (typically that there is an effect, or that there is an effect of a particular sign). For example, in evaluating whether a new cancer treatment works, the null hypothesis typically would be that the remedy does not work, while the alternative hypothesis would be that the remedy does work..

66 Hypothesis testing When the data are sufficiently improbable under the assumption that the null hypothesis is true, the null hypothesis is rejected in favor of the alternative hypothesis. This does not imply that the data are probable under the assumption that the alternative hypothesis is true, nor that the null hypothesis is false, nor that the alternative hypothesis is true. Statistical hypothesis testing is making a decision between rejecting or not rejecting a null hypothesis, on the basis of a set of observations 66

67 Possible decisions and errors Accept Ho Reject Ho H0 is True α H0 is false β

68 Important Statistical terminology Alpha (α) : Probability of Type-I error Beta (β) : Probability of Type-II error=1-power Power: The power of a test against a specific alternative hypothesis is the chance that the test correctly rejects the null hypothesis when the alternative hypothesis is true. P (P-value): The P value of the null hypothesis given the data is the smallest significance level p for which any of the tests would have rejected the null hypothesis.. d : Confidence interval half width Delta (Δ): difference

69 QUESTIONS??? Contact Me: Ramses Sadek 69

### F. Farrokhyar, MPhil, PhD, PDoc

Learning objectives Descriptive Statistics F. Farrokhyar, MPhil, PhD, PDoc To recognize different types of variables To learn how to appropriately explore your data How to display data using graphs How

### Research Variables. Measurement. Scales of Measurement. Chapter 4: Data & the Nature of Measurement

Chapter 4: Data & the Nature of Graziano, Raulin. Research Methods, a Process of Inquiry Presented by Dustin Adams Research Variables Variable Any characteristic that can take more than one form or value.

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### Outline of Topics. Statistical Methods I. Types of Data. Descriptive Statistics

Statistical Methods I Tamekia L. Jones, Ph.D. (tjones@cog.ufl.edu) Research Assistant Professor Children s Oncology Group Statistics & Data Center Department of Biostatistics Colleges of Medicine and Public

### II. DISTRIBUTIONS distribution normal distribution. standard scores

Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

### Chapter 3: Data Description Numerical Methods

Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### Dr. Peter Tröger Hasso Plattner Institute, University of Potsdam. Software Profiling Seminar, Statistics 101

Dr. Peter Tröger Hasso Plattner Institute, University of Potsdam Software Profiling Seminar, 2013 Statistics 101 Descriptive Statistics Population Object Object Object Sample numerical description Object

### 4. Introduction to Statistics

Statistics for Engineers 4-1 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation

### Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

### Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

### Northumberland Knowledge

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

### MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

### Variables and Data A variable contains data about anything we measure. For example; age or gender of the participants or their score on a test.

The Analysis of Research Data The design of any project will determine what sort of statistical tests you should perform on your data and how successful the data analysis will be. For example if you decide

### EBM Cheat Sheet- Measurements Card

EBM Cheat Sheet- Measurements Card Basic terms: Prevalence = Number of existing cases of disease at a point in time / Total population. Notes: Numerator includes old and new cases Prevalence is cross-sectional

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### Central Tendency. n Measures of Central Tendency: n Mean. n Median. n Mode

Central Tendency Central Tendency n A single summary score that best describes the central location of an entire distribution of scores. n Measures of Central Tendency: n Mean n The sum of all scores divided

### Week 1. Exploratory Data Analysis

Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

### Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

### Descriptive Statistics

Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

### Inferential Statistics

Inferential Statistics Sampling and the normal distribution Z-scores Confidence levels and intervals Hypothesis testing Commonly used statistical methods Inferential Statistics Descriptive statistics are

### Mathematics. Probability and Statistics Curriculum Guide. Revised 2010

Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

### STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS Correct answers are in bold italics.. This scenario applies to Questions 1 and 2: A study was done to compare the lung capacity of coal miners

### The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

### Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

### Lecture 1: Review and Exploratory Data Analysis (EDA)

Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Principles of Statistics STA-201-TE This TECEP is an introduction to descriptive and inferential statistics. Topics include: measures of central tendency, variability, correlation, regression, hypothesis

### The Big 50 Revision Guidelines for S1

The Big 50 Revision Guidelines for S1 If you can understand all of these you ll do very well 1. Know what is meant by a statistical model and the Modelling cycle of continuous refinement 2. Understand

### Data Analysis: Describing Data - Descriptive Statistics

WHAT IT IS Return to Table of ontents Descriptive statistics include the numbers, tables, charts, and graphs used to describe, organize, summarize, and present raw data. Descriptive statistics are most

### 2 Descriptive statistics with R

Biological data analysis, Tartu 2006/2007 1 2 Descriptive statistics with R Before starting with basic concepts of data analysis, one should be aware of different types of data and ways to organize data

### Fairfield Public Schools

Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

### Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research

### Chapter 15 Multiple Choice Questions (The answers are provided after the last question.)

Chapter 15 Multiple Choice Questions (The answers are provided after the last question.) 1. What is the median of the following set of scores? 18, 6, 12, 10, 14? a. 10 b. 14 c. 18 d. 12 2. Approximately

### What are Data? The Research Question (Randomised Controlled Trials (RCTs)) The Research Question (Non RCTs)

What are Data? Quantitative Data o Sets of measurements of objective descriptions of physical and behavioural events; susceptible to statistical analysis Qualitative data o Descriptive, views, actions

### SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION. Friday 14 March 2008 9.00-9.

SECOND M.B. AND SECOND VETERINARY M.B. EXAMINATIONS INTRODUCTION TO THE SCIENTIFIC BASIS OF MEDICINE EXAMINATION Friday 14 March 2008 9.00-9.45 am Attempt all ten questions. For each question, choose the

### Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

### 10-3 Measures of Central Tendency and Variation

10-3 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.

### Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

### Chapter 3 Descriptive Statistics: Numerical Measures. Learning objectives

Chapter 3 Descriptive Statistics: Numerical Measures Slide 1 Learning objectives 1. Single variable Part I (Basic) 1.1. How to calculate and use the measures of location 1.. How to calculate and use the

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### Variables. Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

### Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

### CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

### Statistics and research

Statistics and research Usaneya Perngparn Chitlada Areesantichai Drug Dependence Research Center (WHOCC for Research and Training in Drug Dependence) College of Public Health Sciences Chulolongkorn University,

### CHINHOYI UNIVERSITY OF TECHNOLOGY

CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF NATURAL SCIENCES AND MATHEMATICS DEPARTMENT OF MATHEMATICS MEASURES OF CENTRAL TENDENCY AND DISPERSION INTRODUCTION From the previous unit, the Graphical displays

### Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.

Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National

### Using SPSS, Chapter 2: Descriptive Statistics

1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Session 1.6 Measures of Central Tendency

Session 1.6 Measures of Central Tendency Measures of location (Indices of central tendency) These indices locate the center of the frequency distribution curve. The mode, median, and mean are three indices

### Numerical Summarization of Data OPRE 6301

Numerical Summarization of Data OPRE 6301 Motivation... In the previous session, we used graphical techniques to describe data. For example: While this histogram provides useful insight, other interesting

### 1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics)

1.5 NUMERICAL REPRESENTATION OF DATA (Sample Statistics) As well as displaying data graphically we will often wish to summarise it numerically particularly if we wish to compare two or more data sets.

### Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

### Elementary Statistics

Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

### A Correlation of. to the. South Carolina Data Analysis and Probability Standards

A Correlation of to the South Carolina Data Analysis and Probability Standards INTRODUCTION This document demonstrates how Stats in Your World 2012 meets the indicators of the South Carolina Academic Standards

### Histogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004

Graphs, and measures of central tendency and spread 9.07 9/13/004 Histogram If discrete or categorical, bars don t touch. If continuous, can touch, should if there are lots of bins. Sum of bin heights

### Study Design & Methodology

UNIVERSITY of LIMERICK OLLSCOIL LUIMNIGH STATISTICAL CONSULTING UNIT Research in Health Sciences Study Design & Methodology Study Design & Methodology Dr Jean Saunders C.Stat Inc extra Declan slides Lyons

### Data Exploration Data Visualization

Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

### Dongfeng Li. Autumn 2010

Autumn 2010 Chapter Contents Some statistics background; ; Comparing means and proportions; variance. Students should master the basic concepts, descriptive statistics measures and graphs, basic hypothesis

### Data Types. 1. Continuous 2. Discrete quantitative 3. Ordinal 4. Nominal. Figure 1

Data Types By Tanya Hoskin, a statistician in the Mayo Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic CTSA BERD Resource. Don t let the title scare you.

### Means, standard deviations and. and standard errors

CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard

### Clinical Study Design and Methods Terminology

Home College of Veterinary Medicine Washington State University WSU Faculty &Staff Page Page 1 of 5 John Gay, DVM PhD DACVPM AAHP FDIU VCS Clinical Epidemiology & Evidence-Based Medicine Glossary: Clinical

### How to Conduct a Hypothesis Test

How to Conduct a Hypothesis Test The idea of hypothesis testing is relatively straightforward. In various studies we observe certain events. We must ask, is the event due to chance alone, or is there some

### Descriptive Statistics

Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

### 430 Statistics and Financial Mathematics for Business

Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions

### ! x sum of the entries

3.1 Measures of Central Tendency (Page 1 of 16) 3.1 Measures of Central Tendency Mean, Median and Mode! x sum of the entries a. mean, x = = n number of entries Example 1 Find the mean of 26, 18, 12, 31,

### Models for Discrete Variables

Probability Models for Discrete Variables Our study of probability begins much as any data analysis does: What is the distribution of the data? Histograms, boxplots, percentiles, means, standard deviations

### Describing and presenting data

Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data

### Graphical and Tabular. Summarization of Data OPRE 6301

Graphical and Tabular Summarization of Data OPRE 6301 Introduction and Re-cap... Descriptive statistics involves arranging, summarizing, and presenting a set of data in such a way that useful information

### DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1

DESCRIPTIVE STATISTICS - CHAPTERS 1 & 2 1 OVERVIEW STATISTICS PANIK...THE THEORY AND METHODS OF COLLECTING, ORGANIZING, PRESENTING, ANALYZING, AND INTERPRETING DATA SETS SO AS TO DETERMINE THEIR ESSENTIAL

### Descriptive Statistics. Understanding Data: Categorical Variables. Descriptive Statistics. Dataset: Shellfish Contamination

Descriptive Statistics Understanding Data: Dataset: Shellfish Contamination Location Year Species Species2 Method Metals Cadmium (mg kg - ) Chromium (mg kg - ) Copper (mg kg - ) Lead (mg kg - ) Mercury

### Introduction to Stata

Introduction to Stata September 23, 2014 Stata is one of a few statistical analysis programs that social scientists use. Stata is in the mid-range of how easy it is to use. Other options include SPSS,

### Chris Slaughter, DrPH. GI Research Conference June 19, 2008

Chris Slaughter, DrPH Assistant Professor, Department of Biostatistics Vanderbilt University School of Medicine GI Research Conference June 19, 2008 Outline 1 2 3 Factors that Impact Power 4 5 6 Conclusions

### WHAT IS A JOURNAL CLUB?

WHAT IS A JOURNAL CLUB? With its September 2002 issue, the American Journal of Critical Care debuts a new feature, the AJCC Journal Club. Each issue of the journal will now feature an AJCC Journal Club

### Descriptive Statistics and Measurement Scales

Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

### CHAPTER THREE COMMON DESCRIPTIVE STATISTICS COMMON DESCRIPTIVE STATISTICS / 13

COMMON DESCRIPTIVE STATISTICS / 13 CHAPTER THREE COMMON DESCRIPTIVE STATISTICS The analysis of data begins with descriptive statistics such as the mean, median, mode, range, standard deviation, variance,

### Hypothesis Testing. Bluman Chapter 8

CHAPTER 8 Learning Objectives C H A P T E R E I G H T Hypothesis Testing 1 Outline 8-1 Steps in Traditional Method 8-2 z Test for a Mean 8-3 t Test for a Mean 8-4 z Test for a Proportion 8-5 2 Test for

### Foundation of Quantitative Data Analysis

Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1

### Sample Exam #1 Elementary Statistics

Sample Exam #1 Elementary Statistics Instructions. No books, notes, or calculators are allowed. 1. Some variables that were recorded while studying diets of sharks are given below. Which of the variables

### Sample Size Planning, Calculation, and Justification

Sample Size Planning, Calculation, and Justification Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa

### Introduction to Quantitative Methods

Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

### Exploratory Data Analysis

Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

### Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to

### DESCRIPTIVE STATISTICS & DATA PRESENTATION*

Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

### Data Analysis. Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) SS Analysis of Experiments - Introduction

Data Analysis Lecture Empirical Model Building and Methods (Empirische Modellbildung und Methoden) Prof. Dr. Dr. h.c. Dieter Rombach Dr. Andreas Jedlitschka SS 2014 Analysis of Experiments - Introduction

### The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

### X X X a) perfect linear correlation b) no correlation c) positive correlation (r = 1) (r = 0) (0 < r < 1)

CORRELATION AND REGRESSION / 47 CHAPTER EIGHT CORRELATION AND REGRESSION Correlation and regression are statistical methods that are commonly used in the medical literature to compare two or more variables.

### 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

### business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

### Content Sheet 7-1: Overview of Quality Control for Quantitative Tests

Content Sheet 7-1: Overview of Quality Control for Quantitative Tests Role in quality management system Quality Control (QC) is a component of process control, and is a major element of the quality management

### Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence

### Summary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)

Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume

### Competency 1 Describe the role of epidemiology in public health

The Northwest Center for Public Health Practice (NWCPHP) has developed competency-based epidemiology training materials for public health professionals in practice. Epidemiology is broadly accepted as

### CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

CA200 Quantitative Analysis for Business Decisions File name: CA200_Section_04A_StatisticsIntroduction Table of Contents 4. Introduction to Statistics... 1 4.1 Overview... 3 4.2 Discrete or continuous

### Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D.

Biodiversity Data Analysis: Testing Statistical Hypotheses By Joanna Weremijewicz, Simeon Yurek, Steven Green, Ph. D. and Dana Krempels, Ph. D. In biological science, investigators often collect biological