Organizing Your Approach to a Data Analysis
|
|
|
- Claude Marsh
- 10 years ago
- Views:
Transcription
1 Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize the time spent trying to interpret randomly selected data analyses. In most cases, the statistical methods used to answer the primary question can be selected prior to looking at the data. (Data driven selection of analysis methods weakens confidence in the statistical inference.) Exploratory analyses (i.e., those analyses based on models selected after looking at the data) should be clearly labeled as such. I. Before looking at the data A. Identify overall goal of the study B. Identify specific aims and how they relate to overall goal 1. Identify the current state of scientific knowledge 2. Identify the competing hypotheses that the study is designed to discriminate between 3. (Often dictated by available data) C. Refine scientific hypotheses into statistical hypotheses 1. Identify type of question a. Prediction, estimation, or testing b. Identifying groups, quantifying distributions, or comparing distributions 2. Where appropriate, specify statistical hypotheses in terms of a summary measure for the distribution of measurements a. e.g., mean, median, proportion above a threshold, event rate D. Consider design of ideal experiment 1. Ignore practical, ethical limitations in order to be able to later compare how close the actual situation is to the ideal a. Who would be the subjects b. What would be the intervention c. How would subjects be assigned to the intervention d. What would be the variables measured
2 E. Available data 1. Sampling scheme and sample size a. Retrospective vs prospective b. Observational vs intervention c. Inclusion, exclusion criteria 2. Variables in the data set a. Names b. Relationship to real world quantities c. Conditions under which they were measured d. Units of measurement (limitations) Data Analysis, Handout #1 Page 2 e.g., qualitative vs quantitative, continuous vs discrete, patterns of missing data 3. Categorization of variables according to scientific meaning a. Demographic (age, sex, etc.) b. Baseline physiology (SBP, performance status) c. Baseline disease risk factors, prognosis d. Measures of treatment intervention e. Measures of ancillary clinical course during treatment (e.g., ancillary treatments, environmental conditions) f. Measures of treatment outcome g. (Others specific to the scientific setting) 4. Categorization of variables according to use in analysis a. Response (outcome) variables b. Predictor variable of interest (variable identifying groups) c. Variables identifying subgroups to explore effect modification d. Potential confounders Causally associated with response variable (in truth), but not in causal pathway of interest Association with predictor of interest (in the sample) e. Variables which allow increased precision Variables predictive of response, but not associated with predictor of interest Questions about effects within such groups can be answered with more precision than questions about effects in the larger population (e.g., adjusting for age) f. Surrogates for response Variables in the causal pathway of interest Variables measuring a later effect of the response g. Irrelevant
3 II. Univariate descriptive statistics A. Goals 1. Identify errors in the data a. Particularly unusual measurements (out of range) b. Unusual combinations of measurements 2. Verify your understanding of the measurements 3. Identify patterns of missing data 4. Identify exact population used in study (Materials and Methods) 5. Identify aspects of the data that may present technical statistical issues Data Analysis, Handout #1 Page 3 a. Ideal: allows easiest, most precise statistical inference with smaller sample sizes equal information about all groups being investigated (? equal sample sizes) measurements of response within each group distributed symmetrically with no long tails (outliers) no missing data b. Potential problems suggesting possibility of problematic scientific interpretation (problems which can not necessarily be solved with the available data) missing data patterns c. Potential problems suggesting less generalizable statistical analysis (problems not necessarily indicated by the measures of statistical confidence) Outliers in distribution of grouping variables (predictors): i.e., low sample sizes in some groups that are far away from the rest of the data (e.g., trying to determine an age effect in a sample in which most are between 10 and 20 years old, but one subject is 80) d. Potential technical problems suggesting possibility of less precise inference (problems that will tend to lower our reported level of statistical precision) Outliers in distribution of response Too little variation in the distribution of the grouping variables (e.g, trying to determine an age effect from a sample in which everyone is between 20 and 21 years old) Too much association among the different grouping variables (e.g., trying to determine an age effect when all the young subjects are male and all the old subjects are female) e. Potential technical problems which suggest we might need to use more complicated statistical methods Repeated measurements on the same sampling unit (correlated response) When comparing means: unequal variability across groups being compared When comparing time to events: lack of proportional hazards When adjusting for covariates: nonlinear effects; interactions
4 C. Order of investigation D. Tools 1. Potential confounders 2. Predictor of interest 3. Response 1. Frequency tables 2. Mean, median, standard deviation, etc. Data Analysis, Handout #1 Page 4 a. Quick numerical methods of detecting outliers by detecting asymmetry Mean and median markedly different Mean, median not midway between minimum and maximum Mean, median not midway between 25th and 75th percentiles For positive variables: standard deviation larger than two-thirds of the mean Minimum or maximum too many standard deviations away from the mean ( too many depends on sample size) 3. Box plots, histograms III. Bivariate and trivariate descriptive statistics A. Goals B. Ideal 1. Identify confounding relationships a. Associations between other variables and predictor of interest b. Associations between other variables and response 2. Identify important predictors of response a. Univariate effects b. Effect modification (interactions) 3. Identify surrogates of response 4. Characterize form of functional relationships (linear, etc.) 1. Predictor of interest has no association with any other predictors 2. Only a few variables are markedly associated with response 3. All associations look like a straight line relationship 4. No interactions (effect modification)
5 C. Order of investigation D. Tools 1. Relationships among other predictors 2. Relationships between predictor of interest and other predictors 3. Relationships between response and other predictors 4. Relationships between predictor of interest and response overall 5. Relationships between predictor of interest and response within subgroups 1. Contingency tables 2. Stratified means, medians, standard deviations, etc. 3. Stratified box plots, histograms, etc. 4. Scatterplots 5. Stratified scatterplots 6. Correlations IV. Defining a suitable context for modeling A. Goals 1. Choosing appropriate form for response variables a. Selection of measure of response Transformations of available data b. Summary measure to use as basis for statistical model 2. Selection of groups to be investigated / compared Form for predictor of interest Identification and form of interactions (effect modification) Identification and form of potential confounders to be modeled Identification and form of precision variables to be modeled 3. Choosing analysis method (type of regression) B. Methods Data Analysis, Handout #1 Page 5 1. Ideal: Statistical model dictated entirely by scientific question (before looking at the data) 2. Exploratory: Model building a. Educated guess for first models b. Fit models c. Evaluate validity of necessary assumptions V. Model Building to Address Primary Question A. Goals (in order of importance) 1. Selection of variables to address scientific questions (main effects and interactions) 2. Selection of variables to minimize bias (address confounding) 3. Selection of variables to maximize precision 4. Selection of models which are easiest to implement (usually: have the least technical requirements on the distribution of response)
6 B. Methods 1. Addressing scientific question: Thinking about the problem Data Analysis, Handout #1 Page 6 2. Addressing unanticipated confounding: Adding or removing variables and observing effect on other regression parameters relative to findings in bivariate description of data 3. Addressing precision: Determining which variables tend to predict response (many difficult issues here) 4. Evaluate extent to which data meets technical requirements of statistical procedures VI. Exploratory Analyses for Hypothesis Generation A. Modeling of exact form of predictor-response relationship (e.g., dose-response) B. Identification of other predictors of response C. Subgroup analyses: Compare effect of predictor of interest on response within subgroups (effect modification) Reporting Results and Interpretation The basic principles of reporting the results of a statistical analysis are the same ones we learned about in elementary school science. The elements of a proper scientific lab report are: A. Scientific Background and Hypotheses B. Materials and Methods 1. Sampling scheme 2. Most basic descriptive statistics C. Results (more objective first) 1. Descriptive statistics 2. Results of analyses about primary question a. Estimates of effect Point estimates (single best estimate) - Interval estimates (range of estimates indicating precision) b. Decisions about hypotheses Binary decision (yes or no) Measure of statistical confidence in precision 3. Results of analyses about prespecified secondary questions or questions which demonstrate consistency (or lack of same) across alternative approaches 4. Results of analyses about questions that arose during analysis and that the vast majority of readers would agree could and should be answered by the data D. Discussion (subjective, including particularly data-driven analyses) 1. Elaboration on ways that these analyses address the overall goal of the study 2. Results of the most speculative analyses of the data
7 General Requirements for Ph.D. Applied Exam Data Analysis, Handout #1 Page 7 In the report of your analysis, you should describe the results of your analysis and the conclusions you would reach from those results. This report should look like a formal report to a statistically naive client (i.e., the researcher who brought you the data and/or involved you in the analysis) or an interested lay person. Because a statistical analysis aims to answer a scientific question, you should organize your report in the manner which is customarily used in science. To wit: 1. Summary: Provide a concise description of the question, the data used to try to answer it, and the conclusions of your analysis. Give the most pertinent estimates, confidence intervals, and P values. Note that estimates and confidence intervals regarding the main question of interest are also important even when there is no statistically significant effect. Don t give too much detail here, but do note any significant problems that were encountered. The basic goal is to have all the key information in your summary, and the rest of your report is the supporting detail. 2. Background: Provide a description of the scientific motivation for the analysis. Use your own words rather than copying the description provided by the client. By providing your understanding of the problem, the client may be able to correct any misconceptions that you had about the science. You don t have to go into great detail here, but do give all the facts that entered into your decision process during the analysis. 3. Questions of Interest: List the specific questions that your client posed as well as the questions that you answered. Highlight discrepancies between the two categories of questions. (It is not at all uncommon that the question posed by a researcher could not be answered statistically with the data they provide. 4. Source of the Data: Describe the source and sampling methods for the data, if known. Note that the source and sampling methods can not generally be divined from the data (e.g., you cannot tell from the data whether the subjects were randomized to intervention or not). Describe the variables that are available and their meaning for the analysis. Highlight patterns of missing data as well as possible confounding by measured or unmeasured variables. This should not be a detailed presentation of descriptive statistics, however. That will come under Results. But if there are aspects of the descriptive statistics that are of interest solely for a technical description of the sampling plan, they can go here. 5. Statistical Methods: Describe the methods used for the analysis at two levels. 1) Give a low-level technical description of the analysis for the client to use in the manuscript. Include references for non-standard techniques. You may want to describe the software used, and certainly want to describe the methods used for ensuring the appropriateness of your models. Explain how you handled common problems like missing data, multiple comparisons, etc. 2) Explain the basic philosophy behind the analysis techniques in layman s terms. Provide interpretations for all parameter estimates. Motivate transformations. Describe the use of P values and confidence intervals if they play an important role in your analysis. Explain why you didn t use more common techniques if necessary. 6. Results: Provide the pertinent results of your analyses. Do not include all the dead-end analyses you might have done unless they provide insight into the question. Do lead the client up to the analyses gradually. a. Start off with descriptive statistics. This is an area often given short shrift in previous years. The goal is to describe the basic characteristics of the sample used to address the question, as well as to present simple descriptive statistics (non-model based) that address the questions. Tables and plots are the key tools. If there are any characteristics of the data that present technical problems that needed to be addressed in the modeling, try to present descriptive statistics illustrating those issues. The basic idea is to presage all the issues you will talk about when presenting the models used in statistical inference, insofar as possible with simple descriptive statistics.
8 Data Analysis, Handout #1 Page 8 b. Then go to the major models used to answer the primary questions. Present summaries of the statistical inference obtained from these models (point estimates, CI, P values). Make sure you provide scientific units for the estimates. Highlight any particular issues that materially affected the models used to answer the question (confounding, interactions, nonlinearities, etc.) Tables can often be used to good effect here. c. Leave exploratory analyses (if any) for last and highlight the exploratory nature of those analyses. Present the results of your analyses in tables and publishing quality figures. DO NOT INCLUDE OUTPUT FROM STATISTICAL PROGRAMS. (Such means little to me and nothing to a client). When possible, use words instead of cryptic variable names. Use forms of estimates that have some meaning to a statistically naive researcher. Thus, if you log transform your response, present geometric mean ratios rather than linear regression parameters. Present confidence intervals rather than the values of Z, t, F, or χ 2 statistics. 7. Discussion: Discuss the conclusions which you feel can be drawn from the analyses. Suggest directions for future studies and analyses. Highlight the limitations of the data and your analyses. 8. Appendix: Anything of an overly technical nature should be put in an appendix. You may want to include extensive tables in an appendix instead of the main results section. The major theme of the above is to write to the client and the scientific community rather than to a statistician. If you cannot explain your findings in a straightforward manner, then the analysis is of little value to anyone. Also, lead your reader to all the proper results. You spent a long time analyzing the data. Now provide a brief tour through the high points of your work. Statistical diagnostics, which take a lot of our time, can most often be summarized in a single sentence ( We found no evidence to suggest that we could not rely on the results from our analysis. ) You are reporting your major results and impressions of the data. If the client wanted to see every detail, he/she would have to do the analysis himself/herself. Grading Written report Your papers will be graded with respect to three major areas: 1. Scientific approach a. Did you investigate problems in the sampling that might materially affect the results? b. In addressing each of the questions, did you choose appropriate models to answer the scientific questions? 2. Statistical approach a. Were the methods chosen appropriate for the data at hand? Were any key assumptions violate? b. Were the methods chosen reasonably efficient? 3. Written report a. Were your findings well documented in a succinct manner? b. Was the report written at an appropriately low level?
Stats 202 Data Analysis Project Winter 2016
Stats 202 Data Analysis Project Winter 2016 1 Learning Objectives The learning goals of the Stats 202 data analysis project are Formulate clear scientific research questions; Explore public data sources
Fairfield Public Schools
Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE
LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 119 STATISTICS AND ELEMENTARY ALGEBRA 5 Lecture Hours, 2 Lab Hours, 3 Credits Pre-
Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.
Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts
Lecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel [email protected] Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics
Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This
Study Design and Statistical Analysis
Study Design and Statistical Analysis Anny H Xiang, PhD Department of Preventive Medicine University of Southern California Outline Designing Clinical Research Studies Statistical Data Analysis Designing
Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm
Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm
Chapter 7: Simple linear regression Learning Objectives
Chapter 7: Simple linear regression Learning Objectives Reading: Section 7.1 of OpenIntro Statistics Video: Correlation vs. causation, YouTube (2:19) Video: Intro to Linear Regression, YouTube (5:18) -
Elements of statistics (MATH0487-1)
Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP. Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study.
Appendix G STATISTICAL METHODS INFECTIOUS METHODS STATISTICAL ROADMAP Prepared in Support of: CDC/NCEH Cross Sectional Assessment Study Prepared by: Centers for Disease Control and Prevention National
Lecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
Introduction to Regression and Data Analysis
Statlab Workshop Introduction to Regression and Data Analysis with Dan Campbell and Sherlock Campbell October 28, 2008 I. The basics A. Types of variables Your variables may take several forms, and it
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics
South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)
Data Quality Assessment: A Reviewer s Guide EPA QA/G-9R
United States Office of Environmental EPA/240/B-06/002 Environmental Protection Information Agency Washington, DC 20460 Data Quality Assessment: A Reviewer s Guide EPA QA/G-9R FOREWORD This document is
Week 1. Exploratory Data Analysis
Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics
Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2015 Examinations Aim The aim of the Probability and Mathematical Statistics subject is to provide a grounding in
4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"
Data Analysis Plan The appropriate methods of data analysis are determined by your data types and variables of interest, the actual distribution of the variables, and the number of cases. Different analyses
BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS
TABLES, CHARTS, AND GRAPHS / 75 CHAPTER TWELVE TABLES, CHARTS, AND GRAPHS Tables, charts, and graphs are frequently used in statistics to visually communicate data. Such illustrations are also a frequent
AP Statistics: Syllabus 1
AP Statistics: Syllabus 1 Scoring Components SC1 The course provides instruction in exploring data. 4 SC2 The course provides instruction in sampling. 5 SC3 The course provides instruction in experimentation.
MTH 140 Statistics Videos
MTH 140 Statistics Videos Chapter 1 Picturing Distributions with Graphs Individuals and Variables Categorical Variables: Pie Charts and Bar Graphs Categorical Variables: Pie Charts and Bar Graphs Quantitative
IBM SPSS Direct Marketing 22
IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release
PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050
PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050 Class Hours: 2.0 Credit Hours: 3.0 Laboratory Hours: 2.0 Date Revised: Fall 2013 Catalog Course Description: Descriptive
Analyzing Research Articles: A Guide for Readers and Writers 1. Sam Mathews, Ph.D. Department of Psychology The University of West Florida
Analyzing Research Articles: A Guide for Readers and Writers 1 Sam Mathews, Ph.D. Department of Psychology The University of West Florida The critical reader of a research report expects the writer to
T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these
Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program
Department/Academic Unit: Public Health Sciences Degree Program: Biostatistics Collaborative Program Department of Mathematics and Statistics Degree Level Expectations, Learning Outcomes, Indicators of
Problem of the Month Through the Grapevine
The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards: Make sense of problems
IBM SPSS Direct Marketing 23
IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release
Data Preparation and Statistical Displays
Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries
Basic Statistics and Data Analysis for Health Researchers from Foreign Countries Volkert Siersma [email protected] The Research Unit for General Practice in Copenhagen Dias 1 Content Quantifying association
University of Maryland School of Medicine Master of Public Health Program. Evaluation of Public Health Competencies
Semester/Year of Graduation University of Maryland School of Medicine Master of Public Health Program Evaluation of Public Health Competencies Students graduating with an MPH degree, and planning to work
Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013
Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives
Descriptive Statistics and Exploratory Data Analysis
Descriptive Statistics and Exploratory Data Analysis Dean s s Faculty and Resident Development Series UT College of Medicine Chattanooga Probasco Auditorium at Erlanger January 14, 2008 Marc Loizeaux,
HOW TO WRITE A LABORATORY REPORT
HOW TO WRITE A LABORATORY REPORT Pete Bibby Dept of Psychology 1 About Laboratory Reports The writing of laboratory reports is an essential part of the practical course One function of this course is to
Simple Predictive Analytics Curtis Seare
Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use
Data Analysis, Research Study Design and the IRB
Minding the p-values p and Quartiles: Data Analysis, Research Study Design and the IRB Don Allensworth-Davies, MSc Research Manager, Data Coordinating Center Boston University School of Public Health IRB
II. DISTRIBUTIONS distribution normal distribution. standard scores
Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression
Unit 31 A Hypothesis Test about Correlation and Slope in a Simple Linear Regression Objectives: To perform a hypothesis test concerning the slope of a least squares line To recognize that testing for a
Module 223 Major A: Concepts, methods and design in Epidemiology
Module 223 Major A: Concepts, methods and design in Epidemiology Module : 223 UE coordinator Concepts, methods and design in Epidemiology Dates December 15 th to 19 th, 2014 Credits/ECTS UE description
11. Analysis of Case-control Studies Logistic Regression
Research methods II 113 11. Analysis of Case-control Studies Logistic Regression This chapter builds upon and further develops the concepts and strategies described in Ch.6 of Mother and Child Health:
business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar
business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel
Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:
Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours
Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini
NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building
Inferential Statistics. What are they? When would you use them?
Inferential Statistics What are they? When would you use them? What are inferential statistics? Why learn about inferential statistics? Why use inferential statistics? When are inferential statistics utilized?
Use advanced techniques for summary and visualization of complex data for exploratory analysis and presentation.
MS Biostatistics MS Biostatistics Competencies Study Development: Work collaboratively with biomedical or public health researchers and PhD biostatisticians, as necessary, to provide biostatistical expertise
Teaching Multivariate Analysis to Business-Major Students
Teaching Multivariate Analysis to Business-Major Students Wing-Keung Wong and Teck-Wong Soon - Kent Ridge, Singapore 1. Introduction During the last two or three decades, multivariate statistical analysis
How To Write A Data Analysis
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
Presenting survey results Report writing
Presenting survey results Report writing Introduction Report writing is one of the most important components in the survey research cycle. Survey findings need to be presented in a way that is readable
Service courses for graduate students in degree programs other than the MS or PhD programs in Biostatistics.
Course Catalog In order to be assured that all prerequisites are met, students must acquire a permission number from the education coordinator prior to enrolling in any Biostatistics course. Courses are
2. Simple Linear Regression
Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according
Chapter 5 Analysis of variance SPSS Analysis of variance
Chapter 5 Analysis of variance SPSS Analysis of variance Data file used: gss.sav How to get there: Analyze Compare Means One-way ANOVA To test the null hypothesis that several population means are equal,
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@
Developing Risk Adjustment Techniques Using the SAS@ System for Assessing Health Care Quality in the lmsystem@ Yanchun Xu, Andrius Kubilius Joint Commission on Accreditation of Healthcare Organizations,
Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th
Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th Standard 3: Data Analysis, Statistics, and Probability 6 th Prepared Graduates: 1. Solve problems and make decisions that depend on un
MATH 140 HYBRID INTRODUCTORY STATISTICS COURSE SYLLABUS
MATH 140 HYBRID INTRODUCTORY STATISTICS COURSE SYLLABUS Instructor: Mark Schilling Email: [email protected] (Note: If your CSUN email address is not one you use regularly, be sure to set up automatic
How To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
List of Examples. Examples 319
Examples 319 List of Examples DiMaggio and Mantle. 6 Weed seeds. 6, 23, 37, 38 Vole reproduction. 7, 24, 37 Wooly bear caterpillar cocoons. 7 Homophone confusion and Alzheimer s disease. 8 Gear tooth strength.
Analyzing Research Data Using Excel
Analyzing Research Data Using Excel Fraser Health Authority, 2012 The Fraser Health Authority ( FH ) authorizes the use, reproduction and/or modification of this publication for purposes other than commercial
SPSS Explore procedure
SPSS Explore procedure One useful function in SPSS is the Explore procedure, which will produce histograms, boxplots, stem-and-leaf plots and extensive descriptive statistics. To run the Explore procedure,
The Dummy s Guide to Data Analysis Using SPSS
The Dummy s Guide to Data Analysis Using SPSS Mathematics 57 Scripps College Amy Gamble April, 2001 Amy Gamble 4/30/01 All Rights Rerserved TABLE OF CONTENTS PAGE Helpful Hints for All Tests...1 Tests
Exercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)
Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared
Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88)
Chapter 5: Analysis of The National Education Longitudinal Study (NELS:88) Introduction The National Educational Longitudinal Survey (NELS:88) followed students from 8 th grade in 1988 to 10 th grade in
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION
HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate
What is the purpose of this document? What is in the document? How do I send Feedback?
This document is designed to help North Carolina educators teach the Common Core (Standard Course of Study). NCDPI staff are continually updating and improving these tools to better serve teachers. Statistics
UNIT 1: COLLECTING DATA
Core Probability and Statistics Probability and Statistics provides a curriculum focused on understanding key data analysis and probabilistic concepts, calculations, and relevance to real-world applications.
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS
DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 [email protected] 1. Descriptive Statistics Statistics
Roadmap to Data Analysis. Introduction to the Series, and I. Introduction to Statistical Thinking-A (Very) Short Introductory Course for Agencies
Roadmap to Data Analysis Introduction to the Series, and I. Introduction to Statistical Thinking-A (Very) Short Introductory Course for Agencies Objectives of the Series Roadmap to Data Analysis Provide
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus
Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives
Quantitative Methods for Finance
Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain
HMRC Tax Credits Error and Fraud Additional Capacity Trial. Customer Experience Survey Report on Findings. HM Revenue and Customs Research Report 306
HMRC Tax Credits Error and Fraud Additional Capacity Trial Customer Experience Survey Report on Findings HM Revenue and Customs Research Report 306 TNS BMRB February2014 Crown Copyright 2014 JN119315 Disclaimer
Exploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
Geostatistics Exploratory Analysis
Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras [email protected]
Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model
Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity
Adequacy of Biomath. Models. Empirical Modeling Tools. Bayesian Modeling. Model Uncertainty / Selection
Directions in Statistical Methodology for Multivariable Predictive Modeling Frank E Harrell Jr University of Virginia Seattle WA 19May98 Overview of Modeling Process Model selection Regression shape Diagnostics
Introduction to Statistics and Quantitative Research Methods
Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.
COMMON CORE STATE STANDARDS FOR
COMMON CORE STATE STANDARDS FOR Mathematics (CCSSM) High School Statistics and Probability Mathematics High School Statistics and Probability Decisions or predictions are often based on data numbers in
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010
Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different
Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.
Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C
APPENDIX E THE ASSESSMENT PHASE OF THE DATA LIFE CYCLE
APPENDIX E THE ASSESSMENT PHASE OF THE DATA LIFE CYCLE The assessment phase of the Data Life Cycle includes verification and validation of the survey data and assessment of quality of the data. Data verification
Guidelines for AJO-DO submissions: Randomized Clinical Trials June 2015
Guidelines for AJO-DO submissions: Randomized Clinical Trials June 2015 Complete and transparent reporting allows for accurate assessment of the quality of trial and correct interpretation of the trial
Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard
Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express
Johns Hopkins University Bloomberg School of Public Health
Johns Hopkins University Bloomberg School of Public Health Report on Johns Hopkins University School of Medicine Faculty Salary Analysis, 2003-2004 With Additional Comments November 29, 2005 Objectives:
STAT 360 Probability and Statistics. Fall 2012
STAT 360 Probability and Statistics Fall 2012 1) General information: Crosslisted course offered as STAT 360, MATH 360 Semester: Fall 2012, Aug 20--Dec 07 Course name: Probability and Statistics Number
Summarizing and Displaying Categorical Data
Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency
Indiana Academic Standards Mathematics: Probability and Statistics
Indiana Academic Standards Mathematics: Probability and Statistics 1 I. Introduction The college and career ready Indiana Academic Standards for Mathematics: Probability and Statistics are the result of
Algebra 1 Course Information
Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through
FACILITATOR/MENTOR GUIDE
FACILITATOR/MENTOR GUIDE Descriptive analysis variables table shells hypotheses Measures of association methods design justify analytic assess calculate analysis problem stratify confounding statistical
430 Statistics and Financial Mathematics for Business
Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS
QUANTITATIVE METHODS BIOLOGY FINAL HONOUR SCHOOL NON-PARAMETRIC TESTS This booklet contains lecture notes for the nonparametric work in the QM course. This booklet may be online at http://users.ox.ac.uk/~grafen/qmnotes/index.html.
RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY
RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY I. INTRODUCTION According to the Common Core Standards (2010), Decisions or predictions are often based on data numbers
UNIVERSITY OF NAIROBI
UNIVERSITY OF NAIROBI MASTERS IN PROJECT PLANNING AND MANAGEMENT NAME: SARU CAROLYNN ELIZABETH REGISTRATION NO: L50/61646/2013 COURSE CODE: LDP 603 COURSE TITLE: RESEARCH METHODS LECTURER: GAKUU CHRISTOPHER
Statistics 151 Practice Midterm 1 Mike Kowalski
Statistics 151 Practice Midterm 1 Mike Kowalski Statistics 151 Practice Midterm 1 Multiple Choice (50 minutes) Instructions: 1. This is a closed book exam. 2. You may use the STAT 151 formula sheets and
ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008. Description of the course
ECON 523 Applied Econometrics I /Masters Level American University, Spring 2008 Instructor: Maria Heracleous Lectures: M 8:10-10:40 p.m. WARD 202 Office: 221 Roper Phone: 202-885-3758 Office Hours: M W
Statistical Rules of Thumb
Statistical Rules of Thumb Second Edition Gerald van Belle University of Washington Department of Biostatistics and Department of Environmental and Occupational Health Sciences Seattle, WA WILEY AJOHN
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
1. The parameters to be estimated in the simple linear regression model Y=α+βx+ε ε~n(0,σ) are: a) α, β, σ b) α, β, ε c) a, b, s d) ε, 0, σ
STA 3024 Practice Problems Exam 2 NOTE: These are just Practice Problems. This is NOT meant to look just like the test, and it is NOT the only thing that you should study. Make sure you know all the material
