1. Data and information

Size: px
Start display at page:

Download "1. Data and information"

Transcription

1 1. Data and information -Random sampling application -Information base and data collection 1 References Data sampling: Cochran, W.G. (1977) Sampling Techniques. John Wiley&Sons, Inc. New York and others Lohr, S.L. (1999) Sampling: Design and Analysis. Brooks/Cole Publishing Company

2 Random Sampling of Farming Systems Sampling has to cope with reality rather than with controlled conditions (= trial, laboratory). Certain violations of the rules for random sampling may be unavoidable. Lists about all units (farms, households) in a total population do rarely exist (in particular in developing countries) The design of random sampling plans may depend more on ensuring equal probabilities for survey units to be chosen than on decreasing variances in strata. The application of justified sampling for gaining representative knowledge is usually affected by the low pre-knowledge on existing types of farming systems. Total populations are far from being infinite. 3 Basics in data sampling on farming systems Select a sample survey plan and identify the respective formulae for estimators before you start the survey. Calculate the required or possible size of your sample and the optimal distribution between the strata under consideration of your resources. Plan the steps of your survey as precise as possible in advance. This includes the methodological parts (intended analyses, questionnaire, data base structure, sampling plan) as well as the logistic issues (budget, time, staff, transport, required software, data input). Plan and conduct the survey as closely as possible to the requirements of the theory of your chosen type of sample survey, but stay pragmatic. Try to adapt the theory to reality and document unavoidable divergence (= limitations of inferences from the sample on the total population). Low resources are no excuse for a bad planning of a survey! 4

3 Sampling Plan The choice of the sampling plan (simple sampling, stratified sampling, more complex sampling schemes) depends on the specific situation. Criteria are: available pre-knowledge possibilities to assure controlled probabilities of selection possibilities to minimize variance between surveyed units (e.g. farming systems) capacity limits (funding, time, transport, staff etc.) The chosen sampling plan decides on the required estimators (= formulae for calculating estimates from the surveyed units on the total population 5 Estimators 1 An estimator is a function (a formula) that allows the estimation of the situation in the total population by using information from a sample. Sample Total Population information Estimator estimation 6

4 Estimators The correct estimators (=formulae) depends on the applied sampling plan Estimators are required for: parameters of location: - "averages" (e.g. mean values) - totals - frequencies and proportions parameters of spread (e.g. variance, standard deviation) 7 Estimators 3: Example for correct estimators, quantitative data Sampling Plan: simple random sampling (SRS) stratified sampling (STS) estimate: estimators mean value yi, s i yi = n yi y vp Ny z = Nvp z variance yz, s z ( yi ) = yi si n 1 s N z sz vp = N z whereby: y i = Information from farm i ys i, = mean and variance (SRS) i yz, s = Mean and variance in stratum z z yvp, s = Mean and variance (STS) vp n = N z = N vp = Number of interviewed farmers Total number of farmers in stratum z Total number of farmers in study region 8

5 Sample size (1) Criteria for sample size: 1. Variance of the desired information within the total population. Acceptable non-systematic sampling error 3. Available resources for the survey (logistics, time, etc.) Two basic approaches: 1. Calculating the sample size based on the accepted error. Determining the sample size by available resources and calculating the obtained error in retrospect 9 Sample size () Calculating the required sample size: The required sample size has to be calculated for every single criterion under research. Attention: Calculating the required sample size by the following formulae presupposes that the concerned criterion is normal distributed in the total population. The variance of concerned criteria in the total population has to be estimated in advance. This may happen by applying general characteristics of a normal distribution Example for quantitative criteria: standard deviation = 1/6 of range [= maximal value - minimal value], variance = (range/6)² 10

6 Formulae for sample size : Quantitative criteria small ratio sample/ total population: large ratio sample/ total population (n 0 > 5% of N): n 0 t = n = n 1 α 0 e s N n0 N 1 whereby: n, n 0 = sample size, t = quantile of t-distribution, s² = variance according to chosen sampling plan, N = total population, e = accepted error in units of the criterion (±) 11 Formulae for sample size: Qualitative criteria (proportions, frequencies) small ratio sample/total population: large ratio sample/total population (n 0 > 5% of N): n 0 t = n = n pq e N n0 N 1 1 α / whereby: n, n 0 = sample size, t = quantile of t-distribution, p = proportion of total population with criterion, q = proportion of total population without criterion, pq = variance according to chosen sampling plan, N = total population, e = accepted error in % (±) 0 N.B.: These formulae do not consider budget restrictions. Respective considerations (example: Neyman-Tschupow formula) deal with the minimization of error under a given budget in stratified sampling. 1

7 PC-Exercise: file C31sampDB.xls Introduction to PC in general and Excel (if required) Create a copy of the exercise file Open Windows explorer switch to partition "user of <name of PC-room>", usually partition U: choose subdirectory: kurs\m319 Copy file sampling1.xls from this subdirectory to your partition (usually H:) Start Excel Open file sampling1.xls (resp. H:\[your subdirectory]\sampling1.xls) explain structure of excel-files (multiple worksheets) 1. Exercises with Spreadsheet "sample size" explain calculation and interpret results parametrize data in the example and allowed errors and discuss the resulting impacts on the sample size 13 Determination of obtained precision Confidence limits of estimators (see 1.1b) express the obtained precision. The calculation of confidence limit corresponds to the calculation of the sample size with the difference, that n is known and e is required. Quantitative criteria: small ratio sample/total population: t α ± 1 s n large ratio sample/ total population: t1 ± α s N n n N 1 whereby: n = sample size, t = quantile of t-distribution, s = standard deviation according to chosen sampling plan, N = total population 14

8 Determination of obtained precision Qualitative criteria (proportions, frequencies): large total populations: ± t1 α / n pq small total populations: ± t1 α whereby: n= sample size, t = quantile of t-distribution, p = proportion of total population with criterion, q = proportion of total population without criterion, pq = variance according to chosen sampling plan pq n N n N 1 15 Data entry and editing 1. Transfer of data from questionnaires and data sheets to files. Coding of alphanumeric information 3. Check for data entry errors and correction 4. Identify data problems (extreme values, missing data) 5. Solve data problems and provide operational data base 16

9 Identification of outliers - exploratory data analysis Upper hinge Median Lower hinge outlier } 1,5 x h-spread } h-spread N = 0 REVPV Rules for identification of outliers in box-and-whisker plots: all values above upper hinge + 1,5 h-spread and below lower hinge - 1,5 h-spread are extreme observations 17 Missing values The mechanisms, that lead to missing data decide on the possible solutions for further analyses of concerned data sets The use of imputation values requires that these mechanisms are ignorable, i.e. not linked to the information content Imputation procedures: Advantage: use of standard methods for calculations possible Disadvantage: no consideration of added uncertainty Method: replacement of individual missing values by values that are derived from complete sets of the sample 18

10 Generation of Imputation Values use of mean values, medians or modes variance-neutral imputation regression values (if covariate data available) hot deck imputation (random choice of values from comparable cases in the same survey) cold deck imputation (random choice of values from other sources) nearest-neighbour imputation (value from next record) 19. Classifications ( Cluster analyses ) 0

11 References Bi- and Multivariate Analysis: Aldendorfer, M.S., Blackfield R.K (1984) Cluster Analysis, West Hilcrest Backhaus, K.; Erichson, B.; Plinke, W.; Weiber, R. (1994) Multivariate Analysemethoden. Springer Verlag, Berlin u.a. (German language) Henze, A. (1994) Marktforschung. UTB 179, Ulmer Verlag Stuttgart (German language) Tukey, J.W. (1977) Exploratory Data Analysis. Addison-Wesley Publishing Company, Inc., Reading SPSS online help SPSS vers upward. Classification ( Cluster analyses ) 1 Steps in the Computer Application The computer exercises follow the most common sequence of application in reality rather than the steps in learning econometrics univariate classification ordering and identifying the best class boundaries multivariate classification checking for too high relationships between selected classification criteria and application of cluster procedures (cluster algorithm + distance measure) testing for differences statistical tests for checking significant differences between classes (χ²-test, nonparametric tests) and between observations in time (z-test, t-test) models of linear dependencies linear regression, multiple regression, probit- and logit models. Classification ( Cluster analyses )

12 Univariate classification PC-Exercise: file C3class.xls, sheet: univ, Software: Excel order the selected (quantitative) classification criterion in ascending order determine differences ( ) from one value to the next within this order check for over proportionally large steps (graphically and/or numerically) -> preliminary class borders determine homogeneity (coefficient of variation within classes = standard deviation / mean value) and heterogeneity (distance between class means) of the preliminary classes check if moving border cases improves the measures of heterogeneity and homogeneity. Classification ( Cluster analyses ) 3 Multivariate classification PC-Exercise: file C3class.xls, sheet: multi (+ derivates), Software: Excel, SPSS import "multi" in SPSS, check for linear correlations and exclude one of each two too highly correlated variables from the further process set up cluster procedure (selection of distance measure, cluster algorithm and standardization of classification variables) The exploratory approach - interpret results (dendrogrammes) from different sets of procedures (seize and number of clusters, development of homogeneity within clusters. Classification ( Cluster analyses ) 4

13 Testing for differences PC-Exercise: file C3class.xls, sheet: multi (+ derivates), Software: Excel, SPSS Checking significant differences between clusters of selected results - nonparametric tests for quantitative variables, χ²-test for qualitative variables in SPSS (tables for statistical tests inbuilt) Interpretation of test results towards a description of the assumed clusters Checking significant differences between the current sample and information from the past (z-test, t-test) in Excel (use of printed test-value table on the standard normal distribution) probability test value probability test value 95% 1, ,5% 1, % 1,816 9,5% 1, Design of family models -Application of family models -Uncertainty and risk -Gap analysis and interpretation 6

14 References Modelling in General: Dantzig, G. B. (1963) "Linear Programming and Extensions", Princeton University Press, Princeton, N.J France, J.; Thornley, J.H.M. (1984) Mathematical Models in Agriculture. Butterworth, London MOTAD: Hazell, P.B.R. (1971) A linear alternative to quadratic and semivariance programming for farm planning under uncertainty. American Journal of Agricultural Economics, 53, pp.53-6 Doppler, W.; Salman, A. Z. and Al-Karablieh, E. K., Wolff, H.-P.: The impact of water price strategies on the allocation of irrigation water - the case of the Jordan Valley. Agricultural Water Management, 55 (00), Elsevier Science Ltd., pp LP-Models in EXCEL A LP-Matrix can be set up in several ways in EXCEL. The one used within M319 is just one alternative. The required elements are, however, always the same. SOLVER is provided as a standard add-in to EXCEL, which is why it is used in the module. Other software (e.g. XA or GAMS) is suited as well - in some regards even better but requires the purchase and the learning of how this software works. 8

15 Set-Up of the LP Matrix PC-Exercise: file C33mod1.xls, sheet: LP1, Software: Excel Exercises with Spreadsheet "LP (1)" set-up of a planning matrix transform planning matrix for the use with EXCEL solver explain EXCEL-Solver settings: cells, constraints etc. explain reports on solution, sensitivity and limits explain the use of ranges and the SUMPRODUCT command run basic model and discuss resulting reports on the (1) the optimal solution and () the sensitivity analysis 9 Parameterization PC-Exercise: file C33mod1.xls, sheet: LP, Software: Excel Exercises with Spreadsheet "LP ()" explain and demonstrate the approach of parametrizing explain and apply integer constraints add additional activities and constraints run model with changed parameters and discuss resulting reports on the (1) the optimal solution and () the sensitivity analysis 30

16 A brief introduction to MOTAD models PC-Exercise: file C33mod.xls, Software: Excel, Solver The MOTAD approach is a linear approximation of the (µ,σ-)- criterion (which is refered to in literature also as E-V model, cf. also lecture chapter 6) MOTAD = Minimization Of Total Absolute Deviation (i.e. uses deviation measure rather than variance to measure variability of return) Advantage over quadratic programming (E-V-models): solution of the model requires linear algorithm only. 31 Required data for a MOTAD models Available capacities, required capacities per realized unit of activities, contribution of alternative activities to the objective function (= data requirements of a E-model, i.e. a model based on an expected value/avtivity only) The distribution (=variation) of the altenatives' contributions to the objective function A "realistic" idea about the desired total expected value (e.g. total gross margin). "Realistic" means = or < than the maximum return from an LP model that is based on expected values. 3

17 Applied Method Set up your basic LP model, but formulate the contribution of your activities to the objective function as a constraint that is forced to yield the expected total gross margin (respective cells in the objective function stay 0) Add constraints for the absolute deviation of the values in your time series (=mean value of total time series observed value in t n ). These constraints must be larger than their RHS value of 0 in the optimal solution Add adjustment activities (columns) that allow for a stepwise (1 step = 1) reduction of the absolute deviation and deliver 1 "unit of variation" to the objective function. 33 Results MOTAD allows for the calculation of a series of optimal combinations of activities for different attitudes towards risk The final selection among the optimal combinations depends on the individual farmer's choice and refers to his specific utility function (I.e. his preference with regard to the combination of expected income and related uncertainty) For the mathematical background and justification refer to Hazel /(1971) 34

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

Geostatistics Exploratory Analysis

Geostatistics Exploratory Analysis Instituto Superior de Estatística e Gestão de Informação Universidade Nova de Lisboa Master of Science in Geospatial Technologies Geostatistics Exploratory Analysis Carlos Alberto Felgueiras cfelgueiras@isegi.unl.pt

More information

Simple Predictive Analytics Curtis Seare

Simple Predictive Analytics Curtis Seare Using Excel to Solve Business Problems: Simple Predictive Analytics Curtis Seare Copyright: Vault Analytics July 2010 Contents Section I: Background Information Why use Predictive Analytics? How to use

More information

430 Statistics and Financial Mathematics for Business

430 Statistics and Financial Mathematics for Business Prescription: 430 Statistics and Financial Mathematics for Business Elective prescription Level 4 Credit 20 Version 2 Aim Students will be able to summarise, analyse, interpret and present data, make predictions

More information

Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools. Tools for Summarizing Data Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

More information

STATISTICA Formula Guide: Logistic Regression. Table of Contents

STATISTICA Formula Guide: Logistic Regression. Table of Contents : Table of Contents... 1 Overview of Model... 1 Dispersion... 2 Parameterization... 3 Sigma-Restricted Model... 3 Overparameterized Model... 4 Reference Coding... 4 Model Summary (Summary Tab)... 5 Summary

More information

Least Squares Estimation

Least Squares Estimation Least Squares Estimation SARA A VAN DE GEER Volume 2, pp 1041 1045 in Encyclopedia of Statistics in Behavioral Science ISBN-13: 978-0-470-86080-9 ISBN-10: 0-470-86080-4 Editors Brian S Everitt & David

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar business statistics using Excel Glyn Davis & Branko Pecar OXFORD UNIVERSITY PRESS Detailed contents Introduction to Microsoft Excel 2003 Overview Learning Objectives 1.1 Introduction to Microsoft Excel

More information

SPSS Tests for Versions 9 to 13

SPSS Tests for Versions 9 to 13 SPSS Tests for Versions 9 to 13 Chapter 2 Descriptive Statistic (including median) Choose Analyze Descriptive statistics Frequencies... Click on variable(s) then press to move to into Variable(s): list

More information

Northumberland Knowledge

Northumberland Knowledge Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

More information

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

More information

Algebra 1 Course Information

Algebra 1 Course Information Course Information Course Description: Students will study patterns, relations, and functions, and focus on the use of mathematical models to understand and analyze quantitative relationships. Through

More information

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012 Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization GENOME 560, Spring 2012 Data are interesting because they help us understand the world Genomics: Massive Amounts

More information

Chapter 11 Introduction to Survey Sampling and Analysis Procedures

Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter 11 Introduction to Survey Sampling and Analysis Procedures Chapter Table of Contents OVERVIEW...149 SurveySampling...150 SurveyDataAnalysis...151 DESIGN INFORMATION FOR SURVEY PROCEDURES...152

More information

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE Perhaps Microsoft has taken pains to hide some of the most powerful tools in Excel. These add-ins tools work on top of Excel, extending its power and abilities

More information

How To Write A Data Analysis

How To Write A Data Analysis Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

More information

ADD-INS: ENHANCING EXCEL

ADD-INS: ENHANCING EXCEL CHAPTER 9 ADD-INS: ENHANCING EXCEL This chapter discusses the following topics: WHAT CAN AN ADD-IN DO? WHY USE AN ADD-IN (AND NOT JUST EXCEL MACROS/PROGRAMS)? ADD INS INSTALLED WITH EXCEL OTHER ADD-INS

More information

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study)

Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Cairo University Faculty of Economics and Political Science Statistics Department English Section Students' Opinion about Universities: The Faculty of Economics and Political Science (Case Study) Prepared

More information

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition Online Learning Centre Technology Step-by-Step - Excel Microsoft Excel is a spreadsheet software application

More information

03 The full syllabus. 03 The full syllabus continued. For more information visit www.cimaglobal.com PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

03 The full syllabus. 03 The full syllabus continued. For more information visit www.cimaglobal.com PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS 0 The full syllabus 0 The full syllabus continued PAPER C0 FUNDAMENTALS OF BUSINESS MATHEMATICS Syllabus overview This paper primarily deals with the tools and techniques to understand the mathematics

More information

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Lecture 2: Descriptive Statistics and Exploratory Data Analysis Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics. Business Course Text Bowerman, Bruce L., Richard T. O'Connell, J. B. Orris, and Dawn C. Porter. Essentials of Business, 2nd edition, McGraw-Hill/Irwin, 2008, ISBN: 978-0-07-331988-9. Required Computing

More information

Airport Planning and Design. Excel Solver

Airport Planning and Design. Excel Solver Airport Planning and Design Excel Solver Dr. Antonio A. Trani Professor of Civil and Environmental Engineering Virginia Polytechnic Institute and State University Blacksburg, Virginia Spring 2012 1 of

More information

Introduction to Sampling. Dr. Safaa R. Amer. Overview. for Non-Statisticians. Part II. Part I. Sample Size. Introduction.

Introduction to Sampling. Dr. Safaa R. Amer. Overview. for Non-Statisticians. Part II. Part I. Sample Size. Introduction. Introduction to Sampling for Non-Statisticians Dr. Safaa R. Amer Overview Part I Part II Introduction Census or Sample Sampling Frame Probability or non-probability sample Sampling with or without replacement

More information

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012. Abstract. Review session.

Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012. Abstract. Review session. June 23, 2012 1 review session Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012 Review session. Abstract Quantitative methods in business Accounting

More information

Confidence Intervals for the Difference Between Two Means

Confidence Intervals for the Difference Between Two Means Chapter 47 Confidence Intervals for the Difference Between Two Means Introduction This procedure calculates the sample size necessary to achieve a specified distance from the difference in sample means

More information

An introduction to using Microsoft Excel for quantitative data analysis

An introduction to using Microsoft Excel for quantitative data analysis Contents An introduction to using Microsoft Excel for quantitative data analysis 1 Introduction... 1 2 Why use Excel?... 2 3 Quantitative data analysis tools in Excel... 3 4 Entering your data... 6 5 Preparing

More information

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS

Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS Overview of Non-Parametric Statistics PRESENTER: ELAINE EISENBEISZ OWNER AND PRINCIPAL, OMEGA STATISTICS About Omega Statistics Private practice consultancy based in Southern California, Medical and Clinical

More information

240ST014 - Data Analysis of Transport and Logistics

240ST014 - Data Analysis of Transport and Logistics Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2015 240 - ETSEIB - Barcelona School of Industrial Engineering 715 - EIO - Department of Statistics and Operations Research MASTER'S

More information

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013 Statistics I for QBIC Text Book: Biostatistics, 10 th edition, by Daniel & Cross Contents and Objectives Chapters 1 7 Revised: August 2013 Chapter 1: Nature of Statistics (sections 1.1-1.6) Objectives

More information

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza

Handling missing data in large data sets. Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza Handling missing data in large data sets Agostino Di Ciaccio Dept. of Statistics University of Rome La Sapienza The problem Often in official statistics we have large data sets with many variables and

More information

Quantitative Methods for Finance

Quantitative Methods for Finance Quantitative Methods for Finance Module 1: The Time Value of Money 1 Learning how to interpret interest rates as required rates of return, discount rates, or opportunity costs. 2 Learning how to explain

More information

Week 1. Exploratory Data Analysis

Week 1. Exploratory Data Analysis Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

More information

Statistics Review PSY379

Statistics Review PSY379 Statistics Review PSY379 Basic concepts Measurement scales Populations vs. samples Continuous vs. discrete variable Independent vs. dependent variable Descriptive vs. inferential stats Common analyses

More information

Statistics Graduate Courses

Statistics Graduate Courses Statistics Graduate Courses STAT 7002--Topics in Statistics-Biological/Physical/Mathematics (cr.arr.).organized study of selected topics. Subjects and earnable credit may vary from semester to semester.

More information

Using MS Excel to Analyze Data: A Tutorial

Using MS Excel to Analyze Data: A Tutorial Using MS Excel to Analyze Data: A Tutorial Various data analysis tools are available and some of them are free. Because using data to improve assessment and instruction primarily involves descriptive and

More information

Annex 6 BEST PRACTICE EXAMPLES FOCUSING ON SAMPLE SIZE AND RELIABILITY CALCULATIONS AND SAMPLING FOR VALIDATION/VERIFICATION. (Version 01.

Annex 6 BEST PRACTICE EXAMPLES FOCUSING ON SAMPLE SIZE AND RELIABILITY CALCULATIONS AND SAMPLING FOR VALIDATION/VERIFICATION. (Version 01. Page 1 BEST PRACTICE EXAMPLES FOCUSING ON SAMPLE SIZE AND RELIABILITY CALCULATIONS AND SAMPLING FOR VALIDATION/VERIFICATION (Version 01.1) I. Introduction 1. The clean development mechanism (CDM) Executive

More information

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics Course Text Business Statistics Lind, Douglas A., Marchal, William A. and Samuel A. Wathen. Basic Statistics for Business and Economics, 7th edition, McGraw-Hill/Irwin, 2010, ISBN: 9780077384470 [This

More information

Introduction to Statistics and Quantitative Research Methods

Introduction to Statistics and Quantitative Research Methods Introduction to Statistics and Quantitative Research Methods Purpose of Presentation To aid in the understanding of basic statistics, including terminology, common terms, and common statistical methods.

More information

E x c e l 2 0 1 0 : Data Analysis Tools Student Manual

E x c e l 2 0 1 0 : Data Analysis Tools Student Manual E x c e l 2 0 1 0 : Data Analysis Tools Student Manual Excel 2010: Data Analysis Tools Chief Executive Officer, Axzo Press: Series Designer and COO: Vice President, Operations: Director of Publishing Systems

More information

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus

Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Auxiliary Variables in Mixture Modeling: 3-Step Approaches Using Mplus Tihomir Asparouhov and Bengt Muthén Mplus Web Notes: No. 15 Version 8, August 5, 2014 1 Abstract This paper discusses alternatives

More information

Teaching Multivariate Analysis to Business-Major Students

Teaching Multivariate Analysis to Business-Major Students Teaching Multivariate Analysis to Business-Major Students Wing-Keung Wong and Teck-Wong Soon - Kent Ridge, Singapore 1. Introduction During the last two or three decades, multivariate statistical analysis

More information

Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

More information

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools 2009-2010 Week 1 Week 2 14.0 Students organize and describe distributions of data by using a number of different

More information

Study Guide for the Final Exam

Study Guide for the Final Exam Study Guide for the Final Exam When studying, remember that the computational portion of the exam will only involve new material (covered after the second midterm), that material from Exam 1 will make

More information

January 26, 2009 The Faculty Center for Teaching and Learning

January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS A USER GUIDE January 26, 2009 The Faculty Center for Teaching and Learning THE BASICS OF DATA MANAGEMENT AND ANALYSIS Table of Contents Table of Contents... i

More information

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

1/27/2013. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 Introduce moderated multiple regression Continuous predictor continuous predictor Continuous predictor categorical predictor Understand

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE

CONTENTS OF DAY 2. II. Why Random Sampling is Important 9 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE 1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 3 Problems with small populations 8 II. Why Random Sampling is Important 9 A myth,

More information

Descriptive Analysis

Descriptive Analysis Research Methods William G. Zikmund Basic Data Analysis: Descriptive Statistics Descriptive Analysis The transformation of raw data into a form that will make them easy to understand and interpret; rearranging,

More information

Simple Linear Regression Inference

Simple Linear Regression Inference Simple Linear Regression Inference 1 Inference requirements The Normality assumption of the stochastic term e is needed for inference even if it is not a OLS requirement. Therefore we have: Interpretation

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium December 10, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Elementary Statistics

Elementary Statistics Elementary Statistics Chapter 1 Dr. Ghamsary Page 1 Elementary Statistics M. Ghamsary, Ph.D. Chap 01 1 Elementary Statistics Chapter 1 Dr. Ghamsary Page 2 Statistics: Statistics is the science of collecting,

More information

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi-110 012 seema@iasri.res.in Genomics A genome is an organism s

More information

Data Preparation and Statistical Displays

Data Preparation and Statistical Displays Reservoir Modeling with GSLIB Data Preparation and Statistical Displays Data Cleaning / Quality Control Statistics as Parameters for Random Function Models Univariate Statistics Histograms and Probability

More information

Linear Threshold Units

Linear Threshold Units Linear Threshold Units w x hx (... w n x n w We assume that each feature x j and each weight w j is a real number (we will relax this later) We will study three different algorithms for learning linear

More information

How To Understand The Theory Of Probability

How To Understand The Theory Of Probability Graduate Programs in Statistics Course Titles STAT 100 CALCULUS AND MATR IX ALGEBRA FOR STATISTICS. Differential and integral calculus; infinite series; matrix algebra STAT 195 INTRODUCTION TO MATHEMATICAL

More information

Biostatistics: Types of Data Analysis

Biostatistics: Types of Data Analysis Biostatistics: Types of Data Analysis Theresa A Scott, MS Vanderbilt University Department of Biostatistics theresa.scott@vanderbilt.edu http://biostat.mc.vanderbilt.edu/theresascott Theresa A Scott, MS

More information

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not.

Example: Credit card default, we may be more interested in predicting the probabilty of a default than classifying individuals as default or not. Statistical Learning: Chapter 4 Classification 4.1 Introduction Supervised learning with a categorical (Qualitative) response Notation: - Feature vector X, - qualitative response Y, taking values in C

More information

Final Exam Practice Problem Answers

Final Exam Practice Problem Answers Final Exam Practice Problem Answers The following data set consists of data gathered from 77 popular breakfast cereals. The variables in the data set are as follows: Brand: The brand name of the cereal

More information

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini

Section Format Day Begin End Building Rm# Instructor. 001 Lecture Tue 6:45 PM 8:40 PM Silver 401 Ballerini NEW YORK UNIVERSITY ROBERT F. WAGNER GRADUATE SCHOOL OF PUBLIC SERVICE Course Syllabus Spring 2016 Statistical Methods for Public, Nonprofit, and Health Management Section Format Day Begin End Building

More information

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready (SCCCR) Probability and Statistics South Carolina College- and Career-Ready Mathematical Process Standards The South Carolina College- and Career-Ready (SCCCR)

More information

DISCRIMINANT FUNCTION ANALYSIS (DA)

DISCRIMINANT FUNCTION ANALYSIS (DA) DISCRIMINANT FUNCTION ANALYSIS (DA) John Poulsen and Aaron French Key words: assumptions, further reading, computations, standardized coefficents, structure matrix, tests of signficance Introduction Discriminant

More information

Chapter 7. One-way ANOVA

Chapter 7. One-way ANOVA Chapter 7 One-way ANOVA One-way ANOVA examines equality of population means for a quantitative outcome and a single categorical explanatory variable with any number of levels. The t-test of Chapter 6 looks

More information

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model

Assumptions. Assumptions of linear models. Boxplot. Data exploration. Apply to response variable. Apply to error terms from linear model Assumptions Assumptions of linear models Apply to response variable within each group if predictor categorical Apply to error terms from linear model check by analysing residuals Normality Homogeneity

More information

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION

HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HYPOTHESIS TESTING: CONFIDENCE INTERVALS, T-TESTS, ANOVAS, AND REGRESSION HOD 2990 10 November 2010 Lecture Background This is a lightning speed summary of introductory statistical methods for senior undergraduate

More information

New SAS Procedures for Analysis of Sample Survey Data

New SAS Procedures for Analysis of Sample Survey Data New SAS Procedures for Analysis of Sample Survey Data Anthony An and Donna Watts, SAS Institute Inc, Cary, NC Abstract Researchers use sample surveys to obtain information on a wide variety of issues Many

More information

Introduction to Quantitative Methods

Introduction to Quantitative Methods Introduction to Quantitative Methods October 15, 2009 Contents 1 Definition of Key Terms 2 2 Descriptive Statistics 3 2.1 Frequency Tables......................... 4 2.2 Measures of Central Tendencies.................

More information

Part 2: Analysis of Relationship Between Two Variables

Part 2: Analysis of Relationship Between Two Variables Part 2: Analysis of Relationship Between Two Variables Linear Regression Linear correlation Significance Tests Multiple regression Linear Regression Y = a X + b Dependent Variable Independent Variable

More information

Tutorial 5: Hypothesis Testing

Tutorial 5: Hypothesis Testing Tutorial 5: Hypothesis Testing Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction................................ 1 2 Testing distributional assumptions....................

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING LAB 4 INSTRUCTIONS CONFIDENCE INTERVALS AND HYPOTHESIS TESTING In this lab you will explore the concept of a confidence interval and hypothesis testing through a simulation problem in engineering setting.

More information

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com

Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Silvermine House Steenberg Office Park, Tokai 7945 Cape Town, South Africa Telephone: +27 21 702 4666 www.spss-sa.com SPSS-SA Training Brochure 2009 TABLE OF CONTENTS 1 SPSS TRAINING COURSES FOCUSING

More information

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics Lecturer: Mikhail Zhitlukhin. 1. Course description Probability Theory and Introductory Statistics

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Final Exam Review MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A researcher for an airline interviews all of the passengers on five randomly

More information

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

More information

Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; hflores@rice.edu, and Dr. J.A. Dobelman

Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; hflores@rice.edu, and Dr. J.A. Dobelman Introduction to Statistical Computing in Microsoft Excel By Hector D. Flores; hflores@rice.edu, and Dr. J.A. Dobelman Statistics lab will be mainly focused on applying what you have learned in class with

More information

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities:

Teaching model: C1 a. General background: 50% b. Theory-into-practice/developmental 50% knowledge-building: c. Guided academic activities: 1. COURSE DESCRIPTION Degree: Double Degree: Derecho y Finanzas y Contabilidad (English teaching) Course: STATISTICAL AND ECONOMETRIC METHODS FOR FINANCE (Métodos Estadísticos y Econométricos en Finanzas

More information

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST

UNDERSTANDING THE INDEPENDENT-SAMPLES t TEST UNDERSTANDING The independent-samples t test evaluates the difference between the means of two independent or unrelated groups. That is, we evaluate whether the means for two independent groups are significantly

More information

Organizing Your Approach to a Data Analysis

Organizing Your Approach to a Data Analysis Biost/Stat 578 B: Data Analysis Emerson, September 29, 2003 Handout #1 Organizing Your Approach to a Data Analysis The general theme should be to maximize thinking about the data analysis and to minimize

More information

Lecture 2. Summarizing the Sample

Lecture 2. Summarizing the Sample Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

THE KRUSKAL WALLLIS TEST

THE KRUSKAL WALLLIS TEST THE KRUSKAL WALLLIS TEST TEODORA H. MEHOTCHEVA Wednesday, 23 rd April 08 THE KRUSKAL-WALLIS TEST: The non-parametric alternative to ANOVA: testing for difference between several independent groups 2 NON

More information

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard

Algebra 1 2008. Academic Content Standards Grade Eight and Grade Nine Ohio. Grade Eight. Number, Number Sense and Operations Standard Academic Content Standards Grade Eight and Grade Nine Ohio Algebra 1 2008 Grade Eight STANDARDS Number, Number Sense and Operations Standard Number and Number Systems 1. Use scientific notation to express

More information

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7)

Overview Classes. 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) Overview Classes 12-3 Logistic regression (5) 19-3 Building and applying logistic regression (6) 26-3 Generalizations of logistic regression (7) 2-4 Loglinear models (8) 5-4 15-17 hrs; 5B02 Building and

More information

The primary goal of this thesis was to understand how the spatial dependence of

The primary goal of this thesis was to understand how the spatial dependence of 5 General discussion 5.1 Introduction The primary goal of this thesis was to understand how the spatial dependence of consumer attitudes can be modeled, what additional benefits the recovering of spatial

More information

Statistics 104: Section 6!

Statistics 104: Section 6! Page 1 Statistics 104: Section 6! TF: Deirdre (say: Dear-dra) Bloome Email: dbloome@fas.harvard.edu Section Times Thursday 2pm-3pm in SC 109, Thursday 5pm-6pm in SC 705 Office Hours: Thursday 6pm-7pm SC

More information

Common Core Unit Summary Grades 6 to 8

Common Core Unit Summary Grades 6 to 8 Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations

More information

RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY

RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY RUTHERFORD HIGH SCHOOL Rutherford, New Jersey COURSE OUTLINE STATISTICS AND PROBABILITY I. INTRODUCTION According to the Common Core Standards (2010), Decisions or predictions are often based on data numbers

More information

200627 - AC - Clinical Trials

200627 - AC - Clinical Trials Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2014 200 - FME - School of Mathematics and Statistics 715 - EIO - Department of Statistics and Operations Research MASTER'S DEGREE

More information

A Basic Introduction to Missing Data

A Basic Introduction to Missing Data John Fox Sociology 740 Winter 2014 Outline Why Missing Data Arise Why Missing Data Arise Global or unit non-response. In a survey, certain respondents may be unreachable or may refuse to participate. Item

More information

Research Methods & Experimental Design

Research Methods & Experimental Design Research Methods & Experimental Design 16.422 Human Supervisory Control April 2004 Research Methods Qualitative vs. quantitative Understanding the relationship between objectives (research question) and

More information

Two-Sample T-Tests Assuming Equal Variance (Enter Means)

Two-Sample T-Tests Assuming Equal Variance (Enter Means) Chapter 4 Two-Sample T-Tests Assuming Equal Variance (Enter Means) Introduction This procedure provides sample size and power calculations for one- or two-sided two-sample t-tests when the variances of

More information

A Quantitative Approach to Commercial Damages. Applying Statistics to the Measurement of Lost Profits + Website

A Quantitative Approach to Commercial Damages. Applying Statistics to the Measurement of Lost Profits + Website Brochure More information from http://www.researchandmarkets.com/reports/2212877/ A Quantitative Approach to Commercial Damages. Applying Statistics to the Measurement of Lost Profits + Website Description:

More information

Statistical & Technical Team

Statistical & Technical Team Statistical & Technical Team A Practical Guide to Sampling This guide is brought to you by the Statistical and Technical Team, who form part of the VFM Development Team. They are responsible for advice

More information

Application of discriminant analysis to predict the class of degree for graduating students in a university system

Application of discriminant analysis to predict the class of degree for graduating students in a university system International Journal of Physical Sciences Vol. 4 (), pp. 06-0, January, 009 Available online at http://www.academicjournals.org/ijps ISSN 99-950 009 Academic Journals Full Length Research Paper Application

More information