Basic Data Analysis. Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012. Abstract. Review session.



Similar documents
Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

A Fuel Cost Comparison of Electric and Gas-Powered Vehicles

LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE

AMS 5 CHANCE VARIABILITY

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Big Data & Scripting Part II Streaming Algorithms

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Fairfield Public Schools

Sampling Distributions

3. Data Analysis, Statistics, and Probability

SOLUTIONS: 4.1 Probability Distributions and 4.2 Binomial Distributions

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Sampling. COUN 695 Experimental Design

Ch.3 Demand Forecasting.

Descriptive Statistics

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS INTRODUCTION TO STATISTICS MATH 2050

Vertical Alignment Colorado Academic Standards 6 th - 7 th - 8 th

sensitivity analysis. Using Excel 2.1 MANUAL WHAT-IF ANALYSIS 2.2 THRESHOLD VALUES

Qualitative and Quantitative Assessment of Uncertainty in Regulatory Decision Making. Charles F. Manski

PROJECT RISK MANAGEMENT - ADVANTAGES AND PITFALLS

STAT 360 Probability and Statistics. Fall 2012

Optimal Replacement of Underground Distribution Cables

Reflections on Probability vs Nonprobability Sampling

This Method will show you exactly how you can profit from this specific online casino and beat them at their own game.

Section A. Index. Section A. Planning, Budgeting and Forecasting Section A.2 Forecasting techniques Page 1 of 11. EduPristine CMA - Part I

Northumberland Knowledge

International College of Economics and Finance Syllabus Probability Theory and Introductory Statistics

The relationship between exchange rates, interest rates. In this lecture we will learn how exchange rates accommodate equilibrium in

Overview of Pricing Research

Stat 20: Intro to Probability and Statistics

How To Use Statgraphics Centurion Xvii (Version 17) On A Computer Or A Computer (For Free)

Project Decision Analysis Process

Data Analysis Tools. Tools for Summarizing Data

Problem Solving and Data Analysis

IAB Evaluation Study of Methods Used to Assess the Effectiveness of Advertising on the Internet

ECON 459 Game Theory. Lecture Notes Auctions. Luca Anderlini Spring 2015

6 Hedging Using Futures

MTH 140 Statistics Videos

Measurement Information Model

Exploratory data analysis (Chapter 2) Fall 2011

Statistics Review PSY379

Chapter 111. Texas Essential Knowledge and Skills for Mathematics. Subchapter B. Middle School

Lecture Notes Module 1

Prentice Hall Algebra Correlated to: Colorado P-12 Academic Standards for High School Mathematics, Adopted 12/2009

Through Proper Testing Techniques. By Perry D. Drake

Street Address: 1111 Franklin Street Oakland, CA Mailing Address: 1111 Franklin Street Oakland, CA 94607

The central question: Does the bankruptcy process make sure that firms in bankruptcy are inefficient, and that inefficient firms end up in bankruptcy?

(Refer Slide Time: 01:52)

Evaluation of Selection Methods for Global Mobility Management Software

Probability and statistics; Rehearsal for pattern recognition

Multiple Optimization Using the JMP Statistical Software Kodak Research Conference May 9, 2005

Eurodollar Futures, and Forwards

STAT 35A HW2 Solutions

1 Introduction The Scientific Method (1 of 20) 1 Introduction Observations and Measurements Qualitative, Quantitative, Inferences (2 of 20)

Driving Strategic Planning with Predictive Modeling. An Oracle White Paper Updated July 2008

The Basics of Graphical Models

Your Questions from Chapter 1. General Psychology PSYC 200. Your Questions from Chapter 1. Your Questions from Chapter 1. Science is a Method.

Geography 4203 / GIS Modeling. Class 12: Spatial Data Quality and Uncertainty

Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Schools Value-added Information System Technical Manual

Justifying Simulation. Why use simulation? Accurate Depiction of Reality. Insightful system evaluations

Version 4.0. Statistics Guide. Statistical analyses for laboratory and clinical researchers. Harvey Motulsky

South Carolina College- and Career-Ready (SCCCR) Probability and Statistics

Recall this chart that showed how most of our course would be organized:

Quantitative Inventory Uncertainty

Simulation and Risk Analysis

Lecture 19: Chapter 8, Section 1 Sampling Distributions: Proportions

Experimental Uncertainties (Errors)

TEACHING SIMULATION WITH SPREADSHEETS

PROJECT COST MANAGEMENT

Research design and methods Part II. Dr Brian van Wyk POST-GRADUATE ENROLMENT AND THROUGHPUT

Errors in Operational Spreadsheets: A Review of the State of the Art

Practical Calculation of Expected and Unexpected Losses in Operational Risk by Simulation Methods

Grade 6 Mathematics Assessment. Eligible Texas Essential Knowledge and Skills

BASIC STATISTICAL METHODS FOR GENOMIC DATA ANALYSIS

Improving proposal evaluation process with the help of vendor performance feedback and stochastic optimal control

Science Stage 6 Skills Module 8.1 and 9.1 Mapping Grids

1 A project or simply a task?

Hypothesis Testing for Beginners

Audit Sampling. AU Section 350 AU

C. Wohlin, "Is Prior Knowledge of a Programming Language Important for Software Quality?", Proceedings 1st International Symposium on Empirical

Economics 1011a: Intermediate Microeconomics

Risk and Return in the Canadian Bond Market

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

Bayesian Statistical Analysis in Medical Research

PS 271B: Quantitative Methods II. Lecture Notes

Boris Kovalerchuk Dept. of Computer Science Central Washington University USA

Binomial Sampling and the Binomial Distribution

RISK MANAGEMENT OVERVIEW - APM Project Pathway (Draft) RISK MANAGEMENT JUST A PART OF PROJECT MANAGEMENT

Chapter 2: Descriptive Statistics

Math 370, Spring 2008 Prof. A.J. Hildebrand. Practice Test 1 Solutions

Graduate Certificate in Systems Engineering

1. How different is the t distribution from the normal?

Transcription:

June 23, 2012 1 review session Basic Data Analysis Stephen Turnbull Business Administration and Public Policy Lecture 12: June 22, 2012 Review session. Abstract

Quantitative methods in business Accounting is business s way of handling precise and accurate data. It does not admit mistakes, and uses techniques like double entry to detect and correct mistakes. In contrast, statistics is business s way of handling imprecise, inaccurate, and uncertain data. There is always possibility of error and the job of statistical science is to provide consistent ways of tracking, measuring, and adjusting for error. June 23, 2012 2 review session

Dealing with error What makes statistics difficult is the fact that there is always error, always uncertainty, and that our measurement errors are affected by that uncertainty. However for business purposes, typically we can formulate the costs of risk and uncertainty in terms that reflect the basic values that we are dealing with us rather than the uncertainty itself (e.g., maximizing profit). In economics (in consumer theory and general decision theory), we need to worry about the all aspects of uncertainty, and all the possible values of the distribution of outcomes and so on. But in typical business applications we are mostly interested in the distribution of a financial outcome or similar. Then we June 23, 2012 3 review session

use our best estimate of the measure of error. We don t need to know more about the error variance beyond our best estimate. June 23, 2012 4 review session

The job of statistics We do have good ideas of how to calculate best estimates. If we use those we are able to make good assessments of risks that are involved in many or most business situations. The job of statistics is not to directly make decisions. Making decisions involve value judgments, and value judgments need to be made by the affected parties. In business of course that s management. The job of statistics then is to come up with algorithms (known methods of computation) to assess values and the errors and risks associated with them. June 23, 2012 5 review session

Measurement of value Human beings have relatively imprecise notions of value, even when units are well defined (such as yen and dollars). We deal with big and small losses differently. Nevertheless, statistics needs numbers. Always remember that the numbers only partly reflect what people want and think. Defining and getting the numbers is not the job of statistics, but of various specialties such as economics, finance, marketing, and so on. June 23, 2012 6 review session

Measurement of uncertainty Human beings notion of uncertainty is partly quantitative and qualitative. We can usually say that one event is more or less likely than another. But we usually can t assign additive probabilities to these evaluations. Often we don t have a clear idea of how events relate to each other. This means that there is tension the human side of decision-making and quantitative side. The human is better at making judgments of what he really wants but these judgments are not useful for extended calculations. June 23, 2012 7 review session

Making extended calculations The purpose of statistics is the foundation of and methods for extended calculations about uncertainty. Statistics quantifies uncertainty and for that purpose we use probability. By using probability, we can develop standard methods for measuring the uncertainty and reliability of our quantitative statements. June 23, 2012 8 review session

Experiments and observation Sometimes we merely study past history, and infer natural laws from that. This is called an observational study. Sometimes we can test our hypothetical laws directly. This is called an experimental study. Experimental studies are preferred by scientists because we can directly infer cause and effect, because only our proposed cause changes; this implies that if effect changes as well, there s a direct relationship. When we have uncontrolled observations, we must have a model of the behavior and of the uncertainty to conduct inference. June 23, 2012 9 review session

Randomized experiments Often there are many important factors that we cannot control as precisely as physical scientists do, especially when dealing with human beings. Sometimes we can observe those factors, sometimes we cannot. It is important to recognize that we don t always know what the important factors are, and unrecognized factors will often not be observed or recorded at all. We can ensure that those uncontrolled factors have a specific distribution by use of random samples. June 23, 2012 10 review session

The distribution of factors in random samples We usually don t know what that distribution is quantitatively. We can say that the observed distribution of the sample approximates that of the underlying population. We say that such a sample is representative. We make the meaning of approximate exact by using the Law of Large Numbers. We can make the probability of a large error as small as we like, at the cost of taking a sufficiently large sample. If the purpose of our study is to understand the unknown distribution, then random sampling helps a lot. June 23, 2012 11 review session

Sometimes we need to account for the size of cells; in that case we use a density. June 23, 2012 12 review session Distribution The fundamental idea of statistics is that of distribution. We assume that we do not care about the individuals, but only about certain measurable properties. (Cf. accounting, where the individuals do matter.) We count the number of individuals with each combination of values of properties. We can count either the number (an absolute distribution) or the fraction (a relative distribution). We can count for a particular combination or range of combinations, called cell (a frequency distribution) or for all cells up to some level (a cumulative distribution).

June 23, 2012 13 review session Distributions are represented graphically by histograms.