Foundation of Quantitative Data Analysis



Similar documents
Using SPSS, Chapter 2: Descriptive Statistics

MBA 611 STATISTICS AND QUANTITATIVE METHODS

A Correlation of. to the. South Carolina Data Analysis and Probability Standards

Northumberland Knowledge

Descriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion

Statistics Review PSY379

Descriptive Statistics

Lecture 2: Descriptive Statistics and Exploratory Data Analysis

Statistics I for QBIC. Contents and Objectives. Chapters 1 7. Revised: August 2013

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

Additional sources Compilation of sources:

Lecture 1: Review and Exploratory Data Analysis (EDA)

Exploratory data analysis (Chapter 2) Fall 2011

business statistics using Excel OXFORD UNIVERSITY PRESS Glyn Davis & Branko Pecar

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

How To Write A Data Analysis

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

STATISTICAL ANALYSIS WITH EXCEL COURSE OUTLINE

Intro to Statistics 8 Curriculum

Data exploration with Microsoft Excel: univariate analysis

CA200 Quantitative Analysis for Business Decisions. File name: CA200_Section_04A_StatisticsIntroduction

IBM SPSS Statistics 20 Part 1: Descriptive Statistics

Business Statistics. Successful completion of Introductory and/or Intermediate Algebra courses is recommended before taking Business Statistics.

Basics of Statistics

Practice#1(chapter1,2) Name

Variables. Exploratory Data Analysis

Probability and Statistics Vocabulary List (Definitions for Middle School Teachers)

STA-201-TE. 5. Measures of relationship: correlation (5%) Correlation coefficient; Pearson r; correlation and causation; proportion of common variance

Course Text. Required Computing Software. Course Description. Course Objectives. StraighterLine. Business Statistics

Quantitative Methods for Finance

Exercise 1.12 (Pg )

Why Taking This Course? Course Introduction, Descriptive Statistics and Data Visualization. Learning Goals. GENOME 560, Spring 2012

Describing, Exploring, and Comparing Data

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

STAT 35A HW2 Solutions

Summarizing and Displaying Categorical Data

IBM SPSS Statistics for Beginners for Windows

When to use Excel. When NOT to use Excel 9/24/2014

Using Excel for descriptive statistics

Module 2: Introduction to Quantitative Data Analysis

Data exploration with Microsoft Excel: analysing more than one variable

Analyzing and interpreting data Evaluation resources from Wilder Research

Curriculum Map Statistics and Probability Honors (348) Saugus High School Saugus Public Schools

Dongfeng Li. Autumn 2010

Classify the data as either discrete or continuous. 2) An athlete runs 100 meters in 10.5 seconds. 2) A) Discrete B) Continuous

Descriptive and Inferential Statistics

Bowerman, O'Connell, Aitken Schermer, & Adcock, Business Statistics in Practice, Canadian edition

An introduction to using Microsoft Excel for quantitative data analysis

Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY

Descriptive Analysis

Diagrams and Graphs of Statistical Data

Improving the Performance of Data Mining Models with Data Preparation Using SAS Enterprise Miner Ricardo Galante, SAS Institute Brasil, São Paulo, SP

Mathematics within the Psychology Curriculum

Information Technology Services will be updating the mark sense test scoring hardware and software on Monday, May 18, We will continue to score

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

An Introduction to SPSS. Workshop Session conducted by: Dr. Cyndi Garvan Grace-Anne Jackman

The Binomial Probability Distribution

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Data Mining Techniques Chapter 5: The Lure of Statistics: Data Mining Using Familiar Tools

430 Statistics and Financial Mathematics for Business

4 Other useful features on the course web page. 5 Accessing SAS

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

DATA INTERPRETATION AND STATISTICS

Description. Textbook. Grading. Objective

Projects Involving Statistics (& SPSS)

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

THE BINOMIAL DISTRIBUTION & PROBABILITY

4.1 Exploratory Analysis: Once the data is collected and entered, the first question is: "What do the data look like?"

Descriptive Statistics

Descriptive Statistics and Measurement Scales

Statistics. Measurement. Scales of Measurement 7/18/2012

3. Data Analysis, Statistics, and Probability

Exploratory Data Analysis. Psychology 3256

Guided Reading 9 th Edition. informed consent, protection from harm, deception, confidentiality, and anonymity.

Survey Data Analysis. Qatar University. Dr. Kenneth M.Coleman - University of Michigan

Common Tools for Displaying and Communicating Data for Process Improvement

1-3 id id no. of respondents respon 1 responsible for maintenance? 1 = no, 2 = yes, 9 = blank

Scatter Plots with Error Bars

Center: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)

Means, standard deviations and. and standard errors

Introduction to Statistics and Quantitative Research Methods

Analyzing Experimental Data

Fairfield Public Schools

SCHOOL OF HEALTH AND HUMAN SCIENCES DON T FORGET TO RECODE YOUR MISSING VALUES

Data analysis process

TIPS FOR DOING STATISTICS IN EXCEL

Basic Concepts in Research and Data Analysis

DesCartes (Combined) Subject: Mathematics Goal: Data Analysis, Statistics, and Probability

STAT355 - Probability & Statistics

What Does the Normal Distribution Sound Like?

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Normality Testing in Excel

Interpreting Data in Normal Distributions

How To Test For Significance On A Data Set

Instructions for SPSS 21

COMMON CORE STATE STANDARDS FOR

03 The full syllabus. 03 The full syllabus continued. For more information visit PAPER C03 FUNDAMENTALS OF BUSINESS MATHEMATICS

Transcription:

Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1 and 2 Assignment #3: To replicate the classroom exercises. D.B. Khang _ HSRS #10 - Page 1 Objectives D.B. Khang _ HSRS #10 - Page 2 At the end of this lesson, you should be able to: Understand the role of statistical analysis in empirical research Use Excel and SPSS software in data manipulation and simplest statistical operations Be refreshed of the basic knowledge of probability theory to properly interpret the findings of statistical analysis 1

Statistical Analysis D.B. Khang _ HSRS #10 - Page 3 Data Information knowledge decisions and actions Statistical analysis: Set of scientific methods used to analyze the data in order to provide meaningful information for better understanding and decision making through An approximation of the real world Measurements of the errors of this approximation Based on the data available and the purposes, we may classify as Descriptive statistics: summarizing and presenting the (population or census) data in order: To provide insights To explain To assess and evaluate Inferential statistics: Analysis of data available (from a sample, and experiment, etc.) to draw conclusions on a larger or unseen group (population, future events, etc.) in order : To estimate and predict To test hypotheses To provide insights and To explain Types of data D.B. Khang _ HSRS #10 - Page 4 Non-metric (or qualitative) data: Nominal size of number is not related to the amount of the characteristic being measured Referring to names or attributes only Examples: brand, color, sex, professions, etc. Ordinal larger numbers indicate more (or less) of the characteristic measured, but not how much more (or less) Referring to ranking Examples: ranks, preferences, age groups, social classes, etc. Metric (or quantitative) data: Notes: Interval contains ordinal properties, and in addition, there are equal differences between scale points. Examples: temperature, date, index number, etc. Ratio contains interval scale properties, and in addition, there is a natural zero point Examples: length, counts, weight, sales, age, etc. Level of data is critical in determining the appropriate technique to use Statistics deals with all kinds of data, assuming that we enough of them 2

Storage of data for analysis D.B. Khang _ HSRS #10 - Page 5 Good storage of raw quantitative data is essential for meaningful manipulation, summary, presentation and analysis Most databases store data in format of table Rows are the data items or subjects Columns are the measurements or values assigned (collected) to the items: variables Data storage in most databases are transferable Basic data management skills to be developed through practices: Enter data into Excel and SPSS provide explanations of variables and scores Transfer data between these two platforms Calculate new variables from existing data entered Practical tips: Data should be coded numerically Full documentation (meanings of variables and their values) Consistency: data collection, storage and analysis Manipulations of data stored are acceptable but should be transparent Classroom exercise 1 Consider the data set HBAT.sav Read the description of the data and try to understand the meaning of the variables in the data set. Identify the metric and the non-metric variables, and the meanings of the values of the variables. D.B. Khang _ HSRS #10 - Page 6 Save the file into Excel file. Transfer the file back into SPSS data file. Try to reformat both files for better readability. 3

Summarizing and presenting data D.B. Khang _ HSRS #10 - Page 7 Most often, data should be summarized and presented in sensible ways that support our objectives (that is, to provide insights, to explain or to evaluate) Options usually include: Presenting summarized distributions: frequency tables, percentiles Using some measures of central tendency as representative statistics: averages, medians, modes Using some measures of variability: ranges, variances, standard deviations, inter-quartile ranges Using other descriptive statistics: min, max, quartiles, skewness, kurtosis, etc. Using tabulations and cross tabulations Using graphs and diagrams: line graphs, bar charts, pie charts, frequency diagrams, histograms, box plots and other statistical graphs Most of these can be supported by Excel and SPSS. Classroom exercise 2 D.B. Khang _ HSRS #10 - Page 8 Apply descriptive statistical tools of SPSS/Excel to the variables X 18 and X 19 of HBAT data set and interpret the results. Apply Pie chart to X 1, Histogram to X 19. Draw the scatter graph of X 18 and X 19 and interpret the results Draw the frequency tables of X 1 and X 2 and interpret the results Apply cross tabulation to X 1 and X 2 and interpret the results. Apply cross tabulation with two layers to X 1, X 3 and X 4 and interpret the results Copy the above tables into an Excel file for possible formatting 4

Classroom exercise 3 D.B. Khang _ HSRS #10 - Page 9 Create in Excel and SPSS a new variable: Z 19 = (X 19 μ )/σ where μ is mean of X 19 and σ is standard deviation of X 19 Apply descriptive statistical tools on Z 19 and interpret the results Draw the histogram charts of X 19 and Z 19 and interpret the results Note: Z 19 is called the standardized variable of X 19 Review of probability and distribution Probability: defined on random events (occurrences) Takes values between 0 and 1 Can be interpreted as limit of relative frequency (objective probability) Note: Often we may use also subjective probabilities, especially in decision making under uncertainty. Such probabilities simply mean the extent of our belief in the occurrence of uncertain events. However, most of statistics deals with objective interpretation based on random sampling of data! Random variable: output of a measurement (or survey question) that is taken out randomly from a given population. Usually we can have only sample values of the variables. Random variable can (only) be described by its distribution Distribution of a random variable can be approximated through observed values using summary statistics, histogram, frequency table or various charts Distribution of real random variables can also be approximated by theoretical distributions like normal, uniform, student, chi square, etc. Notation and examples Probability: P(customer is from magazine industry) = 0.52 Random variable: X 19 = customer satisfaction score Combined: P(X 19 >= 7.8) =? D.B. Khang _ HSRS #10 - Page 10 5

A small challenge D.B. Khang _ HSRS #10 - Page 11 A two-headed coin, a two-tailed coin and an ordinary coin are placed in a bag. One of the coins is drawn at random and flipped; it comes up head. What is the probability that there is a head on the other side of this coin? Solution: There are 6 sides of which 3 sides are Head: one from the normal coin and 2 from the two-head coin. Call them H1, H2 and H3. Each side has equal chance to come up If you see H1, the other side is Tail; if you see H2 or H3, the other side will be head. Once you see head, the probability is 2/3 to see H2 or H3. 6