Statistics: Descriptive Statistics & Mediterranean Agronomic Institute of Chania & University of Crete MS.c Program Business Economics and Management 7 October 2013, Vol. 1.1 Statistics: Descriptive Statistics &
Description 1 2 Basic Notions of 3 Statistics: Descriptive Statistics &
Description 1 2 Basic Notions of 3 Statistics: Descriptive Statistics &
Describing the Statistical Problem Definition Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling. Phenomena 1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc 2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc 3 Chaotic: Extreme financial events, earthquakes, tsunami, etc Statistics: Descriptive Statistics &
Describing the Statistical Problem Definition Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling. Phenomena 1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc 2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc 3 Chaotic: Extreme financial events, earthquakes, tsunami, etc Statistics: Descriptive Statistics &
Describing the Statistical Problem Definition Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling. Phenomena 1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc 2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc 3 Chaotic: Extreme financial events, earthquakes, tsunami, etc Statistics: Descriptive Statistics &
Describing the Statistical Problem Definition Statistics is a mathematical branch of science that deals with uncertain (or random) phenomena with the help of sampling. Phenomena 1 Random (or Uncertain): Outcome of tossing a coin, the outcome on betting, maximum car speed of a car, daily rain percipitation, number of daily births, return of a stock index, the level of sales, number of student attendence, etc 2 Non-Random (Certain): The sunrise, the sunset, gravity on earth, etc 3 Chaotic: Extreme financial events, earthquakes, tsunami, etc Statistics: Descriptive Statistics &
beamer-tu-logo Quotations on randomness 1 Aristotle: The probable is what usually happens 2 Democritus: Everything existing in the universe is the fruit of chance 3 Plato (to Phaedon): I know too well that these arguments from probabilities are imposters, and unless great caution is observed in the use of them, they are apt to be deceptive 4 Heraclitus: There is nothing permanent except change 5 Descartes (in Discourse on Method): It is a truth very certain that when it is not in our power to determine what is true we ought to follow what is most probable Statistics: Descriptive Statistics &
beamer-tu-logo Random Variable (r.v.) Random variable is the result of a random experiment which is characterised by uncertainty 1 Number of heads when tossing a coin ten (10) times 2 The daily consumption of a person 3 The maximum car speed of a car 4 A stock return today 5 Rain precipitation on a day 6 The number of student attendence Sampling The statistical sampling through the collection of a representative number of random variables can talk about the statistical characteristics of this random variable Statistics: Descriptive Statistics &
beamer-tu-logo Random Variable (r.v.) Random variable is the result of a random experiment which is characterised by uncertainty 1 Number of heads when tossing a coin ten (10) times 2 The daily consumption of a person 3 The maximum car speed of a car 4 A stock return today 5 Rain precipitation on a day 6 The number of student attendence Sampling The statistical sampling through the collection of a representative number of random variables can talk about the statistical characteristics of this random variable Statistics: Descriptive Statistics &
Defining Sample and Population Sample Sample, is a smaller number (a subset) of the people or objects that exist within a population. Population Population is refereed to as the universe, this is the entire set of people or objects of interest. It could be: 1 Infinite tossing a coin experiments 2 All adult citizens in a country 3 All cars driven in a country 4 All stock returns 5 All places where rain happens etc. 6 All students in a country Statistics: Descriptive Statistics &
Defining Sample and Population Sample Sample, is a smaller number (a subset) of the people or objects that exist within a population. Population Population is refereed to as the universe, this is the entire set of people or objects of interest. It could be: 1 Infinite tossing a coin experiments 2 All adult citizens in a country 3 All cars driven in a country 4 All stock returns 5 All places where rain happens etc. 6 All students in a country Statistics: Descriptive Statistics &
Defining Sample and Population (cont.) Important!! A sample is said to be representative if its members tend to have the same characteristics (e.g., region, shopping behaviour, age, income, educational level) as the population from which they were selected. 1 For example, if 45% of the population consists of female drivers, we would like our sample to also include 45% females. 2 When a sample is so large as to include all members of the population, it is referred to as a complete census. Statistics: Descriptive Statistics &
Descriptive Statistics A simple definition In descriptive statistics, we simply summarize and describe the data we have collected. For example: 1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!! 2 According to the Bureau of the Census, there has been an increase of 200% on the average UK gas consumption after the year 1980. 3 Rain precipitation data from different location are characterised by large variation Statistics: Descriptive Statistics &
Descriptive Statistics A simple definition In descriptive statistics, we simply summarize and describe the data we have collected. For example: 1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!! 2 According to the Bureau of the Census, there has been an increase of 200% on the average UK gas consumption after the year 1980. 3 Rain precipitation data from different location are characterised by large variation Statistics: Descriptive Statistics &
Descriptive Statistics A simple definition In descriptive statistics, we simply summarize and describe the data we have collected. For example: 1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!! 2 According to the Bureau of the Census, there has been an increase of 200% on the average UK gas consumption after the year 1980. 3 Rain precipitation data from different location are characterised by large variation Statistics: Descriptive Statistics &
Descriptive Statistics A simple definition In descriptive statistics, we simply summarize and describe the data we have collected. For example: 1 Observing the car speed at a specific location in an avenue you diagnose that the mean speed is 15.4 mph. Also the probability that a car will get speed at least 20 mph is 24%. This is descriptive statistics. You are merely describing the data that you have recorded!! 2 According to the Bureau of the Census, there has been an increase of 200% on the average UK gas consumption after the year 1980. 3 Rain precipitation data from different location are characterised by large variation Statistics: Descriptive Statistics &
Statistical Inference A simple definition In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example: 1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit. 2 Observing the average gas consumption, the ministry of energy may decide to turn its energy production to different directions in order to match future needs. 3 Due to observed large variation in rain percipitation data from different location, the Meteorology office decides to increase the number of locations where data is collected Statistics: Descriptive Statistics &
Statistical Inference A simple definition In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example: 1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit. 2 Observing the average gas consumption, the ministry of energy may decide to turn its energy production to different directions in order to match future needs. 3 Due to observed large variation in rain percipitation data from different location, the Meteorology office decides to increase the number of locations where data is collected Statistics: Descriptive Statistics &
Statistical Inference A simple definition In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example: 1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit. 2 Observing the average gas consumption, the ministry of energy may decide to turn its energy production to different directions in order to match future needs. 3 Due to observed large variation in rain percipitation data from different location, the Meteorology office decides to increase the number of locations where data is collected Statistics: Descriptive Statistics &
Statistical Inference A simple definition In inferential statistics, sometimes referred to as inductive statistics, we go beyond mere description of the data and arrive at inferences regarding the phenomenon or phenomena for which sample data were obtained. For example: 1 Observing the car speed taken by so many cars, the circulation regulator may impose a more realistic speed limit. 2 Observing the average gas consumption, the ministry of energy may decide to turn its energy production to different directions in order to match future needs. 3 Due to observed large variation in rain percipitation data from different location, the Meteorology office decides to increase the number of locations where data is collected Statistics: Descriptive Statistics &
Types Qualitative Qualitative data are words that cannot be defined by numbers. Some of the variables associated with people or objects are qualitative in nature, indicating that the person or object belongs in a category. For example: 1 You are either male or female 2 You are less than 25 years old or not 3 Your have a small or a large household 4 You are located in Crete or not 5 You are located in a mountainous area or not Statistics: Descriptive Statistics &
Types (cont.) Quantitative Quantitative data is collective data that can be measured by numbers. There are two types of quantitative variables: discrete and continuous. 1 Discrete quantitative variables can take on only certain values along an interval, with the possible values having gaps between them. Examples of discrete quantitative variables would be the number of employees on the payroll of a manufacturing firm, the number students attending a class, or the number of births are given per each calendar day. Discrete variables in business statistics usually consist of observations that we can count having integer values. 2 Continuous quantitative variables can take on a value at any point along an interval. For example, the stock index return can take at a given moment the value of 0.0493 or 0.049372. This will depend on the accuracy with which the volume can be measured. The possible values that could be taken on would have no gaps between them. Other examples of continuous quantitative variables are the car speed, the rain precipitation etc beamer-tu-logo Statistics: Descriptive Statistics &
Types (cont.) Quantitative Quantitative data is collective data that can be measured by numbers. There are two types of quantitative variables: discrete and continuous. 1 Discrete quantitative variables can take on only certain values along an interval, with the possible values having gaps between them. Examples of discrete quantitative variables would be the number of employees on the payroll of a manufacturing firm, the number students attending a class, or the number of births are given per each calendar day. Discrete variables in business statistics usually consist of observations that we can count having integer values. 2 Continuous quantitative variables can take on a value at any point along an interval. For example, the stock index return can take at a given moment the value of 0.0493 or 0.049372. This will depend on the accuracy with which the volume can be measured. The possible values that could be taken on would have no gaps between them. Other examples of continuous quantitative variables are the car speed, the rain precipitation etc beamer-tu-logo Statistics: Descriptive Statistics &
Types (cont.) Dummy Researchers sometimes convert Qualitative to Quantitative data using the so-called Dummy data. Examples: 1 For car speed data the Sex qualitative specification can take the answer YES for male motorists and the answer NO for female. This can be quantified in 1 for male and 0 for female. 2 For rain precipitation data the location qualitative specification can take the answer YES for mountainous location and the answer NO for the no-mountenous location. This can be quantified in 1 and 0 respectively. Statistics: Descriptive Statistics &
Types (cont.) Types of quantitative data 1 Cross-section data: These data might refer to people, companies, locations, countries given time t x 1,...,x N where N the total amount of Cross-sectional data given time t. (Important!!: Ordering of data does not matter) 2 Time-series data:these data might refer to people, companies, locations, countries collected in an an array of time interval given location l x 1,...,x T where T the total amount of time-series data given location l (Important!!: Ordering of data does matter) Statistics: Descriptive Statistics &
Types (cont.) transformation in quantitative data In various time-series data there is a necessity to take raw data from one source and then transform them into a different form for the empirical analysis 1 The percentage change of sales 2 The percentage change of a stock index Suppose, x 1,...,x T represent stock index time-series. Then for t {1,...,(T 1)} (x t+1 x t) x t 100 stands for the percentage change in this index. Statistics: Descriptive Statistics &
Values Table: Car Speed x i f i F i 4 2 2 7 2 4 8 1 5 9 1 6 10 3 9 11 2 11 12 4 15 13 4 19 14 4 23 15 3 26 16 2 28 17 3 31 18 4 35 19 3 38 20 5 43 22 1 44 23 1 45 24 4 49 25 1 50 Statistics: Descriptive Statistics &
Classified Values Table: Car Speed x i f i F i [0 5] 2 2 [6 10] 7 9 [11 15] 17 26 [16 20] 17 43 [21 25] 7 50 Statistics: Descriptive Statistics &
Car Speed car speed car speed 0 1 2 3 4 5 Frequency 0 5 10 15 4 8 10 12 14 16 18 20 23 25 0 5 10 15 20 25 speed Statistics: Descriptive Statistics &
Advertisement Expenditure in 000,000 s Euro advertisement in thous. Euro advertisement in thous. Euro 0 1 2 3 4 5 6 7 Frequency 0 2 4 6 8 10 0 5 10 15 20 25 0 5 10 15 20 25 30 adv Statistics: Descriptive Statistics &
Gas consumption in UK (quarterly data for 1960 1985) Gas consumption in UK 200 400 600 800 1000 1200 1960 1965 1970 1975 1980 1985 Time Statistics: Descriptive Statistics &
Toyota car sales in Greece (monthly data for 1998 2003) Toyota car sales in Greece (monthly) 1000 1500 2000 2500 3000 1998 1999 2000 2001 2002 2003 Time Statistics: Descriptive Statistics &
beamer-tu-logo Main Objectives of 1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures 4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables Statistics: Descriptive Statistics &
beamer-tu-logo Main Objectives of 1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures 4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables Statistics: Descriptive Statistics &
beamer-tu-logo Main Objectives of 1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures 4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables Statistics: Descriptive Statistics &
beamer-tu-logo Main Objectives of 1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures 4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables Statistics: Descriptive Statistics &
beamer-tu-logo Main Objectives of 1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures 4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables Statistics: Descriptive Statistics &
beamer-tu-logo Main Objectives of 1 Describe data using measures of location tendency 2 Describe data using measures of dispersion 3 Describe data using probability measures 4 Compare data of different measure using standardises techniques 5 Express relationship between two (2) different random variables Statistics: Descriptive Statistics &
Mean Description Mean (or Mean Average) of expresses the arithmetic mean of a random variable for a given sample of our experiment How to estimate it? 1 For a sample of N observations x = 1 N N i=1 x i 2 For a sample of N = N i=1 f i N i=1 x = f ix i N i=1 f i with f i frequencies for each x i measure 3 For the populationµ x = 1 N N i=1 x i beamer-tu-logo Statistics: Descriptive Statistics &
Variance Description Variance (or Volatility) expresses a measure of a random variable dispension from the mean How to estimate it? 1 For a sample of N observations (N 30) S 2 = 1 N N (x i x) 2 i=1 2 For a sample N < 30 S 2 = 1 N 1 N (x i x) 2 i=1 Statistics: Descriptive Statistics &
Variance How to estimate it? (cont.) 1 For the population (and for samples with N 30) σ 2 = 1 N N (x i µ) 2 1 N i=1 N i=1 x 2 i µ 2 Standard Deviation 1 For the sample 2 For the population S = S 2 σ = σ 2 Statistics: Descriptive Statistics &
Car Speed data Estimate mean and variance when using Discrete Values x = 15.4 S 2 = 26 How to estimate mean and variance when using Classified Values 1 Set median values for x i, like z i. These are: {3, 8, 13, 18, 23} 2 Set new z and S 2 values such as: N i=1 z = f iz i N i=1 f i = 15 S 2 = 1 N N zi 2 f i z 2 = 26 i=1 beamer-tu-logo Statistics: Descriptive Statistics &
Median (M) Description Median expresses a measure of location tendency that is assigned to a value of the random variable that has 50% of the probability (or frequency) of the whole sample How to estimate it? M = {x : F(x) N/2} Statistics: Descriptive Statistics &
Percentiles Description Percentiles expresses a measure of central tendency that is assined to a value of the random variable that has P% of the probability (or frequency) of the whole sample How to estimate it? 1 1st Percentile Q 1 = {x : F(x) N/4} 2 2nd Percentile (Median) Q 2 = {x : F(x) N/2} 3 3nd Percentile Q 3 = {x : F(x) 3 N/4} Statistics: Descriptive Statistics &
Car Speed data Estimate the Median using Classified Values Q 2 = l i + δ i f i [ N 2 F i 1], 1 l i, the lower bound of the class where F(x) N/2 2 δ i the width of a class 3 f i the frequency of the class such that F(x) N/2 4 F i 1 the bounded frequency which represents the i 1 class of the F(x) N/2 condition Statistics: Descriptive Statistics &
Mode Description Mode expresses a location measure that inform as about the value of our random variable with the highest frequency (or probability) How to estimate it? m x = {x : f(x) = max(f 1,...,f N )} Problems with estimating m x 1 The Mode (m x) estimation may depend on the classes assigned for the analysed random variable. A possible change in the width of class assigned may change the mode of a random variable. When no classes assigned then mode is an objective measures of central tendency. 2 The same m x may appear for two (2) or even more random variable values or classes. This are the bimodal and the multimodal cases respectively. beamer-tu-logo Statistics: Descriptive Statistics &
Car Speed data Estimate the mode using Discrete Values m x = 20 Estimate the mode using Classified Values m x [10, 20] 1 Not precise estimation 2 May assign mode to values with very low frequency, such as 11 and 16 Statistics: Descriptive Statistics &
Absolute Measures of Linear Relationship (2 random variables) Covariance Covariance is the measure of how much two random variables move together. If two variables tend to move together in the same direction, then the covariance between the two variables will be positive. If two variables move in the opposite direction, the covariance will be negative. If there is no tendency for two variables to move one way or the other, then the covariance will be zero. How to estimate it? 1 For a sample of N observations 2 For the population S 2 x,y = N i=1 (x i x)(y i ȳ) N 1 σ 2 x,y = N i=1 (x i µ x)(y i µ y) N beamer-tu-logo Statistics: Descriptive Statistics &
Relative of Linear Relationship (2 random variables) Correlation Correlation is the relative measure of linear relationship between two (2) random variables How to estimate it? 1 For a sample of N observations ρ x,y = σx,y σ x σ y 2 For the population r x,y = sx,y s x s y Statistics: Descriptive Statistics &
beamer-tu-logo Correlation, explained 1 The correlation s sign depends on the Covariance sign (e.g. positive covariance lead to positive correlation) 2 1 ρ x,y 1, 1 r x,y 1 3 When ρ x,y, r x,y 0, we have nearly uncorrelated random variables 4 ρ x,y, r x,y > 0, we have possitivly and when < 0 negatively correlated random variables Statistics: Descriptive Statistics &
Sales versus Advertisement (in thousand Euro s) Sales (thous. Euro) 80 82 84 86 88 90 92 0 5 10 15 20 25 Advert. (thous. Euro) Statistics: Descriptive Statistics &
Car speed (mph) versus Distance to stop (in ft) 120 100 Stopping distance (ft) 80 60 40 20 0 0 5 10 15 20 25 Speed (mph) Statistics: Descriptive Statistics &
Car weights versus Miles per gallon Miles Per Gallon 10 15 20 25 30 2 3 4 5 Car Weight Statistics: Descriptive Statistics &
Correlation using data Correlation in Sales vs Advertisement r x,y = 0.9409605 Correlation in Car speed (mph) versus Distance to stop r x,y = 0.8068949 Correlation in Car weights versus Miles per gallon r x,y = 0.8676594 Statistics: Descriptive Statistics &
Coefficient of Variation (CV) Description Coefficient of Variation is a relative dispersion measure for a random variable that expresses the standard deviation as a percentage of the arithmetic mean. How to estimate it? 1 For a sample of N observations CV = 100 S x 2 For the population CV = σ µ 100 Statistics: Descriptive Statistics &
beamer-tu-logo Why the CV is used? We use it when we like to compare the variation of two population (or samples) of different measures. Example: When sales data of two (2) different population (with different currencies) are analysed for their dispersion one can use the variance measure. However, variance expresses the an absolute measure of variation on the currency of each population. Here, the CV can demonstrate a relative dispersion measure in order for the dispersions to be comparable. Statistics: Descriptive Statistics &
Skewness Coefficient Description Skewness Coefficient is a measure of asymmetry for the frequency (or probability) distribution of a random variable How to estimate it? 1 Pearson s first coefficient of skewness of the distribution of a random variable using: µx mx a 1 = σ x 2 Skewness coefficient with µ 3 x = 1 N a 1 = µ3 x σ 3 x N (x i µ) 3 i=1 beamer-tu-logo Statistics: Descriptive Statistics &
beamer-tu-logo Skewness, explained 1 When a 1 > 0 we have positive skewness with µ x > m x (positive asymmetry) 2 When a 1 < 0 we have negative skewness with µ x < m x (negative asymmetry) 3 When a 1 0 we have zero (0) skewness (symmetry) Statistics: Descriptive Statistics &
Skewness Paradigm Statistics: Descriptive Statistics &
Kurtosis Coefficient Description Kurtosis Coefficient is a measure that assesses how flat or peaked is the frequency (or probability) distribution of a random variable How to estimate it? Pearson s coefficient of kurtosis a 2 = µ4 x σ 4 x with µ 4 x = 1 N N (x i µ) 4 i=1 Statistics: Descriptive Statistics &
beamer-tu-logo Kurtosis, explained 1 When a 2 3 > 0 we have more peaked distribution (leptokurtic or fat-tailed distribution) 2 When a 2 3 < 0 we have less peaked distribution (platykurtic or long-tailed distribution ) 3 When a 2 3 = 0 we have mesokurtic distribution Statistics: Descriptive Statistics &
Kurtosis Paradigm Statistics: Descriptive Statistics &
Skewness and Kurtosis using Advertisement data Estimate Skewness and Kurtosis when using Discrete Values a 1 = 0.06949301, Positive Asymmetry a 2 = 1.174533, Platykurtic Estimate Skewness and Kurtosis when using Classified Values a 1 = 0.542137, Positive Asymmetry a 2 = 0.7993088, Platykurtic 1 Similar signs in Asymmetry and Kurtosis 2 Much stronger positive asymmetry in the Classified Values (look at the graph!!!) Statistics: Descriptive Statistics &
Description Basic Notions of 1 2 Basic Notions of 3 Statistics: Descriptive Statistics &
Basic Notions of Definition is the likelihood for specific outcome of a random experiment to happen Number of possible outcomes in which the event occurs Total number of possible outcomes 1 Example I: The probability of having head when tossing a coin (theoretical probability) 1/2 2 Example III: The probability of having King when selecting a playing card (theoretical probability) 4/52 3 Example IV: The probability of Car speed 4 in the Car speed data sampling experiment (empirical probability) 2/50 beamer-tu-logo Statistics: Descriptive Statistics &
Basic Notions of Definition is the likelihood for specific outcome of a random experiment to happen Number of possible outcomes in which the event occurs Total number of possible outcomes 1 Example I: The probability of having head when tossing a coin (theoretical probability) 1/2 2 Example III: The probability of having King when selecting a playing card (theoretical probability) 4/52 3 Example IV: The probability of Car speed 4 in the Car speed data sampling experiment (empirical probability) 2/50 beamer-tu-logo Statistics: Descriptive Statistics &
Basic Notions of Definition is the likelihood for specific outcome of a random experiment to happen Number of possible outcomes in which the event occurs Total number of possible outcomes 1 Example I: The probability of having head when tossing a coin (theoretical probability) 1/2 2 Example III: The probability of having King when selecting a playing card (theoretical probability) 4/52 3 Example IV: The probability of Car speed 4 in the Car speed data sampling experiment (empirical probability) 2/50 beamer-tu-logo Statistics: Descriptive Statistics &
Basic Notions of Definition is the likelihood for specific outcome of a random experiment to happen Number of possible outcomes in which the event occurs Total number of possible outcomes 1 Example I: The probability of having head when tossing a coin (theoretical probability) 1/2 2 Example III: The probability of having King when selecting a playing card (theoretical probability) 4/52 3 Example IV: The probability of Car speed 4 in the Car speed data sampling experiment (empirical probability) 2/50 beamer-tu-logo Statistics: Descriptive Statistics &
(cont.) Basic Notions of Types of 1 Empirical : The probability estimated as an outcome of empirical experiment 2 Theoretical : The probability estimated as an empirical of theoretical experiment Important!! 1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of probability estimation 3 The empirical probabilities converge to the theoretical as N Statistics: Descriptive Statistics &
(cont.) Basic Notions of Types of 1 Empirical : The probability estimated as an outcome of empirical experiment 2 Theoretical : The probability estimated as an empirical of theoretical experiment Important!! 1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of probability estimation 3 The empirical probabilities converge to the theoretical as N Statistics: Descriptive Statistics &
(cont.) Basic Notions of Types of 1 Empirical : The probability estimated as an outcome of empirical experiment 2 Theoretical : The probability estimated as an empirical of theoretical experiment Important!! 1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of probability estimation 3 The empirical probabilities converge to the theoretical as N Statistics: Descriptive Statistics &
(cont.) Basic Notions of Types of 1 Empirical : The probability estimated as an outcome of empirical experiment 2 Theoretical : The probability estimated as an empirical of theoretical experiment Important!! 1 Theoretical probabilities never coincide with the empirical 2 As researcher increase the sample N to increase the accuracy of probability estimation 3 The empirical probabilities converge to the theoretical as N Statistics: Descriptive Statistics &
Car Speed using probabilities Basic Notions of car speed car speed 0.00 0.02 0.04 0.06 0.08 0.10 Density 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 4 8 10 12 14 16 18 20 23 25 0 5 10 15 20 25 speed Statistics: Descriptive Statistics &
Car Speed using probabilities Basic Notions of Estimating Probabilities 1 P(x = 4) = 0.04 2 P(x 7) = 0.04+0.04 = 0.08 3 P(x 20) = 1 P(x 19) = 0.24 Statistics: Descriptive Statistics &
(cont.) Basic Notions of Sample Space Sample space (Ω) is the collection of all posible outcomes in a random experiment Ω = {E 1,...,E k } Sample Space for Car Speed data Ω = {4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25} Redefining For a given sample space Ω, then the probabilities follow the following: 1 0 P(E i ) 1, i = 1,...,k 2 The sum of the probabilities k P(E i ) = 1 i=1 Statistics: Descriptive Statistics & beamer-tu-logo
(cont.) Basic Notions of Laws of Symbolises the union of events Symbolises the intersection of events 1 When E + E = Ω the events are condidered mutually exclussive and have P(E 1 E 2 ) = 0 2 For E 1 and E 2 with E 1 E 2, the union of the events P(E 1 E 2 ) = P(E 1 )+P(E 2 ) P(E 1 E 2 ) 3 For E 1 and E 2 with E 1 E 2 =, the union of the events for i = 1,...,k P(E 1 E 2 ) = P(E 1 )+P(E 2 ) 4 For E 1 and E 2 independent events we have P(E 1 E 2 ) = P(E 1 ) P(E 2 ) Eg: E 1 the event of having even outcome when tossing one coin, and E 2 the event of having even outcome when tossing another coin beamer-tu-logo Statistics: Descriptive Statistics &
(cont.) Basic Notions of Conditional 1 Marginal : The probability that a given event will occur. No other events are taken into consideration. A typical expression is P(A) for the A event. 2 Joint : The probability that two or more events will all occur. A typical expression is P(A B) for the A and B events. 3 Conditional : The probability that an event will occur, given that another event has already happened. A typical expression is P(A B), with the verbal description, the probability of A, given B. Statistics: Descriptive Statistics &
(cont.) Basic Notions of Conditional Supose data from two (2) random variables. E i the quality for a product being high-medium-low for i = 1 = 2 and i = 3 respectively. On the other hand we can divide the our sample space into A and B for the A-market area and the B-market area respectively. We can assign conditional frequencies (probabilities) to the 2-entry table as follows: A B E 1 P(E 1 A) P(E 1 B) P(E 1 ) E 2 P(E 2 A) P(E 2 B) P(E 2 ) E 3 P(E 3 A) P(E 3 B) P(E 3 ) P(A) P(B) P(Ω) Statistics: Descriptive Statistics &
(cont.) Basic Notions of Conditional (cont.) Using data we can derive the following 2-entry table: A B E 1 P(E 1 A) = 0.48 P(E 1 B) = 0.12 P(E 1 ) = 0.60 E 2 P(E 2 A) = 0.15 P(E 2 B) = 0.10 P(E 2 ) = 0.25 E 3 P(E 3 A) = 0.0225 P(E 3 B) = 0.1275 P(E 3 ) = 0.15 P(A) = 0.6525 P(B) = 0.3475 P(Ω) = 1 Statistics: Descriptive Statistics &
(cont.) Basic Notions of Conditional (cont.) One can derive the probability of having a product of high quality conditional on being in the A market area as: where the marginal probability. and P(E 1 A) = P(E 1 A), P(A) P(A) = P(E 1 A)+P(E 2 A)+P(E 3 A), P(E 1 A) = P(E 1 A) P(A) the joint probabilities. P(E 1 B) = P(E 1 B) P(B) Statistics: Descriptive Statistics &
(cont.) Basic Notions of Conditional (cont.) One can derive the probability of having a product of at least medium quality conditional on being in the A market area as: where the marginal probability. and P(E 1 E 2 A) = P((E 1 E 2 ) A), P(A) P(A) = P(E 1 A)+P(E 2 A)+P(E 3 A), P((E 1 E 2 ) A) = P((E 1 A) (E 2 A)) = P(E 1 A)+P(E 2 A) = 0.63 So, the P(E 1 E 2 A) = 0.709885. Question? Is this probability bigger than that of having a product of at least medium quality conditional on being in the B market area? beamer-tu-logo P(E 1 E 2 B) = 0.4075216 Statistics: Descriptive Statistics &
(cont.) Basic Notions of Conditional (cont.) One can derive the probability of having a product of at least medium quality conditional on being in the A market area as: where the marginal probability. and P(E 1 E 2 A) = P((E 1 E 2 ) A), P(A) P(A) = P(E 1 A)+P(E 2 A)+P(E 3 A), P((E 1 E 2 ) A) = P((E 1 A) (E 2 A)) = P(E 1 A)+P(E 2 A) = 0.63 So, the P(E 1 E 2 A) = 0.709885. Question? Is this probability bigger than that of having a product of at least medium quality conditional on being in the B market area? beamer-tu-logo P(E 1 E 2 B) = 0.4075216 Statistics: Descriptive Statistics &
Description 1 2 Basic Notions of 3 Statistics: Descriptive Statistics &
Anderson D.R., Sweeney, D.J, Statistics for Business and Economics. South-Western College Pub; 11th edition, 2010 Weiers R.M., to Business Statistics. South-Western College Pub; 7th edition, 2010. Koop G., Analysis of Economic. Wiley Pub; 2nd edition, 2005. Statistics: Descriptive Statistics &
Anderson D.R., Sweeney, D.J, Statistics for Business and Economics. South-Western College Pub; 11th edition, 2010 Weiers R.M., to Business Statistics. South-Western College Pub; 7th edition, 2010. Koop G., Analysis of Economic. Wiley Pub; 2nd edition, 2005. Statistics: Descriptive Statistics &
Anderson D.R., Sweeney, D.J, Statistics for Business and Economics. South-Western College Pub; 11th edition, 2010 Weiers R.M., to Business Statistics. South-Western College Pub; 7th edition, 2010. Koop G., Analysis of Economic. Wiley Pub; 2nd edition, 2005. Statistics: Descriptive Statistics &