Models for Discrete Variables


 Valentine Harrell
 1 years ago
 Views:
Transcription
1 Probability Models for Discrete Variables Our study of probability begins much as any data analysis does: What is the distribution of the data? Histograms, boxplots, percentiles, means, standard deviations  they all apply with at most minor adjustments. We begin our study by considering discrete quantitative data. Remember: In discrete data, a lot of the data are ties. We ll start with a simple random variable: The number x of credit cards a randomly selected person has. A probability table and histogram for this data are shown. x = # of credit cards a person has p(x) = probability a person has x credit cards Because this is highly discrete data, percentiles are not very useful. Because of the large numbers of ties, we use relative frequencies to summarize the data. The letter p is used for probability (which is synonymous in a sense with proportion and relative frequency.) Here s how to obtain the value of the mean for the random variable x: Mean = = xp x Compute the product of the values with their relative frequencies. Sum these. (This formula works even when there are no ties: p(x) = 1/N for each value.) For the example, the mean computation is in the third column of the table. Mean Variance/SD x p(x) x p(x) x px (0.0) = 0.00 (0 1.80) 0.0 = (0.30) = 0.30 (1 1.80) 0.30 = (0.0) = 0.40 ( 1.80) 0.0 = (0.15) = 0.45 (3 1.80) 0.15 = (0.10) = 0.40 (4 1.80) 0.10 = (0.05) = 0.5 (5 1.80) 0.05 = x = # of credit cards 1.00 x = 1.80 = Discrete Populations and Probability Distributions Page 1
2 Suppose you had the ideal sample of 100 data values from this situation. Then you would have the following (these have been sorted): The mean of these is (The standard deviation is 1.44.) You might notice that the mean could be computed as follows: x MEAN n The two computations are identical. The = discreteness of the data the large number of ties. xp x computation takes advantage of the What happened to the divide by the number of observations in the formula for mean? Reexamine the computations detailed above: This division is incorporated into the probabilities. Place your finger under the horizontal axis of the histogram, at the position of the mean: The histogram will balance. The mean does not have to be one of the possible values. No one has 1.8 credit cards. 1 Variance and Standard Deviation Let s talk standard deviation. We want to be able  as we did for the mean to obtain this value without worrying about actual lists of data. We determine the variance, and then standard deviation, for x as follows:. (Discrete) Variance = x px Standard Deviation = x px For each value we determine the squared deviation from the mean. We multiply these squared deviations by the probabilities. Sum the results to get the variance. The standard deviation is the square root of the variance. Again, the divide by how many is embedded into the relative frequencies. For the credit card example, the details of this computation are shown in the fourth column of the table on page 4. The variance is =.060 and the standard deviation is = This computation could be replaced by simply constructing a data set having the proper relative 1 This is a right skewed distribution, yet the mean is below the mode. This is an exception to the general rule. Such exceptions are usually found when data are highly discrete (here there are only 6 possible values of x). Discrete Populations and Probability Distributions Page
3 frequencies, then inputting the values into a computer or calculator and having the technology determine the value of the standard deviation. The mean and standard deviation are measures of the center and spread of a distribution. They are not systematically dependent on the size of the data set. Illustrating Probability and the Mean and Standard Deviation of a Random Variable as Long Term Behavior Consider the probability distribution for a random variable x as at right. Here are results of 10 observations of this variable: The (sample) mean for these 10 observations is.0, the standard deviation is 1.3. Relative frequencies (RF) and the sample mean and standard deviation are shown in the second (10) column below: x p(x) x p(x) (x ) p(x) (1 ) 0.4 = ( ) 0.3 = (3 ) 0. = (4 ) 0.1 = 0.4 Mean =.0 Variance = 1.0 SD = 1.0 x RF 10 RF 100 RF 1000 RF RF Mean 10 =.0 Mean 100 = 1.94 Mean 1000 = Mean = 1.98 Mean =.006 SD 10 = 1.3 SD 100 = 1.04 SD 1000 = SD = 0.99 SD = Add to these another 90 observations, for 100 total (at right): Results are shown in the third (100) column of the table above Add another 900 (not shown due to space considerations) for 1000; then another 9,000 for 10,000; then another 90,000 for 100,000. See the fourth through sixth columns of the above table. Is that a fluke? Try it again If you are treating a sample of data of this type in tabular form, you should multiply the value you get for x by ( ) to obtain the sample standard deviation S. If you have a sample on hand, there really is no probability you aren t going to randomly select items from the sample (you already have them). So the formula for the standard deviation of a random variable is not quite correct when applied to a sample. The adjustment is necessary for technical reasons the adjusted value is, in one technical sense, a better estimate of the standard deviation for the probability distribution. Discrete Populations and Probability Distributions Page 3
4 x RF 10 RF 100 RF 1000 RF RF Mean 10 = 1.60 Mean 100 = 1.90 Mean 1000 =.04 Mean =.003 Mean =.005 SD 10 = 0.70 SD 100 = 0.95 SD 1000 = 1.03 SD = SD = 1.00 Probabilities are long term relative frequencies. The mean and standard deviation of a random variable also reflect what happens in the long term: The mean and standard deviation for all possible units. If the probability distribution mimics a population distribution, then probabilities are population relative frequencies, and the mean and standard deviation for the probability distribution are the mean and standard deviation for the population. Probability Talk Suppose we select a person at random from the general population. Go back to the credit card example: The probability a random selected person carries (exactly) credit cards is 0.0. So: If the population has 0 people, it must be the case that 0.0(0) = 4 of the people have 1 credit card. How else could the probability be 0.0? If the population consists of 8000 people, then 1600 (which is 0.0 of 8000) have 1 credit card. A probability of 0.0 goes hand in hand with a population for which 0.0 of all people have 1 credit card no matter the size of the population. The population distribution is identical to the probability distribution for the outcome if a single value is randomly selected from the population. Thinking about things a slightly different way: Consider p(3) = It means that the probability of selecting a single person with 3 credit cards is This implies that if the experiment of selecting a single person at random is performed repeatedly, in the very long run, 0.15 of the time the person will have 3 credit cards. Probability refers to the relative frequency of occurrences in a huge (infinite) set of identical repeats of the sampling. This also means that 0.15 of all people (the entire population) have 3 credit cards. Consider the mean of = (We use the Greek letter to represent the mean of a probability distribution or population.) As the mean for the probability distribution it also implies that if the experiment of selecting a single person at random is performed repeatedly, in the very long (infinite) run the average result will be This also means that the mean for the entire population is (That was probably obvious to you. However, a population is a real thing while a probability distribution is a mathematical object. It is likely not the case that only one single value would ever be randomly selected from the population.) When we talk about probability distributions we are talking about all possible ways that an experiment might occur. Consider looking at all possible ways of selecting a person. The mean number of credit cards is 1.8, and the standard deviation is In 0.30 = 30% of those ways, the selected person has exactly 1 credit card. Discrete Populations and Probability Distributions Page 4
5 Probability Example Choose a college student at random. Count x = the number of siblings in the family. (Subtract one from each x to arrive at # of brothers and sisters a student has. ) # of children x p( x) Here s the interpretation of the probability p() =0.806: Consider all possible ways of selecting a student. In = 8.06% of those ways, the student is from a child family. The computation of the mean, variance and standard deviation follow: x p(x) x p(x) (x ) p(x) (0.194) = (1.743) = (0.806) = (.743) = (0.39) = (3.743) 0.39 = : = : = (0.0003) = (10.743) = Mean: =.7430 Variance: =.1548 St Dev: = It is not correct to compute 55/10 = 5 to get the mean. (Nor is 1/10 = 0.1 correct.) While 1,, 10 are the 10 possible values, they do not occur with equal frequency. Each possible value must be weighted by its probability of occurrence. Here s an interpretation of the mean and standard deviation: Consider all possible ways of selecting a student. The mean number of siblings is.7430 with standard deviation In other words: The mean number of siblings for the population of all students is.7430, with standard deviation of Probability calculations # of Children What is the probability a randomly chosen student comes from a family with Discrete Populations and Probability Distributions Page 5
6 more than 5 siblings? P(x > 5) = = at most 3 siblings? P(x 3) = = at least 7 siblings? P(x 7) = = fewer 3 than 5 siblings? P(x < 5) = = These computations illustrate how to perform probability computations for events that consist of a number of outcomes. (For example: The event more than 5 is formed by all outcomes more than 5: 6, 7, 8, 9, 10. Since we are talking about family size, what s being said is that a family with more than 5 siblings is a family with 6, 7, 8, 9, or 10 siblings (ignoring really large families, as they have sufficiently small probabilities that ignoring them has no impact on fundamental analyses.) 3 Technically, the phrase less than is not appropriate for a discrete variable, while fewer than is grammatically proper. However, in common speech very few people make this distinction. Discrete Populations and Probability Distributions Page 6
7 Frequency Application Approximating the mean of continuous data from a histogram. If you are given a histogram for continuous data, but not the data itself, you can approximate the mean and standard deviation: Step 1) Step ) Step 3) Example pretend that all the data in a given bin are at the midpoint determine relative frequencies for each bin use the relative frequency approach to computing mean and standard deviation Consider the failure times (in hours) of 19 industrial machines. Step 1) Step ) Step 3) We pretend all data are at the midpoints: 35, 45,, 85. Here are relative frequencies. (For best accuracy use many digits or exact fractions.) midpoint frequency relative frequency Step 4) The mean is then approximately 35(0.056) + 45(0.1053) + 55(0.1053) + 65(0.4737) + 75(0.1579) + 85(0.1053) = Use the mean in the variance computation ( ) (0.056) + ( ) (0.1053) + ( ) (0.1053) + ( ) (0.4737) + ( ) (0.1579) + ( ) (0.1053) = Then SD = Failure Time (Hours) These are only approximate values. To obtain exact values you must input the raw data and do the computations. (You might also approximate the standard deviation using Range/4. For the histogram, the range would be expected to be Max Min = 6. Then the standard deviation should be around This approach is considerably quicker Discrete Populations and Probability Distributions Page 7
8 Exercises 1. Consider the set of 5 observations below  each is the number of bags of recycling brought by a family to the recycling center (there is a 7 bag limit) a) Use your calculator s (or software s) statistics functions to determine the mean and standard deviation for the data (nearest 0.1 for each). b) Complete the table below. # of bags Frequency Relative Frequency c) Sketch a histogram. What shape is this distribution? d) Suppose the table from part c describes a population much larger than 5. Determine the mean, variance and standard deviation for this distribution. Use the formulas for population mean, variance and standard deviation of a probability distribution (again to the nearest 0.1): = xp x x px x px Associate each answer with the correct symbol. (If your work is correct, the measn and standard deviations from a and d will match.). For the population described below a) Visualize a histogram. x p(x) 1/6 1/6 1/6 1/6 1/6 1/6 b) Determine the mean and standard deviation (nearest 0.01). This is the distribution of results when one tosses a fair six sided die. c) Suppose you started tossing a die many many times, recording each result (and entering them into you calculator). After a huge number of tosses: what is the proportion of 3s? what is the mean, as computed by your calculator? what is the value of the standard deviation, as computed by your calculator? 3. The distribution of lengths of students last names are tabulated below. x p(x).4% 0.0% 9.4% 14.%.0%.8% 16.5% 9.4%.4% 0.9% a) Draw a histogram. What shape is this distribution? b) Find the mean length. c) Does this distribution describe a large population? Why (not)? Discrete Populations and Probability Distributions Page 8
9 d) Using the range, guess the standard deviation. How does this compare to the actual standard deviation of 1.71 for this data? 4. The table at right is the probability distribution of the number of pets in a household from a survey given by the Humane Society: a) Find the probability that a household picked at random would have four pets. b) Find the probability that a household has a pet. x p(x) c) Find the probability that a household has more than 3 pets d) Find the probability that a household has at least 4 pets e) Find the probability that a household has less than pets f) Find the probability that a household has no more than 5 pets g) Find the probability that a household has at most 6 pets h) Find the mean and standard deviation i) Which is more likely to occur, that a household to have 4 or more pets, or at most? j) According to the table, which number of pets is most likely to occur? Is the mean number of pets the same as the number of pets most likely to occur and what does this indicate? k) What is the probability that a randomly selected household has a number of pets within one standard deviation of the mean? (Begin by computing and +. What is the probability of an outcome between these two values?) l) What is the probability that a randomly selected household has a number of pets within two standard deviations of the mean? (Begin by computing and +.) m) According to this table, what is the probability that someone has 9 pets? Is this necessarily representative for every household anywhere? n) Explain what it means to say the probability of having 3 pets is Use either relative frequency or percent in your explanation, as well as the phrase all possible. o) Explain what the mean and standard deviation represent. 5. Here is a discrete probability table and chart showing the number of songs downloaded off of itunes in a week by college students who own Apple computers. x p(x) a) Identify the units and the variable. What is the probability that: b) A college student will download more than 6 songs a week? c) A college student will download at most 6 songs a week? d) A college student will download less than 3 songs a week? e) A college student will download at least 3 songs a week? Discrete Populations and Probability Distributions Page 9
10 f) A college student will download no more than 5 songs a week? g) A college student will download no less than 5 songs a week? h) A college student will not download 4 songs? i) Find values of the mean and standard deviation. j) If we polled all college students that owned an Apple computer, what percent of them would download more than 6 songs in a week? k) What is the probability a student s number of downloads is within one standard deviation of the mean? Within two? 6. A casino offers a game of chance. It costs $5 to play. The profit (loss if negative) x has the probability distribution shown below. x p(x) a) Sketch a histogram. (This is bimodal with an outlier. Gambling uses strange distributions.) b) Determine the mean and standard deviation (nearest 0.01 = one penny). (Be careful when subtracting a negative.) Identify each by symbol. c) What is the probability you lose money when you play this game? What s the probability you win money? d) What would happen to a player who plays this game a very large number of times? e) Convince yourself that the probability of a result within one standard deviation of the mean is 0.970, and that the probability of a result within two standard deviations is The rule of thumb stating 68/95 can be really misleading when data are from a strange distribution like this. This distribution is strange in two ways: Bimodal; Huge outlier. 7. A carnival game costs $.00 to play. The probability of winning x dollars is shown. a) What is the probability of at least getting your money back? b) What is the probability of losing money playing this game? c) What is the probability of breaking even? d) What is the mean payout for this game? e) What is the standard deviation? f) Does this game make a profit for the carnival? If so, how much? Explain. 8. Suppose p(y) is defined for y = 1,,, 9 as follows: py log1 1 y For example: p(5) = log (5 + 1/5) = log (6/5) = log 1. ( ). a) Construct a probability (relative frequency) table. Draw a histogram. b) Show that the probabilities sum to exactly 1. c) Determine the mean, variance and standard deviation to the nearest d) Determine the exact value of the mean. x p(x) Discrete Populations and Probability Distributions Page 10
11 This distribution is known as Benford s Law. It is often used to model the leading digits of collections of numbers (not all collections of numbers just those with certain properties). So: e) Find the leading digit of the population of each of the 50 states (for example, if the population is 3,483,399 then the leading digit is ). Construct a relative frequency table for this data. Solutions 1. a) 4.8 and 1.7. b) The relative frequencies are (in order) 0.04, 0.08, 0.1, 0.16, 0.0, 0.4, c) It s a bit left skewed, d) = 4.8, = 1.7. (The mean is left of the mode which hints at left skew.). a) The histogram has a flat (uniform) shape. b) 3.50, c) Enter the data from each toss in the calculator. Once you have many many tosses (a large set of data) the proportion of 3s will be very close to 1/6 = The mean and standard deviation for the data will closely match 3.50 and a) The distribution is fairly symmetric (an outlier at?). b) The mean is letters. c) This probably is not a population; in any large population some people would have threeletter last names, others would have names longer than 11 letters. d) 9/4 =.5. This isn t too far from the actual a) p ( x 4) b) p ( x 0) c) p ( x 3) d) p ( x 4) e) p ( x ) f) p ( x 5) g) p ( x 6) h) The mean is xp(x) = The variance is ( x ) p( x) ( x 1.797) p( x) = The standard deviation is the square root of this: i) The probability that a household has 4 or more pets is 0.108, the probability that there are at most pets is 0.731, so since the probability is higher for at most pets, it is more likely to occur. j) According to the table it is most likely to occur that a given household has 1 pet. However the mean number of pets is 1.8. This indicates that the mean doesn t always represent what is likely to occur or what has occurred the most. The mean is not the mode and the mean being greater suggests right skew. k) = = 0.47; + = = Results between these two values (between 0.47 and 3.13) are 1,, and 3. The probability of having 1, or 3 pets is = 0.77 (not that far from 0.68). l) = 1.80 (1.33) = 0.86; + = (1.33) = Results between these two values (between and 4.46) are 0, 1,, 3, and 4. The probability of having 0 4 pets (inclusive) is = (not that far from 0.95). m) According to this table the probability is 0, meaning it cannot happen. However this does not mean that no household anywhere has 9 pets. The model given here is stated more concisely by truncating values from 9 and up, and the lack of this detail has no real impact on the accuracy of the description. n) If we repeatedly sample households, then in the long run 16.1% of the time the household will have exactly 3 pets. Or 16.1% of all households have 3 pets. o) Technically it means this: If we repeatedly sample one household randomly, then in the long run the mean number of pets is 1.80 with standard deviation Here s the better way: Examining all possible households, the mean number of pets is 1.80 with standard deviation is a) The units are students with Apple computers; the variable is weekly number of downloads. b) c) d) e) f) g) 0.0. h) i) Mean: 3.099; SD: j) 3.9%. k) 0.613; Discrete Populations and Probability Distributions Page 11
12 Percent 6. a) See the histogram. Notice the outlier at 100. b) = 0.10 (a loss of 10 cents on average), = c) 0.54, d) The player will go broke. On average in the long run the player loses 10 cents per game. So in the very long run this will lose the player huge amounts of money. (However: If you had a million dollars to lose, it would take you around 10 million plays to lose it. So you d keep entertained for quite some time.) a) b) = c) d) e) f) Yes. The mean payout of only $0.80 is easily offset by the $ cost to play. In the long run the carnival makes a mean of $1.0 per play. 8. Partial solutions. y p(y) 1 log log 1 = log 3 log = log 4 log 3 = log 5 log 4 = log 6 log 5 = log 7 log 6 = log 8 log 7 = log 9 log 8 = log 10 log 9 = Total log 10 log 1 = 1 0 = Mean = ; Variance = ; St Dev = Profit ($) 100 Discrete Populations and Probability Distributions Page 1
Chapter 2. Objectives. Tabulate Qualitative Data. Frequency Table. Descriptive Statistics: Organizing, Displaying and Summarizing Data.
Objectives Chapter Descriptive Statistics: Organizing, Displaying and Summarizing Data Student should be able to Organize data Tabulate data into frequency/relative frequency tables Display data graphically
More informationCALCULATIONS & STATISTICS
CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 15 scale to 0100 scores When you look at your report, you will notice that the scores are reported on a 0100 scale, even though respondents
More informationReport of for Chapter 2 pretest
Report of for Chapter 2 pretest Exam: Chapter 2 pretest Category: Organizing and Graphing Data 1. "For our study of driving habits, we recorded the speed of every fifth vehicle on Drury Lane. Nearly every
More informationProbability Models for Continuous Random Variables
Density Probability Models for Continuous Random Variables At right you see a histogram of female length of life. (Births and deaths are recorded to the nearest minute. The data are essentially continuous.)
More informationX  Xbar : ( 4150) (4850) (5050) (5050) (5450) (5750) Deviations: (note that sum = 0) Squared :
Review Exercises Average and Standard Deviation Chapter 4, FPP, p. 7476 Dr. McGahagan Problem 1. Basic calculations. Find the mean, median, and SD of the list x = (50 41 48 54 57 50) Mean = (sum x) /
More informationChapter 6 ATE: Random Variables Alternate Examples and Activities
Probability Chapter 6 ATE: Random Variables Alternate Examples and Activities [Page 343] Alternate Example: NHL Goals In 2010, there were 1319 games played in the National Hockey League s regular season.
More informationCents and the Central Limit Theorem Overview of Lesson GAISE Components Common Core State Standards for Mathematical Practice
Cents and the Central Limit Theorem Overview of Lesson In this lesson, students conduct a handson demonstration of the Central Limit Theorem. They construct a distribution of a population and then construct
More informationUnivariate Descriptive Statistics
Univariate Descriptive Statistics Displays: pie charts, bar graphs, box plots, histograms, density estimates, dot plots, stemleaf plots, tables, lists. Example: sea urchin sizes Boxplot Histogram Urchin
More information103 Measures of Central Tendency and Variation
103 Measures of Central Tendency and Variation So far, we have discussed some graphical methods of data description. Now, we will investigate how statements of central tendency and variation can be used.
More informationMA 1125 Lecture 14  Expected Values. Friday, February 28, 2014. Objectives: Introduce expected values.
MA 5 Lecture 4  Expected Values Friday, February 2, 24. Objectives: Introduce expected values.. Means, Variances, and Standard Deviations of Probability Distributions Two classes ago, we computed the
More informationIntroduction to Descriptive Statistics
Mathematics Learning Centre Introduction to Descriptive Statistics Jackie Nicholas c 1999 University of Sydney Acknowledgements Parts of this booklet were previously published in a booklet of the same
More informationExpectations. Expectations. (See also Hays, Appendix B; Harnett, ch. 3).
Expectations Expectations. (See also Hays, Appendix B; Harnett, ch. 3). A. The expected value of a random variable is the arithmetic mean of that variable, i.e. E() = µ. As Hays notes, the idea of the
More information4. Introduction to Statistics
Statistics for Engineers 41 4. Introduction to Statistics Descriptive Statistics Types of data A variate or random variable is a quantity or attribute whose value may vary from one unit of investigation
More informationMATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS
MATHEMATICS FOR ENGINEERS STATISTICS TUTORIAL 4 PROBABILITY DISTRIBUTIONS CONTENTS Sample Space Accumulative Probability Probability Distributions Binomial Distribution Normal Distribution Poisson Distribution
More informationStatistics 100 Binomial and Normal Random Variables
Statistics 100 Binomial and Normal Random Variables Three different random variables with common characteristics: 1. Flip a fair coin 10 times. Let X = number of heads out of 10 flips. 2. Poll a random
More information13.2 Measures of Central Tendency
13.2 Measures of Central Tendency Measures of Central Tendency For a given set of numbers, it may be desirable to have a single number to serve as a kind of representative value around which all the numbers
More informationExpected values, standard errors, Central Limit Theorem. Statistical inference
Expected values, standard errors, Central Limit Theorem FPP 1618 Statistical inference Up to this point we have focused primarily on exploratory statistical analysis We know dive into the realm of statistical
More informationChapter 6 Random Variables
Chapter 6 Random Variables Day 1: 6.1 Discrete Random Variables Read 340344 What is a random variable? Give some examples. A numerical variable that describes the outcomes of a chance process. Examples:
More informationChapter 3: Central Tendency
Chapter 3: Central Tendency Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately describes the center of the distribution and represents
More informationChapter 3: Data Description Numerical Methods
Chapter 3: Data Description Numerical Methods Learning Objectives Upon successful completion of Chapter 3, you will be able to: Summarize data using measures of central tendency, such as the mean, median,
More informationMonte Carlo Method: Probability
John (ARC/ICAM) Virginia Tech... Math/CS 4414: The Monte Carlo Method: PROBABILITY http://people.sc.fsu.edu/ jburkardt/presentations/ monte carlo probability.pdf... ARC: Advanced Research Computing ICAM:
More informationThe Idea of Probability
AP Statistics 5.1 Reading Guide Name Directions: Read the following pages and then answer the questions at the end. We ll have a short miniquiz over this material (for Mastery) when we return from Thanksgiving
More informationChapter 2  Graphical Summaries of Data
Chapter 2  Graphical Summaries of Data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Raw data is often difficult to make sense
More informationStatistics and Random Variables. Math 425 Introduction to Probability Lecture 14. Finite valued Random Variables. Expectation defined
Expectation Statistics and Random Variables Math 425 Introduction to Probability Lecture 4 Kenneth Harris kaharri@umich.edu Department of Mathematics University of Michigan February 9, 2009 When a large
More informationQuestion: What is the probability that a fivecard poker hand contains a flush, that is, five cards of the same suit?
ECS20 Discrete Mathematics Quarter: Spring 2007 Instructor: John Steinberger Assistant: Sophie Engle (prepared by Sophie Engle) Homework 8 Hints Due Wednesday June 6 th 2007 Section 6.1 #16 What is the
More informationWe will use the following data sets to illustrate measures of center. DATA SET 1 The following are test scores from a class of 20 students:
MODE The mode of the sample is the value of the variable having the greatest frequency. Example: Obtain the mode for Data Set 1 77 For a grouped frequency distribution, the modal class is the class having
More informationChapter 6: Random Variables
Chapter : Random Variables Section.3 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter Random Variables.1 Discrete and Continuous Random Variables.2 Transforming and Combining
More informationWeek 3&4: Z tables and the Sampling Distribution of X
Week 3&4: Z tables and the Sampling Distribution of X 2 / 36 The Standard Normal Distribution, or Z Distribution, is the distribution of a random variable, Z N(0, 1 2 ). The distribution of any other normal
More informationThe Math. P (x) = 5! = 1 2 3 4 5 = 120.
The Math Suppose there are n experiments, and the probability that someone gets the right answer on any given experiment is p. So in the first example above, n = 5 and p = 0.2. Let X be the number of correct
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x  x) B. x 3 x C. 3x  x D. x  3x 2) Write the following as an algebraic expression
More informationLab 11. Simulations. The Concept
Lab 11 Simulations In this lab you ll learn how to create simulations to provide approximate answers to probability questions. We ll make use of a particular kind of structure, called a box model, that
More informationStatistical Foundations: Measures of Location and Central Tendency and Summation and Expectation
Statistical Foundations: and Central Tendency and and Lecture 4 September 5, 2006 Psychology 790 Lecture #49/05/2006 Slide 1 of 26 Today s Lecture Today s Lecture Where this Fits central tendency/location
More informationDescriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion
Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research
More informationThe right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median
CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box
More informationExercise 1.12 (Pg. 2223)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationHistogram. Graphs, and measures of central tendency and spread. Alternative: density (or relative frequency ) plot /13/2004
Graphs, and measures of central tendency and spread 9.07 9/13/004 Histogram If discrete or categorical, bars don t touch. If continuous, can touch, should if there are lots of bins. Sum of bin heights
More informationChapter 4. Probability and Probability Distributions
Chapter 4. robability and robability Distributions Importance of Knowing robability To know whether a sample is not identical to the population from which it was selected, it is necessary to assess the
More information5.1.1 The Idea of Probability
5.1.1 The Idea of Probability Chance behavior is unpredictable in the short run but has a regular and predictable pattern in the long run. This remarkable fact is the basis for the idea of probability.
More informationSection 5 Part 2. Probability Distributions for Discrete Random Variables
Section 5 Part 2 Probability Distributions for Discrete Random Variables Review and Overview So far we ve covered the following probability and probability distribution topics Probability rules Probability
More informationVisual Display of Data in Stata
Lab 2 Visual Display of Data in Stata In this lab we will try to understand data not only through numerical summaries, but also through graphical summaries. The data set consists of a number of variables
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationGCSE Statistics Revision notes
GCSE Statistics Revision notes Collecting data Sample This is when data is collected from part of the population. There are different methods for sampling Random sampling, Stratified sampling, Systematic
More informationCh5: Discrete Probability Distributions Section 51: Probability Distribution
Recall: Ch5: Discrete Probability Distributions Section 51: Probability Distribution A variable is a characteristic or attribute that can assume different values. o Various letters of the alphabet (e.g.
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationIntroduction; Descriptive & Univariate Statistics
Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More information$2 4 40 + ( $1) = 40
THE EXPECTED VALUE FOR THE SUM OF THE DRAWS In the game of Keno there are 80 balls, numbered 1 through 80. On each play, the casino chooses 20 balls at random without replacement. Suppose you bet on the
More informationChapter 4. Probability Distributions
Chapter 4 Probability Distributions Lesson 41/42 Random Variable Probability Distributions This chapter will deal the construction of probability distribution. By combining the methods of descriptive
More informationExploratory Data Analysis. Psychology 3256
Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find
More informationDesciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods
Desciptive Statistics Qualitative data Quantitative data Graphical methods Numerical methods Qualitative data Data are classified in categories Non numerical (although may be numerically codified) Elements
More informationLecture 5 : The Poisson Distribution. Jonathan Marchini
Lecture 5 : The Poisson Distribution Jonathan Marchini Random events in time and space Many experimental situations occur in which we observe the counts of events within a set unit of time, area, volume,
More informationSession 1.6 Measures of Central Tendency
Session 1.6 Measures of Central Tendency Measures of location (Indices of central tendency) These indices locate the center of the frequency distribution curve. The mode, median, and mean are three indices
More informationCalculation example mean, median, midrange, mode, variance, and standard deviation for raw and grouped data
Calculation example mean, median, midrange, mode, variance, and standard deviation for raw and grouped data Raw data: 7, 8, 6, 3, 5, 5, 1, 6, 4, 10 Sorted data: 1, 3, 4, 5, 5, 6, 6, 7, 8, 10 Number of
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationSolution. Solution. (a) Sum of probabilities = 1 (Verify) (b) (see graph) Chapter 4 (Sections 4.34.4) Homework Solutions. Section 4.
Math 115 N. Psomas Chapter 4 (Sections 4.34.4) Homework s Section 4.3 4.53 Discrete or continuous. In each of the following situations decide if the random variable is discrete or continuous and give
More information6.042/18.062J Mathematics for Computer Science. Expected Value I
6.42/8.62J Mathematics for Computer Science Srini Devadas and Eric Lehman May 3, 25 Lecture otes Expected Value I The expectation or expected value of a random variable is a single number that tells you
More informationPie Charts. proportion of icecream flavors sold annually by a given brand. AMS5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
More informationMargin of Error When Estimating a Population Proportion
Margin of Error When Estimating a Population Proportion Student Outcomes Students use data from a random sample to estimate a population proportion. Students calculate and interpret margin of error in
More informationThe basics of probability theory. Distribution of variables, some important distributions
The basics of probability theory. Distribution of variables, some important distributions 1 Random experiment The outcome is not determined uniquely by the considered conditions. For example, tossing a
More informationWhat is the probability of throwing a fair die and receiving a six? Introduction to Probability. Basic Concepts
Basic Concepts Introduction to Probability A probability experiment is any experiment whose outcomes relies purely on chance (e.g. throwing a die). It has several possible outcomes, collectively called
More informationResearch Methods 1 Handouts, Graham Hole,COGS  version 1.0, September 2000: Page 1:
Research Methods 1 Handouts, Graham Hole,COGS  version 1.0, September 2000: Page 1: THE NORMAL CURVE AND "Z" SCORES: The Normal Curve: The "Normal" curve is a mathematical abstraction which conveniently
More informationUnit 2 Number and Operations in Base Ten: Place Value, Addition, and Subtraction
Unit 2 Number and Operations in Base Ten: Place Value, Addition, and Subtraction Introduction In this unit, students will review place value to 1,000,000 and understand the relative sizes of numbers in
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationSection 6.1 Discrete Random variables Probability Distribution
Section 6.1 Discrete Random variables Probability Distribution Definitions a) Random variable is a variable whose values are determined by chance. b) Discrete Probability distribution consists of the values
More informationChapter 5: Discrete Probability Distributions
Chapter 5: Discrete Probability Distributions Section 5.1: Basics of Probability Distributions As a reminder, a variable or what will be called the random variable from now on, is represented by the letter
More informationThis is Descriptive Statistics, chapter 2 from the book Beginning Statistics (index.html) (v. 1.0).
This is Descriptive Statistics, chapter from the book Beginning Statistics (index.html) (v..). This book is licensed under a Creative Commons byncsa. (http://creativecommons.org/licenses/byncsa/./)
More informationCenter: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)
Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationSimple Regression Theory II 2010 Samuel L. Baker
SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the
More informationCHINHOYI UNIVERSITY OF TECHNOLOGY
CHINHOYI UNIVERSITY OF TECHNOLOGY SCHOOL OF NATURAL SCIENCES AND MATHEMATICS DEPARTMENT OF MATHEMATICS MEASURES OF CENTRAL TENDENCY AND DISPERSION INTRODUCTION From the previous unit, the Graphical displays
More informationNumerical Measures of Central Tendency
Numerical Measures of Central Tendency Often, it is useful to have special numbers which summarize characteristics of a data set These numbers are called descriptive statistics or summary statistics. A
More informationHypothesis Testing with z Tests
CHAPTER SEVEN Hypothesis Testing with z Tests NOTE TO INSTRUCTOR This chapter is critical to an understanding of hypothesis testing, which students will use frequently in the coming chapters. Some of the
More informationEasy Casino Profits. Congratulations!!
Easy Casino Profits The Easy Way To Beat The Online Casinos Everytime! www.easycasinoprofits.com Disclaimer The authors of this ebook do not promote illegal, underage gambling or gambling to those living
More informationNumerical Summaries. Chapter 2. Mean or Average. Median (M) Basic Practice of Statistics  3rd Edition
Numerical Summaries Chapter 2 Describing Distributions with Numbers Center of the data mean median Variation range quartiles (interquartile range) variance standard deviation BPS  5th Ed. Chapter 2 1
More informationIntroduction to Stata: Graphic Displays of Data and Correlation
Math 143 Lab #1 Introduction to Stata: Graphic Displays of Data and Correlation Overview Thus far in the course, you have produced most of our graphical displays by hand, calculating summaries and correlations
More informationPROBLEM SET 1. For the first three answer true or false and explain your answer. A picture is often helpful.
PROBLEM SET 1 For the first three answer true or false and explain your answer. A picture is often helpful. 1. Suppose the significance level of a hypothesis test is α=0.05. If the pvalue of the test
More informationTest of proportion = 0.5 N Sample prop 95% CI z value p value (0.400, 0.466)
STATISTICS FOR THE SOCIAL AND BEHAVIORAL SCIENCES Recitation #10 Answer Key PROBABILITY, HYPOTHESIS TESTING, CONFIDENCE INTERVALS Hypothesis tests 2 When a recent GSS asked, would you be willing to pay
More informationSection 3.1 Measures of Central Tendency: Mode, Median, and Mean
Section 3.1 Measures of Central Tendency: Mode, Median, and Mean One number can be used to describe the entire sample or population. Such a number is called an average. There are many ways to compute averages,
More informationUnit 21 Student s t Distribution in Hypotheses Testing
Unit 21 Student s t Distribution in Hypotheses Testing Objectives: To understand the difference between the standard normal distribution and the Student's t distributions To understand the difference between
More informationElementary Statistics and Inference. Elementary Statistics and Inference. 17 Expected Value and Standard Error. 22S:025 or 7P:025.
Elementary Statistics and Inference S:05 or 7P:05 Lecture Elementary Statistics and Inference S:05 or 7P:05 Chapter 7 A. The Expected Value In a chance process (probability experiment) the outcomes of
More informationLab 6: Sampling Distributions and the CLT
Lab 6: Sampling Distributions and the CLT Objective: The objective of this lab is to give you a hands on discussion and understanding of sampling distributions and the Central Limit Theorem (CLT), a theorem
More information4. Continuous Random Variables, the Pareto and Normal Distributions
4. Continuous Random Variables, the Pareto and Normal Distributions A continuous random variable X can take any value in a given range (e.g. height, weight, age). The distribution of a continuous random
More informationCONNECT: Powers and logs POWERS, INDICES, EXPONENTS, LOGARITHMS THEY ARE ALL THE SAME!
CONNECT: Powers and logs POWERS, INDICES, EXPONENTS, LOGARITHMS THEY ARE ALL THE SAME! You may have come across the terms powers, indices, exponents and logarithms. But what do they mean? The terms power(s),
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 111) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More informationPLAYING GAMES OF CHANCE
SLOTS PLAYING GAMES OF CHANCE The one thing that all games of chance have in common is that winning or losing is based on randomness. While the dream of winning is exciting, it s important to know your
More information11.2 POINT ESTIMATES AND CONFIDENCE INTERVALS
11.2 POINT ESTIMATES AND CONFIDENCE INTERVALS Point Estimates Suppose we want to estimate the proportion of Americans who approve of the president. In the previous section we took a random sample of size
More informationChapter 2: Exploring Data with Graphs and Numerical Summaries. Graphical Measures Graphs are used to describe the shape of a data set.
Page 1 of 16 Chapter 2: Exploring Data with Graphs and Numerical Summaries Graphical Measures Graphs are used to describe the shape of a data set. Section 1: Types of Variables In general, variable can
More informationPractice Questions Chapter 4 & 5
Practice Questions Chapter 4 & 5 Use the following to answer questions 13: Ignoring twins and other multiple births, assume babies born at a hospital are independent events with the probability that a
More informationExploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
More information1. Which of the following questions on a job application does not naturally give rise to a binomial variable?
1. Which of the following questions on a job application does not naturally give rise to a binomial variable? a) What is your gender? b) Have you graduated high school? c) Have you ever been convicted
More informationMAT 118 DEPARTMENTAL FINAL EXAMINATION (written part) REVIEW. Ch 13. One problem similar to the problems below will be included in the final
MAT 118 DEPARTMENTAL FINAL EXAMINATION (written part) REVIEW Ch 13 One problem similar to the problems below will be included in the final 1.This table presents the price distribution of shoe styles offered
More informationsum of the values X = = number of values in the sample n sum of the values number of values in the population
Topic 3: Measures of Central Tendency With any set of numerical data, there is always a temptation to summarise the data to a single value that attempts to be a typical value for the data There are three
More informationLesson 1: Experimental and Theoretical Probability
Lesson 1: Experimental and Theoretical Probability Probability is the study of randomness. For instance, weather is random. In probability, the goal is to determine the chances of certain events happening.
More informationMONEY MANAGEMENT. Guy Bower delves into a topic every trader should endeavour to master  money management.
MONEY MANAGEMENT Guy Bower delves into a topic every trader should endeavour to master  money management. Many of us have read Jack Schwager s Market Wizards books at least once. As you may recall it
More information! x sum of the entries
3.1 Measures of Central Tendency (Page 1 of 16) 3.1 Measures of Central Tendency Mean, Median and Mode! x sum of the entries a. mean, x = = n number of entries Example 1 Find the mean of 26, 18, 12, 31,
More informationSTT315 Chapter 4 Random Variables & Probability Distributions KM. Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables
Chapter 4.5, 6, 8 Probability Distributions for Continuous Random Variables Discrete vs. continuous random variables Examples of continuous distributions o Uniform o Exponential o Normal Recall: A random
More informationCHAPTER 4: DISCRETE RANDOM VARIABLE
CHAPTER 4: DISCRETE RANDOM VARIABLE Exercise 1. A company wants to evaluate its attrition rate, in other words, how long new hires stay with the company. Over the years, they have established the following
More informationDescribe what is meant by a placebo Contrast the doubleblind procedure with the singleblind procedure Review the structure for organizing a memo
Readings: Ha and Ha Textbook  Chapters 1 8 Appendix D & E (online) Plous  Chapters 10, 11, 12 and 14 Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability
More informationNorthumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data  November 2012  This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
More information