What proportion of those surveyed completed high school only? 58941/ =.3887 or 38.87% complete high school only.

Size: px
Start display at page:

Download "What proportion of those surveyed completed high school only? 58941/ =.3887 or 38.87% complete high school only."

Transcription

1 Unit 2: SUMMARIZING DATA Topic 6: Two-Way Tables In Topic 2, we learned how to construct a bar chart to represent data for a single categorical variable. Often, we are interested in relationships among two or more categorical variables (sex, race, occupation, etc.). Information on two categorical variables can be best represented in a simple two-waytable of counts, as shown in the example to follow. All analysisdone on categorical variables involves nothing more than computing proportions and summing values. Example: Data on educational attainment of Americans of different ages were collected by the Census Bureau in The two-way table below contains counts of the number of people who fell into each education/age category listed on the sides of the table. Totals for each level of educational attainment and age group are given in the last column and row respectively. So, for example, of the 151,616 total people surveyed, 13,183 were over the age of 65 and did not complete high school Data Age Group Education Total Didn t complete high school Completed high school College: 1-3 years College, 4 or more years Total Note: This two-way table may also be called a 4x5 table to indicate the number of levels of each of the 2 variables. The 4 represents the number of education categories (row variable) and the 5 represents the number of age group categories (column variable). Typically, if an explanatory variable and a response variable can be clearly identified, the explanatory variable will be the column variable, and the response variable will be the row variable. 15

2 Virtually all aspects of the relationship between the variables Educational Attainment and Age Group can be found using proportions. To see this consider the following questions. What proportion of those surveyed did not complete high school? Of the people surveyed, did not complete high school, giving a proportion of 36114/ =.2382 or 23.82% who did not complete high school. What proportion of those surveyed completed high school only? 58941/ =.3887 or 38.87% complete high school only. It is simple to verify that also 17.01% and 20.30% completed 1-3 years of college and 4 years of college respectively. These four percentages (23.82%, 38.87%, 17.01%, 20.30%) taken collectively comprise what is known as the marginal distribution of the variable Educational Attainment. Using the word marginal means we are only considering one of the variables (in this case: educational attainment), not the relationship between the two. So the marginal distribution gives the percentages of cases falling into each category of a single categorical variable. Marginal distributions of cate- 40% gorical variables are best 30% represented using bar graphs as was done back in Topic 1. The 20% marginal distribution for educational attainment is 10% shown to the right. 0% Didn t Complete Completed College: 1-3 Years College: 4 or More Years Relationships Between Categorical Variables: Consider now the following question: What proportion of those surveyed between the ages of 35 and 44 completed4yearsofcollege? The italicizedpartof thisquestionisto 16

3 draw your attention to the fact that the proportion sought is only for those between the ages of 35 and 44. Since there are people in the age group and 9332 of them completed 4 years of college, then 9332/34682 =.2691 or 26.91% of the year-olds completed 4 years of college. Suppose we wanted to find the proportionof those between 35 and 44 at each of the 4 educational attainment levels. These proportions are:.1396,.3806,.2107, and.2691 respectively. Because we are finding these proportions under the condition that the respondents are between 35 and 44 years of age, these values comprise the conditional distribution of the educational attainment variable for the age group Probably the best way to visually represent these conditional distributions between categorical variables is through what is known as a segmented bar graph. A segmented bar graph for these data is shown below. Does there appear to be a relationship between educational attainment and age? Describe this relationship. 100% 80% 60% 40% Education Level College: 4 or More Years College: 1 3 Years Completed Didn t Complete 20% 0% >65 Age Group R Note: To create simple bar graphs in R, see pages of the R- Commander manual. To create a segmented bar graph in R, see pages and of the R-Commander manual. Note: These same data were compiled from a sample taken as part of the 2010 census and appear on the next page along with a segmented bar graph. What differences did you see? 17

4 2010 Data Age Group Education Total Didn t complete high school Completed high school College: 1-3 years College, 4 or more years Total Has the relationship between educational attainment and age changed over the past 25 years? 100% 80% 60% 40% Education Level College: 4 or More Years College: 1 3 Years Completed Didn t Complete 20% 0% >65 Age Group Independence: In the age vs. education level example, it was clear that the distribution of education levels depended on the age group. For example, the younger age groups tended to have attained higher levels of education than the older age groups for the 1987 data. Whenever the conditional distribution of one variable is identical for every category of the other variable, the two variables are said to be independent. To illustrate this idea, consider the following data describing the relationship between students residency status and gender for 100 students. Intuitively, would you expect there to be a relationship between gender and residency status, or would you expect these variables to be independent? 18

5 Male Female Note that regardless of whether a In-State student is male or female, the Out-of-State probability of having in-state residency is 3/4 and the probability of not having residency is 1/4. By the same token, regardless of a student s residency, the probability of being male is 4/10 and the probability of being female is 6/10. Since the distribution of one variable does not depend on the level of the other variable, the variables residency status and gender are independent. Relative Risk: Sometimes it is informative to look at the ratio of two proportions to gain information about the relative likelihood of occurrence of two events. Such a ratio of proportions is known as the relative risk between the two groups. For example, if the proportion of men between the ages of 50 & 59 with coronary heart disease (CHD) is 0.5 and the proportion of men between the ages of 30 & 39 with CHD is 0.2, the relative risk of a man having CHD between the two age groups is 5/2 = 2.5. This says that a randomly selected man in his 50 s is 2.5 times as likely to have CHD than a randomly selected man in his 30 s. In the age vs. education level example, what is the relative risk among those over 65 between not completing high school and completing 4 or more years of college? Simpson s Paradox: Consider the following data collected to investigate a possible influence of race on the imposition of the death penalty for murder. Data on the race of the defendant in a murder trial and whether or not the death penalty was given appear in the table below. 19

6 Are there a higher percentage Death Penalty? of whites or blacks sentenced Defendant s Race Yes No Total to death overall? Computing, White of 160 or % of the Black whites and 17 of 166 or 10.24% of the blacks are sentenced to Total death. So overall, more whites than blacks are sentenced to death when facing the death penalty for murder. The Paradox: Suppose we consider a third variable, the race of the murder victim. A table incorporating this additional information is given below: White Defendant: Black Defendant: Death Penalty Death Penalty Yes No Yes No White Victim White Victim Black Victim 0 9 Black Victim 6 97 Now consider the following pair of questions: 1. Among the cases where the victim was white, was there a higher percentage of whites or blacks sentenced to death? 2. Among the cases where the victim was black, was there a higher percentage of whites or blacks sentenced to death? Answers: 1. Among the cases with white victims, 19 of 151 or 12.58% of whites were sentenced to death, and 11 of 63 or 17.46% of blacks were sentenced to death. 2. Among the cases with black victims, 0 of 9 or 0.00% of whites were sentenced to death, and 6 of 103 or 5.83% of blacks were sentenced to death. 20

7 What happened here?!? Although a higher percentage of whites were sentenced to death overall, a higher percentage of blacks were sentenced to death both when the victim was white and when the victim was black! Can you explain this paradox? Bottom Line: Additional variables, such as the race of the victim here, can playanimportantroleintheanalysisofdata,andcanchangeourperceptions and conclusions. Variables of this type are known as lurking variables, as first considered in Topic 3. Topic 7: Displaying and Describing Distributions (Quantitative Data) When analyzing a distribution, there are many features one might examine. Five of these are outlined below. 1. Center: The center of the distribution is generally the most important and informative aspect of the distribution. Some measures of center with which we are familiar are the mean, median, or mode. 2. Spread or Variability: Giving the center of the distribution is not sufficient. It is also important to give some idea of how spread out the data are. Consider the two followingsets of data: (98, 99, 100, 101, 102) and (50, 75, 100, 125, 150). Clearly, these two lists have the same center (100), but the latter has much more variability than the former. 3. Shape: There are 3 common shapes one finds in examining distributions, though not all distributions fall into one of these categories. These shapes are: (a) Symmetric: (b) Skewed to the left: 21

8 Chapter 6 Notes Introduction: Chapter 6 is one of the hardest chapters for 216 students to understand, and to make it a big worse, it seems to come up very early in the course, as students are still trying to feel comfortable with the new vocabulary and concepts of statistics, etc. When we look for relationship between 2 quantitative variables, the task is relatively easy to visualize and the concept is rather straight forward to grasp. We put one variable on the y axis (the response variable) and the explanatory on the x axis, and, using (x, y) points, we plot them on a scatter plot. Then we visually and statistically find a line or curve of best fit which describes the points best. We can even quantify the relationship using statistical values and concepts like regression coefficient and r-squared values. We can describe various shades of relationship between the two quantities going from weak to very strong. When it comes to finding relationship between 2 categorical variables (the stuff of chapter 6) it becomes a bit harder and somewhat mysterious, as well as more vague. And that seems reasonable, since we are trying to relate 2 things which have categories and do so in some form of a quantitative way. For this task we use a process that involves making/using a 2-way-table, then compute conditional and sometimes marginal distributions, followed by a segmented bar graph. From these things we can make a determination of relationship, but it is simply that the 2 variables are relatively independent of each other, or that they are not independent of each other. And, the way we do this is rather strange, to the eyes of new statistics students. 2-Way-Tables: Two way tables have one variables categories which act as the row titles, and the categories of the other variable act as the column titles, with the counts of the observational units in the study who fit in each category of each variable shown as the cell values". The margins of the table contain totals counts of the rows and columns, respectively. The total-total value is shown in the bottom right cell, and represents the total number of observational units used in the study from which the table came. Statistical convention dictates that the variable whose categories make up the row titles is the y or response variable and the variable whose categories make up the column titles is the x or explanatory variable. When making up a 2-way-table, be sure that your total-total sums the same whether you are adding the marginal column or adding the marginal row!! Marginal Distributions: The marginal distribution is a listing of proportions, or fractions, whose denominator -1-

9 (bottom number) is ALWAYS the total-total number and whose numerator (top number) is one of the values in the total row or column. To know whether you are working with the row or column totals you need to answer the question: Do I want the marginal distribution of the x variable or y variable?. If the answer is the x variable, use the totals on the total row for marginal numerators, if the answer is the y variable, use the totals on the total column (on the right) for marginal numerators. Conditional Distributions: Like marginal distributions, conditionals are distributions of fractions, whose denominators are NEVER the total-total value, but rather are the total row or total column values. The numerators of these conditionals are individual cell counts. To determine what values to use for denominators, you have to sort of understand what a conditional means. I think about it this way. The whole world consists of the number of individuals listed in the total-total cell. If we are finding the conditional of the x variable (assuming the x categories are the column labels), then we reduce our world down to just those who live in the first category (i.e., the left column) of the world. My conditional (meaning my world is now reduced upon condition of just those being in the first category of the x variable), distribution is now found by finding the proportions in that left column with denominator as column total and numerator as cell value in each category of the y variable. Then I find the next condition, which is the proportions using the totals of all those who live in the next category of the x variable (i.e., those who live in the next column to the right)--and so on and on, until I have completed the conditional of every column in the x variable categories. To do conditionals of the y variable, (where my y has category names as row labels), my conditional distribution has proportions whose denominators are the various y variable category totals, and whose numerators are the cell values of that y category. We again start with the conditional of the first row, then the next row, etc. until finished with the distribution. Realize that this process is reversed (rows become columns and vice-versa), if the table is not done in the conventional way, and the x variable has categories for row titles and y variable has categories as column titles. To understand what I was attempting to explain above takes some practice, and is the reason I say this stuff is a bit hard to get the hang of. Once you get the hang of it, however, you probably won't forget how to do it properly. Segmented Bar Graph: The segmented bar graph is, in essence, a visualization of the conditional distribution of y on x. In statistics jargon, we always say make a graph or distribution of... on... or... vs..., or... to..., and the first set of dots will ALWAYS be the y variable and the x variable will ALWAYS come after the on, or vs, or to, etc. This may help you if that phrase is the only one you are given in the problem and need to determine which variable is response (y) and which is explanatory (x). -2-

10 Segmented bar graphs are composed of bars (the categories of the x variable) where each bar has segments (the categories of the y variable). All bars are the same height (1.0 or 100%), and the segmented bars are placed in an x-y axis. The x axis has a label (which is the x variable) and y axis has the label proportion or percent. Each bar has the x category label under it, and the y variable is stated with little boxes showing the legend of colors or textures used for each category of the y variable (i.e., the colors/textures of the bar segments). Finally, each graph must have a main title in it. Conclusion of the study: After all of this work you are now able to determine a conclusion. If your graph has bars that look approximately the same, then the two variables (x and y) are relatively independent. If one or more bars look different from the others, then the 2 variables are not independent. You might want to say that if the bars are not independent, why don't we just say that they are dependent, but, in statistics for this case, there are various levels and types of dependency, which you cannot know from just a 2-way-table, so even though it is clumsy English, it is more precise statistics-speak. Some might say that it should be the other way around that the different looking bars indicate independence or freedom. But, here is how I look at this concept, which makes sense to me why we do as we do on this. Let's say that I am a statistician working for you and you hired me to construct the study, along with 2-way-table, graph, etc. I do so and show it to you. You then ask What proportion of the first category of the y variable made up the second variable? If all of my bars on the segmented bar graph are about the same, then I can answer that question immediately, with the proportion of the desired category of y on any of the bars. I am independent to give you an answer. However, if I have bars with different segment lengths on them, then I cannot answer your question until I ask you What category of the x variable are you referring to?, since that will make my answer change depending on different categories. I am not independent to answer your question until YOU ANSWER MINE! Simpson's Paradox: A paradox is a seeming contradiction. There was a famous guard for the Utah Jazz who was inducted into the NBA hall of fame a while back. There was another player on a competing team who had a better career shooting percentage than John (the guard) had, but who was never even considered for the hall of fame. John, who was a famous 3 point shooter, had a much better 3 point shooting percentage, and a bit better 2 point shooting percentage than the other player. The paradox is how did the other player have an overall better shooting percentage than John did? This situation comes up a lot in 2-way-table statistics and is referred to as Simpson's Paradox. There is a marvelous diagram of this situation in the last WATCH OUT section of Ch 6 in your book. In essence, there is a confounding 3 rd variable, which gives different weighting to the proportion you are computing, which causes the paradox. In our NBA example, John took mostly 3 point shots and had a much better shooting percentage than -3-

11 the other player did. The other player took mostly 2 point shots, and although he had a slightly lower percentage of those score than John did, when you just do total makes divided by total attempted shots, the other player is higher, because John missed more of the difficult shots in his overall attempts. The key here is to determine what that 3 rd hidden variable is, the one which weights the proportion differently. In John's case, he had a higher percentage of more difficult shots, which dragged down his overall unweighted proportion. Here is another Simpson's paradox, which shows the problem of weighting. Say you and I take 5 courses one semester and I get 3 A's, 1 B and 1 C. You get 1 A, 2 B's and 2 C's. And you beat me in GPA. How is that possible when I got more A's by far than you? Answer: the weighting of the grades. My A's were in 1 credit lab courses and the C was in a 5 credit course, whereas your A was in a 5 credit course and C's were in 1 credit labs. Summary: So, as you study this chapter, do so slowly, methodically, thoughtfully, and do many problems and examples. Don't take these concepts too lightly or study too superficially. However, don't over think this stuff, either. Once you get it, then you will have it. -4-

Characteristics of Binomial Distributions

Characteristics of Binomial Distributions Lesson2 Characteristics of Binomial Distributions In the last lesson, you constructed several binomial distributions, observed their shapes, and estimated their means and standard deviations. In Investigation

More information

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175) Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

More information

MEASURES OF VARIATION

MEASURES OF VARIATION NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are

More information

Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

More information

Unit 9 Describing Relationships in Scatter Plots and Line Graphs

Unit 9 Describing Relationships in Scatter Plots and Line Graphs Unit 9 Describing Relationships in Scatter Plots and Line Graphs Objectives: To construct and interpret a scatter plot or line graph for two quantitative variables To recognize linear relationships, non-linear

More information

6.4 Normal Distribution

6.4 Normal Distribution Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under

More information

PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables

PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD. To explore for a relationship between the categories of two discrete variables 3 Stacked Bar Graph PURPOSE OF GRAPHS YOU ARE ABOUT TO BUILD To explore for a relationship between the categories of two discrete variables 3.1 Introduction to the Stacked Bar Graph «As with the simple

More information

DESCRIPTIVE STATISTICS & DATA PRESENTATION*

DESCRIPTIVE STATISTICS & DATA PRESENTATION* Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department

More information

RATIOS, PROPORTIONS, PERCENTAGES, AND RATES

RATIOS, PROPORTIONS, PERCENTAGES, AND RATES RATIOS, PROPORTIOS, PERCETAGES, AD RATES 1. Ratios: ratios are one number expressed in relation to another by dividing the one number by the other. For example, the sex ratio of Delaware in 1990 was: 343,200

More information

Data exploration with Microsoft Excel: analysing more than one variable

Data exploration with Microsoft Excel: analysing more than one variable Data exploration with Microsoft Excel: analysing more than one variable Contents 1 Introduction... 1 2 Comparing different groups or different variables... 2 3 Exploring the association between categorical

More information

Association Between Variables

Association Between Variables Contents 11 Association Between Variables 767 11.1 Introduction............................ 767 11.1.1 Measure of Association................. 768 11.1.2 Chapter Summary.................... 769 11.2 Chi

More information

Simple Regression Theory II 2010 Samuel L. Baker

Simple Regression Theory II 2010 Samuel L. Baker SIMPLE REGRESSION THEORY II 1 Simple Regression Theory II 2010 Samuel L. Baker Assessing how good the regression equation is likely to be Assignment 1A gets into drawing inferences about how close the

More information

Session 7 Bivariate Data and Analysis

Session 7 Bivariate Data and Analysis Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares

More information

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

More information

UNDERSTANDING THE TWO-WAY ANOVA

UNDERSTANDING THE TWO-WAY ANOVA UNDERSTANDING THE e have seen how the one-way ANOVA can be used to compare two or more sample means in studies involving a single independent variable. This can be extended to two independent variables

More information

Mind on Statistics. Chapter 4

Mind on Statistics. Chapter 4 Mind on Statistics Chapter 4 Sections 4.1 Questions 1 to 4: The table below shows the counts by gender and highest degree attained for 498 respondents in the General Social Survey. Highest Degree Gender

More information

CALCULATIONS & STATISTICS

CALCULATIONS & STATISTICS CALCULATIONS & STATISTICS CALCULATION OF SCORES Conversion of 1-5 scale to 0-100 scores When you look at your report, you will notice that the scores are reported on a 0-100 scale, even though respondents

More information

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005

Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005 Quantitative vs. Categorical Data: A Difference Worth Knowing Stephen Few April 2005 When you create a graph, you step through a series of choices, including which type of graph you should use and several

More information

TEACHER NOTES MATH NSPIRED

TEACHER NOTES MATH NSPIRED Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when

More information

Descriptive Statistics and Measurement Scales

Descriptive Statistics and Measurement Scales Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample

More information

Modifying Colors and Symbols in ArcMap

Modifying Colors and Symbols in ArcMap Modifying Colors and Symbols in ArcMap Contents Introduction... 1 Displaying Categorical Data... 3 Creating New Categories... 5 Displaying Numeric Data... 6 Graduated Colors... 6 Graduated Symbols... 9

More information

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs

Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)

More information

Statistics 2014 Scoring Guidelines

Statistics 2014 Scoring Guidelines AP Statistics 2014 Scoring Guidelines College Board, Advanced Placement Program, AP, AP Central, and the acorn logo are registered trademarks of the College Board. AP Central is the official online home

More information

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.

Pie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple. Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of

More information

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences

Introduction to Statistics for Psychology. Quantitative Methods for Human Sciences Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html

More information

Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

More information

Fairfield Public Schools

Fairfield Public Schools Mathematics Fairfield Public Schools AP Statistics AP Statistics BOE Approved 04/08/2014 1 AP STATISTICS Critical Areas of Focus AP Statistics is a rigorous course that offers advanced students an opportunity

More information

Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions.

Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions. Student Outcomes Students summarize a data set using box plots, the median, and the interquartile range. Students use box plots to compare two data distributions. Lesson Notes The activities in this lesson

More information

Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

More information

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller

Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize

More information

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number

1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number 1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression

More information

AP STATISTICS REVIEW (YMS Chapters 1-8)

AP STATISTICS REVIEW (YMS Chapters 1-8) AP STATISTICS REVIEW (YMS Chapters 1-8) Exploring Data (Chapter 1) Categorical Data nominal scale, names e.g. male/female or eye color or breeds of dogs Quantitative Data rational scale (can +,,, with

More information

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r),

Chapter 10. Key Ideas Correlation, Correlation Coefficient (r), Chapter 0 Key Ideas Correlation, Correlation Coefficient (r), Section 0-: Overview We have already explored the basics of describing single variable data sets. However, when two quantitative variables

More information

Independent samples t-test. Dr. Tom Pierce Radford University

Independent samples t-test. Dr. Tom Pierce Radford University Independent samples t-test Dr. Tom Pierce Radford University The logic behind drawing causal conclusions from experiments The sampling distribution of the difference between means The standard error of

More information

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont

CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont CONTINGENCY TABLES ARE NOT ALL THE SAME David C. Howell University of Vermont To most people studying statistics a contingency table is a contingency table. We tend to forget, if we ever knew, that contingency

More information

Excel Tutorial. Bio 150B Excel Tutorial 1

Excel Tutorial. Bio 150B Excel Tutorial 1 Bio 15B Excel Tutorial 1 Excel Tutorial As part of your laboratory write-ups and reports during this semester you will be required to collect and present data in an appropriate format. To organize and

More information

Problem of the Month: Fair Games

Problem of the Month: Fair Games Problem of the Month: The Problems of the Month (POM) are used in a variety of ways to promote problem solving and to foster the first standard of mathematical practice from the Common Core State Standards:

More information

4. Descriptive Statistics: Measures of Variability and Central Tendency

4. Descriptive Statistics: Measures of Variability and Central Tendency 4. Descriptive Statistics: Measures of Variability and Central Tendency Objectives Calculate descriptive for continuous and categorical data Edit output tables Although measures of central tendency and

More information

Comparing Sets of Data Grade Eight

Comparing Sets of Data Grade Eight Ohio Standards Connection: Data Analysis and Probability Benchmark C Compare the characteristics of the mean, median, and mode for a given set of data, and explain which measure of center best represents

More information

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median

The right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box

More information

Mind on Statistics. Chapter 2

Mind on Statistics. Chapter 2 Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table

More information

Examples of Data Representation using Tables, Graphs and Charts

Examples of Data Representation using Tables, Graphs and Charts Examples of Data Representation using Tables, Graphs and Charts This document discusses how to properly display numerical data. It discusses the differences between tables and graphs and it discusses various

More information

Scatter Plots with Error Bars

Scatter Plots with Error Bars Chapter 165 Scatter Plots with Error Bars Introduction The procedure extends the capability of the basic scatter plot by allowing you to plot the variability in Y and X corresponding to each point. Each

More information

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish

Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Examining Differences (Comparing Groups) using SPSS Inferential statistics (Part I) Dwayne Devonish Statistics Statistics are quantitative methods of describing, analysing, and drawing inferences (conclusions)

More information

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the

More information

Mathematical goals. Starting points. Materials required. Time needed

Mathematical goals. Starting points. Materials required. Time needed Level S6 of challenge: B/C S6 Interpreting frequency graphs, cumulative cumulative frequency frequency graphs, graphs, box and box whisker and plots whisker plots Mathematical goals Starting points Materials

More information

Chapter 6: Probability

Chapter 6: Probability Chapter 6: Probability In a more mathematically oriented statistics course, you would spend a lot of time talking about colored balls in urns. We will skip over such detailed examinations of probability,

More information

EXPERIMENTAL DESIGN REFERENCE

EXPERIMENTAL DESIGN REFERENCE EXPERIMENTAL DESIGN REFERENCE Scenario: A group of students is assigned a Populations Project in their Ninth Grade Earth Science class. They decide to determine the effect of sunlight on radish plants.

More information

Linear Models in STATA and ANOVA

Linear Models in STATA and ANOVA Session 4 Linear Models in STATA and ANOVA Page Strengths of Linear Relationships 4-2 A Note on Non-Linear Relationships 4-4 Multiple Linear Regression 4-5 Removal of Variables 4-8 Independent Samples

More information

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph.

c. Construct a boxplot for the data. Write a one sentence interpretation of your graph. MBA/MIB 5315 Sample Test Problems Page 1 of 1 1. An English survey of 3000 medical records showed that smokers are more inclined to get depressed than non-smokers. Does this imply that smoking causes depression?

More information

. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches)

. 58 58 60 62 64 66 68 70 72 74 76 78 Father s height (inches) PEARSON S FATHER-SON DATA The following scatter diagram shows the heights of 1,0 fathers and their full-grown sons, in England, circa 1900 There is one dot for each father-son pair Heights of fathers and

More information

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1)

Class 19: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.1) Spring 204 Class 9: Two Way Tables, Conditional Distributions, Chi-Square (Text: Sections 2.5; 9.) Big Picture: More than Two Samples In Chapter 7: We looked at quantitative variables and compared the

More information

Measurement with Ratios

Measurement with Ratios Grade 6 Mathematics, Quarter 2, Unit 2.1 Measurement with Ratios Overview Number of instructional days: 15 (1 day = 45 minutes) Content to be learned Use ratio reasoning to solve real-world and mathematical

More information

Module 3: Correlation and Covariance

Module 3: Correlation and Covariance Using Statistical Data to Make Decisions Module 3: Correlation and Covariance Tom Ilvento Dr. Mugdim Pašiƒ University of Delaware Sarajevo Graduate School of Business O ften our interest in data analysis

More information

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables

Lesson 3: Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables Calculating Conditional Probabilities and Evaluating Independence Using Two-Way Tables Classwork Example 1 Students at Rufus King High School were discussing some of the challenges of finding space for

More information

Section 14 Simple Linear Regression: Introduction to Least Squares Regression

Section 14 Simple Linear Regression: Introduction to Least Squares Regression Slide 1 Section 14 Simple Linear Regression: Introduction to Least Squares Regression There are several different measures of statistical association used for understanding the quantitative relationship

More information

Correlation key concepts:

Correlation key concepts: CORRELATION Correlation key concepts: Types of correlation Methods of studying correlation a) Scatter diagram b) Karl pearson s coefficient of correlation c) Spearman s Rank correlation coefficient d)

More information

OA3-10 Patterns in Addition Tables

OA3-10 Patterns in Addition Tables OA3-10 Patterns in Addition Tables Pages 60 63 Standards: 3.OA.D.9 Goals: Students will identify and describe various patterns in addition tables. Prior Knowledge Required: Can add two numbers within 20

More information

Exercise 1.12 (Pg. 22-23)

Exercise 1.12 (Pg. 22-23) Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

More information

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm

Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm Mgt 540 Research Methods Data Analysis 1 Additional sources Compilation of sources: http://lrs.ed.uiuc.edu/tseportal/datacollectionmethodologies/jin-tselink/tselink.htm http://web.utk.edu/~dap/random/order/start.htm

More information

The correlation coefficient

The correlation coefficient The correlation coefficient Clinical Biostatistics The correlation coefficient Martin Bland Correlation coefficients are used to measure the of the relationship or association between two quantitative

More information

Descriptive statistics; Correlation and regression

Descriptive statistics; Correlation and regression Descriptive statistics; and regression Patrick Breheny September 16 Patrick Breheny STA 580: Biostatistics I 1/59 Tables and figures Descriptive statistics Histograms Numerical summaries Percentiles Human

More information

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

More information

Using SPSS, Chapter 2: Descriptive Statistics

Using SPSS, Chapter 2: Descriptive Statistics 1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

More information

Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam

Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab Exercise #5 Analysis of Time of Death Data for Soldiers in Vietnam Objectives: 1. To use exploratory data analysis to investigate

More information

Demographics of Atlanta, Georgia:

Demographics of Atlanta, Georgia: Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From

More information

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab

Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab 1 Valor Christian High School Mrs. Bogar Biology Graphing Fun with a Paper Towel Lab I m sure you ve wondered about the absorbency of paper towel brands as you ve quickly tried to mop up spilled soda from

More information

Sample Table. Columns. Column 1 Column 2 Column 3 Row 1 Cell 1 Cell 2 Cell 3 Row 2 Cell 4 Cell 5 Cell 6 Row 3 Cell 7 Cell 8 Cell 9.

Sample Table. Columns. Column 1 Column 2 Column 3 Row 1 Cell 1 Cell 2 Cell 3 Row 2 Cell 4 Cell 5 Cell 6 Row 3 Cell 7 Cell 8 Cell 9. Working with Tables in Microsoft Word The purpose of this document is to lead you through the steps of creating, editing and deleting tables and parts of tables. This document follows a tutorial format

More information

MBA 611 STATISTICS AND QUANTITATIVE METHODS

MBA 611 STATISTICS AND QUANTITATIVE METHODS MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain

More information

Probability Distributions

Probability Distributions CHAPTER 5 Probability Distributions CHAPTER OUTLINE 5.1 Probability Distribution of a Discrete Random Variable 5.2 Mean and Standard Deviation of a Probability Distribution 5.3 The Binomial Distribution

More information

Probability, statistics and football Franka Miriam Bru ckler Paris, 2015.

Probability, statistics and football Franka Miriam Bru ckler Paris, 2015. Probability, statistics and football Franka Miriam Bru ckler Paris, 2015 Please read this before starting! Although each activity can be performed by one person only, it is suggested that you work in groups

More information

Descriptive Statistics

Descriptive Statistics Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web

More information

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.

2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles. Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible

More information

Continuing, we get (note that unlike the text suggestion, I end the final interval with 95, not 85.

Continuing, we get (note that unlike the text suggestion, I end the final interval with 95, not 85. Chapter 3 -- Review Exercises Statistics 1040 -- Dr. McGahagan Problem 1. Histogram of male heights. Shaded area shows percentage of men between 66 and 72 inches in height; this translates as "66 inches

More information

Describing Relationships between Two Variables

Describing Relationships between Two Variables Describing Relationships between Two Variables Up until now, we have dealt, for the most part, with just one variable at a time. This variable, when measured on many different subjects or objects, took

More information

Scientific Graphing in Excel 2010

Scientific Graphing in Excel 2010 Scientific Graphing in Excel 2010 When you start Excel, you will see the screen below. Various parts of the display are labelled in red, with arrows, to define the terms used in the remainder of this overview.

More information

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

More information

AP * Statistics Review. Descriptive Statistics

AP * Statistics Review. Descriptive Statistics AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

More information

Summary of important mathematical operations and formulas (from first tutorial):

Summary of important mathematical operations and formulas (from first tutorial): EXCEL Intermediate Tutorial Summary of important mathematical operations and formulas (from first tutorial): Operation Key Addition + Subtraction - Multiplication * Division / Exponential ^ To enter a

More information

Main Effects and Interactions

Main Effects and Interactions Main Effects & Interactions page 1 Main Effects and Interactions So far, we ve talked about studies in which there is just one independent variable, such as violence of television program. You might randomly

More information

Statistics. Measurement. Scales of Measurement 7/18/2012

Statistics. Measurement. Scales of Measurement 7/18/2012 Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does

More information

Midterm Review Problems

Midterm Review Problems Midterm Review Problems October 19, 2013 1. Consider the following research title: Cooperation among nursery school children under two types of instruction. In this study, what is the independent variable?

More information

Name: Date: Use the following to answer questions 2-3:

Name: Date: Use the following to answer questions 2-3: Name: Date: 1. A study is conducted on students taking a statistics class. Several variables are recorded in the survey. Identify each variable as categorical or quantitative. A) Type of car the student

More information

The Center for Teaching, Learning, & Technology

The Center for Teaching, Learning, & Technology The Center for Teaching, Learning, & Technology Instructional Technology Workshops Microsoft Excel 2010 Formulas and Charts Albert Robinson / Delwar Sayeed Faculty and Staff Development Programs Colston

More information

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS

ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS DATABASE MARKETING Fall 2015, max 24 credits Dead line 15.10. ASSIGNMENT 4 PREDICTIVE MODELING AND GAINS CHARTS PART A Gains chart with excel Prepare a gains chart from the data in \\work\courses\e\27\e20100\ass4b.xls.

More information

Chapter 4 Displaying and Describing Categorical Data

Chapter 4 Displaying and Describing Categorical Data Chapter 4 Displaying and Describing Categorical Data Chapter Goals Learning Objectives This chapter presents three basic techniques for summarizing categorical data. After completing this chapter you should

More information

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation

Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Lecture 11: Chapter 5, Section 3 Relationships between Two Quantitative Variables; Correlation Display and Summarize Correlation for Direction and Strength Properties of Correlation Regression Line Cengage

More information

Describing, Exploring, and Comparing Data

Describing, Exploring, and Comparing Data 24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter

More information

Office of Institutional Research & Planning

Office of Institutional Research & Planning NECC Northern Essex Community College NECC College Math Tutoring Center Results Spring 2011 The College Math Tutoring Center at Northern Essex Community College opened its doors to students in the Spring

More information

Using Proportions to Solve Percent Problems I

Using Proportions to Solve Percent Problems I RP7-1 Using Proportions to Solve Percent Problems I Pages 46 48 Standards: 7.RP.A. Goals: Students will write equivalent statements for proportions by keeping track of the part and the whole, and by solving

More information

II. DISTRIBUTIONS distribution normal distribution. standard scores

II. DISTRIBUTIONS distribution normal distribution. standard scores Appendix D Basic Measurement And Statistics The following information was developed by Steven Rothke, PhD, Department of Psychology, Rehabilitation Institute of Chicago (RIC) and expanded by Mary F. Schmidt,

More information

Fun with Fractions: A Unit on Developing the Set Model: Unit Overview www.illuminations.nctm.org

Fun with Fractions: A Unit on Developing the Set Model: Unit Overview www.illuminations.nctm.org Fun with Fractions: A Unit on Developing the Set Model: Unit Overview www.illuminations.nctm.org Number of Lessons: 7 Grades: 3-5 Number & Operations In this unit plan, students explore relationships among

More information

Descriptive Statistics

Descriptive Statistics Descriptive Statistics Primer Descriptive statistics Central tendency Variation Relative position Relationships Calculating descriptive statistics Descriptive Statistics Purpose to describe or summarize

More information

Session 7 Fractions and Decimals

Session 7 Fractions and Decimals Key Terms in This Session Session 7 Fractions and Decimals Previously Introduced prime number rational numbers New in This Session period repeating decimal terminating decimal Introduction In this session,

More information

Charts, Tables, and Graphs

Charts, Tables, and Graphs Charts, Tables, and Graphs The Mathematics sections of the SAT also include some questions about charts, tables, and graphs. You should know how to (1) read and understand information that is given; (2)

More information

Scientific Method. 2. Design Study. 1. Ask Question. Questionnaire. Descriptive Research Study. 6: Share Findings. 1: Ask Question.

Scientific Method. 2. Design Study. 1. Ask Question. Questionnaire. Descriptive Research Study. 6: Share Findings. 1: Ask Question. Descriptive Research Study Investigation of Positive and Negative Affect of UniJos PhD Students toward their PhD Research Project : Ask Question : Design Study Scientific Method 6: Share Findings. Reach

More information

The Chi-Square Test. STAT E-50 Introduction to Statistics

The Chi-Square Test. STAT E-50 Introduction to Statistics STAT -50 Introduction to Statistics The Chi-Square Test The Chi-square test is a nonparametric test that is used to compare experimental results with theoretical models. That is, we will be comparing observed

More information

How can I improve my interviewing skills? MATERIALS

How can I improve my interviewing skills? MATERIALS Mock Interviews 6 Finding a job The BIG Idea How can I improve my interviewing skills? AGENDA Approx. 45 minutes I. Warm Up: Model an Interview (10 minutes) II. Interview Practice (30 minutes) III. Wrap

More information

Describing and presenting data

Describing and presenting data Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data

More information