1 Organizing and Graphing Data
|
|
- Dayna Dorsey
- 7 years ago
- Views:
Transcription
1 1 Organizing and Graphing Data 1.1 Organizing and Graphing Categorical Data After categorical data has been sampled it should be summarized to provide the following information: 1. Which values have been observed? (red, green, blue, brown, orange, yellow) 2. How often did every value occur? Categorical data is usually summarized in a table giving the following information: categories observed frequency, or number of measurements for each category relative frequency, or proportion of measurements for each category percentage of measurements for each category Definition: The relative frequency for a particular category is the fraction or proportion of the frequency that the category appears in the the data set. It is calculated as Relative frequency of a category = frequency of that category Sum of all frequencies percent = 100 Relative Frequency Example: Sum of all frequencies = sample size = number of observations=n=200 category frequency relative frequency percentage wood % tiles % linoleum % carpet % total % Such a table is called the frequency distribution table for categorical data. Once the data is summarized in a frequency distribution table, the data can be displayed in a bar chart or pie chart. The bar chart (bar graph) will effectively show the frequencies in the different categories whereas the pie chart will show the relationship between the parts and the whole. 1
2 1.1.1 Bar Graph Definition 1 A graph made of bars whose heights represent the frequencies of respective categories is called a bar graph. Instead of frequencies a bar graph might display the relative frequencies or percentages of the categories. For every category the x-axis is marked with a tick. Each category is represented by a bar, which AREA is proportional to the corresponding frequency (relative frequency). label the y-axis. Remark: The width of each bar should be the same, so the height is proportional to the corresponding frequency. Example 1 Suppose the frequency distribution of the mainly used flooring products is: frequency relative freq wood tiles linoleum carpet
3 1.1.2 Pie Charts Pie charts provide an alternative kind of graph for categorical data: Definition 2 A circle divided into portions that represent the relative frequencies or percentages of a population or sample belonging to different categories is called a pie-chart. The size of the slice representing a particular category is proportional to the corresponding frequency (relative frequency) that fall within this category. How to create a pie chart: Draw a circle Calculate the slice size (angle) (fraction of the circle for the category) use protractor to mark the angles slice size=category relative frequency 360 frequency relative freq angle wood tiles linoleum carpet
4 M&M s example: On the M&M s webpage the following information on the distribution of colors in peanut M&M s is provided color brown yellow red blue orange green percent 12% 15% 12% 23% 23% 15 In order if this distribution is a true description of what is in a bag, someone bought a bag with 200 peanut M&M s and wants to describe the colors of the contents. Color is a categorical variable, so a relative frequency table shall be obtained. color count rel. freq. percentage brown % yellow % red % blue % orange % green % Total % And a bar chart would look like this: For the pie chart the angles of the slices have to be determined color count rel. freq. angle brown o yellow o red o blue o orange o green o Total o This results in the following pie chart 4
5 1.2 Organizing and Graphing Quantitative Data Graphs from this section display the data for a quantitative variable in a fashion so that the distribution of the data becomes apparent Stem and Leaf Plots Another way of displaying numerical data is the stem and leaf plot. Each observed number is broken into two pieces called the stem and the leaf. How to do a stem and leaf plot: 1. Divide each measurement into two parts: The first digit(s) of the number are the stems. The last digit(s) of the number are the leaves. 2. List the stems in a column, with a vertical line to their right. 3. For each measurement, record the leaf portion in the same row as its corresponding stem. 4. Order the leaves from lowest to highest in each stem. 5. Provide a key to your stem and leaf coding so that the reader can recreate the actual measurements. Example 2 Acceptance rates at some business schools: 16.3, 12.0, 25.1,20.3, 31.9, 20.7, 30.1, 19.5, 36.2, 46.9, 25.8, 36.7, 33.8, 24.2, 21.5, 35.1, 37.6, 23.9, 17.0, 38.4, 31.2, 43.8, 28.9, 31.4, 48.9 Stem and Leaf Plot:
6 stem=tens leaf=tenth It shows: center, range, concentration, nature of distribution (unimodal, bimodal, multimodal), unusual values, skewed to the right/left. Sometimes the available stem choices result in a plot that contains too few stems and a large number of leaves within each stem. In this situation you can stretch the stems by dividing each into several lines. The two common choices for dividing stems are: Into two lines, with leaves 0 to 4 and 5 to 9 into 5 lines, with leaves 0-1, 2-3, 4-5, 6-7, 8-9 Example:(acceptance rates) You also can use stem and leaf plots for the comparison of the distribution of two groups: Relative Frequency Histograms The most common graph for describing numerical continuous data is the histogram. It visualizes the distribution of the underlying variable, that is: how many measurements are found where on the measurement scale. How a histogram looks like: 6
7 Definition: A relative frequency histogram for a quantitative data set is a bar graph in which the hight of the bar shows how often (measured as a relative frequency) measurements fall in a particular interval. The classes or intervals are plotted along the horizontal axis. The first step into creating a histogram, is finding the frequency distribution of the variable of interest. Definition 3 A frequency distribution for quantitative data lists all the classes and the number of values that belong to each class. How to obtain a frequency distribution: 1. Decide which class intervals (preferably of equal length) to use for the frequency distribution. Each class is given through its lower boundary and its upper boundary. The class width= upper boundary - lower boundary. The number of class intervals used should be approximately the square root of the sample size, but not lower than 4 and not larger than 20. Use sensible interval boundaries: The intervals should have if possible the same width and the boundaries should be rounded numbers (if possible whole numbers or tenth, or multiples). 2. Create a frequency table for the class intervals using the method of left inclusion. List the class intervals and the frequency of values falling within this interval. Also give the relative frequencies for each class interval. These relative frequencies can now be displayed in a histogram. To obtain the histogram from the frequency distribution, follow the following steps: 1. Mark the boundaries of the class intervals on a horizontal axis. 2. Use the relative frequency on the vertical axis. 7
8 3. Draw a bar for each class interval, with heights according to the relative frequency of the corresponding class interval. Example 3 Histogram for acceptance rates: 1. The sample size is 25, the square root is 5, but we will use 4 class intervals, because of the range is about 10-50, which is easily divided into intervals [10, 20), [20, 30), [30, 40), [40, 50) 2. class intervals frequency relative frequency [10, 20) [20, 30) [30, 40) [40, 50) This graph uses the frequency (relative frequency is a better choice the intervals have the same width!) It shows: center, range, concentration, nature of distribution (unimodal, bimodal, multimodal), unusual values, skewed to the right/left. 8
9 Features to check for in a histogram 1. center, where is the middle of the data? 2. range, the data fall between which values (here:40 and 100). 3. number of peaks: unimodal(just one peak), bimodal (often occurs if you have observation from two groups (men, women)(two peaks), multimodal(more than 2 peaks) 4. symmetry: if you can draw a vertical line so that the part to the left is a mirror image of the part to the right, then it is symmetric. 5. nonsymmetric graphs are skewed. If the upper tail of the histogram stretches out farther than the lower tail, then is the histogram positively skewed, or skewed to the right. 6. Is the lower tail longer than the upper tail the histogram is negatively skewed. 7. Check for outliers. 9
10 2 Numerical Descriptive Measures methods for describing data JUST FOR NUMERICAL VARIABLES!! 2.1 Measures of Central Tendency The mean of a set of numerical observation is the familiar arithmetic average. To write the formula for the mean in a mathematical fashion we have to introduce some notation. Introduction of notation: x= the variable for which we have sample data n= sample size = number of observations x 1 =the first sample observation x 2 =the second sample observation. x n = the nth sample observation For example, we might have a sample of n=4 observations on x=battery lifetime(hr): x 1 =5.9, x 2 =7.3, x 3 =6.6, x 4 =5.7, The sum of x 1, x 2,..., x n can be denoted by but this is cumbersome. x 1 + x x n The Greek letter Σ is traditionally used in mathematics to denote summation. In particular Σ n i=1x i will denote the sum of x 1,, x n. Abbreviation Σx is used in the book. For the example above Σ 4 i=1x i = x 1 + x 2 + x 3 + x 4 = =
11 Definition: The sample mean of a numerical sample x 1, x 2,..., x n denoted by x is x = sum of all observations number of observations = x 1 + x x n n = Σn i=1x i n The mean battery life is x = = = Another number to describe the center of a sample is the median. The median is the value that divides the ordered sample in two sets of the same size, so that 50% of the data is less than this number (and 50% is greater than this number). Definition: The sample median, M, is determined by first ordering the n observation from smallest to largest. Then { the single middle value if n is odd M = sample median = the average of the middle two values if n is even Example: Suppose you have the following ordered sample of size 10: The median would be in this case the mean of the fifth and sixth observation (6+7)/2=6.5 and the sample mean is x = The median of the sample is the third observation which is 8, the sample mean x = 7.4. Comparing mean and median The mean is the balance point of the distribution. If you would try to balance a histogram on a pin, you would have to position the pin at the mean in order to succeed. The median is the point where the distribution is cut into two parts of the same area. In a symmetric distribution mean and median are equal. In a positively skewed distribution the mean is greater than the median. In a negatively skewed distribution the mean is smaller than the median. 2.2 Measures of Dispersion for numerical data It is not enough just to report a number that describes the center of a sample. The spread, the variability in a sample is also an important characteristic of a sample. Examples: graphs Definition: The range of a sample is the difference between the largest and the smallest value in the sample. Range = largest value - smallest value. Usually the greater the range the larger the variability. However, variability depends on more than just the distance between the two most extreme values. It is a characteristic of the whole data set and every observation contributes to it. 11
12 Sample 1: * * * * * o * * * * * Sample 2: * ****O**** * Definition The n deviations from the sample mean are the differences x 1 x, x 2 x, x n x A specific deviation is greater than zero if the value is greater than x and negative if it is less than x. The set of deviations describes the variability of the data set, but n i=1 (x i x)=0. If you square every deviation before summing them up, you will receive a number that characterizes the variability in the data set. Definition: The sample variance, denoted by s 2, is the sum of squared deviations from the mean divided by n 1. That is ni=1 s 2 (x i x) 2 = n 1 The sample standard deviation is the positive square root of the sample variance and is denoted by s. s = s 2 = ni=1 (x i x) 2 n 1 For calculating the sample variance for a given sample the following formula is easier to compute: s 2 = x 2 i ( x i ) 2 n 1 n 12
13 Example: Calculate the standard deviation of the 4 battery lives. i x i x i x (x i x) 2 x 2 i Σ The sample variance is s 2 = 1.589/3 = and the sample standard deviation is s = = Using the other formula, first calculate s 2 = With this we get s = 0.53 = Measures of position ni=1 x 2 i ( n i=1 x i) 2 n = = 3 = 1.59 = The concept of the median can be generalized, by asking for the number so that k% (instead of 50%) falls below the number. Definition: For any particular number k between 0 and 100, the k th percentile is a value such that k percent of the observations in the data set fall at or below that value. With this definition, the median is the 50 th percentile, 50% of the data fall below the median. n An alternative measure of variability is the interquartile range. Like the mean the standard deviation is greatly affected by outliers. The interquartile range is as the median resistant to outliers. It is based on quantities called quartiles. Definition: The lower quartile Q 1 is the 25th percentile, 25% of the data fall below it. The median Q 2 is the 50th percentile, 50% of the data fall below it. The upper quartile Q 3 is the 75th percentile, 75% of the data fall below it (and 25% above). The middle 50% of the measurements fall between the lower and upper quartile. The quartiles of a sample are obtained by: 13
14 1. Divide the n ordered observations into a lower and an upper half; if n is odd, the median is excluded from both halves. 2. The lower quartile Q 1 is the median of the lower half. 3. The upper quartile Q 3 is the median of the upper half. Example: Q1 med Q3 Definition: The interquartile range (IQR) is given by IQR = upper quartile lower quartile=q 3 Q 1 The IQR in the example is IQR= 8 5 = 3. The middle 50% of the data points in this sample are captured in an interval not longer than Summarizing a data set with a Boxplot The boxplot is a powerful graphical tool for summarizing data It shows the center, the spread, and the symmetry or the skewness at the same time. It is based on the median, the iqr, and the minimum and maximum of the observations. Construction of a boxplot 1. Draw a horizontal or vertical measurement scale. 2. Draw a rectangular box, whose lower edge is at the lower quartile and whose upper edge is at the upper quartile. 3. Draw a line segment inside the box at the location of the median. 4. Add line segments from each end of the box to the smallest and largest observation in the data set. Example: Sample of pulse after exercise of size 92. mean=80 median= 76.0, min=50, max=140, q l =68, q u =87 14
15 A boxplot can be supplied with even more information. Sometimes a star * is added for the mean. This will help to give a visual comparison between mean and median. In addition outliers may me identified in the boxplot. In order to do this, we first have to define, what an outlier is. Definition: An observation is called an outlier if it is more than 1.5 iqr away from the closest quartile. In order to determine if there is an outlier present in the data set calculate upper fence = upper quartile iqr, every measurement above the upper fence is an upper outlier lower fence = lower quartile 1.5 iqr, every measurement below the lower fence is called a lower outlier. Example: The iqr in the example is 87 68=19. (1.5 *19)=28.5. upper fence = *19= The maximum equals 140, so the data contains at least one upper outlier. lower fence = *19=39.5. The minimum equals 50, so that there is no lower outlier present. Outliers may be marked by a circle or a star in a box plot. In this case the whiskers only extend to the smallest and largest non outliers. One can create comparative boxplots by drawing several boxes in one graph. This is a good tool for comparing continuous variables in different categories. Example: Resting pulse and pulse after exercise boxplots in one graph. 15
16 16
17 3 A four step process 1. STATE: What is the practical question in context of the discipline? 2. PLAN: What statistical tool(s) have to be employed to find an answer? 3. SOLVE: Make the graphs and calculations necessary. 4. CONCLUDE: Give the answer to the question STATEd above in the context of the discipline. Example (Logging in the Rainforest(pg.57): 1. STATE Does logging the tropical rain forest result in its destruction? To answer this question we have data on the number of trees per acre on plots that had never been logged (Group 1), that had been logged 1 year earlier (Group 2), and plots that had been logged 8 years earlier. 2. Plan: Do side by side boxplots and descriptive statistics for the data from the 3 groups. 3. Solve: GROUP N Mean Median StDev Minimum Maximum Q1 Q Conclude The numerical summary as well as the boxplot suggests, that logging results in average in a smaller number of trees per acre, whereas the standard deviation seems to be almost unchanged. 17
Diagrams and Graphs of Statistical Data
Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in
More informationSTATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members
More informationChapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs
Types of Variables Chapter 1: Looking at Data Section 1.1: Displaying Distributions with Graphs Quantitative (numerical)variables: take numerical values for which arithmetic operations make sense (addition/averaging)
More informationDescriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics
Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),
More informationSummarizing and Displaying Categorical Data
Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency
More informationExploratory data analysis (Chapter 2) Fall 2011
Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,
More informationThe right edge of the box is the third quartile, Q 3, which is the median of the data values above the median. Maximum Median
CONDENSED LESSON 2.1 Box Plots In this lesson you will create and interpret box plots for sets of data use the interquartile range (IQR) to identify potential outliers and graph them on a modified box
More informationExercise 1.12 (Pg. 22-23)
Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.
More informationDescriptive Statistics
Y520 Robert S Michael Goal: Learn to calculate indicators and construct graphs that summarize and describe a large quantity of values. Using the textbook readings and other resources listed on the web
More informationCenter: Finding the Median. Median. Spread: Home on the Range. Center: Finding the Median (cont.)
Center: Finding the Median When we think of a typical value, we usually look for the center of the distribution. For a unimodal, symmetric distribution, it s easy to find the center it s just the center
More information3: Summary Statistics
3: Summary Statistics Notation Let s start by introducing some notation. Consider the following small data set: 4 5 30 50 8 7 4 5 The symbol n represents the sample size (n = 0). The capital letter X denotes
More informationLecture 1: Review and Exploratory Data Analysis (EDA)
Lecture 1: Review and Exploratory Data Analysis (EDA) Sandy Eckel seckel@jhsph.edu Department of Biostatistics, The Johns Hopkins University, Baltimore USA 21 April 2008 1 / 40 Course Information I Course
More informationPie Charts. proportion of ice-cream flavors sold annually by a given brand. AMS-5: Statistics. Cherry. Cherry. Blueberry. Blueberry. Apple.
Graphical Representations of Data, Mean, Median and Standard Deviation In this class we will consider graphical representations of the distribution of a set of data. The goal is to identify the range of
More informationVariables. Exploratory Data Analysis
Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is
More informationData Exploration Data Visualization
Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select
More informationExploratory Data Analysis
Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction
More informationThe Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)
Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,
More informationBar Graphs and Dot Plots
CONDENSED L E S S O N 1.1 Bar Graphs and Dot Plots In this lesson you will interpret and create a variety of graphs find some summary values for a data set draw conclusions about a data set based on graphs
More informationIntroduction to Statistics for Psychology. Quantitative Methods for Human Sciences
Introduction to Statistics for Psychology and Quantitative Methods for Human Sciences Jonathan Marchini Course Information There is website devoted to the course at http://www.stats.ox.ac.uk/ marchini/phs.html
More informationExploratory Data Analysis. Psychology 3256
Exploratory Data Analysis Psychology 3256 1 Introduction If you are going to find out anything about a data set you must first understand the data Basically getting a feel for you numbers Easier to find
More information1.3 Measuring Center & Spread, The Five Number Summary & Boxplots. Describing Quantitative Data with Numbers
1.3 Measuring Center & Spread, The Five Number Summary & Boxplots Describing Quantitative Data with Numbers 1.3 I can n Calculate and interpret measures of center (mean, median) in context. n Calculate
More informationBNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I
BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential
More informationMBA 611 STATISTICS AND QUANTITATIVE METHODS
MBA 611 STATISTICS AND QUANTITATIVE METHODS Part I. Review of Basic Statistics (Chapters 1-11) A. Introduction (Chapter 1) Uncertainty: Decisions are often based on incomplete information from uncertain
More informationDescribing, Exploring, and Comparing Data
24 Chapter 2. Describing, Exploring, and Comparing Data Chapter 2. Describing, Exploring, and Comparing Data There are many tools used in Statistics to visualize, summarize, and describe data. This chapter
More informationMeans, standard deviations and. and standard errors
CHAPTER 4 Means, standard deviations and standard errors 4.1 Introduction Change of units 4.2 Mean, median and mode Coefficient of variation 4.3 Measures of variation 4.4 Calculating the mean and standard
More informationStatistics Chapter 2
Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students
More informationIntroduction to Environmental Statistics. The Big Picture. Populations and Samples. Sample Data. Examples of sample data
A Few Sources for Data Examples Used Introduction to Environmental Statistics Professor Jessica Utts University of California, Irvine jutts@uci.edu 1. Statistical Methods in Water Resources by D.R. Helsel
More informationLesson 4 Measures of Central Tendency
Outline Measures of a distribution s shape -modality and skewness -the normal distribution Measures of central tendency -mean, median, and mode Skewness and Central Tendency Lesson 4 Measures of Central
More information6.4 Normal Distribution
Contents 6.4 Normal Distribution....................... 381 6.4.1 Characteristics of the Normal Distribution....... 381 6.4.2 The Standardized Normal Distribution......... 385 6.4.3 Meaning of Areas under
More informationDescriptive Statistics. Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion
Descriptive Statistics Purpose of descriptive statistics Frequency distributions Measures of central tendency Measures of dispersion Statistics as a Tool for LIS Research Importance of statistics in research
More informationChapter 2: Frequency Distributions and Graphs
Chapter 2: Frequency Distributions and Graphs Learning Objectives Upon completion of Chapter 2, you will be able to: Organize the data into a table or chart (called a frequency distribution) Construct
More information2 Describing, Exploring, and
2 Describing, Exploring, and Comparing Data This chapter introduces the graphical plotting and summary statistics capabilities of the TI- 83 Plus. First row keys like \ R (67$73/276 are used to obtain
More informationDESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.
DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,
More informationHow To Write A Data Analysis
Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction
More informationChapter 2 Data Exploration
Chapter 2 Data Exploration 2.1 Data Visualization and Summary Statistics After clearly defining the scientific question we try to answer, selecting a set of representative members from the population of
More informationDescriptive Statistics and Measurement Scales
Descriptive Statistics 1 Descriptive Statistics and Measurement Scales Descriptive statistics are used to describe the basic features of the data in a study. They provide simple summaries about the sample
More informationHISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
Mathematics Revision Guides Histograms, Cumulative Frequency and Box Plots Page 1 of 25 M.K. HOME TUITION Mathematics Revision Guides Level: GCSE Higher Tier HISTOGRAMS, CUMULATIVE FREQUENCY AND BOX PLOTS
More informationWeek 1. Exploratory Data Analysis
Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam
More informationMind on Statistics. Chapter 2
Mind on Statistics Chapter 2 Sections 2.1 2.3 1. Tallies and cross-tabulations are used to summarize which of these variable types? A. Quantitative B. Mathematical C. Continuous D. Categorical 2. The table
More informationCommon Tools for Displaying and Communicating Data for Process Improvement
Common Tools for Displaying and Communicating Data for Process Improvement Packet includes: Tool Use Page # Box and Whisker Plot Check Sheet Control Chart Histogram Pareto Diagram Run Chart Scatter Plot
More informationBiostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY
Biostatistics: DESCRIPTIVE STATISTICS: 2, VARIABILITY 1. Introduction Besides arriving at an appropriate expression of an average or consensus value for observations of a population, it is important to
More informationLecture 2: Descriptive Statistics and Exploratory Data Analysis
Lecture 2: Descriptive Statistics and Exploratory Data Analysis Further Thoughts on Experimental Design 16 Individuals (8 each from two populations) with replicates Pop 1 Pop 2 Randomly sample 4 individuals
More informationVisualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures
Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the
More informationChapter 1: Exploring Data
Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student
More informationModule 4: Data Exploration
Module 4: Data Exploration Now that you have your data downloaded from the Streams Project database, the detective work can begin! Before computing any advanced statistics, we will first use descriptive
More informationProbability and Statistics Vocabulary List (Definitions for Middle School Teachers)
Probability and Statistics Vocabulary List (Definitions for Middle School Teachers) B Bar graph a diagram representing the frequency distribution for nominal or discrete data. It consists of a sequence
More informationTHE BINOMIAL DISTRIBUTION & PROBABILITY
REVISION SHEET STATISTICS 1 (MEI) THE BINOMIAL DISTRIBUTION & PROBABILITY The main ideas in this chapter are Probabilities based on selecting or arranging objects Probabilities based on the binomial distribution
More informationTEACHER NOTES MATH NSPIRED
Math Objectives Students will understand that normal distributions can be used to approximate binomial distributions whenever both np and n(1 p) are sufficiently large. Students will understand that when
More informationHow To Check For Differences In The One Way Anova
MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way
More informationAP * Statistics Review. Descriptive Statistics
AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production
More information2. Here is a small part of a data set that describes the fuel economy (in miles per gallon) of 2006 model motor vehicles.
Math 1530-017 Exam 1 February 19, 2009 Name Student Number E There are five possible responses to each of the following multiple choice questions. There is only on BEST answer. Be sure to read all possible
More informationTopic 9 ~ Measures of Spread
AP Statistics Topic 9 ~ Measures of Spread Activity 9 : Baseball Lineups The table to the right contains data on the ages of the two teams involved in game of the 200 National League Division Series. Is
More informationEXAM #1 (Example) Instructor: Ela Jackiewicz. Relax and good luck!
STP 231 EXAM #1 (Example) Instructor: Ela Jackiewicz Honor Statement: I have neither given nor received information regarding this exam, and I will not do so until all exams have been graded and returned.
More informationTutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller
Tutorial 3: Graphics and Exploratory Data Analysis in R Jason Pienaar and Tom Miller Getting to know the data An important first step before performing any kind of statistical analysis is to familiarize
More informationData exploration with Microsoft Excel: univariate analysis
Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating
More informationDescriptive statistics parameters: Measures of centrality
Descriptive statistics parameters: Measures of centrality Contents Definitions... 3 Classification of descriptive statistics parameters... 4 More about central tendency estimators... 5 Relationship between
More informationdetermining relationships among the explanatory variables, and
Chapter 4 Exploratory Data Analysis A first look at the data. As mentioned in Chapter 1, exploratory data analysis or EDA is a critical first step in analyzing the data from an experiment. Here are the
More informationa. mean b. interquartile range c. range d. median
3. Since 4. The HOMEWORK 3 Due: Feb.3 1. A set of data are put in numerical order, and a statistic is calculated that divides the data set into two equal parts with one part below it and the other part
More informationSummary of Formulas and Concepts. Descriptive Statistics (Ch. 1-4)
Summary of Formulas and Concepts Descriptive Statistics (Ch. 1-4) Definitions Population: The complete set of numerical information on a particular quantity in which an investigator is interested. We assume
More informationDescribing and presenting data
Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data
More informationShape of Data Distributions
Lesson 13 Main Idea Describe a data distribution by its center, spread, and overall shape. Relate the choice of center and spread to the shape of the distribution. New Vocabulary distribution symmetric
More informationSTAT355 - Probability & Statistics
STAT355 - Probability & Statistics Instructor: Kofi Placid Adragni Fall 2011 Chap 1 - Overview and Descriptive Statistics 1.1 Populations, Samples, and Processes 1.2 Pictorial and Tabular Methods in Descriptive
More information1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number
1) Write the following as an algebraic expression using x as the variable: Triple a number subtracted from the number A. 3(x - x) B. x 3 x C. 3x - x D. x - 3x 2) Write the following as an algebraic expression
More informationDemographics of Atlanta, Georgia:
Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From
More informationSampling and Descriptive Statistics
Sampling and Descriptive Statistics Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University Reference: 1. W. Navidi. Statistics for Engineering and Scientists.
More informationMEASURES OF VARIATION
NORMAL DISTRIBTIONS MEASURES OF VARIATION In statistics, it is important to measure the spread of data. A simple way to measure spread is to find the range. But statisticians want to know if the data are
More informationUsing SPSS, Chapter 2: Descriptive Statistics
1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,
More informationFoundation of Quantitative Data Analysis
Foundation of Quantitative Data Analysis Part 1: Data manipulation and descriptive statistics with SPSS/Excel HSRS #10 - October 17, 2013 Reference : A. Aczel, Complete Business Statistics. Chapters 1
More informationIntro to Statistics 8 Curriculum
Intro to Statistics 8 Curriculum Unit 1 Bar, Line and Circle Graphs Estimated time frame for unit Big Ideas 8 Days... Essential Question Concepts Competencies Lesson Plans and Suggested Resources Bar graphs
More informationPractice#1(chapter1,2) Name
Practice#1(chapter1,2) Name Solve the problem. 1) The average age of the students in a statistics class is 22 years. Does this statement describe descriptive or inferential statistics? A) inferential statistics
More informationSECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS
SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing
More informationSection 1.1 Exercises (Solutions)
Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees
More informationInterpreting Data in Normal Distributions
Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,
More informationNorthumberland Knowledge
Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about
More informationNumeracy Targets. I can count at least 20 objects
Targets 1c I can read numbers up to 10 I can count up to 10 objects I can say the number names in order up to 20 I can write at least 4 numbers up to 10. When someone gives me a small number of objects
More informationSPSS Manual for Introductory Applied Statistics: A Variable Approach
SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All
More information+ Chapter 1 Exploring Data
Chapter 1 Exploring Data Introduction: Data Analysis: Making Sense of Data 1.1 Analyzing Categorical Data 1.2 Displaying Quantitative Data with Graphs 1.3 Describing Quantitative Data with Numbers Introduction
More informationEXPLORING SPATIAL PATTERNS IN YOUR DATA
EXPLORING SPATIAL PATTERNS IN YOUR DATA OBJECTIVES Learn how to examine your data using the Geostatistical Analysis tools in ArcMap. Learn how to use descriptive statistics in ArcMap and Geoda to analyze
More informationWeek 11 Lecture 2: Analyze your data: Descriptive Statistics, Correct by Taking Log
Week 11 Lecture 2: Analyze your data: Descriptive Statistics, Correct by Taking Log Instructor: Eakta Jain CIS 6930, Research Methods for Human-centered Computing Scribe: Chris(Yunhao) Wan, UFID: 1677-3116
More informationconsider the number of math classes taken by math 150 students. how can we represent the results in one number?
ch 3: numerically summarizing data - center, spread, shape 3.1 measure of central tendency or, give me one number that represents all the data consider the number of math classes taken by math 150 students.
More informationLecture 2. Summarizing the Sample
Lecture 2 Summarizing the Sample WARNING: Today s lecture may bore some of you It s (sort of) not my fault I m required to teach you about what we re going to cover today. I ll try to make it as exciting
More informationMathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions
Title: Using the Area on a Pie Chart to Calculate Probabilities Mathematics Content: Pie Charts; Area as Probability; Probabilities as Percents, Decimals & Fractions Objectives: To calculate probability
More informationDESCRIPTIVE STATISTICS & DATA PRESENTATION*
Level 1 Level 2 Level 3 Level 4 0 0 0 0 evel 1 evel 2 evel 3 Level 4 DESCRIPTIVE STATISTICS & DATA PRESENTATION* Created for Psychology 41, Research Methods by Barbara Sommer, PhD Psychology Department
More informationMean = (sum of the values / the number of the value) if probabilities are equal
Population Mean Mean = (sum of the values / the number of the value) if probabilities are equal Compute the population mean Population/Sample mean: 1. Collect the data 2. sum all the values in the population/sample.
More informationCHAPTER THREE. Key Concepts
CHAPTER THREE Key Concepts interval, ordinal, and nominal scale quantitative, qualitative continuous data, categorical or discrete data table, frequency distribution histogram, bar graph, frequency polygon,
More informationScope and Sequence KA KB 1A 1B 2A 2B 3A 3B 4A 4B 5A 5B 6A 6B
Scope and Sequence Earlybird Kindergarten, Standards Edition Primary Mathematics, Standards Edition Copyright 2008 [SingaporeMath.com Inc.] The check mark indicates where the topic is first introduced
More informationSession 7 Bivariate Data and Analysis
Session 7 Bivariate Data and Analysis Key Terms for This Session Previously Introduced mean standard deviation New in This Session association bivariate analysis contingency table co-variation least squares
More informationStatistics Revision Sheet Question 6 of Paper 2
Statistics Revision Sheet Question 6 of Paper The Statistics question is concerned mainly with the following terms. The Mean and the Median and are two ways of measuring the average. sumof values no. of
More informationStatistics. Measurement. Scales of Measurement 7/18/2012
Statistics Measurement Measurement is defined as a set of rules for assigning numbers to represent objects, traits, attributes, or behaviors A variableis something that varies (eye color), a constant does
More informationIntroduction; Descriptive & Univariate Statistics
Introduction; Descriptive & Univariate Statistics I. KEY COCEPTS A. Population. Definitions:. The entire set of members in a group. EXAMPLES: All U.S. citizens; all otre Dame Students. 2. All values of
More informationModule 2: Introduction to Quantitative Data Analysis
Module 2: Introduction to Quantitative Data Analysis Contents Antony Fielding 1 University of Birmingham & Centre for Multilevel Modelling Rebecca Pillinger Centre for Multilevel Modelling Introduction...
More informationAP Statistics Solutions to Packet 2
AP Statistics Solutions to Packet 2 The Normal Distributions Density Curves and the Normal Distribution Standard Normal Calculations HW #9 1, 2, 4, 6-8 2.1 DENSITY CURVES (a) Sketch a density curve that
More informationFirst Midterm Exam (MATH1070 Spring 2012)
First Midterm Exam (MATH1070 Spring 2012) Instructions: This is a one hour exam. You can use a notecard. Calculators are allowed, but other electronics are prohibited. 1. [40pts] Multiple Choice Problems
More informationPart 2: Data Visualization How to communicate complex ideas with simple, efficient and accurate data graphics
Part 2: Data Visualization How to communicate complex ideas with simple, efficient and accurate data graphics Why visualize data? The human eye is extremely sensitive to differences in: Pattern Colors
More informationSta 309 (Statistics And Probability for Engineers)
Instructor: Prof. Mike Nasab Sta 309 (Statistics And Probability for Engineers) Chapter 2 Organizing and Summarizing Data Raw Data: When data are collected in original form, they are called raw data. The
More informationBridging Documents for Mathematics
Bridging Documents for Mathematics 5 th /6 th Class, Primary Junior Cycle, Post-Primary Primary Post-Primary Card # Strand(s): Number, Measure Number (Strand 3) 2-5 Strand: Shape and Space Geometry and
More informationExploratory Data Analysis
Exploratory Data Analysis Learning Objectives: 1. After completion of this module, the student will be able to explore data graphically in Excel using histogram boxplot bar chart scatter plot 2. After
More informationDescriptive Statistics
Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9
More informationMathematical Conventions Large Print (18 point) Edition
GRADUATE RECORD EXAMINATIONS Mathematical Conventions Large Print (18 point) Edition Copyright 2010 by Educational Testing Service. All rights reserved. ETS, the ETS logo, GRADUATE RECORD EXAMINATIONS,
More information