# Graphical methods for presenting data

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Chapter 2 Graphical methods for presenting data 2.1 Introduction We have looked at ways of collecting data and then collating them into tables. Frequency tables are useful methods of presenting data; they do, however, have their limitations. With large amounts of data graphical presentation methods are often clearer to understand. Here, we look at methods for producing graphical representations of data of the types we have seen previously. 2.2 Stem and Leaf plots Stem and leaf plots are a quick and easy way of representing data graphically. They can be used with both discrete and continuous data. The method for creating a stem and leaf plot is similar to that for creating a grouped frequency table. The first stage, as with grouped frequency tables, is to decide on a reasonable number of intervals which span the range of data. The interval widths for a stem and leaf plot must be equal. Because of the way the plot works it is best to use sensible values for the interval width i.e. 5, 10, 100, 1000; if a dataset consists of many small values, this interval width could also be 1, or even 0.1 or Once we have decided on our intervals we can construct the stem and leaf plot. This is perhaps best described by demonstration. Consider the following data: 11,12,9,15,21,25,19,8. The first step is to decide on interval widths one obvious choice would be to go up in10s. This would give a stem unit of 10 and a leaf unit of1. The stem and leaf plot is constructed as below Stem Leaf n = 8, stem unit = 10, leaf unit = 1. You can clearly see where the data have been put. The stem units are to the left of the vertical line, while the leaves are to the right. So, for example, our first observation 11 is made up of 12

2 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 13 a stem unit of one 10 and a leaf unit one 1. It is important to give an equal amount of space to each leaf value by doing so, we can get a clear picture of any patterns in the data (it s almost like a bar chart on its side but it still shows the raw observations!). Before producing a stem and leaf plot, it will probably help to first write down the data in ascending numerical order. Example 1: Percentage returns on a share As you might imagine, the interval width does not have to be 10. The following numbers show the percentage returns on an ordinary share for 23 consecutive months: Here, the largest value is 2.4 and the smallest 2.3, and we have lots of decimal values in between. Thus, it seems sensible here to have a stem unit of 1 and a leaf unit of 0.1. A stem and leaf diagram for this set of returns then might look like: Stem Leaf n = 23, stem unit = 1, leaf unit = 0.1. Example 2: Unemployment rates in the U.S. Hopefully, that should all seem fine so far. So what can go wrong? Consider the following data, which are the percentage unemployment rates for 10 U.S. states: If you were to choose 10 as the interval width (i.e. go up in 10s), the stem and leaf plot would look like Stem Leaf n = 10, stem unit = 10, leaf unit = 1. Here, the interval width is too large, resulting in only two intervals for our data. With such few intervals it is difficult to identify any patterns in the data. We can get a better idea about what is going on if we choose a smaller interval width say 5. Doing so gives the following stem and leaf plot:

3 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Stem Leaf n = 10, stem unit = 10, leaf unit = 1. Notice now that there are two 1s in the stem one for observations between 10 and 14 (inclusive) and another for observations between 15 and 19 (inclusive). Thus, the stem unit is still 10, but the interval width is now only 5. Changing the interval width like this produces a plot which starts to show some sort of pattern in the data indeed, this is the intention of such graphical presentations. We could, however, go to the other extreme and have too many intervals. If this were the case, any pattern would again be lost because lots of intervals would contain no observations at all. So choose your interval width carefully! Example 3: Call centre data Let s work through the following example. The observations in the table below are the recorded time it takes to get through to an operator at a telephone call centre (in seconds) Stem Leaf n = stem unit = leaf unit =

4 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 15 Example 4: Production line data If there is more than one significant figure in the data, the extra digits are cut (or truncated), not rounded, to the nearest value; that is to say, 2.97 would become 2.9, not 3.0. To illustrate this, consider the following data on lengths of items on a production line (in cm): The stem and leaf plot for this is as follows: n = 10, stem unit = 1 cm, leaf unit = 0.1 cm. Here the interval width is 0.5. This allows for greater clarity in the plot. Why do you think we cut the extra digits? Example 5: student marks The stem and leaf plot below represents the marks on a test for 50 students n = 50, stem unit = 10, leaf unit = 1. It s easy to see some of the advantages of graphically presenting data. For example, here you can clearly see that the data are centred around a value in the low 30 s and fall away on either side. From stem and leaf plots we can quickly and easily tell if the distribution of the data is symmetric or asymmetric. We can see whether there are any outliers, that is, observations which are either much larger or much smaller than is typical of the data. We could perhaps even tell whether the data are multi modal, that is to say, whether there are two or more peaks on the graph with a gap between them. If so, this could suggest that the sample contains data from two or more groups.

5 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Using Minitab With the small data sets we have seen so far, it is obviously relatively easy to create stem and leaf plots by hand. With larger data sets this would be more problematic and certainly more time consuming. Fortunately, there are computer packages that will create these plots for us Minitab is one such package, and can be found on most university computers. Details and guidance on using Minitab will be provided in the computer practical sessions in week Bar Charts Bar charts are a commonly used and clear way of presenting categorical data or any ungrouped discrete frequency observations. For example, recall the example on students modes of transport: Student Mode Student Mode Student Mode 1 Car 11 Walk 21 Walk 2 Walk 12 Walk 22 Metro 3 Car 13 Metro 23 Car 4 Walk 14 Bus 24 Car 5 Bus 15 Train 25 Car 6 Metro 16 Bike 26 Bus 7 Car 17 Bus 27 Car 8 Bike 18 Bike 28 Walk 9 Walk 19 Bike 29 Car 10 Car 20 Metro 30 Car The first logical step is to put these into a frequency table, giving Mode Frequency Car 10 Walk 7 Bike 4 Bus 4 Metro 4 Train 1 Total 30

6 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 17 We can then present this information as a bar chart, by following the five step process shown below: 1. First decide what goes on each axis of the chart. By convention the variable being measured goes on the horizontal (x axis) and the frequency goes on the vertical (y axis). 2. Next decide on a numeric scale for the frequency axis. This axis represents the frequency in each category by its height. It must start at zero and include the largest frequency. It is common to extend the axis slightly above the largest value so you are not drawing to the edge of the graph. 3. Having decided on a range for the frequency axis we need to decide on a suitable number scale to label this axis. This should have sensible values, for example, 0,1,2,..., or 0,10,20..., or other such values as make sense given the data. 4. Draw the axes and label them appropriately. 5. Draw a bar for each category. When drawing the bars it is essential to ensure the following: the width of each bar is the same; the bars are separated from each other by equally sized gaps. This gives the following bar chart: 10 Frequency Car Walk Bike Bus Metro Train This bar chart clearly shows that the most popular mode of transport is the car and that the metro, bus and cycling are all equally popular (in our small sample). Bar charts provide a simple method of quickly spotting simple patterns of popularity within a discrete data set.

7 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Histograms Bar charts have their limitations; for example, they cannot be used to present continuous data. When dealing with continuous random variables a different kind of graph is required. This is called a histogram. At first sight these look similar to bar charts. There are, however, two critical differences: the horizontal (x-axis) is a continuous scale. As a result of this there are no gaps between the bars (unless there are no observations within a class interval); the height of the rectangle is only proportional to the frequency if the class intervals are all equal. With histograms it is the area of the rectangle that is proportional to their frequency. Initially we will only consider histograms with equal class intervals. Those with uneven class intervals require more careful thought. Producing a histogram is much like producing a bar chart and in many respects can be considered to be the next stage after producing a grouped frequency table. In reality, it is often best to produce a frequency table first which collects all the data together in an ordered format. Once we have the frequency table, the process is very similar to drawing a bar chart. 1. Find the maximum frequency and draw the vertical (y axis) from zero to this value, including a sensible numeric scale. 2. The range of the horizontal (x axis) needs to include not only the full range of observations but also the full range of the class intervals from the frequency table. 3. Draw a bar for each group in your frequency table. These should be the same width and touch each other (unless there are no data in one particular class). The frequency table for the data on service times for a telephone call centre (Section 1.3.2) was Service time Frequency 175 time < time < time < time < time < time < time < time < time < time < Total 50

8 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 19 The histogram for these data is: Frequency Time (s) Histograms are useful tools in data analysis. They are easy to produce in Minitab for large data sets and provide a clear visual representation of the data. Using histograms, it is easy to spot the modal or most popular class in the data, i.e. the one with the highest peak. It is also easy to spot simple patterns in the data. Is the frequency distribution symmetric, as the histograms produced above, or is it skewed to one side like the left hand histogram in the following graphic?

9 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 20 Histograms also allow us to make early judgements as to whether all our data come from the same population. Consider the right hand histogram in the graphic above. It clearly contains two separate modes (peaks), each of which has its own symmetric pattern of data. This clearly suggests that the data come from two separate populations, one centred around 85 with a narrow spread and one centred around 100 with a wider spread. In real situations it is unlikely that the difference would be as dramatic, unless you had a poor sampling method. However, the drawing of histograms is often the first stage of a more complex analysis. Finally, when drawing histograms be aware of observations on variables which have boundaries on their ranges. For example heights, weights, times to complete tasks etc. can not take negative values so there is a lower limit at zero. Computer programs do not automatically know this. You should make sure that the lower limit of the first class interval is not negative in such cases. We have seen some basic ways in which we might present data graphically. These methods will often provide the mainstay of business presentations. There are, however, other techniques which are useful and offer advantages in some applications over histograms and bar charts.

10 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Percentage Relative Frequency Histograms When we produced frequency tables in Section 1.3, we included a column for percentage relative frequency. This contained values for the frequency of each group, relative to the overall sample size, expressed as a percentage. For example, a percentage relative frequency table for the data on service time (in seconds) for calls to a credit card service centre is: Service time Frequency Relative Frequency (%) 175 time < time < time < time < time < time < time < time < time < time < Totals You can plot these data like an ordinary histogram, or, instead of using frequency on the vertical axis (y-axis), you could use the percentage relative frequency Relative frequency (%) Time (s) Note that the y-axis now contains the relative percentages rather than the frequencies. You might well ask why would we want to do this?.

11 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 22 These percentage relative frequency histograms are useful when comparing two samples that have different numbers of observations. If one sample were larger than the other then a frequency histogram would show a difference simply because of the larger number of observations. Looking at percentages removes this difference and enables us to look at relative differences. For example, in the following graph there are data from two groups and four times as many data points for one group as the other. The left-hand plot shows an ordinary histogram and it is clear that the comparison between groups is masked by the quite different sample sizes. The right-hand plot shows a histogram based on (percentage) relative frequencies and this enables a much more direct comparison of the distributions in the two groups. Overlaying histograms on the same graph can sometimes not produce such a clear picture, particularly if the values in both groups are close or overlap one another significantly.

12 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Relative Frequency Polygons These are a natural extension of the relative frequency histogram. They differ in that, rather than drawing bars, each class is represented by one point and these are joined together by straight lines. The method is similar to that for producing a histogram: 1. Produce a percentage relative frequency table. 2. Draw the axes Thex-axis needs to contain the full range of the classes used. They-axis needs to range from 0 to the maximum percentage relative frequency. 3. Plot points: pick the mid point of the class interval on thex-axis and go up until you reach the appropriate percentage value on the y-axis and mark the point. Do this for each class. 4. Join adjacent points together with straight lines. The relative frequency polygon is exactly the same as the relative frequency histogram, but instead of having bars we join the mid points of the top of each bar with a straight line. Consider the following simple example. This can be easily drawn by hand: Class Interval Mid Point % Relative Frequency 0 x < x < x < x < x < Relative frequency polygon Relative frequency (%) x

13 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 24 These percentage relative frequency polygons are very useful for comparing two or more samples we can easily overlay many relative frequency polygons, but overlaying the corresponding histograms could get really messy! Consider the following data on gross weekly income (in ) collected from two sites in Newcastle. Let us suppose that many more responses were collected in Jesmond so that a direct comparison of the frequencies using a standard histogram is not appropriate. Instead we use relative frequencies. Weekly Income ( ) West Road (%) Jesmond Road (%) 0 income < income < income < income < income < income < income < income < income < income < Minitab was used to produce the following plot of the percentage relative frequancy polygons for the two groups. We can clearly see the differences between the two samples. The line connecting the boxes represents the data from West Road and the line connecting the circles represents those for Jesmond Road. The distribution of incomes on West Road is skewed towards lower values, whilst those on Jesmond Road are more symmetric. The graph clearly shows that income in the Jesmond Road area is higher than that in the West Road area.

14 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Cumulative Frequency Polygons (Ogive) Cumulative percentage relative frequency is also a useful tool. The cumulative percentage relative frequency is simply the sum of the percentage relative frequencies at the end of each class interval (i.e. we add the frequencies up as we go along). Consider the example from the previous section: Class Interval % Relative Frequency Cumulative % Relative Frequency 0 x < x < x < x < x < At the upper limit of the first class the cumulative % relative frequency is simply the % relative frequency in the first class, i.e. 10. However, at the end of the second class, at20, the cumulative % relative frequency is = 30. The cumulative % relative frequency at the end of the last class must be100. The corresponding graph, or ogive, is simple to produce by hand: 1. Draw the axes. 2. Label thex-axis with the full range of the data and they-axis from 0 to 100%. 3. Plot the cumulative % relative frequency at the end point of each class. 4. Join adjacent points, starting at 0% at the lowest class boundary. 100 Ogive Cumulative relative frequency (%) x

15 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 26 For example,minitab was used to produce the ogive below for the income data from the West Road survey: This graph instantly tells you many things. To see what percentage of respondents earn less than x per week: 1. Find x on thex-axis and draw a line up from this value until you reach the ogive; 2. From this point trace across to they-axis; 3. Read the percentage from they-axis. If we wanted to know what percentage of respondents in the survey in West Road earn less than 250 per week, we simply find 250 on the x-axis, trace up to the ogive and then trace across to the y-axis and we can read a figure of about 47%. The process obviously works in reverse. If we wanted to know what level of income 50% of respondents earned, we would trace across from 50% to the ogive and then down to thex-axis and read a value of about 300.

16 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 27 Ogives can also be used for comparison purposes. The following plot contains the ogives for the income data at both the West Road and Jesmond Road sites. It clearly shows the ogive for Jesmond Road is shifted to the right of that for West Road. This tells us that the surveyed incomes are higher on Jesmond Road. We can compare the percentages of people earning different income levels between the two sites quickly and easily. This technique can also be used to great effect for examining the changes before and after the introduction of a marketing strategy. For example, daily sales figures of a product for a period before and after an advertising campaign might be plotted. Here, a comparison of the two ogives can be used to help assess whether or not the campaign has been successful.

17 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Other graphical summaries Multiple Bar Charts The data below gives the daily sales of CDs (in ) by music type for an independent retailer. Day Chart Dance Rest Total Monday Tuesday Wednesday Thursday Friday Saturday Sunday Total Bar charts could be drawn of total sales per music type in the week, or of total daily sales. It might be interesting to see daily sales broken down into music types. This can be done in a similar manner to the bar charts produced previously. The only difference is that the height of the bars is dictated by the total daily sales, and each bar has segments representing each music type. TheMinitab worksheet and chart this produces are shown below: These types of charts are particularly good for presenting such financial information or illustrating any breakdown of data over time for example, the number of new cars sold by month and model.

18 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Pie charts Pie charts are simple diagrams for displaying categorical or grouped data. These charts are commonly used within industry to communicate simple ideas, for example market share. They are used to show the proportions of a whole. They are best used when there are only a handful of categories to display. A pie chart consists of a circle divided into segments, one segment for each category. The size of each segment is determined by the frequency of the category and measured by the angle of the segment. As the total number of degrees in a circle is 360, the angle given to a segment is360 times the fraction of the data in the category, that is angle = Number in category Total number in sample (n) 360. Consider the data on newspaper sales to 650 students. Paper Frequency Degrees The Times The Sun The Sport The Guardian The Financial Times The Mirror The Daily Mail The Independent Totals The pie chart is constructed by first drawing a circle and then dividing it up into segments whose angles are calculated using this formula. Minitab was used to produce the following pie chart:

19 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 30 It shows that The Sun, The Times and The Guardian are the most popular papers. Note that the pie chart is a simple circle. Some computer software will draw perspective pie charts, pie charts with slices detached etc. It is best to avoid such gimmicks which merely obscure the information contained in the chart Scatter Plots Scatter plots are used to plot two variables which you believe might be related, for example, height and weight, advertising expenditure and sales, or age of machinery and maintenance costs. Consider the following data for monthly output and total costs at a factory. Total costs ( ) Monthly Output If you were interested in the relationship between the cost of production and the number of units produced you could easily plot this by hand. 1. The response variable is placed on the y-axis. Here we are trying to understand how total costs relate to monthly output and so the response variable is total costs. 2. The variable that is used to try to explain the response variable (here, monthly output) is placed on thex-axis. 3. Plot the pairs of points on the graph. This graph can be produced using Minitab (Select Graph then Scatterplot then Simple and insert the required variables).

20 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 31 The plot highlights a clear relationship between the two variables: the more units made, the more it costs in total. This relationship is shown on the graph by the upwards trend within the data as monthly output increases so too does total cost. A downwards sloping trend would imply that as output increased, total costs declined, an unlikely scenario. This type of plot is the first stage of a more sophisticated analysis which we will develop in Semester 2 of this course.

21 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Time Series Plots So far we have only considered data where we can (at least for some purposes) ignore the order in which the data come. Not all data are like this. One exception is the case of time series data, that is, data collected over time. Examples include monthly sales of a product, the price of a share at the end of each day or the air temperature at midday each day. Such data can be plotted by using a scatter plot, but with time as the (horizontal) x-axis, and where the points are connected by lines. Consider the following data (overleaf) on the number of computers sold (in thousands) by quarter (January-March, April-June, July-September, October-December) at a large warehouse outlet. Quarter Units Sold Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q Q By hand, a time series plot is constructed as follows: 1. Draw thex-axis and label over the time scale. 2. Draw they-axis and label with an appropriate scale. 3. Plot each point according to time and value. 4. Draw lines connecting all points.

22 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA 33 The time series plot is: The plot clearly shows us two things: firstly, that there is an upwards trend to the data, and secondly that there is some regular variation around this trend. We will come back to more sophisticated techniques for analysing time series data later in the course.

23 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Exercises 1. The following table shows the weight (in kilograms) of 50 sacks of potatoes leaving a farm shop Display these data in a stem and leaf plot. Note the number of decimal places and adjust accordingly. State clearly both the stem and the leaf units. 2. A market researcher asked 650 students what their favourite daily newspaper was. The results are summarised in the frequency table below. Represent these data in an appropriate graphical manner. The Times 140 The Sun 200 The Sport 50 The Guardian 120 The Financial Times 20 The Mirror 80 The Daily Mail 10 The Independent Produce a histogram for the data on length of mobile phone calls from the exercises in Chapter 1 (listed again below) and comment on it

24 CHAPTER 2. GRAPHICAL METHODS FOR PRESENTING DATA Consider the following data for daily sales at a small record shop, before and after a local radio advertising campaign. Daily Sales Before After 1000 sales < sales < sales < sales < sales < sales < sales < sales < sales < Totals (a) Calculate the percentage relative frequency for before and after. (b) Calculate the cumulative percentage relative frequency for before and after. (c) Plot the relative frequency polygons for both on one graph and comment. (d) Plot the ogives for both on one graph. (e) Find the level of sales, before and after, that are reached on 25%, 50% and 75% of days.

### Summarizing and Displaying Categorical Data

Summarizing and Displaying Categorical Data Categorical data can be summarized in a frequency distribution which counts the number of cases, or frequency, that fall into each category, or a relative frequency

### Diagrams and Graphs of Statistical Data

Diagrams and Graphs of Statistical Data One of the most effective and interesting alternative way in which a statistical data may be presented is through diagrams and graphs. There are several ways in

### Statistics Revision Sheet Question 6 of Paper 2

Statistics Revision Sheet Question 6 of Paper The Statistics question is concerned mainly with the following terms. The Mean and the Median and are two ways of measuring the average. sumof values no. of

### Descriptive statistics Statistical inference statistical inference, statistical induction and inferential statistics

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics),

### Module 2: Introduction to Quantitative Data Analysis

Module 2: Introduction to Quantitative Data Analysis Contents Antony Fielding 1 University of Birmingham & Centre for Multilevel Modelling Rebecca Pillinger Centre for Multilevel Modelling Introduction...

### Visualizing Data. Contents. 1 Visualizing Data. Anthony Tanbakuchi Department of Mathematics Pima Community College. Introductory Statistics Lectures

Introductory Statistics Lectures Visualizing Data Descriptive Statistics I Department of Mathematics Pima Community College Redistribution of this material is prohibited without written permission of the

### Exploratory data analysis (Chapter 2) Fall 2011

Exploratory data analysis (Chapter 2) Fall 2011 Data Examples Example 1: Survey Data 1 Data collected from a Stat 371 class in Fall 2005 2 They answered questions about their: gender, major, year in school,

### AP * Statistics Review. Descriptive Statistics

AP * Statistics Review Descriptive Statistics Teacher Packet Advanced Placement and AP are registered trademark of the College Entrance Examination Board. The College Board was not involved in the production

### Appendix 2.1 Tabular and Graphical Methods Using Excel

Appendix 2.1 Tabular and Graphical Methods Using Excel 1 Appendix 2.1 Tabular and Graphical Methods Using Excel The instructions in this section begin by describing the entry of data into an Excel spreadsheet.

### Common Tools for Displaying and Communicating Data for Process Improvement

Common Tools for Displaying and Communicating Data for Process Improvement Packet includes: Tool Use Page # Box and Whisker Plot Check Sheet Control Chart Histogram Pareto Diagram Run Chart Scatter Plot

### Drawing a histogram using Excel

Drawing a histogram using Excel STEP 1: Examine the data to decide how many class intervals you need and what the class boundaries should be. (In an assignment you may be told what class boundaries to

### Exercise 1.12 (Pg. 22-23)

Individuals: The objects that are described by a set of data. They may be people, animals, things, etc. (Also referred to as Cases or Records) Variables: The characteristics recorded about each individual.

### Darton College Online Math Center Statistics. Chapter 2: Frequency Distributions and Graphs. Presenting frequency distributions as graphs

Chapter : Frequency Distributions and Graphs 1 Presenting frequency distributions as graphs In a statistical study, researchers gather data that describe the particular variable under study. To present

### Formulas, Functions and Charts

Formulas, Functions and Charts :: 167 8 Formulas, Functions and Charts 8.1 INTRODUCTION In this leson you can enter formula and functions and perform mathematical calcualtions. You will also be able to

### Week 1. Exploratory Data Analysis

Week 1 Exploratory Data Analysis Practicalities This course ST903 has students from both the MSc in Financial Mathematics and the MSc in Statistics. Two lectures and one seminar/tutorial per week. Exam

### The Big Picture. Describing Data: Categorical and Quantitative Variables Population. Descriptive Statistics. Community Coalitions (n = 175)

Describing Data: Categorical and Quantitative Variables Population The Big Picture Sampling Statistical Inference Sample Exploratory Data Analysis Descriptive Statistics In order to make sense of data,

### Data exploration with Microsoft Excel: univariate analysis

Data exploration with Microsoft Excel: univariate analysis Contents 1 Introduction... 1 2 Exploring a variable s frequency distribution... 2 3 Calculating measures of central tendency... 16 4 Calculating

### Statistics Chapter 2

Statistics Chapter 2 Frequency Tables A frequency table organizes quantitative data. partitions data into classes (intervals). shows how many data values are in each class. Test Score Number of Students

### Exploratory Data Analysis

Exploratory Data Analysis Johannes Schauer johannes.schauer@tugraz.at Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Introduction

### SECTION 2-1: OVERVIEW SECTION 2-2: FREQUENCY DISTRIBUTIONS

SECTION 2-1: OVERVIEW Chapter 2 Describing, Exploring and Comparing Data 19 In this chapter, we will use the capabilities of Excel to help us look more carefully at sets of data. We can do this by re-organizing

### STATS8: Introduction to Biostatistics. Data Exploration. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics Data Exploration Babak Shahbaba Department of Statistics, UCI Introduction After clearly defining the scientific problem, selecting a set of representative members

### Descriptive Statistics

Descriptive Statistics Descriptive statistics consist of methods for organizing and summarizing data. It includes the construction of graphs, charts and tables, as well various descriptive measures such

### Variables. Exploratory Data Analysis

Exploratory Data Analysis Exploratory Data Analysis involves both graphical displays of data and numerical summaries of data. A common situation is for a data set to be represented as a matrix. There is

### Section 1.1 Exercises (Solutions)

Section 1.1 Exercises (Solutions) HW: 1.14, 1.16, 1.19, 1.21, 1.24, 1.25*, 1.31*, 1.33, 1.34, 1.35, 1.38*, 1.39, 1.41* 1.14 Employee application data. The personnel department keeps records on all employees

### A Picture Really Is Worth a Thousand Words

4 A Picture Really Is Worth a Thousand Words Difficulty Scale (pretty easy, but not a cinch) What you ll learn about in this chapter Why a picture is really worth a thousand words How to create a histogram

### Good luck! BUSINESS STATISTICS FINAL EXAM INSTRUCTIONS. Name:

Glo bal Leadership M BA BUSINESS STATISTICS FINAL EXAM Name: INSTRUCTIONS 1. Do not open this exam until instructed to do so. 2. Be sure to fill in your name before starting the exam. 3. You have two hours

### Chapter 2 Statistical Foundations: Descriptive Statistics

Chapter 2 Statistical Foundations: Descriptive Statistics 20 Chapter 2 Statistical Foundations: Descriptive Statistics Presented in this chapter is a discussion of the types of data and the use of frequency

### Data Analysis Tools. Tools for Summarizing Data

Data Analysis Tools This section of the notes is meant to introduce you to many of the tools that are provided by Excel under the Tools/Data Analysis menu item. If your computer does not have that tool

### ITS Training Class Charts and PivotTables Using Excel 2007

When you have a large amount of data and you need to get summary information and graph it, the PivotTable and PivotChart tools in Microsoft Excel will be the answer. The data does not need to be in one

### Basic Tools for Process Improvement

What is a Histogram? A Histogram is a vertical bar chart that depicts the distribution of a set of data. Unlike Run Charts or Control Charts, which are discussed in other modules, a Histogram does not

### 2. Simple Linear Regression

Research methods - II 3 2. Simple Linear Regression Simple linear regression is a technique in parametric statistics that is commonly used for analyzing mean response of a variable Y which changes according

### STAB22 section 1.1. total = 88(200/100) + 85(200/100) + 77(300/100) + 90(200/100) + 80(100/100) = 176 + 170 + 231 + 180 + 80 = 837,

STAB22 section 1.1 1.1 Find the student with ID 104, who is in row 5. For this student, Exam1 is 95, Exam2 is 98, and Final is 96, reading along the row. 1.2 This one involves a careful reading of the

### Engineering Problem Solving and Excel. EGN 1006 Introduction to Engineering

Engineering Problem Solving and Excel EGN 1006 Introduction to Engineering Mathematical Solution Procedures Commonly Used in Engineering Analysis Data Analysis Techniques (Statistics) Curve Fitting techniques

### DESCRIPTIVE STATISTICS. The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses.

DESCRIPTIVE STATISTICS The purpose of statistics is to condense raw data to make it easier to answer specific questions; test hypotheses. DESCRIPTIVE VS. INFERENTIAL STATISTICS Descriptive To organize,

### AMS 7L LAB #2 Spring, 2009. Exploratory Data Analysis

AMS 7L LAB #2 Spring, 2009 Exploratory Data Analysis Name: Lab Section: Instructions: The TAs/lab assistants are available to help you if you have any questions about this lab exercise. If you have any

### Visualization Quick Guide

Visualization Quick Guide A best practice guide to help you find the right visualization for your data WHAT IS DOMO? Domo is a new form of business intelligence (BI) unlike anything before an executive

### Instruction Manual for SPC for MS Excel V3.0

Frequency Business Process Improvement 281-304-9504 20314 Lakeland Falls www.spcforexcel.com Cypress, TX 77433 Instruction Manual for SPC for MS Excel V3.0 35 30 25 LSL=60 Nominal=70 Capability Analysis

T O P I C 1 2 Techniques and tools for data analysis Preview Introduction In chapter 3 of Statistics In A Day different combinations of numbers and types of variables are presented. We go through these

### 9 RUN CHART RUN CHART

Module 9 RUN CHART RUN CHART 1 What is a Run Chart? A Run Chart is the most basic tool used to display how a process performs over time. It is a line graph of data points plotted in chronological order

### Costing and Break-Even Analysis

W J E C B U S I N E S S S T U D I E S A L E V E L R E S O U R C E S. 28 Spec. Issue 2 Sept 212 Page 1 Costing and Break-Even Analysis Specification Requirements- Classify costs: fixed, variable and semi-variable.

### Statistics and Probability

Statistics and Probability TABLE OF CONTENTS 1 Posing Questions and Gathering Data. 2 2 Representing Data. 7 3 Interpreting and Evaluating Data 13 4 Exploring Probability..17 5 Games of Chance 20 6 Ideas

### seven Statistical Analysis with Excel chapter OVERVIEW CHAPTER

seven Statistical Analysis with Excel CHAPTER chapter OVERVIEW 7.1 Introduction 7.2 Understanding Data 7.3 Relationships in Data 7.4 Distributions 7.5 Summary 7.6 Exercises 147 148 CHAPTER 7 Statistical

### VisualCalc AdWords Dashboard Indicator Whitepaper Rev 3.2

VisualCalc AdWords Dashboard Indicator Whitepaper Rev 3.2 873 Embarcadero Drive, Suite 3 El Dorado Hills, California 95762 916.939.2020 www.visualcalc.com Introduction The VisualCalc AdWords Dashboard

### Geometry Notes PERIMETER AND AREA

Perimeter and Area Page 1 of 57 PERIMETER AND AREA Objectives: After completing this section, you should be able to do the following: Calculate the area of given geometric figures. Calculate the perimeter

### Demographics of Atlanta, Georgia:

Demographics of Atlanta, Georgia: A Visual Analysis of the 2000 and 2010 Census Data 36-315 Final Project Rachel Cohen, Kathryn McKeough, Minnar Xie & David Zimmerman Ethnicities of Atlanta Figure 1: From

### CHAPTER 14 STATISTICS. 14.1 Introduction

238 MATHEMATICS STATISTICS CHAPTER 14 14.1 Introduction Everyday we come across a lot of information in the form of facts, numerical figures, tables, graphs, etc. These are provided by newspapers, televisions,

### DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS

DESCRIPTIVE STATISTICS AND EXPLORATORY DATA ANALYSIS SEEMA JAGGI Indian Agricultural Statistics Research Institute Library Avenue, New Delhi - 110 012 seema@iasri.res.in 1. Descriptive Statistics Statistics

### Premaster Statistics Tutorial 4 Full solutions

Premaster Statistics Tutorial 4 Full solutions Regression analysis Q1 (based on Doane & Seward, 4/E, 12.7) a. Interpret the slope of the fitted regression = 125,000 + 150. b. What is the prediction for

### SPSS Manual for Introductory Applied Statistics: A Variable Approach

SPSS Manual for Introductory Applied Statistics: A Variable Approach John Gabrosek Department of Statistics Grand Valley State University Allendale, MI USA August 2013 2 Copyright 2013 John Gabrosek. All

### Descriptive Statistics

Descriptive Statistics Suppose following data have been collected (heights of 99 five-year-old boys) 117.9 11.2 112.9 115.9 18. 14.6 17.1 117.9 111.8 16.3 111. 1.4 112.1 19.2 11. 15.4 99.4 11.1 13.3 16.9

### Describing and presenting data

Describing and presenting data All epidemiological studies involve the collection of data on the exposures and outcomes of interest. In a well planned study, the raw observations that constitute the data

### 1 of 7 9/5/2009 6:12 PM

1 of 7 9/5/2009 6:12 PM Chapter 2 Homework Due: 9:00am on Tuesday, September 8, 2009 Note: To understand how points are awarded, read your instructor's Grading Policy. [Return to Standard Assignment View]

### DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability

DesCartes (Combined) Subject: Mathematics Goal: Statistics and Probability RIT Score Range: Below 171 Below 171 Data Analysis and Statistics Solves simple problems based on data from tables* Compares

### BUSINESS OCR LEVEL 2 CAMBRIDGE TECHNICAL. Cambridge TECHNICALS FINANCIAL FORECASTING FOR BUSINESS CERTIFICATE/DIPLOMA IN K/502/5252 LEVEL 2 UNIT 3

Cambridge TECHNICALS OCR LEVEL 2 CAMBRIDGE TECHNICAL CERTIFICATE/DIPLOMA IN BUSINESS FINANCIAL FORECASTING FOR BUSINESS K/502/5252 LEVEL 2 UNIT 3 GUIDED LEARNING HOURS: 30 UNIT CREDIT VALUE: 5 FINANCIAL

### CHAPTER THREE. Key Concepts

CHAPTER THREE Key Concepts interval, ordinal, and nominal scale quantitative, qualitative continuous data, categorical or discrete data table, frequency distribution histogram, bar graph, frequency polygon,

### Interpreting Data in Normal Distributions

Interpreting Data in Normal Distributions This curve is kind of a big deal. It shows the distribution of a set of test scores, the results of rolling a die a million times, the heights of people on Earth,

### Using Excel for descriptive statistics

FACT SHEET Using Excel for descriptive statistics Introduction Biologists no longer routinely plot graphs by hand or rely on calculators to carry out difficult and tedious statistical calculations. These

### This file contains 2 years of our interlibrary loan transactions downloaded from ILLiad. 70,000+ rows, multiple fields = an ideal file for pivot

Presented at the Southeastern Library Assessment Conference, October 22, 2013 1 2 3 This file contains 2 years of our interlibrary loan transactions downloaded from ILLiad. 70,000+ rows, multiple fields

### Trigonometric functions and sound

Trigonometric functions and sound The sounds we hear are caused by vibrations that send pressure waves through the air. Our ears respond to these pressure waves and signal the brain about their amplitude

### How To: Analyse & Present Data

INTRODUCTION The aim of this How To guide is to provide advice on how to analyse your data and how to present it. If you require any help with your data analysis please discuss with your divisional Clinical

### Descriptive Statistics Practice Problems (Total 6 marks) (Total 8 marks) (Total 8 marks) (Total 8 marks) (1)

Descriptive Statistics Practice Problems 1. The age in months at which a child first starts to walk is observed for a random group of children from a town in Brazil. The results are 14.3, 11.6, 12.2, 14.,

### Northumberland Knowledge

Northumberland Knowledge Know Guide How to Analyse Data - November 2012 - This page has been left blank 2 About this guide The Know Guides are a suite of documents that provide useful information about

### Using SPSS, Chapter 2: Descriptive Statistics

1 Using SPSS, Chapter 2: Descriptive Statistics Chapters 2.1 & 2.2 Descriptive Statistics 2 Mean, Standard Deviation, Variance, Range, Minimum, Maximum 2 Mean, Median, Mode, Standard Deviation, Variance,

### Common Core Unit Summary Grades 6 to 8

Common Core Unit Summary Grades 6 to 8 Grade 8: Unit 1: Congruence and Similarity- 8G1-8G5 rotations reflections and translations,( RRT=congruence) understand congruence of 2 d figures after RRT Dilations

### BNG 202 Biomechanics Lab. Descriptive statistics and probability distributions I

BNG 202 Biomechanics Lab Descriptive statistics and probability distributions I Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential

### Data Exploration Data Visualization

Data Exploration Data Visualization What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include Helping to select

### DATA INTERPRETATION AND STATISTICS

PholC60 September 001 DATA INTERPRETATION AND STATISTICS Books A easy and systematic introductory text is Essentials of Medical Statistics by Betty Kirkwood, published by Blackwell at about 14. DESCRIPTIVE

### If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C?

Problem 3 If A is divided by B the result is 2/3. If B is divided by C the result is 4/7. What is the result if A is divided by C? Suggested Questions to ask students about Problem 3 The key to this question

### STATGRAPHICS Online. Statistical Analysis and Data Visualization System. Revised 6/21/2012. Copyright 2012 by StatPoint Technologies, Inc.

STATGRAPHICS Online Statistical Analysis and Data Visualization System Revised 6/21/2012 Copyright 2012 by StatPoint Technologies, Inc. All rights reserved. Table of Contents Introduction... 1 Chapter

### Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1 Review 1. As part of survey of college students a researcher is interested in the variable class standing. She records a 1 if the student is a freshman, a 2 if the student

### 4. Using Diagrams to Present Data 1

1 4. Using Diagrams to Present Data 1 CHAPTER OUTLINE In the last chapter we saw how data could be collected. Now we are going to show how these data can be summarized and presented to an audience. There

### NASA Explorer Schools Pre-Algebra Unit Lesson 2 Student Workbook. Solar System Math. Comparing Mass, Gravity, Composition, & Density

National Aeronautics and Space Administration NASA Explorer Schools Pre-Algebra Unit Lesson 2 Student Workbook Solar System Math Comparing Mass, Gravity, Composition, & Density What interval of values

### Symbolizing your data

Symbolizing your data 6 IN THIS CHAPTER A map gallery Drawing all features with one symbol Drawing features to show categories like names or types Managing categories Ways to map quantitative data Standard

### Chapter 9 Descriptive Statistics for Bivariate Data

9.1 Introduction 215 Chapter 9 Descriptive Statistics for Bivariate Data 9.1 Introduction We discussed univariate data description (methods used to eplore the distribution of the values of a single variable)

### Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c

Unit 7 Quadratic Relations of the Form y = ax 2 + bx + c Lesson Outline BIG PICTURE Students will: manipulate algebraic expressions, as needed to understand quadratic relations; identify characteristics

### PART 1 CROSSING EMA. I will pause between the two parts for questions, but if I am not clear enough at any stage please feel free to interrupt.

PART 1 CROSSING EMA Good evening everybody, I d like to start by introducing myself. My name is Andrew Gebhardt and I am part of Finex LLP, a small Investment Manager which amongst other endeavours also

### Data Visualization Techniques

Data Visualization Techniques From Basics to Big Data with SAS Visual Analytics WHITE PAPER SAS White Paper Table of Contents Introduction.... 1 Generating the Best Visualizations for Your Data... 2 The

### Mathematics. Probability and Statistics Curriculum Guide. Revised 2010

Mathematics Probability and Statistics Curriculum Guide Revised 2010 This page is intentionally left blank. Introduction The Mathematics Curriculum Guide serves as a guide for teachers when planning instruction

### MAS131: Introduction to Probability and Statistics Semester 1: Introduction to Probability Lecturer: Dr D J Wilkinson

MAS131: Introduction to Probability and Statistics Semester 1: Introduction to Probability Lecturer: Dr D J Wilkinson Statistics is concerned with making inferences about the way the world is, based upon

### Introducing the. Tools for. Continuous Improvement

Introducing the Tools for Continuous Improvement The Concept In today s highly competitive business environment it has become a truism that only the fittest survive. Organisations invest in many different

### Web Supplement to Chapter 2

Web upplement to Chapter 2 UPPLY AN EMAN: TAXE 21 Taxes upply and demand analysis is a very useful tool for analyzing the effects of various taxes In this Web supplement, we consider a constant tax per

### Describing Data: Frequency Distributions and Graphic Presentation

Chapter 2 Describing Data: Frequency Distributions and Graphic Presentation GOALS When you have completed this chapter, you will be able to: Organize raw data into a frequency distribution Produce a histogram,

### IBM SPSS Direct Marketing 23

IBM SPSS Direct Marketing 23 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 23, release

### Time Series Forecasting Techniques

03-Mentzer (Sales).qxd 11/2/2004 11:33 AM Page 73 3 Time Series Forecasting Techniques Back in the 1970s, we were working with a company in the major home appliance industry. In an interview, the person

### Descriptive Statistics

CHAPTER Descriptive Statistics.1 Distributions and Their Graphs. More Graphs and Displays.3 Measures of Central Tendency. Measures of Variation Case Study. Measures of Position Uses and Abuses Real Statistics

### MATH 140 Lab 4: Probability and the Standard Normal Distribution

MATH 140 Lab 4: Probability and the Standard Normal Distribution Problem 1. Flipping a Coin Problem In this problem, we want to simualte the process of flipping a fair coin 1000 times. Note that the outcomes

### Box-and-Whisker Plots

Mathematics Box-and-Whisker Plots About this Lesson This is a foundational lesson for box-and-whisker plots (boxplots), a graphical tool used throughout statistics for displaying data. During the lesson,

### MINITAB ASSISTANT WHITE PAPER

MINITAB ASSISTANT WHITE PAPER This paper explains the research conducted by Minitab statisticians to develop the methods and data checks used in the Assistant in Minitab 17 Statistical Software. One-Way

### Consolidation of Grade 3 EQAO Questions Data Management & Probability

Consolidation of Grade 3 EQAO Questions Data Management & Probability Compiled by Devika William-Yu (SE2 Math Coach) GRADE THREE EQAO QUESTIONS: Data Management and Probability Overall Expectations DV1

### IBM SPSS Direct Marketing 22

IBM SPSS Direct Marketing 22 Note Before using this information and the product it supports, read the information in Notices on page 25. Product Information This edition applies to version 22, release

### Data Analysis. Using Excel. Jeffrey L. Rummel. BBA Seminar. Data in Excel. Excel Calculations of Descriptive Statistics. Single Variable Graphs

Using Excel Jeffrey L. Rummel Emory University Goizueta Business School BBA Seminar Jeffrey L. Rummel BBA Seminar 1 / 54 Excel Calculations of Descriptive Statistics Single Variable Graphs Relationships

### TIPS FOR DOING STATISTICS IN EXCEL

TIPS FOR DOING STATISTICS IN EXCEL Before you begin, make sure that you have the DATA ANALYSIS pack running on your machine. It comes with Excel. Here s how to check if you have it, and what to do if you

### CSU, Fresno - Institutional Research, Assessment and Planning - Dmitri Rogulkin

My presentation is about data visualization. How to use visual graphs and charts in order to explore data, discover meaning and report findings. The goal is to show that visual displays can be very effective

Visualizations 101 Table of Contents Find the story within your data Introduction 2 Types of Visualizations 3 Static vs. Animated Charts 6 Drilldowns and Drillthroughs 6 About Logi Analytics 7 1 For centuries,

### Spotfire v6 New Features. TIBCO Spotfire Delta Training Jumpstart

Spotfire v6 New Features TIBCO Spotfire Delta Training Jumpstart Map charts New map chart Layers control Navigation control Interaction mode control Scale Web map Creating a map chart Layers are added

### Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Biology 1 Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price Introduction In this world of high technology and information overload scientists

### Intro to Excel spreadsheets

Intro to Excel spreadsheets What are the objectives of this document? The objectives of document are: 1. Familiarize you with what a spreadsheet is, how it works, and what its capabilities are; 2. Using

### CHAPTER 1: SPREADSHEET BASICS. AMZN Stock Prices Date Price 2003 54.43 2004 34.13 2005 39.86 2006 38.09 2007 89.15 2008 69.58

1. Suppose that at the beginning of October 2003 you purchased shares in Amazon.com (NASDAQ: AMZN). It is now five years later and you decide to evaluate your holdings to see if you have done well with