Getting started in Excel Disclaimer: This guide is not complete. It is rather a chronicle of my attempts to start using Excel for data analysis. As I use a Mac with OS X, these directions may need to be varied for your situation. Installing the Data Analysis Tool Pack. Under the Tools menu, use the Add-Ins option to install the Data Analysis tool pack. You may have more than one option for such a tool pack. Just install all of them. After installing, you should see a Data Analysis menu option as in the picture to the left. Entering Data by hand After opening a new workbook using the File menu, you can start entering data. By double clicking at the top of the document you can enter a title for the document. In this picture, I ve started entering the data from Table 1.1 of the book. Remember to save your work often.
Entering data automatically Data is contained on both the CD that came with your book and on the publisher s website: http://bcs.whfreeman.com/bps3e/ Click on the data sets link. You can then download and decompress the data set for your particular computer. It is then a simple matter to open it in Excel. Entering formulae At this point, I ve entered all of the data and have decided I d like to calculate the mean (average). To do this, I went to an empty cell (D10) and entered a label. I then went to the cell where I want the average to appear (E10), clicked and started typing the formula. Notice the window at the top of the screen where the formula: =AVERAGE($B$2:$B$52) appears? That s what I ve typed in cell E10. The = sign tells excel that I m entering a formula that it will need to calculate. The word AVERAGE is a command that tells Excel to calculate the mean. You can learn about this under the Help menu. The symbols $B$2:$B$52 tells Excel what data it is supposed to average. The dollar signs indicate that if I copy this formula to a different cell, it is not supposed to change the data it is finding the average of.
As another example consider this data: In the cell B10 I entered the formula =AVERAGE(B5:B8) As you can see, the average appears. Note that I did not use $ signs around the cell locations. Now consider what happens if I copy and paste that formula directly into cell C10: As you can see, in cell C10 appears the correct average for Exam 2. Because I didn t use $ signs, Excel automatically updated the cell locations to reflect the cells that are in the same position relative to cell C10 that the other cells were in relation to cell B10. There are many other formulae that are useful--learn to look for them in the Excel s Help.
A first attempt at a histogram Select Data Analysis under the Tools menu. The Data Analysis window should appear. Select Histogram. In the histogram window, in the Input Range box, enter the range of cells containing the data. Checking the New worksheet ply button will mean that the histogram will appear in the same document as your data, but on a different sheet. Checking the chart output button will mean that Excel draws a chart representing the histogram.
Here is a picture of the output. I had to move and resize the bar graph in order to make it presentable. The table at the top represents the different classes that Excel used to draw the histogram. We d like to have different class sizes and also arrange the bar graph to look more like the ones in our text. A second try at a histogram To change the class sizes, you can directly edit the column of class sizes labelled bin. I decided to use bins of size 5. Here s what it looks like after I changed it:
Another way of achieving this goal, is: In the document, list the boundary values for the class sizes. Then enter the range of those cells in the Bin range box of the Histogram window.
To change aspects of how the histogram is drawn, double click on various parts of the graph. In particular, be sure that the histogram has a useful title and that the axes are usefully labelled. It s also important to be sure that the bars of the histogram don t have any meaningless gaps. If you double click on one of the bars in the graph you will get a box where you can change the size of the gaps. Set the gap width to zero. Histogram pointers Histograms should be informative, not entertaining. Don t use wacky colors, shapes, or a 3rd dimension. Excel will (almost) never produce a default histogram which is accurate. In particular, it chooses bin sizes poorly and puts meaningless gaps between bars of the graph. Be sure that the histogram and its axes have useful and accurate labels.
Time plots Begin by loading the data from Table 1.3 (average college tuition & fees). We want Excel to plot this data. Highlight the data you would like plotted and click on the chart wizard icon: Choose the XY(Scatter) plot and then pick the option Scatter with data points connected by smoothed Lines. The chart wizard then lets you change various options. Be sure that your axes are labelled and that your chart has a title. Again, by double clicking on various aspects of the chart, you can change such things as the scale and spacing. Here s the graph, I drew: Average college tuition and fees Ave. Tuition and fees (real dollars) 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 1970 1973 1976 1979 1982 1985 1988 1991 1994 1997 2000 Academic year Private Public
The 5 Number Summary and Boxplots Suppose you have data in cells A2:A17 and you want the 5 number summary. The formula to get the nth quartile Qn is =QUARTILE($A$2:$A$17,n) The possible values for n are 0, 1, 2, 3, or 4. These give you the minimum, 1st quartile, median, 3rd quartile, and maximum respectively. You can also have Excel produce several descriptive statistics. Use the Descriptive Statistics option with the Data Analysis toolbox. Unfortunately, this does not include the first and third quartiles. You ll have to get those separately. See the next two pictures.