MAX 201 Spring 2008 Assignment #6: Charts & Graphs; Modifying Data Due at the beginning of class on March 18 th Introduction This assignment introduces the charting and graphing capabilities of SPSS and Excel. It also helps you learn how to modify data values using SPSS. All these skills will be useful to you later in the semester when you are writing your paper and preparing your presentation of your research results. [Note about missing values: normally you do NOT want to present NA, NO ANSWER or DK s (Don t Know). Make sure that you are consistent, so that your SPSS and Excel charts contain the same categories.] Excel Charts & Graphs Excel has extensive graphic capabilities, which can help you make strong visual presentations of data. You need to give the proper instructions to the Chart Wizard in order for the charts or graphs to contain the appropriate data. Making a Pie Chart of PADEG Open the GSS 2006.xls file you saved on your H drive in Assignment 5. On a new worksheet in GSS 2006.xls, create a univariate pivot table showing the numerical distribution ( counts ) of PADEG, the same way that you did in Assignment 5. Name the new worksheet Assign6. Note: Remember to save frequently! The GSS 2006.xls file is already about 7 megabytes, which is large enough to increase the chances that Excel will crash. You can prevent a lot of frustration and extra work by saving frequently. To make a pie chart, follow the above directions and: o Create a pivot table o Hide missing values (including Don t Know, Not Applicable, and No Answer ) o Supply value labels for the data consistent with the codebook Select the data and value labels (not the column headings or the grand total), copy and paste special on the same worksheet. Click on the arrow beneath the Paste button on the Home tab and select Paste Special. Choose Values and number formats under Paste, and None under Operations). To activate the Chart Wizard, highlight the data just pasted and click on the Insert tab and select Pie chart (the icon is a circle with blue, yellow, and red slices). Click on Pie in 3-D. Your pie chart should look like the chart for marital status below: 1
Example: Pie Chart of MARITAL Count of marital marital Total Married 2170 Married 2170 Widowed 366 Widowed 366 Divorced 732 Divorced 732 Separated 156 Separated 156 Never Married 1080 Never Married 1080 Grand Total 4504 To format your chart: o Click on the Layout tab. o Under Labels, click on Chart title\above Chart. Double click where it says Chart title and add your own informative title. o Under Legend, select Show Legend at Right o Under Data Labels, select More Data Label Options and make sure that Percentage and Show Leader Lines are checked Place the pie chart as an object in sheet Assign6, where your pivot table and pasted data is. You may have to move and resize the chart so that it is readable. You can play around with the format for the title and data labels by right clicking on whatever you want to re-format. Insert your name, assignment, and lab information at the top of the page. Preview the page before printing to be sure your name, the pivot table, the pasted data, and the chart all fit on one page. 2
Making Charts in SPSS; Modifying Data Values Before issuing commands to SPSS that will analyze data, it is sometimes desirable to modify data values that is, to combine values of the variable for presentation (note that this can be done in Excel also, but it is more complicated). This is a matter for the analyst s judgment for example if there are only a few cases in one category, it may make sense to combine those cases with those taking on the next value. Pay attention to the question and responses, too you may decide it makes sense to combine the responses strongly agree and agree, for example. See the example below. Example: HOMPOP In this example, we want to make a pie chart of the variable HOMPOP, which is the number of household members. The original frequency table appears below. A pie chart with too many pieces (i.e., more than 6 or 7) is difficult to interpret, so we want to combine some categories so that the chart will be more visually appealing and easier to read. We notice that very few people live in households with six or more members. (Categories 6, 7, 8, 9, 10 & 11 all have relatively small values, compared to the values for 1 through 5.) Therefore, we will combine categories 6 through 11 into one group, creating a new variable, HOMPOP2. Frequencies for this are shown in the table below, Recoded household size. Details of how to do this data transformation are in the next section. NUMBER OF PERSONS IN HOUSEHOLD Valid 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 Total Cumulative Frequency Percent Valid Percent Percent 1220 27.1 27.1 27.1 1563 34.7 34.7 61.7 716 15.9 15.9 77.6 590 13.1 13.1 90.7 267 5.9 5.9 96.6 100 2.2 2.2 98.8 37.8.8 99.6 11.2.2 99.9 3.1.1 99.9 2.0.0 100.0 1.0.0 100.0 4510 100.0 100.0 Valid 1.00 2.00 3.00 4.00 5.00 6-11 Total Recoded Household Size Cumulative Frequency Percent Valid Percent Percent 1220 27.1 27.1 27.1 1563 34.7 34.7 61.7 716 15.9 15.9 77.6 590 13.1 13.1 90.7 267 5.9 5.9 96.6 154 3.4 3.4 100.0 4510 100.0 100.0 3
Making a bar chart of BYPHONE in SPSS Similarly, you will modify or recode the variable TVHOURS and then make a chart of the frequencies of the values of this variable. Open the GSS analysis SPSS file you created on your H drive in Assignment 5. Create a Frequency table showing the numerical distribution ( counts ) and percentages of TVHOURS. Print the output and close it without saving. Since simpler is better when presenting data visually, you will want to group the categories. Use the above example (HOMPOP) as a guide for understanding grouping of categories in your distribution of TVHOURS. Decide which categories you want to combine for TVHOURS. Recode TVHOURS to TVHOURS2 by using the Transform>Recode commands. This will produce a Recode into Different Variables dialog box. Select TVHOURS and name the Output variable TVHOURS2. Click Change. Then click Old and New Values. In the Old and New Values dialog box, you must deal with each of the values (or value ranges) of the original variable and assign these to different (or the same) values for the new variable. Ranges mean that you could specify, for example, that values 6 through 10 in the old variable be assigned the value of 6 in the new variable. Be careful when you recode to include all the categories that you want in your new variable. For example if you want category 1 to be the same in both TVHOURS and TVHOURS2 you have to include it in the recode instructions. Any category you do not include, SPSS will assume is missing. Substitute informative labels for the numerical codes associated with the categories of TVHOURS. Go to View Variables and click on the bottom right-hand corner of the Values cell for TVHOURS2 (which will be at the end of the data list). Provide a label for each value. Run the frequency command on the variable TVHOURS2 and choose the chart option (by clicking Charts at the bottom, which takes you to the chart dialog). Choose Bar chart, and Percentages. [Note: this will give you the table and the chart. You could have gone directly to graphs on the toolbar, to produce only the chart.] Print the output (both the table and the chart). SPSS Charts Showing Relationships between Two Variables So far, we ve made charts showing the distribution of a single variable. Charts are also useful to show how two variables are related to one another (basically a visual presentation of crosstab tables). In SPSS, construct a crosstab table showing the relationship between FEPRESCH and SEX. Click Cells and ask for row percents. FEPRESCH is the dependent variable. Put it in the columns, and SEX in the rows. Open a new Word document, put your name, section, and Assignment 6 at the top, and paste the table into this document. Produce a clustered bar chart showing this relationship. On the menu, choose Graphs>Bar. In the dialog box choose clustered and summaries for groups of cases. 4
Click Define for the Define Clustered Bar dialog box. Put FEPRESCH, your dependent variable, in the category axis box and your independent variable, SEX in the define clusters by box. Bars should represent % of cases and not numbers of cases. Under Options be sure that you do not display groups defined by missing values. Paste the bar chart into your Word document. To copy your chart, double click on your chart in the SPSS output so that the Chart Editor opens. In the Edit Menu, select Copy Chart. Then paste your chart into Word. Finally construct a crosstab table and a bar chart showing the relationship between an independent nominal or ordinal variable and a dependent nominal or ordinal variable of your choice. Paste both this second table and the second chart into your Word document, save, and print. Using Excel to Produce Bivariate Charts This part of the assignment is just to illustrate how you can produce charts in either SPSS or Excel this produces a chart for the same two variables used in the section above. Open a new Excel worksheet. Type in the relevant percentage information from your second crosstab table above using the independent and dependent variable of your choice that you created in SPSS. Make sure that you are consistent across SPSS and Excel (if you eliminated DKs and NAs in SPSS, also do it in Excel). If I had produced a bivariate table of ABNOMORE and SEX, I would enter the data into an Excel spreadsheet like this: abnomore Sex Yes No Male 45% 55% Female 40% 60% Highlight everything except for the dependent variable name (ABNOMORE in this example). Click on the Insert tab, Charts\Column\2-D Column\Clustered Column to make a column chart. Use the Layout tab to give the chart the appropriate titles and produce it as an object on the same sheet. Put your name etc. at the top and print the spreadsheet. To summarize: you should have 1. An Excel spreadsheet containing your name & section, two tables, and a pie chart of PADEG 2. SPSS outputs for the original frequency for TVHOURS, the recoded TVHOURS2, and the TVHOURS2 bar chart 3. A Word document containing two crosstab tables and two bar charts (one for FEPRESCH by SEX, and the other of your choosing). 4. An Excel spreadsheet with data and a column chart (same variables as in the second part of the Word document in #3). 5
Part A. Answer the following questions: Interpretation and Analysis 1) Which program do you prefer for showing frequencies using charts, Excel or SPSS? Why? Give some reasons based on your experience during the lab. 2) Describe the sex differences you find in views on relationship of working women with their preschool children. 3) Do you think that these differences are statistically significant, i.e. the sample differences did not result by chance? What makes you think this? Note that we can t judge significance based simply on whether results seem to make sense. 4) In a paragraph, describe the relationship between the two variables you chose to analyze in the last part of the assignment. Identify which is the independent and which the dependent variable, and why; talk about the overall distribution of the dependent variable in the sample; and then talk about the distribution of the dependent variable with the categories of the independent variable. The goal of this part of the analysis is to illustrate that you can describe and analyze the data (that you are illustrating visually with the charts) in paragraph form. Part B Response to the Readings According to the Tuch and Sigelman article, Race, Class, and Black-White Differences in Social Policy Views, which policy issues (dependent variables) show the most significant differences according to race (the independent variable)? Does it matter for your answer if you leave all four categories, or if you combine into only two categories ( should be and should not be )? Answer this question in 100 words or less. 6