Time Population (in thousands) September 2015 Data Visualization: Some Basics Graphical Options You have many graphical options but particular types of data are best represented with particular types of graphs. Some of the more common types are overviewed in Table 1. Table 1: Types of graphs and their uses Type of Graph Line graph Bar chart Table Scatter plot Pie chart Uses change over time, presenting different types of data together comparisons, differences / similarities between groups exact values, concisely communicating (esp. if story not particularly compelling) relationships between values, correlation (positive, negative, none) between two variables proportions (very simple data, large differences between proportions) Here are the common types of data visualization with examples and described in more detail: Line graph: changes over time. Bear Population in Emerald Forest 150 100 50 0 2000 2001 2002 2003 2004 2005 2006 Year Notice the unit for the x-axis is Years and that we can follow the progress of Population (the y-axis unit) over time by following the pattern of the line connecting the data points. Bar graph: comparing groups, showing differences or similarities. Time to complete obstacle course for soldiers wearing protective gear 15 10 5 0 4.98 No Chemical Protection 8.93 9.78 Current Gear Future Warrior Gear A quick glance at this graph shows us that there is a large difference between the groups wearing no chemical protection and the groups wearing either the current gear or the future warrior protective gear. We can also see that there is not much of a difference between the group wearing the current gear and the group wearing the future warrior protective gear.
Table: presenting exact values or different types of data together, concisely communicating data We can easily compare the startup speed between Firefox and IE by glancing at the actual values and knowing that 11.6 is higher than 7.8. It is also useful that we can see not only startup speed in this table, but also download speeds. Having multiple categories displayed in the same space is useful if we want to determine which browser is best overall and we need multiple types of data to ascertain that. Scatter plot (here with a superimposed trend line*): illustrating relationships between values. We are generally interested in the spread (which is a visualization for how correlated the variables on our axes are) and the general trend of the points on a scatter plot (which is where the trend line becomes useful). For example, here we can see the general upward trend of the points, suggesting that we generally expect higher test scores for longer studying times. We can also see that the points are tight around the line, suggesting that the correlation is strong. *It is up to you whether you want to add the trend line to your visualization. Some authorities don t like the extra clutter, some prefer the ease of access to the information. Pie chart: showing proportions.
APPROXIMATE COMPOSITION OF THE AIR Other gasses Oxygen Nitrogen Use this type of chart with caution. It only works for small numbers of groups that have noticeable differences between their proportions. From this chart, we can see that the bus is the most used transportation method, followed by the car. Choosing your story The graphical representation you choose for your data will depend on the story you want to tell (2). Each representation lends itself to telling a different story. You can hypothetically choose any graphical option for your data, but some may be more or less appropriate. For example, consider the following data tables. Depending on the way you compile the data, you could make an argument for whether the US or China won the Olympics. When emphasizing total medal count, it seems obvious that the US won. However, if total gold medals is foregrounded, it seems China is the winner. Neither interpretation is wrong. They each offer support for different arguments. It is up to you to decide what argument you are making and how your data visualization can help support it. Choosing your Representation Sometimes it is obvious that a particular representation will not work with a particular type of data (for example, generally when we look at data over time, we will not use a bar graph). However, some types of data can work well in more than one representation and you can choose one depending upon the story you want to tell.
These two graphs are representations of the same data set, the unemployment rate in the United States from 1946 to 2013. But as you can see, they offer two rather different stories. This graph suggests that the unemployment rate is steadily increasing over time. We might get from the spread of the points that the rate varies around that steady incline. This graph suggests that the unemployment rate varies wildly, but it is not readily apparent that the rate is increasing. If we wanted to show that the unemployment rate is on average rising over time, we would use the scatter plot. If we just want to show that the unemployment rate varies year by year, we would use the line graph. Representing Data Effectively: Wisdom Before Beauty You should certainly try to refine your data visualization, try not to make your visualization so pretty that it becomes unreadable. Privilege clarity of message over aesthetics. Make sure your visualization still effectively communicates your meaning. For example, consider the 3D bar graph. It might look neat, but it takes some effort to figure out what it is supposed to communicate.
Explaining the Data Visualization Finally, even if you have an amazing data visualization, you still need to explain it. You should do this both in a caption and in the text of your paper. Your data visualization should facilitate reading and it should be intuitive enough that someone could look at it and get a good feel for what the data is saying. However, you should still give an explanation in words that tells the reader the bottom line. In a caption, you explain your visualization (???). 1. Name the figure or table (Table 1, Figure 1, etc.) 2. Describe the visualization 3. What are the axes 4. What does the data represent Within the text, you should interpret your visualization (4). In the results section: 1. Refer to your table or figure and state the main trend 2. Support the trend with data 3. (if needed) Note any additional trends and support with data 4. (if needed) Note any exceptions to the main trends or unexpected outcomes In the Discussion: 1. (if needed) Provide an explanation for those exceptions 2. (optional) Compare to other research 3. (optional) Evaluate with respect to a hypothesis (explain whether supports or contradicts hypothesis) 4. (optional) Mention further findings 5. State the bottom line: What does the data mean? For example, consider the table on internet browser performance. In this table, the best value for each task is emphasized with red bolding.
Notice that the caption names the table and explains what is contained in it. We might interpret the table like this: Table 1 suggests that Chrome is the best internet browser. It scored the best on two of the four measures. However, we should note that Chrome had the worst score on memory use. Therefore, we might conclude that Chrome is the best internet browser for users whose computers have enough memory to handle it. Users whose computers have lower memory might prefer Firefox 25, which has the lowest memory use. Users who want the fastest browser could use IE 11, which has the lowest speed of the four browsers.