1 Part 2: Data Visualization How to communicate complex ideas with simple, efficient and accurate data graphics
2 Why visualize data? The human eye is extremely sensitive to differences in: Pattern Colors Format Because of our amazing ability to decipher these differences instantly, representing complex data sets with data graphics is an efficient method to communicate what the numbers are saying. The visual display of quantitative information serves as a vehicle to traverse a complex data world. Graphics reveal data.
3 What is the best way to display the data? Let the data instruct you Do not have a prespecified mode of displaying the data. Do whatever it takes to display data in the most appropriate way. Design should be contentdriven not methodology driven.
4 CONTEXT, CONTEXT, CONTEXT! Put the data into a human context What are we comparing the data to? Previous rounds (historical context) Has the clinic performance rate improved over time? Other similar clinics How well is the clinic performing compared to other clinics: In the same district/province/region (geographic context) With the same caseload With the same resources Care Provided Documented Chart Selected Data Collected Data Analyzed Data Visualized Data Reported Data Interpreted Decisions Made
5 Graphical Excellence Have the audience in mind. What is the purpose of the graphic? Description, exploration Make large data sets coherent Reveal the data at several levels of detail Induce reader to think about the content, not the methodology Encourage eye to compare different pieces of data Spatial orientation, patterns, colors, formatting Avoid distortion of the data Axes, scaling, labeling Clear and easy to read Integrate words and numbers with graphics Tufte, Edward. The Visual Display of Quantitative Information. Connecticute, Graphic Press: Page 13.
6 Theory of Data Graphics Above all else show the data 1) Maximize dataink ratio. I. Erase nondataink II. Erase redundant dataink 2) Remove Chart Junk. I. Shadows II. 3Drendering III. Other ornaments 3) Avoid Optical Vibration Before After Performane Rate Performane Rate Clinical Visits Percentage of adult patients who had at least one visit in each half of the year Clinic Clinical Visits Percentage of adult patients who had at least one visit in each half of the year Clinic Tufte, Edward. The Visual Display of Quantitative Information. Connecticut, Graphic Press: Page 13.
8 Examples
9 Bar Charts Good for comparing a set of categorical values. Best when there are not too many categories and/or variables. 1 Clinical Visits Percentage of adult patients who had at least one visit in each half of the year Performane Rate Clinic Tips: Organizing data from largest to smallest may be helpful in highlighting data. Keep it simple: do not use shadows or 3D rectangles.
10 Too many categories can make bar charts messy. When there are this many bars on a bar graph, make sure to ask yourself if it is contextually appropriate to compare all of the values on the bar chart Clinical visits (2011) Percentage of eligible adult patients who had at least one clinical visit in each half of the year. Performance Rate (%) Clinic
11 Too many variables per category can also make bar charts messy. Is it appropriate to compare all of the variables within a category? Mean Clinic Scores by Indicator (2011) Performance Rate (%) Clinical Visits TB Screening CTX Nutritional Assessment Prevention Education Alcohol Screening 0 A B C D E Clinic
12 Pie Charts Work well if you want to compare individual slices of the pie with the whole pie. It may be difficult to compare different sections of a given pie chart or to compare data across different pie charts. A bar chart (histogram or stack chart) or table may be more appropriate in that case.
13 Too many variables make a pie chart hard to manage. If the variables are numerical, consider using a histogram instead. You can also consider combining categories but remember that this could hide variation and alter how the data are interpreted. CD4 Count Distribution < CD4 Count Distribution <
14 Tables Tables often work better than bar charts and pie charts when there are too many data points and too many descriptors of those data points. Many people may not consider this as a way to visualize data, but tables still use specific formatting and spatial orientation to communicate the data more easily. In terms of data ink, every piece of a table is critical information. However, tables may not be good at showing patterns over time. CD4 Monitoring Mean Clinic Scores Percentage of eligible patients who had at least one CD4 count during the review period
15 Table Formatting Tips Do not use gridlines. The space between the numbers visually separate categories. Underline the column headers Consider Zebra Striping: light shading to separate specific groups you want to highlight. Before After CD4 Monitoring Indicator Results Clinic Performance Rate Denominator A 60% 100 B 75% 150 C 50% 120 CD4 Monitoring Indicator Results Clinic Performance Rate (%) Denominator A B C
16 Line Charts Line charts work well to show trends over intervals of time (time series). The more data points, the better. Line charts show a continuous line even though data may be discrete. Tips: Use different colors to differentiate between different line. Remember that our eyes will naturally compare two different lines on the same chart. If two data points are not comparable, then maybe they should not be on the same graph. Label the lines directly on the chart instead of using a legend.
17 Line charts are very prone to distortion. 25 Percentage of eligible patients screened for tuberculosis Y Axis Scale: 0 to 25 Y Axis Scale: 0 to Performance Rate (%) Performance Rate (%) Jan Feb Mar Apr May June Jan Feb Mar Apr May June 0 Jan Feb Mar Apr May June Y Axis Scale: 15 to 20 Y Axis Scale: 0 to 25 Height > Width Performance Rate (%) Performance Rate (%) Jan Feb Mar Apr May June
18 Boxandwhisker Plots Are a great way to compare different sets of data. Several different descriptive statistics can be compared: Max, min, upper quartile, median, lower quartile, range and interquartile range. Namibia Food Security Oct 10  Mar 11 Jan  Jun 10 Jul  Dec 09 Jan  Jun 09 Jul  Dec 08 Review Period Jan  Jun Performance Rate (%)
19 The next few examples illustrate how important labeling is. Labeling provides more context to the data, allowing for more rigorous and accurate interpretations of the data. Mortality Rate (# deaths / 1000 people/year) Mortality Rates of People Actively Playing Popular Sports in Soccer Rugby Cricket Golf Is playing golf more dangerous than other sports?
20 Mortality Rate (# deaths / 1000 people/year) Mortality Rate of People Actively Playing Popular Sports in 2011 Average Age = 23 Average Age = 20 Average Age = 25 Average Age = 60 Soccer Rugby Cricket Golf
21 Performance Rate (%) What can we conclude? Percent of Adults who received a TB assessment during the review period (Adult, 2008) Clinic A Clinic B Clinic C
22 Performance Rate (%) Percent of Adults who received a TB assessment during the review period (Adult, 2008) n = 2 n = 150 n = 200 Clinic A Clinic B Clinic C Clinic C only has 2 eligible patients!
23 Write on Graphs: Use words, numbers and graphics in combinations Use words directly on graphs to provide more context. For example, on a clinic level run chart, use words and arrows to denote when a QI project was implemented. Here s an example from Namibia.
24 Graph/Table Combinations Graphs and tables can be utilized together. The table provides more context and detail while the graph reveals any patterns of the data. Here s an example using data form Uganda.
25 Sparklines: Intense, Simple, WordSized Graphics Invented by Edward Tufte, these powerful graphics add tremendously to the meaning of numbers. They provide context. For example, I can say that the current temperature is 30 degrees Celsius. However, if I include a sparkline that shows the weather during the previous 24 hours, it immediately puts that 30 degrees into context. The sparklines I showed in the previous slide show the spread of the data. Each little tick mark represents an individual clinic s score. The red mark is the mean of those scores. Since I oriented the spreads in the same column, I can quickly see how the spread changes from round to round.
26 Small Multiples When clinic level data are aggregated, detail at the clinic level is lost. Looking at longitudinal mean clinic scores, individual clinic trends cannot be extrapolated. There are several visualization techniques that encourage the eye to examine both clinic level and aggregate level patterns. Small multiples, a series of graphics that show the same combination of variables, is one such technique. Here is an example of what it would look like. Created by Jorge Camoes
27 Heat Maps Use color to encourage the eye to examine both clinic level and aggregate level patterns. In this example, each color represents a range of performance rates. The more red the color, the closer the performance rate is to 0%. The more green the color the closer the performance rate is to 100%. A B C D E F Jan Jun Jul Dec Jan Jun Jul Dec Jan Jun Mar Apr G Namibia Food Security Indicator Results Percentage of eligible adult patients assessed for food security by clinic and review period. Clinic H I J K L M N O P Key to Swatch Colors Rate (%) 0 to to to to to to to to to to 100
28 Summary Context is essential for graphical integrity. Provide historical data when available. Label axes properly. Always provide denominators to percentages. Do whatever it takes to display the data in the best way with integrity and clarity. Data visualization should be contentdriven not methodology driven Use combinations of words, numbers and graphics. Combine tables and charts together Creating an excellent data graphic takes time. Like good writing it requires revising and editing.
Based on Chapter 11, Excel 2007 Dashboards & Reports (Alexander) and Create Dynamic Charts in Microsoft Office Excel 2007 and Beyond (Scheck)
Reporting Results: Part 2 Based on Chapter 11, Excel 2007 Dashboards & Reports (Alexander) and Create Dynamic Charts in Microsoft Office Excel 2007 and Beyond (Scheck) Bullet Graph (pp. 200 205, Alexander,
