Author: Laura Thompson

Beyond Static Graphs: Engage Your Audience with Interactive Data Visualizations

Post author By Laura Thompson
Post date April 25, 2017

Interactive data visualizations allow you to display complex data in interesting ways and allow viewers to be active participants in exploring the data. Check out some of these some great examples of interactive data visualizations for inspiration.

The Washington Post [i] presents a political visual that pairs graphics and a narrative that you can interact with as you scroll through.

Gapminder [ii] allows you to explore global statistics.

At Tableau Public’s gallery [iii] you can explore lots of visualizations people have created for various purposes.

Flowing Data allows you to visualize a day in the life of Americans.

This interactive visualization by NY Times [iv] allows you to draw what you guess a trend line might be, then see the actual trend line.

Interactive Data Visualization with Tableau

I wanted to experiment with visualization of Nebraska On-Farm Research data. After experimenting with several tools, I decided to work with Tableau software, developed by a company whose sole mission is to help people see and understand their data. You can learn more at http://www.tableau.com/. The data was collected by participants in the Nebraska On-Farm Research Network. The crop producers involved compared yields where no starter fertilizer was used to yields where 10-34-0 starter fertilizer was applied at planting. Phosphorus levels for the fields were recorded.

Interactive data visualization featuring Nebraska Extension On-Farm Research data on starter fertilizer use.

You can view the interactive data visualization here [v].

The data visualization has two parts. The first part is a scatter plot showing soil P versus the yield increase for starter fertilizer use. Hovering over each point brings up a tool tip that shows the study location, year, starter product tested and rate, yield of the control and starter treatments, statistical significance, and soil phosphorus level. The tool also allows farmers or agronomists to explore data by adjusting a slider to a range of soil phosphorus that is representative for their fields. The visual then displays the average yield increase for these sites.

Example of tooltip that appears when hovering over the data points.

The second part allows users to calculate economic impact by putting in their own soil phosphorus, starter fertilizer cost and expected corn price. The tool calculates the expected yield increase and expected return on investment based on the regression line that fits through the data.

Getting Started with Interactive Visualizations

A number of tools are available to enable you to create interactive and animated data visualizations. This article [vi] lists some design tools to look into and is a great place to get started exploring options. Some require more coding, but others can be done completely through a graphical user interface.

Tableau, the tool I used, has a great resource of instructional videos [vii] to help you get started. I recommend working with a simpler dataset, such as the one I used when learning the tool. You also may find it helpful, as I did, to sketch out a plan for the visualization prior to beginning work in Tableau.

Do you have ideas for how interactive data visualizations could be used in Extension or in your work? Share your ideas in the comments.

References

[i] https://www.washingtonpost.com/graphics/politics/endangered-seats/?tid=sm_tw&utm_term=.da23b96b06ce

[ii] https://www.gapminder.org/tools/#_locale_id=en;&chart-type=bubbles

[iii] https://public.tableau.com/en-us/s/gallery

[iv] https://www.nytimes.com/interactive/2017/04/14/upshot/drug-overdose-epidemic-you-draw-it.html?_r=0

[v] https://public.tableau.com/views/Test_StarterandSoilP/Dashboard2?:embed=y&:display_count=yes

[vi] http://www.creativebloq.com/design-tools/data-visualization-712402/2

[vii] https://www.tableau.com/learn/training

Tags big data, data visualization

Content Design Extension i-Three Lab Information Information Technology Innovation Media Professional Development Technology

Data Visualization Makeover

Post author By Laura Thompson
Post date April 13, 2017

This post is a little embarrassing.

I’m going to share some not-so-pretty (and not-so-effective) graphics that I have made…and presented…multiple times. Then I’ll walk you through the steps I took to create new and improved versions using my seven elements of good data visualization.

First, about the data. These data were collected by the Nebraska On-Farm Research Network. Farmers working with the network were trying to determine the optimum planting rate (population) for soybeans. Each picked several planting rates (treatments), replicated and randomized these treatments, and at the end of the growing season, recorded the yields. Their goal was to determine which planting rate maximized yield and, more importantly, profit.

Initially, this is how I presented the data, circa 2015.

Original presentation of soybean population data. — Original graph

Frightening, right?

When presenting the data in live presentations, I broke the first monster graph down into a series of three slides, one for each year.

Graph of 2006 soybean population data. — Original graph of 2006 data

Original presentation of 2007 soybean population data. — Original graph of 2007 data

Original presentation of 2008 soybean population data. — Original graph of 2008 data

Can you determine what the “take away” message of these graphs is?

My intended message was that research results had shown very little yield increase as soybean seeding rate increased, within the range of seeding rates we tested.

Were you able to come up with that message?

Overall, I think these graphs are much more difficult to interpret than necessary.

I set about to give these graphs a makeover for a 2017 presentation, using the seven elements of data visualization that I presented in my previous eXtension blog. I took the following steps, which illustrate my thought process in improving this graph.

Step 1: What is the point?

After thinking through my point, I boiled down my message to this: “Soybean yields increased very minimally as seeding rates increased above 90,000 seeds per acre.” A second important point was: “The lowest seeding rate was most economical because the increase in yield realized by increasing seeding rates did not offset the increase in seed cost.”

Stating this message explicitly provided much-needed direction – and helped me determine what was not important. Because I was concerned with the overall pattern in the results, the precise location and year of each study were not important.

Step 2: Choosing the right chart

Scatterplots can be an excellent way to show a relationship between two things (like soybean planting rates and yield). Connecting lines imply the continuity of the data and can allow us to compare multiple series of the data (in this case, research sites). In my case I was wanting to show a lack of difference, so I started experimenting with plotting the data in this way.

Displaying the data in a scatterplot with connecting lines — Scatterplot with connecting lines

This starts to better communicate the data and gets all the information onto one manageable sized graph, but it is still busy. The legend is extensive and doesn’t provide needed information. The Y-axis also goes from 35 to 80 (rather than 0 to 80), which is misleading and makes the differences appear to be greater than they are.

Step 3: Less is More

Here I have removed the legend (and unnecessary site and year designations) and reset the y-axis to 0 to 80.

Modified scatterplot with y-axis from 0 to 80 and legend removed

At this point, after consultation with a statistician, I decided to include only the sites where the same four planting rates of 90,000, 120,000, 150,000, and 180,000 seeds/acre were tested. This provided a better fit for the data and let us perform a more appropriate statistical test. I also updated the data to include data that had been collected in 2016 from three additional sites. Since these sites tested planting rates of only 90,000 to 180,000, I updated the x-axis to include only this range.

Scatterplot with adjusted x-axis. — Scatterplot with adjusted x-axis

Now the trend is starting to be more evident.

Step 4: Use color intentionally

Since color is no longer connected to site and year designations in a legend, it can be used to emphasize other information. From presenting the data in the past, I knew that people often asked if the trends shown by the data were true for both non-irrigated and irrigated conditions. For this reason, I chose to use color to designate irrigated sites (blue) and non-irrigated sites (orange). I also made the gridlines and axis numbering a lighter grey and less prominent, since we are concerned with the overall trend rather than exact values. Already, this graph is much less overwhelming to look at.

Graph using color to designate irrigation — Use color to designate irrigation

I needed to designate what the orange and blue colors were indicating. Instead of a separate legend, I included text in colors that coordinated with the colored lines on the graph. This strategy allows viewers to get almost all the information they need immediately when they are looking at the graph rather than having to look back and forth between a legend and the chart.

Chart leveraging consistent color for labeling — Leverage consistent color for labeling

Step 5. Create pointed titles and call out key points with text

At this point the trend is fairly obvious and there is room to add in the average statistics. I used a black line and a larger font to make the average more prominent directly on the graph. I noted actual values for the average statistic only. (Think how cluttered it would be to show values on the chart for every data point.) Showing the values only for the average communicates the important information – the overall trend.

The last step was adding a title. Rather than a boring, uninformative title like “Yield versus planting rate for soybeans in Nebraska” I tried to bring my main point home using the title on the graph below.

The final addition I made was to include source information on the bottom right. This gives credibility, lets people know where to find more info, and, in the case of data you have collected yourself, is a great way to promote Extension or your university.

Step 6: Get feedback and iterate

As I presented this data at winter meetings, a common question was “What were the average final stands for each planting population?” or “How many soybeans do I need to have at the end of the year?” To try to address these questions, I created an iteration that displayed this information with the planting populations. This version seems a little more cluttered to me, but I think it is worth it since that information was being requested.

As you may recall, a secondary objective was to communicate that increasing soybean seeding rate did not pay off in terms of increased yield. Rather than create a separate graphic for this objective, I used the yield data presented in this graphic to demonstrate the very small yield increase, and then when presenting, provided a second slide with calculations of profit for each rate. This worked out well as a discussion slide.

Before and After

Here are the completed “before and after” graphics. What do you think? Does this graphic better communicate the main point? What changes would you make? Let me know in the comments!

Tags big data, data visualization, fellow, impact

Content Design Fellowships i-Three Lab Information Information Technology Innovation Media Professional Development Technology

Seven Elements of Good Data Visualization

Post author By Laura Thompson
Post date April 11, 2017

To prompt behavior change, we must be able to effectively communicate data. Not convinced? Read this post on why data visualization matters. The goal of this article is to dig in deeper and present some foundational concepts for creating good data visuals.

A recent eXtension webinar [i] described numerous tools and programs at our disposal for creating more engaging data visualizations, so I won’t address those sorts of resources in this post. What I hope to impart are fundamental concepts of data visualizations that are cross-cutting and applicable, regardless of which tool you choose to use to create them. I have distilled these into my top seven characteristics of good data visualization.

1. What’s your point?

Our goal is to present scientific data in a clear and simple way. But do not misunderstand me; I am not advocating for over simplistic, watered down presentations of science. For example, Nate Silver’s FiveThirtyEight [ii] website, featuring data visualizations and journalism on topics of politics, economics, science, and sports, presents lots of complicated data, often using chart types that are unfamiliar and atypical. And even though we have probably all been warned against using visuals that depart from the norm, the site is ranked 618 in the U.S according to Alexa[iii], and, according to Quantcast[iv], over 371,000 visit the site each month in the U.S. Despite my lack of interest in sports in general, I have found myself browsing through numerous sports-related stories on Silver’s site. What makes these fairly complex and unfamiliar graphs engaging and worth spending time looking at? I propose that one key factor is that the authors know their point.

When you are presenting a graph or chart, do you think through what you want people to understand and walk away with? I know that I am guilty of approaching data visualization with the goal of displaying all of the data as neatly and completely as possible. While not a bad idea, more must be considered than whether I was able to fit all the information into the display. Before finalizing a graph to share with others, take a step back and ask yourself, “What is my point?” Then, determine if the graph actually conveys that, or if there is a way to make your point clearer. It could be that a different chart type or color scheme would help elucidate your point.

2. Choose the right chart

I suspect that by and large, bar charts are the most used chart type, and possibly for good reason: they are simple to read and people are familiar with them. However, they are not a one-size-fits all solution, and numerous other options should be considered. The following charts are ones I have experimented with in the last year.

Consider trying one of these chart or graphic types out this year. All of these examples were made in Excel – not with any special software – using some creative “tricks” to make them possible. At the end of this post I provide some resources to help you learn these tricks.

Simple text

When you have one or two numbers that tell the story, highlighting a single number is a great option. A caution, however: do not overuse this simple tool or it will lose its impact.

Heatmap

Tables are rarely a good tool for showing data in a live presentation, but they do give you the ability to present a lot of detail and can be useful in printed materials. Combining a table with the technique used in a heatmap – that is, adding colors that vary in intensity to show relative performance – can help readers more quickly process and see patterns. In the example below, the darker colors represent higher yields, allowing the reader to see at a glance which combination of nitrogen application rate and seeding rate results in the best yield.

Layered bar graph

A layered bar graph is essentially combining the two bars of a side-by-side double bar graph. Both the grey bars and red bars are assumed to start at the 0 point on the x-axis. This allows an easy comparison showing us how much more there is of the grey than the red. Combining the bars is a good technique for saving space and clearly illustrates the difference between the two things you are comparing, especially when you want to emphasize the relative difference and not necessarily the quantitative difference. Instead of a legend, color in the title and color of the bar segments are used to communicate information about the different elements being compared.

Small multiples

Small multiples is a great tool for breaking complex information into an array of manageable and comparable information. The technique uses multiple views to show different partitions of a dataset, using a series of similar charts or graphs with the same scale and axes that can be easily compared. There are numerous uses for small multiples, and many chart types can be broken down into small multiples; this examples uses horizontal bar charts.

3. Less is more

Eliminating unnecessary legends, gridlines, tick marks, and colors will clean up the graph and allow you to focus your learner’s attention on your point.

Eliminating the legend is a good strategy to clean up the graphic, and, if done well, makes interpretation of the graph quicker. Labeling bars directly, such as in the small multiples example, makes it easier for the viewer to process information because they do not have to look between the legend and the main part of the chart to determine what each color in the bar chart represents. Color also can be used in a similar way, such as in the stacked bar example. Colors in the title and on the average lines indicate what the grey and red categories are, making a separate label unnecessary.

Consider whether eliminating axis labels and instead labeling the points directly might be advantageous. When the actual numeric value is important, label the points directly; when the overall trend is important, leave the axis labels in place. To reduce redundancy, however, do not use both axis labels and individual point labels. One exception to this guideline would be to use the axis labels, but label a few key data points to draw attention to them.

4. Use color intentionally

I have seen many graphs like the following. In this case, each individual site was given a different color. The graph is bright and eye catching, yet the color is not used in a meaningful way. Separating the various sites with different colors is not important and only detracts from the overall point.

Example of color not used purposefully in a layered bar chart.

Color is a powerful tool and should always be used to convey a message. When I am developing a graphic, I like to first make as much of the graph as possible grey. Then I go back and begin using color to make the key point stand out. In the following graph I have used a lighter shade of red for the late planting date, and a darker shade of red for the early planting date. Color in the subtitle is used to designate what the different colors of bars represent and allows the legend to be eliminated.

Example of layered bar chart with purposeful use of color.

5. Create pointed titles and call out key points with text

The previous graph could be given a title along the lines of “Soybean Yield by Planting Date, 2008 to 2010.” However, a much more useful title could be leveraged to communicate the key point – in this case, “Planting Soybeans Early Resulted in an Average 2.7 bu/acre Yield Increase.”

Text also can be used in other strategic locations, such as the use of the word “12 On-Farm Research Sites” to designate all the sites along the x-axis rather than labeling them each “site 1, site 2, etc.” A subtitle is used to designate what the different colors of bars represent and provides additional useful information about planting dates.

6. Get feedback and iterate

This process is dynamic and, at least for me, requires lots of trial and error. Utilize the back button. Or create a separate copy before trying a bold remake, which also allows you to compare the first and second versions. On a number of occasions, once I had gotten a graph cleaned up and presentable, I realized my point would be better displayed with a completely different graph type, and I ended up starting the process over again.

Starting with a quick sketch on a sheet of scratch paper can also be helpful. Sometimes you can save time by quickly drawing out some ideas of how to display your variables before beginning your computer work. This also forces you to think through the concept rather than just defaulting to one of Excel’s recommended charts.

Getting feedback can be very valuable. Ask other people to take a look at your graph. Ask them what they think the main point is, and what they notice first. Audiences also often provide great feedback. Take note of what questions your audience have and then determine if there is a way to make your graph more clearly communicate the information they need.

7. Read up and copy other visualizations

Many of the graphics I have experimented with came from examples that intrigued me by the effectiveness with which they communicated information. I encourage you to browse websites and follow Twitter accounts that routinely produce good data visualizations. If you see something that really communicates information well, take a few minutes to look at it and think about why it is effective, then try to incorporate that into your future designs.

Here are some suggestions to get you started:

Websites

Storytelling with Data [v] (by author Cole Knaflic)
Stephanie Evergreen [vi]
FiveThirtyEight [vii]
USDA ERS Chart Gallery [viii]
USDA ERS Data Visualizations [ix]

Twitter accounts

@BBGVisualData
@WSJGrapics
@NateSilver538
@evergreendata
@Rbloggers
@Seeing_Data
@538viz
@FiveThirtyEight
@USDA_ERS

This is admittedly a very brief introduction to the concept of data visualization. There are lots of great resources that discuss how to pick the right chart for your data – and even walk you through how to create them. Two of my favorites that are fairly comprehensive are “Storytelling with Data” by Cole Knaflic and “Effective Data Visualization“ by Stephanie Evergreen.

Be patient with yourself – as with most things, learning to create good data visualizations takes time. Scott Berinato, author of “Good Charts: The HBR Guide to Making Smarter, More Persuasive Data Visualizations,” sums it up well: “Simplicity takes some discipline and courage to achieve. The impulse is to include everything you know. But charts communicate the idea that you’ve been just that – busy[x].”

These seven suggestions are meant to serve as a starting point and to encourage you to begin experimenting with the way you communicate data. In the next post, I will take you through a data visualization makeover using the elements I outlined in this post.

Please take a minute to answer these three questions. Your feedback helps direct future articles and resources.

[gform form=”https://docs.google.com/forms/d/e/1FAIpQLSfmAI4bHnnWZYSecowyCiFqkuI3knR8C77jsJaXoMpiU4H2LQ/viewform?usp=sf_link” legal=’off’ title=’off’]

[i] https://learn.extension.org/events/3007%5blink

[ii] www.fivethirtyeight.com

[iii] http://www.alexa.com/siteinfo/fivethirtyeight.com

[iv] https://www.quantcast.com/fivethirtyeight.com

[v] http://www.storytellingwithdata.com

[vi] http://stephanieevergreen.com/

[vii] www.fivethirtyeight.com

[viii] https://www.ers.usda.gov/data-products/chart-gallery/

[ix] https://www.ers.usda.gov/data-products/data-visualizations/

[x] https://hbr.org/2016/06/visualizations-that-really-work

Tags big data, data visualization, fellow, impact

Content Extension Fellowships i-Three Lab Information Technology Innovation Media Professional Development Technology

Data Visualization for Extension Professionals: Why Does it Matter?

Post author By Laura Thompson
Post date April 4, 2017

“We face danger whenever information growth outpaces our understanding of how to process it.”[i]

The ability to generate data has greatly increased in recent years, across all sectors, including agriculture. In fact, according to VCloud News, 90% of the world’s data has been created in the last 2 years alone[ii]. This “big data” is harnessed to improve health, save money, and improve efficiencies. In this era of “big data,” challenges lie not only in storing and processing data, but distilling and presenting it so it becomes meaningful and offers insights for our intended audience. Scott Berinato, senior editor at Harvard Business Review, encapsulates this idea in “Visualizations That Really Work”: “Decision making increasingly relies on data, which comes at us with such overwhelming velocity, and in such volume, that we can’t comprehend it without some layer of abstraction.”[iii]

The goal of this post is to discuss how we, as scientists and educators, can present data in clear and concise ways.

Enter data visualization.

What is Data Visualization?

Simply put, data visualization is how we make sense of, and communicate, data.

However, this term can encompass a variety of things and varies by profession – computer programmers, statisticians, graphic designers, business analysts, scientists, journalists, and professional speakers all approach the topic of data visualization differently.

I am not a computer programmer, nor am I a graphic designer. I am a scientist by training, and therefore a practitioner of data visualization. I experiment, and I have much to learn.

I have been convinced of the importance of paying attention to how we visualize data, as much by my own struggles to decipher cluttered, burdensome graphics as by any well-crafted argument. Unfortunately, scientific data is often presented in overly complex charts – charts that make data hard to interpret and consequently remember. This is true for information delivered to both the scientific community and Extension audiences. In fact, it could be argued there is a tendency within the scientific community to over-complicate things, as if making our data more convoluted will impress people with our vast knowledge.

Thankfully, scientific data presentation does not have to be cumbersome and overly complex; effective visualizations can make the message clear and memorable.

Why Should Extension Professionals Worry about Data Visualization?

Intuitively, we know that good information, when poorly communicated, cannot prompt desired behavior change. You can’t act on information you don’t understand – and having information does not equal understanding.

There is research evidence that supports this. Pandey, Manivannan, Nov, Satterthwaite, and Bertini (2014)[iv] tested the assumption that “visualization leads to more persuasive messages” by showing participants data in both chart and table form. When participants didn’t have strong beliefs about a topic, the visual information presented in charts was more persuasive than textual information presented in tables in changing their attitudes. Simply, data visualizations lead to greater impact.

So why is there not more emphasis on this important aspect of how we communicate data?

A quick Google Trend [v] analysis shows a rapid increase in searches for “big data” since 2011, while searches for “data visualization” stay relatively stagnant. Why the lack of interest and emphasis on visualizing our data? Surely as we increase the quantity of data we collect, the need for effective data visualization increases correspondingly, if not increasingly more.

Google Trend Analysis of "Data Visualization" and "Big Data" — Google Trend Analysis of “Data Visualization” and “Big Data”

In Cooperative Extension, our goal is to have impact – for people to make behavior changes as a result of information we share. In order for this to happen, we need to effectively communicate data. Unfortunately, many obstacles get in the way of effective data communication. I believe one of these obstacles is simply ignorance of the fact that data can be communicated poorly.

Lack of awareness and attention to the issue may be partly to blame, but it may not be all our fault. After all, in the past, data visualization has been left to specialists such as data scientists and professional designers. But now, due to enhanced computing capabilities, new software and tools, and the ability to quickly collect and process massive quantities of data, most Extension professionals routinely produce charts and figures – without formal training in data visualization.

As a 2016 eXtension fellow [vi], my goal is to bring awareness and promote discussion of the topic of data visualization. If Extension is to fulfill the mission of bridging the gap between scientists and the public, so the public can act on the information scientists provide, we must communicate data well.

Fortunately, numerous books, videos, podcasts, and blogs are dedicated to the finer points of good data visualization. As a starting point, in my next post, I offer what I consider my top seven elements of good data visualization.

Please take a moment to complete the anonymous survey below. Information submitted will be used to guide my work during this fellowship.

[gform form=”https://docs.google.com/forms/d/e/1FAIpQLScYNJo9TjjiQ4yR7UYPYOZ6cFz0lzBe5hIjO6Qq6mjL9f447g/viewform” legal=’off’ title=’off’]

Endnotes

[i] Silver N. (2012).The Signal and the Noise: Why So Many Predictions Fail But Some Don’t. New York: Penguin Press.

[ii] http://www.vcloudnews.com/every-day-big-data-statistics-2-5-quintillion-bytes-of-data-created-daily/

[iii] https://hbr.org/2016/06/visualizations-that-really-work

[iv] http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=6876023

[v] https://trends.google.com/trends/

[vi] https://www.extension.org/laura-thompson/

Tags big data, data visualization, fellow, impact