Skip to content

An Introduction to Data Visualization And Its Tools

Author’s Note – This post is adapted from a presentation given at the iSchool at Syracuse University on February 6, 2015. You can download those presentation slides here.

With the rise of the interactive web and our access to large sets of open information, data visualization has become a hot topic in recent years. Yet, we have been using data visualization for centuries. If you have ever created a graph in Excel, congratulations, you have at least dabbled in the visualization process.

Today, data visualization is evolving rapidly. We have access to an ever-growing library of tools to produce compelling, interactive graphics. Here, we’ll dive into the history of data visualization and some of the key tools needed to build your own graphics.

The Evolution of Data Visualization

Visualization is an extension of an innate human trait: we are visual creatures and we often use graphics to make sense of large sets of information.  True data visualization, however, can be traced back to a man named William Playfair.  In the 1700s, Playfair created four tools still used to this day: the bar graph, the line graph, the circle graph, and the pie chart.

Playfair created the foundations for graphical statistics, which in turn help power the Industrial Revolution. Here, all 19th- century innovators used hand-drawn visualizations to make sense of their own technological progress.

Fast forward to the 20th century, and accessible computer software gave the power to visualize large sets of data to anyone with a computer. Remember, Excel was once (and arguably still is) a novel, powerful tool.

Today, the interactive web has led to a host of new visualization tools. We can now build visualizations that are dynamic and customizable, all atop growing sets of data.

Data Vis Process

At its core, data visualization is a process that requires patience and persistence. I like to take the many phases that go into successful visualizations and break them into two large categories: analysis and presentation.

In the analysis phase we gather our data, clean up a likely messy dataset, and try to make sense of the numbers/text/whatever is in front of us.  In the presentation phase, we actually visualize our analyzed dataset, taking care to design our final product.

Throughout, it is imperative that you ask questions, recognize your context, and tell a story. It need not be profound or serious, but it is important to present a visualization in a way that is understandable. The most compelling visualizations have a mission.

Tools Highlight

There are many great visualization tools available. Several fully packaged tools exist that largely operate from a graphical interface, each having a range of capabilities. These include Excel, SPSS, and Tableau. The most advanced tools often involve writing code. I highlight two of my favorites below.

LEFT - A PDF output straight from R; RIGHT - A final rendering edited in Illustrator
LEFT – A PDF output straight from R; RIGHT – A final rendering edited in Illustrator.

R is a statistical programming environment geared towards data analysis; it also has some powerful graphical capabilities built in. With the short R script hosted here, we can easily render a bar graph with axes, titles, and colors. We can then import this PDF output into Adobe Illustrator, and work with R’s generated layers to further style our visualization. R and Illustrator work together well to produce clean static visualizations.

D3, a visualization library for JavaScript, is the leading standard for interactive web-based visualization. It does require more technical weight to produce the base elements that R handles for us behind-the-scenes, but D3 itself is a powerful JavaScript extension.

In this first example, we recreate the snow totals bar graph, but this time with some added interactivity.  Turn to our second example, and we’ve now created an interactive map of drought patterns in the United States. Notice how our jump from bar graph to large heat map only included a few extra lines of code? That’s the power of D3 at work!

Next Steps

Mastering the theories behind data visualization is arguably just as important as knowing the key toolsets. Work to build your design process and do not try to rush through your first tutorials.

In turn, make sure you pick the right tools for your case.  Each tool, from basic Excel graphs to advanced R statistical analysis outputs, serves a purpose. To learn more about R, check out DataCamp’s quick-start tutorial. Interested in D3? Check out Scott Murray’s fantastic beginner’s guide.

And finally, if you’re in the Syracuse area, join our local visualization community. The Syracuse InfoVis Meetup will be hosting its kickoff meeting on Tuesday, March 3 at 6 p.m. and we want to see you there.

Happy visualizing!

Billy Ceskavich

Billy is an Engagement Fellow and Masters candidate in the iSchool's Information Management program. He's an avid fan of consumer technology and the startup space, having worked in the industry both in Syracuse and Silicon Valley. He's active on Twitter (@ceskavich) or can be contacted directly via email (bceskavich@gmail.com).

More Posts - Website - Twitter - Facebook - LinkedIn