Big data. It’s a term we hear constantly in our workplace. But do we really know what big data is and how dynamic it can be? What is it used for and how does it impact academic libraries? Lots of questions I know, but I’ll attempt to visually and verbally describe what big data is.
Big data is based on the compilation of the “what and where” of an object in consecutively captured moments of time and then added to the accumulated set of data elements on multiple objects over time. Or put another way, it’s recording the measurements in the change of multiple objects, moment to moment and then compiling that information into a data set in real-time and archiving all the results.
In some aspects, the concept of big data is better understood if you can visualize it. What can it do and what are some of the real-world practical applications of big data? Seeing it in action will help better explain how REALLY BIG, big data can get and how powerful it can be.
Imagine for one moment if everyone in the US wore a Fitbit. That would mean 324 million Fitbits sending vital signs every minute to a central data site to become part of a “Big Data” set … this information could then be used to inform the CDC or Department of Health through charts, graphs or maps as a tool to visualizing big data.
How to think about big data
Think of a data set as a box. When data is collected by sensors at prescribed moments in time or as a stream of constantly collected data it is then sent to the box. Software then starts to read the data and sends back visual outputs when a user sends a query to the data set. As data is continually gathered by sensors in the field, the data set grows and provides more and more information to reply or respond to users querying the data set. And on it goes. The data set just gets bigger and bigger and bigger. All the time gathering and processing more and more data to feed back to the user in real-time, since the data is gathered in real-time.
One key aspect of using big data is the location component gathered by the sensors to enable the data to be charted, mapped or visualized. So how can we benefit from visualization or mapping this data? Decision makers can see events happen in real-time. Time is the critical element in using big data and as more devises come on line, more data will be created and streamed to the data set.
So what are these sensors that gather data for the data set? We carry them around with us every day. It’s any devise that has GPS capabilities like your cell phone, smartphone, tablets, computers, cameras, or items with RFID tags, and wearable devises like smart watches, Fitbits and clothing with imbedded sensors, etc. and the list goes on and on.
It’s predicted that by 2020 there will be 50 billion connected devices on Earth gathering information. (1) Imagine for one moment if everyone in the US wore a Fitbit. That would mean 324 million Fitbits sending vital signs every minute to a central data site to become part of a “Big Data” set that gathers and aggregates everyone’s vitals together to produce a snapshot of the country’s health at any given moment in time. They could also aggregate data by State, region, maybe census tracts to see which areas of the country are healthier than others.
This information could then be used to inform the CDC or Department of Health through charts, graphs or maps as a tool to visualizing big data. Other examples could include the occurrence and movement of outbreaks of the Zika virus or where there are concentrations of flu within a specific city, state or region and how it migrates across neighborhoods or from town to town over time?
Big Data Examples
Let’s look at some real life examples. Below are three websites that allow you to interact and use or visualize big data:
We’ll stick with the first example “Flight Radar 24” since I think it does a great job at showing the capabilities of what big data is and visualizes it very well. This site also has an Android and Apple app for your phone or tablet so you can connect to big data where ever you go.
At first glance, this website will display a multitude of little yellow (real-time) and orange (5 minute delay) airplanes. If you wait a few seconds you will see them move. Zoom in using the “+” button and you will now see the planes moving steadily across the map.
This website is operating on streamed data from each plane’s GPS unit and transponder in real-time. That data is stored every time the plane moves and sends data back to be added to the data set. If you click on a plane, (Figure 1) it will show you the path from its point of origin to its current location. This is the streamed data being pulled from storage and displays the path the plane has taken so far. Also in the left hand column you will notice that there is much more information about that plane like its current altitude, speed, etc. all refreshed in real-time.
But wait! There’s more. We can take this visualization one step further.
Click on the “3D” button just below the picture of the plane. (Figure 2) You will be sent flying, virtually or course, alongside the plane moving over the landscape. Hold down the left mouse button and move your mouse around and your visual relationship to the plane with change. If you click the playback button, you can then search for and see the last seven days of flight paths for all flights. This is all being retrieved from the site’s big data archive.
You can replay any previous flight and see where it traveled. This should give you a sense of the power of big data and the capabilities of 3D visualization. The “How it Works” page has details and metadata about the data collected and how it’s used.
So what other uses are there? Big data can be used to model or create smart cities, or use for city infrastructure management, analyzing and scheduling city bus systems, assist with fire, safety and police operations, managing retail or medical practices and building automation and security. There are so many uses.
So big data is used to formulate and process results that can assist with operational processes or that can help answer questions through visual displays that allow users to conduct analyses and formulate conclusions or monitor the progress of an operation or situation.
Big data in libraries
So how do we inform our patrons about big data and how to find and use it? How do we recognize it ourselves to be able to transfer that information to the patron?
Big data isn’t as out-in-the-open as other types of data we’re used to using. Big data is mostly running behind the scenes so it’s not obvious. It’s powering websites and search engines that maybe we use already. As long and we understand what big data is and how it’s used can we then guide and educate our patrons. Given the examples I’ve listed above as starting points, I hope that it gives you a visual understanding of what elements of big data to look for.
So where is big data taking us? I can only guess that it’s to a better place or maybe it’s time to jump on that plane used in the example.
ESRI 2016 Users Conference session on big data, Real-Time and Big Data: Real-Time GIS: The Internet of Things
A series of good articles on Data and Data visualization.
Editor’s note: This article first appeared on The Syracuse Libraries Research & Scholarship blog and is republished here with permission.