What is Big Data? How is it analyzed and how do companies use it? Jason Mills ’95, G’96, who is the Director of Big Data Technology and Analytics at JPMorgan Chase discussed Big Data at the School of Information Studies recently. His company was one of those affected by the financial crisis that began in 2008. Now, in part through the use of Big Data, the hope is that similar situations will not happen again.
What is Big Data?
Big Data to some is just a buzz-word. To Mills, Big Data is “one of the most transformative areas of technology ever.” He defines Big Data as starting at the terabyte level. A terabyte, or 1 trillion bytes, is a huge amount of data – more than most personal computers can hold. Companies that deal with Big Data often hold many terabytes of data.
Big Data is often unstructured, or is presented in formats like emails, pictures, and video, instead of in a clean format like an Excel spreadsheet with columns and rows. In fact, Big Data requires systems beyond the ability of typical database software.
The amount of data will continue to expand, likely exponentially, thanks to all the ways that we can now measure human data – be it through GPS tracking, the internet of things, and easy access to communication. Finding ways to make sense of all that data, determine patterns, and then create value based on the findings has enormous potential.
Why work with Big Data?
Mills began his talk by asking the audience to consider:
- what they love,
- what they are good at,
- what they can get paid for,
- and what the world needs
He says, when people do what they love and what they are good at, they have found their passion. Then they do what they are good at and what they can get paid for, they have a profession. Then they get paid and do what the world needs, they have a vocation. When they do what the world needs and what they love, they have a mission. When all of those things come together, people have found their purpose. Mills believes that working in Big Data can be the purpose for many people because it does have the power to be transformative.
How do companies analyze Big Data?
Because Big Data requires the power of more than just one personal computer, there are programs that allow multiple computers to process the data at the same time, making the analysis faster and more efficient.
Hadoop, created by the Apache Software Foundation is open-source. Mills is a supporter of open-source development, saying, “the power of sharing code for free has changed the technology world.”
Using Hadoop allows many companies to analyze large amounts of data in a fraction of the amount of time that it would take with single computers.
How can you get hired working with Big Data?
If you believe working with Big Data is your purpose, Mills has some advice about finding a job. First, companies are looking for people who take the initiative to learn about Big Data technologies on their own. Products like Hadoop are relatively new, so there are very few experts. This means there is opportunity to learn and advance quickly. Mills says, if you learn and successfully build a Hadoop cluster, you have good possibilities of getting hired.
Learning skills like R and Python are “hugely important”, Mills says.
But knowing how to code is not the only way to get into Big Data. Mills has people on his team who are not programmers, but also have a “passion for execution”, he says. When it comes to financial crime, exposing illegal practices can be scary. It is important for his employees to know how to organize information clearly, and be able to quickly read a situation.
Since the Big Data industry is constantly evolving, Mills says that he attends a lot of conferences and speaks with leaders in other industries that are using Big Data, since they have a vision for where things are going. Finally, Mills recommends visiting Silicon Valley and spending time with startups. “You can learn a lot.”