Data at the heart of recovery.

Data science is kind of like a mystery novel. At least that’s what drew recent iSchool alumna Aarthe Jayaprakash ’20 to the discipline.

“You’re following clues, trying to find out who’s the bad guy,” she says. “It’s the same thing in a different context.”

That detective work led Aarthe straight to the winning team for MIT’s COVID-19 Datathon, highlighting how important and powerful data can be in mitigating, and recovering from, crises like the current coronavirus pandemic.

At the early onset of the pandemic, data was a critical tool that policy and decision-makers leaned on to understand how the disease was spreading, and more importantly, how to stop it. But the early data was piecemeal, which meant that the response had to be fairly broad. That led to the widespread lockdowns many cities around the country are still enduring.

As more data has become available, scientists have been able to produce more precise analyses that help leaders understand which communities and populations are most at risk, and as a result, be more targeted in their response.

animated gif showing increase of COVID cases by zip code in new york city

Increase in COVID-19 cases in New York City shown by zip code.

In Aarthe’s case, she and her team used public datasets from the City of New York to identify which neighborhoods are more risk-prone than others, based on demographics as well as socio-economic and mobility factors. That data included income and education data by zip code, data from subway stops and number of riders, city bicycle data, mobility reports, and more.

One of the most powerful uses for data lies in its capacity to help scholars identify root causes. Whether it’s the outcome of a natural disaster or the consequences of public policy, data can validate the effects of particular events and provide valuable insights for guiding future action. Aarthe notes that data analysis has expanded our ability to understand an event and react more quickly and effectively.

Stephen Wallace, Professor of Practice at the iSchool, agrees — especially when it comes to COVID-19. “Data is really at the heart of trying to understand the disease,” he says, “so we can take proactive action in order to contain it.”

Steve teaches courses in applied data science and natural language processing, so he knows a thing or two about building predictive models and using them in real-world applications. He mentions that, when it comes to using data to inform recovery efforts, dissemination is an integral part of the process.

Dissemination is the critical link between information and action, as was proven again during the rapid outbreak of COVID-19. Only after an effective educational dissemination campaign, Steve says, were policymakers able to take action and enact city-wide lockdowns. “If they had tried to do that first, it would have been a disaster.”

That point echoes sentiments Aarthe makes about the role data plays in informing action. While she leaves specific actions up to decision-makers, she says her team’s goal in the Datathon was to identify the highest-impact target areas. “The idea is to point them in the right direction,” she says.

That also speaks to what she sees as one of the most powerful uses for data science: it allows us to be more efficient and effective in our collective response to crises. “We’re not waiting for a disaster to happen to act,” Aarthe notes. That’s because data generates insights that enable policymakers and the public to make quick decisions in real-time.

As for Steve, he takes pride in Americans’ collective response to COVID-19. Early access to data and analysis demonstrated that our health care system would be overwhelmed if we did not act, which drove decisions to shelter in place. That’s kept the disease localized and helped stop the magnitude of the spread.

With ever-increasing access to data in the public domain, Aarthe reminds us that everyone can harness the power of data analysis to explore and find patterns, even if you don’t have a technical background.  “You don’t have to know statistics in order to make a discovery,” she says.  Given how many tools and resources are available, “you could be self-taught and discover something on your own.”

It was that attitude of humble discovery, like a detective on a mission, that allowed her and her team to win the Datathon and make a lasting contribution to the recovery effort. At the time of writing, Aarthe is working on publishing a co-authored paper about her findings that she hopes will help guide public policy.

“I just wanted to try something out,” she says. “It turned out to be such a valuable experience, and it showed me how integral data will continue to be in helping us recover from this crisis.”