When scientists and researchers want to apply to the National Science Foundation (NSF) for a grant opportunity, the agency has a mandate to ensure that any research funded is new and transformative in the field.

While an admirable goal in theory, in practice, it is often very difficult for grant writers and researchers to know what fields and topics haven’t been covered and funded yet, making the grant writing and researching process a daunting task.

School of Information Studies (iSchool) faculty member Daniel Acuna hopes to change this with his work under a recently awarded NSF EAGER grant entitled “Improving grant reviewing and scientific innovation by linking funding and scholarly literature.”

The grant is part of an effort by the NSF’s Science of Science and Innovation Policy (SciSIP) program to use data-driven approaches to accelerate knowledge discovery and inform policy.

EAGER, “Early-concept Grants for Exploratory Research,” funding is awarded to researchers to support exploratory work in early stages on untested but potentially transformative, research ideas or approaches.

Acuna proposes to use artificial Intelligence (A.I.) to merge several open datasets of publications and grants into a large, unified new dataset. He also proposes a search tool to help the NSF and grant-seeking scientists evaluate proposals and find gaps in knowledge using the new dataset.

“This is basically a recommendation system for both researchers and NSF program officers,” explained Acuna. “It will help them find out what types of research have been proposed and funded across certain areas.”

“From my own experience in trying to submit this proposal, for example, it was very time consuming to go through several websites and sources to find out if researchers were working on such a thing, or if grants had already been distributed in a similar line of research,” said Acuna. 

Publications and research documents will be sourced from Microsoft Academic Graph, which covers nearly all fields of science; MEDLINE for biomedical literature; ArXiv, which covers physics, mathematics, computer science, biology, finance and statistics; and the National Bureau of Economic Research.

Data on funding will come from the U.S. Department of Health and Human Services’ Federal RePORTER, containing nearly 3 million scientific awards from 14 federal agencies.

“Once we have this dataset established, it will enable more research and funding on scientific topics that might have otherwise been overlooked,” said Acuna. “And not only will we have the search tool, but I plan to make the dataset itself publicly available for researchers to study to find gaps or trends in sponsored research.”

Acuna also hopes to standardize the data across the various sources, making sure that records for the same researcher, institution, and funding agency map correctly across different databases. 

When he talks to his colleagues about plans for the research database, Acuna says their reactions have been positive. “They think it will be very useful,” he said. “It resonates with them because of NSF’s emphasis on funding new research, and they need to know if their proposals fit this guideline.”

The project has received interest both from research colleagues of Acuna as well as program officers at the NSF. 

Acuna is the principal investigator on the grant, and the two-year award totals nearly $170,000. He plans to spend the first year building the infrastructure for the project combining the datasets, and the second year will be devoted to building the website and search interface.

Collaborating with Acuna are iSchool Associate Professor Bei Yu, Konrad Kording (Northwestern University) and James Evans (University of Chicago).