Bei Yu

Associate Professor | PhD Program Director
320 Hinds Hall
Phone: 315.443.3613
Bei Yu

My research areas are Natural Language Processing and Computational Social Science. More specifically, my research focuses on using machine learning and natural language processing techniques to improve information quality, organization and access, especially in the science and health domain. I am particularly interested in linguistic patterns that characterize people’s opinions, emotions, and language styles, and their roles in information representation and sharing in science literature, news and social media. My most recent work focuses on computational modeling of misinformation in science news. For example, I developed new methods for identifying exaggerated claims in science news by extracting and comparing claims from news articles and research papers. I also collaborate with social scientists to computationally operationalize social science theories and concepts to answer research questions in a variety of domains, such as political science, business, and mass communication. My work has been supported by funding sources including NSF, IMLS, and Microsoft.


Recently funded projects:

2020-2023, PI on NSF SciSIP Grant #1952353, $375,000, for “PreCheck: Understanding Press Release Exaggeration (PRE) of Scientific Research”

2020-2022, PI on Microsoft Investigator Fellowship Award, $200,000. For “Toward Teaching and Research on NLP for Social Good”

2020-2022, PI on CUSE Grant, $29,995, for “Tracking unverified health advice in science communication”

Recent publications from my research team:

Yu, B., Wang, J., Guo, L., & Li, Y. (2020). Measuring Correlation-to-Causation Exaggeration in Press Releases. Proceedings of the 28th International Conference on Computational Linguistics (COLING2020), 4860-4872.

Zhou, H. and Yu, B. (2020). Information Quality of Reddit Link Posts on Health News. In A. Sundqvist, G. Berget, J. Nolin, K.I. Skjerdingstad (Eds.), Springer Lecture Notes in Computer Science: Vol. 12051. Sustainable Digital Communities – 15th iConference. 186-197.

Yu, B., Li, Y. and Wang, J. (2019). Detecting Causal Language Use in Science Findings. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 4656-4666.

Li, Y. and Yu, B. (2019). Identifying finding sentences in conclusion subsections of biomedical abstracts. In N. G. Taylor, C. Christian-Lamb, M. H. Martin, and B. Nardi (Eds.), Springer Lecture Notes in Computer Science: Vol. 11420. Information in Contemporary Society – 14th iConference, 679-689.

Yuan, S. and Yu, B. (2019). HClaimE: a tool to identify health claims in health news headlines. Information Processing and Management, 56(4), 1220-1233.

View Experts@Syracuse Profile

View Google Scholar Profile

Courses Taught
  • IST 736 – Text Mining (Professor of Record, 2014-current)
  • IST 707 – Data Analytics (Professor of Record, 2011-2019)
  • IST 700 – Deep Learning, NLP and Computational Social Science
  • IST 777 – Statistical Methods in IST
  • IST 659 – Data Administration Concepts and Database Management
  • IST 552 – Information Systems Analysis
  • IST 359 – Introduction to Database Management System
  • IST 800 – Proseminar on Content Analysis
  • IST 800 – Proseminar on Information Retrieval
  • IST 800 – Proseminar on Text as Data