CNLP and IST are jointly hosting a presentation by Andres Corrada-Emmanuel
February 13, 2004
Center for Science and Technology Room 4-201
by Andres Corrada-Emmanuel Question answering research is enjoying a renaissance due to increased interest by the government in furthering the ability of analysts to obtain "answers" from a wide range of sources. While the leading systems in this research effort employ NLP techniques that actually try to understand questions posed by a user, statistical approaches are useful because they trade performance for ease of implementation. But, in addition, current question answering systems typically contain an initial document or passage retrieval phase. Further natural language processing is then used to rank the most likely answer passages or extract exact answers to a user's question. Statistical methods can be used to increase the accuracy of this initial retrieval. We will discuss how we can incorporate a bigram model to increase the accuracy of the passage retrieval phase. The model is trained on correct answers that have been annotated to identify named entities such as 'person', 'location', etc. Performance results will be presented using the 'factoid' questions of the TREC 2002 QA track. Brief Biography
Dr. Andres Corrada-Emmanuel was trained as a physicist at Harvard and the University of Masschusetts at Amherst. He began a career in natural language processing when he joined the Research Department of Dragon Systems, a speech recognition company that had as one of its founder his favorite undergraduate physics teacher. At Dragon he worked on the Large Vocabulary Continous Speech Recognition (LVCSR) government initiative, and speaker identification. His work on speaker identification resulted in a state-of-the-art algorithm that has been patented. He now does research on statistical question answering at the Center for Intelligent Information Retrieval at UMASS/Amherst.