Open-domian Question Answering from Large Text Collections

Many books have indexes, but most textual media have none. Newspapers, legal transcripts, conference proceedings, correspondence, video subtitles, and web pages are increasingly accessible with computers, yet are still without indexes or other sophisticated means of finding the excerpts most relevant to a reader's question. Better than an index, and much better than a keyword search, are the high-precision computerized question-answering systems explored in this book. Marius Pasca presents novel and robust methods for capturing the semantics of natural language questions and for finding the most relevant portions of texts. This research has led to a fully implemented and rigorously evaluated architecture that has produced experimental results showing great promise for the future of Internet search technology.