WordSieve: A Method for Real-Time Context Extraction (pdf )

Bauer, T. and Leake D., WordSieve: A Method for Real-Time Context Extraction. Modeling and Using Context: Proceedings of the Third International and Interdisciplinary Conference, Context 2001 , Springer-Verlag, 2001.


In order to be useful, intelligent information retrieval agents must provide their users with context-relevant information. This paper presents WordSieve, an algorithm for automatically extracting information about the context in which documents are consulted during web browsing. Using information extracted from the stream of documents consulted by the user, WordSieve automatically builds context profiles which differentiate sets of documents that users tend to access in groups. These profiles are used in a research-aiding system to index documents consulted in the current context and pro-actively suggest them to users in similar future contexts. In initial experiments on the capability to match documents to the task contexts in which they were consulted, WordSieve indexing outperformed indexing based on \textit{Term Frequency/Inverse Document Frequency}, a common document indexing approach for intelligent agents in information retrieval.

See http://www.cs.indiana.edu/~leake/INDEX.html for additional publications in the Artificial Intelligence/Cognitive Science report and reprint archive maintained by David Leake.