Computational Intelligence, 23, pp 3-14, 2007.
Department of Computer Science, University of Texas at El Paso
Abstract: When one tries to use the Web as a dictionary or encyclopedia, entering some single term into a search engine, the highly-ranked pages in the result can include irrelevant or useless sites. The problem is that single-term queries, if taken literally, underspecify the type of page the user wants. For such problems automatic query expansion, also known as pseudo-feedback, is often effective. In this method the top n documents returned by an initial retrieval are used to provide terms for a second retrieval. This paper contributes, first, new normalization techniques for query expansion, and second, a new way of computing the similarity between an expanded query and a document, the "local relevance density" metric, which complements the standard vector product metric. Both of these techniques are shown to be useful for single-term queries, in Japanese, in experiments done over the World Wide Web in early 2001.
Nigel Ward's Publications