Personal Information Retrieval Visualization (PIRV): Clustering and Visualization of Web Document Search Results

Xiangyang Xu (1), Ernst L. Leiss (1)

e-mails: x228917@yahoo.com, coscel@cs.uh.edu

(1) University of Houston - Department of Computer Science TX 77207 Houston Estados Unidos

Abstract

Conventional web search engines often return long lists of ranked documents as their output. This text-like data presentation for web search results has many limitations. Since only a part of the list of documents can be shown at a time, users cannot get a complete picture of the returned documents. Therefore, users do not know if these documents contain a document they are interested in, after reading the first few items of the list of documents. Due to the imprecise nature of current Web search engines and the explosive increase in the number of documents available, users are forced to spend a significant amount of time going through the list of the results or abandon the current search result.

In this project, we design and implement a system called PIRV (Personal Information Retrieval Visualization), which dynamically groups the search results into clusters and presents these clusters in 2-dimensional graphics. After receiving a query from a user, PIRV sends it to the search engine, receives the returned documents, clusters these documents according to similarity values between individual documents, transforms the data into a graphical representation, and then displays these graphics to the user. With this visual display, a user may use visual perception to evaluation these clusters and to make an intuitive judgment about the relevance of these documents without having to read a significant portion of each document. Furthermore, a user’s search history is saved in the user’s computer upon logging out; this can be used to assist in future searches. The saved search history file is automatically retrieved by PIRV upon login. A user can also view previous search results when doing multiple query searches.

Keywords:Internet Search, Clustering of Results, Visualization


BibTex

@INPROCEEDINGS{xu04:38,
                  AUTHOR       = {Xiangyang Xu and Ernst L. Leiss},
                  TITLE        = {Personal Information Retrieval Visualization (PIRV): Clustering and Visualization of Web Document Search Results},
                  BOOKTITLE    = {30ma Conferencia Latinoamericana de Informática (CLEI2004)},
                  YEAR         = {2004},
                  editor       = {Mauricio Solar and David Fernández-Baca and Ernesto Cuadros-Vargas},
                  pages        = {105--116},
                  address      = {},
                  month        = Sep,
                  organization = {Sociedad Peruana de Computación},
                  note         = {ISBN 9972-9876-2-0},
                  file         = {http://clei2004.spc.org.pe/es/html/pdfs/38.pdf}
}

pdficon.gif PDF de este artículo
PDF de CLEI2004 (incluye todos los artículos)
Página principal CLEI 2004
Generado por Sociedad Peruana de Computación