Scanning Compressed Full Text Files
DOI:
https://doi.org/10.29173/cais748Abstract
From the 1994 CAIS Conference:
The Information Industry in Transition
McGill University, Montreal, Quebec. May 25 - 27, 1994.
In this paper we discuss an application of compression, not with the overall goal of reducing disk space, but with the goal of extending the applicability of full text scan procedures to larger text files for use in on-line search environments. This paper presents an alternative to inverted file generation for access to full text data files of medium size, 50-200Mbytes, for which the cost of generating a full inverted file is not warranted. Full scan techniques, which are often useful in interactive situations for small files, become unacceptably slow for interactive sessions with files above 10 Mbytes and so the use of compression to reduce the quantity of data scanned is an attractive alternative. Furthermore, an index that can be used to reduce search time further to very acceptable 1-2 second times can be generated as a byproduct of the compression.