|
|
 |
|
Viewing report
|
|
 |
 |
Folding-up. Edition No. 1
VDM Publishing House, July 2008, Pages: 112
Latent semantic indexing (LSI) is an information retrieval method that represents a dataset as a term-document matrix. LSI uses a matrix factorization method known as the partial singular value decomposition (PSVD) to reduce noise in the data by projecting the term-document matrix into a lower-dimensional vector space. Calculating the PSVD of a large term-document matrix is computationally expensive. In a rapidly expanding environment, a term-document matrix is altered often as new documents and terms are added. Recomputing the PSVD of the term-document matrix each time changes occur can be too expensive. Folding-in is one method of adding new documents or terms to an LSI database; updating the PSVD of the database is another method. Folding-in is computationally inexpensive, but may cause a loss of accuracy in the PSVD. PSVD-updating is more expensive, but maintains the PSVD accuracy. We introduce folding-up, a new method that combines folding-in and PSVD-updating. Folding-up offers a significant improvement in computation time compared with recomputing the PSVD or PSVD-updating, while avoiding the degradation in the PSVD that can occur when the folding-in method alone is used.
|
 |
|
|