首页 | 本学科首页   官方微博 | 高级检索  
     检索      

LSI文本挖掘技术剖析
引用本文:蔡豪源.LSI文本挖掘技术剖析[J].农业图书情报学刊,2016,28(7):5-9.
作者姓名:蔡豪源
作者单位:广州图书馆,广东 广州 510623
摘    要:介绍了LSI潜在语义索引在信息检索领域的运用。阐述了词项加权的3种方法,分析了矩阵的奇异值分解SVD在提取矩阵重要信息方面的作用,展示了对词项—文档矩阵的降秩近似是如何模拟人类理解语义的过程;比较了向量空间模型与LSI在搜索算法上的异同,通过对词项—文档矩阵进行文本挖掘的例子,指出了LSI在分析文档间内在联系所起到的作用。

关 键 词:潜在语义索引  文本挖掘  向量空间模型  奇异值分解  
收稿时间:2016-01-16

Analysis of the Latent Semantic Indexing text Mining Method
CAI Hao-yuan.Analysis of the Latent Semantic Indexing text Mining Method[J].Journal of Library and Information Sciences in Agriculture,2016,28(7):5-9.
Authors:CAI Hao-yuan
Institution:Guangzhou Library, Guangdong Guangzhou 510623, China
Abstract:This paper introduced the application of latent semantic indexing in the field of information retrieval, and presented three ways to calculate the lexical item weighting, and then analyzed the role of Singular Value Decomposition (SVD) in capturing the important information of matrix, and showed how the reduced-rank approximation of item-document matrix simulated the psychological process of human when understanding the meanings of sentences. Through the comparison of the searching algorithm of Vector Space Model (VSM) and LSI, and the case of text mining of a term-document matrix, it indicated how LSI worked in analyzing the connection between documents.
Keywords:Latent semantic indexing  
本文献已被 万方数据 等数据库收录!
点击此处可从《农业图书情报学刊》浏览原始摘要信息
点击此处可从《农业图书情报学刊》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号