首页 | 本学科首页   官方微博 | 高级检索  
     

一种改进的长词优先逆向最大匹配分词消歧策略
引用本文:田占霄,韩宪忠,王克俭. 一种改进的长词优先逆向最大匹配分词消歧策略[J]. 河北农业大学学报, 2009, 32(4)
作者姓名:田占霄  韩宪忠  王克俭
作者单位:河北农业大学,信息科学与技术学院,河北,保定,071001;河北农业大学,信息科学与技术学院,河北,保定,071001;河北农业大学,信息科学与技术学院,河北,保定,071001
基金项目:河北省科学技术研究与发展计划项目 
摘    要:为提高逆向最大匹配算法的分词精度,本研究利用词频阙值,单字函数等方法取得了较好的消歧效果。实验结果表明:该分词算法既能遵循长词优先的原则,又能进一步识别和消除覆盖歧义。改进的RMM不仅在速度上仍保持较大优势而且在分词准确率上有了进一步的提高,对使用机械分词算法的中小型搜索引擎在提高分词精度方面具有一定的实用价值。

关 键 词:中文分词  逆向最大匹配算法  单字率  词频

An improved ambiguity resolution of RMM based on long-term priotity
TIAN Zhan-xiao,HAN Xian-zhong,WANG Ke-jian. An improved ambiguity resolution of RMM based on long-term priotity[J]. Journal of Agricultural University of Hebei, 2009, 32(4)
Authors:TIAN Zhan-xiao  HAN Xian-zhong  WANG Ke-jian
Affiliation:TIAN Zhan-xiao,HAN Xian-zhong,WANG Ke-jian (College of Information Science , Technology,Agricultural University of Hebei,Baoding 071001,China)
Abstract:In order to enhance the accuracy of chinese word segmentation,using term frequency value and single character function,the present study has made great progress on the ambiguity resolution.The experiment shows this method is able to follow the long-word-first principle,and can further detect and resolve ambiguity.The improved RMM not only has a greater advantage in speed,but also increases the accuracy.It has practical value in the aspect of ambiguity resolution to the middle and small-scale search engines ...
Keywords:chinese word segmentation  RMM  rate of chinese character  term frequency
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号