对Apriori算法的研究及改进 |
| |
引用本文: | 丁丽,;孙高峰. 对Apriori算法的研究及改进[J]. 张家口农专学报, 2013, 0(2): 16-21 |
| |
作者姓名: | 丁丽, 孙高峰 |
| |
作者单位: | [1]亳州职业技术学院信息工程系; [2]安徽电力亳州供电公司电力调度控制中心 |
| |
摘 要: | 关联分析是数据挖掘的本质体现,关联规则挖掘就是寻找给定的大量数据项集之间存在的某种规律的过程。Apriori算法是关联规则中最重要的一种挖掘频繁项集的算法,但是它也存在一定的不足。目的为了提高挖掘效率。方法采用实验的方法,在经典Apriori算法的基础上进行改进。结果证明改进的Apriori算法性能优于经典的Apriori算法,尤其是在交易事务条数比较多的情况下,效果更加明显。结论是改进的算法在计算支持度个数时,每次不需要扫描全部数据库,只需要在精简的数据库表中扫描各项所在的行就可以了,大大节省了时间;支持度计数的统计也比较容易,也不会产生过多的冗余,可以在很大程度上降低挖掘的复杂度,提高挖掘算法的效率。
|
关 键 词: | 数据挖掘 关联规则 Apriori算法 改进算法 支持度 置信度 频繁项集 |
Analysis and Improvement on Apriori Algorithm |
| |
Affiliation: | DING Li1,SUN Gao-feng2(1.Department of Information Engineering,Bozhou Vocational and Technical College,Bozhou 236800,Anhui,China; 2.Power Dispatching Control Center,Bozhou Power Supply Company,Anhui Electric,Bozhou 236800,Anhui,China) |
| |
Abstract: | Correlation analysis is the essence of data mining.Mining association rules is the process of searching regular rule between a large given set of data items.Apriori algorithm is one of the most important association rules mining of frequent item sets,but there is certain shortcoming with it.The purpose of this study is to improve the efficiency of mining.Using the method of experiment,efficiency was improved on the basis of classical Apriori algorithm.The results showed that the modified Apriori algorithm outperformed the classical,and the effect was more obvious especially with a larger transaction number.The conclusion is that the improved algorithm does not need to scan the whole database each time when calculating the support number;it only needs to scan the row where the reduced database table is with greatly saved time;the statistics for support count is relatively easy,which won’t produce too much redundancy.Thus it would largely reduce the complexity of mining and improve the efficiency of mining algorithm. |
| |
Keywords: | data mining association rules Apriori algorithm improved algorithm support confidence frequent item sets |
本文献已被 维普 等数据库收录! |
|