详细信息
文献类型:期刊文献
中文题名:自适应特征权重的K-means聚类算法
英文题名:K-means Clustering Algorithm Based on Adaptive Feature Weighted
作者:李四海[1];满自斌[2]
第一作者:李四海
机构:[1]甘肃中医学院,甘肃兰州730000;[2]兰州理工大学,甘肃兰州730050
第一机构:甘肃中医药大学
年份:2013
卷号:23
期号:6
起止页码:98
中文期刊名:计算机技术与发展
外文期刊名:Computer Technology and Development
收录:CSTPCD
基金:国家自然科学基金资助项目(51069004)
语种:中文
中文关键词:K—means;医学数据聚类;自适应特征权重;聚类评价;混淆矩阵
外文关键词:K-means; medical data clustering; AFW ;cluster evaluation; confusion matrix
摘要:为提高传统K-means聚类算法在医学数据聚类中的准确率和稳定性,提出了一种自适应特征权重的K-means聚类算法AFW-K-means。该算法首先通过计算属性的均方差选取初始聚类中心,然后根据当前的迭代结果,按照类内紧密、类间远离的原则调整属性在距离公式中的特征权重,以便更准确地反映数据点在欧氏空间中的真实距离,最后选取UCI上的BCW乳腺肿瘤等数据集对算法的有效性进行验证。结果表明:算法的准确率和稳定性均明显好于传统K-means算法。
In order to improve the accuracy and stability of traditional K-means algorithm on medical data clustering, proposed an adaptive feature weighted K-means clustering algorithm named AFW-K-means. Firstly, initial clustering center was chosen by calculating mean square deviation of feature attribute. Then,according to the results of each iteration,the feature attribute weight in distance formula is modified based on the principle of minimum-in-cluster-distance and maximum-between-cluster-distance, which can reflect the true distance among the data points in the Euclidean space. Finally, the validity of the proposed approach is demonstrated by the experiment of UCI data set such as Breast Cancer Wisconsin data set. The results showed that the algorithm has higher precision of prediction and better stability than traditional K-means algorithm.
参考文献:
正在载入数据...