摘 要:针对复杂网络交叠团的聚类与模糊分析方法设计问题,给出一种新的模糊度量及相应的模糊聚类方法,并以新度量为基础,设计出两种挖掘网络模糊拓扑特征的新指标:团间连接紧密程度和模糊点对交叠团的连接贡献度,并将其用于网络交叠模块拓扑结构宏观分析和团间关键点提取。实验结果表明,使用该聚类与分析方法不仅可以获得模糊团结构,而且能够揭示出新的网络特征。该方法为复杂网络聚类后分析提供了新的视角。
fuzzy clustering and information mining in complex networks
zhao kun,zhang shao-wu,pan quan
(school of automation, northwestern polytechnical university, xi’an 710072, china)
abstract:there is seldom a method which is capable of both clustering the network and analyzing the resulted overlapping communities. to solve this problem, this paper presented a novel fuzzy metric and a soft clustering algorithm. based on the novel metric, two topological fuzzy metric, which include clique-clique closeness degree and inter-clique connecting contribution degree, were devised and applied in the topological macro analysis and the extraction of key nodes in the overlapping communities. experimental results indicate that, as an attempt of analysis after clustering, the new indicators and mechanics can uncover new topology features hidden in the network.
key words:network fuzzy clustering; clique-node similarity; clique-clique closeness degree; inter-clique connection contribution degree; symmetrical nonnegative matrix factorization(s-nmf); network topology macrostructure
1 新模糊度量和最优化逼近方法
w ?tw→y(2)
minw≥0 f?g(y,w)=‖y-w ?tw‖?f=?12?ij[(y-w ?tw)。(y-w ?tw)]ij(3)
其中:‖•‖?f为欧氏距离;a。b表示矩阵a、b的hadamard 矩阵乘法。由此,模糊度量w的实现问题转换为一个最优化问题,即寻找合适的w使式(3)定义的目标函数达到最小值。
式(3)本质上是一种矩阵分解,被称为对称非负矩阵分解,或s-nmf (symmetrical non-negative matrix factorization)。?s-nmf的求解与非负矩阵分解nmf[11,12]的求解方法非常类似。非负矩阵分解将数据分解为两个非负矩阵的乘积,得到对原数据的简化描述,被广泛应用于各种数据分析领域。类似nmf的求解,s-nmf可视为加入限制条件(h=w)下的nmf。给出s-nmf的迭代式如下:
wk+1=w?k。[w?ky]/[w?kw ?t?kw?k](4)
2 团—团关系度量
团—点相似度w使得定量刻画网络中的其他拓扑关系成为可能。正如w ?tw可被用来作为点与点的相似度的一个估计,同样可用w来估计团—团关系:
z=ww ?t(8)
z=ww ?t=1.337 60.035 3
0.035 31.337 6
由于图1中的网络是对称网络,两团具有同样的拓扑连接模式,它们有相同的团内密度1.337 6,而团间重叠度为?0.035 3。
3 团间连接贡献度
