2008年5月19日 星期一

On Spectral Clustering: Analysis and an algorithm

This paper proposed an extension to the normalized-cut for K-way simutaneous segmentation. The author also studied the condition that their algorithm would work well and demostrated what the eigen-vectors do represent physically with a simple example. The algorithm is quite easy to implement too.

 

The most important contribution of the paper should be identifying the geometric relationship of the top k eigen-vectors. Specifically, given a perfect similarity kernel function (i.e. f(i, j) = 0  for Class i != Class j), the resulting row vectors of the the matrix consisting of the top k eigen-vectors would cluster k orthogonal points on a unit k-sphere one k-class data, achieving maximum separation. The practical situation, however, would not be so ideal. The similarity function is suspect to noises and the author studies this case with matrix perturbation theory, showing theoretical guarantees of their algorithm.

 

The simple experiments on a few 2D data give promising results, but whether the algorithm could work well on higher order data is a problem. Indeed, I think the largest problem of multi-class clustering would be to specify the number of classes before running. It is truly conflicting because if we already know the cluster number, we should have a good idea of partitioning. Maybe that is why often we see segmentation algorithms appear in an over-segmentation manner.

沒有留言: