Fast Co-Clustering by Ranking and Sampling
    Download PDF
Zhao Li,Xindong Wu. Fast Co-Clustering by Ranking and Sampling. International Journal of Software and Informatics, 2012,6(4):523~534
Hits: 1496
Download times: 2571
Fund:This work is sponsored by US National Science Foundation (NSF) under Grant No.CCF-0905337.
Abstract:Co-clustering treats a data matrix in a symmetric fashion that a partitioning of rows can induce a partitioning of columns, and vice versa. It has been shown advantageous over tradition clustering. However, the computational complexity of most co-clustering algorithms are costly, and thus limit their e?ectiveness on large datasets. A recently proposed sampling-based matrix decomposition method can achieve a linear computational complexity, but selected rows and columns can not effectively represent a large sparse dataset, and many unselected rows and columns can not be mapped to the selected rows and columns because they do not share features in common, thus its performance is impaired. To address this problem, we propose a fast co-clustering framework by ranking and sampling that only representative samples are selected for co-clustering, and the remaining samples can be easily labeled by their neighbors in clustered samples. Extensive experiments on large text datasets show that our approach is able to use very few samples to achieve comparable results in linear time compared to state-of-the-art co-clustering algorithms of nonlinear computational complexity.
keywords:co-clustering  ranking  sampling  nonlinear complexity
View Full Text  View/Add Comment  Download reader



Top Paper  |  FAQ  |  Guest Editors  |  Email Alert  |  Links  |  Copyright  |  Contact Us

© Copyright by Institute of Software, the Chinese Academy of Sciences

京公网安备 11040202500065号