List of Titles

Local contrast as an effective means to robust clustering against varying densities

Authors: Chen, Bo , Ting, Kaiming , Washio, Takashi , Zhu, Ye
Date: 2018
Type: Text , Journal article
Relation: Machine Learning Vol. 107, no. 8-10 (2018), p. 1621-1645
Full Text:
Reviewed:
Description: Most density-based clustering methods have difficulties detecting clusters of hugely different densities in a dataset. A recent density-based clustering CFSFDP appears to have mitigated the issue. However, through formalising the condition under which it fails, we reveal that CFSFDP still has the same issue. To address this issue, we propose a new measure called Local Contrast, as an alternative to density, to find cluster centers and detect clusters. We then apply Local Contrast to CFSFDP, and create a new clustering method called LC-CFSFDP which is robust in the presence of varying densities. Our empirical evaluation shows that LC-CFSFDP outperforms CFSFDP and three other state-of-the-art variants of CFSFDP. © 2018, The Author(s).

LiNearN : A new approach to nearest neighbour density estimator

Authors: Wells, Jonathan , Ting, Kaiming , Washio, Takashi
Date: 2014
Type: Text , Journal article
Relation: Pattern Recognition Vol. 47, no. 8 (2014), p. 2702-2720
Full Text: false
Reviewed:
Description: Despite their wide spread use, nearest neighbour density estimators have two fundamental limitations: O(n2) time complexity and O(n) space complexity. Both limitations constrain nearest neighbour density estimators to small data sets only. Recent progress using indexing schemes has improved to near linear time complexity only.We propose a new approach, called LiNearN for Linear time Nearest Neighbour algorithm, that yields the first nearest neighbour density estimator having O(n) time complexity and constant space complexity, as far as we know. This is achieved without using any indexing scheme because LiNearN uses a subsampling approach for which the subsample values are significantly less than the data size. Like existing density estimators, our asymptotic analysis reveals that the new density estimator has a parameter to trade off between bias and variance. We show that algorithms based on the new nearest neighbour density estimator can easily scale up to data sets with millions of instances in anomaly detection and clustering tasks. Highlights•Reject the premise that a NN algorithm must find the NN for every instance.•The first NN density estimator that has O(n) time complexity and O(1) space complexity.•These complexities are achieved without using any indexing scheme.•Our asymptotic analysis reveals that it trades off between bias and variance.•Easily scales up to large data sets in anomaly detection and clustering tasks.

Showing items 1 - 2 of 2

Local contrast as an effective means to robust clustering against varying densities

LiNearN : A new approach to nearest neighbour density estimator