Density-ratio based clustering for discovering clusters with varying densities
- Authors: Zhu, Ye , Ting, Kaiming , Carman, Mark
- Date: 2016
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 60, no. (2016), p. 983-997
- Full Text: false
- Reviewed:
- Description: Density-based clustering algorithms are able to identify clusters of arbitrary shapes and sizes in a dataset which contains noise. It is well-known that most of these algorithms, which use a global density threshold, have difficulty identifying all clusters in a dataset having clusters of greatly varying densities. This paper identifies and analyses the condition under which density-based clustering algorithms fail in this scenario. It proposes a density-ratio based method to overcome this weakness, and reveals that it can be implemented in two approaches. One approach is to modify a density-based clustering algorithm to do density-ratio based clustering by using its density estimator to compute density-ratio. The other approach involves rescaling the given dataset only. An existing density-based clustering algorithm, which is applied to the rescaled dataset, can find all clusters with varying densities that would otherwise impossible had the same algorithm been applied to the unscaled dataset. We provide an empirical evaluation using DBSCAN, OPTICS and SNN to show the effectiveness of these two approaches. © 2016 Elsevier Ltd
Local contrast as an effective means to robust clustering against varying densities
- Authors: Chen, Bo , Ting, Kaiming , Washio, Takashi , Zhu, Ye
- Date: 2018
- Type: Text , Journal article
- Relation: Machine Learning Vol. 107, no. 8-10 (2018), p. 1621-1645
- Full Text:
- Reviewed:
- Description: Most density-based clustering methods have difficulties detecting clusters of hugely different densities in a dataset. A recent density-based clustering CFSFDP appears to have mitigated the issue. However, through formalising the condition under which it fails, we reveal that CFSFDP still has the same issue. To address this issue, we propose a new measure called Local Contrast, as an alternative to density, to find cluster centers and detect clusters. We then apply Local Contrast to CFSFDP, and create a new clustering method called LC-CFSFDP which is robust in the presence of varying densities. Our empirical evaluation shows that LC-CFSFDP outperforms CFSFDP and three other state-of-the-art variants of CFSFDP. © 2018, The Author(s).