Local contrast as an effective means to robust clustering against varying densities
- Authors: Chen, Bo , Ting, Kaiming , Washio, Takashi , Zhu, Ye
- Date: 2018
- Type: Text , Journal article
- Relation: Machine Learning Vol. 107, no. 8-10 (2018), p. 1621-1645
- Full Text:
- Reviewed:
- Description: Most density-based clustering methods have difficulties detecting clusters of hugely different densities in a dataset. A recent density-based clustering CFSFDP appears to have mitigated the issue. However, through formalising the condition under which it fails, we reveal that CFSFDP still has the same issue. To address this issue, we propose a new measure called Local Contrast, as an alternative to density, to find cluster centers and detect clusters. We then apply Local Contrast to CFSFDP, and create a new clustering method called LC-CFSFDP which is robust in the presence of varying densities. Our empirical evaluation shows that LC-CFSFDP outperforms CFSFDP and three other state-of-the-art variants of CFSFDP. © 2018, The Author(s).
Nearest-neighbour-induced isolation similarity and its impact on density-based clustering
- Authors: Qin, Xiaoyu , Ting, Kai , Zhu, Ye , Lee, Vincent
- Date: 2019
- Type: Text , Conference paper
- Relation: 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Annual Conference on Innovative Applications of Artificial Intelligence, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, 27 January to 1 February 2019. p. 4755-4762
- Full Text:
- Reviewed:
- Description: A recent proposal of data dependent similarity called Isolation Kernel/Similarity has enabled SVM to produce better classification accuracy. We identify shortcomings of using a tree method to implement Isolation Similarity; and propose a nearest neighbour method instead. We formally prove the characteristic of Isolation Similarity with the use of the proposed method. The impact of Isolation Similarity on density-based clustering is studied here. We show for the first time that the clustering performance of the classic density-based clustering algorithm DBSCAN can be significantly uplifted to surpass that of the recent density-peak clustering algorithm DP. This is achieved by simply replacing the distance measure with the proposed nearest-neighbour-induced Isolation Similarity in DBSCAN, leaving the rest of the procedure unchanged. A new type of clusters called mass-connected clusters is formally defined. We show that DBSCAN, which detects density-connected clusters, becomes one which detects mass-connected clusters, when the distance measure is replaced with the proposed similarity. We also provide the condition under which mass-connected clusters can be detected, while density-connected clusters cannot. © 2019, Association for the Advancement of Artificial Intelligence (www.aaai.org).