- Title
- sGrid++ : revising simple grid based density estimator for mining outlying aspect
- Creator
- Samariya, D.; Ma, J.; Aryal, S.
- Date
- 2022
- Type
- Text; Conference paper
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/190306
- Identifier
- vital:17610
- Identifier
-
https://doi.org/10.1007/978-3-031-20891-1_15
- Identifier
- ISBN:0302-9743 (ISSN); 9783031208904 (ISBN)
- Abstract
- In this paper, we address the problem of outlying aspect mining, which aims to identify a set of features (subspace(s) a.k.a aspect(s)) where a given data object stands out from the rest of the data. To detect the most outlying aspect of a given data object, outlying aspect mining algorithms need to compare and rank subspaces with different dimensionality. Thus, they require a fast and dimensionally unbias scoring measure. Existing measures use density or distance to compute the outlyingness of the query in each subspace. Density and distance are dimensionally bias, i.e. density decreases as the dimension of subspace increases. To make them comparable (dimensionally unbias), Z-score normalization is used in the previous works. However, to compute Z-score normalization, we need to compute the outlyingness of each data point in each subspace, which adds significant computational overhead on top of the already expensive density or distance computation. Recently developed measure called sGrid is a simple and efficient density estimator which allows a fast systemic search. While it is efficient compared to other distance and density-based measures, it is also a dimensionally bias measure and it requires to use Z-score normalization to make it dimensionality unbiased, which makes it computationally expensive. In this paper, we propose a simpler version of sGrid called sGrid++ that is not only efficient and effective but also dimensionality unbiased. It does not require Z-score normalization. We demonstrate the effectiveness and efficiency of the proposed scoring measure in outlying aspect mining using synthetic and real-world datasets. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
- Publisher
- Springer Science and Business Media Deutschland GmbH
- Relation
- 23rd International Conference on Web Information Systems Engineering, WISE 2021, Biarritz, France, 1-3 November 2022, Web Information Systems Engineering – WISE 2022 23rd International Conference, Biarritz, France, November 1–3, 2022, Proceedings Vol. 13724 LNCS, p. 194-208
- Rights
- All metadata describing materials held in, or linked to, the repository is freely available under a CC0 licence
- Rights
- Copyright © 2022, The Author(s)
- Subject
- Density estimation; Dimensionality-unbiased score; Histogram; Outlier explanation; Outlying aspect mining
- Reviewed
- Hits: 220
- Visitors: 150
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|