- Title
- A new effective and efficient measure for outlying aspect mining
- Creator
- Samariya, Durgesh; Aryal, Sunil; Ting, Kai; Ma, Jiangang
- Date
- 2020
- Type
- Text; Conference paper
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/184605
- Identifier
- vital:16552
- Identifier
-
https://doi.org/10.1007/978-3-030-62008-0_32
- Identifier
- ISBN:0302-9743 (ISSN); 9783030620073 (ISBN)
- Abstract
- Outlying Aspect Mining (OAM) aims to find the subspaces (a.k.a. aspects) in which a given query is an outlier with respect to a given data set. Existing OAM algorithms use traditional distance/density-based outlier scores to rank subspaces. Because these distance/density-based scores depend on the dimensionality of subspaces, they cannot be compared directly between subspaces of different dimensionality. Z-score normalisation has been used to make them comparable. It requires to compute outlier scores of all instances in each subspace. This adds significant computational overhead on top of already expensive density estimation—making OAM algorithms infeasible to run in large and/or high-dimensional datasets. We also discover that Z-score normalisation is inappropriate for OAM in some cases. In this paper, we introduce a new score called Simple Isolation score using Nearest Neighbor Ensemble (SiNNE), which is independent of the dimensionality of subspaces. This enables the scores in subspaces with different dimensionalities to be compared directly without any additional normalisation. Our experimental results revealed that SiNNE produces better or at least the same results as existing scores; and it significantly improves the runtime of an existing OAM algorithm based on beam search. © 2020, Springer Nature Switzerland AG.
- Publisher
- Springer Science and Business Media Deutschland GmbH
- Relation
- 21st International Conference on Web Information Systems Engineering, WISE 2020, Amsterdam. 20-24 October 2020, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics Vol. 12343 LNCS, p. 463-474
- Rights
- All metadata describing materials held in, or linked to, the repository is freely available under a CC0 licence
- Rights
- Copyright © 2020, Springer Nature Switzerland AG.
- Subject
- Dimensionality-unbiased score; Nearest neighbor ensemble; Outlier explanation; Outlying aspect mining
- Reviewed
- Hits: 1794
- Visitors: 366
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|