Detection of anomalies and explanation in cybersecurity
- Authors: Samariya, Durgesh , Ma, Jiangang , Aryal, Sunil , Zhao, Xiaohui
- Date: 2024
- Type: Text , Conference paper
- Relation: 30th International Conference on Neural Information Processing, ICONIP 2023, Changsha, 20-23 November 2023, Neural Information Processing: 30th International Conference, ICONIP 2023, Changsha, China, November 20-23, 2023, Proceedings, Part XIII Vol. 1967 CCIS, p. 414-426
- Full Text: false
- Reviewed:
- Description: Histogram-based anomaly detectors have gained significant attention and application in the field of intrusion detection because of their high efficiency in identifying anomalous patterns. However, they fail to explain why a given data point is flagged as an anomaly. Outlying Aspect Mining (OAM) aims to detect aspects (a.k.a subspaces) where a given anomaly significantly differs from others. In this paper, we have proposed a simple but effective and efficient histogram-based solution - HMass. In addition to detecting anomalies, HMass provides explanations on why the points are anomalous. The effectiveness and efficiency of HMass are evaluated using comparative analysis on seven cyber security datasets, covering the tasks of anomaly detection and outlying aspect mining. © 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
Detection and explanation of anomalies in healthcare data
- Authors: Samariya, Durgesh , Ma, Jiangang , Aryal, Sunil , Zhao, Xiaohui
- Date: 2023
- Type: Text , Journal article
- Relation: Health Information Science and Systems Vol. 11, no. 1 (2023), p. 20-20
- Full Text: false
- Reviewed:
- Description: The growth of databases in the healthcare domain opens multiple doors for machine learning and artificial intelligence technology. Many medical devices are available in the medical field however, medical errors remain a severe challenge. Different algorithms are developed to identify and solve medical errors, such as detecting anomalous readings, anomalous health conditions of a patient, etc. However, they fail to answer why those entries are considered an anomaly. This research gap leads to an outlying aspect mining problem. The problem of outlying aspect mining aims to discover the set of features (a.k.a subspace) in which the given data point is dramatically different than others. In this paper, we present a framework that detects anomalies in healthcare data and then provides an explanation of anomalies. This paper aims to effectively and efficiently detect anomalies and explain why they are considered anomalies by detecting outlying aspects. First, we re-introduced four anomaly detection techniques and outlying aspect mining algorithms. Then, we evaluate the performance of anomaly detection techniques and choose the best anomaly detection algorithm. Later, we detect the top k anomaly as a query and detect their outlying aspect. Lastly, we evaluate their performance on 16 real-world healthcare datasets. The experimental results show that the latest isolation-based outlying aspect mining measure, SiNNE, has outstanding performance on this task and has promising results.
A new effective and efficient measure for outlying aspect mining
- Authors: Samariya, Durgesh , Aryal, Sunil , Ting, Kai , Ma, Jiangang
- Date: 2020
- Type: Text , Conference paper
- Relation: 21st International Conference on Web Information Systems Engineering, WISE 2020, Amsterdam. 20-24 October 2020, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics Vol. 12343 LNCS, p. 463-474
- Full Text: false
- Reviewed:
- Description: Outlying Aspect Mining (OAM) aims to find the subspaces (a.k.a. aspects) in which a given query is an outlier with respect to a given data set. Existing OAM algorithms use traditional distance/density-based outlier scores to rank subspaces. Because these distance/density-based scores depend on the dimensionality of subspaces, they cannot be compared directly between subspaces of different dimensionality. Z-score normalisation has been used to make them comparable. It requires to compute outlier scores of all instances in each subspace. This adds significant computational overhead on top of already expensive density estimation—making OAM algorithms infeasible to run in large and/or high-dimensional datasets. We also discover that Z-score normalisation is inappropriate for OAM in some cases. In this paper, we introduce a new score called Simple Isolation score using Nearest Neighbor Ensemble (SiNNE), which is independent of the dimensionality of subspaces. This enables the scores in subspaces with different dimensionalities to be compared directly without any additional normalisation. Our experimental results revealed that SiNNE produces better or at least the same results as existing scores; and it significantly improves the runtime of an existing OAM algorithm based on beam search. © 2020, Springer Nature Switzerland AG.