Efficient anomaly detection by isolation using Nearest Neighbour Ensemble
- Authors: Bandaragoda, Tharindu , Ting, Kaiming , Albrecht, David , Liu, Fei , Wells, Jonathan
- Date: 2014
- Type: Text , Conference paper
- Relation: 14th IEEE International Conference on Data Mining Workshop (ICDMW 2014); Shenzhen, China; 14th December 2014 p. 698-705
- Full Text: false
- Reviewed:
- Description: This paper presents iNNE (isolation using Nearest Neighbour Ensemble), an efficient nearest neighbour-based anomaly detection method by isolation. Inne runs significantly faster than existing nearest neighbour-based methods such as Local Outlier Factor, especially in data sets having thousands of dimensions or millions of instances. This is because the proposed method has linear time complexity and constant space complexity. Compared with the existing tree-based isolation method iForest, the proposed isolation method overcomes three weaknesses of iForest that we have identified, i.e., Its inability to detect local anomalies, anomalies with a low number of relevant attributes, and anomalies that are surrounded by normal instances.
LeSiNN : Detecting anomalies by identifying least similar nearest neighbours
- Authors: Pang, Guansong , Ting, Kaiming , Albrecht, David
- Date: 2015
- Type: Text , Conference proceedings , Conference paper
- Relation: 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015; Atlantic City, New Jersey; 14th-17th November 2015; published in Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015
- Full Text: false
- Description: We introduce the concept of Least Similar Nearest Neighbours (LeSiNN) and use LeSiNN to detect anomalies directly. Although there is an existing method which is a special case of LeSiNN, this paper is the first to clearly articulate the underlying concept, as far as we know. LeSiNN is the first ensemble method which works well with models trained using samples of one instance. LeSiNN has linear time complexity with respect to data size and the number of dimensions, and it is one of the few anomaly detectors which can apply directly to both numeric and categorical data sets. Our extensive empirical evaluation shows that LeSiNN is either competitive to or better than six state-of-the-art anomaly detectors in terms of detection accuracy and runtime. © 2015 IEEE.
- Description: Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015
Improving iForest with relative mass
- Authors: Aryal, Sunil , Ting, Kaiming , Wells, Jonathan , Washio, Takashi
- Date: 2014
- Type: Text , Conference paper
- Relation: 18th Pacific-Asia Conference, PAKDD 2014: Advances in Knowledge Discovery and Data Mining; Tainan, Taiwan; 13th-16th May 2014; published in Lecture Notes in Artificial Intelligence (subseries of Lecture Notes in Computer Science) Vol. 8444, p. 510-521
- Full Text: false
- Reviewed:
- Description: iForest uses a collection of isolation trees to detect anomalies. While it is effective in detecting global anomalies, it fails to detect local anomalies in data sets having multiple clusters of normal instances because the local anomalies are masked by normal clusters of similar density and they become less susceptible to isolation. In this paper, we propose a very simple but effective solution to overcome this limitation by replacing the global ranking measure based on path length with a local ranking measure based on relative mass that takes local data distribution into consideration. We demonstrate the utility of relative mass by improving the task specific performance of iForest in anomaly detection and information retrieval tasks.
An efficient hybrid system for anomaly detection in social networks
- Authors: Rahman, Md Shafiur , Halder, Sajal , Uddin, Ashraf , Acharjee, Uzzal
- Date: 2021
- Type: Text , Journal article
- Relation: Cybersecurity Vol. 4, no. 1 (2021), p.
- Full Text:
- Reviewed:
- Description: Anomaly detection has been an essential and dynamic research area in the data mining. A wide range of applications including different social medias have adopted different state-of-the-art methods to identify anomaly for ensuring user’s security and privacy. The social network refers to a forum used by different groups of people to express their thoughts, communicate with each other, and share the content needed. This social networks also facilitate abnormal activities, spread fake news, rumours, misinformation, unsolicited messages, and propaganda post malicious links. Therefore, detection of abnormalities is one of the important data analysis activities for the identification of normal or abnormal users on the social networks. In this paper, we have developed a hybrid anomaly detection method named DT-SVMNB that cascades several machine learning algorithms including decision tree (C5.0), Support Vector Machine (SVM) and Naïve Bayesian classifier (NBC) for classifying normal and abnormal users in social networks. We have extracted a list of unique features derived from users’ profile and contents. Using two kinds of dataset with the selected features, the proposed machine learning model called DT-SVMNB is trained. Our model classifies users as depressed one or suicidal one in the social network. We have conducted an experiment of our model using synthetic and real datasets from social network. The performance analysis demonstrates around 98% accuracy which proves the effectiveness and efficiency of our proposed system. © 2021, The Author(s).
Exploring data mining techniques in medical data streams
- Authors: Sun, Le , Ma, Jiangang , Zhang, Yanchun , Wang, Hua
- Date: 2016
- Type: Text , Book chapter
- Relation: Databases Theory and Applications Chapter 25 p. 321-332
- Full Text: false
- Reviewed:
- Description: Data stream mining has been studied in diverse application domains. In recent years, a population aging is stressing the national and international health care systems. Anomaly detection is a typical example of a data streams application. It is a dynamic process of finding abnormal behaviours from given data streams. In this paper, we discuss the existing anomaly detection techniques for Medical data streams. In addition, we present a process of using the Autoregressive Integrated Moving Average model (ARIMA) to analyse the ECG data streams.
Higher-order structure based anomaly detection on attributed networks
- Authors: Yuan, Xu , Zhou, Na , Yu, Shuo , Huang, Huafei , Chen, Zhikui , Xia, Feng
- Date: 2021
- Type: Text , Conference paper
- Relation: 2021 IEEE International Conference on Big Data, Big Data 2021, virtual online, 15-18 December 2021, Proceedings - 2021 IEEE International Conference on Big Data, Big Data 2021 p. 2691-2700
- Full Text:
- Reviewed:
- Description: Anomaly detection (such as telecom fraud detection and medical image detection) has attracted the increasing attention of people. The complex interaction between multiple entities widely exists in the network, which can reflect specific human behavior patterns. Such patterns can be modeled by higher-order network structures, thus benefiting anomaly detection on attributed networks. However, due to the lack of an effective mechanism in most existing graph learning methods, these complex interaction patterns fail to be applied in detecting anomalies, hindering the progress of anomaly detection to some extent. In order to address the aforementioned issue, we present a higher-order structure based anomaly detection (GUIDE) method. We exploit attribute autoencoder and structure autoencoder to reconstruct node attributes and higher-order structures, respectively. Moreover, we design a graph attention layer to evaluate the significance of neighbors to nodes through their higher-order structure differences. Finally, we leverage node attribute and higher-order structure reconstruction errors to find anomalies. Extensive experiments on five real-world datasets (i.e., ACM, Citation, Cora, DBLP, and Pubmed) are implemented to verify the effectiveness of GUIDE. Experimental results in terms of ROC-AUC, PR-AUC, and Recall@K show that GUIDE significantly outperforms the state-of-art methods. © 2021 IEEE.