List of Titles

Simple supervised dissimilarity measure : bolstering iForest-induced similarity with class information without learning

Authors: Wells, Jonathan , Aryal, Sunil , Ting, Kai
Date: 2020
Type: Text , Journal article
Relation: Knowledge and Information Systems Vol. 62, no. 8 (2020), p. 3203-3216
Full Text: false
Reviewed:
Description: Existing distance metric learning methods require optimisation to learn a feature space to transform data—this makes them computationally expensive in large datasets. In classification tasks, they make use of class information to learn an appropriate feature space. In this paper, we present a simple supervised dissimilarity measure which does not require learning or optimisation. It uses class information to measure dissimilarity of two data instances in the input space directly. It is a supervised version of an existing data-dependent dissimilarity measure called me. Our empirical results in k-NN and LVQ classification tasks show that the proposed simple supervised dissimilarity measure generally produces predictive accuracy better than or at least as good as existing state-of-the-art supervised and unsupervised dissimilarity measures. © 2020, Springer-Verlag London Ltd., part of Springer Nature.

Isolation-based anomaly detection using nearest-neighbor ensembles

Authors: Bandaragoda, Tharindu , Ting, Kaiming , Albrecht, David , Liu, Fei , Zhu, Ye , Wells, Jonathan
Date: 2018
Type: Text , Journal article
Relation: Computational Intelligence Vol. 34, no. 4 (2018), p. 968-998
Full Text: false
Reviewed:
Description: The first successful isolation-based anomaly detector, ie, iForest, uses trees as a means to perform isolation. Although it has been shown to have advantages over existing anomaly detectors, we have identified 4 weaknesses, ie, its inability to detect local anomalies, anomalies with a high percentage of irrelevant attributes, anomalies that are masked by axis-parallel clusters, and anomalies in multimodal data sets. To overcome these weaknesses, this paper shows that an alternative isolation mechanism is required and thus presents iNNE or isolation using Nearest Neighbor Ensemble. Although relying on nearest neighbors, iNNE runs significantly faster than the existing nearest neighbor–based methods such as the local outlier factor, especially in data sets having thousands of dimensions or millions of instances. This is because the proposed method has linear time complexity and constant space complexity. © 2018 Wiley Periodicals, Inc.

Defying the gravity of learning curve : A characteristic of nearest neighbour anomaly detectors

Authors: Ting, Kaiming , Washio, Takashi , Wells, Jonathan , Aryal, Sunil
Date: 2017
Type: Text , Journal article
Relation: Machine Learning Vol. 106, no. 1 (2017), p. 55-91
Full Text: false
Reviewed:
Description: Conventional wisdom in machine learning says that all algorithms are expected to follow the trajectory of a learning curve which is often colloquially referred to as ‘more data the better’. We call this ‘the gravity of learning curve’, and it is assumed that no learning algorithms are ‘gravity-defiant’. Contrary to the conventional wisdom, this paper provides the theoretical analysis and the empirical evidence that nearest neighbour anomaly detectors are gravity-defiant algorithms.

LiNearN : A new approach to nearest neighbour density estimator

Authors: Wells, Jonathan , Ting, Kaiming , Washio, Takashi
Date: 2014
Type: Text , Journal article
Relation: Pattern Recognition Vol. 47, no. 8 (2014), p. 2702-2720
Full Text: false
Reviewed:
Description: Despite their wide spread use, nearest neighbour density estimators have two fundamental limitations: O(n2) time complexity and O(n) space complexity. Both limitations constrain nearest neighbour density estimators to small data sets only. Recent progress using indexing schemes has improved to near linear time complexity only.We propose a new approach, called LiNearN for Linear time Nearest Neighbour algorithm, that yields the first nearest neighbour density estimator having O(n) time complexity and constant space complexity, as far as we know. This is achieved without using any indexing scheme because LiNearN uses a subsampling approach for which the subsample values are significantly less than the data size. Like existing density estimators, our asymptotic analysis reveals that the new density estimator has a parameter to trade off between bias and variance. We show that algorithms based on the new nearest neighbour density estimator can easily scale up to data sets with millions of instances in anomaly detection and clustering tasks. Highlights•Reject the premise that a NN algorithm must find the NN for every instance.•The first NN density estimator that has O(n) time complexity and O(1) space complexity.•These complexities are achieved without using any indexing scheme.•Our asymptotic analysis reveals that it trades off between bias and variance.•Easily scales up to large data sets in anomaly detection and clustering tasks.

DEMass: a new density estimator for big data

Authors: Ting, Kaiming , Washio, Takashi , Wells, Jonathan , Liu, Fei , Aryal, Sunil
Date: 2013
Type: Text , Journal article
Relation: Knowledge and Information Systems Vol. 35, no. 3 (2013), p. 493-524
Full Text: false
Reviewed:
Description: Density estimation is the ubiquitous base modelling mechanism employed for many tasks including clustering, classification, anomaly detection and information retrieval. Commonly used density estimation methods such as kernel density estimator and k-nearest neighbour density estimator have high time and space complexities which render them inapplicable in problems with big data. This weakness sets the fundamental limit in existing algorithms for all these tasks. We propose the first density estimation method, having average case sub-linear time complexity and constant space complexity in the number of instances, that stretches this fundamental limit to an extent that dealing with millions of data can now be done easily and quickly. We provide an asymptotic analysis of the new density estimator and verify the generality of the method by replacing existing density estimators with the new one in three current density-based algorithms, namely DBSCAN, LOF and Bayesian classifiers, representing three different data mining tasks of clustering, anomaly detection and classification. Our empirical evaluation results show that the new density estimation method significantly improves their time and space complexities, while maintaining or improving their task-specific performances in clustering, anomaly detection and classification. The new method empowers these algorithms, currently limited to small data size only, to process big data—setting a new benchmark for what density-based algorithms can achieve.

Local models - the key to boosting stable learners successfully

Authors: Ting, Kaiming , Zhu, Lian , Wells, Jonathan
Date: 2013
Type: Text , Journal article
Relation: Computational Intelligence Vol. 29, no. 2 (2013), p. 331-356
Full Text: false
Reviewed:
Description: Boosting has been shown to improve the predictive performance of unstable learners such as decision trees, but not of stable learners like Support Vector Machines (SVM), k-nearest neighbours and Naive Bayes classifiers. In addition to the model stability problem, the high time complexity of some stable learners such as SVM prohibits them from generating multiple models to form an ensemble for large data sets. This paper introduces a simple method that not only enables Boosting to improve the predictive performance of stable learners, but also significantly reduces the computational time to generate an ensemble of stable learners such as SVM for large data sets that would otherwise be infeasible. The method proposes to build local models, instead of global models; and it is the first method, to the best of our knowledge, to solve the two problems in Boosting stable learners at the same time. We implement the method by using a decision tree to define local regions and build a local model for each local region. We show that this implementation of the proposed method enables successful Boosting of three types of stable learners: SVM, k-nearest neighbours and Naive Bayes classifiers.
Description: Boosting has been shown to improve the predictive performance of unstable learners such as decision trees, but not of stable learners like Support Vector Machines (SVM), k-nearest neighbors and Naive Bayes classifiers. In addition to the model stability problem, the high time complexity of some stable learners such as SVM prohibits them from generating multiple models to form an ensemble for large data sets. This paper introduces a simple method that not only enables Boosting to improve the predictive performance of stable learners, but also significantly reduces the computational time to generate an ensemble of stable learners such as SVM for large data sets that would otherwise be infeasible. The method proposes to build local models, instead of global models; and it is the first method, to the best of our knowledge, to solve the two problems in Boosting stable learners at the same time. We implement the method by using a decision tree to define local regions and build a local model for each local region. We show that this implementation of the proposed method enables successful Boosting of three types of stable learners: SVM, k-nearest neighbors and Naive Bayes classifiers.

Density estimation based on mass

Authors: Ting, Kaiming , Washio, Takashi , Wells, Jonathan , Liu, Fei
Date: 2011
Type: Text , Conference paper
Relation: 11th IEEE International Conference on Data Mining (ICDM 2011) p. 715-724
Full Text: false
Reviewed:
Description: Density estimation is the ubiquitous base modelling mechanism employed for many tasks such as clustering, classification, anomaly detection and information retrieval. Commonly used density estimation methods such as kernel density estimator and k-nearest neighbour density estimator have high time and space complexities which render them inapplicable in problems with large data size and even a moderate number of dimensions. This weakness sets the fundamental limit in existing algorithms for all these tasks. We propose the first density estimation method which stretches this fundamental limit to an extent that dealing with millions of data can now be done easily and quickly. We analyze the error of the new estimation (from the true density) using a bias-variance analysis. We then perform an empirical evaluation of the proposed method by replacing existing density estimators with the new one in two current density-based algorithms, namely, DBSCAN and LOF. The results show that the new density estimation method significantly improves the runtime of DBSCAN and LOF, while maintaining or improving their task-specific performances in clustering and anomaly detection, respectively. The new method empowers these algorithms, currently limited to small data size only, to process very large databases - setting a new benchmark for what density-based algorithms can achieve.

Feature-subspace aggregating: ensembles for stable and unstable learners

Authors: Ting, Kaiming , Wells, Jonathan , Tan, Swee , Teng, Shyh , Webb, Geoffrey
Date: 2011
Type: Text , Journal article
Relation: Machine Learning Vol. 82, no. 3 (2011), p. 375-397
Full Text: false
Reviewed:
Description: This paper introduces a new ensemble approach, Feature-Subspace Aggregating (Feating), which builds local models instead of global models. Feating is a generic ensemble approach that can enhance the predictive performance of both stable and unstable learners. In contrast, most existing ensemble approaches can improve the predictive performance of unstable learners only. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble through an increased level of localisation in Feating. Our empirical evaluation shows that Feating performs significantly better than Boosting, Random Subspace and Bagging in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by Feating makes feasible SVM ensembles that would otherwise be infeasible for large data sets. When SVM is the preferred base learner, we show that Feating SVM performs better than Boosting decision trees and Random Forests. We further demonstrate that Feating also substantially reduces the error of another stable learner, k-nearest neighbour, and an unstable learner, decision tree.

Multi-dimensional mass estimation and mass-based clustering

Authors: Ting, Kaiming , Wells, Jonathan
Date: 2010
Type: Text , Conference paper
Relation: Proceedings of the 10th IEEE International Conference on Data Mining (ICDM) p. 511-520
Full Text: false
Reviewed:
Description: Mass estimation, an alternative to density estimation, has been shown recently to be an effective base modelling mechanism for three data mining tasks of regression, information retrieval and anomaly detection. This paper advances this work in two directions. First, we generalise the previously proposed one-dimensional mass estimation to multidimensional mass estimation, and significantly reduce the time complexity to O(ψh) from O(ψ h )-making it feasible for a full range of generic problems. Second, we introduce the first clustering method based on mass-it is unique because it does not employ any distance or density measure. The structure of the new mass model enables different parts of a cluster to be identified and merged without expensive evaluations. The characteristics of the new clustering method are: (i) it can identify arbitrary-shape clusters; (ii) it is significantly faster than existing density-based or distance-based methods; and (iii) it is noise-tolerant.
Description: Mass estimation, an alternative to density estimation, has been shown recently to be an effective base modelling mechanism for three data mining tasks of regression, information retrieval and anomaly detection. This paper advances this work in two directions. First, we generalise the previously proposed one-dimensional mass estimation to multidimensional mass estimation, and significantly reduce the time complexity to O(

FaSS : Ensembles for stable learners

Authors: Ting, Kaiming , Wells, Jonathan , Tan, Swee , Teng, Shyh , Webb, Geoffrey
Date: 2009
Type: Text , Conference paper
Relation: 8th International Workshop on Multipul Classifier Systems (MCS 2009)
Full Text: false
Reviewed:
Description: This paper introduces a new ensemble approach, Feature-Space Subdivision (FaSS), which builds local models instead of global models. FaSS is a generic ensemble approach that can use either stable or unstable models as its base models. In contrast, existing ensemble approaches which employ randomisation can only use unstable models. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble with an increased level of localisation in FaSS. Our empirical evaluation shows that FaSS performs significantly better than boosting in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by FaSS makes SVM ensembles a reality that would otherwise infeasible for large data sets, and FaSS SVM performs better than Boosting J48 and Random Forests when SVM is the preferred base learner

Showing items 1 - 10 of 10