Defying the gravity of learning curve : A characteristic of nearest neighbour anomaly detectors
- Authors: Ting, Kaiming , Washio, Takashi , Wells, Jonathan , Aryal, Sunil
- Date: 2017
- Type: Text , Journal article
- Relation: Machine Learning Vol. 106, no. 1 (2017), p. 55-91
- Full Text: false
- Reviewed:
- Description: Conventional wisdom in machine learning says that all algorithms are expected to follow the trajectory of a learning curve which is often colloquially referred to as ‘more data the better’. We call this ‘the gravity of learning curve’, and it is assumed that no learning algorithms are ‘gravity-defiant’. Contrary to the conventional wisdom, this paper provides the theoretical analysis and the empirical evidence that nearest neighbour anomaly detectors are gravity-defiant algorithms.
Local models - the key to boosting stable learners successfully
- Authors: Ting, Kaiming , Zhu, Lian , Wells, Jonathan
- Date: 2013
- Type: Text , Journal article
- Relation: Computational Intelligence Vol. 29, no. 2 (2013), p. 331-356
- Full Text: false
- Reviewed:
- Description: Boosting has been shown to improve the predictive performance of unstable learners such as decision trees, but not of stable learners like Support Vector Machines (SVM), k-nearest neighbours and Naive Bayes classifiers. In addition to the model stability problem, the high time complexity of some stable learners such as SVM prohibits them from generating multiple models to form an ensemble for large data sets. This paper introduces a simple method that not only enables Boosting to improve the predictive performance of stable learners, but also significantly reduces the computational time to generate an ensemble of stable learners such as SVM for large data sets that would otherwise be infeasible. The method proposes to build local models, instead of global models; and it is the first method, to the best of our knowledge, to solve the two problems in Boosting stable learners at the same time. We implement the method by using a decision tree to define local regions and build a local model for each local region. We show that this implementation of the proposed method enables successful Boosting of three types of stable learners: SVM, k-nearest neighbours and Naive Bayes classifiers.
- Description: Boosting has been shown to improve the predictive performance of unstable learners such as decision trees, but not of stable learners like Support Vector Machines (SVM), k-nearest neighbors and Naive Bayes classifiers. In addition to the model stability problem, the high time complexity of some stable learners such as SVM prohibits them from generating multiple models to form an ensemble for large data sets. This paper introduces a simple method that not only enables Boosting to improve the predictive performance of stable learners, but also significantly reduces the computational time to generate an ensemble of stable learners such as SVM for large data sets that would otherwise be infeasible. The method proposes to build local models, instead of global models; and it is the first method, to the best of our knowledge, to solve the two problems in Boosting stable learners at the same time. We implement the method by using a decision tree to define local regions and build a local model for each local region. We show that this implementation of the proposed method enables successful Boosting of three types of stable learners: SVM, k-nearest neighbors and Naive Bayes classifiers.
Feature-subspace aggregating: ensembles for stable and unstable learners
- Authors: Ting, Kaiming , Wells, Jonathan , Tan, Swee , Teng, Shyh , Webb, Geoffrey
- Date: 2011
- Type: Text , Journal article
- Relation: Machine Learning Vol. 82, no. 3 (2011), p. 375-397
- Full Text: false
- Reviewed:
- Description: This paper introduces a new ensemble approach, Feature-Subspace Aggregating (Feating), which builds local models instead of global models. Feating is a generic ensemble approach that can enhance the predictive performance of both stable and unstable learners. In contrast, most existing ensemble approaches can improve the predictive performance of unstable learners only. Our analysis shows that the new approach reduces the execution time to generate a model in an ensemble through an increased level of localisation in Feating. Our empirical evaluation shows that Feating performs significantly better than Boosting, Random Subspace and Bagging in terms of predictive accuracy, when a stable learner SVM is used as the base learner. The speed up achieved by Feating makes feasible SVM ensembles that would otherwise be infeasible for large data sets. When SVM is the preferred base learner, we show that Feating SVM performs better than Boosting decision trees and Random Forests. We further demonstrate that Feating also substantially reduces the error of another stable learner, k-nearest neighbour, and an unstable learner, decision tree.