A computing perspective on scientific chinese trinity
- Authors: Sun, Zhaohao , Wang, Paul
- Date: 2013
- Type: Text , Journal article
- Relation: New Mathematics and Natural Computation Vol. 9, no. 2 (2013), p. 129-152
- Full Text:
- Reviewed:
- Description: The unprecedented and rapid development of the Chinese economy has been vividly displayed in front of the whole world to see. The attention has been particularly acute for the academic community and career politician alike. Ironically, this rapid economic miracle of China has been built on an unsound and often even questionable foundation of Chinese words, language and culture, of which we call them "Chinese trinity". This paper deals with the Chinese trinity from a computing science perspective. This paper argues the reform in scientific Chinese trinity with an emphasis of the word "scientific" ought to play a key role for further Chinese economic development and to launch a much improved contemporary Chinese society on a solid foundation. In addition, this paper proposes specifically ten computing paradigms and examines critically their potential impacts on scientific Chinese trinity. Finally, we feel the very focused approaches as proposed here might inspire as well as provide a much needed road map toward the goal of the scientific Chinese trinity. Judiciously chosen vigorous research projects appear to be indispensable. The unfortunate well known and long overdue reform has finally been rescued by the pressure of the information revolution coming of age. © 2013 World Scientific Publishing Company.
- Description: 2003011223
A performance review of recent corner detectors
- Authors: Awrangjeb, Mohammad , Lu, Guojun
- Date: 2013
- Type: Text , Conference paper
- Relation: International Conference on Digital Image Computing: Techniques and Applications, 26 November 2013 to 28 November 2013 p. 157-164
- Full Text:
- Reviewed:
- Description: Contour-based corner detectors directly or indirectly estimate a significance measure (eg, curvature) on the points of a planar curve and select the curvature extrema points as corners. A number of promising contour-based corner detectors have recently been proposed. They mainly differ in how the curvature is estimated on each point of the given curve. As the curvature on a digital curve can only be approximated, it is important to estimate a curvature that remains stable against significant noises, for example, geometric transformations and compression, on the curve. Moreover, in many applications, for instance, in content-based image retrieval, a fast corner detector is a prerequisite. So, it is also a primary characteristic that how much time a corner detector takes for corner detection in a given image. In addition, different authors evaluated their detectors on different platforms using different evaluation systems. Evaluation systems that depend on human judgements and visual identification of corners are manual and too subjective. Application of a manual system on a large test database will be expensive. Therefore, it is important to evaluate the detectors on a common platform using an automatic evaluation system. This paper first reviews six most recent and highly performed corner detectors and analyse their theoretical running time. Then it uses an automatic evaluation system to analyse their performance. Both the robustness to noise and efficiency are estimated to rank the detectors.
ACSP-Tree: A tree structure for mining behavioral patterns from wireless sensor networks
- Authors: Rashid, Md. Mamunur , Gondal, Iqbal , Kamruzzaman, Joarder
- Date: 2013
- Type: Text , Conference paper
- Relation: IEEE Conference on Local Computer Networks (LCN 2013) (21 October 2013 to 24 October 2013) p. 691-694
- Full Text: false
- Reviewed:
- Description: WSNs generates a large amount of data in the form of stream and mining knowledge from the stream of data can be extremely useful. Association rules mining, from the sensor data, has been studied in recent literature. However, sensor association rules mining often produces a huge number of rules, but most of them either are redundant or fail to reflect the true correlation relationship among data objects. In this paper, we address this problem and propose mining of a new type of sensor behavioral pattern called associated-correlated sensor patterns. The proposed behavioral patterns capture not only association-like co-occurrences but also the substantial temporal correlations implied by such co-occurrences in the sensor data. Here, we also use a prefix tree-based structure called associated-correlated sensor pattern-tree (ACSP-tree), which facilitates frequent pattern (FP) growth-based mining technique to generate all associated-correlated patterns from WSN data with only one scan over the sensor database. Extensive performance study shows that our approach is time and memory efficient in finding associated-correlated patterns than the existing most efficient algorithms.
An adaptive strategy for assortative mating in genetic algorithm
- Authors: Nazmul, Rumana , Chetty, Madhu
- Date: 2013
- Type: Text , Conference paper
- Relation: 2013 IEEE Congress on Evolutionary Computation p. 2237-2244
- Full Text: false
- Reviewed:
- Description: In any traditional Genetic Algorithm (GA), recombination is a dominant search operator and capable of exploring the search space by sharing genetic information among the individuals in the population. However, a simple application of recombination alone is insufficient to guide convergence to an optimal solution. The selection of parents for recombination operation has a significant role in guiding the evolution towards the optimal solution and also for maintaining genetic diversity to avoid getting trapped in local minima. A non-random mating mimics the mechanism of reproduction in nature and is effective in maintaining diversity in population. This paper proposes a new strategy for selection of mating pairs based on a type of non-random mating called as assortative mating. The proposed mate selection scheme conserves the merits of both positive and negative assortative mating in a controlled manner by allowing mating between individuals having both similar and dissimilar phenotypes. For effective cross-over, it maintains genetic diversity in population by distributing the recombination among dissimilar individuals. Furthermore, it ensures the preservation and propagation of useful genetic information to the later stages of search by the selection of mates having similar phenotypes. Experimental results, using not only the five widely used benchmark functions but also twenty newly developed modified functions, are reported. The results show significant improvements in the convergence characteristics of the proposed mating strategy over existing nonrandom mating techniques.
Attribute weighted Naive Bayes classifier using a local optimization
- Authors: Taheri, Sona , Yearwood, John , Mammadov, Musa , Seifollahi, Sattar
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing & Applications Vol.24, no.5 (2013), p.995-1002
- Full Text:
- Reviewed:
- Description: The Naive Bayes classifier is a popular classification technique for data mining and machine learning. It has been shown to be very effective on a variety of data classification problems. However, the strong assumption that all attributes are conditionally independent given the class is often violated in real-world applications. Numerous methods have been proposed in order to improve the performance of the Naive Bayes classifier by alleviating the attribute independence assumption. However, violation of the independence assumption can increase the expected error. Another alternative is assigning the weights for attributes. In this paper, we propose a novel attribute weighted Naive Bayes classifier by considering weights to the conditional probabilities. An objective function is modeled and taken into account, which is based on the structure of the Naive Bayes classifier and the attribute weights. The optimal weights are determined by a local optimization method using the quasisecant method. In the proposed approach, the Naive Bayes classifier is taken as a starting point. We report the results of numerical experiments on several real-world data sets in binary classification, which show the efficiency of the proposed method.
Automated unsupervised authorship analysis using evidence accumulation clustering
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 19, no. 1 (2013), p. 95-120
- Full Text:
- Reviewed:
- Description: Authorship Analysis aims to extract information about the authorship of documents from features within those documents. Typically, this is performed as a classification task with the aim of identifying the author of a document, given a set of documents of known authorship. Alternatively, unsupervised methods have been developed primarily as visualisation tools to assist the manual discovery of clusters of authorship within a corpus by analysts. However, there is a need in many fields for more sophisticated unsupervised methods to automate the discovery, profiling and organisation of related information through clustering of documents by authorship. An automated and unsupervised methodology for clustering documents by authorship is proposed in this paper. The methodology is named NUANCE, for n-gram Unsupervised Automated Natural Cluster Ensemble. Testing indicates that the derived clusters have a strong correlation to the true authorship of unseen documents. © 2011 Cambridge University Press.
- Description: 2003010584
Backbreak prediction in the Chadormalu iron mine using artificial neural network
- Authors: Monjezi, Masoud , Ahmadi, Zabiholla , Yazdian-Varjani, Ali , Khandelwal, Manoj
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 23, no. 3-4 (2013), p. 1101-1107
- Full Text: false
- Reviewed:
- Description: Backbreak is one of the unfavorable blasting results, which can be defined as the unwanted rock breakage behind the last row of blast holes. Blast pattern parameters, like stemming, burden, delay timing, stiffness ratio (bench height/burden) and rock mass conditions (e.g., geo-mechanical properties and joints), are effective in backbreak intensity. Till date, with the exception of some qualitative guidelines, no specific method has been developed for predicting the phenomenon. In this paper, an effort has been made to apply artificial neural networks (ANNs) for predicting backbreak in the blasting operation of the Chadormalu iron mine (Iran). Number of ANN models with different hidden layers and neurons were tried, and it was found that a network with architecture 10-7-7-1 is the optimum model. A comparative study also approved the superiority of the ANN modeling over the conventional regression analysis. Mean square error (MSE), variance account for (VAF) and coefficient of determination (R 2) between measured and predicted backbreak for the ANN model were calculated and found 89.46 %, 0.714 and 90.02 %, respectively. Also, for the regression model, MSE, VAF and R 2 were computed and found 66.93 %, 1.46 and 68.10 %, respectively. Sensitivity analysis was also carried out to find out the influence of each input parameter on backbreak results, and it was revealed that burden is the most influencing parameter on the backbreak, whereas water content is the least effective parameter in this regard. © 2012 Springer-Verlag London Limited.
Building roof plane extraction from LIDAR data
- Authors: Awrangjeb, Mohammad , Lu, Guojun
- Date: 2013
- Type: Text , Conference paper
- Relation: 2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA)
- Full Text:
- Reviewed:
- Description: This paper presents a new segmentation technique to use LIDAR point cloud data for automatic extraction of building roof planes. The raw LIDAR points are first classified into two major groups: ground and non-ground points. The ground points are used to generate a 'building mask' in which the black areas represent the ground where there are no laser returns below a certain height. The non-ground points are segmented to extract the planar roof segments. First, the building mask is divided into small grid cells. The cells containing the black pixels are clustered such that each cluster represents an individual building or tree. Second, the non-ground points within a cluster are segmented based on their coplanarity and neighbourhood relations. Third, the planar segments are refined using a rule-based procedure that assigns the common points among the planar segments to the appropriate segments. Finally, another rule-based procedure is applied to remove tree planes which are generally small in size and randomly oriented. Experimental results on three Australian sites have shown that the proposed method offers high building detection and roof plane extraction rates.
Clustered memetic algorithm with local heuristics for ab initio protein structure prediction
- Authors: Islam, M. D. , Chetty, Madhu
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Evolutionary Computation Vol. 17, no. 4 (2013), p. 558-576
- Full Text: false
- Reviewed:
- Description: Low-resolution protein models are often used within a hierarchical framework for structure prediction. However, even with these simplified but realistic protein models, the search for the optimal solution remains NP complete. The complexity is further compounded by the multimodal nature of the search space. In this paper, we propose a systematic design of an evolutionary search technique, namely the memetic algorithm (MA), to effectively search the vast search space by exploiting the domain-specific knowledge and taking cognizance of the multimodal nature of the search space. The proposed MA achieves this by incorporating various novel features: 1) a modified fitness function includes two additional terms to account for the hydrophobic and polar nature of the residues; 2) a systematic (rather than random) generation of population automatically prevents an occurrence of invalid conformations; 3) a generalized nonisomorphic encoding scheme implicitly eliminates generation of twins (similar conformations) in the population; 4) the identification of a meme (protein substructures) during optimization from different basins of attraction - a process that is equivalent to implicit applications of threading principles; 5) a clustering of the population corresponds to basins of attraction that allows evolution to overcome the complexity of multimodal search space, thereby avoiding search getting trapped in a local optimum; and 6) a 2-stage framework gathers domain knowledge (i.e., substructures or memes) from different basins of attraction for a combined execution in the second stage. Experiments conducted with different lattice models using known benchmark protein sequences and comparisons carried out with recently reported approaches in this journal show that the proposed algorithm has robustness, speed, accuracy, and superior performance. The approach is generic and can easily be extended for applications to other classes of problems.
DEMass: a new density estimator for big data
- Authors: Ting, Kaiming , Washio, Takashi , Wells, Jonathan , Liu, Fei , Aryal, Sunil
- Date: 2013
- Type: Text , Journal article
- Relation: Knowledge and Information Systems Vol. 35, no. 3 (2013), p. 493-524
- Full Text: false
- Reviewed:
- Description: Density estimation is the ubiquitous base modelling mechanism employed for many tasks including clustering, classification, anomaly detection and information retrieval. Commonly used density estimation methods such as kernel density estimator and k-nearest neighbour density estimator have high time and space complexities which render them inapplicable in problems with big data. This weakness sets the fundamental limit in existing algorithms for all these tasks. We propose the first density estimation method, having average case sub-linear time complexity and constant space complexity in the number of instances, that stretches this fundamental limit to an extent that dealing with millions of data can now be done easily and quickly. We provide an asymptotic analysis of the new density estimator and verify the generality of the method by replacing existing density estimators with the new one in three current density-based algorithms, namely DBSCAN, LOF and Bayesian classifiers, representing three different data mining tasks of clustering, anomaly detection and classification. Our empirical evaluation results show that the new density estimation method significantly improves their time and space complexities, while maintaining or improving their task-specific performances in clustering, anomaly detection and classification. The new method empowers these algorithms, currently limited to small data size only, to process big data—setting a new benchmark for what density-based algorithms can achieve.
Efficient nonlinear classification via low-rank regularised least squares
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 22, no. 7-8(2013), p. 1279-1289
- Full Text: false
- Reviewed:
- Description: We revisit the classical technique of regularised least squares (RLS) for nonlinear classification in this paper. Specifically, we focus on a low-rank formulation of the RLS, which has linear time complexity in the size of data set only, independent of both the number of classes and number of features. This makes low-rank RLS particularly suitable for problems with large data and moderate feature dimensions. Moreover, we have proposed a general theorem for obtaining the closed-form estimation of prediction values on a holdout validation set given the low-rank RLS classifier trained on the whole training data. It is thus possible to obtain an error estimate for each parameter setting without retraining and greatly accelerate the process of cross-validation for parameter selection. Experimental results on several large-scale benchmark data sets have shown that low-rank RLS achieves comparable classification performance while being much more efficient than standard kernel SVM for nonlinear classification. The improvement in efficiency is more evident for data sets with higher dimensions.
Evaluating authorship distance methods using the positive Silhouette coefficient
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 19, no. 4 (2013), p. 517-535
- Full Text:
- Reviewed:
- Description: Unsupervised Authorship Analysis (UAA) aims to cluster documents by authorship without knowing the authorship of any documents. An important factor in UAA is the method for calculating the distance between documents. This choice of the authorship distance method is considered more critical to the end result than the choice of cluster analysis algorithm. One method for measuring the correlation between a distance metric and a labelling (such as class values or clusters) is the Silhouette Coefficient (SC). The SC can be leveraged by measuring the correlation between the authorship distance method and the true authorship, evaluating the quality of the distance method. However, we show that the SC can be severely affected by outliers. To address this issue, we introduce the Positive Silhouette Coefficient, given as the proportion of instances with a positive SC value. This metric is not easily altered by outliers and produces a more robust metric. A large number of authorship distance methods are then compared using the PSC, and the findings are presented. This research provides an insight into the efficacy of methods for UAA and presents a framework for testing authorship distance methods.
- Description: C1
Evaluation and prediction of blast-induced ground vibration at Shur River Dam, Iran, by artificial neural network
- Authors: Monjezi, Masoud , Hasanipanah, Mahdi , Khandelwal, Manoj
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 22, no. 7-8 (2013), p. 1637-1643
- Full Text: false
- Reviewed:
- Description: The purpose of this article is to evaluate and predict blast-induced ground vibration at Shur River Dam in Iran using different empirical vibration predictors and artificial neural network (ANN) model. Ground vibration is a seismic wave that spreads out from the blasthole when explosive charge is detonated in a confined manner. Ground vibrations were recorded and monitored in and around the Shur River Dam, Iran, at different vulnerable and strategic locations. A total of 20 blast vibration records were monitored, out of which 16 data sets were used for training of the ANN model as well as determining site constants of various vibration predictors. The rest of the 4 blast vibration data sets were used for the validation and comparison of the result of ANN and different empirical predictors. Performances of the different predictor models were assessed using standard statistical evaluation criteria. Finally, it was found that the ANN model is more accurate as compared to the various empirical models available. As such, a high conformity (R 2 = 0.927) was observed between the measured and predicted peak particle velocity by the developed ANN model. © 2012 Springer-Verlag London Limited.
Evaluation of effect of blast design parameters on flyrock using artificial neural networks
- Authors: Monjezi, Masoud , Mehrdanesh, Amirhossein , Malek, Alaeddin , Khandelwal, Manoj
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing and Applications Vol. 23, no. 2 (2013), p. 349-356
- Full Text: false
- Reviewed:
- Description: Flyrock, the propelled rock fragments beyond a specific limit, can be considered as one of the most crucial and hazardous events in the open pit blasting operations. Involvement of various effective parameters has made the problem so complicated, and the available empirical methods are not proficient to predict the flyrock. To achieve more accurate results, employment of new approaches, such as artificial neural network (ANN) can be very helpful. In this paper, an attempt has been made to apply the ANN method to predict the flyrock in the blasting operations of Sungun copper mine, Iran. Number of ANN models was tried using various permutation and combinations, and it was observed that a model trained with back-propagation algorithm having 9-5-2-1 architecture is the best optimum. Flyrock were also computed from various available empirical models suggested by Lundborg. Statistical modeling has also been done to compare the prediction capability of ANN over other methods. Comparison of the results showed absolute superiority of the ANN modeling over the empirical as well as statistical models. Sensitivity analysis was also performed to identify the most influential inputs on the output results. It was observed that powder factor, hole diameter, stemming and charge per delay are the most effective parameters on the flyrock. © 2012 Springer-Verlag London Limited.
Exploiting spatial smoothness to recover undecoded coefficients for transform domain distributed video coding
- Authors: Ali, Mortuza , Murshed, Manzur
- Date: 2013
- Type: Text , Conference paper
- Relation: IEEE International Conference on Image Processing; Melbourne, Australia; 15th-18th September 2013, p. 1782-1786
- Relation: http://purl.org/au-research/grants/arc/DP1095487
- Full Text: false
- Reviewed:
- Description: In a transform domain distributed video coding scheme, the correlation between the current encoding unit, e.g. block and slice, and the corresponding side-information is modeled using a virtual channel. This correlation model is then used for rate allocation, quantization, and Wyner-Ziv coding. Since the encoder can only have an estimate of the correlation instead of the exact knowledge of the side-information, the decoder will fail to recover the quantized transformed coeffi- cients with a nonzero probability. In this paper, we propose to integrate a scheme at the decoder to recover the undecoded coefficients using the spatial smoothness property of individual video frames. Simulation results demonstrated that, at different decoding failure probabilities, a transformed coeffi- cient recovery scheme can significantly improve the quality of videos in terms of both PSNR and SSIM.
- Description: In a transform domain distributed video coding scheme, the correlation between the current encoding unit, e.g. block and slice, and the corresponding side-information is modeled using a virtual channel. This correlation model is then used for rate allocation, quantization, and Wyner-Ziv coding. Since the encoder can only have an estimate of the correlation instead of the exact knowledge of the side-information, the decoder will fail to recover the quantized transformed coeffi- cients with a nonzero probability. In this paper, we propose to integrate a scheme at the decoder to recover the undecoded coefficients using the spatial smoothness property of individual video frames. Simulation results demonstrated that, at different decoding failure probabilities, a transformed coeffi- cient recovery scheme can significantly improve the quality of videos in terms of both PSNR and SSIM
Extraction and processing of real time strain of embedded FBG sensors using a fixed filter FBG circuit and an artificial neural network
- Authors: Kahandawa, Gayan , Epaarachchi, Jayantha , Wang, Hao , Canning, John , Lau, Alan
- Date: 2013
- Type: Text , Journal article
- Relation: Measurement: Journal of the International Measurement Confederation Vol. 46, no. 10 (2013), p. 4045-4051
- Full Text:
- Reviewed:
- Description: Fibre Bragg Grating (FBG) sensors have been used in the development of structural health monitoring (SHM) and damage detection systems for advanced composite structures over several decades. Unfortunately, to date only a handful of appropriate configurations and algorithm sare available for using in SHM systems have been developed. This paper reveals a novel configuration of FBG sensors to acquire strain reading and an integrated statistical approach to analyse data in real time. The proposed configuration has proven its capability to overcome practical constraints and the engineering challenges associated with FBG-based SHM systems. A fixed filter decoding system and an integrated artificial neural network algorithm for extracting strain from embedded FBG sensor were proposed and experimentally proved. Furthermore, the laboratory level experimental data was used to verify the accuracy of the system and it was found that the error levels were less than 0.3% in predictions. The developed SMH system using this technology has been submitted to US patent office and will be available for use of aerospace applications in due course. © 2013 Elsevier Ltd. All rights reserved.
Inferring large scale genetic networks with S-System model
- Authors: Chowdhury, Ahsan , Chetty, Madhu , Nguyen, Vinh
- Date: 2013
- Type: Text , Conference paper
- Relation: Genetic and Evolutionary Computation Conference p. 271-278
- Full Text: false
- Reviewed:
- Description: Gene regulatory network (GRN) reconstruction from high-throughput microarray data is an important problem in systems biology. The S-System model, a differential equation based approach, is among the mainstream approaches for modeling GRNs
Learning sparse kernel classifiers for multi-instance classification
- Authors: Fu, Zhouyu , Lu, Guojun , Ting, Kaiming , Zhang, Dengsheng
- Date: 2013
- Type: Text , Journal article
- Relation: IEEE Transactions on Neural Networks and Learning Systems Vol. 24, no. 9 (2013), p. 1377-1389
- Full Text: false
- Reviewed:
- Description: We propose a direct approach to learning sparse kernel classifiers for multi-instance (MI) classification to improve efficiency while maintaining predictive accuracy. The proposed method builds on a convex formulation for MI classification by considering the average score of individual instances for bag-level prediction. In contrast, existing formulations used the maximum score of individual instances in each bag, which leads to nonconvex optimization problems. Based on the convex MI framework, we formulate a sparse kernel learning algorithm by imposing additional constraints on the objective function to enforce the maximum number of expansions allowed in the prediction function. The formulated sparse learning problem for the MI classification is convex with respect to the classifier weights. Therefore, we can employ an effective optimization strategy to solve the optimization problem that involves the joint learning of both the classifier and the expansion vectors. In addition, the proposed formulation can explicitly control the complexity of the prediction model while still maintaining competitive predictive performance. Experimental results on benchmark data sets demonstrate that our proposed approach is effective in building very sparse kernel classifiers while achieving comparable performance to the state-of-the-art MI classifiers.
Local models - the key to boosting stable learners successfully
- Authors: Ting, Kaiming , Zhu, Lian , Wells, Jonathan
- Date: 2013
- Type: Text , Journal article
- Relation: Computational Intelligence Vol. 29, no. 2 (2013), p. 331-356
- Full Text: false
- Reviewed:
- Description: Boosting has been shown to improve the predictive performance of unstable learners such as decision trees, but not of stable learners like Support Vector Machines (SVM), k-nearest neighbours and Naive Bayes classifiers. In addition to the model stability problem, the high time complexity of some stable learners such as SVM prohibits them from generating multiple models to form an ensemble for large data sets. This paper introduces a simple method that not only enables Boosting to improve the predictive performance of stable learners, but also significantly reduces the computational time to generate an ensemble of stable learners such as SVM for large data sets that would otherwise be infeasible. The method proposes to build local models, instead of global models; and it is the first method, to the best of our knowledge, to solve the two problems in Boosting stable learners at the same time. We implement the method by using a decision tree to define local regions and build a local model for each local region. We show that this implementation of the proposed method enables successful Boosting of three types of stable learners: SVM, k-nearest neighbours and Naive Bayes classifiers.
- Description: Boosting has been shown to improve the predictive performance of unstable learners such as decision trees, but not of stable learners like Support Vector Machines (SVM), k-nearest neighbors and Naive Bayes classifiers. In addition to the model stability problem, the high time complexity of some stable learners such as SVM prohibits them from generating multiple models to form an ensemble for large data sets. This paper introduces a simple method that not only enables Boosting to improve the predictive performance of stable learners, but also significantly reduces the computational time to generate an ensemble of stable learners such as SVM for large data sets that would otherwise be infeasible. The method proposes to build local models, instead of global models; and it is the first method, to the best of our knowledge, to solve the two problems in Boosting stable learners at the same time. We implement the method by using a decision tree to define local regions and build a local model for each local region. We show that this implementation of the proposed method enables successful Boosting of three types of stable learners: SVM, k-nearest neighbors and Naive Bayes classifiers.
Mass estimation
- Authors: Ting, Kaiming , Zhou, Guang , Liu, Fei , Tan, Swee
- Date: 2013
- Type: Text , Journal article
- Relation: Machine Learning Vol. 90, no. 1 (2013), p. 127-160
- Full Text: false
- Reviewed:
- Description: This paper introduces mass estimation—a base modelling mechanism that can be employed to solve various tasks in machine learning. We present the theoretical basis of mass and efficient methods to estimate mass. We show that mass estimation solves problems effectively in tasks such as information retrieval, regression and anomaly detection. The models, which use mass in these three tasks, perform at least as well as and often better than eight state-of-the-art methods in terms of task-specific performance measures. In addition, mass estimation has constant time and space complexities.