A new loss function for robust classification
- Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
- Date: 2014
- Type: Text , Journal article
- Relation: Intelligent Data Analysis Vol. 18, no. 4 (2014), p. 697-715
- Full Text: false
- Reviewed:
- Description: Loss function plays an important role in data classification. Manyloss functions have been proposed and applied to differentclassification problems. This paper proposes a new so called thesmoothed 0-1 loss function, that could be considered as anapproximation of the classical 0-1 loss function. Due to thenon-convexity property of the proposed loss function, globaloptimization methods are required to solve the correspondingoptimization problems. Together with the proposed loss function, wecompare the performance of several existing loss functions in theclassification of noisy data sets. In this comparison, differentoptimization problems are considered in regards to the convexity andsmoothness of different loss functions. The experimental resultsshow that the proposed smoothed 0-1 loss function works better ondata sets with noisy labels, noisy features, and outliers. © 2014 - IOS Press and the authors. All rights reserved.
A data mining application of the incidence semirings
- Authors: Abawajy, Jemal , Kelarev, Andrei , Yearwood, John , Turville, Christopher
- Date: 2013
- Type: Text , Journal article
- Relation: Houston Journal of Mathematics Vol. 39, no. 4 (2013), p. 1083-1093
- Relation: http://purl.org/au-research/grants/arc/LP0990908
- Full Text: false
- Reviewed:
- Description: This paper is devoted to a combinatorial problem for incidence semirings, which can be viewed as sets of polynomials over graphs, where the edges are the unknowns and the coefficients are taken from a semiring. The construction of incidence rings is very well known and has many useful applications. The present article is devoted to a novel application of the more general incidence semirings. Recent research on data mining has motivated the investigation of the sets of centroids that have largest weights in semiring constructions. These sets are valuable for the design of centroid-based classification systems, or classifiers, as well as for the design of multiple classifiers combining several individual classifiers. Our article gives a complete description of all sets of centroids with the largest weight in incidence semirings.
Attribute weighted Naive Bayes classifier using a local optimization
- Authors: Taheri, Sona , Yearwood, John , Mammadov, Musa , Seifollahi, Sattar
- Date: 2013
- Type: Text , Journal article
- Relation: Neural Computing & Applications Vol.24, no.5 (2013), p.995-1002
- Full Text:
- Reviewed:
- Description: The Naive Bayes classifier is a popular classification technique for data mining and machine learning. It has been shown to be very effective on a variety of data classification problems. However, the strong assumption that all attributes are conditionally independent given the class is often violated in real-world applications. Numerous methods have been proposed in order to improve the performance of the Naive Bayes classifier by alleviating the attribute independence assumption. However, violation of the independence assumption can increase the expected error. Another alternative is assigning the weights for attributes. In this paper, we propose a novel attribute weighted Naive Bayes classifier by considering weights to the conditional probabilities. An objective function is modeled and taken into account, which is based on the structure of the Naive Bayes classifier and the attribute weights. The optimal weights are determined by a local optimization method using the quasisecant method. In the proposed approach, the Naive Bayes classifier is taken as a starting point. We report the results of numerical experiments on several real-world data sets in binary classification, which show the efficiency of the proposed method.
Application of rank correlation, clustering and classification in information security
- Authors: Beliakov, Gleb , Yearwood, John , Kelarev, Andrei
- Date: 2012
- Type: Text , Journal article
- Relation: Journal of Networks Vol. 7, no. 6 (2012), p. 935-945
- Full Text:
- Reviewed:
- Description: This article is devoted to experimental investigation of a novel application of a clustering technique introduced by the authors recently in order to use robust and stable consensus functions in information security, where it is often necessary to process large data sets and monitor outcomes in real time, as it is required, for example, for intrusion detection. Here we concentrate on a particular case of application to profiling of phishing websites. First, we apply several independent clustering algorithms to a randomized sample of data to obtain independent initial clusterings. Silhouette index is used to determine the number of clusters. Second, rank correlation is used to select a subset of features for dimensionality reduction. We investigate the effectiveness of the Pearson Linear Correlation Coefficient, the Spearman Rank Correlation Coefficient and the Goodman-Kruskal Correlation Coefficient in this application. Third, we use a consensus function to combine independent initial clusterings into one consensus clustering. Fourth, we train fast supervised classification algorithms on the resulting consensus clustering in order to enable them to process the whole large data set as well as new data. The precision and recall of classifiers at the final stage of this scheme are critical for effectiveness of the whole procedure. We investigated various combinations of several correlation coefficients, consensus functions, and a variety of supervised classification algorithms. © 2012 Academy Publisher.
- Description: 2003010277
Rule-based classifiers and meta classifiers for identification of cardiac autonomic neuropathy progression
- Authors: Jelinek, Herbert , Kelarev, Andrei , Stranieri, Andrew , Yearwood, John
- Date: 2012
- Type: Text , Journal article
- Relation: International Journal of Information Science and Computer Mathematics Vol. 5, no. 2 (2012), p. 49-53
- Full Text:
- Reviewed:
- Description: We investigate and compare several rule-based classifiers and meta classifiers in their ability to obtain multi-class classifications of cardiac autonomic neuropathy (CAN) and its progression. The best results obtained in our experiments are significantly better than the outcomes published previously in the literature for analogous CAN identification tasks or simpler binary classification tasks.
A novel hybrid neural learning algorithm using simulated annealing and quasisecant method
- Authors: Yearwood, John , Bagirov, Adil , Seifollahi, Sattar
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we propose a hybrid learning algorithm for the single hidden layer feedforward neural networks (SLFNs) for data classification. The proposed hybrid algorithm is a two-phase learning algorithm and is based on the quasisecant and the simulated annealing methods. First, the weights between the hidden layer and the output layer nodes (output layer weights) are adjusted by the quasisecant algorithm. Then the simulated annealing is applied for global attribute weighting. The weights between the input layer and the hidden layer nodes are fixed in advance and are not included in the learning process. The proposed two-phase learning of the network is a novel idea and is different from that of the existing ones. The numerical results on some benchmark data sets are also reported and these results are promising. © 2011, Australian Computer Society, Inc.
- Description: 2003009507
An application of novel clustering technique for information security
- Authors: Beliakov, Gleb , Yearwood, John , Kelarev, Andrei
- Date: 2011
- Type: Text , Conference paper
- Relation: Applications and Techniques in Information Security Workshop p. 5-11
- Full Text: false
- Reviewed:
- Description: This article presents experimental results devoted to a new application of the novel clustering technique introduced by the authors recently. Our aim is to facilitate the application of robust and stable consensus functions in information security, where it is often necessary to process large data sets and monitor outcomes in real time, as it is required, for example, for intrusion detection. Here we concentrate on the particular case of application to profiling of phishing websites. First, we apply several independent clustering algorithms to a randomized sample of data to obtain independent initial clusterings. Silhouette index is used to determine the number of clusters. Second, we use a consensus function to combine these independent clusterings into one consensus clustering . Feature ranking is used to select a subset of features for the consensus function. Third, we train fast supervised classification algorithms on the resulting consensus clustering in order to enable them to process the whole large data set as well as new data. The precision and recall of classifiers at the final stage of this scheme are critical for effectiveness of the whole procedure. We investigated various combinations of three consensus functions, Cluster-Based Graph Formulation (CBGF), Hybrid Bipartite Graph Formulation (HBGF), and Instance-Based Graph Formulation (IBGF) and a variety of supervised classification algorithms. The best precision and recall have been obtained by the combination of the HBGF consensus function and the SMO classifier with the polynomial kernel.
- Description: 2003009195
From convex to nonconvex: A loss function analysis for binary classification
- Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at10th IEEE International Conference on Data Mining Workshops, ICDMW 2010 p. 1281-1288
- Full Text:
- Reviewed:
- Description: Problems of data classification can be studied in the framework of regularization theory as ill-posed problems. In this framework, loss functions play an important role in the application of regularization theory to classification. In this paper, we review some important convex loss functions, including hinge loss, square loss, modified square loss, exponential loss, logistic regression loss, as well as some non-convex loss functions, such as sigmoid loss, ø-loss, ramp loss, normalized sigmoid loss, and the loss function of 2 layer neural network. Based on the analysis of these loss functions, we propose a new differentiable non-convex loss function, called smoothed 0-1 loss function, which is a natural approximation of the 0-1 loss function. To compare the performance of different loss functions, we propose two binary classification algorithms for binary classification, one for convex loss functions, the other for non-convex loss functions. A set of experiments are launched on several binary data sets from the UCI repository. The results show that the proposed smoothed 0-1 loss function is robust, especially for those noisy data sets with many outliers. © 2010 IEEE.
A classification algorithm that derives weighted sum scores for insight into disease
- Authors: Quinn, Anthony , Stranieri, Andrew , Yearwood, John , Hafen, Gaudenz
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at Third Australasian Workshop on Health Informatics and Knowledge Management (HIKM 2009), Wellington, New Zealand : Vol. 97, p. 13-17
- Full Text:
- Description: Data mining is often performed with datasets associated with diseases in order to increase insights that can ultimately lead to improved prevention or treatment. Classification algorithms can achieve high levels of predictive accuracy but have limited application for facilitating the insight that leads to deeper understanding of aspects of the disease. This is because the representation of knowledge that arises from classification algorithms is too opaque, too complex or too sparse to facilitate insight. Clustering, association and visualisation approaches enable greater scope for clinicians to be engaged in a way that leads to insight, however predictive accuracy is compromised or non-existent. This research investigates the practical applications of Automated Weighted Sum, (AWSum), a classification algorithm that provides accuracy comparable to other techniques whilst providing some insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. Clinicians are very familiar with weighted sum scoring scales so the internal representation is intuitive and easily understood. This paper presents results from the use of the AWSum approach with data from patients suffering from Cystic Fibrosis.
A formula for multiple classifiers in data mining based on Brandt semigroups
- Authors: Kelarev, Andrei , Yearwood, John , Mammadov, Musa
- Date: 2009
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 78, no. 2 (2009), p. 293-309
- Full Text:
- Reviewed:
- Description: A general approach to designing multiple classifiers represents them as a combination of several binary classifiers in order to enable correction of classification errors and increase reliability. This method is explained, for example, in Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques, 2005, Sect. 7.5). The aim of this paper is to investigate representations of this sort based on Brandt semigroups. We give a formula for the maximum number of errors of binary classifiers, which can be corrected by a multiple classifier of this type. Examples show that our formula does not carry over to larger classes of semigroups. © 2008 Springer Science+Business Media, LLC.
A polynomial ring construction for the classification of data
- Authors: Kelarev, Andrei , Yearwood, John , Vamplew, Peter
- Date: 2009
- Type: Text , Journal article
- Relation: Bulletin of the Australian Mathematical Society Vol. 79, no. 2 (2009), p. 213-225
- Full Text:
- Reviewed:
- Description: Drensky and Lakatos (Lecture Notes in Computer Science, 357 (Springer, Berlin, 1989), pp. 181-188) have established a convenient property of certain ideals in polynomial quotient rings, which can now be used to determine error-correcting capabilities of combined multiple classifiers following a standard approach explained in the well-known monograph by Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques (Elsevier, Amsterdam, 2005)). We strengthen and generalise the result of Drensky and Lakatos by demonstrating that the corresponding nice property remains valid in a much larger variety of constructions and applies to more general types of ideals. Examples show that our theorems do not extend to larger classes of ring constructions and cannot be simplified or generalised.
Cayley graphs as classifiers for data mining : The influence of asymmetries
- Authors: Kelarev, Andrei , Ryan, Joe , Yearwood, John
- Date: 2009
- Type: Text , Journal article
- Relation: Discrete Mathematics Vol. 309, no. 17 (2009), p. 5360-5369
- Relation: http://purl.org/au-research/grants/arc/DP0211866
- Full Text:
- Reviewed:
- Description: The endomorphism monoids of graphs have been actively investigated. They are convenient tools expressing asymmetries of the graphs. One of the most important classes of graphs considered in this framework is that of Cayley graphs. Our paper proposes a new method of using Cayley graphs for classification of data. We give a survey of recent results devoted to the Cayley graphs also involving their endomorphism monoids. © 2008 Elsevier B.V. All rights reserved.
Experimental investigation of three machine learning algorithms for ITS dataset
- Authors: Yearwood, John , Kang, Byeongho , Kelarev, Andrei
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at First International Conference, FGIT 2009, Future Generation Information Technology, Jeju Island, Korea : 10th-12th December 2009 Vol. 5899, p. 308-316
- Full Text:
- Description: The present article is devoted to experimental investigation of the performance of three machine learning algorithms for ITS dataset in their ability to achieve agreement with classes published in the biologi cal literature before. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form a Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel machine learning ap proaches to the analysis of datasets of this sort. This paper introduces a k-committees classifier and compares it with the discrete k-means and Nearest Neighbour classifiers. It turns out that all three machine learning algorithms are efficient and can be used to automate future biologically significant classifications for datasets of this kind. A simplified version of a synthetic dataset, where the k-committees classifier outperforms k-means and Nearest Neighbour classifiers, is also presented.
- Description: 2003007844
AWSum -Combining classification with knowledge acquisition
- Authors: Quinn, Anthony , Stranieri, Andrew , Yearwood, John , Hafen, Gaudenz , Jelinek, Herbert
- Date: 2008
- Type: Text , Journal article
- Relation: International Journal of Software and Informatics Vol. 2, no. 2 (2008), p. 199-214
- Full Text: false
- Reviewed:
- Description: Many classifiers achieve high levels of accuracy but have limited applicability in real world situations because they do not lead to a greater understanding or insight into the way features influence the classification. In areas such as health informatics a classifier that clearly identifies the influences on classification can be used to direct research and formulate interventions. This research investigates the practical aplications of Automated Weighted Sum, (AWSum), a classifier that provides accuracy comparable to other techniques whist providing insight into the data. This is achieved by calculating a weight for each feature value that represents its influence on the class value. The merits of this approach in classification and insight are evaluated on a Cystic Fibrosis and diabetes datasets with positive results.
Experimental investigation of clasification algorithms for ITS dataset
- Authors: Yearwood, John , Kang, Byeongho , Kelarev, Andrei
- Date: 2008
- Type: Text , Conference paper
- Relation: PKAW-08, Pacific Rim Knowledge Acquisition Workshop 2008, as part of PRICAI 2008, Tenth Pacific Rim p. 262-272
- Full Text: false
- Reviewed:
- Description: This article is devoted to experimental investigation of classification algorithms for analysis of ITS dataset. We introduce and consider a novel k-committees alogorithm for classification and compare it with the discrete k- means and nearest neighbour algorithms. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel algorithms and adjust familiar ones. We present the results of experiments comparing the efficiency of three classification methods in their ability to achieve agreement with classes published in the biological literature before. It turns out that our algorithms are efficient and can be used to obtain biologically significant classifications. A simplified version of a synthetic dataset, where the k-committees classifier out performs k-means and Nearest Neighbour classifiers, is also presented.
- Description: E1
Using links to aid web classification
- Authors: Xie, Wei , Mammadov, Musa , Yearwood, John
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at 6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007, Melbourne, Victoria : 11th-13th July 2007 p. 981-986
- Full Text:
- Description: In this paper, we will present a new approach of using link information to improve the accuracy and efficiency of web classification. However, different from others, we only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets. We apply SVM and BoosTexter for classification. We show that the classification accuracy can be improved based on mixtures of ordinary word features and out-linked-class features. We analyze and discuss the reason of this improvement.
- Description: 2003005438
A fuzzy derivative approach to classification of outcomes from the ADRAC database
- Authors: Mammadov, Musa , Saunders, Gary , Yearwood, John
- Date: 2004
- Type: Text , Journal article
- Relation: International Transactions in Operational Research Vol. 11, no. 2 (2004), p. 169-180
- Full Text: false
- Reviewed:
- Description: The Australian Adverse Drug Reaction Advisory Committee (ADRAC) database has been collected and maintained by the Therapeutic Goods Administration. In this paper we study a part of his database (Card2) which contains records having just reactions from the Cardiovascular group. Drug-reaction relationships are presented by a vector of degrees which shows the degree of association of a drug with each class of reactions. In this work we examine these relationships in the classification of reaction outcomes. A modified version of the fuzzy derivative method (FDM2) is used for classification.
- Description: C1
- Description: 2003000895
Multi label classification and drug-reaction associations using global optimization techniques
- Authors: Mammadov, Musa , Yearwood, John , Aliyea, Leyla
- Date: 2004
- Type: Text , Conference paper
- Relation: Paper presented at ICOTA6: 6th International Conference on Optimization - Techniques and Applications, Ballarat, Victoria : 9th December, 2004
- Full Text: false
- Reviewed:
- Description: E1
- Description: 2003000890
An algorithm for clustering based on non-smooth optimization techniques
- Authors: Bagirov, Adil , Rubinov, Alex , Sukhorukova, Nadezda , Yearwood, John
- Date: 2003
- Type: Text , Journal article
- Relation: International Transactions in Operational Research Vol. 10, no. 6 (2003), p. 611-617
- Full Text: false
- Reviewed:
- Description: The problem of cluster analysis is formulated as a problem of non-smooth, non-convex optimization, and an algorithm for solving the cluster analysis problem based on non-smooth optimization techniques is developed. We discuss applications of this algorithm in large databases. Results of numerical experiments are presented to demonstrate the effectiveness of this algorithm.
- Description: C1
- Description: 2003000422
New algorithms for multi-class cancer diagnosis using tumor gene expression signatures
- Authors: Bagirov, Adil , Ferguson, Brent , Ivkovic, Sasha , Saunders, Gary , Yearwood, John
- Date: 2003
- Type: Text , Journal article
- Relation: Bioinformatics Vol. 19, no. 14 (2003), p. 1800-1807
- Full Text:
- Reviewed:
- Description: Motivation: The increasing use of DNA microarray-based tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. Results: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters step-by-step. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multi-class classifiers based on support vector machines for this data set.
- Description: C1
- Description: 2003000439