A novel hybrid neural learning algorithm using simulated annealing and quasisecant method
- Authors: Yearwood, John , Bagirov, Adil , Seifollahi, Sattar
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we propose a hybrid learning algorithm for the single hidden layer feedforward neural networks (SLFNs) for data classification. The proposed hybrid algorithm is a two-phase learning algorithm and is based on the quasisecant and the simulated annealing methods. First, the weights between the hidden layer and the output layer nodes (output layer weights) are adjusted by the quasisecant algorithm. Then the simulated annealing is applied for global attribute weighting. The weights between the input layer and the hidden layer nodes are fixed in advance and are not included in the learning process. The proposed two-phase learning of the network is a novel idea and is different from that of the existing ones. The numerical results on some benchmark data sets are also reported and these results are promising. © 2011, Australian Computer Society, Inc.
- Description: 2003009507
An algorithm for clustering based on non-smooth optimization techniques
- Authors: Bagirov, Adil , Rubinov, Alex , Sukhorukova, Nadezda , Yearwood, John
- Date: 2003
- Type: Text , Journal article
- Relation: International Transactions in Operational Research Vol. 10, no. 6 (2003), p. 611-617
- Full Text: false
- Reviewed:
- Description: The problem of cluster analysis is formulated as a problem of non-smooth, non-convex optimization, and an algorithm for solving the cluster analysis problem based on non-smooth optimization techniques is developed. We discuss applications of this algorithm in large databases. Results of numerical experiments are presented to demonstrate the effectiveness of this algorithm.
- Description: C1
- Description: 2003000422
An incremental approach for the construction of a piecewise linear classifier
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at XIIIth International Conference : Applied Stochastic Models and Data Analysis, ASMDA 2009, Vilnius, Lithuania : 30th June - 3rd July 2009 p. 507–511
- Relation: https://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Description: In this paper the problem of finding piecewise linear boundaries between sets is considered and is applied for solving supervised data classification problems. An algorithm for the computation of piecewise linear boundaries, consisting of two main steps, is proposed. In the first step sets are approximated by hyperboxes to find so-called “indeterminate” regions between sets. In the second step sets are separated inside these “indeterminate” regions by piecewise linear functions. These functions are computed incrementally starting with a linear function. Results of numerical experiments are reported. These results demonstrate that the new algorithm requires a reasonable training time and it produces consistently good test set accuracy on most data sets comparing with mainstream classifiers.
- Description: 2003007559
An incremental piecewise linear classifier based on polyhedral conic separation
- Authors: Ozturk, Gurkan , Bagirov, Adil , Kasimbeyli, Refail
- Date: 2015
- Type: Text , Journal article
- Relation: Machine Learning Vol. 101, no. 1-3 (2015), p. 397-413
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: In this paper, a piecewise linear classifier based on polyhedral conic separation is developed. This classifier builds nonlinear boundaries between classes using polyhedral conic functions. Since the number of polyhedral conic functions separating classes is not known a priori, an incremental approach is proposed to build separating functions. These functions are found by minimizing an error function which is nonsmooth and nonconvex. A special procedure is proposed to generate starting points to minimize the error function and this procedure is based on the incremental approach. The discrete gradient method, which is a derivative-free method for nonsmooth optimization, is applied to minimize the error function starting from those points. The proposed classifier is applied to solve classification problems on 12 publicly available data sets and compared with some mainstream and piecewise linear classifiers. © 2014, The Author(s).
Classification through incremental max-min separability
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean , Karasozen, Bulent
- Date: 2011
- Type: Text , Journal article
- Relation: Pattern Analysis and Applications Vol. 14, no. 2 (2011), p. 165-174
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Reviewed:
- Description: Piecewise linear functions can be used to approximate non-linear decision boundaries between pattern classes. Piecewise linear boundaries are known to provide efficient real-time classifiers. However, they require a long training time. Finding piecewise linear boundaries between sets is a difficult optimization problem. Most approaches use heuristics to avoid solving this problem, which may lead to suboptimal piecewise linear boundaries. In this paper, we propose an algorithm for globally training hyperplanes using an incremental approach. Such an approach allows one to find a near global minimizer of the classification error function and to compute as few hyperplanes as needed for separating sets. We apply this algorithm for solving supervised data classification problems and report the results of numerical experiments on real-world data sets. These results demonstrate that the new algorithm requires a reasonable training time and its test set accuracy is consistently good on most data sets compared with mainstream classifiers. © 2010 Springer-Verlag London Limited.
Data mining with combined use of optimization techniques and self-organizing maps for improving risk grouping rules : Application to prostate cancer patients
- Authors: Churilov, Leonid , Bagirov, Adil , Schwartz, Daniel , Smith, Kate , Dally, Michael
- Date: 2005
- Type: Text , Journal article
- Relation: Journal of Management Information Systems Vol. 21, no. 4 (2005), p. 85-100
- Full Text:
- Reviewed:
- Description: Data mining techniques provide a popular and powerful tool set to generate various data-driven classification systems. In this paper, we investigate the combined use of self-organizing maps (SOM) and nonsmooth nonconvex optimization techniques in order to produce a working case of a data-driven risk classification system. The optimization approach strengthens the validity of SOM results, and the improved classification system increases both the quality of prediction and the homogeneity within the risk groups. Accurate classification of prostate cancer patients into risk groups is important to assist in the identification of appropriate treatment paths. We start with the existing rules and aim to improve classification accuracy by identifying inconsistencies utilizing self-organizing maps as a data visualization tool. Then, we progress to the study of assigning prostate cancer patients into homogenous groups with the aim to support future clinical treatment decisions. Using the case of prostate cancer patients grouping, we demonstrate strong potential of data-driven risk classification schemes for addressing the risk grouping issues in more general organizational settings. © 2005 M.E. Sharpe, Inc.
- Description: C1
- Description: 2003001265
Diagnostic with incomplete nominal/discrete data
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
Feature selection using misclassification counts
- Authors: Bagirov, Adil , Yatsko, Andrew , Stranieri, Andrew
- Date: 2011
- Type: Conference proceedings , Unpublished work
- Relation: Proceedings of the 9th Australasian Data Mining Conference (AusDM 2011), 51-62. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 121.
- Full Text:
- Description: Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and instance acquisition effort, considering all the data attributes accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance and redundancy, what ranking does not immediately decide. Additionally, feature ranking methods from different independent sources are called in for the direct comparison.
- Description: Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and the data acquisition effort, considering all data components being accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class centers with respect to coordinates of informative features. Ranking is done on the degree, to which different variables exhibit random characteristics. The results are being verified using the Nearest Neighbor classifier. This also helps to address the feature irrelevance, what ranking does not immediately decide. Additionally, feature ranking methods available from different independent sources are called in for direct comparison.
Max-min separability
- Authors: Bagirov, Adil
- Date: 2005
- Type: Text , Journal article
- Relation: Optimization Methods and Software Vol. 20, no. 2-3 (2005), p. 271-290
- Full Text:
- Reviewed:
- Description: We consider the problem of discriminating two finite point sets in the n-dimensional space by a finite number of hyperplanes generating a piecewise linear function. If the intersection of these sets is empty, then they can be strictly separated by a max-min of linear functions. An error function is introduced. This function is nonconvex piecewise linear. We discuss an algorithm for its minimization. The results of numerical experiments using some real-world datasets are presented, which show the effectiveness of the proposed approach.
- Description: C1
- Description: 2003001350
Multi-source cyber-attacks detection using machine learning
- Authors: Taheri, Sona , Gondal, Iqbal , Bagirov, Adil , Harkness, Greg , Brown, Simon , Chi, Chihung
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 IEEE International Conference on Industrial Technology, ICIT 2019; Melbourne, Australia; 13th-15th February 2019 Vol. 2019-February, p. 1167-1172
- Full Text:
- Reviewed:
- Description: The Internet of Things (IoT) has significantly increased the number of devices connected to the Internet ranging from sensors to multi-source data information. As the IoT continues to evolve with new technologies number of threats and attacks against IoT devices are on the increase. Analyzing and detecting these attacks originating from different sources needs machine learning models. These models provide proactive solutions for detecting attacks and their sources. In this paper, we propose to apply a supervised machine learning classification technique to identify cyber-attacks from each source. More precisely, we apply the incremental piecewise linear classifier that constructs boundary between sources/classes incrementally starting with one hyperplane and adding more hyperplanes at each iteration. The algorithm terminates when no further significant improvement of the separation of sources/classes is possible. The construction and usage of piecewise linear boundaries allows us to avoid any possible overfitting. We apply the incremental piecewise linear classifier on the multi-source real world cyber security data set to identify cyber-attacks and their sources.
- Description: Proceedings of the IEEE International Conference on Industrial Technology
New algorithms for multi-class cancer diagnosis using tumor gene expression signatures
- Authors: Bagirov, Adil , Ferguson, Brent , Ivkovic, Sasha , Saunders, Gary , Yearwood, John
- Date: 2003
- Type: Text , Journal article
- Relation: Bioinformatics Vol. 19, no. 14 (2003), p. 1800-1807
- Full Text:
- Reviewed:
- Description: Motivation: The increasing use of DNA microarray-based tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. Results: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters step-by-step. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multi-class classifiers based on support vector machines for this data set.
- Description: C1
- Description: 2003000439
New gene selection algorithm using hypeboxes to improve performance of classifiers
- Authors: Bagirov, Adil , Mardaneh, Karim
- Date: 2020
- Type: Text , Journal article
- Relation: International Journal of Bioinformatics Research and Applications Vol. 16, no. 3 (2020), p. 269-289
- Full Text: false
- Reviewed:
- Description: The use of DNA microarray technology allows to measure the expression levels of thousands of genes in one single experiment which makes possible to apply classification techniques to classify tumours. However, the large number of genes and relatively small number of tumours in gene expression datasets may (and in some cases significantly) diminish the accuracy of many classifiers. Therefore, efficient gene selection algorithms are required to identify most informative genes or groups of genes to improve the performance of classifiers. In this paper, a new gene selection algorithm is developed using marginal hyberboxes of genes or groups of genes for each tumour type. Informative genes are defined using overlaps between hyberboxes. The results on six gene expression datasets demonstrate that the proposed algorithm is able to considerably reduce the number of genes and significantly improve the performance of classifiers. © 2020 Inderscience Enterprises Ltd.
Nonsmooth optimisation approach to data classification
- Authors: Bagirov, Adil , Soukhoroukova, Nadejda
- Date: 2001
- Type: Text , Conference paper
- Relation: Paper presented at Post-graduate ADFA Conference for Computer Science, PACCS01, Canberra, Australian Capital Territory : 14th July 2001
- Full Text:
- Description: We reduce the supervised classification to solving a nonsmooth optimization problem. The proposed method allows one to solve classification problems for databases with arbitrary number of classes. Numerical experiments have been carried out with databases of small and medium size. We present their results and provide comparison of these results with ones obtained by other algorithms of classification based on the optimization techniques. Results of numerical experiments show effectiveness of the proposed algorithms.
- Description: 2003003668