List of Titles

New gene selection algorithm using hypeboxes to improve performance of classifiers

Authors: Bagirov, Adil , Mardaneh, Karim
Date: 2020
Type: Text , Journal article
Relation: International Journal of Bioinformatics Research and Applications Vol. 16, no. 3 (2020), p. 269-289
Full Text: false
Reviewed:
Description: The use of DNA microarray technology allows to measure the expression levels of thousands of genes in one single experiment which makes possible to apply classification techniques to classify tumours. However, the large number of genes and relatively small number of tumours in gene expression datasets may (and in some cases significantly) diminish the accuracy of many classifiers. Therefore, efficient gene selection algorithms are required to identify most informative genes or groups of genes to improve the performance of classifiers. In this paper, a new gene selection algorithm is developed using marginal hyberboxes of genes or groups of genes for each tumour type. Informative genes are defined using overlaps between hyberboxes. The results on six gene expression datasets demonstrate that the proposed algorithm is able to considerably reduce the number of genes and significantly improve the performance of classifiers. © 2020 Inderscience Enterprises Ltd.

An incremental piecewise linear classifier based on polyhedral conic separation

Authors: Ozturk, Gurkan , Bagirov, Adil , Kasimbeyli, Refail
Date: 2015
Type: Text , Journal article
Relation: Machine Learning Vol. 101, no. 1-3 (2015), p. 397-413
Relation: http://purl.org/au-research/grants/arc/DP140103213
Full Text: false
Reviewed:
Description: In this paper, a piecewise linear classifier based on polyhedral conic separation is developed. This classifier builds nonlinear boundaries between classes using polyhedral conic functions. Since the number of polyhedral conic functions separating classes is not known a priori, an incremental approach is proposed to build separating functions. These functions are found by minimizing an error function which is nonsmooth and nonconvex. A special procedure is proposed to generate starting points to minimize the error function and this procedure is based on the incremental approach. The discrete gradient method, which is a derivative-free method for nonsmooth optimization, is applied to minimize the error function starting from those points. The proposed classifier is applied to solve classification problems on 12 publicly available data sets and compared with some mainstream and piecewise linear classifiers. © 2014, The Author(s).

Diagnostic with incomplete nominal/discrete data

Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
Date: 2015
Type: Text , Journal article
Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
Full Text:
Reviewed:
Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.

Classification through incremental max-min separability

Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean , Karasozen, Bulent
Date: 2011
Type: Text , Journal article
Relation: Pattern Analysis and Applications Vol. 14, no. 2 (2011), p. 165-174
Relation: http://purl.org/au-research/grants/arc/DP0666061
Full Text: false
Reviewed:
Description: Piecewise linear functions can be used to approximate non-linear decision boundaries between pattern classes. Piecewise linear boundaries are known to provide efficient real-time classifiers. However, they require a long training time. Finding piecewise linear boundaries between sets is a difficult optimization problem. Most approaches use heuristics to avoid solving this problem, which may lead to suboptimal piecewise linear boundaries. In this paper, we propose an algorithm for globally training hyperplanes using an incremental approach. Such an approach allows one to find a near global minimizer of the classification error function and to compute as few hyperplanes as needed for separating sets. We apply this algorithm for solving supervised data classification problems and report the results of numerical experiments on real-world data sets. These results demonstrate that the new algorithm requires a reasonable training time and its test set accuracy is consistently good on most data sets compared with mainstream classifiers. © 2010 Springer-Verlag London Limited.

Data mining with combined use of optimization techniques and self-organizing maps for improving risk grouping rules : Application to prostate cancer patients

Authors: Churilov, Leonid , Bagirov, Adil , Schwartz, Daniel , Smith, Kate , Dally, Michael
Date: 2005
Type: Text , Journal article
Relation: Journal of Management Information Systems Vol. 21, no. 4 (2005), p. 85-100
Full Text:
Reviewed:
Description: Data mining techniques provide a popular and powerful tool set to generate various data-driven classification systems. In this paper, we investigate the combined use of self-organizing maps (SOM) and nonsmooth nonconvex optimization techniques in order to produce a working case of a data-driven risk classification system. The optimization approach strengthens the validity of SOM results, and the improved classification system increases both the quality of prediction and the homogeneity within the risk groups. Accurate classification of prostate cancer patients into risk groups is important to assist in the identification of appropriate treatment paths. We start with the existing rules and aim to improve classification accuracy by identifying inconsistencies utilizing self-organizing maps as a data visualization tool. Then, we progress to the study of assigning prostate cancer patients into homogenous groups with the aim to support future clinical treatment decisions. Using the case of prostate cancer patients grouping, we demonstrate strong potential of data-driven risk classification schemes for addressing the risk grouping issues in more general organizational settings. © 2005 M.E. Sharpe, Inc.
Description: C1
Description: 2003001265

Max-min separability

Authors: Bagirov, Adil
Date: 2005
Type: Text , Journal article
Relation: Optimization Methods and Software Vol. 20, no. 2-3 (2005), p. 271-290
Full Text:
Reviewed:
Description: We consider the problem of discriminating two finite point sets in the n-dimensional space by a finite number of hyperplanes generating a piecewise linear function. If the intersection of these sets is empty, then they can be strictly separated by a max-min of linear functions. An error function is introduced. This function is nonconvex piecewise linear. We discuss an algorithm for its minimization. The results of numerical experiments using some real-world datasets are presented, which show the effectiveness of the proposed approach.
Description: C1
Description: 2003001350

An algorithm for clustering based on non-smooth optimization techniques

Authors: Bagirov, Adil , Rubinov, Alex , Sukhorukova, Nadezda , Yearwood, John
Date: 2003
Type: Text , Journal article
Relation: International Transactions in Operational Research Vol. 10, no. 6 (2003), p. 611-617
Full Text: false
Reviewed:
Description: The problem of cluster analysis is formulated as a problem of non-smooth, non-convex optimization, and an algorithm for solving the cluster analysis problem based on non-smooth optimization techniques is developed. We discuss applications of this algorithm in large databases. Results of numerical experiments are presented to demonstrate the effectiveness of this algorithm.
Description: C1
Description: 2003000422

New algorithms for multi-class cancer diagnosis using tumor gene expression signatures

Authors: Bagirov, Adil , Ferguson, Brent , Ivkovic, Sasha , Saunders, Gary , Yearwood, John
Date: 2003
Type: Text , Journal article
Relation: Bioinformatics Vol. 19, no. 14 (2003), p. 1800-1807
Full Text:
Reviewed:
Description: Motivation: The increasing use of DNA microarray-based tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. Results: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters step-by-step. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multi-class classifiers based on support vector machines for this data set.
Description: C1
Description: 2003000439

Showing items 1 - 8 of 8

New gene selection algorithm using hypeboxes to improve performance of classifiers

An incremental piecewise linear classifier based on polyhedral conic separation

Diagnostic with incomplete nominal/discrete data

Classification through incremental max-min separability

Data mining with combined use of optimization techniques and self-organizing maps for improving risk grouping rules : Application to prostate cancer patients

Max-min separability

An algorithm for clustering based on non-smooth optimization techniques

New algorithms for multi-class cancer diagnosis using tumor gene expression signatures