Visual tools for analysing evolution, emergence, and error in data streams
- Authors: Hart, Sol , Yearwood, John , Bagirov, Adil
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at 6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007, Melbourne, Victoria : 11th-13th July 2007 p. 987-992
- Full Text:
- Description: The relatively new field of stream mining has necessitated the development of robust drift-aware algorithms that provide accurate, real time, data handling capabilities. Tools are needed to assess and diagnose important trends and investigate drift evolution parameters. In this paper, we present two new and novel visualisation techniques, Pixie and Luna graphs, which incorporate salient group statistics coupled with intuitive visual representations of multidimensional groupings over time. Through the novel representations presented here, spatial interactions between temporal divisions can be diagnosed and overall distribution patterns identified. It provides a means of evaluating in non-constrained capacity, commonly constrained evolutionary problems.
- Description: 2003005432
New algorithms for multi-class cancer diagnosis using tumor gene expression signatures
- Authors: Bagirov, Adil , Ferguson, Brent , Ivkovic, Sasha , Saunders, Gary , Yearwood, John
- Date: 2003
- Type: Text , Journal article
- Relation: Bioinformatics Vol. 19, no. 14 (2003), p. 1800-1807
- Full Text:
- Reviewed:
- Description: Motivation: The increasing use of DNA microarray-based tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. Results: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters step-by-step. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multi-class classifiers based on support vector machines for this data set.
- Description: C1
- Description: 2003000439
Unsupervised and supervised data classification via nonsmooth and global optimisation
- Authors: Bagirov, Adil , Rubinov, Alex , Sukhorukova, Nadezda , Yearwood, John
- Date: 2003
- Type: Text , Journal article
- Relation: Top Vol. 11, no. 1 (2003), p. 1-92
- Full Text:
- Reviewed:
- Description: We examine various methods for data clustering and data classification that are based on the minimization of the so-called cluster function and its modications. These functions are nonsmooth and nonconvex. We use Discrete Gradient methods for their local minimization. We consider also a combination of this method with the cutting angle method for global minimization. We present and discuss results of numerical experiments.
- Description: C1
- Description: 2003000421
A global optimisation approach to classification in medical diagnosis and prognosis
- Authors: Bagirov, Adil , Rubinov, Alex , Yearwood, John , Stranieri, Andrew
- Date: 2001
- Type: Text , Conference paper
- Relation: Paper presented at 34th Hawaii International Conference on System Sciences, HICSS-34, Maui, Hawaii, USA : 3rd-6th January 2001
- Full Text:
- Description: In this paper global optimisation-based techniques are studied in order to increase the accuracy of medical diagnosis and prognosis with FNA image data from the Wisconsin Diagnostic and Prognostic Breast Cancer databases. First we discuss the problem of determining the most informative features for the classification of cancerous cases in the databases under consideration. Then we apply a technique based on convex and global optimisation to breast cancer diagnosis. It allows the classification of benign cases and malignant ones and the subsequent diagnosis of patients with very high accuracy. The third application of this technique is a method that calculates centres of clusters to predict when breast cancer is likely to recur in patients for which cancer has been removed. The technique achieves higher accuracy with these databases than reported elsewhere in the literature.
- Description: 2003003950