Classes and clusters in data analysis
- Authors: Rubinov, Alex , Sukhorukova, Nadezda , Ugon, Julien
- Date: 2006
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 173, no. 3 (Sep 2006), p. 849-865
- Full Text:
- Reviewed:
- Description: We discuss the relation between classes and clusters in datasets with given classes. We examine the distribution of classes within obtained clusters, using different clustering methods which are based on different techniques. We also study the structure of the obtained clusters. One of the main conclusions, obtained in this research is that the notion purity cannot be always used for evaluation of accuracy of clustering techniques. (c) 2005 Elsevier B.V. All rights reserved.
- Description: C1
- Description: 2003001593
A heuristic algorithm for solving the minimum sum-of-squares clustering problems
- Authors: Ordin, Burak , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Global Optimization Vol. 61, no. 2 (2015), p. 341-361
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k-means and modified global k-means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k-means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k-means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k-means algorithms in finding global solutions to clustering problems.
Using meta-regression data mining to improve predictions of performance based on heart rate dynamics for Australian football
- Authors: Jelinek, Herbert , Kelarev, Andrei , Robinson, Dean , Stranieri, Andrew , Cornforth, David
- Date: 2014
- Type: Text , Journal article
- Relation: Applied Soft Computing Vol. 14, no. PART A (2014), p. 81-87
- Full Text: false
- Reviewed:
- Description: This work investigates the effectiveness of using computer-based machine learning regression algorithms and meta-regression methods to predict performance data for Australian football players based on parameters collected during daily physiological tests. Three experiments are described. The first uses all available data with a variety of regression techniques. The second uses a subset of features selected from the available data using the Random Forest method. The third used meta-regression with the selected feature subset. Our experiments demonstrate that feature selection and meta-regression methods improve the accuracy of predictions for match performance of Australian football players based on daily data of medical tests, compared to regression methods alone. Meta-regression methods and feature selection were able to obtain performance prediction outcomes with significant correlation coefficients. The best results were obtained by the additive regression based on isotonic regression for a set of most influential features selected by Random Forest. This model was able to predict athlete performance data with a correlation coefficient of 0.86 (p < 0.05). © 2013 Published by Elsevier B.V. All rights reserved.
- Description: C1
Classification through incremental max-min separability
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean , Karasozen, Bulent
- Date: 2011
- Type: Text , Journal article
- Relation: Pattern Analysis and Applications Vol. 14, no. 2 (2011), p. 165-174
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Reviewed:
- Description: Piecewise linear functions can be used to approximate non-linear decision boundaries between pattern classes. Piecewise linear boundaries are known to provide efficient real-time classifiers. However, they require a long training time. Finding piecewise linear boundaries between sets is a difficult optimization problem. Most approaches use heuristics to avoid solving this problem, which may lead to suboptimal piecewise linear boundaries. In this paper, we propose an algorithm for globally training hyperplanes using an incremental approach. Such an approach allows one to find a near global minimizer of the classification error function and to compute as few hyperplanes as needed for separating sets. We apply this algorithm for solving supervised data classification problems and report the results of numerical experiments on real-world data sets. These results demonstrate that the new algorithm requires a reasonable training time and its test set accuracy is consistently good on most data sets compared with mainstream classifiers. © 2010 Springer-Verlag London Limited.
Internet security applications of Grobner-Shirvov bases
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul
- Date: 2010
- Type: Text , Journal article
- Relation: Asian-European Journal of Mathematics Vol. 3, no. 3 (2010), p. 435-442
- Relation: http://purl.org/au-research/grants/arc/DP0211866
- Full Text: false
- Reviewed: