A feature selection approach for unsupervised classification based on clustering
- Authors: Rubinov, Alex , Soukhoroukova, Nadejda , Ugon, Julien
- Date: 2004
- Type: Text , Conference paper
- Relation: Paper presented at Sixth International Conference on Optimization: Techniques and Applications (ICOTA) , University of Ballarat, Ballarat, Victoria : 9th-11th December 2004
- Full Text: false
- Description: Data have been collected for many years in different scientific (industrial, medical) research groups. Very often these groups kept all the the they could collect. It is possible that the data contains a lot of noisy features which do not bring any information, but make the problem more complicated. The additional study of eliminating non-informative and selecting informative features is very important in the area of Data Mining. There are several feature selection methods which were developed for supervised classification. The area of feature selection for unsupervised classification is not so developed. In this paper we present a new feature selection approach for unsupervised classification, based on clustering and nonsmooth optimisation techniques.
- Description: 2003004085
A generalized subgradient method with piecewise linear subproblem
- Authors: Bagirov, Adil , Ganjehlou, Asef Nazari , Tor, Hakan , Ugon, Julien
- Date: 2010
- Type: Text , Journal article
- Relation: Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications and Algorithms Vol. 17, no. 5 (2010), p. 621-638
- Full Text: false
- Reviewed:
- Description: In this paper, a new version of the quasisecant method for nonsmooth nonconvex optimization is developed. Quasisecants are overestimates to the objective function in some neighborhood of a given point. Subgradients are used to obtain quasisecants. We describe classes of nonsmooth functions where quasisecants can be computed explicitly. We show that a descent direction with suffcient decrease must satisfy a set of linear inequalities. In the proposed algorithm this set of linear inequalities is solved by applying the subgradient algorithm to minimize a piecewise linear function. We compare results of numerical experiments between the proposed algorithm and subgradient method. Copyright © 2010 Watam Press.
A modified parallel optimization system for updating large-size time-evolving flow matrix
- Authors: Yu, Ting , Ugon, Julien , Yu, Wei
- Date: 2011
- Type: Text , Journal article
- Relation: Information Sciences Vol.194, no. (2011), p.57-67
- Full Text: false
- Reviewed:
- Description: Flow matrices are widely used in many disciplines, but few methods can estimate them. This paper presents a knowledge-based system as capable of estimating and updating large-size time-evolving flow matrix. The system in this paper consists of two major components with the purposes of matrix estimation and parallel optimization. The matrix estimation algorithm interprets and follows users' query scripts, retrieves data from various sources and integrates them for the matrix estimation. The parallel optimization component is built upon a supercomputing facility to utilize its computational power to efficiently process a large amount of data and estimate a large-size complex matrix. The experimental results demonstrate its outstanding performance and the acceptable accuracy by directly and indirectly comparing the estimation matrix with the actual matrix constructed by surveys. © 2011 Elsevier Inc. All rights reserved.
A new algorithm for the placement of WLAN access point based on nonsmooth optimization technique
- Authors: Kouhbor, Shahnaz , Ugon, Julien , Kruger, Alexander , Rubinov, Alex , Branch, Philip
- Date: 2005
- Type: Text , Conference paper
- Relation: Paper presented at the 7th International Conference on Advanced Communication Technology, Phoenix Park, Korea : 21st February, 2005
- Full Text:
- Reviewed:
- Description: In wireless local area network (WLAN), signal coverage is obtained by proper placement of access points (APs). The impact of incorrect placement of APs is significant. If they are placed too far apart, they generate a coverage gap but if they are too close to each other, this leads to excessive co-channel interferences. In this paper, we describe a mathematical model we have developed to find the optimal number and location of APs. To solve the problem, we use an optimization algorithm developed at the University of Ballarat called discrete gradient algorithm. Results indicate that our model is able to solve optimal coverage problems for different numbers of users.
- Description: E1
- Description: 2003001376
A new modified global k-means algorithm for clustering large data sets
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at XIIIth International Conference : Applied Stochastic Models and Data Analysis, ASMDA 2009, Vilnius, Lithuania : 30th June - 3rd July 2009 p. 1-5
- Full Text: false
- Description: The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and inefficient for solving clustering problems in large data sets. Recently, in order to resolve difficulties with the choice of starting points, incremental approaches have been developed. The modified global k-means algorithm is based on such an approach. It iteratively adds one cluster center at a time. Numerical experiments show that this algorithm considerably improve the k-means algorithm. However, this algorithm is not suitable for clustering very large data sets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points spanning different parts of the data set. We exploit information gathered in previous iterations of the incremental algorithm to reduce its complexity.
- Description: 2003007558
A novel piecewise linear classifier based on polyhedral conic and max-min separabilities
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean , Ozturk, Gurkan , Kasimbeyli, Refail
- Date: 2011
- Type: Text , Journal article
- Relation: TOP Vol.21, no.1 (2011), p. 1-22
- Full Text: false
- Reviewed:
- Description: In this paper, an algorithm for finding piecewise linear boundaries between pattern classes is developed. This algorithm consists of two main stages. In the first stage, a polyhedral conic set is used to identify data points which lie inside their classes, and in the second stage we exclude those points to compute a piecewise linear boundary using the remaining data points. Piecewise linear boundaries are computed incrementally starting with one hyperplane. Such an approach allows one to significantly reduce the computational effort in many large data sets. Results of numerical experiments are reported. These results demonstrate that the new algorithm consistently produces a good test set accuracy on most data sets comparing with a number of other mainstream classifiers. © 2011 Sociedad de EstadÃstica e Investigación Operativa.
Almost simplicial polytopes : the lower and upper bound theorems
- Authors: Nevo, Eran , Pineda-Villavicencio, Guillermo , Ugon, Julien , Yost, David
- Date: 2020
- Type: Text , Journal article
- Relation: Canadian Journal of Mathematics Vol. 72, no. 2 (2020), p. 537-556. http://purl.org/au-research/grants/arc/DP180100602
- Full Text:
- Reviewed:
- Description: We study -vertex -dimensional polytopes with at most one nonsimplex facet with, say, vertices, called almost simplicial polytopes. We provide tight lower and upper bound theorems for these polytopes as functions of, and, thus generalizing the classical Lower Bound Theorem by Barnette and the Upper Bound Theorem by McMullen, which treat the case where s = 0. We characterize the minimizers and provide examples of maximizers for any. Our construction of maximizers is a generalization of cyclic polytopes, based on a suitable variation of the moment curve, and is of independent interest. © 2018 Canadian Mathematical Society.
An Agile group aware process beyond CRISP-DM: A hospital data mining case study
- Authors: Sharma, Vishakha , Stranieri, Andrew , Ugon, Julien , Martin, Laura
- Date: 2017
- Type: Text , Conference proceedings
- Relation: ICCDA '17: Proceedings of the International Conference on Computer and Data Analysis May 2017 p. 109-113
- Full Text: false
- Reviewed:
- Description: The CRISP-DM methodology is commonly used in data analytics exercises within an organisation to provide system and structure to data mining processes. However, in providing a rigorous framework, CRISP-DM overlooks two facets of data analytics in organisational contexts; data mining exercises are far more agile and subject to change than presumed in CRISP-DM and central decisions regarding the interpretation of patterns discovered and the direction of analytics exercises are typically not made by individuals but by committees or groups within an organisation. The current study provides a case study of data mining in a hospital setting and suggests how the agile nature of an analytics exercise and the group reasoning inherent in key decisions can be accommodated within a CRISP-DM methodology.
An algorithm for clusterwise linear regression based on smoothing techniques
- Authors: Bagirov, Adil , Ugon, Julien , Mirzayeva, Hijran
- Date: 2014
- Type: Text , Journal article
- Relation: Optimization Letters Vol. 9, no. 2 (2014), p. 375-390
- Full Text: false
- Reviewed:
- Description: We propose an algorithm based on an incremental approach and smoothing techniques to solve clusterwise linear regression (CLR) problems. This algorithm incrementally divides the whole data set into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate an initial solution for solving global optimization problems at each iteration of the incremental algorithm. Such an approach allows one to find global or approximate global solutions to the CLR problems. The algorithm is tested using several data sets for regression analysis and compared with the multistart and incremental Spath algorithms.
An algorithm for minimizing clustering functions
- Authors: Bagirov, Adil , Ugon, Julien
- Date: 2005
- Type: Text , Journal article
- Relation: Optimization Vol. 54, no. 4-5 (Aug-Oct 2005), p. 351-368
- Full Text:
- Reviewed:
- Description: The problem of cluster analysis is formulated as a problem of nonsmooth, nonconvex optimization. An algorithm for solving the latter optimization problem is developed which allows one to significantly reduce the computational efforts. This algorithm is based on the so-called discrete gradient method. Results of numerical experiments are presented which demonstrate the effectiveness of the proposed algorithm.
- Description: C1
- Description: 2003001266
An efficient algorithm for the incremental construction of a piecewise linear classifier
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean
- Date: 2011
- Type: Text , Journal article
- Relation: Information Systems Vol. 36, no. 4 (2011), p. 782-790
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Reviewed:
- Description: In this paper the problem of finding piecewise linear boundaries between sets is considered and is applied for solving supervised data classification problems. An algorithm for the computation of piecewise linear boundaries, consisting of two main steps, is proposed. In the first step sets are approximated by hyperboxes to find so-called "indeterminate" regions between sets. In the second step sets are separated inside these "indeterminate" regions by piecewise linear functions. These functions are computed incrementally starting with a linear function. Results of numerical experiments are reported. These results demonstrate that the new algorithm requires a reasonable training time and it produces consistently good test set accuracy on most data sets comparing with mainstream classifiers. © 2010 Elsevier B.V. All rights reserved.
An incremental approach for the construction of a piecewise linear classifier
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at XIIIth International Conference : Applied Stochastic Models and Data Analysis, ASMDA 2009, Vilnius, Lithuania : 30th June - 3rd July 2009 p. 507–511
- Relation: https://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Description: In this paper the problem of finding piecewise linear boundaries between sets is considered and is applied for solving supervised data classification problems. An algorithm for the computation of piecewise linear boundaries, consisting of two main steps, is proposed. In the first step sets are approximated by hyperboxes to find so-called “indeterminate” regions between sets. In the second step sets are separated inside these “indeterminate” regions by piecewise linear functions. These functions are computed incrementally starting with a linear function. Results of numerical experiments are reported. These results demonstrate that the new algorithm requires a reasonable training time and it produces consistently good test set accuracy on most data sets comparing with mainstream classifiers.
- Description: 2003007559
Analysis and comparison of co-occurrence matrix and pixel n-gram features for mammographic images
- Authors: Kulkarni, Pradnya , Stranieri, Andrew , Kulkarni, Sid , Ugon, Julien , Mittal, Manish
- Date: 2015
- Type: Text , Conference paper
- Relation: International Conference on Communication and Computing p. 7-14
- Full Text: false
- Reviewed:
- Description: Mammography is a proven way of detecting breast cancer at an early stage. Various feature extraction techniques such as histograms, co-occurrence matrix, local binary patterns, Gabor filters, wavelet transforms are used for analysing mammograms. The novel pixel N-gram feature extraction technique has been inspired from the character N-gram concept of text retrieval. In this paper, we have compared the novel N-gram feature extraction technique with the co-occurrence matrix feature extraction technique. The experiments were conducted on the benchmark miniMIAS mammography database. Classification of mammograms into normal and abnormal category using N-gram features showed promising results with greater classification accuracy, sensitivity and specificity compared to classification using co-occurrence matrix features. Moreover, N-gram features computation are found to be considerably faster than co-occurrence matrix feature computation
Application of nonsmooth optimisation to data analysis
- Authors: Ugon, Julien
- Date: 2005
- Type: Text , Thesis , PhD
- Full Text:
- Description: The research presented in this thesis is two-fold: on the one hand, major data mining problems are reformulated as mathematical programming problems. These problems should be carefully designed, since from their formulation depends the efficiency, perhaps the existence, of the solvers. On the other hand, optimisation methods are adapted to solve these problems, most of which are nonsmooth and nonconvex. This part is delicate, as the solution is often required to be good and obtained fast. Numerical experiments on real-world datasets are presented and analysed.
- Description: Doctor of Philosophy
Automatic sleep stage identification: difficulties and possible solutions
- Authors: Sukhorukova, Nadezda , Stranieri, Andrew , Ofoghi, Bahadorreza , Vamplew, Peter , Saleem, Muhammad Saad , Ma, Liping , Ugon, Adrien , Ugon, Julien , Muecke, Nial , Amiel, Hélène , Philippe, Carole , Bani-Mustafa, Ahmed , Huda, Shamsul , Bertoli, Marcello , Levy, P , Ganascia, J.G
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: The diagnosis of many sleep disorders is a labour intensive task that involves the specialised interpretation of numerous signals including brain wave, breath and heart rate captured in overnight polysomnogram sessions. The automation of diagnoses is challenging for data mining algorithms because the data sets are extremely large and noisy, the signals are complex and specialist's analyses vary. This work reports on the adaptation of approaches from four fields; neural networks, mathematical optimisation, financial forecasting and frequency domain analysis to the problem of automatically determing a patient's stage of sleep. Results, though preliminary, are promising and indicate that combined approaches may prove more fruitful than the reliance on a approach.
Characterization theorem for best linear spline approximation with free knots
- Authors: Sukhorukova, Nadezda , Ugon, Julien
- Date: 2010
- Type: Text , Journal article
- Relation: Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications and Algorithms Vol. 17, no. 5 (2010), p. 687-708
- Full Text: false
- Reviewed:
- Description: A necessary condition for a best Chebyshev approximation by piecewise linear functions is derived using quasidifferential calculus. We first discover some properties of the knots joining the linear functions. Then we use these properties to obtain the optimality condition. This condition is stronger than existing results. We present an example of linear spline approximation where the existing optimality conditions are satisfied, but not the proposed one, which shows that it is not optimal. Copyright © 2010 Watam Press.
Characterization theorem for best polynomial spline approximation with free knots, variable degree and fixed tails
- Authors: Crouzeix, Jean-Pierre , Sukhorukova, Nadezda , Ugon, Julien
- Date: 2017
- Type: Text , Journal article
- Relation: Journal of Optimization Theory and Applications Vol. 172, no. 3 (2017), p. 950-964
- Full Text:
- Reviewed:
- Description: In this paper, we derive a necessary condition for a best approximation by piecewise polynomial functions of varying degree from one interval to another. Based on these results, we obtain a characterization theorem for the polynomial splines with fixed tails, that is the value of the spline is fixed in one or more knots (external or internal). We apply nonsmooth nonconvex analysis to obtain this result, which is also a necessary and sufficient condition for inf-stationarity in the sense of Demyanov-Rubinov. This paper is an extension of a paper where similar conditions were obtained for free tails splines. The main results of this paper are essential for the development of a Remez-type algorithm for free knot spline approximation.
Chebyshev approximation by linear combinations of fixed knot polynomial splines with weighting functions
- Authors: Sukhorukova, Nadezda , Ugon, Julien
- Date: 2016
- Type: Text , Journal article
- Relation: Journal of Optimization Theory and Applications Vol. 171, no. 2 (2016), p. 536-549
- Full Text: false
- Reviewed:
- Description: In this paper, we derive conditions for best uniform approximation by fixed knots polynomial splines with weighting functions. The theory of Chebyshev approximation for fixed knots polynomial functions is very elegant and complete. Necessary and sufficient optimality conditions have been developed leading to efficient algorithms for constructing optimal spline approximations. The optimality conditions are based on the notion of alternance (maximal deviation points with alternating deviation signs). In this paper, we extend these results to the case when the model function is a product of fixed knots polynomial splines (whose parameters are subject to optimization) and other functions (whose parameters are predefined). This problem is nonsmooth, and therefore, we make use of convex and nonsmooth analysis to solve it.
Chebyshev multivariate polynomial approximation : alternance interpretation
- Authors: Sukhorukova, Nadezda , Ugon, Julien , Yost, David
- Date: 2018
- Type: Text , Book chapter
- Relation: 2016 Matrix Annals p. 177-182
- Full Text:
- Reviewed:
- Description: In this paper, we derive optimality conditions for Chebyshev approximation of multivariate functions. The theory of Chebyshev (uniform) approximation for univariate functions was developed in the late nineteenth and twentieth century. The optimality conditions are based on the notion of alternance (maximal deviation points with alternating deviation signs). It is not clear, however, how to extend the notion of alternance to the case of multivariate functions. There have been several attempts to extend the theory of Chebyshev approximation to the case of multivariate functions. We propose an alternative approach, which is based on the notion of convexity and nonsmooth analysis.
Chebyshev multivariate polynomial approximation and point reduction procedure
- Authors: Sukhorukova, Nadezda , Ugon, Julien , Yost, David
- Date: 2021
- Type: Text , Journal article
- Relation: Constructive Approximation Vol. 53, no. 3 (2021), p. 529-544
- Relation: http://purl.org/au-research/grants/arc/DP180100602
- Full Text:
- Reviewed:
- Description: We apply the methods of nonsmooth and convex analysis to extend the study of Chebyshev (uniform) approximation for univariate polynomial functions to the case of general multivariate functions (not just polynomials). First of all, we give new necessary and sufficient optimality conditions for multivariate approximation, and a geometrical interpretation of them which reduces to the classical alternating sequence condition in the univariate case. Then, we present a procedure for verification of necessary and sufficient optimality conditions that is based on our generalization of the notion of alternating sequence to the case of multivariate polynomials. Finally, we develop an algorithm for fast verification of necessary optimality conditions in the multivariate polynomial case. © 2019, Springer Science+Business Media, LLC, part of Springer Nature.