Incremental DC optimization algorithm for large-scale clusterwise linear regression
- Authors: Bagirov, Adil , Taheri, Sona , Cimen, Emre
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Computational and Applied Mathematics Vol. 389, no. (2021), p. 1-17
- Relation: https://purl.org/au-research/grants/arc/DP190100580
- Full Text: false
- Reviewed:
- Description: The objective function in the nonsmooth optimization model of the clusterwise linear regression (CLR) problem with the squared regression error is represented as a difference of two convex functions. Then using the difference of convex algorithm (DCA) approach the CLR problem is replaced by the sequence of smooth unconstrained optimization subproblems. A new algorithm based on the DCA and the incremental approach is designed to solve the CLR problem. We apply the Quasi-Newton method to solve the subproblems. The proposed algorithm is evaluated using several synthetic and real-world data sets for regression and compared with other algorithms for CLR. Results demonstrate that the DCA based algorithm is efficient for solving CLR problems with the large number of data points and in particular, outperforms other algorithms when the number of input variables is small. © 2020 Elsevier B.V.
New diagonal bundle method for clustering problems in large data sets
- Authors: Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona
- Date: 2017
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 263, no. 2 (2017), p. 367-379
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clustering is one of the most important tasks in data mining. Recent developments in computer hardware allow us to store in random access memory (RAM) and repeatedly read data sets with hundreds of thousands and even millions of data points. This makes it possible to use conventional clustering algorithms in such data sets. However, these algorithms may need prohibitively large computational time and fail to produce accurate solutions. Therefore, it is important to develop clustering algorithms which are accurate and can provide real time clustering in large data sets. This paper introduces one of them. Using nonsmooth optimization formulation of the clustering problem the objective function is represented as a difference of two convex (DC) functions. Then a new diagonal bundle algorithm that explicitly uses this structure is designed and combined with an incremental approach to solve this problem. The method is evaluated using real world data sets with both large number of attributes and large number of data points. The proposed method is compared with two other clustering algorithms using numerical results. © 2017 Elsevier B.V.
A heuristic algorithm for solving the minimum sum-of-squares clustering problems
- Authors: Ordin, Burak , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Global Optimization Vol. 61, no. 2 (2015), p. 341-361
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k-means and modified global k-means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k-means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k-means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k-means algorithms in finding global solutions to clustering problems.
Piecewise linear classifiers based on nonsmooth optimization approaches
- Authors: Bagirov, Adil , Kasimbeyli, Refail , Ozturk, Gurkan , Ugon, Julien
- Date: 2014
- Type: Text , Book chapter
- Relation: Optimization in Science and Engineering p. 1-32
- Full Text: false
- Reviewed:
- Description: Nonsmooth optimization provides efficient algorithms for solving many machine learning problems. In particular, nonsmooth optimization approaches to supervised data classification problems lead to the design of very efficient algorithms for their solution. In this chapter, we demonstrate how nonsmooth optimization algorithms can be applied to design efficient piecewise linear classifiers for supervised data classification problems. Such classifiers are developed using a max–min and a polyhedral conic separabilities as well as an incremental approach. We report results of numerical experiments and compare the piecewise linear classifiers with a number of other mainstream classifiers.
Comparing different nonsmooth minimization methods and software
- Authors: Karmitsa, Napsu , Bagirov, Adil , Makela, Marko
- Date: 2012
- Type: Text , Journal article
- Relation: Optimization Methods and Software Vol. 27, no. 1 (2012), p. 131-153
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Reviewed:
- Description: Most nonsmooth optimization (NSO) methods can be divided into two main groups: subgradient methods and bundle methods. In this paper, we test and compare different methods from both groups as well as some methods which may be considered as hybrids of these two and/or some others. All the solvers tested are so-called general black box methods which, at least in theory, can be applied to solve almost all NSO problems. The test set includes a large number of unconstrained nonsmooth convex and nonconvex problems of different size. In particular, it includes piecewise linear and quadratic problems. The aim of this work is not to foreground some methods over the others but to get some insight on which method to select for certain types of problems. © 2012 Taylor and Francis Group, LLC.
Fast modified global k-means algorithm for incremental cluster construction
- Authors: Bagirov, Adil , Ugon, Julien , Webb, Dean
- Date: 2011
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 44, no. 4 (2011), p. 866-876
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Reviewed:
- Description: The k-means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and are inefficient for solving clustering problems in large datasets. Recently, incremental approaches have been developed to resolve difficulties with the choice of starting points. The global k-means and the modified global k-means algorithms are based on such an approach. They iteratively add one cluster center at a time. Numerical experiments show that these algorithms considerably improve the k-means algorithm. However, they require storing the whole affinity matrix or computing this matrix at each iteration. This makes both algorithms time consuming and memory demanding for clustering even moderately large datasets. In this paper, a new version of the modified global k-means algorithm is proposed. We introduce an auxiliary cluster function to generate a set of starting points lying in different parts of the dataset. We exploit information gathered in previous iterations of the incremental algorithm to eliminate the need of computing or storing the whole affinity matrix and thereby to reduce computational effort and memory usage. Results of numerical experiments on six standard datasets demonstrate that the new algorithm is more efficient than the global and the modified global k-means algorithms. © 2010 Elsevier Ltd. All rights reserved.
Integrated production system optimization using global optimization techniques
- Authors: Mason, T. L. , Emelle, C. , Van Berkel, J. , Bagirov, Adil , Kampas, F. , Pinter, J. D.
- Date: 2007
- Type: Text , Journal article
- Relation: Journal of Industrial and Management Optimization Vol. 3, no. 2 (May 2007), p. 257-277
- Full Text:
- Reviewed:
- Description: Many optimization problems related to integrated oil and gas production systems are nonconvex and multimodal. Additionally, apart from the innate nonsmoothness of many optimization problems, nonsmooth functions such as minimum and maximum functions may be used to model flow/pressure controllers and cascade mass in the gas gathering and blending networks. In this paper we study the application of different versions of the derivative free Discrete Gradient Method (DGM) as well as the Lipschitz Global Optimizer (LGO) suite to production optimization in integrated oil and gas production systems and their comparison with various local and global solvers used with the General Algebraic Modeling System (GAMS). Four nonconvex and nonsmooth test cases were constructed from a small but realistic integrated gas production system optimization problem. The derivation of the system of equations for the various test cases is also presented. Results demonstrate that DGM is especially effective for solving nonsmooth optimization problems and its two versions are capable global optimization algorithms. We also demonstrate that LGO solves successfully the presented test (as well as other related real-world) problems.
- Description: C1
- Description: 2003004725
Improving risk grouping rules for prostate cancer patients with optimization
- Authors: Churilov, Leonid , Bagirov, Adil , Schwartz, Daniel , Smith, Kate , Dally, Michael
- Date: 2004
- Type: Text , Conference paper
- Relation: Paper presented at the 37th Annual Hawaii International Conference on System Sciences, Big Island, Hawaii : 5th January, 2004
- Full Text:
- Reviewed:
- Description: Data mining techniques provide a popular and powerful toolset to address both clinical and management issues in the area of health care. This paper describes the study of assigning prostate cancer patients into homogenous groups with the aim to support future clinical treatment decisions. The cluster analysis based model is suggested and an application of non-smooth non-convex optimization techniques to solve this model is discussed. It is demonstrated that using the optimization based approach to data mining of a prostate cancer patients database can lead to generation of a significant amount of new knowledge that can be effectively utilized to enhance clinical decision making.
- Description: E1
- Description: 2003000846
Optimization based clustering algorithms in multicast group hierarchies
- Authors: Jia, Long , Ouveysi, Iradj , Rubinov, Alex , Bagirov, Adil
- Date: 2003
- Type: Text , Conference paper
- Relation: Paper presented at the 2003 Australian Telecommunications Networks and Applications Conference, Melbourne : 8th - 10th December, 2003
- Full Text:
- Reviewed:
- Description: In this paper we propose the use of optimization based clustering algorithms to determine hierarchical multicast trees. This problem is formulated as an optimization problem with a non-smooth, non-convex objective function. Different algorithms are examined for solving this problem. Results of numerical experiments using some artificial and real-world databases are reported. We compare several optimization based clustering methods and their combinations with the k- means method. The results demonstrate the effectiveness of these algorithms.
- Description: E1
- Description: 2003000382