### A heuristic algorithm for solving the minimum sum-of-squares clustering problems

**Authors:**Ordin, Burak , Bagirov, Adil**Date:**2015**Type:**Text , Journal article**Relation:**Journal of Global Optimization Vol. 61, no. 2 (2015), p. 341-361**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k-means and modified global k-means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k-means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k-means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k-means algorithms in finding global solutions to clustering problems.

### A convolutional recursive modified Self Organizing Map for handwritten digits recognition

**Authors:**Mohebi, Ehsan , Bagirov, Adil**Date:**2014**Type:**Text , Journal article**Relation:**Neural Networks Vol. 60, no. (2014), p. 104-118**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**It is well known that the handwritten digits recognition is a challenging problem. Different classification algorithms have been applied to solve it. Among them, the Self Organizing Maps (SOM) produced promising results. In this paper, first we introduce a Modified SOM for the vector quantization problem with improved initialization process and topology preservation. Then we develop a Convolutional Recursive Modified SOM and apply it to the problem of handwritten digits recognition. The computational results obtained using the well known MNIST dataset demonstrate the superiority of the proposed algorithm over the existing SOM-based algorithms.

### An algorithm for clustering using L1-norm based on hyperbolic smoothing technique

**Authors:**Bagirov, Adil , Mohebi, Ehsan**Date:**2016**Type:**Text , Journal article**Relation:**Computational Intelligence Vol. 32, no. 3 (2016), p. 439-457**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Cluster analysis deals with the problem of organization of a collection of objects into clusters based on a similarity measure, which can be defined using various distance functions. The use of different similarity measures allows one to find different cluster structures in a data set. In this article, an algorithm is developed to solve clustering problems where the similarity measure is defined using the L1-norm. The algorithm is designed using the nonsmooth optimization approach to the clustering problem. Smoothing techniques are applied to smooth both the clustering function and the L1-norm. The algorithm computes clusters sequentially and finds global or near global solutions to the clustering problem. Results of numerical experiments using 12 real-world data sets are reported, and the proposed algorithm is compared with two other clustering algorithms. ©2015 Wiley Periodicals, Inc.

### Prediction of monthly rainfall in Victoria, Australia : Clusterwise linear regression approach

**Authors:**Bagirov, Adil , Mahmood, Arshad , Barton, Andrew**Date:**2017**Type:**Text , Journal article**Relation:**Atmospheric Research Vol. 188, no. (2017), p. 20-29**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889–2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations. © 2017 Elsevier B.V.

### An incremental piecewise linear classifier based on polyhedral conic separation

**Authors:**Ozturk, Gurkan , Bagirov, Adil , Kasimbeyli, Refail**Date:**2015**Type:**Text , Journal article**Relation:**Machine Learning Vol. 101, no. 1-3 (2015), p. 397-413**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**In this paper, a piecewise linear classifier based on polyhedral conic separation is developed. This classifier builds nonlinear boundaries between classes using polyhedral conic functions. Since the number of polyhedral conic functions separating classes is not known a priori, an incremental approach is proposed to build separating functions. These functions are found by minimizing an error function which is nonsmooth and nonconvex. A special procedure is proposed to generate starting points to minimize the error function and this procedure is based on the incremental approach. The discrete gradient method, which is a derivative-free method for nonsmooth optimization, is applied to minimize the error function starting from those points. The proposed classifier is applied to solve classification problems on 12 publicly available data sets and compared with some mainstream and piecewise linear classifiers. © 2014, The Author(s).

### Solving DC programs using the cutting angle method

**Authors:**Ferrer, Albert , Bagirov, Adil , Beliakov, Gleb**Date:**2015**Type:**Text , Journal article**Relation:**Journal of Global Optimization Vol. 61, no. 1 (2015), p. 71-89**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**In this paper, we propose a new algorithm for global minimization of functions represented as a difference of two convex functions. The proposed method is a derivative free method and it is designed by adapting the extended cutting angle method. We present preliminary results of numerical experiments using test problems with difference of convex objective functions and box-constraints. We also compare the proposed algorithm with a classical one that uses prismatical subdivisions.

### A proximal bundle method for nonsmooth DC optimization utilizing nonconvex cutting planes

**Authors:**Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Makela, Marko**Date:**2017**Type:**Text , Journal article**Relation:**Journal of Global Optimization Vol. 68, no. 3 (2017), p. 501-535**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**In this paper, we develop a version of the bundle method to solve unconstrained difference of convex (DC) programming problems. It is assumed that a DC representation of the objective function is available. Our main idea is to utilize subgradients of both the first and second components in the DC representation. This subgradient information is gathered from some neighborhood of the current iteration point and it is used to build separately an approximation for each component in the DC representation. By combining these approximations we obtain a new nonconvex cutting plane model of the original objective function, which takes into account explicitly both the convex and the concave behavior of the objective function. We design the proximal bundle method for DC programming based on this new approach and prove the convergence of the method to an -critical point. The algorithm is tested using some academic test problems and the preliminary numerical results have shown the good performance of the new bundle method. An interesting fact is that the new algorithm finds nearly always the global solution in our test problems.

### New diagonal bundle method for clustering problems in large data sets

**Authors:**Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona**Date:**2017**Type:**Text , Journal article**Relation:**European Journal of Operational Research Vol. 263, no. 2 (2017), p. 367-379**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clustering is one of the most important tasks in data mining. Recent developments in computer hardware allow us to store in random access memory (RAM) and repeatedly read data sets with hundreds of thousands and even millions of data points. This makes it possible to use conventional clustering algorithms in such data sets. However, these algorithms may need prohibitively large computational time and fail to produce accurate solutions. Therefore, it is important to develop clustering algorithms which are accurate and can provide real time clustering in large data sets. This paper introduces one of them. Using nonsmooth optimization formulation of the clustering problem the objective function is represented as a difference of two convex (DC) functions. Then a new diagonal bundle algorithm that explicitly uses this structure is designed and combined with an incremental approach to solve this problem. The method is evaluated using real world data sets with both large number of attributes and large number of data points. The proposed method is compared with two other clustering algorithms using numerical results. © 2017 Elsevier B.V.

### Clustering in large data sets with the limited memory bundle method

**Authors:**Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona**Date:**2018**Type:**Text , Journal article**Relation:**Pattern Recognition Vol. 83, no. (2018), p. 245-259**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**The aim of this paper is to design an algorithm based on nonsmooth optimization techniques to solve the minimum sum-of-squares clustering problems in very large data sets. First, the clustering problem is formulated as a nonsmooth optimization problem. Then the limited memory bundle method [Haarala et al., 2007] is modified and combined with an incremental approach to design a new clustering algorithm. The algorithm is evaluated using real world data sets with both the large number of attributes and the large number of data points. It is also compared with some other optimization based clustering algorithms. The numerical results demonstrate the efficiency of the proposed algorithm for clustering in very large data sets.

### Nonsmooth DC programming approach to the minimum sum-of-squares clustering problems

**Authors:**Bagirov, Adil , Taheri, Sona , Ugon, Julien**Date:**2016**Type:**Text , Journal article**Relation:**Pattern Recognition Vol. 53, no. (2016), p. 12-24**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**This paper introduces an algorithm for solving the minimum sum-of-squares clustering problems using their difference of convex representations. A non-smooth non-convex optimization formulation of the clustering problem is used to design the algorithm. Characterizations of critical points, stationary points in the sense of generalized gradients and inf-stationary points of the clustering problem are given. The proposed algorithm is tested and compared with other clustering algorithms using large real world data sets. © 2015 Elsevier Ltd. All rights reserved.

### An incremental clustering algorithm based on hyperbolic smoothing

**Authors:**Bagirov, Adil , Ordin, Burak , Ozturk, Gurkan , Xavier, Adilson**Date:**2015**Type:**Text , Journal article**Relation:**Computational Optimization and Applications Vol. 61, no. 1 (2015), p. 219-241**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clustering is an important problem in data mining. It can be formulated as a nonsmooth, nonconvex optimization problem. For the most global optimization techniques this problem is challenging even in medium size data sets. In this paper, we propose an approach that allows one to apply local methods of smooth optimization to solve the clustering problems. We apply an incremental approach to generate starting points for cluster centers which enables us to deal with nonconvexity of the problem. The hyperbolic smoothing technique is applied to handle nonsmoothness of the clustering problems and to make it possible application of smooth optimization algorithms to solve them. Results of numerical experiments with eleven real-world data sets and the comparison with state-of-the-art incremental clustering algorithms demonstrate that the smooth optimization algorithms in combination with the incremental approach are powerful alternative to existing clustering algorithms.

### Optimization based clustering algorithms for authorship analysis of phishing emails

**Authors:**Seifollahi, Sattar , Bagirov, Adil , Layton, Robert , Gondal, Iqbal**Date:**2017**Type:**Text , Journal article**Relation:**Neural Processing Letters Vol. 46, no. 2 (2017), p. 411-425**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Phishing has given attackers power to masquerade as legitimate users of organizations, such as banks, to scam money and private information from victims. Phishing is so widespread that combating the phishing attacks could overwhelm the victim organization. It is important to group the phishing attacks to formulate effective defence mechanism. In this paper, we use clustering methods to analyze and characterize phishing emails and perform their relative attribution. Emails are first tokenized to a bag-of-word space and, then, transformed to a numeric vector space using frequencies of words in documents. Wordnet vocabulary is used to take effects of similar words into account and to reduce sparsity. The word similarity measure is combined with the term frequencies to introduce a novel text transformation into numeric features. To improve the accuracy, we apply inverse document frequency weighting, which gives higher weights to features used by fewer authors. The k-means and recently introduced three optimization based algorithms: MS-MGKM, INCA and DCClust are applied for clustering purposes. The optimization based algorithms indicate the existence of well separated clusters in the phishing emails dataset. © 2017, Springer Science+Business Media New York.

### A comparative assessment of models to predict monthly rainfall in Australia

**Authors:**Bagirov, Adil , Mahmood, Arshad**Date:**2018**Type:**Text , Journal article**Relation:**Water Resources Management Vol. 32, no. 5 (2018), p. 1777-1794**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Accurate rainfall prediction is a challenging task. It is especially challenging in Australia where the climate is highly variable. Australia’s climatic zones range from high rainfall tropical regions in the north to the driest desert region in the interior. The performance of prediction models may vary depending on climatic conditions. It is, therefore, important to assess and compare the performance of these models in different climatic zones. This paper examines the performance of data driven models such as the support vector machines for regression, the multiple linear regression, the k-nearest neighbors and the artificial neural networks for monthly rainfall prediction in Australia depending on climatic conditions. Rainfall data with five meteorological variables over the period of 1970–2014 from 24 geographically diverse weather stations are used for this purpose. The prediction performance of each model was evaluated by comparing observed and predicted rainfall using various measures for prediction accuracy. © 2018, Springer Science+Business Media B.V., part of Springer Nature.

### A difference of convex optimization algorithm for piecewise linear regression

**Authors:**Bagirov, Adil , Taheri, Sona , Asadi, Soodabeh**Date:**2019**Type:**Text , Journal article**Relation:**Journal of Industrial and Management Optimization Vol. 15, no. 2 (2019), p. 909-932**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**The problem of finding a continuous piecewise linear function approximating a regression function is considered. This problem is formulated as a nonconvex nonsmooth optimization problem where the objective function is represented as a difference of convex (DC) functions. Subdifferentials of DC components are computed and an algorithm is designed based on these subdifferentials to find piecewise linear functions. The algorithm is tested using some synthetic and real world data sets and compared with other regression algorithms.

### Nonsmooth DC programming approach to clusterwise linear regression : Optimality conditions and algorithms

**Authors:**Bagirov, Adil , Ugon, Julien**Date:**2018**Type:**Text , Journal article**Relation:**Optimization Methods and Software Vol. 33, no. 1 (2018), p. 194-219**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**The clusterwise linear regression problem is formulated as a nonsmooth nonconvex optimization problem using the squared regression error function. The objective function in this problem is represented as a difference of convex functions. Optimality conditions are derived, and an algorithm is designed based on such a representation. An incremental approach is proposed to generate starting solutions. The algorithm is tested on small to large data sets. © 2017 Informa UK Limited, trading as Taylor & Francis Group.

### Nonsmooth optimization algorithm for solving clusterwise linear regression problems

**Authors:**Bagirov, Adil , Ugon, Julien , Mirzayeva, Hijran**Date:**2015**Type:**Text , Journal article**Relation:**Journal of Optimization Theory and Applications Vol. 164, no. 3 (2015), p. 755-780**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clusterwise linear regression consists of finding a number of linear regression functions each approximating a subset of the data. In this paper, the clusterwise linear regression problem is formulated as a nonsmooth nonconvex optimization problem and an algorithm based on an incremental approach and on the discrete gradient method of nonsmooth optimization is designed to solve it. This algorithm incrementally divides the whole dataset into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate good starting points for solving global optimization problems at each iteration of the incremental algorithm. The algorithm is compared with the multi-start Spath and the incremental algorithms on several publicly available datasets for regression analysis.

### Double bundle method for finding clarke stationary points in nonsmooth dc programming

**Authors:**Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Makela, Marko , Taheri, Sona**Date:**2018**Type:**Text , Journal article**Relation:**SIAM Journal on Optimization Vol. 28, no. 2 (2018), p. 1892-1919**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:****Reviewed:****Description:**The aim of this paper is to introduce a new proximal double bundle method for unconstrained nonsmooth optimization, where the objective function is presented as a difference of two convex (DC) functions. The novelty in our method is a new escape procedure which enables us to guarantee approximate Clarke stationarity for solutions by utilizing the DC components of the objective function. This optimality condition is stronger than the criticality condition typically used in DC programming. Moreover, if a candidate solution is not approximate Clarke stationary, then the escape procedure returns a descent direction. With this escape procedure, we can avoid some shortcomings encountered when criticality is used. The finite termination of the double bundle method to an approximate Clarke stationary point is proved by assuming that the subdifferentials of DC components are polytopes. Finally, some encouraging numerical results are presented.

### DC programming algorithm for clusterwise linear L1 regression

**Authors:**Bagirov, Adil , Taheri, Sona**Date:**2017**Type:**Text , Journal article**Relation:**Journal of the Operations Research Society of China Vol. 5, no. 2 (2017), p. 233-256**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**The aim of this paper is to develop an algorithm for solving the clusterwise linear least absolute deviations regression problem. This problem is formulated as a nonsmooth nonconvex optimization problem, and the objective function is represented as a difference of convex functions. Optimality conditions are derived by using this representation. An algorithm is designed based on the difference of convex representation and an incremental approach. The proposed algorithm is tested using small to large artificial and real-world data sets. © 2017, Operations Research Society of China, Periodicals Agency of Shanghai University, Science Press, and Springer-Verlag Berlin Heidelberg.