Your selections:

Show More

Show Less

A heuristic algorithm for solving the minimum sum-of-squares clustering problems

**Authors:**Ordin, Burak , Bagirov, Adil**Date:**2015**Type:**Text , Journal article**Relation:**Journal of Global Optimization Vol. 61, no. 2 (2015), p. 341-361**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k-means and modified global k-means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k-means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k-means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k-means algorithms in finding global solutions to clustering problems.

A novel piecewise linear classifier based on polyhedral conic and max-min separabilities

- Bagirov, Adil, Ugon, Julien, Webb, Dean, Ozturk, Gurkan, Kasimbeyli, Refail

**Authors:**Bagirov, Adil , Ugon, Julien , Webb, Dean , Ozturk, Gurkan , Kasimbeyli, Refail**Date:**2011**Type:**Text , Journal article**Relation:**TOP Vol.21, no.1 (2011), p. 1-22**Full Text:**false**Reviewed:****Description:**In this paper, an algorithm for finding piecewise linear boundaries between pattern classes is developed. This algorithm consists of two main stages. In the first stage, a polyhedral conic set is used to identify data points which lie inside their classes, and in the second stage we exclude those points to compute a piecewise linear boundary using the remaining data points. Piecewise linear boundaries are computed incrementally starting with one hyperplane. Such an approach allows one to significantly reduce the computational effort in many large data sets. Results of numerical experiments are reported. These results demonstrate that the new algorithm consistently produces a good test set accuracy on most data sets comparing with a number of other mainstream classifiers. Â© 2011 Sociedad de EstadÃstica e InvestigaciÃ³n Operativa.

New diagonal bundle method for clustering problems in large data sets

- Karmitsa, Napsu, Bagirov, Adil, Taheri, Sona

**Authors:**Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona**Date:**2017**Type:**Text , Journal article**Relation:**European Journal of Operational Research Vol. 263, no. 2 (2017), p. 367-379**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clustering is one of the most important tasks in data mining. Recent developments in computer hardware allow us to store in random access memory (RAM) and repeatedly read data sets with hundreds of thousands and even millions of data points. This makes it possible to use conventional clustering algorithms in such data sets. However, these algorithms may need prohibitively large computational time and fail to produce accurate solutions. Therefore, it is important to develop clustering algorithms which are accurate and can provide real time clustering in large data sets. This paper introduces one of them. Using nonsmooth optimization formulation of the clustering problem the objective function is represented as a difference of two convex (DC) functions. Then a new diagonal bundle algorithm that explicitly uses this structure is designed and combined with an incremental approach to solve this problem. The method is evaluated using real world data sets with both large number of attributes and large number of data points. The proposed method is compared with two other clustering algorithms using numerical results. © 2017 Elsevier B.V.

Clusterwise support vector linear regression

- Joki, Kaisa, Bagirov, Adil, Karmitsa, Napsu, Mäkelä, Marko, Taheri, Sona

**Authors:**Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Mäkelä, Marko , Taheri, Sona**Date:**2020**Type:**Text , Journal article**Relation:**European Journal of Operational Research Vol. 287, no. 1 (2020), p. 19-35**Full Text:**false**Reviewed:****Description:**In clusterwise linear regression (CLR), the aim is to simultaneously partition data into a given number of clusters and to find regression coefficients for each cluster. In this paper, we propose a novel approach to model and solve the CLR problem. The main idea is to utilize the support vector machine (SVM) approach to model the CLR problem by using the SVM for regression to approximate each cluster. This new formulation of the CLR problem is represented as an unconstrained nonsmooth optimization problem, where we minimize a difference of two convex (DC) functions. To solve this problem, a method based on the combination of the incremental algorithm and the double bundle method for DC optimization is designed. Numerical experiments are performed to validate the reliability of the new formulation for CLR and the efficiency of the proposed method. The results show that the SVM approach is suitable for solving CLR problems, especially, when there are outliers in data. © 2020 Elsevier B.V.**Description:**Funding details: Academy of Finland, 289500, 294002, 319274 Funding details: Turun Yliopisto Funding details: Australian Research Council, ARC, (Project no. DP190100580 ).

- «
- ‹
- 1
- ›
- »

Are you sure you would like to clear your session, including search history and login status?