An algorithm for clusterwise linear regression based on smoothing techniques
- Authors: Bagirov, Adil , Ugon, Julien , Mirzayeva, Hijran
- Date: 2014
- Type: Text , Journal article
- Relation: Optimization Letters Vol. 9, no. 2 (2014), p. 375-390
- Full Text: false
- Reviewed:
- Description: We propose an algorithm based on an incremental approach and smoothing techniques to solve clusterwise linear regression (CLR) problems. This algorithm incrementally divides the whole data set into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate an initial solution for solving global optimization problems at each iteration of the incremental algorithm. Such an approach allows one to find global or approximate global solutions to the CLR problems. The algorithm is tested using several data sets for regression analysis and compared with the multistart and incremental Spath algorithms.
Nonsmooth optimization algorithms for clusterwise linear regression
- Authors: Mirzayeva, Hijran
- Date: 2013
- Type: Text , Thesis , PhD
- Full Text: false
- Description: Data mining is about solving problems by analyzing data that present in databases. Supervised and unsupervised data classification (clustering) are among the most important techniques in data mining. Regression analysis is the process of fitting a function (often linear) to the data to discover how one or more variables vary as a function of another. The aim of clusterwise regression is to combine both of these techniques, to discover trends within data, when more than one trend is likely to exist. Clusterwise regression has applications for instance in market segmentation, where it allows one to gather information on customer behaviors for several unknown groups of customers. There exist different methods for solving clusterwise linear regression problems. In spite of that, the development of efficient algorithms for solving clusterwise linear regression problems is still an important research topic. In this thesis our aim is to develop new algorithms for solving clusterwise linear regression problems in large data sets based on incremental and nonsmooth optimization approaches. Three new methods for solving clusterwise linear regression problems are developed and numerically tested on publicly available data sets for regression analysis. The first method is a new algorithm for solving the clusterwise linear regression problems based on their nonsmooth nonconvex formulation. This is an incremental algorithm. The second method is a nonsmooth optimization algorithm for solving clusterwise linear regression problems. Nonsmooth optimization techniques are proposed to use instead of the Sp¨ath algorithm to solve optimization problems at each iteration of the incremental algorithm. The discrete gradient method is used to solve nonsmooth optimization problems at each iteration of the incremental algorithm. This approach allows one to reduce the CPU time and the number of regression problems solved in comparison with the first incremental algorithm. The third algorithm is an algorithm based on an incremental approach and on the smoothing techniques for solving clusterwise linear regression problems. The use of smoothing techniques allows one to apply powerful methods of smooth nonlinear programming to solve clusterwise linear regression problems. Numerical results are presented for all three algorithms using small to large data sets. The new algorithms are also compared with multi-start Sp¨ath algorithm for clusterwise linear regression.
- Description: Doctor of Philosophy
Nonsmooth nonconvex optimization approach to clusterwise linear regression problems
- Authors: Bagirov, Adil , Ugon, Julien , Mirzayeva, Hijran
- Date: 2013
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 229, no. 1 (2013), p. 132-142
- Full Text: false
- Reviewed:
- Description: Clusterwise regression consists of finding a number of regression functions each approximating a subset of the data. In this paper, a new approach for solving the clusterwise linear regression problems is proposed based on a nonsmooth nonconvex formulation. We present an algorithm for minimizing this nonsmooth nonconvex function. This algorithm incrementally divides the whole data set into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate a good starting point for solving global optimization problems at each iteration of the incremental algorithm. Such an approach allows one to find global or near global solution to the problem when the data sets are sufficiently dense. The algorithm is compared with the multistart Späth algorithm on several publicly available data sets for regression analysis. © 2013 Elsevier B.V. All rights reserved.
- Description: 2003011018
Nonsmooth optimization algorithm for solving clusterwise linear regression problems
- Authors: Bagirov, Adil , Ugon, Julien , Mirzayeva, Hijran
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Optimization Theory and Applications Vol. 164, no. 3 (2015), p. 755-780
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clusterwise linear regression consists of finding a number of linear regression functions each approximating a subset of the data. In this paper, the clusterwise linear regression problem is formulated as a nonsmooth nonconvex optimization problem and an algorithm based on an incremental approach and on the discrete gradient method of nonsmooth optimization is designed to solve it. This algorithm incrementally divides the whole dataset into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate good starting points for solving global optimization problems at each iteration of the incremental algorithm. The algorithm is compared with the multi-start Spath and the incremental algorithms on several publicly available datasets for regression analysis.