- Title
- Nonsmooth optimization algorithms for clusterwise linear regression
- Creator
- Mirzayeva, Hijran
- Date
- 2013
- Type
- Text; Thesis; PhD
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/41975
- Identifier
- vital:5317
- Abstract
- Data mining is about solving problems by analyzing data that present in databases. Supervised and unsupervised data classification (clustering) are among the most important techniques in data mining. Regression analysis is the process of fitting a function (often linear) to the data to discover how one or more variables vary as a function of another. The aim of clusterwise regression is to combine both of these techniques, to discover trends within data, when more than one trend is likely to exist. Clusterwise regression has applications for instance in market segmentation, where it allows one to gather information on customer behaviors for several unknown groups of customers. There exist different methods for solving clusterwise linear regression problems. In spite of that, the development of efficient algorithms for solving clusterwise linear regression problems is still an important research topic. In this thesis our aim is to develop new algorithms for solving clusterwise linear regression problems in large data sets based on incremental and nonsmooth optimization approaches. Three new methods for solving clusterwise linear regression problems are developed and numerically tested on publicly available data sets for regression analysis. The first method is a new algorithm for solving the clusterwise linear regression problems based on their nonsmooth nonconvex formulation. This is an incremental algorithm. The second method is a nonsmooth optimization algorithm for solving clusterwise linear regression problems. Nonsmooth optimization techniques are proposed to use instead of the Sp¨ath algorithm to solve optimization problems at each iteration of the incremental algorithm. The discrete gradient method is used to solve nonsmooth optimization problems at each iteration of the incremental algorithm. This approach allows one to reduce the CPU time and the number of regression problems solved in comparison with the first incremental algorithm. The third algorithm is an algorithm based on an incremental approach and on the smoothing techniques for solving clusterwise linear regression problems. The use of smoothing techniques allows one to apply powerful methods of smooth nonlinear programming to solve clusterwise linear regression problems. Numerical results are presented for all three algorithms using small to large data sets. The new algorithms are also compared with multi-start Sp¨ath algorithm for clusterwise linear regression.; Doctor of Philosophy
- Publisher
- University of Ballarat
- Rights
- Open Access
- Rights
- This metadata is freely available under a CCO license
- Subject
- Regression analysis; Clusterwise regression; Data mining
- Hits: 1274
- Visitors: 1112
- Downloads: 0