Clusterwise support vector linear regression
- Authors: Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Mäkelä, Marko , Taheri, Sona
- Date: 2020
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 287, no. 1 (2020), p. 19-35
- Full Text:
- Reviewed:
- Description: In clusterwise linear regression (CLR), the aim is to simultaneously partition data into a given number of clusters and to find regression coefficients for each cluster. In this paper, we propose a novel approach to model and solve the CLR problem. The main idea is to utilize the support vector machine (SVM) approach to model the CLR problem by using the SVM for regression to approximate each cluster. This new formulation of the CLR problem is represented as an unconstrained nonsmooth optimization problem, where we minimize a difference of two convex (DC) functions. To solve this problem, a method based on the combination of the incremental algorithm and the double bundle method for DC optimization is designed. Numerical experiments are performed to validate the reliability of the new formulation for CLR and the efficiency of the proposed method. The results show that the SVM approach is suitable for solving CLR problems, especially, when there are outliers in data. © 2020 Elsevier B.V.
- Description: Funding details: Academy of Finland, 289500, 294002, 319274 Funding details: Turun Yliopisto Funding details: Australian Research Council, ARC, (Project no. DP190100580 ).
Visual tools for analysing evolution, emergence, and error in data streams
- Authors: Hart, Sol , Yearwood, John , Bagirov, Adil
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at 6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007, Melbourne, Victoria : 11th-13th July 2007 p. 987-992
- Full Text:
- Description: The relatively new field of stream mining has necessitated the development of robust drift-aware algorithms that provide accurate, real time, data handling capabilities. Tools are needed to assess and diagnose important trends and investigate drift evolution parameters. In this paper, we present two new and novel visualisation techniques, Pixie and Luna graphs, which incorporate salient group statistics coupled with intuitive visual representations of multidimensional groupings over time. Through the novel representations presented here, spatial interactions between temporal divisions can be diagnosed and overall distribution patterns identified. It provides a means of evaluating in non-constrained capacity, commonly constrained evolutionary problems.
- Description: 2003005432
Data mining with combined use of optimization techniques and self-organizing maps for improving risk grouping rules : Application to prostate cancer patients
- Authors: Churilov, Leonid , Bagirov, Adil , Schwartz, Daniel , Smith, Kate , Dally, Michael
- Date: 2005
- Type: Text , Journal article
- Relation: Journal of Management Information Systems Vol. 21, no. 4 (2005), p. 85-100
- Full Text:
- Reviewed:
- Description: Data mining techniques provide a popular and powerful tool set to generate various data-driven classification systems. In this paper, we investigate the combined use of self-organizing maps (SOM) and nonsmooth nonconvex optimization techniques in order to produce a working case of a data-driven risk classification system. The optimization approach strengthens the validity of SOM results, and the improved classification system increases both the quality of prediction and the homogeneity within the risk groups. Accurate classification of prostate cancer patients into risk groups is important to assist in the identification of appropriate treatment paths. We start with the existing rules and aim to improve classification accuracy by identifying inconsistencies utilizing self-organizing maps as a data visualization tool. Then, we progress to the study of assigning prostate cancer patients into homogenous groups with the aim to support future clinical treatment decisions. Using the case of prostate cancer patients grouping, we demonstrate strong potential of data-driven risk classification schemes for addressing the risk grouping issues in more general organizational settings. © 2005 M.E. Sharpe, Inc.
- Description: C1
- Description: 2003001265