- Title
- Missing value imputation via clusterwise linear regression
- Creator
- Karmitsa, Napsu; Taheri, Sona; Bagirov, Adil; Makinen, Pauliina
- Date
- 2022
- Type
- Text; Journal article
- Identifier
- http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/174096
- Identifier
- vital:14773
- Identifier
-
https://doi.org/10.1109/TKDE.2020.3001694
- Identifier
- ISBN:1041-4347
- Abstract
In this paper a new method of preprocessing incomplete data is introduced. The method is based on clusterwise linear regression and it combines two well-known approaches for missing value imputation: linear regression and clustering. The idea is to approximate missing values using only those data points that are somewhat similar to the incomplete data point. A similar idea is used also in clustering based imputation methods. Nevertheless, here the linear regression approach is used within each cluster to accurately predict the missing values, and this is done simultaneously to clustering. The proposed method is tested using some synthetic and real-world data sets and compared with other algorithms for missing value imputations. Numerical results demonstrate that the proposed method produces the most accurate imputations in MCAR and MAR data sets with a clear structure and the percentages of missing data no more than 25%
- Relation
- IEEE transactions on knowledge and data engineering Vol. 34, no. 4 (2020), p. 1889-1901
- Rights
- Copyright @ 2020 IEEE
- Rights
- This metadata is freely available under a CCO license
- Subject
- 08 Information and Computing Sciences; Data Analysis; Incomplete Data; Imputation; Clusterwise Linear Regression; Nonsmooth Optimization; Engineering; Computer Science
- Reviewed
- Hits: 1610
- Visitors: 1479
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|