In this paper a new method of preprocessing incomplete data is introduced. The method is based on clusterwise linear regression and it combines two well-known approaches for missing value imputation: linear regression and clustering. The idea is to approximate missing values using only those data points that are somewhat similar to the incomplete data point. A similar idea is used also in clustering based imputation methods. Nevertheless, here the linear regression approach is used within each cluster to accurately predict the missing values, and this is done simultaneously to clustering. The proposed method is tested using some synthetic and real-world data sets and compared with other algorithms for missing value imputations. Numerical results demonstrate that the proposed method produces the most accurate imputations in MCAR and MAR data sets with a clear structure and the percentages of missing data no more than 25%
Partial undersampling of imbalanced data for cyber threats detection
The non-smooth and bi-objective team orienteering problem with soft constraints
A simulated annealing-based maximum-margin clustering algorithm
Multi-source cyber-attacks detection using machine learning
Are you sure you would like to clear your session, including search history and login status?