Multi-label learning with emerging new labels
- Authors: Zhu, Yue , Ting, Kaiming , Zhou, Zhi-Hua
- Date: 2018
- Type: Text , Journal article
- Relation: IEEE Transactions on Knowledge and Data Engineering Vol. 30, no. 10 (2018), p. 1901-1914
- Full Text:
- Reviewed:
- Description: In a multi-label learning task, an object possesses multiple concepts where each concept is represented by a class label. Previous studies on multi-label learning have focused on a fixed set of class labels, i.e., the class label set of test data is the same as that in the training set. In many applications, however, the environment is dynamic and new concepts may emerge in a data stream. In order to maintain a good predictive performance in this environment, a multi-label learning method must have the ability to detect and classify instances with emerging new labels. To this end, we propose a new approach called Multi-label learning with Emerging New Labels (MuENL). It has three functions: classify instances on currently known labels, detect the emergence of a new label, and construct a new classifier for each new label that works collaboratively with the classifier for known labels. In addition, we show that MuENL can be easily extended to handle sparse high dimensional data streams by simply reducing the original dimensionality, and then applying MuENL on the reduced dimensional space. Our empirical evaluation shows the effectiveness of MuENL on several benchmark datasets and MuENLHD on the sparse high dimensional Weibo dataset.
Missing value imputation via clusterwise linear regression
- Authors: Karmitsa, Napsu , Taheri, Sona , Bagirov, Adil , Makinen, Pauliina
- Date: 2022
- Type: Text , Journal article
- Relation: IEEE transactions on knowledge and data engineering Vol. 34, no. 4 (2020), p. 1889-1901
- Full Text: false
- Reviewed:
- Description:
In this paper a new method of preprocessing incomplete data is introduced. The method is based on clusterwise linear regression and it combines two well-known approaches for missing value imputation: linear regression and clustering. The idea is to approximate missing values using only those data points that are somewhat similar to the incomplete data point. A similar idea is used also in clustering based imputation methods. Nevertheless, here the linear regression approach is used within each cluster to accurately predict the missing values, and this is done simultaneously to clustering. The proposed method is tested using some synthetic and real-world data sets and compared with other algorithms for missing value imputations. Numerical results demonstrate that the proposed method produces the most accurate imputations in MCAR and MAR data sets with a clear structure and the percentages of missing data no more than 25%