Incremental DC optimization algorithm for large-scale clusterwise linear regression
- Authors: Bagirov, Adil , Taheri, Sona , Cimen, Emre
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Computational and Applied Mathematics Vol. 389, no. (2021), p. 1-17
- Relation: https://purl.org/au-research/grants/arc/DP190100580
- Full Text: false
- Reviewed:
- Description: The objective function in the nonsmooth optimization model of the clusterwise linear regression (CLR) problem with the squared regression error is represented as a difference of two convex functions. Then using the difference of convex algorithm (DCA) approach the CLR problem is replaced by the sequence of smooth unconstrained optimization subproblems. A new algorithm based on the DCA and the incremental approach is designed to solve the CLR problem. We apply the Quasi-Newton method to solve the subproblems. The proposed algorithm is evaluated using several synthetic and real-world data sets for regression and compared with other algorithms for CLR. Results demonstrate that the DCA based algorithm is efficient for solving CLR problems with the large number of data points and in particular, outperforms other algorithms when the number of input variables is small. © 2020 Elsevier B.V.
Multi-source cyber-attacks detection using machine learning
- Authors: Taheri, Sona , Gondal, Iqbal , Bagirov, Adil , Harkness, Greg , Brown, Simon , Chi, Chihung
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 IEEE International Conference on Industrial Technology, ICIT 2019; Melbourne, Australia; 13th-15th February 2019 Vol. 2019-February, p. 1167-1172
- Full Text:
- Reviewed:
- Description: The Internet of Things (IoT) has significantly increased the number of devices connected to the Internet ranging from sensors to multi-source data information. As the IoT continues to evolve with new technologies number of threats and attacks against IoT devices are on the increase. Analyzing and detecting these attacks originating from different sources needs machine learning models. These models provide proactive solutions for detecting attacks and their sources. In this paper, we propose to apply a supervised machine learning classification technique to identify cyber-attacks from each source. More precisely, we apply the incremental piecewise linear classifier that constructs boundary between sources/classes incrementally starting with one hyperplane and adding more hyperplanes at each iteration. The algorithm terminates when no further significant improvement of the separation of sources/classes is possible. The construction and usage of piecewise linear boundaries allows us to avoid any possible overfitting. We apply the incremental piecewise linear classifier on the multi-source real world cyber security data set to identify cyber-attacks and their sources.
- Description: Proceedings of the IEEE International Conference on Industrial Technology
New diagonal bundle method for clustering problems in large data sets
- Authors: Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona
- Date: 2017
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 263, no. 2 (2017), p. 367-379
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clustering is one of the most important tasks in data mining. Recent developments in computer hardware allow us to store in random access memory (RAM) and repeatedly read data sets with hundreds of thousands and even millions of data points. This makes it possible to use conventional clustering algorithms in such data sets. However, these algorithms may need prohibitively large computational time and fail to produce accurate solutions. Therefore, it is important to develop clustering algorithms which are accurate and can provide real time clustering in large data sets. This paper introduces one of them. Using nonsmooth optimization formulation of the clustering problem the objective function is represented as a difference of two convex (DC) functions. Then a new diagonal bundle algorithm that explicitly uses this structure is designed and combined with an incremental approach to solve this problem. The method is evaluated using real world data sets with both large number of attributes and large number of data points. The proposed method is compared with two other clustering algorithms using numerical results. © 2017 Elsevier B.V.