Bundle enrichment method for nonsmooth difference of convex programming problems
- Gaudioso, Manilo, Taheri, Sona, Bagirov, Adil, Karmitsa, Napsu
- Authors: Gaudioso, Manilo , Taheri, Sona , Bagirov, Adil , Karmitsa, Napsu
- Date: 2023
- Type: Text , Journal article
- Relation: Algorithms Vol. 16, no. 8 (2023), p.
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: The Bundle Enrichment Method (BEM-DC) is introduced for solving nonsmooth difference of convex (DC) programming problems. The novelty of the method consists of the dynamic management of the bundle. More specifically, a DC model, being the difference of two convex piecewise affine functions, is formulated. The (global) minimization of the model is tackled by solving a set of convex problems whose cardinality depends on the number of linearizations adopted to approximate the second DC component function. The new bundle management policy distributes the information coming from previous iterations to separately model the DC components of the objective function. Such a distribution is driven by the sign of linearization errors. If the displacement suggested by the model minimization provides no sufficient decrease of the objective function, then the temporary enrichment of the cutting plane approximation of just the first DC component function takes place until either the termination of the algorithm is certified or a sufficient decrease is achieved. The convergence of the BEM-DC method is studied, and computational results on a set of academic test problems with nonsmooth DC objective functions are provided. © 2023 by the authors.
- Authors: Gaudioso, Manilo , Taheri, Sona , Bagirov, Adil , Karmitsa, Napsu
- Date: 2023
- Type: Text , Journal article
- Relation: Algorithms Vol. 16, no. 8 (2023), p.
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: The Bundle Enrichment Method (BEM-DC) is introduced for solving nonsmooth difference of convex (DC) programming problems. The novelty of the method consists of the dynamic management of the bundle. More specifically, a DC model, being the difference of two convex piecewise affine functions, is formulated. The (global) minimization of the model is tackled by solving a set of convex problems whose cardinality depends on the number of linearizations adopted to approximate the second DC component function. The new bundle management policy distributes the information coming from previous iterations to separately model the DC components of the objective function. Such a distribution is driven by the sign of linearization errors. If the displacement suggested by the model minimization provides no sufficient decrease of the objective function, then the temporary enrichment of the cutting plane approximation of just the first DC component function takes place until either the termination of the algorithm is certified or a sufficient decrease is achieved. The convergence of the BEM-DC method is studied, and computational results on a set of academic test problems with nonsmooth DC objective functions are provided. © 2023 by the authors.
Methods and applications of clusterwise linear regression : a survey and comparison
- Long, Qiang, Bagirov, Adil, Taheri, Sona, Sultanova, Nargiz, Wu, Xue
- Authors: Long, Qiang , Bagirov, Adil , Taheri, Sona , Sultanova, Nargiz , Wu, Xue
- Date: 2023
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 17, no. 3 (2023), p.
- Relation: https://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: Clusterwise linear regression (CLR) is a well-known technique for approximating a data using more than one linear function. It is based on the combination of clustering and multiple linear regression methods. This article provides a comprehensive survey and comparative assessments of CLR including model formulations, description of algorithms, and their performance on small to large-scale synthetic and real-world datasets. Some applications of the CLR algorithms and possible future research directions are also discussed. © 2023 Association for Computing Machinery.
- Authors: Long, Qiang , Bagirov, Adil , Taheri, Sona , Sultanova, Nargiz , Wu, Xue
- Date: 2023
- Type: Text , Journal article
- Relation: ACM Transactions on Knowledge Discovery from Data Vol. 17, no. 3 (2023), p.
- Relation: https://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: Clusterwise linear regression (CLR) is a well-known technique for approximating a data using more than one linear function. It is based on the combination of clustering and multiple linear regression methods. This article provides a comprehensive survey and comparative assessments of CLR including model formulations, description of algorithms, and their performance on small to large-scale synthetic and real-world datasets. Some applications of the CLR algorithms and possible future research directions are also discussed. © 2023 Association for Computing Machinery.
Nonsmooth optimization-based hyperparameter-free neural networks for large-scale regression
- Karmitsa, Napsu, Taheri, Sona, Joki, Kaisa, Paasivirta, Pauliina, Defterdarovic, J., Bagirov, Adil, Mäkelä, Marko
- Authors: Karmitsa, Napsu , Taheri, Sona , Joki, Kaisa , Paasivirta, Pauliina , Defterdarovic, J. , Bagirov, Adil , Mäkelä, Marko
- Date: 2023
- Type: Text , Journal article
- Relation: Algorithms Vol. 16, no. 9 (2023), p.
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: In this paper, a new nonsmooth optimization-based algorithm for solving large-scale regression problems is introduced. The regression problem is modeled as fully-connected feedforward neural networks with one hidden layer, piecewise linear activation, and the (Formula presented.) -loss functions. A modified version of the limited memory bundle method is applied to minimize this nonsmooth objective. In addition, a novel constructive approach for automated determination of the proper number of hidden nodes is developed. Finally, large real-world data sets are used to evaluate the proposed algorithm and to compare it with some state-of-the-art neural network algorithms for regression. The results demonstrate the superiority of the proposed algorithm as a predictive tool in most data sets used in numerical experiments. © 2023 by the authors.
- Authors: Karmitsa, Napsu , Taheri, Sona , Joki, Kaisa , Paasivirta, Pauliina , Defterdarovic, J. , Bagirov, Adil , Mäkelä, Marko
- Date: 2023
- Type: Text , Journal article
- Relation: Algorithms Vol. 16, no. 9 (2023), p.
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: In this paper, a new nonsmooth optimization-based algorithm for solving large-scale regression problems is introduced. The regression problem is modeled as fully-connected feedforward neural networks with one hidden layer, piecewise linear activation, and the (Formula presented.) -loss functions. A modified version of the limited memory bundle method is applied to minimize this nonsmooth objective. In addition, a novel constructive approach for automated determination of the proper number of hidden nodes is developed. Finally, large real-world data sets are used to evaluate the proposed algorithm and to compare it with some state-of-the-art neural network algorithms for regression. The results demonstrate the superiority of the proposed algorithm as a predictive tool in most data sets used in numerical experiments. © 2023 by the authors.
High activity and high functional connectivity are mutually exclusive in resting state zebrafish and human brains
- Zarei, Mahdi, Xie, Dan, Jiang, Fei, Bagirov, Adil, Huang, Bo, Raj, Ashish, Nagarajan, Srikantan, Guo, Su
- Authors: Zarei, Mahdi , Xie, Dan , Jiang, Fei , Bagirov, Adil , Huang, Bo , Raj, Ashish , Nagarajan, Srikantan , Guo, Su
- Date: 2022
- Type: Text , Journal article
- Relation: BMC Biology Vol. 20, no. 1 (2022), p. 84-84
- Full Text:
- Reviewed:
- Description: The structural connectivity of neurons in the brain allows active neurons to impact the physiology of target neuron types with which they are functionally connected. While the structural connectome is at the basis of functional connectome, it is the functional connectivity measured through correlations between time series of individual neurophysiological events that underlies behavioral and mental states. However, in light of the diverse neuronal cell types populating the brain and their unique connectivity properties, both neuronal activity and functional connectivity are heterogeneous across the brain, and the nature of their relationship is not clear. Here, we employ brain-wide calcium imaging at cellular resolution in larval zebrafish to understand the principles of resting state functional connectivity. We recorded the spontaneous activity of >12,000 neurons in the awake resting state forebrain. By classifying their activity (i.e., variances of ΔF/F across time) and functional connectivity into three levels (high, medium, low), we find that highly active neurons have low functional connections and highly connected neurons are of low activity. This finding holds true when neuronal activity and functional connectivity data are classified into five instead of three levels, and in whole brain spontaneous activity datasets. Moreover, such activity-connectivity relationship is not observed in randomly shuffled, noise-added, or simulated datasets, suggesting that it reflects an intrinsic brain network property. Intriguingly, deploying the same analytical tools on functional magnetic resonance imaging (fMRI) data from the resting state human brain, we uncover a similar relationship between activity (signal variance over time) and functional connectivity, that is, regions of high activity are non-overlapping with those of high connectivity. We found a mutually exclusive relationship between high activity (signal variance over time) and high functional connectivity of neurons in zebrafish and human brains. These findings reveal a previously unknown and evolutionarily conserved brain organizational principle, which has implications for understanding disease states and designing artificial neuronal networks.
- Authors: Zarei, Mahdi , Xie, Dan , Jiang, Fei , Bagirov, Adil , Huang, Bo , Raj, Ashish , Nagarajan, Srikantan , Guo, Su
- Date: 2022
- Type: Text , Journal article
- Relation: BMC Biology Vol. 20, no. 1 (2022), p. 84-84
- Full Text:
- Reviewed:
- Description: The structural connectivity of neurons in the brain allows active neurons to impact the physiology of target neuron types with which they are functionally connected. While the structural connectome is at the basis of functional connectome, it is the functional connectivity measured through correlations between time series of individual neurophysiological events that underlies behavioral and mental states. However, in light of the diverse neuronal cell types populating the brain and their unique connectivity properties, both neuronal activity and functional connectivity are heterogeneous across the brain, and the nature of their relationship is not clear. Here, we employ brain-wide calcium imaging at cellular resolution in larval zebrafish to understand the principles of resting state functional connectivity. We recorded the spontaneous activity of >12,000 neurons in the awake resting state forebrain. By classifying their activity (i.e., variances of ΔF/F across time) and functional connectivity into three levels (high, medium, low), we find that highly active neurons have low functional connections and highly connected neurons are of low activity. This finding holds true when neuronal activity and functional connectivity data are classified into five instead of three levels, and in whole brain spontaneous activity datasets. Moreover, such activity-connectivity relationship is not observed in randomly shuffled, noise-added, or simulated datasets, suggesting that it reflects an intrinsic brain network property. Intriguingly, deploying the same analytical tools on functional magnetic resonance imaging (fMRI) data from the resting state human brain, we uncover a similar relationship between activity (signal variance over time) and functional connectivity, that is, regions of high activity are non-overlapping with those of high connectivity. We found a mutually exclusive relationship between high activity (signal variance over time) and high functional connectivity of neurons in zebrafish and human brains. These findings reveal a previously unknown and evolutionarily conserved brain organizational principle, which has implications for understanding disease states and designing artificial neuronal networks.
Aggregate subgradient method for nonsmooth DC optimization
- Bagirov, Adil, Taheri, Sona, Joki, Kaisa, Karmitsa, Napsu, Mäkelä, Marko
- Authors: Bagirov, Adil , Taheri, Sona , Joki, Kaisa , Karmitsa, Napsu , Mäkelä, Marko
- Date: 2021
- Type: Text , Journal article
- Relation: Optimization Letters Vol. 15, no. 1 (2021), p. 83-96
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: The aggregate subgradient method is developed for solving unconstrained nonsmooth difference of convex (DC) optimization problems. The proposed method shares some similarities with both the subgradient and the bundle methods. Aggregate subgradients are defined as a convex combination of subgradients computed at null steps between two serious steps. At each iteration search directions are found using only two subgradients: the aggregate subgradient and a subgradient computed at the current null step. It is proved that the proposed method converges to a critical point of the DC optimization problem and also that the number of null steps between two serious steps is finite. The new method is tested using some academic test problems and compared with several other nonsmooth DC optimization solvers. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
- Authors: Bagirov, Adil , Taheri, Sona , Joki, Kaisa , Karmitsa, Napsu , Mäkelä, Marko
- Date: 2021
- Type: Text , Journal article
- Relation: Optimization Letters Vol. 15, no. 1 (2021), p. 83-96
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: The aggregate subgradient method is developed for solving unconstrained nonsmooth difference of convex (DC) optimization problems. The proposed method shares some similarities with both the subgradient and the bundle methods. Aggregate subgradients are defined as a convex combination of subgradients computed at null steps between two serious steps. At each iteration search directions are found using only two subgradients: the aggregate subgradient and a subgradient computed at the current null step. It is proved that the proposed method converges to a critical point of the DC optimization problem and also that the number of null steps between two serious steps is finite. The new method is tested using some academic test problems and compared with several other nonsmooth DC optimization solvers. © 2020, Springer-Verlag GmbH Germany, part of Springer Nature.
Malware variant identification using incremental clustering
- Black, Paul, Gondal, Iqbal, Bagirov, Adil, Moniruzzaman, Md
- Authors: Black, Paul , Gondal, Iqbal , Bagirov, Adil , Moniruzzaman, Md
- Date: 2021
- Type: Text , Journal article
- Relation: Electronics Vol. 10, no. 14 (2021), p.
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Authors: Black, Paul , Gondal, Iqbal , Bagirov, Adil , Moniruzzaman, Md
- Date: 2021
- Type: Text , Journal article
- Relation: Electronics Vol. 10, no. 14 (2021), p.
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
Clusterwise support vector linear regression
- Joki, Kaisa, Bagirov, Adil, Karmitsa, Napsu, Mäkelä, Marko, Taheri, Sona
- Authors: Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Mäkelä, Marko , Taheri, Sona
- Date: 2020
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 287, no. 1 (2020), p. 19-35
- Full Text:
- Reviewed:
- Description: In clusterwise linear regression (CLR), the aim is to simultaneously partition data into a given number of clusters and to find regression coefficients for each cluster. In this paper, we propose a novel approach to model and solve the CLR problem. The main idea is to utilize the support vector machine (SVM) approach to model the CLR problem by using the SVM for regression to approximate each cluster. This new formulation of the CLR problem is represented as an unconstrained nonsmooth optimization problem, where we minimize a difference of two convex (DC) functions. To solve this problem, a method based on the combination of the incremental algorithm and the double bundle method for DC optimization is designed. Numerical experiments are performed to validate the reliability of the new formulation for CLR and the efficiency of the proposed method. The results show that the SVM approach is suitable for solving CLR problems, especially, when there are outliers in data. © 2020 Elsevier B.V.
- Description: Funding details: Academy of Finland, 289500, 294002, 319274 Funding details: Turun Yliopisto Funding details: Australian Research Council, ARC, (Project no. DP190100580 ).
- Authors: Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Mäkelä, Marko , Taheri, Sona
- Date: 2020
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 287, no. 1 (2020), p. 19-35
- Full Text:
- Reviewed:
- Description: In clusterwise linear regression (CLR), the aim is to simultaneously partition data into a given number of clusters and to find regression coefficients for each cluster. In this paper, we propose a novel approach to model and solve the CLR problem. The main idea is to utilize the support vector machine (SVM) approach to model the CLR problem by using the SVM for regression to approximate each cluster. This new formulation of the CLR problem is represented as an unconstrained nonsmooth optimization problem, where we minimize a difference of two convex (DC) functions. To solve this problem, a method based on the combination of the incremental algorithm and the double bundle method for DC optimization is designed. Numerical experiments are performed to validate the reliability of the new formulation for CLR and the efficiency of the proposed method. The results show that the SVM approach is suitable for solving CLR problems, especially, when there are outliers in data. © 2020 Elsevier B.V.
- Description: Funding details: Academy of Finland, 289500, 294002, 319274 Funding details: Turun Yliopisto Funding details: Australian Research Council, ARC, (Project no. DP190100580 ).
Cyberattack triage using incremental clustering for intrusion detection systems
- Taheri, Sona, Bagirov, Adil, Gondal, Iqbal, Brown, Simon
- Authors: Taheri, Sona , Bagirov, Adil , Gondal, Iqbal , Brown, Simon
- Date: 2020
- Type: Text , Journal article
- Relation: International Journal of Information Security Vol. 19, no. 5 (2020), p. 597-607
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: Intrusion detection systems (IDSs) are devices or software applications that monitor networks or systems for malicious activities and signals alerts/alarms when such activity is discovered. However, an IDS may generate many false alerts which affect its accuracy. In this paper, we develop a cyberattack triage algorithm to detect these alerts (so-called outliers). The proposed algorithm is designed using the clustering, optimization and distance-based approaches. An optimization-based incremental clustering algorithm is proposed to find clusters of different types of cyberattacks. Using a special procedure, a set of clusters is divided into two subsets: normal and stable clusters. Then, outliers are found among stable clusters using an average distance between centroids of normal clusters. The proposed algorithm is evaluated using the well-known IDS data sets—Knowledge Discovery and Data mining Cup 1999 and UNSW-NB15—and compared with some other existing algorithms. Results show that the proposed algorithm has a high detection accuracy and its false negative rate is very low. © 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
- Description: This research was conducted in Internet Commerce Security Laboratory (ICSL) funded by Westpac Banking Corporation Australia. In addition, the research by Dr. Sona Taheri and A/Prof. Adil Bagirov was supported by the Australian Government through the Australian Research Council’s Discovery Projects funding scheme (DP190100580).
- Authors: Taheri, Sona , Bagirov, Adil , Gondal, Iqbal , Brown, Simon
- Date: 2020
- Type: Text , Journal article
- Relation: International Journal of Information Security Vol. 19, no. 5 (2020), p. 597-607
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: Intrusion detection systems (IDSs) are devices or software applications that monitor networks or systems for malicious activities and signals alerts/alarms when such activity is discovered. However, an IDS may generate many false alerts which affect its accuracy. In this paper, we develop a cyberattack triage algorithm to detect these alerts (so-called outliers). The proposed algorithm is designed using the clustering, optimization and distance-based approaches. An optimization-based incremental clustering algorithm is proposed to find clusters of different types of cyberattacks. Using a special procedure, a set of clusters is divided into two subsets: normal and stable clusters. Then, outliers are found among stable clusters using an average distance between centroids of normal clusters. The proposed algorithm is evaluated using the well-known IDS data sets—Knowledge Discovery and Data mining Cup 1999 and UNSW-NB15—and compared with some other existing algorithms. Results show that the proposed algorithm has a high detection accuracy and its false negative rate is very low. © 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
- Description: This research was conducted in Internet Commerce Security Laboratory (ICSL) funded by Westpac Banking Corporation Australia. In addition, the research by Dr. Sona Taheri and A/Prof. Adil Bagirov was supported by the Australian Government through the Australian Research Council’s Discovery Projects funding scheme (DP190100580).
The non-smooth and bi-objective team orienteering problem with soft constraints
- Estrada-Moreno, Alejandro, Ferrer, Albert, Juan, Angel, Panadero, Javier, Bagirov, Adil
- Authors: Estrada-Moreno, Alejandro , Ferrer, Albert , Juan, Angel , Panadero, Javier , Bagirov, Adil
- Date: 2020
- Type: Text , Journal article
- Relation: Mathematics Vol. 8, no. 9 (2020), p.
- Full Text:
- Reviewed:
- Description: In the classical team orienteering problem (TOP), a fixed fleet of vehicles is employed, each of them with a limited driving range. The manager has to decide about the subset of customers to visit, as well as the visiting order (routes). Each customer offers a different reward, which is gathered the first time that it is visited. The goal is then to maximize the total reward collected without exceeding the driving range constraint. This paper analyzes a more realistic version of the TOP in which the driving range limitation is considered as a soft constraint: every time that this range is exceeded, a penalty cost is triggered. This cost is modeled as a piece-wise function, which depends on factors such as the distance of the vehicle to the destination depot. As a result, the traditional reward-maximization objective becomes a non-smooth function. In addition, a second objective, regarding the design of balanced routing plans, is considered as well. A mathematical model for this non-smooth and bi-objective TOP is provided, and a biased-randomized algorithm is proposed as a solving approach. © 2020 by the authors.
- Description: This work has been partially supported by the Spanish Ministry of Economy and Competitiveness & FEDER (SEV-2015-0563), the Spanish Ministry of Science (PID2019-111100RB-C21, RED2018-102642-T), and the Erasmus+ Program (2019-I-ES01-KA103-062602).
- Authors: Estrada-Moreno, Alejandro , Ferrer, Albert , Juan, Angel , Panadero, Javier , Bagirov, Adil
- Date: 2020
- Type: Text , Journal article
- Relation: Mathematics Vol. 8, no. 9 (2020), p.
- Full Text:
- Reviewed:
- Description: In the classical team orienteering problem (TOP), a fixed fleet of vehicles is employed, each of them with a limited driving range. The manager has to decide about the subset of customers to visit, as well as the visiting order (routes). Each customer offers a different reward, which is gathered the first time that it is visited. The goal is then to maximize the total reward collected without exceeding the driving range constraint. This paper analyzes a more realistic version of the TOP in which the driving range limitation is considered as a soft constraint: every time that this range is exceeded, a penalty cost is triggered. This cost is modeled as a piece-wise function, which depends on factors such as the distance of the vehicle to the destination depot. As a result, the traditional reward-maximization objective becomes a non-smooth function. In addition, a second objective, regarding the design of balanced routing plans, is considered as well. A mathematical model for this non-smooth and bi-objective TOP is provided, and a biased-randomized algorithm is proposed as a solving approach. © 2020 by the authors.
- Description: This work has been partially supported by the Spanish Ministry of Economy and Competitiveness & FEDER (SEV-2015-0563), the Spanish Ministry of Science (PID2019-111100RB-C21, RED2018-102642-T), and the Erasmus+ Program (2019-I-ES01-KA103-062602).
A simulated annealing-based maximum-margin clustering algorithm
- Seifollahi, Sattar, Bagirov, Adil, Borzeshi, Ehsan, Piccardi, Massimo
- Authors: Seifollahi, Sattar , Bagirov, Adil , Borzeshi, Ehsan , Piccardi, Massimo
- Date: 2019
- Type: Text , Journal article
- Relation: Computational Intelligence Vol. 35, no. 1 (2019), p. 23-41
- Full Text:
- Reviewed:
- Description: Maximum-margin clustering is an extension of the support vector machine (SVM) to clustering. It partitions a set of unlabeled data into multiple groups by finding hyperplanes with the largest margins. Although existing algorithms have shown promising results, there is no guarantee of convergence of these algorithms to global solutions due to the nonconvexity of the optimization problem. In this paper, we propose a simulated annealing-based algorithm that is able to mitigate the issue of local minima in the maximum-margin clustering problem. The novelty of our algorithm is twofold, ie, (i) it comprises a comprehensive cluster modification scheme based on simulated annealing, and (ii) it introduces a new approach based on the combination of k-means++ and SVM at each step of the annealing process. More precisely, k-means++ is initially applied to extract subsets of the data points. Then, an unsupervised SVM is applied to improve the clustering results. Experimental results on various benchmark data sets (of up to over a million points) give evidence that the proposed algorithm is more effective at solving the clustering problem than a number of popular clustering algorithms.
- Authors: Seifollahi, Sattar , Bagirov, Adil , Borzeshi, Ehsan , Piccardi, Massimo
- Date: 2019
- Type: Text , Journal article
- Relation: Computational Intelligence Vol. 35, no. 1 (2019), p. 23-41
- Full Text:
- Reviewed:
- Description: Maximum-margin clustering is an extension of the support vector machine (SVM) to clustering. It partitions a set of unlabeled data into multiple groups by finding hyperplanes with the largest margins. Although existing algorithms have shown promising results, there is no guarantee of convergence of these algorithms to global solutions due to the nonconvexity of the optimization problem. In this paper, we propose a simulated annealing-based algorithm that is able to mitigate the issue of local minima in the maximum-margin clustering problem. The novelty of our algorithm is twofold, ie, (i) it comprises a comprehensive cluster modification scheme based on simulated annealing, and (ii) it introduces a new approach based on the combination of k-means++ and SVM at each step of the annealing process. More precisely, k-means++ is initially applied to extract subsets of the data points. Then, an unsupervised SVM is applied to improve the clustering results. Experimental results on various benchmark data sets (of up to over a million points) give evidence that the proposed algorithm is more effective at solving the clustering problem than a number of popular clustering algorithms.
Double bundle method for finding clarke stationary points in nonsmooth dc programming
- Joki, Kaisa, Bagirov, Adil, Karmitsa, Napsu, Makela, Marko, Taheri, Sona
- Authors: Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Makela, Marko , Taheri, Sona
- Date: 2018
- Type: Text , Journal article
- Relation: SIAM Journal on Optimization Vol. 28, no. 2 (2018), p. 1892-1919
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text:
- Reviewed:
- Description: The aim of this paper is to introduce a new proximal double bundle method for unconstrained nonsmooth optimization, where the objective function is presented as a difference of two convex (DC) functions. The novelty in our method is a new escape procedure which enables us to guarantee approximate Clarke stationarity for solutions by utilizing the DC components of the objective function. This optimality condition is stronger than the criticality condition typically used in DC programming. Moreover, if a candidate solution is not approximate Clarke stationary, then the escape procedure returns a descent direction. With this escape procedure, we can avoid some shortcomings encountered when criticality is used. The finite termination of the double bundle method to an approximate Clarke stationary point is proved by assuming that the subdifferentials of DC components are polytopes. Finally, some encouraging numerical results are presented.
- Authors: Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Makela, Marko , Taheri, Sona
- Date: 2018
- Type: Text , Journal article
- Relation: SIAM Journal on Optimization Vol. 28, no. 2 (2018), p. 1892-1919
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text:
- Reviewed:
- Description: The aim of this paper is to introduce a new proximal double bundle method for unconstrained nonsmooth optimization, where the objective function is presented as a difference of two convex (DC) functions. The novelty in our method is a new escape procedure which enables us to guarantee approximate Clarke stationarity for solutions by utilizing the DC components of the objective function. This optimality condition is stronger than the criticality condition typically used in DC programming. Moreover, if a candidate solution is not approximate Clarke stationary, then the escape procedure returns a descent direction. With this escape procedure, we can avoid some shortcomings encountered when criticality is used. The finite termination of the double bundle method to an approximate Clarke stationary point is proved by assuming that the subdifferentials of DC components are polytopes. Finally, some encouraging numerical results are presented.
Diagnostic with incomplete nominal/discrete data
- Jelinek, Herbert, Yatsko, Andrew, Stranieri, Andrew, Venkatraman, Sitalakshmi, Bagirov, Adil
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
- Authors: Jelinek, Herbert , Yatsko, Andrew , Stranieri, Andrew , Venkatraman, Sitalakshmi , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Artificial Intelligence Research Vol. 4, no. 1 (2015), p. 22-35
- Full Text:
- Reviewed:
- Description: Missing values may be present in data without undermining its use for diagnostic / classification purposes but compromise application of readily available software. Surrogate entries can remedy the situation, although the outcome is generally unknown. Discretization of continuous attributes renders all data nominal and is helpful in dealing with missing values; particularly, no special handling is required for different attribute types. A number of classifiers exist or can be reformulated for this representation. Some classifiers can be reinvented as data completion methods. In this work the Decision Tree, Nearest Neighbour, and Naive Bayesian methods are demonstrated to have the required aptness. An approach is implemented whereby the entered missing values are not necessarily a close match of the true data; however, they intend to cause the least hindrance for classification. The proposed techniques find their application particularly in medical diagnostics. Where clinical data represents a number of related conditions, taking Cartesian product of class values of the underlying sub-problems allows narrowing down of the selection of missing value substitutes. Real-world data examples, some publically available, are enlisted for testing. The proposed and benchmark methods are compared by classifying the data before and after missing value imputation, indicating a significant improvement.
An L-2-Boosting Algorithm for Estimation of a Regression Function
- Bagirov, Adil, Clausen, Conny, Kohler, Michael
- Authors: Bagirov, Adil , Clausen, Conny , Kohler, Michael
- Date: 2010
- Type: Text , Journal article
- Relation: IEEE Transactions on Information Theory Vol. 56, no. 3 (2010), p. 1417-1429
- Full Text:
- Reviewed:
- Description: An L-2-boosting algorithm for estimation of a regression function from random design is presented, which consists of fitting repeatedly a function from a fixed nonlinear function space to the residuals of the data by least squares and by defining the estimate as a linear combination of the resulting least squares estimates. Splitting of the sample is used to decide after how many iterations of smoothing of the residuals the algorithm terminates. The rate of convergence of the algorithm is analyzed in case of an unbounded response variable. The method is used to fit a sum of maxima of minima of linear functions to a given data set, and is compared with other nonparametric regression estimates using simulated data.
- Authors: Bagirov, Adil , Clausen, Conny , Kohler, Michael
- Date: 2010
- Type: Text , Journal article
- Relation: IEEE Transactions on Information Theory Vol. 56, no. 3 (2010), p. 1417-1429
- Full Text:
- Reviewed:
- Description: An L-2-boosting algorithm for estimation of a regression function from random design is presented, which consists of fitting repeatedly a function from a fixed nonlinear function space to the residuals of the data by least squares and by defining the estimate as a linear combination of the resulting least squares estimates. Splitting of the sample is used to decide after how many iterations of smoothing of the residuals the algorithm terminates. The rate of convergence of the algorithm is analyzed in case of an unbounded response variable. The method is used to fit a sum of maxima of minima of linear functions to a given data set, and is compared with other nonparametric regression estimates using simulated data.
Application of optimisation-based data mining techniques to tobacco control dataset
- Dzalilov, Zari, Zhang, J, Bagirov, Adil, Mammadov, Musa
- Authors: Dzalilov, Zari , Zhang, J , Bagirov, Adil , Mammadov, Musa
- Date: 2010
- Type: Text , Journal article
- Relation: International Journal of Lean Thinking Vol. 1, no. 1 (2010), p. 27-41
- Full Text: false
- Reviewed:
- Description: Tobacco smoking is one of the leading causes of death around the world. Consequently, control of tobacco use is an important global public health issue. Tobacco control may be aided by development of theoretical and methodological frameworks for describing and understanding complex tobacco control systems. Linear regression and logistic regression are currently very popular statistical techniques for modeling and analyzing complex data in tobacco control systems. However, in tobacco markets, numerous interrelated factors nontrivially interact with tobacco control policies, such that policies and control outcomes are nonlinearly related.
- Authors: Dzalilov, Zari , Zhang, J , Bagirov, Adil , Mammadov, Musa
- Date: 2010
- Type: Text , Journal article
- Relation: International Journal of Lean Thinking Vol. 1, no. 1 (2010), p. 27-41
- Full Text: false
- Reviewed:
- Description: Tobacco smoking is one of the leading causes of death around the world. Consequently, control of tobacco use is an important global public health issue. Tobacco control may be aided by development of theoretical and methodological frameworks for describing and understanding complex tobacco control systems. Linear regression and logistic regression are currently very popular statistical techniques for modeling and analyzing complex data in tobacco control systems. However, in tobacco markets, numerous interrelated factors nontrivially interact with tobacco control policies, such that policies and control outcomes are nonlinearly related.
Estimation of a regression function by maxima of minima of linear functions
- Bagirov, Adil, Clausen, Conny, Kohler, Michael
- Authors: Bagirov, Adil , Clausen, Conny , Kohler, Michael
- Date: 2009
- Type: Text , Journal article
- Relation: IEEE Transactions on Information Theory Vol. 55, no. 2 (2009), p. 833-845
- Full Text:
- Reviewed:
- Description: In this paper, estimation of a regression function from independent and identically distributed random variables is considered. Estimates are defined by minimization of the empirical L2 risk over a class of functions, which are defined as maxima of minima of linear functions. Results concerning the rate of convergence of the estimates are derived. In particular, it is shown that for smooth regression functions satisfying the assumption of single index models, the estimate is able to achieve (up to some logarithmic factor) the corresponding optimal one-dimensional rate of convergence. Hence, under these assumptions, the estimate is able to circumvent the so-called curse of dimensionality. The small sample behavior of the estimates is illustrated by applying them to simulated data. © 2009 IEEE.
- Authors: Bagirov, Adil , Clausen, Conny , Kohler, Michael
- Date: 2009
- Type: Text , Journal article
- Relation: IEEE Transactions on Information Theory Vol. 55, no. 2 (2009), p. 833-845
- Full Text:
- Reviewed:
- Description: In this paper, estimation of a regression function from independent and identically distributed random variables is considered. Estimates are defined by minimization of the empirical L2 risk over a class of functions, which are defined as maxima of minima of linear functions. Results concerning the rate of convergence of the estimates are derived. In particular, it is shown that for smooth regression functions satisfying the assumption of single index models, the estimate is able to achieve (up to some logarithmic factor) the corresponding optimal one-dimensional rate of convergence. Hence, under these assumptions, the estimate is able to circumvent the so-called curse of dimensionality. The small sample behavior of the estimates is illustrated by applying them to simulated data. © 2009 IEEE.
An algorithm for the estimation of a regression function by continuous piecewise linear functions
- Bagirov, Adil, Clausen, Conny, Kohler, Michael
- Authors: Bagirov, Adil , Clausen, Conny , Kohler, Michael
- Date: 2008
- Type: Text , Journal article
- Relation: Computational Optimization and Applications Vol. 45, no. (2008), p. 159-179
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: The problem of the estimation of a regression function by continuous piecewise linear functions is formulated as a nonconvex, nonsmooth optimization problem. Estimates are defined by minimization of the empirical L 2 risk over a class of functions, which are defined as maxima of minima of linear functions. An algorithm for finding continuous piecewise linear functions is presented. We observe that the objective function in the optimization problem is semismooth, quasidifferentiable and piecewise partially separable. The use of these properties allow us to design an efficient algorithm for approximation of subgradients of the objective function and to apply the discrete gradient method for its minimization. We present computational results with some simulated data and compare the new estimator with a number of existing ones.
- Description: The problem of the estimation of a regression function by continuous piecewise linear functions is formulated as a nonconvex, nonsmooth optimization problem. Estimates are defined by minimization of the empirical L 2 risk over a class of functions, which are defined as maxima of minima of linear functions. An algorithm for finding continuous piecewise linear functions is presented. We observe that the objective function in the optimization problem is semismooth, quasidifferentiable and piecewise partially separable. The use of these properties allow us to design an efficient algorithm for approximation of subgradients of the objective function and to apply the discrete gradient method for its minimization. We present computational results with some simulated data and compare the new estimator with a number of existing ones. © 2008 Springer Science+Business Media, LLC.
An approximate subgradient algorithm for unconstrained nonsmooth, nonconvex optimization
- Bagirov, Adil, Ganjehlou, Asef Nazari
- Authors: Bagirov, Adil , Ganjehlou, Asef Nazari
- Date: 2008
- Type: Text , Journal article
- Relation: Mathematical Methods of Operations Research Vol. 67, no. 2 (2008), p. 187-206
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: In this paper a new algorithm for minimizing locally Lipschitz functions is developed. Descent directions in this algorithm are computed by solving a system of linear inequalities. The convergence of the algorithm is proved for quasidifferentiable semismooth functions. We present the results of numerical experiments with both regular and nonregular objective functions. We also compare the proposed algorithm with two different versions of the subgradient method using the results of numerical experiments. These results demonstrate the superiority of the proposed algorithm over the subgradient method. © 2007 Springer-Verlag.
- Description: C1
- Authors: Bagirov, Adil , Ganjehlou, Asef Nazari
- Date: 2008
- Type: Text , Journal article
- Relation: Mathematical Methods of Operations Research Vol. 67, no. 2 (2008), p. 187-206
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: In this paper a new algorithm for minimizing locally Lipschitz functions is developed. Descent directions in this algorithm are computed by solving a system of linear inequalities. The convergence of the algorithm is proved for quasidifferentiable semismooth functions. We present the results of numerical experiments with both regular and nonregular objective functions. We also compare the proposed algorithm with two different versions of the subgradient method using the results of numerical experiments. These results demonstrate the superiority of the proposed algorithm over the subgradient method. © 2007 Springer-Verlag.
- Description: C1
Discrete gradient method : Derivative-free method for nonsmooth optimization
- Bagirov, Adil, Karasozen, Bulent, Sezer, Monsalve
- Authors: Bagirov, Adil , Karasozen, Bulent , Sezer, Monsalve
- Date: 2008
- Type: Text , Journal article
- Relation: Journal of Optimization Theory and Applications Vol. 137, no. 2 (2008), p. 317-334
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: A new derivative-free method is developed for solving unconstrained nonsmooth optimization problems. This method is based on the notion of a discrete gradient. It is demonstrated that the discrete gradients can be used to approximate subgradients of a broad class of nonsmooth functions. It is also shown that the discrete gradients can be applied to find descent directions of nonsmooth functions. The preliminary results of numerical experiments with unconstrained nonsmooth optimization problems as well as the comparison of the proposed method with the nonsmooth optimization solver DNLP from CONOPT-GAMS and the derivative-free optimization solver CONDOR are presented. © 2007 Springer Science+Business Media, LLC.
- Description: C1
Modified global k-means algorithm for minimum sum-of-squares clustering problems
- Authors: Bagirov, Adil
- Date: 2008
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 41, no. 10 (2008), p. 3192-3199
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: k-Means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and inefficient for solving clustering problems in large data sets. Recently, a new version of the k-means algorithm, the global k-means algorithm has been developed. It is an incremental algorithm that dynamically adds one cluster center at a time and uses each data point as a candidate for the k-th cluster center. Results of numerical experiments show that the global k-means algorithm considerably outperforms the k-means algorithms. In this paper, a new version of the global k-means algorithm is proposed. A starting point for the k-th cluster center in this algorithm is computed by minimizing an auxiliary cluster function. Results of numerical experiments on 14 data sets demonstrate the superiority of the new algorithm, however, it requires more computational time than the global k-means algorithm.
- Description: k-Means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and inefficient for solving clustering problems in large data sets. Recently, a new version of the k-means algorithm, the global k-means algorithm has been developed. It is an incremental algorithm that dynamically adds one cluster center at a time and uses each data point as a candidate for the k-th cluster center. Results of numerical experiments show that the global k-means algorithm considerably outperforms the k-means algorithms. In this paper, a new version of the global k-means algorithm is proposed. A starting point for the k-th cluster center in this algorithm is computed by minimizing an auxiliary cluster function. Results of numerical experiments on 14 data sets demonstrate the superiority of the new algorithm, however, it requires more computational time than the global k-means algorithm. © 2008 Elsevier Ltd. All rights reserved.
- Description: 2003001713
- Authors: Bagirov, Adil
- Date: 2008
- Type: Text , Journal article
- Relation: Pattern Recognition Vol. 41, no. 10 (2008), p. 3192-3199
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: k-Means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and inefficient for solving clustering problems in large data sets. Recently, a new version of the k-means algorithm, the global k-means algorithm has been developed. It is an incremental algorithm that dynamically adds one cluster center at a time and uses each data point as a candidate for the k-th cluster center. Results of numerical experiments show that the global k-means algorithm considerably outperforms the k-means algorithms. In this paper, a new version of the global k-means algorithm is proposed. A starting point for the k-th cluster center in this algorithm is computed by minimizing an auxiliary cluster function. Results of numerical experiments on 14 data sets demonstrate the superiority of the new algorithm, however, it requires more computational time than the global k-means algorithm.
- Description: k-Means algorithm and its variations are known to be fast clustering algorithms. However, they are sensitive to the choice of starting points and inefficient for solving clustering problems in large data sets. Recently, a new version of the k-means algorithm, the global k-means algorithm has been developed. It is an incremental algorithm that dynamically adds one cluster center at a time and uses each data point as a candidate for the k-th cluster center. Results of numerical experiments show that the global k-means algorithm considerably outperforms the k-means algorithms. In this paper, a new version of the global k-means algorithm is proposed. A starting point for the k-th cluster center in this algorithm is computed by minimizing an auxiliary cluster function. Results of numerical experiments on 14 data sets demonstrate the superiority of the new algorithm, however, it requires more computational time than the global k-means algorithm. © 2008 Elsevier Ltd. All rights reserved.
- Description: 2003001713
Integrated production system optimization using global optimization techniques
- Mason, T. L., Emelle, C., Van Berkel, J., Bagirov, Adil, Kampas, F., Pinter, J. D.
- Authors: Mason, T. L. , Emelle, C. , Van Berkel, J. , Bagirov, Adil , Kampas, F. , Pinter, J. D.
- Date: 2007
- Type: Text , Journal article
- Relation: Journal of Industrial and Management Optimization Vol. 3, no. 2 (May 2007), p. 257-277
- Full Text:
- Reviewed:
- Description: Many optimization problems related to integrated oil and gas production systems are nonconvex and multimodal. Additionally, apart from the innate nonsmoothness of many optimization problems, nonsmooth functions such as minimum and maximum functions may be used to model flow/pressure controllers and cascade mass in the gas gathering and blending networks. In this paper we study the application of different versions of the derivative free Discrete Gradient Method (DGM) as well as the Lipschitz Global Optimizer (LGO) suite to production optimization in integrated oil and gas production systems and their comparison with various local and global solvers used with the General Algebraic Modeling System (GAMS). Four nonconvex and nonsmooth test cases were constructed from a small but realistic integrated gas production system optimization problem. The derivation of the system of equations for the various test cases is also presented. Results demonstrate that DGM is especially effective for solving nonsmooth optimization problems and its two versions are capable global optimization algorithms. We also demonstrate that LGO solves successfully the presented test (as well as other related real-world) problems.
- Description: C1
- Description: 2003004725
- Authors: Mason, T. L. , Emelle, C. , Van Berkel, J. , Bagirov, Adil , Kampas, F. , Pinter, J. D.
- Date: 2007
- Type: Text , Journal article
- Relation: Journal of Industrial and Management Optimization Vol. 3, no. 2 (May 2007), p. 257-277
- Full Text:
- Reviewed:
- Description: Many optimization problems related to integrated oil and gas production systems are nonconvex and multimodal. Additionally, apart from the innate nonsmoothness of many optimization problems, nonsmooth functions such as minimum and maximum functions may be used to model flow/pressure controllers and cascade mass in the gas gathering and blending networks. In this paper we study the application of different versions of the derivative free Discrete Gradient Method (DGM) as well as the Lipschitz Global Optimizer (LGO) suite to production optimization in integrated oil and gas production systems and their comparison with various local and global solvers used with the General Algebraic Modeling System (GAMS). Four nonconvex and nonsmooth test cases were constructed from a small but realistic integrated gas production system optimization problem. The derivation of the system of equations for the various test cases is also presented. Results demonstrate that DGM is especially effective for solving nonsmooth optimization problems and its two versions are capable global optimization algorithms. We also demonstrate that LGO solves successfully the presented test (as well as other related real-world) problems.
- Description: C1
- Description: 2003004725