### Incremental DC optimization algorithm for large-scale clusterwise linear regression

**Authors:**Bagirov, Adil , Taheri, Sona , Cimen, Emre**Date:**2021**Type:**Text , Journal article**Relation:**Journal of Computational and Applied Mathematics Vol. 389, no. (2021), p. 1-17**Relation:**https://purl.org/au-research/grants/arc/DP190100580**Full Text:**false**Reviewed:****Description:**The objective function in the nonsmooth optimization model of the clusterwise linear regression (CLR) problem with the squared regression error is represented as a difference of two convex functions. Then using the difference of convex algorithm (DCA) approach the CLR problem is replaced by the sequence of smooth unconstrained optimization subproblems. A new algorithm based on the DCA and the incremental approach is designed to solve the CLR problem. We apply the Quasi-Newton method to solve the subproblems. The proposed algorithm is evaluated using several synthetic and real-world data sets for regression and compared with other algorithms for CLR. Results demonstrate that the DCA based algorithm is efficient for solving CLR problems with the large number of data points and in particular, outperforms other algorithms when the number of input variables is small. © 2020 Elsevier B.V.

### Attribute weighted Naive Bayes classifier using a local optimization

**Authors:**Taheri, Sona , Yearwood, John , Mammadov, Musa , Seifollahi, Sattar**Date:**2013**Type:**Text , Journal article**Relation:**Neural Computing & Applications Vol.24, no.5 (2013), p.995-1002**Full Text:****Reviewed:****Description:**The Naive Bayes classifier is a popular classification technique for data mining and machine learning. It has been shown to be very effective on a variety of data classification problems. However, the strong assumption that all attributes are conditionally independent given the class is often violated in real-world applications. Numerous methods have been proposed in order to improve the performance of the Naive Bayes classifier by alleviating the attribute independence assumption. However, violation of the independence assumption can increase the expected error. Another alternative is assigning the weights for attributes. In this paper, we propose a novel attribute weighted Naive Bayes classifier by considering weights to the conditional probabilities. An objective function is modeled and taken into account, which is based on the structure of the Naive Bayes classifier and the attribute weights. The optimal weights are determined by a local optimization method using the quasisecant method. In the proposed approach, the Naive Bayes classifier is taken as a starting point. We report the results of numerical experiments on several real-world data sets in binary classification, which show the efficiency of the proposed method.

### Improving Naive Bayes classifier using conditional probabilities

**Authors:**Taheri, Sona , Mammadov, Musa , Bagirov, Adil**Date:**2010**Type:**Text , Conference proceedings**Full Text:****Description:**Naive Bayes classifier is the simplest among Bayesian Network classifiers. It has shown to be very efficient on a variety of data classification problems. However, the strong assumption that all features are conditionally independent given the class is often violated on many real world applications. Therefore, improvement of the Naive Bayes classifier by alleviating the feature independence assumption has attracted much attention. In this paper, we develop a new version of the Naive Bayes classifier without assuming independence of features. The proposed algorithm approximates the interactions between features by using conditional probabilities. We present results of numerical experiments on several real world data sets, where continuous features are discretized by applying two different methods. These results demonstrate that the proposed algorithm significantly improve the performance of the Naive Bayes classifier, yet at the same time maintains its robustness. © 2011, Australian Computer Society, Inc.**Description:**2003009505

### A globally optimization algorithm for systems of nonlinear equations

**Authors:**Mammadov, Musa , Taheri, Sona**Date:**2010**Type:**Text , Conference proceedings**Full Text:**false**Description:**In this paper, a new algorithm is proposed for the solutions of system of nonlinear equations. This algorithm uses a combination of the gradient and Newton's methods. A novel dynamic combinator is developed to determine the contribution of the methods in the combination. Also, by using some parameters in the proposed algorithm, this contribution is adjusted. The efficiency of the algoritms is studied in solving system of nonlinear equations.

### Structure learning of Bayesian networks using a new unrestricted dependency algorithm

**Authors:**Taheri, Sona , Mammadov, Musa**Date:**2012**Type:**Text , Conference proceedings**Full Text:****Description:**Bayesian Networks have deserved extensive attentions in data mining due to their efficiencies, and reasonable predictive accuracy. A Bayesian Network is a directed acyclic graph in which each node represents a variable and each arc a probabilistic dependency between two variables. Constructing a Bayesian Network from data is the learning process that is divided in two steps: learning structure and learning parameter. In many domains, the structure is not known a priori and must be inferred from data. This paper presents an iterative unrestricted dependency algorithm for learning structure of Bayesian Networks for binary classification problems. Numerical experiments are conducted on several real world data sets, where continuous features are discretized by applying two different methods. The performance of the proposed algorithm is compared with the Naive Bayes, the Tree Augmented Naive Bayes, and the k

### A globally optimization algorithm for systems of nonlinear equations

**Authors:**Mammadov, Musa , Taheri, Sona**Date:**2010**Type:**Text , Conference paper**Relation:**Proceedings of PCO 2010, The Third International Conference on Power Control and Optimization 2010 Gold Coast p. 214-234**Full Text:**false**Reviewed:****Description:**In this paper, a new algorithm is proposed for the solutions of system of nonlinear equations. This algorithm uses a combination of the gradient and Newton's methods. A novel dynamic combinator is developed to determine the contribution of the methods in the combination. Also, by using some parameters in the proposed algorithm, this contribution is adjusted. The efficiency of the algoritms is studied in solving system of nonlinear equations.

### Solving systems of nonlinear equations using a globally convergent optimization algorithm

**Authors:**Taheri, Sona , Mammadov, Musa**Date:**2012**Type:**Text , Journal article**Relation:**Global Journal of Technology & Optimization Vol. 3, no. (2012), p. 132-138**Full Text:****Reviewed:****Description:**Solving systems of nonlinear equations is a relatively complicated problem for which a number of different approaches have been presented. In this paper, a new algorithm is proposed for the solutions of systems of nonlinear equations. This algorithm uses a combination of the gradient and the Newton’s methods. A novel dynamic combinatory is developed to determine the contribution of the methods in the combination. Also, by using some parameters in the proposed algorithm, this contribution is adjusted. We use the gradient method due to its global convergence property, and the Newton’s method to speed up the convergence rate. We consider two different combinations. In the first one, a step length is determined only along the gradient direction. The second one is finding a step length along both the gradient and the Newton’s directions. The performance of the proposed algorithm in comparison to the Newton’s method, the gradient method and an existing combination method is explored on several well known test problems in solving systems of nonlinear equations. The numerical results provide evidence that the proposed combination algorithm is generally more robust and efficient than other mentioned methods on someimportant and difficult problems.

### Learning the naive bayes classifier with optimization models

**Authors:**Taheri, Sona , Mammadov, Musa**Date:**2013**Type:**Text , Journal article**Relation:**International Journal of Applied Mathematics and Computer Science Vol. 23, no. 4 (2013), p. 787-795**Full Text:****Reviewed:****Description:**Naive Bayes is among the simplest probabilistic classifiers. It often performs surprisingly well in many real world applications, despite the strong assumption that all features are conditionally independent given the class. In the learning process of this classifier with the known structure, class probabilities and conditional probabilities are calculated using training data, and then values of these probabilities are used to classify new observations. In this paper, we introduce three novel optimization models for the naive Bayes classifier where both class probabilities and conditional probabilities are considered as variables. The values of these variables are found by solving the corresponding optimization problems. Numerical experiments are conducted on several real world binary classification data sets, where continuous features are discretized by applying three different methods. The performances of these models are compared with the naive Bayes classifier, tree augmented naive Bayes, the SVM, C4.5 and the nearest neighbor classifier. The obtained results demonstrate that the proposed models can significantly improve the performance of the naive Bayes classifier, yet at the same time maintain its simple structure.

### Globally convergent algorithms for solving unconstrained optimization problems

**Authors:**Taheri, Sona , Mammadov, Musa , Seifollahi, Sattar**Date:**2013**Type:**Text , Journal article**Relation:**Optimization Vol. , no. (2013), p. 1-15**Full Text:****Reviewed:****Description:**New algorithms for solving unconstrained optimization problems are presented based on the idea of combining two types of descent directions: the direction of anti-gradient and either the Newton or quasi-Newton directions. The use of latter directions allows one to improve the convergence rate. Global and superlinear convergence properties of these algorithms are established. Numerical experiments using some unconstrained test problems are reported. Also, the proposed algorithms are compared with some existing similar methods using results of experiments. This comparison demonstrates the efficiency of the proposed combined methods.

### Tree augmented naive bayes based on optimization

**Authors:**Taheri, Sona , Mammadov, Musa**Date:**2011**Type:**Text , Conference paper**Relation:**42 Annual Iranian Mathematics Conference Vali-a-Asr University of Rasanjan 5th-8th September, 2011 p. 594-598**Full Text:**false**Reviewed:****Description:**Tree augmented naive Bayes is a semi-naive Bayesian Learning method. It relaxes the naive Bayes attribute independence assumption by employing a tree structure, in which each attribute only depends on the class and one other attribute. A maximum weighted spanning tree that maximizes the likelihood of the training data is used to perform classification.**Description:**2003009354

### Solving minimax problems : Local smoothing versus global smoothing

**Authors:**Bagirov, Adil , Sultanova, Nargiz , Al Nuaimat, Alia , Taheri, Sona**Date:**2018**Type:**Text , Conference proceedings**Relation:**4th International Conference on Numerical Analysis and Optimization, NAO-IV 2017; Muscat, Oman; 2nd-5th January 2017; published in Numerical Analysis and Optimization NAO-IV (part of the Springer Proceedings in Mathematics and Statistics book series PROMS, volume 235) Vol. 235, p. 23-43**Full Text:**false**Reviewed:****Description:**The aim of this chapter is to compare different smoothing techniques for solving finite minimax problems. We consider the local smoothing technique which approximates the function in some neighborhood of a point of nondifferentiability and also global smoothing techniques such as the exponential and hyperbolic smoothing which approximate the function in the whole domain. Computational results on the collection of academic test problems are used to compare different smoothing techniques. Results show the superiority of the local smoothing technique for convex problems and global smoothing techniques for nonconvex problems. © 2018, Springer International Publishing AG, part of Springer Nature.**Description:**Springer Proceedings in Mathematics and Statistics

### Multi-source cyber-attacks detection using machine learning

**Authors:**Taheri, Sona , Gondal, Iqbal , Bagirov, Adil , Harkness, Greg , Brown, Simon , Chi, Chihung**Date:**2019**Type:**Text , Conference proceedings , Conference paper**Relation:**2019 IEEE International Conference on Industrial Technology, ICIT 2019; Melbourne, Australia; 13th-15th February 2019 Vol. 2019-February, p. 1167-1172**Full Text:****Reviewed:****Description:**The Internet of Things (IoT) has significantly increased the number of devices connected to the Internet ranging from sensors to multi-source data information. As the IoT continues to evolve with new technologies number of threats and attacks against IoT devices are on the increase. Analyzing and detecting these attacks originating from different sources needs machine learning models. These models provide proactive solutions for detecting attacks and their sources. In this paper, we propose to apply a supervised machine learning classification technique to identify cyber-attacks from each source. More precisely, we apply the incremental piecewise linear classifier that constructs boundary between sources/classes incrementally starting with one hyperplane and adding more hyperplanes at each iteration. The algorithm terminates when no further significant improvement of the separation of sources/classes is possible. The construction and usage of piecewise linear boundaries allows us to avoid any possible overfitting. We apply the incremental piecewise linear classifier on the multi-source real world cyber security data set to identify cyber-attacks and their sources.**Description:**Proceedings of the IEEE International Conference on Industrial Technology

### New diagonal bundle method for clustering problems in large data sets

**Authors:**Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona**Date:**2017**Type:**Text , Journal article**Relation:**European Journal of Operational Research Vol. 263, no. 2 (2017), p. 367-379**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**Clustering is one of the most important tasks in data mining. Recent developments in computer hardware allow us to store in random access memory (RAM) and repeatedly read data sets with hundreds of thousands and even millions of data points. This makes it possible to use conventional clustering algorithms in such data sets. However, these algorithms may need prohibitively large computational time and fail to produce accurate solutions. Therefore, it is important to develop clustering algorithms which are accurate and can provide real time clustering in large data sets. This paper introduces one of them. Using nonsmooth optimization formulation of the clustering problem the objective function is represented as a difference of two convex (DC) functions. Then a new diagonal bundle algorithm that explicitly uses this structure is designed and combined with an incremental approach to solve this problem. The method is evaluated using real world data sets with both large number of attributes and large number of data points. The proposed method is compared with two other clustering algorithms using numerical results. © 2017 Elsevier B.V.

### Clustering in large data sets with the limited memory bundle method

**Authors:**Karmitsa, Napsu , Bagirov, Adil , Taheri, Sona**Date:**2018**Type:**Text , Journal article**Relation:**Pattern Recognition Vol. 83, no. (2018), p. 245-259**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**The aim of this paper is to design an algorithm based on nonsmooth optimization techniques to solve the minimum sum-of-squares clustering problems in very large data sets. First, the clustering problem is formulated as a nonsmooth optimization problem. Then the limited memory bundle method [Haarala et al., 2007] is modified and combined with an incremental approach to design a new clustering algorithm. The algorithm is evaluated using real world data sets with both the large number of attributes and the large number of data points. It is also compared with some other optimization based clustering algorithms. The numerical results demonstrate the efficiency of the proposed algorithm for clustering in very large data sets.

### Nonsmooth DC programming approach to the minimum sum-of-squares clustering problems

**Authors:**Bagirov, Adil , Taheri, Sona , Ugon, Julien**Date:**2016**Type:**Text , Journal article**Relation:**Pattern Recognition Vol. 53, no. (2016), p. 12-24**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**This paper introduces an algorithm for solving the minimum sum-of-squares clustering problems using their difference of convex representations. A non-smooth non-convex optimization formulation of the clustering problem is used to design the algorithm. Characterizations of critical points, stationary points in the sense of generalized gradients and inf-stationary points of the clustering problem are given. The proposed algorithm is tested and compared with other clustering algorithms using large real world data sets. © 2015 Elsevier Ltd. All rights reserved.

### Structure learning of Bayesian Networks using global optimization with applications in data classification

**Authors:**Taheri, Sona , Mammadov, Musa**Date:**2014**Type:**Text , Journal article**Relation:**Optimization Letters Vol. 9, no. 5 (2014), p. 931-948**Full Text:****Reviewed:****Description:**Bayesian Networks are increasingly popular methods of modeling uncertainty in artificial intelligence and machine learning. A Bayesian Network consists of a directed acyclic graph in which each node represents a variable and each arc represents probabilistic dependency between two variables. Constructing a Bayesian Network from data is a learning process that consists of two steps: learning structure and learning parameter. Learning a network structure from data is the most difficult task in this process. This paper presents a new algorithm for constructing an optimal structure for Bayesian Networks based on optimization. The algorithm has two major parts. First, we define an optimization model to find the better network graphs. Then, we apply an optimization approach for removing possible cycles from the directed graphs obtained in the first part which is the first of its kind in the literature. The main advantage of the proposed method is that the maximal number of parents for variables is not fixed a priory and it is defined during the optimization procedure. It also considers all networks including cyclic ones and then choose a best structure by applying a global optimization method. To show the efficiency of the algorithm, several closely related algorithms including unrestricted dependency Bayesian Network algorithm, as well as, benchmarks algorithms SVM and C4.5 are employed for comparison. We apply these algorithms on data classification; data sets are taken from the UCI machine learning repository and the LIBSVM. © 2014, Springer-Verlag Berlin Heidelberg.

### A difference of convex optimization algorithm for piecewise linear regression

**Authors:**Bagirov, Adil , Taheri, Sona , Asadi, Soodabeh**Date:**2019**Type:**Text , Journal article**Relation:**Journal of Industrial and Management Optimization Vol. 15, no. 2 (2019), p. 909-932**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:**false**Reviewed:****Description:**The problem of finding a continuous piecewise linear function approximating a regression function is considered. This problem is formulated as a nonconvex nonsmooth optimization problem where the objective function is represented as a difference of convex (DC) functions. Subdifferentials of DC components are computed and an algorithm is designed based on these subdifferentials to find piecewise linear functions. The algorithm is tested using some synthetic and real world data sets and compared with other regression algorithms.

### Discrete gradient methods

**Authors:**Bagirov, Adil , Taheri, Sona , Karmitsa, Napsu**Date:**2020**Type:**Text , Book chapter**Relation:**Numerical Nonsmooth Optimization: State of the Art Algorithms p. 621-654**Full Text:**false**Reviewed:****Description:**In this chapter, the notion of a discrete gradient is introduced and it is shown that the discrete gradients can be used to approximate subdifferentials of a broad class of nonsmooth functions. Two methods based on such approximations, more specifically, the discrete gradient method (DGM) and its limited memory version (LDGB), are described. These methods are semi derivative-free methods for solving nonsmooth and, in general, nonconvex optimization problems. The performance of the methods is demonstrated using some academic test problems. © Springer Nature Switzerland AG 2020.

### An approximate ADMM for solving linearly constrained nonsmooth optimization problems with two blocks of variables

**Authors:**Bagirov, Adil , Taheri, Sona , Bai, Fusheng , Wu, Zhiyou**Date:**2019**Type:**Text , Book chapter**Relation:**Nonsmooth Optimization and Its Applications (part of the International Series of Numerical Mathematics book series) Chapter 2 p. 17-44**Full Text:**false**Reviewed:****Description:**Nonsmooth convex optimization problems with two blocks of variables subject to linear constraints are considered. A new version of the alternating direction method of multipliers is developed for solving these problems. In this method the subproblems are solved approximately. The convergence of the method is studied. New test problems are designed and used to verify the efficiency of the proposed method and to compare it with two versions of the proximal bundle method.

### Double bundle method for finding clarke stationary points in nonsmooth dc programming

**Authors:**Joki, Kaisa , Bagirov, Adil , Karmitsa, Napsu , Makela, Marko , Taheri, Sona**Date:**2018**Type:**Text , Journal article**Relation:**SIAM Journal on Optimization Vol. 28, no. 2 (2018), p. 1892-1919**Relation:**http://purl.org/au-research/grants/arc/DP140103213**Full Text:****Reviewed:****Description:**The aim of this paper is to introduce a new proximal double bundle method for unconstrained nonsmooth optimization, where the objective function is presented as a difference of two convex (DC) functions. The novelty in our method is a new escape procedure which enables us to guarantee approximate Clarke stationarity for solutions by utilizing the DC components of the objective function. This optimality condition is stronger than the criticality condition typically used in DC programming. Moreover, if a candidate solution is not approximate Clarke stationary, then the escape procedure returns a descent direction. With this escape procedure, we can avoid some shortcomings encountered when criticality is used. The finite termination of the double bundle method to an approximate Clarke stationary point is proved by assuming that the subdifferentials of DC components are polytopes. Finally, some encouraging numerical results are presented.