A comparative study of unsupervised classification algorithms in multi-sized data sets
- Authors: Quddus, Syed , Bagirov, Adil
- Date: 2019
- Type: Text , Conference paper
- Relation: 2nd Artificial Intelligence and Cloud Computing Conference, AICCC 2019, Kobe, 21-23 December 2019 p. 26-32
- Full Text: false
- Reviewed:
- Description: The ability to mine and extract useful information automatically, from large data sets, is a common concern for organizations, for the last few decades. Over the internet, data is vastly increasing gradually and consequently the capacity to collect and store very large data is significantly increasing. Existing clustering algorithms are not always efficient and accurate in solving clustering problems for large data sets. However, the development of accurate and fast data classification algorithms for very large scale data sets is still a challenge. In this paper, we present an overview of various algorithms and approaches which are recently being used for Clustering of large data and E-document. In this paper, a comparative study of the performance of various algorithms: the global kmeans algorithm (GKM), the multi-start modified global kmeans algorithm (MS-MGKM), the multi-start kmeans algorithm (MS-KM), the difference of convex clustering algorithm (DCA), the clustering algorithm based on the difference of convex representation of the cluster function and non-smooth optimization (DC-L2), is carried out using C++. © 2019 ACM.
A convolutional recursive modified Self Organizing Map for handwritten digits recognition
- Authors: Mohebi, Ehsan , Bagirov, Adil
- Date: 2014
- Type: Text , Journal article
- Relation: Neural Networks Vol. 60, no. (2014), p. 104-118
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: It is well known that the handwritten digits recognition is a challenging problem. Different classification algorithms have been applied to solve it. Among them, the Self Organizing Maps (SOM) produced promising results. In this paper, first we introduce a Modified SOM for the vector quantization problem with improved initialization process and topology preservation. Then we develop a Convolutional Recursive Modified SOM and apply it to the problem of handwritten digits recognition. The computational results obtained using the well known MNIST dataset demonstrate the superiority of the proposed algorithm over the existing SOM-based algorithms.
A heuristic algorithm for solving the minimum sum-of-squares clustering problems
- Authors: Ordin, Burak , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Global Optimization Vol. 61, no. 2 (2015), p. 341-361
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k-means and modified global k-means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k-means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k-means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k-means algorithms in finding global solutions to clustering problems.
A hybrid neural learning algorithm using evolutionary learning and derivative free local search method
- Authors: Ghosh, Ranadhir , Yearwood, John , Ghosh, Moumita , Bagirov, Adil
- Date: 2006
- Type: Text , Journal article
- Relation: International Journal of Neural Systems Vol. 16, no. 3 (2006), p. 201-213
- Full Text: false
- Reviewed:
- Description: In this paper we investigate a hybrid model based on the Discrete Gradient method and an evolutionary strategy for determining the weights in a feed forward artificial neural network. Also we discuss different variants for hybrid models using the Discrete Gradient method and an evolutionary strategy for determining the weights in a feed forward artificial neural network. The Discrete Gradient method has the advantage of being able to jump over many local minima and find very deep local minima. However, earlier research has shown that a good starting point for the discrete gradient method can improve the quality of the solution point. Evolutionary algorithms are best suited for global optimisation problems. Nevertheless they are cursed with longer training times and often unsuitable for real world application. For optimisation problems such as weight optimisation for ANNs in real world applications the dimensions are large and time complexity is critical. Hence the idea of a hybrid model can be a suitable option. In this paper we propose different fusion strategies for hybrid models combining the evolutionary strategy with the discrete gradient method to obtain an optimal solution much quicker. Three different fusion strategies are discussed: a linear hybrid model, an iterative hybrid model and a restricted local search hybrid model. Comparative results on a range of standard datasets are provided for different fusion hybrid models. © World Scientific Publishing Company.
- Description: C1
- Description: 2003001712
An algorithm for minimization of pumping costs in water distribution systems using a novel approach to pump scheduling
- Authors: Bagirov, Adil , Barton, Andrew , Mala-Jetmarova, Helena , Al Nuaimat, Alia , Ahmed, S. T. , Sultanova, Nargiz , Yearwood, John
- Date: 2013
- Type: Text , Journal article
- Relation: Mathematical and Computer Modelling Vol. 57, no. 3-4 (2013), p. 873-886
- Relation: http://purl.org/au-research/grants/arc/LP0990908
- Full Text: false
- Reviewed:
- Description: The operation of a water distribution system is a complex task which involves scheduling of pumps, regulating water levels of storages, and providing satisfactory water quality to customers at required flow and pressure. Pump scheduling is one of the most important tasks of the operation of a water distribution system as it represents the major part of its operating costs. In this paper, a novel approach for modeling of explicit pump scheduling to minimize energy consumption by pumps is introduced which uses the pump start/end run times as continuous variables, and binary integer variables to describe the pump status at the beginning of the scheduling period. This is different from other approaches where binary integer variables for each hour are typically used, which is considered very impractical from an operational perspective. The problem is formulated as a mixed integer nonlinear programming problem, and a new algorithm is developed for its solution. This algorithm is based on the combination of the grid search with the Hooke-Jeeves pattern search method. The performance of the algorithm is evaluated using literature test problems applying the hydraulic simulation model EPANet. © 2012 Elsevier Ltd.
- Description: 2003010583
An approximate subgradient algorithm for unconstrained nonsmooth, nonconvex optimization
- Authors: Bagirov, Adil , Ganjehlou, Asef Nazari
- Date: 2008
- Type: Text , Journal article
- Relation: Mathematical Methods of Operations Research Vol. 67, no. 2 (2008), p. 187-206
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text:
- Reviewed:
- Description: In this paper a new algorithm for minimizing locally Lipschitz functions is developed. Descent directions in this algorithm are computed by solving a system of linear inequalities. The convergence of the algorithm is proved for quasidifferentiable semismooth functions. We present the results of numerical experiments with both regular and nonregular objective functions. We also compare the proposed algorithm with two different versions of the subgradient method using the results of numerical experiments. These results demonstrate the superiority of the proposed algorithm over the subgradient method. © 2007 Springer-Verlag.
- Description: C1
Global optimization in the summarization of text documents
- Authors: Alyguliev, R. M. , Bagirov, Adil
- Date: 2005
- Type: Text , Journal article
- Relation: Automatic Control and Computer Sciences Vol. 39, no. 6 (2005), p. 42-47
- Full Text: false
- Reviewed:
- Description: In order to ensure minimal redundancy in the summary of a document and the greatest possible coverage of its content a method for the construction of summaries (summarization) based on the clustering of sentences is proposed in the article. Clustering of sentences reduces to a determination of cluster centroids the mathematical realization of which relies on a problem of global optimization. A determination of the number of clusters is one of the complex problems in the clustering procedure. Therefore, an algorithm of stepwise determination of the number of clusters is also proposed in the present study. © 2006 by Allerton Press, Inc.
- Description: C1
Improving Naive Bayes classifier using conditional probabilities
- Authors: Taheri, Sona , Mammadov, Musa , Bagirov, Adil
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Naive Bayes classifier is the simplest among Bayesian Network classifiers. It has shown to be very efficient on a variety of data classification problems. However, the strong assumption that all features are conditionally independent given the class is often violated on many real world applications. Therefore, improvement of the Naive Bayes classifier by alleviating the feature independence assumption has attracted much attention. In this paper, we develop a new version of the Naive Bayes classifier without assuming independence of features. The proposed algorithm approximates the interactions between features by using conditional probabilities. We present results of numerical experiments on several real world data sets, where continuous features are discretized by applying two different methods. These results demonstrate that the proposed algorithm significantly improve the performance of the Naive Bayes classifier, yet at the same time maintains its robustness. © 2011, Australian Computer Society, Inc.
- Description: 2003009505
Local optimization method with global multidimensional search
- Authors: Bagirov, Adil , Rubinov, Alex , Zhang, Jiapu
- Date: 2005
- Type: Text , Journal article
- Relation: Journal of Global Optimization Vol. 32, no. 2 (2005), p. 161-179
- Full Text:
- Reviewed:
- Description: This paper presents a new method for solving global optimization problems. We use a local technique based on the notion of discrete gradients for finding a cone of descent directions and then we use a global cutting angle algorithm for finding global minimum within the intersection of the cone and the feasible region. We present results of numerical experiments with well-known test problems and with the so-called cluster function. These results confirm that the proposed algorithms allows one to find a global minimizer or at least a deep local minimizer of a function with a huge amount of shallow local minima. © Springer 2005.
- Description: C1
- Description: 2003001351
Modified self-organising maps with a new topology and initialisation algorithm
- Authors: Mohebi, Ehsan , Bagirov, Adil
- Date: 2015
- Type: Text , Journal article
- Relation: Journal of Experimental and Theoretical Artificial Intelligence Vol. 27, no. 3 (2015), p. 351-372
- Full Text: false
- Reviewed:
- Description: Mapping quality of the self-organising maps (SOMs) is sensitive to the map topology and initialisation of neurons. In this article, in order to improve the convergence of the SOM, an algorithm based on split and merge of clusters to initialise neurons is introduced. The initialisation algorithm speeds up the learning process in large high-dimensional data sets. We also develop a topology based on this initialisation to optimise the vector quantisation error and topology preservation of the SOMs. Such an approach allows to find more accurate data visualisation and consequently clustering problem. The numerical results on eight small-to-large real-world data sets are reported to demonstrate the performance of the proposed algorithm in the sense of vector quantisation, topology preservation and CPU time requirement. © 2014 Taylor & Francis.
Nonsmooth nonconvex optimization approach to clusterwise linear regression problems
- Authors: Bagirov, Adil , Ugon, Julien , Mirzayeva, Hijran
- Date: 2013
- Type: Text , Journal article
- Relation: European Journal of Operational Research Vol. 229, no. 1 (2013), p. 132-142
- Full Text: false
- Reviewed:
- Description: Clusterwise regression consists of finding a number of regression functions each approximating a subset of the data. In this paper, a new approach for solving the clusterwise linear regression problems is proposed based on a nonsmooth nonconvex formulation. We present an algorithm for minimizing this nonsmooth nonconvex function. This algorithm incrementally divides the whole data set into groups which can be easily approximated by one linear regression function. A special procedure is introduced to generate a good starting point for solving global optimization problems at each iteration of the incremental algorithm. Such an approach allows one to find global or near global solution to the problem when the data sets are sufficiently dense. The algorithm is compared with the multistart Späth algorithm on several publicly available data sets for regression analysis. © 2013 Elsevier B.V. All rights reserved.
- Description: 2003011018
Using global optimization to improve classification for medical diagnosis and prognosis
- Authors: Bagirov, Adil , Rubinov, Alex , Yearwood, John
- Date: 2001
- Type: Text , Journal article
- Relation: Topics in health information management Vol. 22, no. 1 (2001), p. 65-74
- Full Text: false
- Description: Global optimization-based techniques are studied in order to increase the accuracy of medical diagnosis and prognosis with data from various databases. First, we discuss feature selection, the problem of determining the most informative features for classification in the databases under consideration. Then, we apply a technique based on convex and global optimization for classification in these databases. The third application of this technique is a method that calculates centers of clusters to predict when breast cancer is likely to recur in patients for which cancer has been removed. The technique achieves high accuracy with these databases. Better classifiers will lead to improved assistance in making medical diagnostic and prognostic decisions.
- Description: 2003003662