A novel hybrid neural learning algorithm using simulated annealing and quasisecant method
- Authors: Yearwood, John , Bagirov, Adil , Seifollahi, Sattar
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we propose a hybrid learning algorithm for the single hidden layer feedforward neural networks (SLFNs) for data classification. The proposed hybrid algorithm is a two-phase learning algorithm and is based on the quasisecant and the simulated annealing methods. First, the weights between the hidden layer and the output layer nodes (output layer weights) are adjusted by the quasisecant algorithm. Then the simulated annealing is applied for global attribute weighting. The weights between the input layer and the hidden layer nodes are fixed in advance and are not included in the learning process. The proposed two-phase learning of the network is a novel idea and is different from that of the existing ones. The numerical results on some benchmark data sets are also reported and these results are promising. © 2011, Australian Computer Society, Inc.
- Description: 2003009507
A simulated annealing-based maximum-margin clustering algorithm
- Authors: Seifollahi, Sattar , Bagirov, Adil , Borzeshi, Ehsan , Piccardi, Massimo
- Date: 2019
- Type: Text , Journal article
- Relation: Computational Intelligence Vol. 35, no. 1 (2019), p. 23-41
- Full Text:
- Reviewed:
- Description: Maximum-margin clustering is an extension of the support vector machine (SVM) to clustering. It partitions a set of unlabeled data into multiple groups by finding hyperplanes with the largest margins. Although existing algorithms have shown promising results, there is no guarantee of convergence of these algorithms to global solutions due to the nonconvexity of the optimization problem. In this paper, we propose a simulated annealing-based algorithm that is able to mitigate the issue of local minima in the maximum-margin clustering problem. The novelty of our algorithm is twofold, ie, (i) it comprises a comprehensive cluster modification scheme based on simulated annealing, and (ii) it introduces a new approach based on the combination of k-means++ and SVM at each step of the annealing process. More precisely, k-means++ is initially applied to extract subsets of the data points. Then, an unsupervised SVM is applied to improve the clustering results. Experimental results on various benchmark data sets (of up to over a million points) give evidence that the proposed algorithm is more effective at solving the clustering problem than a number of popular clustering algorithms.
Optimization based clustering algorithms for authorship analysis of phishing emails
- Authors: Seifollahi, Sattar , Bagirov, Adil , Layton, Robert , Gondal, Iqbal
- Date: 2017
- Type: Text , Journal article
- Relation: Neural Processing Letters Vol. 46, no. 2 (2017), p. 411-425
- Relation: http://purl.org/au-research/grants/arc/DP140103213
- Full Text: false
- Reviewed:
- Description: Phishing has given attackers power to masquerade as legitimate users of organizations, such as banks, to scam money and private information from victims. Phishing is so widespread that combating the phishing attacks could overwhelm the victim organization. It is important to group the phishing attacks to formulate effective defence mechanism. In this paper, we use clustering methods to analyze and characterize phishing emails and perform their relative attribution. Emails are first tokenized to a bag-of-word space and, then, transformed to a numeric vector space using frequencies of words in documents. Wordnet vocabulary is used to take effects of similar words into account and to reduce sparsity. The word similarity measure is combined with the term frequencies to introduce a novel text transformation into numeric features. To improve the accuracy, we apply inverse document frequency weighting, which gives higher weights to features used by fewer authors. The k-means and recently introduced three optimization based algorithms: MS-MGKM, INCA and DCClust are applied for clustering purposes. The optimization based algorithms indicate the existence of well separated clusters in the phishing emails dataset. © 2017, Springer Science+Business Media New York.
SMGKM : an efficient incremental algorithm for clustering document collections
- Authors: Bagirov, Adil , Seifollahi, Sattar , Piccardi, Massimo , Zare Borzeshi, Ehsan , Kruger, Bernie
- Date: 2023
- Type: Text , Conference paper
- Relation: 19th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2018, Hanoi, Vietnam, 18-24 March 2018, Computational Linguistics and Intelligent Text Processing Vol. 13397 LNCS, p. 314-328
- Full Text: false
- Reviewed:
- Description: Given a large unlabeled document collection, the aim of this paper is to develop an accurate and efficient algorithm for solving the clustering problem over this collection. Document collections typically contain tens or hundreds of thousands of documents, with thousands or tens of thousands of features (i.e., distinct words). Most existing clustering algorithms struggle to find accurate solutions on such large data sets. The proposed algorithm overcomes this difficulty by an incremental approach, incrementing the number of clusters progressively from an initial value of one to a set value. At each iteration, the new candidate cluster is initialized using a partitioning approach which is guaranteed to minimize the objective function. Experiments have been carried out over six, diverse datasets and with different evaluation criteria, showing that the proposed algorithm has outperformed comparable state-of-the-art clustering algorithms in all cases. © 2023, Springer Nature Switzerland AG.