List of Titles

Languages recognized by two-sided automata of graphs

Authors: Miller, Mirka , Kelarev, Andrei , Sokratova, Olga
Date: 2005
Type: Text , Journal article
Relation: Proceedings of the Estonian Academy of Sciences, Physics Mathematic Vol. 51, no. 1 (2005), p. 46-54
Full Text: false
Reviewed:
Description: We introduce two-sided automata defined by directed graphs and describe all languages recognized by these automata.
Description: C1
Description: 2003001399

A description of incidence rings of group automata

Authors: Kelarev, Andrei , Passman, D.S.
Date: 2008
Type: Text , Book chapter
Relation: Contemporary Mathematics : Noncommutative Rings, Group Rings, Diagram Algebras and Their Applications : International Conference December 18-22, 2006, University of Madras, Chennai, India Chapter p. 27-33
Full Text:
Description: Group automata occur in the Krohn-Rhodes Decomposition Theorem and have been extensively investigated in the literature. The incidence rings of group automata were introduced by the first author in analogy with group rings and incidence rings of graphs. The main theorem of the present paper gives a complete description of the structure of incidence rings of group automata in terms of matrix rings over group rings and their natural modules. As a consequence, when the ground ring is a field, we can use known group algebra results to determine when the incidence algebra is prime, semiprime, Artinian or semisimple. We also offer sufficient conditions for the algebra to be semiprimitive.
Description: 2003006588

An algorithm for BCH codes extended with finite state automata

Authors: Kelarev, Andrei
Date: 2008
Type: Text , Journal article
Relation: Fundamenta Informaticae Vol. 84, no. 1 (2008), p. 51-60
Full Text: false
Reviewed:
Description: This articles develops a combinatorial algorithm for a class of codes extending BCH codes and constructed with finite state automata. Ou algorithm computes the largest number of errors that the extended codes can correct, and finds a generator for each optimal code in this class of extensions. The question of finding codes with largest possible information rates remains open.
Description: C1

Experimental investigation of clasification algorithms for ITS dataset

Authors: Yearwood, John , Kang, Byeongho , Kelarev, Andrei
Date: 2008
Type: Text , Conference paper
Relation: PKAW-08, Pacific Rim Knowledge Acquisition Workshop 2008, as part of PRICAI 2008, Tenth Pacific Rim p. 262-272
Full Text: false
Reviewed:
Description: This article is devoted to experimental investigation of classification algorithms for analysis of ITS dataset. We introduce and consider a novel k-committees alogorithm for classification and compare it with the discrete k- means and nearest neighbour algorithms. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel algorithms and adjust familiar ones. We present the results of experiments comparing the efficiency of three classification methods in their ability to achieve agreement with classes published in the biological literature before. It turns out that our algorithms are efficient and can be used to obtain biologically significant classifications. A simplified version of a synthetic dataset, where the k-committees classifier out performs k-means and Nearest Neighbour classifiers, is also presented.
Description: E1

A formula for multiple classifiers in data mining based on Brandt semigroups

Authors: Kelarev, Andrei , Yearwood, John , Mammadov, Musa
Date: 2009
Type: Text , Journal article
Relation: Semigroup Forum Vol. 78, no. 2 (2009), p. 293-309
Full Text:
Reviewed:
Description: A general approach to designing multiple classifiers represents them as a combination of several binary classifiers in order to enable correction of classification errors and increase reliability. This method is explained, for example, in Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques, 2005, Sect. 7.5). The aim of this paper is to investigate representations of this sort based on Brandt semigroups. We give a formula for the maximum number of errors of binary classifiers, which can be corrected by a multiple classifier of this type. Examples show that our formula does not carry over to larger classes of semigroups. © 2008 Springer Science+Business Media, LLC.

A polynomial ring construction for the classification of data

Authors: Kelarev, Andrei , Yearwood, John , Vamplew, Peter
Date: 2009
Type: Text , Journal article
Relation: Bulletin of the Australian Mathematical Society Vol. 79, no. 2 (2009), p. 213-225
Full Text:
Reviewed:
Description: Drensky and Lakatos (Lecture Notes in Computer Science, 357 (Springer, Berlin, 1989), pp. 181-188) have established a convenient property of certain ideals in polynomial quotient rings, which can now be used to determine error-correcting capabilities of combined multiple classifiers following a standard approach explained in the well-known monograph by Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques (Elsevier, Amsterdam, 2005)). We strengthen and generalise the result of Drensky and Lakatos by demonstrating that the corresponding nice property remains valid in a much larger variety of constructions and applies to more general types of ideals. Examples show that our theorems do not extend to larger classes of ring constructions and cannot be simplified or generalised.

An algorithm for the optimization of multiple classifers in data mining based on graphs

Authors: Kelarev, Andrei , Ryan, Joe , Yearwood, John
Date: 2009
Type: Text , Journal article
Relation: The Journal of Combinatorial Mathematics and Combinatorial Computing Vol. 71, no. (2009), p. 65-85
Full Text: false
Reviewed:
Description: This article develops an efficient combinatorial algorithm based on labeled directed graphs and motivated by applications in data mining for designing multiple classifiers. Our method originates from the standard approach described in [37]. It defines a representation of a multiclass classifier in terms of several binary classifiers. We are using labeled graphs to introduce additional structure on the classifier. Representations of this sort are known to have serious advantages. An important property of these representations is their ability to correct errors of individual binary classifiers and produce correct combined output. For every representation like this we develop a combinatorial algorithm with quadratic running time to compute the largest number of errors of individual binary classifiers which can be corrected by the combined multiple classifier. In addition, we consider the question of optimizing the classifiers of this type and find all optimal representations for these multiple classifiers.
Description: 2003007563

Applying clustering and ensemble clustering approaches to phishing profiling

Authors: Webb, Dean , Yearwood, John , Vamplew, Peter , Ma, Liping , Ofoghi, Bahadorreza , Kelarev, Andrei
Date: 2009
Type: Text , Conference paper
Relation: Paper presented at Eighth Australasian Data Mining Conference, AusDM 2009, University of Melbourne, Melbourne, Victoria : 1st–4th December 2009
Full Text:
Description: 2003007911

Cayley graphs as classifiers for data mining : The influence of asymmetries

Authors: Kelarev, Andrei , Ryan, Joe , Yearwood, John
Date: 2009
Type: Text , Journal article
Relation: Discrete Mathematics Vol. 309, no. 17 (2009), p. 5360-5369
Relation: http://purl.org/au-research/grants/arc/DP0211866
Full Text:
Reviewed:
Description: The endomorphism monoids of graphs have been actively investigated. They are convenient tools expressing asymmetries of the graphs. One of the most important classes of graphs considered in this framework is that of Cayley graphs. Our paper proposes a new method of using Cayley graphs for classification of data. We give a survey of recent results devoted to the Cayley graphs also involving their endomorphism monoids. Â© 2008 Elsevier B.V. All rights reserved.

Constructing stochastic mixture policies for episodic multiobjective reinforcement learning tasks

Authors: Vamplew, Peter , Dazeley, Richard , Barker, Ewan , Kelarev, Andrei
Date: 2009
Type: Text , Book chapter
Relation: AI 2009 : Advances in Artificial Intelligence : 22nd Australasian Joint Conference, Melbourne, Australia, December 1-4, 2009. Proceedings Chapter p. 340-349
Full Text:
Description: Multiobjective reinforcement learning algorithms extend reinforcement learning techniques to problems with multiple conflicting objectives. This paper discusses the advantages gained from applying stochastic policies to multiobjective tasks and examines a particular form of stochastic policy known as a mixture policy. Two methods are proposed for deriving mixture policies for episodic multiobjective tasks from deterministic base policies found via scalarised reinforcement learning. It is shown that these approaches are an efficient means of identifying solutions which offer a superior match to the user’s preferences than can be achieved by methods based strictly on deterministic policies.
Description: 2003007906

Experimental investigation of three machine learning algorithms for ITS dataset

Authors: Yearwood, John , Kang, Byeongho , Kelarev, Andrei
Date: 2009
Type: Text , Conference paper
Relation: Paper presented at First International Conference, FGIT 2009, Future Generation Information Technology, Jeju Island, Korea : 10th-12th December 2009 Vol. 5899, p. 308-316
Full Text:
Description: The present article is devoted to experimental investigation of the performance of three machine learning algorithms for ITS dataset in their ability to achieve agreement with classes published in the biologi cal literature before. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form a Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel machine learning ap proaches to the analysis of datasets of this sort. This paper introduces a k-committees classifier and compares it with the discrete k-means and Nearest Neighbour classifiers. It turns out that all three machine learning algorithms are efficient and can be used to automate future biologically significant classifications for datasets of this kind. A simplified version of a synthetic dataset, where the k-committees classifier outperforms k-means and Nearest Neighbour classifiers, is also presented.
Description: 2003007844

Optimization methods and the k-committees algorithm for clustering of sequence data

Authors: Yearwood, John , Bagirov, Adil , Kelarev, Andrei
Date: 2009
Type: Text , Journal article
Relation: Applied and Computational Mathematics Vol. 8, no. 1 (2009), p. 92-101
Relation: http://purl.org/au-research/grants/arc/DP0211866
Relation: http://purl.org/au-research/grants/arc/DP0666061
Full Text: false
Description: The present paper is devoted to new algorithms for unsupervised clustering based on the optimization approaches due to [2], [3] and [4]. We consider a novel situation, where the datasets consist of nucleotide or protein sequences and rather sophisticated biologically significant alignment scores have to be used as a measure of distance. Sequences of this kind cannot be regarded as points in a finite dimensional space. Besides, the alignment scores do not satisfy properties of Minkowski metrics. Nevertheless the optimization approaches have made it possible to introduce a new k-committees algorithm and compare its performance with previous algorithms for two datasets. Our experimental results show that the k-committees algorithms achieves intermediate accuracy for a dataset of ITS sequences, and it can perform better than the discrete k-means and Nearest Neighbour algorithms for certain datasets. All three algorithms achieve good agreement with clusters published in the biological literature before and can be used to obtain biologically significant clusterings.

Optimization of multiple classifiers in data mining based on string rewriting systems

Authors: Dazeley, Richard , Kelarev, Andrei , Yearwood, John , Mammadov, Musa
Date: 2009
Type: Text , Journal article
Relation: Asian-European Journal of Mathematics Vol. 2, no. 1 (2009), p. 41-56
Relation: https://purl.org/au-research/grants/arc/DP0211866
Relation: https://purl.org/au-research/grants/arc/LP0669752
Full Text:
Description: Optimization of multiple classifiers is an important problem in data mining. We introduce additional structure on the class sets of the classifiers using string rewriting systems with a convenient matrix representation. The aim of the present paper is to develop an efficient algorithm for the optimization of the number of errors of individual classifiers, which can be corrected by these multiple classifiers.

Rees matrix constructions for clustering of data

Authors: Kelarev, Andrei , Watters, Paul , Yearwood, John
Date: 2009
Type: Journal article
Relation: Journal of the Australian Mathematical Society Vol. 87, no. 3 (2009), p. 377-393
Relation: http://purl.org/au-research/grants/arc/DP0211866
Full Text:
Reviewed:
Description: This paper continues the investigation of semigroup constructions motivated by applications in data mining. We give a complete description of the error-correcting capabilities of a large family of clusterers based on Rees matrix semigroups well known in semigroup theory. This result strengthens and complements previous formulas recently obtained in the literature. Examples show that our theorems do not generalize to other classes of semigroups.

An application of consensus clustering for DDoS attacks detection

Authors: Zi, Lifang , Yearwood, John , Kelarev, Andrei
Date: 2010
Type: Text , Conference proceedings
Full Text:
Description: The detection of Distributed Denial of Service (DDos) attacks is very important for maintaining the security of networks and the Internet. This paper introduces a novel iterative consensus process based on Hybrid Bipartite Graph Formulation (HGBF) consensus function for DDos attacks detection. First, the features are extracted during feature extraction process based on the analysis of network traffic. Second, several clustering algorithms are applied in combination with the silhouette index to obtain a collection of independent initial clusterings. Third, the HGBF consensus function and silhouette index are used to find an appropriate consensus clustering of the initial clusterings. Fourth, this new consensus clustering is added to the pool of initial clusterings replacing another clustering with the worst Silhouette index. Fifth, the process continues iteratively until the Silhouette index of the resulting consensus clusterings stabilizes. This iterative consensus clustering process can improve the quality of the clusters. The experimental results demonstrate that our iterative consensus process is effective and can be used in practice for detecting the separate phased of DDos attacks.

Consensus clustering and supervised classification for profiling phishing emails in internet commerce security

Authors: Dazeley, Richard , Yearwood, John , Kang, Byeongho , Kelarev, Andrei
Date: 2010
Type: Text , Conference paper
Relation: Paper presented at 11th International Workshop on Knowledge Management and Acquisition for Smart Systems and Services, PKAW 2010 Vol. 6232 LNAI, p. 235-246
Full Text:
Reviewed:
Description: This article investigates internet commerce security applications of a novel combined method, which uses unsupervised consensus clustering algorithms in combination with supervised classification methods. First, a variety of independent clustering algorithms are applied to a randomized sample of data. Second, several consensus functions and sophisticated algorithms are used to combine these independent clusterings into one final consensus clustering. Third, the consensus clustering of the randomized sample is used as a training set to train several fast supervised classification algorithms. Finally, these fast classification algorithms are used to classify the whole large data set. One of the advantages of this approach is in its ability to facilitate the inclusion of contributions from domain experts in order to adjust the training set created by consensus clustering. We apply this approach to profiling phishing emails selected from a very large data set supplied by the industry partners of the Centre for Informatics and Applied Optimization. Our experiments compare the performance of several classification algorithms incorporated in this scheme. © 2010 Springer-Verlag Berlin Heidelberg.

Internet security applications of Grobner-Shirvov bases

Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul
Date: 2010
Type: Text , Journal article
Relation: Asian-European Journal of Mathematics Vol. 3, no. 3 (2010), p. 435-442
Relation: http://purl.org/au-research/grants/arc/DP0211866
Full Text: false
Reviewed:

Internet security applications of the Munn rings

Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul , Wu, Xinwen , Abawajy, Jemal , Pan, L.
Date: 2010
Type: Text , Journal article
Relation: Semigroup Forum Vol. 81, no. 1 (2010), p. 162-171
Full Text:
Reviewed:
Description: Effective multiple clustering systems, or clusterers, have important applications in information security. The aim of the present article is to introduce a new method of designing multiple clusterers based on the Munn rings and describe a class of optimal clusterers which can be obtained in this construction.

A Grobner-Shirshov Algorithm for Applications in Internet Security

Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul , Wu, Xinwen , Ma, Liping , Abawajy, Jemal , Pan, L.
Date: 2011
Type: Text , Journal article
Relation: Southeast Asian Bulletin of Mathematics Vol. 35, no. (2011), p. 807-820
Full Text: false
Reviewed:
Description: The design of multiple classication and clustering systems for the detection of malware is an important problem in internet security. Grobner-Shirshov bases have been used recently by Dazeley et al. [15] to develop an algorithm for constructions with certain restrictions on the sandwich-matrices. We develop a new Grobner-Shirshov algorithm which applies to a larger variety of constructions based on combinatorial Rees matrix semigroups without any restrictions on the sandwich-matrices.

An application of novel clustering technique for information security

Authors: Beliakov, Gleb , Yearwood, John , Kelarev, Andrei
Date: 2011
Type: Text , Conference paper
Relation: Applications and Techniques in Information Security Workshop p. 5-11
Full Text: false
Reviewed:
Description: This article presents experimental results devoted to a new application of the novel clustering technique introduced by the authors recently. Our aim is to facilitate the application of robust and stable consensus functions in information security, where it is often necessary to process large data sets and monitor outcomes in real time, as it is required, for example, for intrusion detection. Here we concentrate on the particular case of application to profiling of phishing websites. First, we apply several independent clustering algorithms to a randomized sample of data to obtain independent initial clusterings. Silhouette index is used to determine the number of clusters. Second, we use a consensus function to combine these independent clusterings into one consensus clustering . Feature ranking is used to select a subset of features for the consensus function. Third, we train fast supervised classification algorithms on the resulting consensus clustering in order to enable them to process the whole large data set as well as new data. The precision and recall of classifiers at the final stage of this scheme are critical for effectiveness of the whole procedure. We investigated various combinations of three consensus functions, Cluster-Based Graph Formulation (CBGF), Hybrid Bipartite Graph Formulation (HBGF), and Instance-Based Graph Formulation (IBGF) and a variety of supervised classification algorithms. The best precision and recall have been obtained by the combination of the HBGF consensus function and the SMO classifier with the polynomial kernel.
Description: 2003009195

Showing items 1 - 20 of 45

A description of incidence rings of group automata

A formula for multiple classifiers in data mining based on Brandt semigroups

A polynomial ring construction for the classification of data

Applying clustering and ensemble clustering approaches to phishing profiling

Cayley graphs as classifiers for data mining : The influence of asymmetries

Constructing stochastic mixture policies for episodic multiobjective reinforcement learning tasks

Experimental investigation of three machine learning algorithms for ITS dataset

Optimization of multiple classifiers in data mining based on string rewriting systems

Rees matrix constructions for clustering of data

An application of consensus clustering for DDoS attacks detection

Consensus clustering and supervised classification for profiling phishing emails in internet commerce security

Internet security applications of the Munn rings