Internet security applications of the Munn rings
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul , Wu, Xinwen , Abawajy, Jemal , Pan, L.
- Date: 2010
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 81, no. 1 (2010), p. 162-171
- Full Text:
- Reviewed:
- Description: Effective multiple clustering systems, or clusterers, have important applications in information security. The aim of the present article is to introduce a new method of designing multiple clusterers based on the Munn rings and describe a class of optimal clusterers which can be obtained in this construction.
Predicting cardiac autonomic neuropathy category for diabetic data with missing values
- Authors: Abawajy, Jemal , Kelarev, Andrei , Chowdhury, Morshed , Stranieri, Andrew , Jelinek, Herbert
- Date: 2013
- Type: Text , Journal article
- Relation: Computers in Biology and Medicine Vol. 43, no. 10 (2013), p. 1328-1333
- Full Text:
- Reviewed:
- Description: Cardiovascular autonomic neuropathy (CAN) is a serious and well known complication of diabetes. Previous articles circumvented the problem of missing values in CAN data by deleting all records and fields with missing values and applying classifiers trained on different sets of features that were complete. Most of them also added alternative features to compensate for the deleted ones. Here we introduce and investigate a new method for classifying CAN data with missing values. In contrast to all previous papers, our new method does not delete attributes with missing values, does not use classifiers, and does not add features. Instead it is based on regression and meta-regression combined with the Ewing formula for identifying the classes of CAN. This is the first article using the Ewing formula and regression to classify CAN. We carried out extensive experiments to determine the best combination of regression and meta-regression techniques for classifying CAN data with missing values. The best outcomes have been obtained by the additive regression meta-learner based on M5Rules and combined with the Ewing formula. It has achieved the best accuracy of 99.78% for two classes of CAN, and 98.98% for three classes of CAN. These outcomes are substantially better than previous results obtained in the literature by deleting all missing attributes and applying traditional classifiers to different sets of features without regression. Another advantage of our method is that it does not require practitioners to perform more tests collecting additional alternative features. © 2013 Elsevier Ltd.
- Description: C1
Performance evaluation of multi-tier ensemble classifiers for phishing websites
- Authors: Abawajy, Jemal , Beliakov, Gleb , Kelarev, Andrei , Yearwood, John
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: This article is devoted to large multi-tier ensemble classifiers generated as ensembles of ensembles and applied to phishing websites. Our new ensemble construction is a special case of the general and productive multi-tier approach well known in information security. Many efficient multi-tier classifiers have been considered in the literature. Our new contribution is in generating new large systems as ensembles of ensembles by linking a top-tier ensemble to another middletier ensemble instead of a base classifier so that the toptier ensemble can generate the whole system. This automatic generation capability includes many large ensemble classifiers in two tiers simultaneously and automatically combines them into one hierarchical unified system so that one ensemble is an integral part of another one. This new construction makes it easy to set up and run such large systems. The present article concentrates on the investigation of performance of these new multi-tier ensembles for the example of detection of phishing websites. We carried out systematic experiments evaluating several essential ensemble techniques as well as more recent approaches and studying their performance as parts of multi-level ensembles with three tiers. The results presented here demonstrate that new three-tier ensemble classifiers performed better than the base classifiers and standard ensembles included in the system. This example of application to the classification of phishing websites shows that the new method of combining diverse ensemble techniques into a unified hierarchical three-tier ensemble can be applied to increase the performance of classifiers in situations where data can be processed on a large computer.
An approach for Ewing test selection to support the clinical assessment of cardiac autonomic neuropathy
- Authors: Stranieri, Andrew , Abawajy, Jemal , Kelarev, Andrei , Huda, Shamsul , Chowdhury, Morshed , Jelinek, Herbert
- Date: 2013
- Type: Text , Journal article
- Relation: Artificial Intelligence in Medicine Vol. 58, no. 3 (2013), p. 185-193
- Full Text:
- Reviewed:
- Description: Objective: This article addresses the problem of determining optimal sequences of tests for the clinical assessment of cardiac autonomic neuropathy (CAN) We investigate the accuracy of using only one of the recommended Ewing tests to classify CAN and the additional accuracy obtained by adding the remaining tests of the Ewing battery This is important as not all five Ewing tests can always be applied in each situation in practice Methods and material: We used new and unique database of the diabetes screening research initiative project, which is more than ten times larger than the data set used by Ewing in his original investigation of CAN We utilized decision trees and the optimal decision path finder (ODPF) procedure for identifying optimal sequences of tests Results: We present experimental results on the accuracy of using each one of the recommended Ewing tests to classify CAN and the additional accuracy that can be achieved by adding the remaining tests of the Ewing battery We found the best sequences of tests for cost-function equal to the number of tests The accuracies achieved by the initial segments of the optimal sequences for 2, 3 and 4 categories of CAN are 80.80, 91.33, 93.97 and 94.14, and respectively, 79.86, 89.29, 91.16 and 91.76, and 78.90, 86.21, 88.15 and 88.93 They show significant improvement compared to the sequence considered previously in the literature and the mathematical expectations of the accuracies of a random sequence of tests The complete outcomes obtained for all subsets of the Ewing features are required for determining optimal sequences of tests for any cost-function with the use of the ODPF procedure We have also found two most significant additional features that can increase the accuracy when some of the Ewing attributes cannot be obtained Conclusions: The outcomes obtained can be used to determine the optimal sequences of tests for each individual cost-function by following the ODPF procedure The results show that the best single Ewing test for diagnosing CAN is the deep breathing heart rate variation test Optimal sequences found for the cost-function equal to the number of tests guarantee that the best accuracy is achieved after any number of tests and provide an improvement in comparison with the previous ordering of tests or a random sequence © 2013 Elsevier B.V.
- Description: 2003011130
Empirical investigation of multi-tier ensembles for the detection of cardiac autonomic neuropathy using subsets of the Ewing features
- Authors: Abawajy, Jemal , Kelarev, Andrei , Stranieri, Andrew , Jelinek, Herbert
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: This article is devoted to an empirical investigation of performance of several new large multi-tier ensembles for the detection of cardiac autonomic neuropathy (CAN) in diabetes patients using sub-sets of the Ewing features. We used new data collected by the diabetes screening research initiative (DiScRi) project, which is more than ten times larger than the data set originally used by Ewing in the investigation of CAN. The results show that new multi-tier ensembles achieved better performance compared with the outcomes published in the literature previously. The best accuracy 97.74% of the detection of CAN has been achieved by the novel multi-tier combination of AdaBoost and Bagging, where AdaBoost is used at the top tier and Bagging is used at the middle tier, for the set consisting of the following four Ewing features: the deep breathing heart rate change, the Valsalva manoeuvre heart rate change, the hand grip blood pressure change and the lying to standing blood pressure change.
Optimization and matrix constructions for classification of data
- Authors: Kelarev, Andrei , Yearwood, John , Vamplew, Peter , Abawajy, Jemal , Chowdhury, Morshed
- Date: 2011
- Type: Journal article
- Relation: New Zealand Journal of Mathematics Vol. 41, no. 2011 (2011), p. 65-73
- Full Text:
- Reviewed:
- Description: Max-plus alegbras and more general semirings have many useful applications and have been actively investigated. On the other hand, structural matrix rings are also well known and have been considered by many authors. The main theorem of this article completely describes all optimal ideas in the more general structural matrix semirings. Originally, our investigation of these ideals was motivated by applications in data mining for the design of multiple classification systems combining several individual classifiers.
Classification systems based on combinatorial semigroups
- Authors: Abawajy, Jemal , Kelarev, Andrei
- Date: 2013
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 86, no. 3 (2013), p. 603-612
- Full Text:
- Reviewed:
- Description: The present article continues the investigation of constructions essential for applications of combinatorial semigroups to the design of multiple classification systems in data mining. Our main theorem gives a complete description of all optimal classification systems defined by one-sided ideals in a construction based on combinatorial Rees matrix semigroups. It strengthens and generalizes previous results, which handled the more narrow case of two-sided ideals. © 2012 Springer Science+Business Media New York.
- Description: 2003011021