Empirical investigation of consensus clustering for large ECG data sets
- Authors: Kelarev, Andrei , Stranieri, Andrew , Yearwood, John , Jelinek, Herbert
- Date: 2012
- Type: Text , Conference proceedings
- Full Text: false
- Description: This article investigates a novel machine learning approach applying consensus clustering in conjunction with classification for the data mining of very large and highly dimensional ECG data sets. To obtain robust and stable clusterings, consensus functions can be applied for clustering ensembles combining a multitude of independent initial clusterings. Direct applications of consensus functions to highly dimensional ECG data sets remain computationally expensive and impracticable. We introduce a multistage scheme including various procedures for dimensionality reduction, consensus clustering of randomized samples, followed by the use of a fast supervised classification algorithm. Applying the Hybrid Bipartite Graph Formulation combined with rank ordering and SMO we obtained an area under the receiver operating curve of 0.987. The performance of the classification algorithm at the final stage is crucial for the effectiveness of this technique. It can be regarded as an indication of the reliability, quality and stability of the combined consensus clustering. © 2012 IEEE.
Empirical study of decision trees and ensemble classifiers for monitoring of diabetes patients in pervasive healthcare
- Authors: Kelarev, Andrei , Stranieri, Andrew , Yearwood, John , Jelinek, Herbert
- Date: 2012
- Type: Text , Conference proceedings
- Full Text: false
- Description: Diabetes is a condition requiring continuous everyday monitoring of health related tests. To monitor specific clinical complications one has to find a small set of features to be collected from the sensors and efficient resource-aware algorithms for their processing. This article is concerned with the detection and monitoring of cardiovascular autonomic neuropathy, CAN, in diabetes patients. Using a small set of features identified previously, we carry out an empirical investigation and comparison of several ensemble methods based on decision trees for a novel application of the processing of sensor data from diabetes patients for pervasive health monitoring of CAN. Our experiments relied on an extensive database collected by the Diabetes Complications Screening Research Initiative at Charles Sturt University and concentrated on the particular task of the detection and monitoring of cardiovascular autonomic neuropathy. Most of the features in the database can now be collected using wearable sensors. Our experiments included several essential ensemble methods, a few more advanced and recent techniques, and a novel consensus function. The results show that our novel application of the decision trees in ensemble classifiers for the detection and monitoring of CAN in diabetes patients achieved better performance parameters compared with the outcomes obtained previously in the literature. © 2012 IEEE.
- Description: 2003009675
Integrating online social networks with e-Commerce : A CBR approach
- Authors: Sun, Zhaohao , Firmin, Sally , Yearwood, John
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: Integrating online social networks (OSN) with e-commerce is a part of Enterprise 2.0 and social media and is of significance for development of e-commerce and online social networking services. However, how to integrate online social networks including Facebook with e-commerce is still a big issue for companies. Case based reasoning (CBR) has a number of successful applications in e-commerce and web services. This article examines how to integrate OSN with e-commerce, how to integrate CBR with e-commerce and how to integrate CBR with OSN. This article also proposes a CBR architecture for integrating online social networks with e-commerce using CBR as an intelligent intermediary. One of the research findings indicates that the principle of CBR is a useful marketing strategy for integrating e-commerce and OSN. The approach proposed in this research will facilitate the development of e-commerce, Enterprise 3.0 and online social networking services. Sun, Firmin, & Yearwood © 2012.
- Description: 2003010901
Performance evaluation of multi-tier ensemble classifiers for phishing websites
- Authors: Abawajy, Jemal , Beliakov, Gleb , Kelarev, Andrei , Yearwood, John
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: This article is devoted to large multi-tier ensemble classifiers generated as ensembles of ensembles and applied to phishing websites. Our new ensemble construction is a special case of the general and productive multi-tier approach well known in information security. Many efficient multi-tier classifiers have been considered in the literature. Our new contribution is in generating new large systems as ensembles of ensembles by linking a top-tier ensemble to another middletier ensemble instead of a base classifier so that the toptier ensemble can generate the whole system. This automatic generation capability includes many large ensemble classifiers in two tiers simultaneously and automatically combines them into one hierarchical unified system so that one ensemble is an integral part of another one. This new construction makes it easy to set up and run such large systems. The present article concentrates on the investigation of performance of these new multi-tier ensembles for the example of detection of phishing websites. We carried out systematic experiments evaluating several essential ensemble techniques as well as more recent approaches and studying their performance as parts of multi-level ensembles with three tiers. The results presented here demonstrate that new three-tier ensemble classifiers performed better than the base classifiers and standard ensembles included in the system. This example of application to the classification of phishing websites shows that the new method of combining diverse ensemble techniques into a unified hierarchical three-tier ensemble can be applied to increase the performance of classifiers in situations where data can be processed on a large computer.
A novel hybrid neural learning algorithm using simulated annealing and quasisecant method
- Authors: Yearwood, John , Bagirov, Adil , Seifollahi, Sattar
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we propose a hybrid learning algorithm for the single hidden layer feedforward neural networks (SLFNs) for data classification. The proposed hybrid algorithm is a two-phase learning algorithm and is based on the quasisecant and the simulated annealing methods. First, the weights between the hidden layer and the output layer nodes (output layer weights) are adjusted by the quasisecant algorithm. Then the simulated annealing is applied for global attribute weighting. The weights between the input layer and the hidden layer nodes are fixed in advance and are not included in the learning process. The proposed two-phase learning of the network is a novel idea and is different from that of the existing ones. The numerical results on some benchmark data sets are also reported and these results are promising. © 2011, Australian Computer Society, Inc.
- Description: 2003009507
Child face detection using age specific luminance invariant geometric descriptor
- Authors: Islam, Mofakharul , Watters, Paul , Yearwood, John
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: While considerable research have been conducted on age-wise age estimation using skin detection most often with other visual cues, relatively little research has looked closely at the subject. In this paper, we present a new framework for interpreting facial image patterns that can be employed in categorical age estimation. The aim is to propose a novel approach to investigate and implement a child face detection technique that is able to estimate age categorically adult or child based on a new hybrid feature descriptor. The novel hybrid feature descriptor LIGD (the luminance invariant geometric descriptor) is composed of some low and high level features, which are found to be effective in characterizing the local appearance. In local appearance estimation, chromaticity, texture, and positional information of few facial visual cues can be employed simultaneously. Compared to the results published in a recent work, our proposed approach yields the highest precision and recall, and overall accuracy in recognition. © 2011 IEEE.
Reinforcement learning approach to AIBO robot's decision making process in Robosoccer's goal keeper problem
- Authors: Mukherjee, Subhasis , Yearwood, John , Vamplew, Peter , Huda, Shamsul
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: Robocup is a popular test bed for AI programs around the world. Robosoccer is one of the two major parts of Robocup, in which AIBO entertainment robots take part in the middle sized soccer event. The three key challenges that robots need to face in this event are manoeuvrability, image recognition and decision making skills. This paper focuses on the decision making problem in Robosoccer - The goal keeper problem. We investigate whether reinforcement learning (RL) as a form of semi-supervised learning can effectively contribute to the goal keeper's decision making process when penalty shot and two attacker problem are considered. Currently, the decision making process in Robosoccer is carried out using rule-base system. RL also is used for quadruped locomotion and navigation purpose in Robosoccer using AIBO. In this paper, we propose a reinforcement learning based approach that uses a dynamic state-action mapping using back propagation of reward and space quantized Q-learning (SQQL) for the choice of high level functions in order to save the goal. The novelty of our approach is that the agent learns while playing and can take independent decision which overcomes the limitations of rule-base system due to fixed and limited predefined decision rules. Performance of the proposed method has been verified against the bench mark data set made with Upenn'03 code logic. It was found that the efficiency of our SQQL approach in goalkeeping was better than the rule based approach. The SQQL develops a semi-supervised learning process over the rule-base system's input-output mapping process, given in the Upenn'03 code. © 2011 IEEE.
Adaptive clustering with feature ranking for DDoS attacks detection
- Authors: Zi, Lifang , Yearwood, John , Wu, Xin
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Distributed Denial of Service (DDoS) attacks pose an increasing threat to the current internet. The detection of such attacks plays an important role in maintaining the security of networks. In this paper, we propose a novel adaptive clustering method combined with feature ranking for DDoS attacks detection. First, based on the analysis of network traffic, preliminary variables are selected. Second, the Modified Global K-means algorithm (MGKM) is used as the basic incremental clustering algorithm to identify the cluster structure of the target data. Third, the linear correlation coefficient is used for feature ranking. Lastly, the feature ranking result is used to inform and recalculate the clusters. This adaptive process can make worthwhile adjustments to the working feature vector according to different patterns of DDoS attacks, and can improve the quality of the clusters and the effectiveness of the clustering algorithm. The experimental results demonstrate that our method is effective and adaptive in detecting the separate phases of DDoS attacks. © 2010 IEEE.
An application of consensus clustering for DDoS attacks detection
- Authors: Zi, Lifang , Yearwood, John , Kelarev, Andrei
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: The detection of Distributed Denial of Service (DDos) attacks is very important for maintaining the security of networks and the Internet. This paper introduces a novel iterative consensus process based on Hybrid Bipartite Graph Formulation (HGBF) consensus function for DDos attacks detection. First, the features are extracted during feature extraction process based on the analysis of network traffic. Second, several clustering algorithms are applied in combination with the silhouette index to obtain a collection of independent initial clusterings. Third, the HGBF consensus function and silhouette index are used to find an appropriate consensus clustering of the initial clusterings. Fourth, this new consensus clustering is added to the pool of initial clusterings replacing another clustering with the worst Silhouette index. Fifth, the process continues iteratively until the Silhouette index of the resulting consensus clusterings stabilizes. This iterative consensus clustering process can improve the quality of the clusters. The experimental results demonstrate that our iterative consensus process is effective and can be used in practice for detecting the separate phased of DDos attacks.
Cluster based rule discovery model for enhancement of government's tobacco control strategy
- Authors: Huda, Shamsul , Yearwood, John , Borland, Ron
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Discovery of interesting rules describing the behavioural patterns of smokers' quitting intentions is an important task in the determination of an effective tobacco control strategy. In this paper, we investigate a compact and simplified rule discovery process for predicting smokers' quitting behaviour that can provide feedback to build an scientific evidence-based adaptive tobacco control policy. Standard decision tree (SDT) based rule discovery depends on decision boundaries in the feature space which are orthogonal to the axis of the feature of a particular decision node. This may limit the ability of SDT to learn intermediate concepts for high dimensional large datasets such as tobacco control. In this paper, we propose a cluster based rule discovery model (CRDM) for generation of more compact and simplified rules for the enhancement of tobacco control policy. The clusterbased approach builds conceptual groups from which a set of decision trees (a decision forest) are constructed. Experimental results on the tobacco control data set show that decision rules from the decision forest constructed by CRDM are simpler and can predict smokers' quitting intention more accurately than a single decision tree. © 2010 IEEE.
Hybrid wrapper-filter approaches for input feature selection using maximum relevance and Artificial Neural Network Input Gain Measurement Approximation (ANNIGMA)
- Authors: Huda, Shamsul , Yearwood, John , Stranieri, Andrew
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Feature selection is an important research problem in machine learning and data mining applications. This paper proposes a hybrid wrapper and filter feature selection algorithm by introducing the filter's feature ranking score in the wrapper stage to speed up the search process for wrapper and thereby finding a more compact feature subset. The approach hybridizes a Mutual Information (MI) based Maximum Relevance (MR) filter ranking heuristic with an Artificial Neural Network (ANN) based wrapper approach where Artificial Neural Network Input Gain Measurement Approximation (ANNIGMA) has been combined with MR (MR-ANNIGMA) to guide the search process in the wrapper. The novelty of our approach is that we use hybrid of wrapper and filter methods that combines filter's ranking score with the wrapper-heuristic's score to take advantages of both filter and wrapper heuristics. Performance of the proposed MRANNIGMA has been verified using bench mark data sets and compared to both independent filter and wrapper based approaches. Experimental results show that MR-ANNIGMA achieves more compact feature sets and higher accuracies than both filter and wrapper approaches alone. © 2010 IEEE.
Understanding victims of identity theft: Preliminary insights
- Authors: Turville, Kylie , Yearwood, John , Miller, Charlynn
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Identity theft is not a new crime, however changes in society and the way that business is conducted have made it an easier, attractive and more lucrative crime. When a victim discovers the misuse of their identity they must then begin the process of recovery, including fixing any issues that may have been created by the misuse. For some victims this may only take a small amount of time and effort, however for others they may continue to experience issues for many years after the initial moment of discovery. To date, little research has been conducted within Australia or internationally regarding what a victim experiences as they work through the recovery process. This paper presents a summary of the identity theft domain with an emphasis on research conducted within Australia, and identifies a number of issues regarding research in this area. The paper also provides an overview of the research project currently being undertaken by the authors in obtaining an understanding of what victims of identity theft experience during the recovery process; particularly their experiences when dealing with organizations. Finally, it reports on some of the preliminary work that has already been conducted for the research project. © 2010 IEEE.
Explicit representations of reasoning to support deliberation within groups
- Authors: Stranieri, Andrew , Yearwood, John , Mays, Heather
- Date: 2008
- Type: Text , Conference proceedings
- Full Text: false
- Description: In practice, the reasoning that underpins problem solving and decision making is rarely performed by an individual in isolation from others but involves a communicative exchanges between participants in a community that can range in size from two to many thousands. Dialogue theories describe patterns in dialogues comprising many dialectical exchanges and often advance deliberation, the kind of dialogue that ensues when participants actively seek to understand all views and collectively arrive at the rationally optimal solution. This study reports on the use of argument maps for structuring reasoning by groups of secondary students. The study aimed to discover whether different maps facilitate deliberation and enhance understanding of the issues by providing an explicit representation of reasoning. An explicit representation of reasoning is a model that encapsulates all relevant claims, evidence, statutes and principles pertinent to an issue. Schemes that have been used to provide explicit representations of reasoning include the Issue Based Information System (IBIS) map, variants of the Toulmin argument structure (TAS) and other knowledge representation schemes used for intelligent computational systems. Results indicate that an explicit representation of reasoning facilitates a depth of understanding of complex issues and there is some indication that the deliberative quality of discussions is enhanced depending on the level of abstraction of the map. Copyright © 2008 COSI.
- Description: 2003006482
A semantic method to information extraction for decision support systems
- Authors: Ofoghi, Bahadorreza , Yearwood, John , Ghosh, Ranadhir
- Date: 2006
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we describe a novel schema for a more semantic text mining process which results in more comprehensive decision making activity by decision support systems via providing more effective and accurate textual information. The utility of two semantic lexical resources; Frame Net and Word Net, in extracting required text snippets from unstructured free texts yields a better and more accurate information extraction process to deliver more precise information either to a DSS or to a decision maker. We explain how the usage of these lexical resources could elevate a focused text mining process which could be applied to an information provider system in a decision support paradigm. The preliminary results obtained after a starter experiment show that the hybrid information extraction schema performs well on some semantic failure situations.
- Description: 2003010644
Parallel selection of multi-category features for online handwritten character recognition
- Authors: Ghosh, Ranadhir , Ghosh, Moumita , Yearwood, John
- Date: 2005
- Type: Text , Conference proceedings
- Full Text: false
- Description: Online handwritten recognition is gaining more interest due to the increasing popularity of hand-held computers, digital notebooks and advanced cellular phones. The large number of writing styles and the variability between them makes the handwriting recognition problem a very challenging area for researchers. Many previous efforts have utilized many different approaches for recognition in online handwriting using various ANN classifier-modeling techniques. Different types of feature extraction techniques have also been used. It has been observed that, beyond a certain point, the inclusion of additional features leads to a worse rather than better performance. Moreover, the choice of features to represent the patterns affects several aspects of pattern recognition problems such as accuracy, required learning time and a necessary number of samples. A common problem with the multi-category feature classification is the conflict between the categories. None of the feasible solutions allow simultaneous optimal solution for all categories. In order to find an optimal solution the search space can be divided based on an individual category in each sub region and finally merging them through decision spport system. In this paper we propose a canonical GA based modular feature selection approach combined with standard MLP for multi category feature selection in online handwriting recognition.