Editorial
- Authors: Yearwood, John
- Date: 2010
- Type: Text , Journal article
- Relation: Journal of Research and Practice in Information Technology Vol. 42, no. 1 (2010), p. 1
- Full Text: false
- Reviewed:
Profiling phishing emails based on hyperlink information
- Authors: Yearwood, John , Mammadov, Musa , Banerjee, Arunava
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 2010 International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, Odense : 9th-11th August 2010 p. 120-127
- Full Text:
- Description: In this paper, a novel method for profiling phishing activity from an analysis of phishing emails is proposed. Profiling is useful in determining the activity of an individual or a particular group of phishers. Work in the area of phishing is usually aimed at detection of phishing emails. In this paper, we concentrate on profiling as distinct from detection of phishing emails. We formulate the profiling problem as a multi-label classification problem using the hyperlinks in the phishing emails as features and structural properties of emails along with whois (i.e.DNS) information on hyperlinks as profile classes. Further, we generate profiles based on classifier predictions. Thus, classes become elements of profiles. We employ a boosting algorithm (AdaBoost) as well as SVM to generate multi-label class predictions on three different datasets created from hyperlink information in phishing emails. These predictions are further utilized to generate complete profiles of these emails. Results show that profiling can be done with quite high accuracy using hyperlink information. © 2010 Crown Copyright.
Scenario-based learning environments
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2004
- Type: Text , Conference paper
- Relation: Paper presented at the Narrative and Interactive Learning Environments conference, NILE 2004, Edinburgh, Scotland : 10th August, 2004
- Full Text: false
- Reviewed:
- Description: E1
- Description: 2003000832
A reasoning community perspective on deliberate democracy
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2011
- Type: Text , Book chapter
- Relation: Technologies for supporting reasoning communities and collaborative decision making: Cooperative approaches p.237-246
- Full Text: false
- Reviewed:
- Description: This chapter describes some of the current approaches to delibertative democracy and the considers them from the perspective of a reasoning community framework. This approach highlights important tasks, process and structures that can be used to enhance the process of groups engaging in deliberative democracy approaches. In particular it focuses attention on the potential for technologies to support groups in achieving broad agreed structured reasoning bases that capture the scope of an issue from multiple perspectives.
Machine learning algorithms for analysis of DNA data sets
- Authors: Yearwood, John , Bagirov, Adil , Kelarev, Andrei
- Date: 2012
- Type: Text , Book chapter
- Relation: Machine Learning Algorithms for Problem Solving in Computational Applications: Intelligent Techniques p. 47-58
- Relation: http://purl.org/au-research/grants/arc/LP0990908
- Full Text: false
- Reviewed:
- Description: The applications of machine learning algorithms to the analysis of data sets of DNA sequences are very important. The present chapter is devoted to the experimental investigation of applications of several machine learning algorithms for the analysis of a JLA data set consisting of DNA sequences derived from non-coding segments in the junction of the large single copy region and inverted repeat A of the chloroplast genome in Eucalyptus collected by Australian biologists. Data sets of this sort represent a new situation, where sophisticated alignment scores have to be used as a measure of similarity. The alignment scores do not satisfy properties of the Minkowski metric, and new machine learning approaches have to be investigated. The authors' experiments show that machine learning algorithms based on local alignment scores achieve very good agreement with known biological classes for this data set. A new machine learning algorithm based on graph partitioning performed best for clustering of the JLA data set. Our novel k-committees algorithm produced most accurate results for classification. Two new examples of synthetic data sets demonstrate that the authors' k-committees algorithm can outperform both the Nearest Neighbour and k-medoids algorithms simultaneously.
Optimization methods and the k-committees algorithm for clustering of sequence data
- Authors: Yearwood, John , Bagirov, Adil , Kelarev, Andrei
- Date: 2009
- Type: Text , Journal article
- Relation: Applied and Computational Mathematics Vol. 8, no. 1 (2009), p. 92-101
- Relation: http://purl.org/au-research/grants/arc/DP0211866
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Description: The present paper is devoted to new algorithms for unsupervised clustering based on the optimization approaches due to [2], [3] and [4]. We consider a novel situation, where the datasets consist of nucleotide or protein sequences and rather sophisticated biologically significant alignment scores have to be used as a measure of distance. Sequences of this kind cannot be regarded as points in a finite dimensional space. Besides, the alignment scores do not satisfy properties of Minkowski metrics. Nevertheless the optimization approaches have made it possible to introduce a new k-committees algorithm and compare its performance with previous algorithms for two datasets. Our experimental results show that the k-committees algorithms achieves intermediate accuracy for a dataset of ITS sequences, and it can perform better than the discrete k-means and Nearest Neighbour algorithms for certain datasets. All three algorithms achieve good agreement with clusters published in the biological literature before and can be used to obtain biologically significant clusterings.
A web-based Narrative construction environment
- Authors: Yearwood, John , Stranieri, Andrew , Osman, Deanna
- Date: 2008
- Type: Text , Conference paper
- Relation: Paper presented at NILE 2008: 5th International Conference on Narrative and Interactive Learning Environments, Edinburgh, Scotland : 6th-8th August 2008 p. 78-81
- Full Text:
- Description: This paper describes a web-based environment for constructing narrative from story snippets contributed by a community of interest. The underlying model uses an argument based structure to infer the next event in the narrative sequence. The approach makes use of both events and higher level story elements derived from Polti’s dramatic situations. Dramatic situations used are consistent with a theme, and events are generally constrained by the dramatic situation. The narrative generated is a function of the event history, the dramatic situations chosen and the plausible inferences about next events that are contributed by a community of interest in the theme. At this stage, a player’s actions are simulated using a random selection from a set and the implementation of a nonsense filter. Example outputs from the system are provided and discussed.
- Description: 2003006499
Detection of child exploiting chatsfrom a mixed chat dataset as a text classification task
- Authors: Yearwood, John , Miah, Md Waliur Rahman , Kulkarni, Siddhivinayak
- Date: 2011
- Type: Text , Conference paper
- Relation: Proceedings of Australasian Language Technology Association Workshop
- Full Text: false
- Reviewed:
- Description: There is a rapidly growing body of work in the use of Embodied Conversational Agents (ECA) to convey complex contextual relationships through verbal and non-verbal communication, in domains ranging from military C2 to e-learning. In these applications the subject matter expert in often naive to the technical requirements of ECAs. ENGAGE (the Extensible Natural Gesture Animation Generation Engine) is desgined to automatically generate appropriate and 'realistic' animation for ECAs based on the content provided to them. It employs syntactic analysis of the surface text and uses predefined behaviours for the ECA. We discuss the design of this system, its current applications and plans for its future development.
A novel hybrid neural learning algorithm using simulated annealing and quasisecant method
- Authors: Yearwood, John , Bagirov, Adil , Seifollahi, Sattar
- Date: 2011
- Type: Text , Conference proceedings
- Full Text: false
- Description: In this paper, we propose a hybrid learning algorithm for the single hidden layer feedforward neural networks (SLFNs) for data classification. The proposed hybrid algorithm is a two-phase learning algorithm and is based on the quasisecant and the simulated annealing methods. First, the weights between the hidden layer and the output layer nodes (output layer weights) are adjusted by the quasisecant algorithm. Then the simulated annealing is applied for global attribute weighting. The weights between the input layer and the hidden layer nodes are fixed in advance and are not included in the learning process. The proposed two-phase learning of the network is a novel idea and is different from that of the existing ones. The numerical results on some benchmark data sets are also reported and these results are promising. © 2011, Australian Computer Society, Inc.
- Description: 2003009507
Experimental investigation of clasification algorithms for ITS dataset
- Authors: Yearwood, John , Kang, Byeongho , Kelarev, Andrei
- Date: 2008
- Type: Text , Conference paper
- Relation: PKAW-08, Pacific Rim Knowledge Acquisition Workshop 2008, as part of PRICAI 2008, Tenth Pacific Rim p. 262-272
- Full Text: false
- Reviewed:
- Description: This article is devoted to experimental investigation of classification algorithms for analysis of ITS dataset. We introduce and consider a novel k-committees alogorithm for classification and compare it with the discrete k- means and nearest neighbour algorithms. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel algorithms and adjust familiar ones. We present the results of experiments comparing the efficiency of three classification methods in their ability to achieve agreement with classes published in the biological literature before. It turns out that our algorithms are efficient and can be used to obtain biologically significant classifications. A simplified version of a synthetic dataset, where the k-committees classifier out performs k-means and Nearest Neighbour classifiers, is also presented.
- Description: E1
Approaches for community decision making and collective reasoning: Knowledge technology support
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2012
- Type: Text , Book
- Relation: Approaches for Community Decision Making and Collective Reasoning: Knowledge Technology Support
- Full Text: false
- Reviewed:
- Description: Technology currently encourages the capture and storage of vast quantities of data and information and so thinkers, reasoners, and decision-makers have available large resources to support their tasks. At the same time, there is a need to engage with an enormous range of complex issues that require reasoning and decisions that are actionable to address them. Approaches for Community Decision Making and Collective Reasoning: Knowledge Technology Support acts to provide knowledge for each individual in a group with the broad structural wealth of reasoning. It also acts as an explicit structure that technological devices for supporting reasoning within a group can hook onto. If you are interested in how groups can structure their activities towards making better decisions or in developing technologies for the support of decision-making in groups, then this book is an excellent way to understand the state of the art and possible ways forward.
Technologies for supporting reasoning communities and collaborative decision making: Cooperative approaches
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2011
- Type: Text , Book
- Full Text: false
- Reviewed:
- Description: The information age has enabled unprecedented levels of data to be collected and stored. At the same time, society and organizations have become increasingly complex. Consequently, decisions in many facets have become increasingly complex but have the potential to be better informed. Technologies for Supporting Reasoning Communities and Collaborative Decision Making: Cooperative Approaches includes chapters from diverse fields of enquiry including decision science, political science, argumentation, knowledge management, cognitive psychology and business intelligence. Each chapter illustrates a perspective on group reasoning that ultimately aims to lead to a greater understanding of reasoning communities and inform technological developments.
Profiling phishing activity based on hyperlinks extracted from phishing emails
- Authors: Yearwood, John , Mammadov, Musa , Webb, Dean
- Date: 2012
- Type: Text , Journal article
- Relation: Social Network Analysis and Mining Vol. 2, no. 1 (2012), p. 5-16
- Full Text: false
- Reviewed:
- Description: Phishing activity has recently been focused on social networking sites as a more effective way of exploiting not only the technology but also the trust that may exist between members in a social network. In this paper, a novel method for profiling phishing activity from an analysis of phishing emails is proposed. Profiling is useful in determining the activity of an individual or a particular group of phishers. Work in the area of phishing is usually aimed at detection of phishing emails. In this paper, we concentrate on profiling as distinct from detection of phishing emails. We formulate the profiling problem as a multi-label classification problem using the hyperlinks in the phishing emails as features and structural properties of emails along with whois (i.e. DNS) information on hyperlinks as profile classes. Further, we generate profiles based on the classifier predictions. Thus, classes become elements of profiles. We employ a boosting algorithm (AdaBoost) as well as SVM to generate multi-label class predictions on three different datasets created from hyperlink information in phishing emails. These predictions are further utilized to generate complete profiles of these emails. Results show that profiling can be done with quite high accuracy using hyperlink information.
Narrative-based interactive learning environments from modelling reasoning
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2007
- Type: Text , Journal article
- Relation: Educational Technology and Society Vol. 10, no. 3 (2007), p. 192-208
- Full Text:
- Reviewed:
- Description: Narrative and story telling has a long history of use in structuring, organising and communicating human experience. This paper describes a narrative based interactive intelligent learning environment which aims to elucidate practical reasoning using interactive emergent narratives that can be used in training novices in decision making. Its design is based on an approach to generating narrative from knowledge that has been modelled in specific decision/reasoning domains. The approach uses a narrative model that is guided partially by inference and contextual information contained in the particular knowledge representation used, the Generic/Actual argument model of structured reasoning. The approach is described with examples in the area of critical care nursing training and positive learning outcomes are reported. © International Forum of Educational Technology & Society (IFETS).
- Description: C1
- Description: 2003002522
A novel canonical dual computational approach for prion AGAAAAGA amyloid fibril molecular modeling
- Authors: Zhang, Jiapu , Gao, David , Yearwood, John
- Date: 2011
- Type: Text , Journal article
- Relation: Journal of Theoretical Biology Vol. 284, no. 1 (2011), p. 149-157
- Full Text: false
- Reviewed:
- Description: Many experimental studies have shown that the prion AGAAAAGA palindrome hydrophobic region (113-120) has amyloid fibril forming properties and plays an important role in prion diseases. However, due to the unstable, noncrystalline and insoluble nature of the amyloid fibril, to date structural information on AGAAAAGA region (113-120) has been very limited. This region falls just within the N-terminal unstructured region PrP (1-123) of prion proteins. Traditional X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy experimental methods cannot be used to get its structural information. Under this background, this paper introduces a novel approach of the canonical dual theory to address the 3D atomic-resolution structure of prion AGAAAAGA amyloid fibrils. The novel and powerful canonical dual computational approach introduced in this paper is for the molecular modeling of prion AGAAAAGA amyloid fibrils, and that the optimal atomic-resolution structures of prion AGAAAAGA amyloid fibils presented in this paper are useful for the drive to find treatments for prion diseases in the field of medicinal chemistry. Overall, this paper presents an important method and provides useful information for treatments of prion diseases. © 2011.
From convex to nonconvex: A loss function analysis for binary classification
- Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at10th IEEE International Conference on Data Mining Workshops, ICDMW 2010 p. 1281-1288
- Full Text:
- Reviewed:
- Description: Problems of data classification can be studied in the framework of regularization theory as ill-posed problems. In this framework, loss functions play an important role in the application of regularization theory to classification. In this paper, we review some important convex loss functions, including hinge loss, square loss, modified square loss, exponential loss, logistic regression loss, as well as some non-convex loss functions, such as sigmoid loss, ø-loss, ramp loss, normalized sigmoid loss, and the loss function of 2 layer neural network. Based on the analysis of these loss functions, we propose a new differentiable non-convex loss function, called smoothed 0-1 loss function, which is a natural approximation of the 0-1 loss function. To compare the performance of different loss functions, we propose two binary classification algorithms for binary classification, one for convex loss functions, the other for non-convex loss functions. A set of experiments are launched on several binary data sets from the UCI repository. The results show that the proposed smoothed 0-1 loss function is robust, especially for those noisy data sets with many outliers. © 2010 IEEE.
A new loss function for robust classification
- Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
- Date: 2014
- Type: Text , Journal article
- Relation: Intelligent Data Analysis Vol. 18, no. 4 (2014), p. 697-715
- Full Text: false
- Reviewed:
- Description: Loss function plays an important role in data classification. Manyloss functions have been proposed and applied to differentclassification problems. This paper proposes a new so called thesmoothed 0-1 loss function, that could be considered as anapproximation of the classical 0-1 loss function. Due to thenon-convexity property of the proposed loss function, globaloptimization methods are required to solve the correspondingoptimization problems. Together with the proposed loss function, wecompare the performance of several existing loss functions in theclassification of noisy data sets. In this comparison, differentoptimization problems are considered in regards to the convexity andsmoothness of different loss functions. The experimental resultsshow that the proposed smoothed 0-1 loss function works better ondata sets with noisy labels, noisy features, and outliers. © 2014 - IOS Press and the authors. All rights reserved.
Adaptive clustering with feature ranking for DDoS attacks detection
- Authors: Zi, Lifang , Yearwood, John , Wu, Xin
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Distributed Denial of Service (DDoS) attacks pose an increasing threat to the current internet. The detection of such attacks plays an important role in maintaining the security of networks. In this paper, we propose a novel adaptive clustering method combined with feature ranking for DDoS attacks detection. First, based on the analysis of network traffic, preliminary variables are selected. Second, the Modified Global K-means algorithm (MGKM) is used as the basic incremental clustering algorithm to identify the cluster structure of the target data. Third, the linear correlation coefficient is used for feature ranking. Lastly, the feature ranking result is used to inform and recalculate the clusters. This adaptive process can make worthwhile adjustments to the working feature vector according to different patterns of DDoS attacks, and can improve the quality of the clusters and the effectiveness of the clustering algorithm. The experimental results demonstrate that our method is effective and adaptive in detecting the separate phases of DDoS attacks. © 2010 IEEE.
An application of consensus clustering for DDoS attacks detection
- Authors: Zi, Lifang , Yearwood, John , Kelarev, Andrei
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: The detection of Distributed Denial of Service (DDos) attacks is very important for maintaining the security of networks and the Internet. This paper introduces a novel iterative consensus process based on Hybrid Bipartite Graph Formulation (HGBF) consensus function for DDos attacks detection. First, the features are extracted during feature extraction process based on the analysis of network traffic. Second, several clustering algorithms are applied in combination with the silhouette index to obtain a collection of independent initial clusterings. Third, the HGBF consensus function and silhouette index are used to find an appropriate consensus clustering of the initial clusterings. Fourth, this new consensus clustering is added to the pool of initial clusterings replacing another clustering with the worst Silhouette index. Fifth, the process continues iteratively until the Silhouette index of the resulting consensus clusterings stabilizes. This iterative consensus clustering process can improve the quality of the clusters. The experimental results demonstrate that our iterative consensus process is effective and can be used in practice for detecting the separate phased of DDos attacks.