List of Titles

Optimization of multiple classifiers in data mining based on string rewriting systems

Authors: Dazeley, Richard , Kelarev, Andrei , Yearwood, John , Mammadov, Musa
Date: 2009
Type: Text , Journal article
Relation: Asian-European Journal of Mathematics Vol. 2, no. 1 (2009), p. 41-56
Relation: https://purl.org/au-research/grants/arc/DP0211866
Relation: https://purl.org/au-research/grants/arc/LP0669752
Full Text:
Description: Optimization of multiple classifiers is an important problem in data mining. We introduce additional structure on the class sets of the classifiers using string rewriting systems with a convenient matrix representation. The aim of the present paper is to develop an efficient algorithm for the optimization of the number of errors of individual classifiers, which can be corrected by these multiple classifiers.

A hybrid clustering algorithm using two level of abstraction

Authors: Ghosh, Ranadhir , Mammadov, Musa , Ghosh, Moumita , Yearwood, John
Date: 2005
Type: Text , Conference paper
Relation: Paper presented at Fuzzy Logic, Soft Computing, and Computational Intelligence, 11th International Fuzzy Systems Association World Congress, Beijing, China : 28th - 31st July, 2005
Full Text: false
Reviewed:
Description: E1
Description: 2003001360

Two level clustering using SOM and dynamical systems

Authors: Ghosh, Ranadhir , Mammadov, Musa , Ghosh, Moumita , Yearwood, John
Date: 2004
Type: Text , Conference paper
Relation: Paper presented at ICOTA6: 6th International Conference on Optimization - Techniques and Applications, Ballarat, Victoria : 9th December, 2004
Full Text: false
Reviewed:
Description: E1
Description: 2003000871

A hybrid wrapper-filter approach to detect the source(s) of out-of-control signals in multivariate manufacturing process

Authors: Huda, Shamsul , Abdollahian, Mali , Mammadov, Musa , Yearwood, John , Ahmed, Shafiq , Sultan, Ibrahim
Date: 2014
Type: Text , Journal article
Relation: European Journal of Operational Research Vol. 237, no. 3 (2014), p. 857-870
Full Text: false
Reviewed:
Description: With modern data-Acquisition equipment and on-line computers used during production, it is now common to monitor several correlated quality characteristics simultaneously in multivariate processes. Multivariate control charts (MCC) are important tools for monitoring multivariate processes. One difficulty encountered with multivariate control charts is the identification of the variable or group of variables that cause an out-of-control signal. Expert knowledge either in combination with wrapper-based supervised classifier or a pre-filter with wrapper are the standard approaches to detect the sources of out-of-control signal. However gathering expert knowledge in source identification is costly and may introduce human error. Individual univariate control charts (UCC) and decomposition of T2 statistics are also used in many cases simultaneously to identify the sources, but these either ignore the correlations between the sources or may take more time with the increase of dimensions. The aim of this paper is to develop a source identification approach that does not need any expert-knowledge and can detect out-of-control signal in less computational complexity. We propose, a hybrid wrapper-filter based source identification approach that hybridizes a Mutual Information (MI) based Maximum Relevance (MR) filter ranking heuristic with an Artificial Neural Network (ANN) based wrapper. The Artificial Neural Network Input Gain Measurement Approximation (ANNIGMA) has been combined with MR (MR-ANNIGMA) to utilize the knowledge about the intrinsic pattern of the quality characteristics computed by the filter for directing the wrapper search process. To compute optimal ANNIGMA score, we also propose a Global MR-ANNIGMA using non-functional relationship between variables which is independent of the derivative of the objective function and has a potential to overcome the local optimization problem of ANN training. The novelty of the proposed approaches is that they combine the advantages of both filter and wrapper approaches and do not require any expert knowledge about the sources of the out-of-control signals. Heuristic score based subset generation process also reduces the search space into polynomial growth which in turns reduces computational time. The proposed approaches were tested by exhaustive experiments using both simulated and real manufacturing data and compared to existing methods including independent filter, wrapper and Multivariate EWMA (MEWMA) methods. The results indicate that the proposed approaches can identify the sources of out-of-control signals more accurately than existing approaches. © 2014 Elsevier B.V. All rights reserved.

A formula for multiple classifiers in data mining based on Brandt semigroups

Authors: Kelarev, Andrei , Yearwood, John , Mammadov, Musa
Date: 2009
Type: Text , Journal article
Relation: Semigroup Forum Vol. 78, no. 2 (2009), p. 293-309
Full Text:
Reviewed:
Description: A general approach to designing multiple classifiers represents them as a combination of several binary classifiers in order to enable correction of classification errors and increase reliability. This method is explained, for example, in Witten and Frank (Data Mining: Practical Machine Learning Tools and Techniques, 2005, Sect. 7.5). The aim of this paper is to investigate representations of this sort based on Brandt semigroups. We give a formula for the maximum number of errors of binary classifiers, which can be corrected by a multiple classifier of this type. Examples show that our formula does not carry over to larger classes of semigroups. © 2008 Springer Science+Business Media, LLC.

Multi label classification and drug-reaction associations using global optimization techniques

Authors: Mammadov, Musa , Yearwood, John , Aliyea, Leyla
Date: 2004
Type: Text , Conference paper
Relation: Paper presented at ICOTA6: 6th International Conference on Optimization - Techniques and Applications, Ballarat, Victoria : 9th December, 2004
Full Text: false
Reviewed:
Description: E1
Description: 2003000890

Dynamical systems based on a fuzzy derivative and its applications to data classification

Authors: Mammadov, Musa , Rubinov, Alex , Yearwood, John
Date: 2003
Type: Text , Conference paper
Relation: Paper presented at the Industrial Optimisation 2003 Conference, Perth : 30th September, 2002
Full Text: false
Reviewed:
Description: E1
Description: 2003000339

A fuzzy derivative approach to classification of outcomes from the ADRAC database

Authors: Mammadov, Musa , Saunders, Gary , Yearwood, John
Date: 2004
Type: Text , Journal article
Relation: International Transactions in Operational Research Vol. 11, no. 2 (2004), p. 169-180
Full Text: false
Reviewed:
Description: The Australian Adverse Drug Reaction Advisory Committee (ADRAC) database has been collected and maintained by the Therapeutic Goods Administration. In this paper we study a part of his database (Card2) which contains records having just reactions from the Cardiovascular group. Drug-reaction relationships are presented by a vector of degrees which shows the degree of association of a drug with each class of reactions. In this work we examine these relationships in the classification of reaction outcomes. A modified version of the fuzzy derivative method (FDM2) is used for classification.
Description: C1
Description: 2003000895

Relationships between different Australian interest rate swap markets

Authors: Mammadov, Musa , Yearwood, John
Date: 2004
Type: Text , Conference paper
Relation: Paper presented at ICOTA6: 6th International Conference on Optimization - Techniques and Applications, Ballarat, Victoria : 9th December, 2004
Full Text: false
Reviewed:
Description: E1
Description: 2003000893

Relationships between different term structures of Australian interest rate swap markets

Authors: Mammadov, Musa , Yearwood, John
Date: 2004
Type: Text , Conference paper
Relation: Paper presented at the First International Workshop on Intelligent Finance, IWIF 2004, Melbourne : 13th December, 2004
Full Text: false
Reviewed:
Description: E1
Description: 2003000894

The study of drug-reaction relationships using global optimization techniques

Authors: Mammadov, Musa , Rubinov, Alex , Yearwood, John
Date: 2007
Type: Text , Journal article
Relation: Optimization Methods and Software Vol. 22, no. 1 (2007), p. 99-126
Full Text: false
Reviewed:
Description: In this paper we develop an optimization approach for the study of adverse drug reaction (ADR) problems. This approach is based on drug-reaction relationships represented in the form of a vector of weights, which can be defined as a solution to some global optimization problem. Although it can be used for solving many ADR problems, we concentrate on two of them here: the accurate identification of drugs that are responsible for reactions that have occurred, and drug-drug interactions. Based on drug-reaction relationships, we formulate these problems as an optimization problem. The approach is applied to cardiovascularn-type reactions from the Australian Adverse Drug Reaction Advisory Committee (ADRAC) database. Software based on this approach has been developed and could have beneficial use in prescribing.
Description: C1
Description: 2003002217

An introduction algorithm with selection significance based on a fuzzy deriviative

Authors: Mammadov, Musa , Yearwood, John
Date: 2002
Type: Text , Conference paper
Relation: Paper presented at Hybrid Information Systems (Advances in Soft Computing), Adelaide : 11th December, 2001
Full Text: false
Reviewed:
Description: E1
Description: 2003000076

A new supervised term ranking method for text categorization

Authors: Mammadov, Musa , Yearwood, John , Zhao, Lei
Date: 2010
Type: Text , Conference paper
Relation: Paper presented at 23rd Australasian Joint Conference on Artificial Intelligence, AI 2010 Vol. 6464 LNAI, p. 102-111
Full Text:
Reviewed:
Description: In text categorization, different supervised term weighting methods have been applied to improve classification performance by weighting terms with respect to different categories, for example, Information Gain, Ï‡² statistic, and Odds Ratio. From the literature there are three term ranking methods to summarize term weights of different categories for multi-class text categorization. They are Summation, Average, and Maximum methods. In this paper we present a new term ranking method to summarize term weights, i.e. Maximum Gap. Using two different methods of information gain and Ï‡² statistic, we setup controlled experiments for different term ranking methods. Reuter-21578 text corpus is used as the dataset. Two popular classification algorithms SVM and Boostexter are adopted to evaluate the performance of different term ranking methods. Experimental results show that the new term ranking method performs better. Â© 2010 Springer-Verlag.

Dynamical systems described by relational elasticities with applications to global optimization

Authors: Mammadov, Musa , Rubinov, Alex , Yearwood, John
Date: 2005
Type: Text , Book chapter
Relation: Continuous Optimization: Current Trends and Modern Applications Chapter p. 365-385
Full Text: false
Reviewed:
Description: B1

Attribute weighted Naive Bayes classifier using a local optimization

Authors: Taheri, Sona , Yearwood, John , Mammadov, Musa , Seifollahi, Sattar
Date: 2013
Type: Text , Journal article
Relation: Neural Computing & Applications Vol.24, no.5 (2013), p.995-1002
Full Text:
Reviewed:
Description: The Naive Bayes classifier is a popular classification technique for data mining and machine learning. It has been shown to be very effective on a variety of data classification problems. However, the strong assumption that all attributes are conditionally independent given the class is often violated in real-world applications. Numerous methods have been proposed in order to improve the performance of the Naive Bayes classifier by alleviating the attribute independence assumption. However, violation of the independence assumption can increase the expected error. Another alternative is assigning the weights for attributes. In this paper, we propose a novel attribute weighted Naive Bayes classifier by considering weights to the conditional probabilities. An objective function is modeled and taken into account, which is based on the structure of the Naive Bayes classifier and the attribute weights. The optimal weights are determined by a local optimization method using the quasisecant method. In the proposed approach, the Naive Bayes classifier is taken as a starting point. We report the results of numerical experiments on several real-world data sets in binary classification, which show the efficiency of the proposed method.

Using links to aid web classification

Authors: Xie, Wei , Mammadov, Musa , Yearwood, John
Date: 2007
Type: Text , Conference paper
Relation: Paper presented at 6th IEEE/ACIS International Conference on Computer and Information Science, ICIS 2007, Melbourne, Victoria : 11th-13th July 2007 p. 981-986
Full Text:
Description: In this paper, we will present a new approach of using link information to improve the accuracy and efficiency of web classification. However, different from others, we only use the mappings between linked documents and their own class or classes. In this case, we only need to add a few features called linked-class features into the datasets. We apply SVM and BoosTexter for classification. We show that the classification accuracy can be improved based on mixtures of ordinary word features and out-linked-class features. We analyze and discuss the reason of this improvement.
Description: 2003005438

Profiling phishing emails based on hyperlink information

Authors: Yearwood, John , Mammadov, Musa , Banerjee, Arunava
Date: 2010
Type: Text , Conference paper
Relation: Paper presented at 2010 International Conference on Advances in Social Network Analysis and Mining, ASONAM 2010, Odense : 9th-11th August 2010 p. 120-127
Full Text:
Description: In this paper, a novel method for profiling phishing activity from an analysis of phishing emails is proposed. Profiling is useful in determining the activity of an individual or a particular group of phishers. Work in the area of phishing is usually aimed at detection of phishing emails. In this paper, we concentrate on profiling as distinct from detection of phishing emails. We formulate the profiling problem as a multi-label classification problem using the hyperlinks in the phishing emails as features and structural properties of emails along with whois (i.e.DNS) information on hyperlinks as profile classes. Further, we generate profiles based on classifier predictions. Thus, classes become elements of profiles. We employ a boosting algorithm (AdaBoost) as well as SVM to generate multi-label class predictions on three different datasets created from hyperlink information in phishing emails. These predictions are further utilized to generate complete profiles of these emails. Results show that profiling can be done with quite high accuracy using hyperlink information. Â© 2010 Crown Copyright.

Profiling phishing activity based on hyperlinks extracted from phishing emails

Authors: Yearwood, John , Mammadov, Musa , Webb, Dean
Date: 2012
Type: Text , Journal article
Relation: Social Network Analysis and Mining Vol. 2, no. 1 (2012), p. 5-16
Full Text: false
Reviewed:
Description: Phishing activity has recently been focused on social networking sites as a more effective way of exploiting not only the technology but also the trust that may exist between members in a social network. In this paper, a novel method for profiling phishing activity from an analysis of phishing emails is proposed. Profiling is useful in determining the activity of an individual or a particular group of phishers. Work in the area of phishing is usually aimed at detection of phishing emails. In this paper, we concentrate on profiling as distinct from detection of phishing emails. We formulate the profiling problem as a multi-label classification problem using the hyperlinks in the phishing emails as features and structural properties of emails along with whois (i.e. DNS) information on hyperlinks as profile classes. Further, we generate profiles based on the classifier predictions. Thus, classes become elements of profiles. We employ a boosting algorithm (AdaBoost) as well as SVM to generate multi-label class predictions on three different datasets created from hyperlink information in phishing emails. These predictions are further utilized to generate complete profiles of these emails. Results show that profiling can be done with quite high accuracy using hyperlink information.

From convex to nonconvex: A loss function analysis for binary classification

Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
Date: 2010
Type: Text , Conference paper
Relation: Paper presented at10th IEEE International Conference on Data Mining Workshops, ICDMW 2010 p. 1281-1288
Full Text:
Reviewed:
Description: Problems of data classification can be studied in the framework of regularization theory as ill-posed problems. In this framework, loss functions play an important role in the application of regularization theory to classification. In this paper, we review some important convex loss functions, including hinge loss, square loss, modified square loss, exponential loss, logistic regression loss, as well as some non-convex loss functions, such as sigmoid loss, Ã¸-loss, ramp loss, normalized sigmoid loss, and the loss function of 2 layer neural network. Based on the analysis of these loss functions, we propose a new differentiable non-convex loss function, called smoothed 0-1 loss function, which is a natural approximation of the 0-1 loss function. To compare the performance of different loss functions, we propose two binary classification algorithms for binary classification, one for convex loss functions, the other for non-convex loss functions. A set of experiments are launched on several binary data sets from the UCI repository. The results show that the proposed smoothed 0-1 loss function is robust, especially for those noisy data sets with many outliers. Â© 2010 IEEE.

A new loss function for robust classification

Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
Date: 2014
Type: Text , Journal article
Relation: Intelligent Data Analysis Vol. 18, no. 4 (2014), p. 697-715
Full Text: false
Reviewed:
Description: Loss function plays an important role in data classification. Manyloss functions have been proposed and applied to differentclassification problems. This paper proposes a new so called thesmoothed 0-1 loss function, that could be considered as anapproximation of the classical 0-1 loss function. Due to thenon-convexity property of the proposed loss function, globaloptimization methods are required to solve the correspondingoptimization problems. Together with the proposed loss function, wecompare the performance of several existing loss functions in theclassification of noisy data sets. In this comparison, differentoptimization problems are considered in regards to the convexity andsmoothness of different loss functions. The experimental resultsshow that the proposed smoothed 0-1 loss function works better ondata sets with noisy labels, noisy features, and outliers. © 2014 - IOS Press and the authors. All rights reserved.

Showing items 1 - 20 of 20

Optimization of multiple classifiers in data mining based on string rewriting systems

A formula for multiple classifiers in data mining based on Brandt semigroups

A new supervised term ranking method for text categorization

Attribute weighted Naive Bayes classifier using a local optimization

Using links to aid web classification

Profiling phishing emails based on hyperlink information

From convex to nonconvex: A loss function analysis for binary classification