A new loss function for robust classification
- Authors: Zhao, Lei , Mammadov, Musa , Yearwood, John
- Date: 2014
- Type: Text , Journal article
- Relation: Intelligent Data Analysis Vol. 18, no. 4 (2014), p. 697-715
- Full Text: false
- Reviewed:
- Description: Loss function plays an important role in data classification. Manyloss functions have been proposed and applied to differentclassification problems. This paper proposes a new so called thesmoothed 0-1 loss function, that could be considered as anapproximation of the classical 0-1 loss function. Due to thenon-convexity property of the proposed loss function, globaloptimization methods are required to solve the correspondingoptimization problems. Together with the proposed loss function, wecompare the performance of several existing loss functions in theclassification of noisy data sets. In this comparison, differentoptimization problems are considered in regards to the convexity andsmoothness of different loss functions. The experimental resultsshow that the proposed smoothed 0-1 loss function works better ondata sets with noisy labels, noisy features, and outliers. © 2014 - IOS Press and the authors. All rights reserved.
Classification for accuracy and insight : A weighted sum approach
- Authors: Quinn, Anthony , Stranieri, Andrew , Yearwood, John
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at Sixth Australasian Data Mining Conference, AusDM 2007, Gold Coast, Queensland, Victoria : 3rd-4th December 2007 p. 203-208
- Full Text:
- Description: This research presents a classifier that aims to provide insight into a dataset in addition to achieving classification accuracies comparable to other algorithms. The classifier called, Automated Weighted Sum (AWSum) uses a weighted sum approach where feature values are assigned weights that are summed and compared to a threshold in order to classify an example. Though naive, this approach is scalable, achieves accurate classifications on standard datasets and also provides a degree of insight. By insight we mean that the technique provides an appreciation of the influence a feature value has on class values, relative to each other. AWSum provides a focus on the feature value space that allows the technique to identify feature values and combinations of feature values that are sensitive and important for a classification. This is particularly useful in fields such as medicine where this sort of micro-focus and understanding is critical in classification.
- Description: 2003005504
Opinion search in web logs
- Authors: Osman, Deanna , Yearwood, John
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at Eighteenth Australasian Database Conference, ADC 2007, Ballarat, Victoria : 29th January-2nd February 2007 p. 133-139
- Full Text:
- Description: Web logs(blogs) are a fast growing forum for people of all ages to express their feelings and opinions on topics of interest. The entries are often written in informal language without the structure found in newswire or published articles. One blog entry may contain many topics, these topics may express an opinion or a fact on a particular topic. This research is in contrast to work on opinion detection which has been carried out on more formally authored texts and on segments that are either whole documents or sentences. Whole web logs are divided into topics using a simple text segmentation approach. Similarity scores are used to distinguish where topic changers occur. The results are compared to human-evaluated topic changes and the most accurate algorithm is used in the remainder of the research. Words within each topic-block are allocated weightings depending on their opinion-bearing strength. Two approaches of using these weights, the sum and the maximum, are used to determine whether the topic-block is opinion-bearing or non-opinion-bearing. The opinion-bearing topic-blocks are rated by human evaluators as either opinion-bearing or non-opinion-bearing with precision of 67% for approach A and 70% for approach B. These results are compared with two approaches on published text to identify the difference between web logs and published articles.
- Description: 2003004895
Using corpus analysis to inform research into opinion detection in blogs
- Authors: Osman, Deanna , Yearwood, John , Vamplew, Peter
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at Sixth Australasian Data Mining Conference, AusDM 2007, Gold Coast, Queensland, Victoria : 3rd-4th December 2007 p. 65-75
- Full Text:
- Description: Opinion detection research relies on labeled documents for training data, either by assumptions based on the document's origin or by using human assessors to categorise the documents. In recent years, blogs have become a source for opinion identification research (TREC Blog06). This study analyses the part-of-speech proportion and the words used within various corpora, determining key differences and similarities useful when preparing for opinion identification research. The resulting comparisons between the characteristics of the various corpora is detailed and discussed. In particular, opinion bearing and non opinion Blog06 documents were found to display a high level of similarity, indicating that blog documents assessed at the document level cannot be used as training data in opinion identification research.
- Description: 2003004892
Comparative analysis of genetic algorithm, simulated annealing and cutting angle method for artificial neural networks
- Authors: Ghosh, Ranadhir , Ghosh, Moumita , Yearwood, John , Bagirov, Adil
- Date: 2005
- Type: Text , Journal article
- Relation: Machine Learning and Data Mining in Pattern Recognition, Proceedings Vol. 3587, no. (2005), p. 62-70
- Full Text: false
- Reviewed:
- Description: Neural network learning is the main essence of ANN. There are many problems associated with the multiple local minima in neural networks. Global optimization methods are capable of finding global optimal solution. In this paper we investigate and present a comparative study for the effects of probabilistic and deterministic global search method for artificial neural network using fully connected feed forward multi-layered perceptron architecture. We investigate two probabilistic global search method namely Genetic algorithm and Simulated annealing method and a deterministic cutting angle method to find weights in neural network. Experiments were carried out on UCI benchmark dataset.
- Description: C1
- Description: 2003003398