Can shallow semantic class information help answer passage retrieval?
- Authors: Ofoghi, Bahadorreza , Yearwood, John
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 22nd Australasian Joint Conference, AI 2009: Advances in Artificial Intelligence, Melbourne, Victoria : 1st-4th December 2009 p. 587–596
- Full Text: false
- Description: In this paper, the effect of using semantic class overlap evidence in enhancing the passage retrieval effectiveness of question answering (QA) systems is tested. The semantic class overlap between questions and passages is measured by evoking FrameNet semantic frames using a shallow term-lookup procedure. We use the semantic class overlap evidence in two ways: i) fusing passage scores obtained from a baseline retrieval system with those obtained from the analysis of semantic class overlap (fusion-based approach), and ii) revising the passage scoring function of the baseline system by incorporating semantic class overlap evidence (revision-based approach). Our experiments with the TREC 2004 and 2006 datasets show that the revision-based approach significantly improves the passage retrieval effectiveness of the baseline system.
- Description: 2003007254
Cayley graphs as classifiers for data mining : The influence of asymmetries
- Authors: Kelarev, Andrei , Ryan, Joe , Yearwood, John
- Date: 2009
- Type: Text , Journal article
- Relation: Discrete Mathematics Vol. 309, no. 17 (2009), p. 5360-5369
- Relation: http://purl.org/au-research/grants/arc/DP0211866
- Full Text:
- Reviewed:
- Description: The endomorphism monoids of graphs have been actively investigated. They are convenient tools expressing asymmetries of the graphs. One of the most important classes of graphs considered in this framework is that of Cayley graphs. Our paper proposes a new method of using Cayley graphs for classification of data. We give a survey of recent results devoted to the Cayley graphs also involving their endomorphism monoids. © 2008 Elsevier B.V. All rights reserved.
Deliberative discourse and reasoning from generic argument structures
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2009
- Type: Text , Journal article
- Relation: AI and Society Vol. 23, no. 3 (2009), p. 353-377
- Full Text: false
- Reviewed:
- Description: In this article a dialectical model for practical reasoning within a community, based on the Generic/Actual Argument Model (GAAM) is advanced and its application to deliberative dialogue discussed. The GAAM, offers a dynamic template for structuring knowledge within a domain of discourse that is connected to and regulated by a community. The paper demonstrates how the community accepted generic argument structure acts to normatively influence both admissible reasoning and the progression of dialectical reasoning between participants. It is further demonstrated that these types of deliberation dialogues supported by the GAAM comply with criteria for normative principles for deliberation, specifically, Alexy's rules for discourse ethics and Hitchcock's Principles of Rational Mutual Inquiry. The connection of reasoning to the community in a documented and transparent structure assists in providing best justified reasons, principles of deliberation and ethical discourse which are important advantages for reasoning communities. © Springer-Verlag London Limited 2006.
Establishing phishing provenance using orthographic features
- Authors: Liping, Ma , Yearwood, John , Watters, Paul
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 2009 eCrime Researchers Summit, eCRIME '09, Tacoma, Washington : 20th-21st October 2009
- Full Text:
- Description: After phishing message detection, determining the provenance of phishing messages and Websites is the second step to tracing cybercriminals. In this paper, we present a novel method to cluster phishing emails automatically using orthographic features. In particular, we develop an algorithm to cluster documents and remove redundant features at the same time. After collecting all the possible features based on observation, we adapt the modified global k-mean method repeatedly, and generate the objective function values over a range of tolerance values across different subsets of features. Finally, we identify the appropriate clusters based on studying the distribution of the objective function values. Experimental evaluation of a large number of computations demonstrates that our clustering and feature selection techniques are highly effective and achieve reliable results.
- Description: 2003007842
Experimental investigation of three machine learning algorithms for ITS dataset
- Authors: Yearwood, John , Kang, Byeongho , Kelarev, Andrei
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at First International Conference, FGIT 2009, Future Generation Information Technology, Jeju Island, Korea : 10th-12th December 2009 Vol. 5899, p. 308-316
- Full Text:
- Description: The present article is devoted to experimental investigation of the performance of three machine learning algorithms for ITS dataset in their ability to achieve agreement with classes published in the biologi cal literature before. The ITS dataset consists of nuclear ribosomal DNA sequences, where rather sophisticated alignment scores have to be used as a measure of distance. These scores do not form a Minkowski metric and the sequences cannot be regarded as points in a finite dimensional space. This is why it is necessary to develop novel machine learning ap proaches to the analysis of datasets of this sort. This paper introduces a k-committees classifier and compares it with the discrete k-means and Nearest Neighbour classifiers. It turns out that all three machine learning algorithms are efficient and can be used to automate future biologically significant classifications for datasets of this kind. A simplified version of a synthetic dataset, where the k-committees classifier outperforms k-means and Nearest Neighbour classifiers, is also presented.
- Description: 2003007844
From lexical entailment to recognizing textual entailment using linguistic resources
- Authors: Ofoghi, Bahadorreza , Yearwood, John
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at Australasian Language Technology Association Workshop 2009, Sydney, New South Wales : 3rd-4th December 2009 p. 119–123
- Full Text:
- Description: In this paper, we introduce our Recognizing Textual Entailment (RTE) system developed on the basis of Lexical Entailment between two text excerpts, namely the hypothesis and the text. To extract atomic parts of hypotheses and texts, we carry out syntactic parsing on the sentences. We then utilize WordNet and FrameNet lexical resources for estimating lexical coverage of the text on the hypothesis. We report the results of our RTE runs on the Text Analysis Conference RTE datasets. Using a failure analysis process, we also show that the main difficulty of our RTE system relates to the underlying difficulty of syntactic analysis of sentences.
- Description: 2003007910
Group structured reasoning for coalescing group decisions
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2009
- Type: Text , Journal article
- Relation: Group Decision and Negotiation Vol. , no. (2009), p. 1-29
- Full Text:
- Reviewed:
- Description: In this paper we present the notion of structured reasoning through a model, called the Generic/Actual Argument Model (GAAM). The model which has been used as a computational representation for machine modelling of reasoning and for hybrid combinations of human and machine reasoning can be used as a coalescent framework for decision making. Whilst the notion of structuring reasoning is not new, structured reasoning is advanced as a technique where group consensus on reasoning structures at various levels can be used to facilitate the comprehension of complex reasoning particularly where there are multiple perspectives. For an issue, the approach provides a scaffolding structure for cognitive co-operation and a normative reasoning structure against which group participants can identify points of difference and points in common as well as the nature of the differences and similarities. Intra-group transparency characterized by the ability to recognise points in common and understand the nature of differences is important to the process of coalescing group decisions that carry maximum group support. © 2009 Springer Science+Business Media B.V.
Key public sector individuals as ICT change agents : An analysis of Australian and German experience
- Authors: Jagodick, Jana , Courvisanos, Jerry , Yearwood, John , Braun, Patrice
- Date: 2009
- Type: Text , Journal article
- Relation: The Asia Pacific Journal of Public Administration Vol. 31, no. 2 (2009), p. 197-212
- Full Text:
- Reviewed:
- Description: The increasing demand for technology-enabled public sector services drives state agencies to launch information and communication technology (ICT) projects. The Australian and German state agencies are taking a proactive role towards technological change by employing so-called ICT change agents. These ICT change agents introduce, diffuse, manage and implement ICT within projects. Despite the mobilisation of change agents, there is scant research on the formal and informal roles of these key individuals within public sector projects. This article bridges that gap by providing valuable insights into the activities of public sector ICT change agents. It is based on empirical research from six case studies in Australian and German state agencies. Findings from these studies indicate that public sector ICT change agents position organisations to take advantage of cutting edge technologies by performing a great variety of formal and informal roles. Formal roles are performed in order to accomplish set formal project tasks, while informal roles help to speed up rapid ICT adoption and innovation through the change agents’ informal networks. The findings are delineated in a framework for future research which shows that formal and informal roles impact on the outcomes of public sector ICT projects.
- Description: 2003007371
MRF model based unsupervised color textured image segmentation using multidimensional spatially variant finite mixture model
- Authors: Islam, Mofakharul , Vamplew, Peter , Yearwood, John
- Date: 2009
- Type: Text , Book chapter
- Relation: Technological developments in Education and Automation p. 375-380
- Full Text: false
- Reviewed:
- Description: We investigate and propose a novel approach to implement an unsupervised color image segmentation model that segments a color image meaningfully and partitions into its constituent parts automatically. The aim is to devise a robust unsupervised segmentation approach that can segment a color textured image more accurately. Here, color and texture information of each individual pixel along with the spatial relationship within its neighborhood have been considered for producing more accuracy in segmentation. In this particular work, the problem we want to investigate is to implement a robust unsupervised Multidimensional Spatially Variant Finite Mixture Model (MSVFMM) based color image segmentation approach using Cluster Ensembles and MRF model along with Daubechies wavelet transforms for increasing the content sensitivity of the segmentation model in order to get a better accuracy in segmentation. Here, Cluster Ensemble has been utilized as a robust automatic tool for finding the number of components in an image. The main idea behind this work is introducing a Bayesian inference based approach to estimate the Maximum a Posteriori (MAP) to identify the different objects/components in a color image. Markov Random Field (MRF) plays a crucial role in capturing the relationships among the neighboring pixels. An Expectation Maximization (EM) model fitting MAP algorithm segments the image utilizing the pixel’s color and texture features and the captured neighborhood relationships among them. The algorithm simultaneously calculates the model parameters and segments the pixels iteratively in an interleaved manner. Finally, it converges to a solution where the model parameters and pixel labels are stabilized within a specified criterion. Finally, we have compared our results with another recent segmentation approach [10], which is similar in nature. The experimental results reveal that the proposed approach is capable of producing more accurate and faithful segmentation and can be employed in different practical image content understanding applications.
Online group deliberation for the elicitation of shared values to underpin decision making
- Authors: Feldman, Yishai , Kraft, Donald , Kuflik, Tsvi , Afshar, Faezeh , Stranieri, Andrew , Yearwood, John
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 7th International Conference, NGITS 2009, Next generation information technologies and systems, Haifa, Israel : 16th-18th June 2009 Vol. 5831, p. 158-168
- Full Text: false
- Description: Values have been shown to underpin our attitudes, behaviour and motivate our decisions. Values do not exist in isolation but have meaning in relation to other values. However, values are not solely the purview of individuals as communities and organisations have core values implicit in their culture, policies and practices. Values for a group can be determined by a minority in power, derived by algorithmically merging values each group member holds, or set by deliberative consensus. The elicitation of values for the group by deliberation is likely to lead to widespread acceptance of values arrived at, however enticing individuals to engage in face to face discussion about values has been found to be very difficult. We present an online deliberative communication approach for the anonymous deliberation of values and claim that the framework has the elements required for the elicitation of shared values.
- Description: 2003007509
Optimization methods and the k-committees algorithm for clustering of sequence data
- Authors: Yearwood, John , Bagirov, Adil , Kelarev, Andrei
- Date: 2009
- Type: Text , Journal article
- Relation: Applied and Computational Mathematics Vol. 8, no. 1 (2009), p. 92-101
- Relation: http://purl.org/au-research/grants/arc/DP0211866
- Relation: http://purl.org/au-research/grants/arc/DP0666061
- Full Text: false
- Description: The present paper is devoted to new algorithms for unsupervised clustering based on the optimization approaches due to [2], [3] and [4]. We consider a novel situation, where the datasets consist of nucleotide or protein sequences and rather sophisticated biologically significant alignment scores have to be used as a measure of distance. Sequences of this kind cannot be regarded as points in a finite dimensional space. Besides, the alignment scores do not satisfy properties of Minkowski metrics. Nevertheless the optimization approaches have made it possible to introduce a new k-committees algorithm and compare its performance with previous algorithms for two datasets. Our experimental results show that the k-committees algorithms achieves intermediate accuracy for a dataset of ITS sequences, and it can perform better than the discrete k-means and Nearest Neighbour algorithms for certain datasets. All three algorithms achieve good agreement with clusters published in the biological literature before and can be used to obtain biologically significant clusterings.
Optimization of multiple classifiers in data mining based on string rewriting systems
- Authors: Dazeley, Richard , Kelarev, Andrei , Yearwood, John , Mammadov, Musa
- Date: 2009
- Type: Text , Journal article
- Relation: Asian-European Journal of Mathematics Vol. 2, no. 1 (2009), p. 41-56
- Relation: https://purl.org/au-research/grants/arc/DP0211866
- Relation: https://purl.org/au-research/grants/arc/LP0669752
- Full Text:
- Description: Optimization of multiple classifiers is an important problem in data mining. We introduce additional structure on the class sets of the classifiers using string rewriting systems with a convenient matrix representation. The aim of the present paper is to develop an efficient algorithm for the optimization of the number of errors of individual classifiers, which can be corrected by these multiple classifiers.
Rees matrix constructions for clustering of data
- Authors: Kelarev, Andrei , Watters, Paul , Yearwood, John
- Date: 2009
- Type: Journal article
- Relation: Journal of the Australian Mathematical Society Vol. 87, no. 3 (2009), p. 377-393
- Relation: http://purl.org/au-research/grants/arc/DP0211866
- Full Text:
- Reviewed:
- Description: This paper continues the investigation of semigroup constructions motivated by applications in data mining. We give a complete description of the error-correcting capabilities of a large family of clusterers based on Rees matrix semigroups well known in semigroup theory. This result strengthens and complements previous formulas recently obtained in the literature. Examples show that our theorems do not generalize to other classes of semigroups.
The impact of frame semantic annotation levels, frame-alignment techniques, and fusion methods on factoid answer processing
- Authors: Ofoghi, Bahadorreza , Yearwood, John , Liping, Ma
- Date: 2009
- Type: Text , Journal article
- Relation: Journal of the American Society for Information Science and Technology Vol. 60, no. 2 (2009), p. 247-263
- Full Text: false
- Reviewed:
- Description: The impact of frame semantic enrichment of texts on the task of factoid question answering (QA) is studied in this paper. In particular, we consider different techniques for answer processing with frame semantics: the level of semantic class identification and role assignment to texts, and the fusion of frame semantic-based answerprocessing approaches with other methods used in the Text REtrieval Conference (TREC). The impact of each of these aspects on the overall performance of a QA system is analyzed in this paper. The TREC 2004 and TREC 2006 factoid question sets were used for the experiments. These demonstrate that the exploitation of encapsulated frame semantics in FrameNet in a shallow semantic parsing process can enhance answer-processing performance in factoid QA systems. This improvement is dependent on the level of semantic annotation, the frame semantic alignment method, and the method of fusing frame semantic-based answer-processing models with other existing models. A more comprehensively annotated environment with all different part-of-speech target predicates provides a higher chance of correct factoid answer retrieval where semantic alignment is based on both semantic classes and a relaxed set of semantic roles for answer span identification. Our experiments on fusion techniques of frame semantic-based and entity-based answer-processing models show that merging answer lists with respect to their scores and redundancy by exploiting a fusion function leads to a more effective overall factoid QA system compared to the use of individual models.
The processes of ICT diffusion in technology projects
- Authors: Jagodic, Jana , Courvisanos, Jerry , Yearwood, John
- Date: 2009
- Type: Text , Journal article
- Relation: Innovation: Management Policy & Practice Vol. 11, no. 3 (2009), p. 291-303
- Full Text:
- Reviewed:
- Description: Delivering technology projects on time with a specified budget and resources has emerged as a strategic imperative in the highly competitive business world. One of the project challenges is increasingly tied to diffigion (spread) of Information and Communication Technology (ICT) innovation. This paper presents an empirical study that examines how ICT innovation is diffused within technology projects. Based on the case study methodology within 12 organisations in Australia and Germany, it emerged that ICT innovation is diffused formally alongside standard project management phases and informally within informal networks. The findings are synthesised in a new framework that seeks to inform theory and practice about formal and informal processes of ICT diffusion in technology projects.
- Description: 2003007370
Unsupervised segmentation of Industrial Images using Markov Random Field Model
- Authors: Islam, Mofakharul , Yearwood, John , Vamplew, Peter
- Date: 2009
- Type: Text , Book chapter
- Relation: Technogical Developments in Education and Automation p. 369-374
- Full Text: false
- Reviewed:
- Description: We propose a novel approach to investigate and implement unsupervised image content understanding and segmentation of color industrial images like medical imaging, forensic imaging, security and surveillance imaging, biotechnical imaging, biometrics, mineral and mining imaging, material science imaging, and many more. In this particular work, our focus will be on medical images only. The aim is to develop a computer aided diagnosis (CAD) system based on a newly developed Multidimensional Spatially Variant Finite Mixture Model (MSVFMM) using Markov Random Fields (MRF) Model. Unsupervised means automatic discovery of classes or clusters in images rather than generating the class or cluster descriptions from training image sets. The aim of this work is to produce precise segmentation of color medical images on the basis of subtle color and texture variation. Finer segmentation of images has tremendous potential in medical imaging where subtle information related to color and texture is required to analyze the image accurately. In this particular work, we have used CIE-Luv and Daubechies wavelet transforms as color and texture descriptors respectively. Using the combined effect of a CIE-Luv color model and Daubechies transforms, we can segment color medical images precisely in a meaningful manner. The evaluation of the results is done through comparison of the segmentation quality with another similar alternative approach and it is found that the proposed approach is capable of producing more faithful segmentation.
Weblogs for market research : Finding more relevant opinion documents using system fusion
- Authors: Osman, Deanna , Yearwood, John , Vamplew, Peter
- Date: 2009
- Type: Text , Journal article
- Relation: Online Information Review Vol. 33, no. 5 (2009), p. 873-888
- Full Text: false
- Reviewed:
- Description: Purpose - The purpose of this paper is to examine the usefulness of fusion as a means of improving the precision of automated opinion detection. Design/methodology/approach - Five system fusion methods are proposed and tested using runs submitted by the Text REtrieval Conference (TREC) Blog06 participants as input. The methods include a voting method, an inverse rank method (IRM), a linear-normalised score method and two weighted methods that use a weighted IRM score to rank the document. Findings - Mean average precision (MAP) is used as an indicator of the performance of the runs in this study. The best system fusion method achieves a 55.5 percent higher MAP result compared with the highest MAP result of any individual run submitted by the Blog06 participants. This equates to an increase in detection of 2,398 relevant opinion documents (21 percent). Practical implications - System fusion can be used to improve upon the results achieved by existing individual opinion detection systems. On the other hand, multiple opinion detection approaches can be combined into one system and fusion used to combine the results to build in diversity. Diversity within fusion inputs can increase the improvements achieved by fusion methods. The improved output from a diverse opinion detection system will then contain a higher number of relevant documents and reduce the incidence of high-ranking non-relevant documents and low-ranking relevant documents. Originality/value - The fusion methods proposed in this study demonstrate that simple fusion of opinion detection systems can improve performance.
Workload coverage through nonsmooth optimization
- Authors: Sukhorukova, Nadezda , Ugon, Julien , Yearwood, John
- Date: 2009
- Type: Text , Journal article
- Relation: Optimization Methods and Software Vol. 24, no. 2 (2009), p. 285-298
- Full Text: false
- Reviewed:
- Description: In this paper, workload coverage is the problem of identifying a pattern of days worked and days off, along with the number of hours worked on each work day. This pattern must satisfy certain work-related constraints and fit best to a predefined workload. In our study, we formulate the problem of workload coverage as an optimization problem. We propose a number of models which take into consideration various staffing constraints. For each of these models, our study aims to find a compromise between an accurate workload coverage and the ability to solve the corresponding optimization problems in a reasonable time. Numerical experiments on each model are carried out and the results are presented. Interestingly, the nonlinear programming approaches are found to be competitive with linear programming ones. © 2009 Taylor & Francis.
A new scoring system in Cystic Fibrosis : Statistical tools for database analysis - A preliminary report
- Authors: Hafen, Gaudenz , Hurst, Cameron , Yearwood, John , Smith, Julie , Dzalilov, Zari , Robinson, P. J.
- Date: 2008
- Type: Text , Journal article
- Relation: BMC Medical Informatics and Decision Making Vol. 8, no. 44 (2008), p.1-11
- Full Text:
- Reviewed:
- Description: Background. Cystic fibrosis is the most common fatal genetic disorder in the Caucasian population. Scoring systems for assessment of Cystic fibrosis disease severity have been used for almost 50 years, without being adapted to the milder phenotype of the disease in the 21st century. The aim of this current project is to develop a new scoring system using a database and employing various statistical tools. This study protocol reports the development of the statistical tools in order to create such a scoring system. Methods. The evaluation is based on the Cystic Fibrosis database from the cohort at the Royal Children's Hospital in Melbourne. Initially, unsupervised clustering of the all data records was performed using a range of clustering algorithms. In particular incremental clustering algorithms were used. The clusters obtained were characterised using rules from decision trees and the results examined by clinicians. In order to obtain a clearer definition of classes expert opinion of each individual's clinical severity was sought. After data preparation including expert-opinion of an individual's clinical severity on a 3 point-scale (mild, moderate and severe disease), two multivariate techniques were used throughout the analysis to establish a method that would have a better success in feature selection and model derivation: 'Canonical Analysis of Principal Coordinates' and 'Linear Discriminant Analysis'. A 3-step procedure was performed with (1) selection of features, (2) extracting 5 severity classes out of a 3 severity class as defined per expert-opinion and (3) establishment of calibration datasets. Results. (1) Feature selection: CAP has a more effective "modelling" focus than DA. (2) Extraction of 5 severity classes: after variables were identified as important in discriminating contiguous CF severity groups on the 3-point scale as mild/moderate and moderate/severe, Discriminant Function (DF) was used to determine the new groups mild, intermediate moderate, moderate, intermediate severe and severe disease. (3) Generated confusion tables showed a misclassification rate of 19.1% for males and 16.5% for females, with a majority of misallocations into adjacent severity classes particularly for males. Conclusion. Our preliminary data show that using CAP for detection of selection features and Linear DA to derive the actual model in a CF database might be helpful in developing a scoring system. However, there are several limitations, particularly more data entry points are needed to finalize a score and the statistical tools have further to be refined and validated, with re-running the statistical methods in the larger dataset. © 2008 Hafen et al; licensee BioMed Central Ltd.
A study of the use of structured reasoning frameworks for improving students' reasoning quality
- Authors: Yearwood, John , Stranieri, Andrew
- Date: 2008
- Type: Text , Journal article
- Relation: Learning and Teaching: an international journal in classroom pedagogy Vol. 1, no. 1 (2008), p. 71-90
- Full Text: false
- Reviewed:
- Description: C1
- Description: 2003006498