Applying clustering and ensemble clustering approaches to phishing profiling
- Authors: Webb, Dean , Yearwood, John , Vamplew, Peter , Ma, Liping , Ofoghi, Bahadorreza , Kelarev, Andrei
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at Eighth Australasian Data Mining Conference, AusDM 2009, University of Melbourne, Melbourne, Victoria : 1st–4th December 2009
- Full Text:
- Description: 2003007911
Automatically generating classifier for phishing email prediction
- Authors: Ma, Liping , Torney, Rosemary , Watters, Paul , Brown, Simon
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks, Kaohsiung, Taiwan : 14th-16th December 2009 p. 779-783
- Full Text:
- Description: Phishing is a form of online identity theft that employs both social engineering and technical subterfuge to steal consumers' personal identity data and financial account credentials. Phishing email prediction has drawn a lot of attention from many researchers. According to current anti-phishing research, a classifier generated by decision tree produces the most accurate predictions. However, there appears not to be any open source available to transfer such a decision to an implementable classifier. The work presented in this paper builds a decision tree parser which automatically translates a decision tree into an implementable program language so that the decision is useful in real world applications. Experiment results show that the parser performs as well as the original decision. © 2009 IEEE.
- Description: 2003007989
The impact of semantic class identification and semantic role labeling on natural language answer extraction
- Authors: Ofoghi, Bahadorreza , Yearwood, John , Ma, Liping
- Date: 2008
- Type: Text , Conference paper
- Relation: Paper presented at 30th European Conference on IR Research, ECIR 2008, Glasgow, UK : 30th March - 3rd April 2008 p. 430-437
- Full Text: false
- Description: In satisfying an information need by a Question Answering (QA) system, there are text understanding approaches which can enhance the performance of final answer extraction. Exploiting the FrameNet lexical resource in this process inspires analysis of the levels of semantic representation in the automated practice where the task of semantic class and role labeling takes place. In this paper, we analyze the impact of different levels of semantic parsing on answer extraction with respect to the individual sub-tasks of frame evocation and frame element assignment.
- Description: 2003006587
Detecting phishing emails using hybrid features
- Authors: Ma, Liping , Ofoghi, Bahadorreza , Watters, Paul , Brown, Simon
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing, UIC-ATC '09, Brisbane, Queensland : 7th-9th July 2009 p. 493-497
- Full Text:
- Description: Phishing emails have been used widely in fraud of financial organizations and customers. Phishing email detection has drawn a lot attention for many researchers and malicious detection devices are installed in email servers. However, phishing has become more and more complicated and sophisticated and attack can bypass the filter set by anti-phishing techniques. In this paper, we present a method to build a robust classifier to detect phishing emails using hybrid features and to select features using information gain. We experiment on 10 cross-validations to build an initial classifier which performs well. The experiment also analyses the quality of each feature using information gain and best feature set is selected after a recursive learning process. Experimental result shows the selected features perform as well as the original features. Finally, we test five machine learning algorithms and compare the performance of each. The result shows that decision tree builds the best classifier.
- Description: 2003007857
Grid-based information retrieval for the aggregation of legal datasets in online dispute resolution
- Authors: Saeed, Ather , Stranieri, Andrew , Dazeley, Richard , Ma, Liping
- Date: 2009
- Type: Text , Journal article
- Relation: Communications of SIWN Vol. 6, no. April (2009), p. 16-22
- Full Text: false
- Description: The Web is a stateless and complex environment when it comes to the retrieval of information from millions of computers connected to the Internet via WWW servers. Information Retrieval (IR) from heterogeneous data sources poses a great challenge as the information of interest is stored in a variety of different formats. Answering an enormous amount of queries is a resource and computational intensive task in ODR (Online Dispute Resolution). Information availability also poses a challenge when it comes to the mediation and arbitration processes in resolving eCommerce and legal disputes. A new Grid-based information retrieval model is proposed for the aggregation and replication of legal datasets from remote machines with indexed-based search facility. Datasets of interests will be indexed with a slight modification to the existing indexing scheme. A new strategy is proposed to deal with similar queries posted over and over again and how the commonality among the XML query trees are exploited and merged for the efficient retrieval of information.
FrameNet-based fact-seeking answer processing : A study of semantic alignment techniques and lexical coverage
- Authors: Ofoghi, Bahadorreza , Yearwood, John , Ma, Liping
- Date: 2008
- Type: Text , Journal article
- Relation: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 5360 LNAI, no. (1 December 2008 through 5 December 2008 2008), p. 192-201
- Full Text: false
- Description: In this paper, we consider two aspects which affect the performance of factoid FrameNet-based Question Answering (QA): i) the frame semantic-based answer processing technique based on frame semantic alignment between questions and passages to identify answer candidates and score them, and ii) the lexical coverage of FrameNet over the predicates which represent the main actions in question and passage events. These are studied using a frame semantic-based QA run over the TREC 2004 and TREC 2006 factoid question sets. © 2008 Springer Berlin Heidelberg.
Scalable continuous query architecture for eCommerce and legal disputes
- Authors: Saeed, Ather , Stranieri, Andrew , Dazeley, Richard , Ma, Liping
- Date: 2008
- Type: Text , Journal article
- Relation: Communications of SIWN Vol. 3, no. (2008), p. 1-6
- Full Text: false
- Reviewed:
- Description: Continuous Queries (CQ) are persistent, content sensitive and time dependent. Once the CQ is installed it will continuously poll the data sources and monitor updates of interest. This paper discusses major problems and issues with the existing CQ techniques for monitoring updates of interest on the web. A new Continuous Query based architecture is proposed to deal with the context sensitive problems of negotiation, mediation and arbitration to resolve Ecommerce and legal disputes. A business process model is given to automate mediation and arbitration processes in ODR (Online dispute resolution) to resolve disputes efficiently and in a timely manner. In the proposed CQ-Mediator architecture partial page update and web services are integrated for efficient monitoring and notification of updates to the disputants, mediators and arbitrators. Performance results of the proposed architecture and business process model for CQ-based ODR is also discussed in the experiment section.
- Description: 2003006852
Two-step comprehensive open domain text annotation with frame semantics
- Authors: Ofoghi, Bahadorreza , Yearwood, John , Ma, Liping
- Date: 2007
- Type: Text , Conference paper
- Relation: Paper presented at Australasian Language Technology Workshop 2007, Melbourne Zoo, Melbourne, Victoria : 10th-11th December 2007 p. 83-91
- Full Text:
- Description: With shallow semantic parsing tasks receiving more attention in many natural language applications, there is a need for labelled corpora for learning the specific tags under consideration. In this paper, we discuss a two-step semantic class and semantic role assignment based on the FrameNet elements over a subset of the AQUAINT collection with a reasonable coverage over the semantic frames in FrameNet. The quality of the annotation task is examined through inter-annotator agreement. The methodology described in this work for measuring inter-annotator agreement can be adapted for similar tasks. Some central aspects of the task are also detailed in this paper.
- Description: 2003005522
A Grobner-Shirshov Algorithm for Applications in Internet Security
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul , Wu, Xinwen , Ma, Liping , Abawajy, Jemal , Pan, L.
- Date: 2011
- Type: Text , Journal article
- Relation: Southeast Asian Bulletin of Mathematics Vol. 35, no. (2011), p. 807-820
- Full Text: false
- Reviewed:
- Description: The design of multiple classication and clustering systems for the detection of malware is an important problem in internet security. Grobner-Shirshov bases have been used recently by Dazeley et al. [15] to develop an algorithm for constructions with certain restrictions on the sandwich-matrices. We develop a new Grobner-Shirshov algorithm which applies to a larger variety of constructions based on combinatorial Rees matrix semigroups without any restrictions on the sandwich-matrices.
Automatic sleep stage identification: difficulties and possible solutions
- Authors: Sukhorukova, Nadezda , Stranieri, Andrew , Ofoghi, Bahadorreza , Vamplew, Peter , Saleem, Muhammad Saad , Ma, Liping , Ugon, Adrien , Ugon, Julien , Muecke, Nial , Amiel, Hélène , Philippe, Carole , Bani-Mustafa, Ahmed , Huda, Shamsul , Bertoli, Marcello , Levy, P , Ganascia, J.G
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: The diagnosis of many sleep disorders is a labour intensive task that involves the specialised interpretation of numerous signals including brain wave, breath and heart rate captured in overnight polysomnogram sessions. The automation of diagnoses is challenging for data mining algorithms because the data sets are extremely large and noisy, the signals are complex and specialist's analyses vary. This work reports on the adaptation of approaches from four fields; neural networks, mathematical optimisation, financial forecasting and frequency domain analysis to the problem of automatically determing a patient's stage of sleep. Results, though preliminary, are promising and indicate that combined approaches may prove more fruitful than the reliance on a approach.