A server side solution for detecting webInject : A machine learning approach
- Authors: Moniruzzaman, Md , Bagirov, Adil , Gondal, Iqbal , Brown, Simon
- Date: 2018
- Type: Text , Conference proceedings , Conference paper
- Relation: 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2018; Melbourne, Australia; 3rd June 2018; published in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 11154 LNAI, p. 162-167
- Full Text: false
- Reviewed:
- Description: With the advancement of client-side on the fly web content generation techniques, it becomes easier for attackers to modify the content of a website dynamically and gain access to valuable information. A majority portion of online attacks is now done by WebInject. The end users are not always skilled enough to differentiate between injected content and actual contents of a webpage. Some of the existing solutions are designed for client side and all the users have to install it in their system, which is a challenging task. In addition, various platforms and tools are used by individuals, so different solutions needed to be designed. Existing server side solution often focuses on sanitizing and filtering the inputs. It will fail to detect obfuscated and hidden scripts. In this paper, we propose a server side solution using a machine learning approach to detect WebInject in banking websites. Unlike other techniques, our method collects features of a Document Object Model (DOM) and classifies it with the help of a pre-trained model.
Automatically generating classifier for phishing email prediction
- Authors: Ma, Liping , Torney, Rosemary , Watters, Paul , Brown, Simon
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at I-SPAN 2009 - The 10th International Symposium on Pervasive Systems, Algorithms, and Networks, Kaohsiung, Taiwan : 14th-16th December 2009 p. 779-783
- Full Text:
- Description: Phishing is a form of online identity theft that employs both social engineering and technical subterfuge to steal consumers' personal identity data and financial account credentials. Phishing email prediction has drawn a lot of attention from many researchers. According to current anti-phishing research, a classifier generated by decision tree produces the most accurate predictions. However, there appears not to be any open source available to transfer such a decision to an implementable classifier. The work presented in this paper builds a decision tree parser which automatically translates a decision tree into an implementable program language so that the decision is useful in real world applications. Experiment results show that the parser performs as well as the original decision. © 2009 IEEE.
- Description: 2003007989
Categorical features transformation with compact one-hot encoder for fraud detection in distributed environment
- Authors: Ul Haq, Ikram , Gondal, Iqbal , Vamplew, Peter , Brown, Simon
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 16th Australasian Conference on Data Mining, AusDM 2018; Bathurst, NSW; 28 November 2018 through 30 November 2018 Vol. 996, p. 69-80
- Full Text: false
- Reviewed:
- Description: Fraud detection for online banking is an important research area, but one of the challenges is the heterogeneous nature of transactions data i.e. a combination of numeric as well as mixed attributes. Usually, numeric format data gives better performance for classification, regression and clustering algorithms. However, many machine learning problems have categorical, or nominal features, rather than numeric features only. In addition, some machine learning platforms such as Apache Spark accept numeric data only. One-hot Encoding (OHE) is a widely used approach for transforming categorical features to numerical features in traditional data mining tasks. The one-hot approach has some challenges as well: the sparseness of the transformed data and that the distinct values of an attribute are not always known in advance. Other than the model accuracy, compactness of machine learning models is equally important due to growing memory and storage needs. This paper presents an innovative technique to transform categorical features to numeric features by compacting sparse data even if all the distinct values are not known. The transformed data can be used for the development of fraud detection systems. The accuracy of the results has been validated on synthetic and real bank fraud data and a publicly available anomaly detection (KDD-99) dataset on a multi-node data cluster. © Springer Nature Singapore Pte Ltd. 2019.
Cyberattack triage using incremental clustering for intrusion detection systems
- Authors: Taheri, Sona , Bagirov, Adil , Gondal, Iqbal , Brown, Simon
- Date: 2020
- Type: Text , Journal article
- Relation: International Journal of Information Security Vol. 19, no. 5 (2020), p. 597-607
- Relation: http://purl.org/au-research/grants/arc/DP190100580
- Full Text:
- Reviewed:
- Description: Intrusion detection systems (IDSs) are devices or software applications that monitor networks or systems for malicious activities and signals alerts/alarms when such activity is discovered. However, an IDS may generate many false alerts which affect its accuracy. In this paper, we develop a cyberattack triage algorithm to detect these alerts (so-called outliers). The proposed algorithm is designed using the clustering, optimization and distance-based approaches. An optimization-based incremental clustering algorithm is proposed to find clusters of different types of cyberattacks. Using a special procedure, a set of clusters is divided into two subsets: normal and stable clusters. Then, outliers are found among stable clusters using an average distance between centroids of normal clusters. The proposed algorithm is evaluated using the well-known IDS data sets—Knowledge Discovery and Data mining Cup 1999 and UNSW-NB15—and compared with some other existing algorithms. Results show that the proposed algorithm has a high detection accuracy and its false negative rate is very low. © 2019, Springer-Verlag GmbH Germany, part of Springer Nature.
- Description: This research was conducted in Internet Commerce Security Laboratory (ICSL) funded by Westpac Banking Corporation Australia. In addition, the research by Dr. Sona Taheri and A/Prof. Adil Bagirov was supported by the Australian Government through the Australian Research Council’s Discovery Projects funding scheme (DP190100580).
Detecting phishing emails using hybrid features
- Authors: Ma, Liping , Ofoghi, Bahadorreza , Watters, Paul , Brown, Simon
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing, UIC-ATC '09, Brisbane, Queensland : 7th-9th July 2009 p. 493-497
- Full Text:
- Description: Phishing emails have been used widely in fraud of financial organizations and customers. Phishing email detection has drawn a lot attention for many researchers and malicious detection devices are installed in email servers. However, phishing has become more and more complicated and sophisticated and attack can bypass the filter set by anti-phishing techniques. In this paper, we present a method to build a robust classifier to detect phishing emails using hybrid features and to select features using information gain. We experiment on 10 cross-validations to build an initial classifier which performs well. The experiment also analyses the quality of each feature using information gain and best feature set is selected after a recursive learning process. Experimental result shows the selected features perform as well as the original features. Finally, we test five machine learning algorithms and compare the performance of each. The result shows that decision tree builds the best classifier.
- Description: 2003007857
Establishing reasoning communities of security experts for Internet Commerce Security
- Authors: Kelarev, Andrei , Brown, Simon , Watters, Paul , Wu, Xinwen , Dazeley, Richard
- Date: 2011
- Type: Text , Book chapter
- Relation: Technologies for supporting reasoning communities and collaborative decision making : Cooperative approaches p. 380-396
- Full Text: false
- Reviewed:
- Description: The highly sophisticated and rapidly evolving area of internet commerce security presents many novel challenges for the organization of discourse in reasoning communities. This chapter suggests appropriate reasoning methods and demonstrates how establishing reasoning communities of security experts and enabling productive group discourse among them can play a crucial role in successful resolution of problems concerning the implementation, integration, deployment and maintenance of flexible local security systems for defense against malware threats in internet security. Local security systems of this sort may combine several ready open source or commercial software packages behind a common front-end and may enhance and supplement their facilities with additional plug-ins. To illustrate the diverse character of challenges the reasoning communities in internet security are likely to be faced with, this chapter concentrates on defense against phishing attacks. This example was selected as it is one of the newest and most rapidly changing application domains for the principles of organizing reasoning communities. The major group discourse methods suggested for the reasoning communities of security experts in this chapter include the Delphi Method, the Wideband Delphi Process, the Generic/Actual Argument Model of Structured Reasoning, Brainstorming, Reverse Brainstorming, Consensus Decision Making, Voting, Open Delphi and Open Brainstorming Methods. The Delphi Method and Wideband Delphi Process are suggested as tools for organizing a cohesive reasoning architecture, for coordinating other methods, and for preparing and allocating other methods to particular issues.
Multi-source cyber-attacks detection using machine learning
- Authors: Taheri, Sona , Gondal, Iqbal , Bagirov, Adil , Harkness, Greg , Brown, Simon , Chi, Chihung
- Date: 2019
- Type: Text , Conference proceedings , Conference paper
- Relation: 2019 IEEE International Conference on Industrial Technology, ICIT 2019; Melbourne, Australia; 13th-15th February 2019 Vol. 2019-February, p. 1167-1172
- Full Text:
- Reviewed:
- Description: The Internet of Things (IoT) has significantly increased the number of devices connected to the Internet ranging from sensors to multi-source data information. As the IoT continues to evolve with new technologies number of threats and attacks against IoT devices are on the increase. Analyzing and detecting these attacks originating from different sources needs machine learning models. These models provide proactive solutions for detecting attacks and their sources. In this paper, we propose to apply a supervised machine learning classification technique to identify cyber-attacks from each source. More precisely, we apply the incremental piecewise linear classifier that constructs boundary between sources/classes incrementally starting with one hyperplane and adding more hyperplanes at each iteration. The algorithm terminates when no further significant improvement of the separation of sources/classes is possible. The construction and usage of piecewise linear boundaries allows us to avoid any possible overfitting. We apply the incremental piecewise linear classifier on the multi-source real world cyber security data set to identify cyber-attacks and their sources.
- Description: Proceedings of the IEEE International Conference on Industrial Technology
The case for a consistent cyberscam classification framework (CCCF)
- Authors: Stabek, Amber , Brown, Simon , Watters, Paul
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at UIC-ATC 2009 - Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing in Conjunction with the UIC'09 and ATC'09 Conferences, Brisbane : 7th-9th July 2009 p. 525-530
- Full Text:
- Description: Cyberscam classification schemes developed by international statistical reporting bodies, including the Bureau of Statistics (Australia), the Internet Crime Complaint Center (US), and the Environics Research Group (Canada), are diverse and largely incompatible. This makes comparisons of cyberscam incidence across jurisdictions very difficult. This paper argues that the critical first step towards the development of an inter-jurisdictional and global approach to identify and intercept cyberscams - and prosecute scammers - is a uniform classification system. © 2009 IEEE.
Using differencing to increase distinctiveness for phishing website clustering
- Authors: Layton, Robert , Brown, Simon , Watters, Paul
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at UIC-ATC 2009 - Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing in Conjunction with the UIC'09 and ATC'09 Conferences, Brisbane : 7th-9th July 2009 p. 488-492
- Full Text: false
- Description: Phishing webpages present a previously underused resource for information on determining provenance of phishing attacks. Phishing webpages aim to impersonate a legitimate website in order to trick their potential victims into revealing their confidential data, such as usernames and passwords. However different phishing webpages often contain small differences and these differences can provide a great deal of evidence on the provenance of phishing attacks. When impersonating a webpage, there is often a large amount of 'redundant' information, as much of the original, impersonated website is found in phishing websites, making phishing websites across different attacks very similar. In order to attempt to overcome this issue, a diff can be used which takes the phishing and original websites as input and returns the differences between the two. These differences present a new view on the data that is previously unused and presents a novel way to increase the ability of clustering algorithms to find good, distinct and separated clusters within the data. The research presented here outlines this diff process and shows that for the data used, comparable results were obtained while the dimensionality of the dataset was reduced. This reduction in size allows for clustering algorithms to complete faster, due to the reduced dimensionality of the dataset. © 2009 IEEE.