Zero-day malware detection based on supervised learning algorithms of API call signatures
- Authors: Alazab, Mamoun , Venkatraman, Sitalakshmi , Watters, Paul , Alazab, Moutaz
- Date: 2011
- Type: Text , Conference proceedings
- Full Text:
- Description: Zero-day or unknown malware are created using code obfuscation techniques that can modify the parent code to produce offspring copies which have the same functionality but with different signatures. Current techniques reported in literature lack the capability of detecting zero-day malware with the required accuracy and efficiency. In this paper, we have proposed and evaluated a novel method of employing several data mining techniques to detect and classify zero-day malware with high levels of accuracy and efficiency based on the frequency of Windows API calls. This paper describes the methodology employed for the collection of large data sets to train the classifiers, and analyses the performance results of the various data mining algorithms adopted for the study using a fully automated tool developed in this research to conduct the various experimental investigations and evaluation. Through the performance results of these algorithms from our experimental analysis, we are able to evaluate and discuss the advantages of one data mining algorithm over the other for accurately detecting zero-day malware successfully. The data mining framework employed in this research learns through analysing the behavior of existing malicious and benign codes in large datasets. We have employed robust classifiers, namely Naïve Bayes (NB) Algorithm, k-Nearest Neighbor (kNN) Algorithm, Sequential Minimal Optimization (SMO) Algorithm with 4 differents kernels (SMO - Normalized PolyKernel, SMO - PolyKernel, SMO - Puk, and SMO- Radial Basis Function (RBF)), Backpropagation Neural Networks Algorithm, and J48 decision tree and have evaluated their performance. Overall, the automated data mining system implemented for this study has achieved high true positive (TP) rate of more than 98.5%, and low false positive (FP) rate of less than 0.025, which has not been achieved in literature so far. This is much higher than the required commercial acceptance level indicating that our novel technique is a major leap forward in detecting zero-day malware. This paper also offers future directions for researchers in exploring different aspects of obfuscations that are affecting the IT world today. © 2011, Australian Computer Society, Inc.
- Description: 2003009506
Authorship attribution for Twitter in 140 characters or less
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at - 2nd Cybercrime and Trustworthy Computing Workshop, CTC 2010 p. 1-8
- Full Text:
- Reviewed:
- Description: Authorship attribution is a growing field, moving from beginnings in linguistics to recent advances in text mining. Through this change came an increase in the capability of authorship attribution methods both in their accuracy and the ability to consider more difficult problems. Research into authorship attribution in the 19th century considered it difficult to determine the authorship of a document of fewer than 1000 words. By the 1990s this values had decreased to less than 500 words and in the early 21 st century it was considered possible to determine the authorship of a document in 250 words. The need for this ever decreasing limit is exemplified by the trend towards many shorter communications rather than fewer longer communications, such as the move from traditional multi-page handwritten letters to shorter, more focused emails. This trend has also been shown in online crime, where many attacks such as phishing or bullying are performed using very concise language. Cybercrime messages have long been hosted on Internet Relay Chats (IRCs) which have allowed members to hide behind screen names and connect anonymously. More recently, Twitter and other short message based web services have been used as a hosting ground for online crimes. This paper presents some evaluations of current techniques and identifies some new preprocessing methods that can be used to enable authorship to be determined at rates significantly better than chance for documents of 140 characters or less, a format popularised by the micro-blogging website Twitter1. We show that the SCAP methodology performs extremely well on twitter messages and even with restrictions on the types of information allowed, such as the recipient of directed messages, still perform significantly higher than chance. Further to this, we show that 120 tweets per user is an important threshold, at which point adding more tweets per user gives a small but non-significant increase in accuracy. © 2010 IEEE.
Automatically determining phishing campaigns using the USCAP methodology
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at General Members Meeting and eCrime Researchers Summit, eCrime 2010 p. 1-8
- Full Text:
- Reviewed:
- Description: Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented that looks at the differences that occur between phishing websites from an authorship analysis perspective and is able to determine different phishing campaigns undertaken by phishing groups. The methodology is named USCAP, for Unsupervised SCAP, which builds on the SCAP methodology from supervised authorship and extends it for unsupervised learning problems. The phishing website source code is examined to generate a model that gives the size and scope of each of the recognized phishing campaigns. The USCAP methodology introduces the first time that phishing websites have been clustered by campaign in an automatic and reliable way, compared to previous methods which relied on costly expert analysis of phishing websites. Evaluation of these clusters indicates that each cluster is strongly consistent with a high stability and reliability when analyzed using new information about the attacks, such as the dates that the attack occurred on. The clusters found are indicative of different phishing campaigns, presenting a step towards an automated phishing authorship analysis methodology. © 2010 IEEE.
The seven scam types: Mapping the terrain of cybercrime
- Authors: Stabek, Amber , Watters, Paul , Layton, Robert
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Threat of cybercrime is a growing danger to the economy. Industries and businesses are targeted by cyber-criminals along with members of the general public. Since cybercrime is often a symptom of more complex criminological regimes such as laundering, trafficking and terrorism, the true damage caused to society is unknown. Dissimilarities in reporting procedures and non-uniform cybercrime classifications lead international reporting bodies to produce incompatible results which cause difficulties in making valid comparisons. A cybercrime classification framework has been identified as necessary for the development of an inter-jurisdictional, transnational, and global approach to identify, intercept, and prosecute cyber-criminals. Outlined in this paper is a cybercrime classification framework which has been applied to the incidence of scams. Content analysis was performed on over 250 scam descriptions stemming from in excess of 35 scamming categories and over 80 static features derived. Using hierarchical cluster and discriminant function analysis, the sample was reduced from over 35 ambiguous categories into 7 scam types and the top four scamming functions - identified as scamming business processes, revealed. The results of this research bear significant ramifications to the current state of scam and cybercrime classification, research and analysis, as well as offer significant insight into the business processes and applications adopted by scammers and cyber-criminals. © 2010 IEEE.
Windows rootkits: Attacks and countermeasures
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xin , Sun, Li
- Date: 2010
- Type: Text , Conference proceedings
- Full Text:
- Description: Windows XP is the dominant operating system in the world today and rootkits have been a major concern for XP users. This paper provides an in-depth analysis of the rootkits that target that operating system, while focusing on those that use various hooking techniques to hide malware on a machine. We identify some of the weaknesses in the Windows XP architecture that rootkits exploit and then evaluate some of the anti-rootkit security features that Microsoft has unveiled in Vista and 7. To reduce the number of rootkit infections in the future, we suggest that Microsoft should take full advantage of Intel's four distinct privilege levels. © 2010 IEEE.
A preliminary profiling of internet money mules : An Australian perspective
- Authors: Aston, Manny , McCombie, Stephen , Reardon, Ben , Watters, Paul
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing, UIC-ATC '09, Brisbane, Queensland : 7th-9th July 2009 p. 482-487
- Full Text:
- Description: Along with the massive growth in Internet commerce over the last ten years there has been a corresponding boom in Internet related crime, or cybercrime. According to research recently released by the Australian Bureau of Statistics in 2006 57,000 Australians aged 15 years and over fell victim to phishing and related Internet scams. Of all the victims of cybercrime, only one group is potentially subject to criminal prosecution: `Internet money mules'-those who, either knowingly or unknowingly, launder money. This paper examines the demographic profile-specifically age, gender and postcode-related to 660 confirmed money mule incidents recorded during the calendar year 2007, for a major Australian financial institution. This data is compared to ABS statistics of Internet usage in 2006. There is clear evidence of a strong gender bias towards males, particularly in the older age group. This is directly relevant when considering education and training programs for both corporations and the community on the issues surrounding Internet money mule scams and in ultimately understanding the problem of Internet banking fraud.
- Description: 2003007858
Detecting phishing emails using hybrid features
- Authors: Ma, Liping , Ofoghi, Bahadorreza , Watters, Paul , Brown, Simon
- Date: 2009
- Type: Text , Conference paper
- Relation: Paper presented at 2009 Symposia and Workshops on Ubiquitous, Autonomic and Trusted Computing, UIC-ATC '09, Brisbane, Queensland : 7th-9th July 2009 p. 493-497
- Full Text:
- Description: Phishing emails have been used widely in fraud of financial organizations and customers. Phishing email detection has drawn a lot attention for many researchers and malicious detection devices are installed in email servers. However, phishing has become more and more complicated and sophisticated and attack can bypass the filter set by anti-phishing techniques. In this paper, we present a method to build a robust classifier to detect phishing emails using hybrid features and to select features using information gain. We experiment on 10 cross-validations to build an initial classifier which performs well. The experiment also analyses the quality of each feature using information gain and best feature set is selected after a recursive learning process. Experimental result shows the selected features perform as well as the original features. Finally, we test five machine learning algorithms and compare the performance of each. The result shows that decision tree builds the best classifier.
- Description: 2003007857