Security and blockchain convergence with internet of multimedia things : current trends, research challenges and future directions
- Jan, Mian, Cai, Jinjin, Gao, Xiang-Chuan, Khan, Fazlullah, Mastorakis, Spyridon, Usman, Muhammad, Alazab, Mamoun, Watters, Paul
- Authors: Jan, Mian , Cai, Jinjin , Gao, Xiang-Chuan , Khan, Fazlullah , Mastorakis, Spyridon , Usman, Muhammad , Alazab, Mamoun , Watters, Paul
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Network and Computer Applications Vol. 175, no. (2021), p.
- Full Text:
- Reviewed:
- Description: The Internet of Multimedia Things (IoMT) orchestration enables the integration of systems, software, cloud, and smart sensors into a single platform. The IoMT deals with scalar as well as multimedia data. In these networks, sensor-embedded devices and their data face numerous challenges when it comes to security. In this paper, a comprehensive review of the existing literature for IoMT is presented in the context of security and blockchain. The latest literature on all three aspects of security, i.e., authentication, privacy, and trust is provided to explore the challenges experienced by multimedia data. The convergence of blockchain and IoMT along with multimedia-enabled blockchain platforms are discussed for emerging applications. To highlight the significance of this survey, large-scale commercial projects focused on security and blockchain for multimedia applications are reviewed. The shortcomings of these projects are explored and suggestions for further improvement are provided. Based on the aforementioned discussion, we present our own case study for healthcare industry: a theoretical framework having security and blockchain as key enablers. The case study reflects the importance of security and blockchain in multimedia applications of healthcare sector. Finally, we discuss the convergence of emerging technologies with security, blockchain and IoMT to visualize the future of tomorrow's applications. © 2020 Elsevier Ltd
- Authors: Jan, Mian , Cai, Jinjin , Gao, Xiang-Chuan , Khan, Fazlullah , Mastorakis, Spyridon , Usman, Muhammad , Alazab, Mamoun , Watters, Paul
- Date: 2021
- Type: Text , Journal article
- Relation: Journal of Network and Computer Applications Vol. 175, no. (2021), p.
- Full Text:
- Reviewed:
- Description: The Internet of Multimedia Things (IoMT) orchestration enables the integration of systems, software, cloud, and smart sensors into a single platform. The IoMT deals with scalar as well as multimedia data. In these networks, sensor-embedded devices and their data face numerous challenges when it comes to security. In this paper, a comprehensive review of the existing literature for IoMT is presented in the context of security and blockchain. The latest literature on all three aspects of security, i.e., authentication, privacy, and trust is provided to explore the challenges experienced by multimedia data. The convergence of blockchain and IoMT along with multimedia-enabled blockchain platforms are discussed for emerging applications. To highlight the significance of this survey, large-scale commercial projects focused on security and blockchain for multimedia applications are reviewed. The shortcomings of these projects are explored and suggestions for further improvement are provided. Based on the aforementioned discussion, we present our own case study for healthcare industry: a theoretical framework having security and blockchain as key enablers. The case study reflects the importance of security and blockchain in multimedia applications of healthcare sector. Finally, we discuss the convergence of emerging technologies with security, blockchain and IoMT to visualize the future of tomorrow's applications. © 2020 Elsevier Ltd
REPLOT : REtrieving Profile Links on Twitter for malicious campaign discovery
- Perez, Charles, Birregah, Babiga, Layton, Robert, Lemercier, Marc, Watters, Paul
- Authors: Perez, Charles , Birregah, Babiga , Layton, Robert , Lemercier, Marc , Watters, Paul
- Date: 2015
- Type: Text , Journal article
- Relation: AI Communications Vol. 29, no. 1 (2015), p. 107-122
- Full Text:
- Reviewed:
- Description: Social networking sites are increasingly subject to malicious activities such as self-propagating worms, confidence scams and drive-by-download malwares. The high number of users associated with the presence of sensitive data, such as personal or professional information, is certainly an unprecedented opportunity for attackers. These attackers are moving away from previous platforms of attack, such as emails, towards social networking websites. In this paper, we present a full stack methodology for the identification of campaigns of malicious profiles on social networking sites, composed of maliciousness classification, campaign discovery and attack profiling. The methodology named REPLOT, for REtrieving Profile Links On Twitter, contains three major phases. First, profiles are analysed to determine whether they are more likely to be malicious or benign. Second, connections between suspected malicious profiles are retrieved using a late data fusion approach consisting of temporal and authorship analysis based models to discover campaigns. Third, the analysis of the discovered campaigns is performed to investigate the attacks. In this paper, we apply this methodology to a real world dataset, with a view to understanding the links between malicious profiles, their attack methods and their connections. Our analysis identifies a cluster of linked profiles focusing on propagating malicious links, as well as profiling two other major clusters of attacking campaigns. © 2016 - IOS Press and the authors. All rights reserved.
- Authors: Perez, Charles , Birregah, Babiga , Layton, Robert , Lemercier, Marc , Watters, Paul
- Date: 2015
- Type: Text , Journal article
- Relation: AI Communications Vol. 29, no. 1 (2015), p. 107-122
- Full Text:
- Reviewed:
- Description: Social networking sites are increasingly subject to malicious activities such as self-propagating worms, confidence scams and drive-by-download malwares. The high number of users associated with the presence of sensitive data, such as personal or professional information, is certainly an unprecedented opportunity for attackers. These attackers are moving away from previous platforms of attack, such as emails, towards social networking websites. In this paper, we present a full stack methodology for the identification of campaigns of malicious profiles on social networking sites, composed of maliciousness classification, campaign discovery and attack profiling. The methodology named REPLOT, for REtrieving Profile Links On Twitter, contains three major phases. First, profiles are analysed to determine whether they are more likely to be malicious or benign. Second, connections between suspected malicious profiles are retrieved using a late data fusion approach consisting of temporal and authorship analysis based models to discover campaigns. Third, the analysis of the discovered campaigns is performed to investigate the attacks. In this paper, we apply this methodology to a real world dataset, with a view to understanding the links between malicious profiles, their attack methods and their connections. Our analysis identifies a cluster of linked profiles focusing on propagating malicious links, as well as profiling two other major clusters of attacking campaigns. © 2016 - IOS Press and the authors. All rights reserved.
Automated unsupervised authorship analysis using evidence accumulation clustering
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 19, no. 1 (2013), p. 95-120
- Full Text:
- Reviewed:
- Description: Authorship Analysis aims to extract information about the authorship of documents from features within those documents. Typically, this is performed as a classification task with the aim of identifying the author of a document, given a set of documents of known authorship. Alternatively, unsupervised methods have been developed primarily as visualisation tools to assist the manual discovery of clusters of authorship within a corpus by analysts. However, there is a need in many fields for more sophisticated unsupervised methods to automate the discovery, profiling and organisation of related information through clustering of documents by authorship. An automated and unsupervised methodology for clustering documents by authorship is proposed in this paper. The methodology is named NUANCE, for n-gram Unsupervised Automated Natural Cluster Ensemble. Testing indicates that the derived clusters have a strong correlation to the true authorship of unseen documents. © 2011 Cambridge University Press.
- Description: 2003010584
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 19, no. 1 (2013), p. 95-120
- Full Text:
- Reviewed:
- Description: Authorship Analysis aims to extract information about the authorship of documents from features within those documents. Typically, this is performed as a classification task with the aim of identifying the author of a document, given a set of documents of known authorship. Alternatively, unsupervised methods have been developed primarily as visualisation tools to assist the manual discovery of clusters of authorship within a corpus by analysts. However, there is a need in many fields for more sophisticated unsupervised methods to automate the discovery, profiling and organisation of related information through clustering of documents by authorship. An automated and unsupervised methodology for clustering documents by authorship is proposed in this paper. The methodology is named NUANCE, for n-gram Unsupervised Automated Natural Cluster Ensemble. Testing indicates that the derived clusters have a strong correlation to the true authorship of unseen documents. © 2011 Cambridge University Press.
- Description: 2003010584
Evaluating authorship distance methods using the positive Silhouette coefficient
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 19, no. 4 (2013), p. 517-535
- Full Text:
- Reviewed:
- Description: Unsupervised Authorship Analysis (UAA) aims to cluster documents by authorship without knowing the authorship of any documents. An important factor in UAA is the method for calculating the distance between documents. This choice of the authorship distance method is considered more critical to the end result than the choice of cluster analysis algorithm. One method for measuring the correlation between a distance metric and a labelling (such as class values or clusters) is the Silhouette Coefficient (SC). The SC can be leveraged by measuring the correlation between the authorship distance method and the true authorship, evaluating the quality of the distance method. However, we show that the SC can be severely affected by outliers. To address this issue, we introduce the Positive Silhouette Coefficient, given as the proportion of instances with a positive SC value. This metric is not easily altered by outliers and produces a more robust metric. A large number of authorship distance methods are then compared using the PSC, and the findings are presented. This research provides an insight into the efficacy of methods for UAA and presents a framework for testing authorship distance methods.
- Description: C1
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 19, no. 4 (2013), p. 517-535
- Full Text:
- Reviewed:
- Description: Unsupervised Authorship Analysis (UAA) aims to cluster documents by authorship without knowing the authorship of any documents. An important factor in UAA is the method for calculating the distance between documents. This choice of the authorship distance method is considered more critical to the end result than the choice of cluster analysis algorithm. One method for measuring the correlation between a distance metric and a labelling (such as class values or clusters) is the Silhouette Coefficient (SC). The SC can be leveraged by measuring the correlation between the authorship distance method and the true authorship, evaluating the quality of the distance method. However, we show that the SC can be severely affected by outliers. To address this issue, we introduce the Positive Silhouette Coefficient, given as the proportion of instances with a positive SC value. This metric is not easily altered by outliers and produces a more robust metric. A large number of authorship distance methods are then compared using the PSC, and the findings are presented. This research provides an insight into the efficacy of methods for UAA and presents a framework for testing authorship distance methods.
- Description: C1
Illicit image detection : An MRF model based stochastic approach
- Islam, Mofakharul, Watters, Paul, Yearwood, John, Hussain, Mazher, Swarna, Lubaba
- Authors: Islam, Mofakharul , Watters, Paul , Yearwood, John , Hussain, Mazher , Swarna, Lubaba
- Date: 2013
- Type: Text , Book chapter
- Relation: Innovations and Advances in Computer, Information, Systems Sciences, and Engineering p. 467-479
- Full Text:
- Reviewed:
- Description: The steady growth of the Internet, sophisticated digital image processing technology, the cheap availability of storage devices and surfer's ever-increasing interest on images have been contributing to make the Internet an unprecedented large image library. As a result, The Internet quickly became the principal medium for the distribution of pornographic content favouring pornography to become a drug of the millennium. With the arrival of GPRS mobile telephone technology, and with the large scale arrival of the 3G networks, along with the cheap availability of latest mobile sets and a variety of forms of wireless connections, the internet has already gone to mobile, driving us toward a new degree of complexity. In this paper, we propose a stochastic model based novel approach to investigate and implement a pornography detection technique towards a framework for automated detection of pornography based on contextual constraints that are representatives of actual pornographic activity. Compared to the results published in recent works, our proposed approach yields the highest accuracy in detection. © 2013 Springer Science+Business Media.
- Authors: Islam, Mofakharul , Watters, Paul , Yearwood, John , Hussain, Mazher , Swarna, Lubaba
- Date: 2013
- Type: Text , Book chapter
- Relation: Innovations and Advances in Computer, Information, Systems Sciences, and Engineering p. 467-479
- Full Text:
- Reviewed:
- Description: The steady growth of the Internet, sophisticated digital image processing technology, the cheap availability of storage devices and surfer's ever-increasing interest on images have been contributing to make the Internet an unprecedented large image library. As a result, The Internet quickly became the principal medium for the distribution of pornographic content favouring pornography to become a drug of the millennium. With the arrival of GPRS mobile telephone technology, and with the large scale arrival of the 3G networks, along with the cheap availability of latest mobile sets and a variety of forms of wireless connections, the internet has already gone to mobile, driving us toward a new degree of complexity. In this paper, we propose a stochastic model based novel approach to investigate and implement a pornography detection technique towards a framework for automated detection of pornography based on contextual constraints that are representatives of actual pornographic activity. Compared to the results published in recent works, our proposed approach yields the highest accuracy in detection. © 2013 Springer Science+Business Media.
Illicit image detection using erotic pose estimation based on kinematic constraints
- Islam, Mofakharul, Watters, Paul, Yearwood, John, Hussain, Mazher, Swarna, Lubaba
- Authors: Islam, Mofakharul , Watters, Paul , Yearwood, John , Hussain, Mazher , Swarna, Lubaba
- Date: 2013
- Type: Text , Book chapter
- Relation: Innovations and Advances in Computer, Information, Systems Sciences, and Engineering p. 481-495
- Full Text:
- Reviewed:
- Description: With the advent of the Internet along with sophisticated digital image processing technology, the Internet quickly became the principal medium for the distribution of pornographic content favouring pornography to become a drug of the millennium. With the advent of GPRS mobile telephone networks, and with the large scale arrival of the 3G networks, along with the cheap availability of latest mobile sets and a variety of forms of wireless connections, the internet has already gone to mobile, drives us toward a new degree of complexity. The detection of pornography remains an important and significant research problem, since there is great potential to minimize harm to the community. In this paper, we propose a novel approach to investigate and implement a pornography detection technique towards a framework for automated detection of pornography based on most commonly found erotic poses. Compared to the results published in recent works, our proposed approach yields the highest accuracy in recognition. © 2013 Springer Science+Business Media.
- Authors: Islam, Mofakharul , Watters, Paul , Yearwood, John , Hussain, Mazher , Swarna, Lubaba
- Date: 2013
- Type: Text , Book chapter
- Relation: Innovations and Advances in Computer, Information, Systems Sciences, and Engineering p. 481-495
- Full Text:
- Reviewed:
- Description: With the advent of the Internet along with sophisticated digital image processing technology, the Internet quickly became the principal medium for the distribution of pornographic content favouring pornography to become a drug of the millennium. With the advent of GPRS mobile telephone networks, and with the large scale arrival of the 3G networks, along with the cheap availability of latest mobile sets and a variety of forms of wireless connections, the internet has already gone to mobile, drives us toward a new degree of complexity. The detection of pornography remains an important and significant research problem, since there is great potential to minimize harm to the community. In this paper, we propose a novel approach to investigate and implement a pornography detection technique towards a framework for automated detection of pornography based on most commonly found erotic poses. Compared to the results published in recent works, our proposed approach yields the highest accuracy in recognition. © 2013 Springer Science+Business Media.
Local n-grams for author identification: Notebook for PAN at CLEF 2013 C3 - CEUR Workshop Proceedings
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Conference proceedings
- Full Text:
- Description: Our approach to the author identification task uses existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. This approach came in third for this year's competition, using a relatively simple scheme of weights by training set accuracy. LNG models create profiles, consisting of a list of character n-grams that best represent a particular author's writing. The use of a weighted ensemble improved upon the accuracy of the method without reducing the speed of the algorithm; the submitted solution was not only near the top of the leaderboard in terms of accuracy, but it was also one of the faster algorithms submitted.
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2013
- Type: Text , Conference proceedings
- Full Text:
- Description: Our approach to the author identification task uses existing authorship attribution methods using local n-grams (LNG) and performs a weighted ensemble. This approach came in third for this year's competition, using a relatively simple scheme of weights by training set accuracy. LNG models create profiles, consisting of a list of character n-grams that best represent a particular author's writing. The use of a weighted ensemble improved upon the accuracy of the method without reducing the speed of the algorithm; the submitted solution was not only near the top of the leaderboard in terms of accuracy, but it was also one of the faster algorithms submitted.
Detecting illicit drugs on social media using Automated Social Media Intelligence Analysis (ASMIA)
- Authors: Watters, Paul , Phair, Nigel
- Date: 2012
- Type: Text , Conference paper
- Relation: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 7672 LNCS, p. 66-76
- Full Text:
- Reviewed:
- Description: While social media is a new and exciting technology, it has the potential to be misused by organized crime groups and individuals involved in the illicit drugs trade. In particular, social media provides a means to create new marketing and distribution opportunities to a global marketplace, often exploiting jurisdictional gaps between buyer and seller. The sheer volume of postings presents investigational barriers, but the platform is amenable to the partial automation of open source intelligence. This paper presents a new methodology for automating social media data, and presents two pilot studies into its use for detecting marketing and distribution of illicit drugs targeted at Australians. Key technical challenges are identified, and the policy implications of the ease of access to illicit drugs are discussed. © 2012 Springer-Verlag.
- Description: 2003010676
- Authors: Watters, Paul , Phair, Nigel
- Date: 2012
- Type: Text , Conference paper
- Relation: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 7672 LNCS, p. 66-76
- Full Text:
- Reviewed:
- Description: While social media is a new and exciting technology, it has the potential to be misused by organized crime groups and individuals involved in the illicit drugs trade. In particular, social media provides a means to create new marketing and distribution opportunities to a global marketplace, often exploiting jurisdictional gaps between buyer and seller. The sheer volume of postings presents investigational barriers, but the platform is amenable to the partial automation of open source intelligence. This paper presents a new methodology for automating social media data, and presents two pilot studies into its use for detecting marketing and distribution of illicit drugs targeted at Australians. Key technical challenges are identified, and the policy implications of the ease of access to illicit drugs are discussed. © 2012 Springer-Verlag.
- Description: 2003010676
Recentred local profiles for authorship attribution
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2012
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 18, no. 3 (2012), p. 293-312
- Full Text:
- Reviewed:
- Description: Authorship attribution methods aim to determine the author of a document, by using information gathered from a set of documents with known authors. One method of performing this task is to create profiles containing distinctive features known to be used by each author. In this paper, a new method of creating an author or document profile is presented that detects features considered distinctive, compared to normal language usage. This recentreing approach creates more accurate profiles than previous methods, as demonstrated empirically using a known corpus of authorship problems. This method, named recentred local profiles, determines authorship accurately using a simple 'best matching author' approach to classification, compared to other methods in the literature. The proposed method is shown to be more stable than related methods as parameter values change. Using a weighted voting scheme, recentred local profiles is shown to outperform other methods in authorship attribution, with an overall accuracy of 69.9% on the ad-hoc authorship attribution competition corpus, representing a significant improvement over related methods. Copyright © Cambridge University Press 2011.
- Description: 2003010688
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2012
- Type: Text , Journal article
- Relation: Natural Language Engineering Vol. 18, no. 3 (2012), p. 293-312
- Full Text:
- Reviewed:
- Description: Authorship attribution methods aim to determine the author of a document, by using information gathered from a set of documents with known authors. One method of performing this task is to create profiles containing distinctive features known to be used by each author. In this paper, a new method of creating an author or document profile is presented that detects features considered distinctive, compared to normal language usage. This recentreing approach creates more accurate profiles than previous methods, as demonstrated empirically using a known corpus of authorship problems. This method, named recentred local profiles, determines authorship accurately using a simple 'best matching author' approach to classification, compared to other methods in the literature. The proposed method is shown to be more stable than related methods as parameter values change. Using a weighted voting scheme, recentred local profiles is shown to outperform other methods in authorship attribution, with an overall accuracy of 69.9% on the ad-hoc authorship attribution competition corpus, representing a significant improvement over related methods. Copyright © Cambridge University Press 2011.
- Description: 2003010688
Towards an implementation of information flow security using semantic web technologies
- Ureche, Oana, Layton, Robert, Watters, Paul
- Authors: Ureche, Oana , Layton, Robert , Watters, Paul
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: Controlling the flow of sensitive data has been widely acknowledged as a critical aspect for securing web information systems. A common limitation of previous approaches for the implementation of the information flow control is their proposal of new scripting languages. This makes them infeasible to be applied to existing systems written in traditional programming languages as these systems need to be redeveloped in the proposed scripting language. This paper proposes a methodology that offers a common interlinqua through the use of Semantic Web technologies for securing web information systems independently of their programming language. © 2012 IEEE.
- Description: 2003011056
- Authors: Ureche, Oana , Layton, Robert , Watters, Paul
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: Controlling the flow of sensitive data has been widely acknowledged as a critical aspect for securing web information systems. A common limitation of previous approaches for the implementation of the information flow control is their proposal of new scripting languages. This makes them infeasible to be applied to existing systems written in traditional programming languages as these systems need to be redeveloped in the proposed scripting language. This paper proposes a methodology that offers a common interlinqua through the use of Semantic Web technologies for securing web information systems independently of their programming language. © 2012 IEEE.
- Description: 2003011056
Unsupervised authorship analysis of phishing webpages
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: Authorship analysis on phishing websites enables the investigation of phishing attacks, beyond basic analysis. In authorship analysis, salient features from documents are used to determine properties about the author, such as which of a set of candidate authors wrote a given document. In unsupervised authorship analysis, the aim is to group documents such that all documents by one author are grouped together. Applying this to cyber-attacks shows the size and scope of attacks from specific groups. This in turn allows investigators to focus their attention on specific attacking groups rather than trying to profile multiple independent attackers. In this paper, we analyse phishing websites using the current state of the art unsupervised authorship analysis method, called NUANCE. The results indicate that the application produces clusters which correlate strongly to authorship, evaluated using expert knowledge and external information as well as showing an improvement over a previous approach with known flaws. © 2012 IEEE.
- Description: 2003010678
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2012
- Type: Text , Conference proceedings
- Full Text:
- Description: Authorship analysis on phishing websites enables the investigation of phishing attacks, beyond basic analysis. In authorship analysis, salient features from documents are used to determine properties about the author, such as which of a set of candidate authors wrote a given document. In unsupervised authorship analysis, the aim is to group documents such that all documents by one author are grouped together. Applying this to cyber-attacks shows the size and scope of attacks from specific groups. This in turn allows investigators to focus their attention on specific attacking groups rather than trying to profile multiple independent attackers. In this paper, we analyse phishing websites using the current state of the art unsupervised authorship analysis method, called NUANCE. The results indicate that the application produces clusters which correlate strongly to authorship, evaluated using expert knowledge and external information as well as showing an improvement over a previous approach with known flaws. © 2012 IEEE.
- Description: 2003010678
Optimization of classifiers for data mining based on combinatorial semigroups
- Kelarev, Andrei, Yearwood, John, Watters, Paul
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul
- Date: 2011
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 82, no. 2 (2011), p. 1-10
- Full Text:
- Reviewed:
- Description: The aim of the present article is to obtain a theoretical result essential for applications of combinatorial semigroups for the design of multiple classification systems in data mining. We consider a novel construction of multiple classification systems, or classifiers, combining several binary classifiers. The construction is based on combinatorial Rees matrix semigroups without any restrictions on the sandwich-matrix. Our main theorem gives a complete description of all optimal classifiers in this novel construction. © 2011 Springer Science+Business Media, LLC.
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul
- Date: 2011
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 82, no. 2 (2011), p. 1-10
- Full Text:
- Reviewed:
- Description: The aim of the present article is to obtain a theoretical result essential for applications of combinatorial semigroups for the design of multiple classification systems in data mining. We consider a novel construction of multiple classification systems, or classifiers, combining several binary classifiers. The construction is based on combinatorial Rees matrix semigroups without any restrictions on the sandwich-matrix. Our main theorem gives a complete description of all optimal classifiers in this novel construction. © 2011 Springer Science+Business Media, LLC.
The factor structure of cybersickness
- Authors: Bruck, Susan , Watters, Paul
- Date: 2011
- Type: Text , Journal article
- Relation: Displays Vol. 32, no. 4 (2011), p. 153-158
- Full Text:
- Reviewed:
- Description: Cybersickness embraces a range of clinical symptoms reported in response to simulated motion in a computer generated, virtual reality environment. The Simulator Sickness Questionnaire (SSQ) has been the standard tool for measuring observed responses; however, many of the observed SSQ variables are highly correlated, so it is not clear which ones are appropriate to use as a basis for building an explanatory model. In this study, responses to the SSQ were analyzed using principal components analysis, and four significant factors - general cybersickness, vision, arousal and fatigue - were identified. An initial interpretation of these factors is provided in the context of a broader cybersickness model, with a view to constructing a new questionnaire with fewer, more focused questions. © 2011 Elsevier B.V. All rights reserved.
- Authors: Bruck, Susan , Watters, Paul
- Date: 2011
- Type: Text , Journal article
- Relation: Displays Vol. 32, no. 4 (2011), p. 153-158
- Full Text:
- Reviewed:
- Description: Cybersickness embraces a range of clinical symptoms reported in response to simulated motion in a computer generated, virtual reality environment. The Simulator Sickness Questionnaire (SSQ) has been the standard tool for measuring observed responses; however, many of the observed SSQ variables are highly correlated, so it is not clear which ones are appropriate to use as a basis for building an explanatory model. In this study, responses to the SSQ were analyzed using principal components analysis, and four significant factors - general cybersickness, vision, arousal and fatigue - were identified. An initial interpretation of these factors is provided in the context of a broader cybersickness model, with a view to constructing a new questionnaire with fewer, more focused questions. © 2011 Elsevier B.V. All rights reserved.
Zero-day malware detection based on supervised learning algorithms of API call signatures
- Alazab, Mamoun, Venkatraman, Sitalakshmi, Watters, Paul, Alazab, Moutaz
- Authors: Alazab, Mamoun , Venkatraman, Sitalakshmi , Watters, Paul , Alazab, Moutaz
- Date: 2011
- Type: Text , Conference proceedings
- Full Text:
- Description: Zero-day or unknown malware are created using code obfuscation techniques that can modify the parent code to produce offspring copies which have the same functionality but with different signatures. Current techniques reported in literature lack the capability of detecting zero-day malware with the required accuracy and efficiency. In this paper, we have proposed and evaluated a novel method of employing several data mining techniques to detect and classify zero-day malware with high levels of accuracy and efficiency based on the frequency of Windows API calls. This paper describes the methodology employed for the collection of large data sets to train the classifiers, and analyses the performance results of the various data mining algorithms adopted for the study using a fully automated tool developed in this research to conduct the various experimental investigations and evaluation. Through the performance results of these algorithms from our experimental analysis, we are able to evaluate and discuss the advantages of one data mining algorithm over the other for accurately detecting zero-day malware successfully. The data mining framework employed in this research learns through analysing the behavior of existing malicious and benign codes in large datasets. We have employed robust classifiers, namely Naïve Bayes (NB) Algorithm, k-Nearest Neighbor (kNN) Algorithm, Sequential Minimal Optimization (SMO) Algorithm with 4 differents kernels (SMO - Normalized PolyKernel, SMO - PolyKernel, SMO - Puk, and SMO- Radial Basis Function (RBF)), Backpropagation Neural Networks Algorithm, and J48 decision tree and have evaluated their performance. Overall, the automated data mining system implemented for this study has achieved high true positive (TP) rate of more than 98.5%, and low false positive (FP) rate of less than 0.025, which has not been achieved in literature so far. This is much higher than the required commercial acceptance level indicating that our novel technique is a major leap forward in detecting zero-day malware. This paper also offers future directions for researchers in exploring different aspects of obfuscations that are affecting the IT world today. © 2011, Australian Computer Society, Inc.
- Description: 2003009506
- Authors: Alazab, Mamoun , Venkatraman, Sitalakshmi , Watters, Paul , Alazab, Moutaz
- Date: 2011
- Type: Text , Conference proceedings
- Full Text:
- Description: Zero-day or unknown malware are created using code obfuscation techniques that can modify the parent code to produce offspring copies which have the same functionality but with different signatures. Current techniques reported in literature lack the capability of detecting zero-day malware with the required accuracy and efficiency. In this paper, we have proposed and evaluated a novel method of employing several data mining techniques to detect and classify zero-day malware with high levels of accuracy and efficiency based on the frequency of Windows API calls. This paper describes the methodology employed for the collection of large data sets to train the classifiers, and analyses the performance results of the various data mining algorithms adopted for the study using a fully automated tool developed in this research to conduct the various experimental investigations and evaluation. Through the performance results of these algorithms from our experimental analysis, we are able to evaluate and discuss the advantages of one data mining algorithm over the other for accurately detecting zero-day malware successfully. The data mining framework employed in this research learns through analysing the behavior of existing malicious and benign codes in large datasets. We have employed robust classifiers, namely Naïve Bayes (NB) Algorithm, k-Nearest Neighbor (kNN) Algorithm, Sequential Minimal Optimization (SMO) Algorithm with 4 differents kernels (SMO - Normalized PolyKernel, SMO - PolyKernel, SMO - Puk, and SMO- Radial Basis Function (RBF)), Backpropagation Neural Networks Algorithm, and J48 decision tree and have evaluated their performance. Overall, the automated data mining system implemented for this study has achieved high true positive (TP) rate of more than 98.5%, and low false positive (FP) rate of less than 0.025, which has not been achieved in literature so far. This is much higher than the required commercial acceptance level indicating that our novel technique is a major leap forward in detecting zero-day malware. This paper also offers future directions for researchers in exploring different aspects of obfuscations that are affecting the IT world today. © 2011, Australian Computer Society, Inc.
- Description: 2003009506
A new procedure to help system/network administrators identify multiple rootkit infections
- Lobo, Desmond, Watters, Paul, Wu, Xinwen
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xinwen
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 2nd International Conference on Communication Software and Networks, ICCSN 2010, Singapore : 26th-28th February 2010 p. 124-128
- Full Text:
- Description: Rootkits refer to software that is used to hide the presence of malware from system/network administrators and permit an attacker to take control of a computer. In our previous work, we designed a system that would categorize rootkits based on the hooks that had been created. Focusing on rootkits that use inline function hooking techniques, we showed that our system could successfully categorize a sample of rootkits using unsupervised EM clustering. In this paper, we extend our previous work by outlining a new procedure to help system/network administrators identify the rootkits that have infected their machines. Using a logistic regression model for profiling families of rootkits, we were able to identify at least one of the rootkits that had infected each of the systems that we tested. © 2010 IEEE.
- Description: Rootkits refer to software that is used to hide the presence of malware from system/network administrators and permit an attacker to take control of a computer. In our previous work, we designed a system that would categorize rootkits based on the hooks that had been created. Focusing on rootkits that use inline function hooking techniques, we showed that our system could successfully categorize a sample of rootkits using unsupervised EM clustering. In this paper, we extend our previous work by outlining a new procedure to help system/network administrators identify the rootkits that have infected their machines. Using a logistic regression model for profiling families of rootkits, we were able to identify at least one of the rootkits that had infected each of the systems that we tested. © 2010 IEEE.
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xinwen
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 2nd International Conference on Communication Software and Networks, ICCSN 2010, Singapore : 26th-28th February 2010 p. 124-128
- Full Text:
- Description: Rootkits refer to software that is used to hide the presence of malware from system/network administrators and permit an attacker to take control of a computer. In our previous work, we designed a system that would categorize rootkits based on the hooks that had been created. Focusing on rootkits that use inline function hooking techniques, we showed that our system could successfully categorize a sample of rootkits using unsupervised EM clustering. In this paper, we extend our previous work by outlining a new procedure to help system/network administrators identify the rootkits that have infected their machines. Using a logistic regression model for profiling families of rootkits, we were able to identify at least one of the rootkits that had infected each of the systems that we tested. © 2010 IEEE.
- Description: Rootkits refer to software that is used to hide the presence of malware from system/network administrators and permit an attacker to take control of a computer. In our previous work, we designed a system that would categorize rootkits based on the hooks that had been created. Focusing on rootkits that use inline function hooking techniques, we showed that our system could successfully categorize a sample of rootkits using unsupervised EM clustering. In this paper, we extend our previous work by outlining a new procedure to help system/network administrators identify the rootkits that have infected their machines. Using a logistic regression model for profiling families of rootkits, we were able to identify at least one of the rootkits that had infected each of the systems that we tested. © 2010 IEEE.
Authorship attribution for Twitter in 140 characters or less
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at - 2nd Cybercrime and Trustworthy Computing Workshop, CTC 2010 p. 1-8
- Full Text:
- Reviewed:
- Description: Authorship attribution is a growing field, moving from beginnings in linguistics to recent advances in text mining. Through this change came an increase in the capability of authorship attribution methods both in their accuracy and the ability to consider more difficult problems. Research into authorship attribution in the 19th century considered it difficult to determine the authorship of a document of fewer than 1000 words. By the 1990s this values had decreased to less than 500 words and in the early 21 st century it was considered possible to determine the authorship of a document in 250 words. The need for this ever decreasing limit is exemplified by the trend towards many shorter communications rather than fewer longer communications, such as the move from traditional multi-page handwritten letters to shorter, more focused emails. This trend has also been shown in online crime, where many attacks such as phishing or bullying are performed using very concise language. Cybercrime messages have long been hosted on Internet Relay Chats (IRCs) which have allowed members to hide behind screen names and connect anonymously. More recently, Twitter and other short message based web services have been used as a hosting ground for online crimes. This paper presents some evaluations of current techniques and identifies some new preprocessing methods that can be used to enable authorship to be determined at rates significantly better than chance for documents of 140 characters or less, a format popularised by the micro-blogging website Twitter1. We show that the SCAP methodology performs extremely well on twitter messages and even with restrictions on the types of information allowed, such as the recipient of directed messages, still perform significantly higher than chance. Further to this, we show that 120 tweets per user is an important threshold, at which point adding more tweets per user gives a small but non-significant increase in accuracy. © 2010 IEEE.
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at - 2nd Cybercrime and Trustworthy Computing Workshop, CTC 2010 p. 1-8
- Full Text:
- Reviewed:
- Description: Authorship attribution is a growing field, moving from beginnings in linguistics to recent advances in text mining. Through this change came an increase in the capability of authorship attribution methods both in their accuracy and the ability to consider more difficult problems. Research into authorship attribution in the 19th century considered it difficult to determine the authorship of a document of fewer than 1000 words. By the 1990s this values had decreased to less than 500 words and in the early 21 st century it was considered possible to determine the authorship of a document in 250 words. The need for this ever decreasing limit is exemplified by the trend towards many shorter communications rather than fewer longer communications, such as the move from traditional multi-page handwritten letters to shorter, more focused emails. This trend has also been shown in online crime, where many attacks such as phishing or bullying are performed using very concise language. Cybercrime messages have long been hosted on Internet Relay Chats (IRCs) which have allowed members to hide behind screen names and connect anonymously. More recently, Twitter and other short message based web services have been used as a hosting ground for online crimes. This paper presents some evaluations of current techniques and identifies some new preprocessing methods that can be used to enable authorship to be determined at rates significantly better than chance for documents of 140 characters or less, a format popularised by the micro-blogging website Twitter1. We show that the SCAP methodology performs extremely well on twitter messages and even with restrictions on the types of information allowed, such as the recipient of directed messages, still perform significantly higher than chance. Further to this, we show that 120 tweets per user is an important threshold, at which point adding more tweets per user gives a small but non-significant increase in accuracy. © 2010 IEEE.
Automatically determining phishing campaigns using the USCAP methodology
- Layton, Robert, Watters, Paul, Dazeley, Richard
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at General Members Meeting and eCrime Researchers Summit, eCrime 2010 p. 1-8
- Full Text:
- Reviewed:
- Description: Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented that looks at the differences that occur between phishing websites from an authorship analysis perspective and is able to determine different phishing campaigns undertaken by phishing groups. The methodology is named USCAP, for Unsupervised SCAP, which builds on the SCAP methodology from supervised authorship and extends it for unsupervised learning problems. The phishing website source code is examined to generate a model that gives the size and scope of each of the recognized phishing campaigns. The USCAP methodology introduces the first time that phishing websites have been clustered by campaign in an automatic and reliable way, compared to previous methods which relied on costly expert analysis of phishing websites. Evaluation of these clusters indicates that each cluster is strongly consistent with a high stability and reliability when analyzed using new information about the attacks, such as the dates that the attack occurred on. The clusters found are indicative of different phishing campaigns, presenting a step towards an automated phishing authorship analysis methodology. © 2010 IEEE.
- Authors: Layton, Robert , Watters, Paul , Dazeley, Richard
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at General Members Meeting and eCrime Researchers Summit, eCrime 2010 p. 1-8
- Full Text:
- Reviewed:
- Description: Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented that looks at the differences that occur between phishing websites from an authorship analysis perspective and is able to determine different phishing campaigns undertaken by phishing groups. The methodology is named USCAP, for Unsupervised SCAP, which builds on the SCAP methodology from supervised authorship and extends it for unsupervised learning problems. The phishing website source code is examined to generate a model that gives the size and scope of each of the recognized phishing campaigns. The USCAP methodology introduces the first time that phishing websites have been clustered by campaign in an automatic and reliable way, compared to previous methods which relied on costly expert analysis of phishing websites. Evaluation of these clusters indicates that each cluster is strongly consistent with a high stability and reliability when analyzed using new information about the attacks, such as the dates that the attack occurred on. The clusters found are indicative of different phishing campaigns, presenting a step towards an automated phishing authorship analysis methodology. © 2010 IEEE.
Identifying rootkit infections using data mining
- Lobo, Desmond, Watters, Paul, Wu, Xinwen
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xinwen
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 2010 International Conference on Information Science and Applications, ICISA 2010, Seoul, Korea : p. 1-7
- Full Text:
- Description: Rootkits refer to software that is used to hide the presence and activity of malware and permit an attacker to take control of a computer system. In our previous work, we focused strictly on identifying rootkits that use inline function hooking techniques to remain hidden. In this paper, we extend our previous work by including rootkits that use other types of hooking techniques, such as those that hook the IATs (Import Address Tables) and SSDTs (System Service Descriptor Tables). Unlike other malware identification techniques, our approach involved conducting dynamic analyses of various rootkits and then determining the family of each rootkit based on the hooks that had been created on the system. We demonstrated the effectiveness of this approach by first using the CLOPE (Clustering with sLOPE) algorithm to cluster a sample of rootkits into several families; next, the ID3 (Iterative Dichotomiser 3) algorithm was utilized to generate a decision tree for identifying the rootkit that had infected a machine. ©2010 IEEE.
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xinwen
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 2010 International Conference on Information Science and Applications, ICISA 2010, Seoul, Korea : p. 1-7
- Full Text:
- Description: Rootkits refer to software that is used to hide the presence and activity of malware and permit an attacker to take control of a computer system. In our previous work, we focused strictly on identifying rootkits that use inline function hooking techniques to remain hidden. In this paper, we extend our previous work by including rootkits that use other types of hooking techniques, such as those that hook the IATs (Import Address Tables) and SSDTs (System Service Descriptor Tables). Unlike other malware identification techniques, our approach involved conducting dynamic analyses of various rootkits and then determining the family of each rootkit based on the hooks that had been created on the system. We demonstrated the effectiveness of this approach by first using the CLOPE (Clustering with sLOPE) algorithm to cluster a sample of rootkits into several families; next, the ID3 (Iterative Dichotomiser 3) algorithm was utilized to generate a decision tree for identifying the rootkit that had infected a machine. ©2010 IEEE.
Internet security applications of the Munn rings
- Kelarev, Andrei, Yearwood, John, Watters, Paul, Wu, Xinwen, Abawajy, Jemal, Pan, L.
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul , Wu, Xinwen , Abawajy, Jemal , Pan, L.
- Date: 2010
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 81, no. 1 (2010), p. 162-171
- Full Text:
- Reviewed:
- Description: Effective multiple clustering systems, or clusterers, have important applications in information security. The aim of the present article is to introduce a new method of designing multiple clusterers based on the Munn rings and describe a class of optimal clusterers which can be obtained in this construction.
- Authors: Kelarev, Andrei , Yearwood, John , Watters, Paul , Wu, Xinwen , Abawajy, Jemal , Pan, L.
- Date: 2010
- Type: Text , Journal article
- Relation: Semigroup Forum Vol. 81, no. 1 (2010), p. 162-171
- Full Text:
- Reviewed:
- Description: Effective multiple clustering systems, or clusterers, have important applications in information security. The aim of the present article is to introduce a new method of designing multiple clusterers based on the Munn rings and describe a class of optimal clusterers which can be obtained in this construction.
RBACS : Rootkit behavioral analysis and classification system
- Lobo, Desmond, Watters, Paul, Wu, Xinwen
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xinwen
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 3rd International Conference on Knowledge Discovery and Data Mining, WKDD 2010, Phuket : 9th-10th January 2010 p. 75-80
- Full Text:
- Description: In this paper, we focus on rootkits, a special type of malicious software (malware) that operates in an obfuscated and stealthy mode to evade detection. Categorizing these rootkits will help in detecting future attacks against the business community. We first developed a theoretical framework for classifying rootkits. Based on our theoretical framework, we then proposed a new rootkit classification system and tested our system on a sample of rootkits that use inline function hooking. Our experimental results showed that our system could successfully categorize the sample using unsupervised clustering. © 2010 IEEE.
- Authors: Lobo, Desmond , Watters, Paul , Wu, Xinwen
- Date: 2010
- Type: Text , Conference paper
- Relation: Paper presented at 3rd International Conference on Knowledge Discovery and Data Mining, WKDD 2010, Phuket : 9th-10th January 2010 p. 75-80
- Full Text:
- Description: In this paper, we focus on rootkits, a special type of malicious software (malware) that operates in an obfuscated and stealthy mode to evade detection. Categorizing these rootkits will help in detecting future attacks against the business community. We first developed a theoretical framework for classifying rootkits. Based on our theoretical framework, we then proposed a new rootkit classification system and tested our system on a sample of rootkits that use inline function hooking. Our experimental results showed that our system could successfully categorize the sample using unsupervised clustering. © 2010 IEEE.